Custom random variable with two parts

bhomass · March 12, 2018, 5:01am

I need to write a custom random variable but it has two arrays in its outcome, not just a single number. Does the Edward customrandomvariable _log_prob(self, value) allow for that? Do I simply set value to be an tensorflow array (tensor)? How exactly would you declare that when you use the “value” variable? (maybe this is just a python or tensorflow question, but I don’t know how)?

In fact I am wondering even if it’s just a single array (or a vector) outcome, I don’t see any sample code that can handle this. How would you optimize the elbo in this case? is it automatically handled once you enter the right shape tensor?

Another question when I implemented this in stan, I only have to generate log prob, no need for samples. Why does Edward require generation of samples?

aksarkar · March 23, 2018, 9:02pm

_log_prob is expected to return a Tensor giving the per-data point log likelihood. (Refer to the implementations in tensorflow.contrib.distributions.)

If value has shape [n, 2], then you could return an [n, 1] Tensor.

The various implementations of ed.Inference all sum the value returned by _log_prob, so everything will work automatically.

Edward requires samples to check that tensor shapes are compatible (see ed.models.random_variable.RandomVariable.__init__). You can circumvent this by initializing with the keyword argument value.

Edward also uses samples to construct the terms in the objective function corresponding to RandomVariables in data.

bhomass · March 31, 2018, 2:55am

Hi thanks for the explanation. I don’t get that last part at all. What is meant by “corresponding to RandomVariables in data”?

There is something I don’t understand about using the value input parameter. If I supply that, it does bypass the tensor shape check, however, it also bypasses calling sample_n, so that the customer _sample_n method I write in my custom random variable class is never called. I don’t think that’s a good thing?

aksarkar · April 1, 2018, 3:36pm

The arguments to inference.run are data and latent_vars. I mean the keys of the dictionary passed in as data.

For example, if I have

x = tf.placeholder(...)
y = ed.models.Normal(loc=x, ...)

inf.run(data={y: y_train, x: x_train})

Edward uses samples of y to compute the objective function for BBVI.

You are correct that initializing a RandomVariable with a value means you never draw a sample.

bhomass · April 2, 2018, 3:23am

I am starting to see the real problem now. So the distributions add up the log prob of each data point.

In My model the dimensions are [N, T, 2]. Calculation of prob requires matrix multiplication along the time dimension and can not be set to log form at the element level. Log can be applied after all the elements have gone thru matrix multiplication and then multiplied to each other. It’s ok to take log prob for each batch sample (N axis), but not for each time element (T axis) in a single batch sample.

Can you think of any way to take care of this situation?

bhomass · April 2, 2018, 8:25am

Ok, I solved this problem by adding the log prob myself and put it in the last column.

I now have a different problem. instead of adding the log probs, I try to multiply the probs, then take the log. But this leads to nan for all parameters. I tried converting to tf.float64 for this part, but to no avail. I know this is more a tensorflow problem then Edward, but if someone is very experienced with tensorflow and can help here, I would really appreciate it.

bhomass · April 2, 2018, 6:01pm

It is a floating point precision problem. I am surprised even float64 couldn’t cover it.

aksarkar · April 2, 2018, 7:36pm

Without a mathematical description of the likelihood, I can’t say much, but it seems like you have a model:

x_i1, ..., x_iT, y_i1, ..., y_iT ~ g(x_i, y_i), i = 1..N

where x_i, y_i are T vectors. I assume you’ve concatenated them to get an [N, T, 2] tensor.

In this case, you should be returning an [N, 1] tensor where each element is log g(x_i, y_i).

This doesn’t require taking logs for each time element, only per batch sample.

bhomass · April 2, 2018, 10:16pm

you are right, that worked!

bhomass · April 4, 2018, 1:36am

I added some tf.Print statements in the sample_n method, and nothing gets printed out during execution.

is sample_n not used during training? another word, if I were debugging my output, I shouldn’t even look so hard there?

aksarkar · April 5, 2018, 3:00am

_sample_n returns a Tensorflow op which gets added to the computation graph. (Refer to tensorflow/tensorflow/python/ops/distributions/distribution.py).

The function call itself is used to build the computation graph for inference, not to actually perform inference.

Topic		Replies	Views
Using custom log probability for custom model	2	999	November 9, 2017
A custom random variable	4	1991	June 23, 2017
Conditioning a random variable on a value determined at run time	1	770	December 5, 2017
Question on log_prob and how it differs from other platforms like Stan	1	845	April 2, 2018
Simple wrapper custom variable	4	1267	September 27, 2017

Custom random variable with two parts

Related topics