Renyi divergence variational inference

dustin · September 20, 2017, 3:14pm

The code looks great. Some comments:

In klqp.py, we don’t place the build_loss_and_gradients function as a method inside KLqp because we use it across many KLqp algorithms. Since your function is only used in one class, it’s recommended you write it as a method (c.f., klpq.py).
What’s justification for a default alpha=0.2?
What does a ‘min’ backward pass correspond to? I’m not sure if I recall a VR-min; does it correspond to alpha → \infty? I haven’t done the math.
Is the logF = tf.reshape(logF, [inference.n_samples, 1]) reshape necessary? Seems like you could just do logF = tf.stack(logF)
Since you only clip on the LHS, you can change logF = tf.log(tf.clip_by_value(tf.reduce_mean(tf.exp(logF - logF_max), 0), 1e-9, np.inf)) to use tf.maximum(1e-9, *).

Would you be interested in submitting a PR? The algorithm would be a nice addition to Edward’s arsenal.

You’re correct. But it also covers trick 1: if the distributions have the property reparameterization_type == tf.contrib.distributions.FULLY_REPARAMETERIZED, then gradients with respect to distribution parameters backpropagate through the sampling. See also discussion in Gradient is incorrect for log pdf of Normal distribution · Issue #7236 · tensorflow/tensorflow · GitHub.

Topic		Replies	Views
KLqp underline implementation	2	987	June 26, 2018
Black box alpha divergence minimization	7	2029	October 31, 2018
Understanding Edward KLqp algorithm	2	1532	July 23, 2018
Confused by error message from inference.run() for LDA with KLqp	5	2683	April 10, 2018
Variance of the gradient	0	705	July 17, 2018