Implementing the local reparameterization trick

aksarkar · September 15, 2017, 12:52am

I’m trying to implement the local reparameterization trick (Kingma, Salimans, and Welling 2015) to fit regressions with the spike-and-slab prior (point-normal mixture) on regression coefficients. I previously did this in Theano using the reparameterization gradient and analytical KL.

My current attempt in Edward is https://github.com/aksarkar/nwas/blob/4d6a1332eb39ca2b5876e14912cbf8eae1b2ed3f/analysis/example.org

Is there a better way to do this in Edward?

I defined two new Edward random variables:

SpikeSlab, which supports analytical KL for the prior
GeneticValue (domain-specific jargon for x * theta), which supports sampling (using a tf.contrib.distributions.Normal instance) and dummy analytical KL (just returns 0)

Then, I call ed.ReparameterizationKLKLqp directly since I know it won’t blow up.

I can’t think of a way to do this that doesn’t expose the reparameterization in the model specification, but this solution doesn’t play nicely with ed.copy, so evaluating the model (e.g. computing coefficient of determination) requires pulling out the coefficients and computing things outside of Edward/Tensorflow.

dustin · September 16, 2017, 1:17pm

If you’d like to use the local reparameterization trick, “expos[ing] the reparameterization in the model specification” is the proper approach. Namely, define the model marginalizing out the weights and where the neurons are random. Inference should be over logodds, scale, and eta given y and x.

Alternatively, you can build a new inference algorithm to try to automate local reparameterizations. That said, I prefer the former approach because I personally view the technique more as a choice of model parameterization for efficient VI in the same way we might use non-centered parameterizations for efficient HMC.

this solution doesn’t play nicely with ed.copy, so evaluating the model (e.g. computing coefficient of determination) requires pulling out the coefficients and computing things outside of Edward/Tensorflow.

Given parameters for the marginal distribution on the neurons, you can calculate the parameters for the distribution on the weights—all in Edward/TensorFlow (there’s a 1-1 mapping as in, e.g., Eq 6 of their paper).

Topic		Replies	Views
Renyi divergence variational inference	2	1120	September 20, 2017
KLqp underline implementation	2	917	June 26, 2018
Several problems when implement Bayes by backprop using edward	0	525	March 30, 2018
Bayesian regression	4	1158	June 26, 2017
Rookie problem (KLqp gets obviously wrong result)	7	1736	October 17, 2017

Implementing the local reparameterization trick

Related Topics