A toy normal model failed (klqp) and why?

Hello,

I usually worked with Full-bayesian approach but started to learn Edward for large dataset analysis (for VI).

I just tested with a simple model (y_i ~ N(mu, sigma)) where mu~N(0, 1.5) and sigma ~ Inv-Gamma(1,1).
When I run klqp, the parameters are not identified (differs a lot whenever run).

(1) Is there anything wrong that I did?
(2) This example works well on ADVI (Stan). But, still I think klqp works well with this example. (though I guess the two are different)
(3) How to monitor the convergence? In Stan, I may see ELBO for convergence. Is the ELBO recorded in edward? or do I need to record it by using inference.print_progress(info_dict)?

I’ve attached the toy example. Thank you for all your comment.
I really appreciate if you can give some advice!

%matplotlib inline
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
 
import edward as ed
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import edward.models as edm
 
plt.style.use('ggplot')
 
ed.set_seed(42)
N=1000
 
y_dat=np.random.normal(loc=2,scale=1,size=N)
y_dat=y_dat.astype(np.float32)
 
mu=edm.Normal(loc=tf.zeros([1]),scale=tf.ones([1])*1.5)
sigma=edm.InverseGamma(concentration=tf.ones([1]),rate=tf.ones([1])) # shape, rate. not shape scale
y=edm.Normal(loc=tf.ones([N])*mu,scale=sigma)
 
qmu=edm.Normal(loc=tf.Variable(tf.random_normal([1])),
              scale=tf.nn.softplus(tf.Variable(tf.random_normal([1]))))
qsigma=edm.InverseGamma(concentration=tf.nn.softplus(tf.Variable(tf.random_normal([1]))),
                  rate=tf.nn.softplus(tf.Variable(tf.random_normal([1]))))# 

inference = ed.KLqp({mu:qmu,sigma:qsigma}, data={y: y_dat})

inference.run(n_iter=5000)

Hi,

(1) I don’t see anything wrong per se
(2) In my experience, KLqp has trouble with the inv-gamma distribution, though I don’t know why. My guess is that the gamma function is somehow tough to differentiate.
If it suits your purposes, use the log-normal distribution in the variational model, see link below.
(3) Instead of calling run, you can initialize, and run updates yourself: see notebook

Hope that helps :slight_smile:

3 Likes

Thank you so much!!.
Well, when I run the inference.run() the results are too different. I guess that is because of the inverse gamma distribution.
I appreciate your notebook!