Hello,

I usually worked with Full-bayesian approach but started to learn Edward for large dataset analysis (for VI).

I just tested with a simple model (y_i ~ N(mu, sigma)) where mu~N(0, 1.5) and sigma ~ Inv-Gamma(1,1).

When I run klqp, the parameters are not identified (differs a lot whenever run).

(1) Is there anything wrong that I did?

(2) This example works well on ADVI (Stan). But, still I think klqp works well with this example. (though I guess the two are different)

(3) How to monitor the convergence? In Stan, I may see ELBO for convergence. Is the ELBO recorded in edward? or do I need to record it by using `inference.print_progress(info_dict)`

?

I’ve attached the toy example. Thank you for all your comment.

I really appreciate if you can give some advice!

```
%matplotlib inline
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import edward as ed
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import edward.models as edm
plt.style.use('ggplot')
ed.set_seed(42)
N=1000
y_dat=np.random.normal(loc=2,scale=1,size=N)
y_dat=y_dat.astype(np.float32)
mu=edm.Normal(loc=tf.zeros([1]),scale=tf.ones([1])*1.5)
sigma=edm.InverseGamma(concentration=tf.ones([1]),rate=tf.ones([1])) # shape, rate. not shape scale
y=edm.Normal(loc=tf.ones([N])*mu,scale=sigma)
qmu=edm.Normal(loc=tf.Variable(tf.random_normal([1])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([1]))))
qsigma=edm.InverseGamma(concentration=tf.nn.softplus(tf.Variable(tf.random_normal([1]))),
rate=tf.nn.softplus(tf.Variable(tf.random_normal([1]))))#
inference = ed.KLqp({mu:qmu,sigma:qsigma}, data={y: y_dat})
inference.run(n_iter=5000)
```