Rule of thumb in choosing n_samples

If I understand correctly, n_samples in inference classes like ed.KLqp is the Monte Carlo sample number S in here. Is this correct?

What’s a good rule of thumb in choosing n_sample? The more the better within a time budget?

That is correct.

However, I don’t know the good rule clearly.

In my experience, if the number of samples is large, the variance of the variational approximation distribution decreases, but convergence does not become faster.

In https://github.com/blei-lab/edward/blob/master/notebooks/tensorboard.ipynb

The authors state:

" With variational inference, we also include information such as the loss function and its decomposition into individual terms. This particular example shows that n_samples=1 tends to have higher variance than n_samples=5 but still converges to the same solution."

which supports your claim that larger number of MC samples decreases the variance.