Marginal distribution


Hii I just got done reading all about variational inference. As I understand stochastic VI uses mean field variational technique and as long as mean field is involved, individual parameters are not accurately estimated. You are shooting for a good global posterior as the end result. For my model, it is essential to find good approximation for each individual parameter. Is it fair to say I can start with the full posterior approximation and integrate away all but one parameter to get the marginal distribution of each one parameter? What would be the challenges of trying that approach?


HI Dustin, you answered all the other questions perfectly. Is this particularly dubious? I am still perplexed if I am on the right path going with an integration approach on the full posterior to obtain marginal distributions. It’s a lot of integration, but at least it isn’t an optimization problem and won’t require iterations. Is there some reason people don’t use this technique?


In general, you should infer the joint posterior distribution. This gives you a variational approximation over all dimensions. After training, only keep around the posterior dimensions you care about.


Just to be sure, you mean keep the selected dimensions by marginalize out the other dimensions, right?


The variational approximation after training represents the posterior jointly over all dimensions. If the variational approximation is a fully-factorized distribution, there is no additional step: you have the marginals immediately. Otherwise if the approximation is correlated across dimensions, take marginals of a joint in the usual ways.