0% Acceptance Rate when using Categorical Distribution with HMC

reuben · April 20, 2017, 5:46am

I’m running HMC to learn a normal distribution, and conditioning on a categorical distribution. A very simplified example of my model is as follows:

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import edward as ed
import numpy as np
import tensorflow as tf
from edward.models import Empirical, Normal, Categorical
from edward.inferences import HMC
mu = Normal(mu=0.0, sigma=1.0)
x = Normal(mu=tf.ones(50) * mu, sigma=1.0)
cat = Categorical(p=x)
observed = 3
qmu = Empirical(params=tf.Variable(tf.zeros([1000])))
inference = ed.HMC({mu: qmu},data={cat: observed})
inference.run()
sess = ed.get_session()
print(sess.run(qmu.params))

The acceptance rate is 0, and as a result, I end up with an empirical distribution of 0s for qmu, after inference. I assume this is a conceptual problem, not a coding one, but I’m not quite sure what I’m doing wrong.

Thanks!

Reuben

reuben · April 20, 2017, 6:33am

As a follow up to this, I’m having trouble finding ways to do exact inference, on a distribution with discrete support in Edward. For example, suppose I have a joint distribution, over a Normal and a Categorical. I assume the Edwardian method is to compose two inference algorithims, e.g. HMC for the Normal, and something that supports discrete distributions for the Categorical. Sorry if this doesn’t make sense - happy to go into more detail, but not sure if this is the right place to sort out my conceptual confusions.

Thanks,

Reuben

dustin · April 20, 2017, 10:00pm

The model is not well-defined. You’re defining x as a Normal prior and then parameterizing cat’s probabilities according to its sample. If you want to parameterize a Categorical with real values, try its logit parameterization. Swapping the line with

cat = Categorical(logits=x)

works. (Also, note you are defining a prior for x but not performing any inference on it. This is fine only in toy problems like this.)

suppose I have a joint distribution, over a Normal and a Categorical. I assume the Edwardian method is to compose two inference algorithims, e.g. HMC for the Normal, and something that supports discrete distributions for the Categorical.

It depends on what variables are latent and what variables are observed. In the provided example, the data is discrete (the categorical variable is observed). Thus you don’t need to do any “inference” over it; you only need to do inference over the unobserved variables (the normal).

I recommend the background tutorials (Edward – Inference of Probabilistic Models).

reuben · April 21, 2017, 8:28pm

Thanks - yeah, I see that the toy example I gave was ill-defined, sorry about that. As for the other question, I’m having a little trouble working out from the docs how to do exact inference for a discrete distribution. Say I have two latent variables: one is a Normal distribution, the other a Categorical. I want to use HMC for the former and something exact for the latter. My understanding is that I can just do two separate inferences, one for each, but if so, what class of inference should I use for the categorical?

Thanks, and good job with Edward - it’s really awesome!

dustin · April 22, 2017, 4:14pm

if so, what class of inference should I use for the categorical?

There’s no universal solution: any inference that works over discrete variables is worth experimenting with (e.g., ideally Gibbs, or KLqp, or MetropolisHastings). You can refer to the reference (Edward – Reference). You mention you want something “exact”. Can you elaborate?

reuben · April 22, 2017, 7:41pm

What I was thinking of was something like webppl’s “enumerate” inference, which just gives a deterministic solution to the inference problem by enumerating through the support of the discrete distribution. I’m currently doing this “by hand” in Edward, but wanted to check if there was a more Edwardian way.

dustin · April 23, 2017, 7:09pm

something like webppl’s “enumerate” inference

You can use the Mixture random variable, which integrates out the categorical probabilities. It’s also more efficient than naive approaches in that its log-density uses the logsumexp trick.

I suppose we could also have something like a ed.enumerate(x) function if helpful. It can return a Mixture random variable integrating over all the {categorical,bernoulli,multinomial,binomial} random variables that x depends on.

reuben · April 28, 2017, 8:30pm

Ah, that makes sense, thanks! I guess “enumerate” might be useful for examples for people new to probabilistic programming, in order to show simple cases of inference, but this solution seems good.

thanks again,

r

Topic		Replies	Views
Confusion about Empirical distributions in Edward	1	1389	June 7, 2017
Simple Beta-Bernoulli model and HMC inference	1	1060	December 14, 2017
Acceptance Rate 0 for HMC in IRT models	1	1639	January 18, 2019
Multinomial-Logistic Normal Model, Probably My Fault	2	3127	January 3, 2018
Having trouble setting up basic HMC model	4	1912	May 29, 2017

0% Acceptance Rate when using Categorical Distribution with HMC

Related topics