MCMC not working for basic model


I’m trying to get a basic latent variable model working so that I can develop it into something more complicated. Here’s what I’ve got:

# Data
data = tf.random_normal(shape=[n], mean=90, stddev=3).eval()

# Model
mu = Normal(0., 1.)
sigma_2 = Normal(0., 1.)
y = Normal(tf.ones(n) * mu, tf.ones(n) * sigma_2)

# Inference
q_mu = Empirical(tf.Variable(tf.zeros(500)))
q_sigma_2 = Empirical(tf.Variable(tf.zeros(500)))

inference = ed.HMC({mu: q_mu, sigma_2: q_sigma_2}, data={y: data})

I’ve tried step sizes ranging from 100 to 0.0000000001 and no luck — the acceptance rate is never sensible (always 0 or 1) and trying to sample from q_mu or q_sigma_2 returns nan.

Does anyone know what I’m doing wrong here?


The scale parameter of your normal variable y is not guaranteed to be positive definite. Can you try
y = Normal(tf.ones(n) * mu, tf.nn.softplus(tf.ones(n) * sigma_2))

I’ve run this bit of code, with the suggestion @mrosenkranz gave and without it, both times neither q_mu nor q_sigma_2 return nan when trying to sample, instead they return 0, which I suppose isn’t much better. (changing the step size to 0.0001 produces nonzero returns on q_mu and q_sigma2 but they aren’t anywhere you’d expect them to be

@mrosenkranz @YSanchezAraujo thanks for taking a look at this, really appreciate it.

I too have not had much luck even after making the change @mrosenkranz suggested. q_mu does start moving towards where you’d expect, but still nowhere near where it should get to.

Any ideas?

@refrigerator I may be completely wrong but I think ultimately the problem is that your data are completely random, and the model you’ve tried to build, with it’s generative process has no information or way to get at that randomness so the approximation is typically bad.

i.e (below is essentially linear regression and because the generative process reflects that, the estimate is accurate).

import numpy as np
import tensorflow as tf
import edward as ed
from edward.models import (
        Normal, Empirical,

# Data
data = np.random.normal(loc=90, scale=3, size=100)
dep_var = data * .6 + np.random.normal(loc=0, scale=1, size=100)

# Model
X = tf.placeholder(tf.float32, [n])
w = Normal(0., 1.)
sigma_2 = Normal(0., 1.)
y = Normal(loc=X*w, scale=tf.nn.softplus(tf.ones(n)*sigma_2))

# Inference
q_w = Empirical(tf.Variable(tf.zeros(10000)))

inference = ed.HMC({w: q_w}, data={X:data, y: dep_var})