Basic Regression Not Working

mig · August 14, 2017, 2:35pm

Hello Everyone,
I am having troubles with a simple regression, any help appreciated.

Code:

Priors:

rate = Gamma(1.0, 1.0, sample_shape=1)
mu = Normal([200.], [100.])
sigma = Gamma([1.0], [1.0])

Model:

z = Poisson(rate=tf.multiply(rate, tf.ones([size, 1])))
x = Normal(loc=tf.add(z, tf.multiply(mu, tf.ones([size, 1]))),
       scale=tf.multiply(sigma, tf.ones([size, 1])))

Posterior:

qrate = PointMass(
tf.nn.softplus(tf.Variable(tf.random_normal([1]))), )
qmu = PointMass(tf.Variable(tf.random_normal([1])), )
qsigma = PointMass(
tf.nn.softplus(tf.Variable(tf.random_normal([1]))), )
qz = Empirical(
tf.nn.softplus(tf.Variable(tf.random_normal([size, size, 1]))), )

Inference:

inference = ed.MAP({rate:qrate, mu:emu, sigma:qsigma}, data={x: x_train, z:qz})
inference.run(n_iter=10000)

The parameter converge to values that are really far off the “true” simulated parameters.

Thanks

dustin · August 14, 2017, 9:55pm

The likelihood x's log-density has mean given by a Poisson variable z. The Poisson variable itself depends on the parameters you aim to optimize over. This means gradient-based optimization fails because it tries to apply chain rule through the Poisson sample.

To use MAP on such a model you need to write a compound random variable x which analytically marginalizes out the discrete variable, i.e., the density is given by

p(x) = sum_{z=0}^\infty Poisson(z | .) Normal(x | .)

mig · August 14, 2017, 10:06pm

I modified the post in a way that I am doing inference but still don’t converge… why is it running but failing?
Can you guide me a little bit on what should I write to be able to do inference? There is no close form analytical solution for the sum you have just written !
is the problem the same as in stan with discrete random variables? (Stan can’t sample from them, here the problem is computing the gradient)

Thanks Dustin !

dustin · August 14, 2017, 10:30pm

This ultimately comes down to what you’re trying to model. Namely, is z observed or latent?

If z is observed as in the edited first post, then MAP works fine. You can double check that it converges to the true parameters using large enough simulated data. If it doesn’t, can you provide a minimal working example?
If z is latent as in the unedited first post, then inference is a lot more difficult because of the intractable likelihood. You have to use ABC methods, which could be as crude as approximating the likelihood. Alternatively, you can perform posterior inference by inferring z as well; this makes the likelihood and priors all tractable but changes the modeling problem.
If z is observed, then Stan and Edward can both easily handle the problem. If z is latent, Stan doesn’t work as you note, but Edward does.

mig · August 15, 2017, 1:13am

Ok, maybe I am not making myself clear.

Z is latent.
I can write the variational inference problem and compute update rules for (z, rate, mu, sigma). I am trying to write down different Edward algorithms and they do not seem to converge to the right solution (probably because I am new to Edward).

For example, by doing
qz = Empirical( tf.nn.softplus(tf.Variable(tf.random_normal([size, size, 1]))), )
and
inference = ed.MAP({rate:qrate, mu:emu, sigma:qsigma}, data={x: x_train, z:qz})
I thought I was doing EM correctly.

Can you post any algorithm (MAP, VB, EM) that would perform inference in the problem?

Thanks !

dustin · August 15, 2017, 6:40pm

The line

inference = ed.MAP({rate:qrate, mu:emu, sigma:qsigma}, data={x: x_train, z:qz})

only specifies the M-step. You still need to specify some inference algorithm to do the E-step and then alternate between the two during training. One approach could be

inference_e = ed.KLqp({z: qz}, data={x: x_train, rate:qrate, mu: qmu, sigma: qsigma})
inference_m = ed.MAP({rate: qrate, mu: qmu, sigma: qsigma}, data={x: x_train, z:qz})

inference_e.initialize()
inference_m.initialize()
tf.global_variables_initializer().run()

for _ in range(10000):
  inference_e.update()
  inference_m.update()

where qz is a Poisson approximating family. If you want Monte Carlo for the E-step, you have to use a non-gradient based MCMC algorithm such as Metropolis-Hastings.

mig · August 15, 2017, 7:19pm

Got it … We will be doing variational EM now…

A brief note, I had to replace qz by the following to make it work.

qz = Poisson(rate=tf.nn.softplus(
tf.Variable(tf.truncated_normal([size, 1], mean=500, stddev=0.05), name=“rate”)))

Thanks !

Topic		Replies	Views
MCMC not working for basic model	4	982	May 8, 2018
Poisson-gamma regression	6	2003	March 19, 2017
Bernoulli distribution parameterized by a mapped Gaussian	11	1770	February 11, 2018
Fixed scales in regression example	6	1395	August 29, 2017
Help with a simple parameter estimation	0	1281	January 22, 2019

Basic Regression Not Working

Related Topics