Variational Inference with Composition of Variables

joshuacortez · February 28, 2018, 2:27am

Hi everyone!
I’m trying to estimate the demand rate of a product with inventory stock-out. I started with a simulation of the data and then tried to estimate the latent parameter lambda using variational inference.
The estimated posterior q_lambda however doesn’t seem to capture the lambda value of 1.5

I’ve gotten variational inference to work for the case where inventory is always available (i.e. simply estimating the lambda of Poisson distributed data) but I’m wondering why it breaks down when using a composition of variables.

Code I used is shown below. Thanks!

import tensorflow as tf
from edward.models import Normal,HalfNormal, Beta, Poisson, Uniform
import edward as ed
import numpy as np
import matplotlib.pyplot as plt

# DATA SIMULATION
# I_train simulates inventory availability (1 if product is available)
I_train = np.array([np.random.choice((1,0), p=(.8,.2)) for i in range(3000)]).reshape(-1,1)
# T is the number of time intervals
T = I_train.shape[0]
# S_train simulates number of purchases of a product per time interval 
S_train = np.multiply(np.random.poisson(lam=1.5, size=(T,1)),I_train)

#MODEL
# this is the latent variable we wish to estimate
var_lambda = Uniform(tf.scalar_mul(0,tf.ones([1,1])),tf.scalar_mul(10,tf.ones([1,1])))

I = tf.placeholder(tf.float32,[T,1])
S = tf.multiply(Poisson(tf.tile(tf.transpose(var_lambda),[T,1])),I)

#INFERENCE
# using variational inference

q_lambda_scale = tf.nn.softplus(tf.Variable(tf.random_normal([1,1])))
q_lambda_loc = tf.Variable(tf.random_normal([1,1]))
q_lambda = Normal(loc = q_lambda_loc, scale = q_lambda_scale)

inference = ed.KLqp({var_lambda:q_lambda}, data={I: I_train, S: S_train})
inference.run(n_samples=1, n_iter=5000)

print(q_lambda.mean().eval())

x_range = tf.range(-10,30,.1)
	
sess = ed.get_session()
plt.plot(*sess.run([x_range, tf.transpose(var_lambda.prob(x_range))]), color="green")
plt.plot(*sess.run([x_range, tf.transpose(q_lambda.prob(x_range))]), color="blue")
plt.axvline(x = 1.5, color = "red")
plt.show()

dustin · March 2, 2018, 4:37am

The normal variational approximation cannot infer against a Uniform prior on [0, 10]. Both have to be on the same support. (Note also Edward’s automated transformations doesn’t work for parameter-defined supports like a Uniform.)

Maybe try a non-negative continuous prior and variational approximation like LogNormal. See, e.g., examples/deep_exponential_family.py as an example.

joshuacortez · March 5, 2018, 12:12am

Thank you @dustin for the suggestion. I’ll be trying it out

joshuacortez · March 5, 2018, 11:25am

I’ve also noticed that the inference works depending on how the inventory availability variable I (an indicator variable) is used.

Instead of multiplying I with the Poisson random variable

S = tf.multiply(Poisson(tf.tile(tf.transpose(var_lambda),[T,1])),I)

It’s instead multiplied to the Poisson’s rate

S = Poisson(tf.multiply(tf.tile(tf.transpose(var_lambda),[T,1]),I))
Since I is part of the Poisson parameter, and Poisson rate can’t be zero, some further adjustments were done. Instead of having 0’s and 1’s as values of the inventory availability data I_train

I_train = np.array([np.random.choice((1,0), p=(.8,.2)) for i in range(3000)]).reshape(-1,1)

The zeros are replaced with a small epsilon value

epsilon = 1e-18
I_train = np.array([np.random.choice((1,epsilon), p=(.8,.2)) for i in range(3000)]).reshape(-1,1)

Not sure why the variational inference works with this, but it worked

aksarkar · March 5, 2018, 5:38pm

The model you’re trying to fit looks like

y | z = 1 ~ Poisson(lambda)
y | z = 0 = 0

Written like this, you would need to implement S as a mixture of a Poisson and a PointMass.

If instead you write:

y ~ Poisson(lambda)
lambda | z = 1 ~ g
lambda | z = 0 = epsilon

For some prior g, then your implementation matches the model.

But the zero observations can’t tell you anything about lambda, and you’re assuming the latent indicator is observed.

So it seems you should just model the non-zero part of the data and ignore the zeros.

joshuacortez · March 8, 2018, 1:06am

Thank you @aksarkar! I’m still just learning about bayesian networks and this is a great help.

I have a follow up question. When you mentioned I would need to implement S as a mixture of a Poisson and a PointMass, does that mean to express S as a product of a Poisson and a PointMass? (i.e. the line below)

S = tf.multiply(Poisson(tf.tile(tf.transpose(var_lambda),[T,1])),PointMass(params = I))

When I tried to use variational inference with this however, it didn’t work any better.

Did you suggest to just model the non-zero part of the data because implementing a mixture of Poisson and PointMass for inference isn’t feasible?

aksarkar · March 8, 2018, 4:13am

AFAIK you would have to implement a custom RandomVariable for a mixture of a PointMass and Poisson.

I would suggest modeling the non-zero part of the data because you don’t need to learn anything about the zeros in the data. You’ve observed the inventory, so you know whether 0 purchases are explained by the item not being in stock versus not being in demand.

If insead you wanted to learn about the inventory given the observed rate of sales, then this model isn’t powerful enough because over a single time interval you can’t distinguish the two cases.

Topic		Replies	Views
Variational EM for Independent Factor Analysis	7	1863	March 28, 2017
Basic Regression Not Working	6	897	August 15, 2017
Poisson-gamma regression	6	2034	March 19, 2017
Question regarding inference API	3	830	May 15, 2017
Parameters not getting updated	4	951	May 11, 2017

Variational Inference with Composition of Variables

Related topics