How to Have Model Sample Have Same Shape as Model?


#1

I think I must be missing something very obvious about how to sampling works in Edward and TesnorFlow. Can someone point out to me what I’m doing wrong?

I want to construct a model of a 1-d Gaussian, and then sample that model N times, and then do a MLE fit of the model to that sample. As I’m going to do a fit, my Edward model needs to be built with the parameters as tf.Variable(). However, I don’t see any way to be able to get the sample and model to have the same shape.

I want to be able to pass in an Edward model for sampling, so @willtownes’s solution and the examples in the tutorials that build the sample from a numpy distribution and then build an Edward model to fit to the sample won’t work here. I am expecting the user to be able to pass in more complex composed models and so I don’t want to have them have to build it in Edward and then build it in numpy as well.

Below is an example of my problem. Can anyone point out to me what I’m missing? I apologize as I realize this is not really an Edward problem, but rather me just not understanding an aspect of TensorFlow well enough, given that ed.models.Normal inherits largely from tf.contrib.distributions.Normal.

import numpy as np
import tensorflow as tf
import edward as ed
# specific modules
from edward.models import Normal

def sample_model(model, n_samples):
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        samples = sess.run(model.sample([n_samples]))
    return samples

# want to perform a fit, so need to use variables
mean = tf.Variable(3.0)
std = tf.Variable(1.0)
N = 100

x = Normal(loc=mean, scale=std)
samples = sample_model(x, N)

print("\nx is a {0} with shape {1}".format(type(x), x.get_shape()))
print("samples is a {0} with shape {1}".format(type(samples), samples.shape))

# fails as x and samples don't have the same shape
#mle = ed.MAP({}, data={x: samples})

# Alternative
x = Normal(loc=mean*tf.ones(N), scale=std*tf.ones(N))
samples = sample_model(x, N)

print("\nx is a {0} with shape {1}".format(type(x), x.get_shape()))
print("samples is a {0} with shape {1}".format(type(samples), samples.shape))
print("samples[0] is a {0} with shape {1}".format(type(samples[0]), samples[0].shape))

# fails as x and samples don't have the same shape
#mle = ed.MAP({}, data={x: samples})
# works but is hugely inefficient, as only using 1 row of a N x N tensor
mle = ed.MAP({}, data={x: samples[0]})
mle.run()

sess = ed.get_session()
print(sess.run(mean))

# As ed.models.Normal inherits from tf.contrib.distributions.Normal the results are the same with pure TF
x = tf.contrib.distributions.Normal(loc=mean*tf.ones(N), scale=std*tf.ones(N))
samples = sample_model(x, N)

print("\nx is a {0} with shape {1}".format(type(x), x.event_shape))
print("samples is a {0} with shape {1}".format(type(samples), samples.shape))

Most likely unimportant, but for good habit:

$ pip freeze | egrep 'tensorflow|edward' 
edward==1.3.3
tensorflow==1.2.0

#2

There are two approaches to understanding the model’s generative process:

  • Define a population distribution which is sampled N times, one for each data point. To do this:
x = Normal(loc=mean, scale=std, sample_shape=N)

Each call to sess.run(x) generates N samples from x. This corresponds to the generative process of N data pionts.

  • Define a batch of N random variables each of which is sampled once. To do this:
x = Normal(loc=tf.ones(N) * mean, scale=tf.ones(N) * std)

Each call to sess.run(x) generates N samples, one from each random varaible. This also corresponds to the generative process of N data points.

As they match the data’s generative process, either can be plugged into Edward’s inference.


#4

Thank you very much for your reply, which clarifies things a lot — sorry, I should have read help(ed.models.Normal) more closely.

This now (obviously) allows me to get the behavior I wanted

def sample_model(model_template, n_samples):
    model = model_template.copy(sample_shape=n_samples)
    with tf.Session() as sess:
        sess.run(tf.global_variables_initializer())
        samples = sess.run(model)
    return model, samples

#edward model: univariate Normal
mean = tf.Variable(3.0)
std = tf.Variable(1.0)
N = 1000

model_template = Normal(loc=mean, scale=std)

x, samples = sample_model(model_template, N)
fit = ed.MAP({}, data={x: samples})
fit.run()
sess = ed.get_session()
sess.run(mean)