Prediction or Criticism of the model mean()

MushroomHunting · August 24, 2017, 2:06am

Hi all

I’m trying to run criticism of a simple bayesian linear regression model

When calling sess.run(Y_post.mean(), feed_dict=X_tst) the result varies each time this is run. I suspect this has something to do with the variational parameter distributions not being set to return their mean. Is this correct and if one wanted exactly the same predictive mean would you have to specify the variational distributions to be the mean values?

i.e. when calling

mean = sess.run(Y_post.mean(), feed_dict=X_test)

Should the definintion of Y_post be defined instead of

Y_post = ed.copy(Y, {W: qW, … } )

as

Y_post = ed.copy(Y, {W: qW.mean(), … } )

is this the correct way to do this? If not what is the recommended way to do this

dustin · August 24, 2017, 7:30am

Fetching y_post draws new parameters because any random variables it depends on in the computational graph are redrawn. The same happens if you try to fetch x in the program
theta = Beta(1.0, 1.0); x = Bernoulli(probs=theta, sample_shape=50).

Consider what this means mathematically. The first line represents the posterior predictive,

p(xnew | x) = \int p(xnew | theta) p(theta | x) d\theta

The second line represents the likelihood with parameters given by the posterior mean, p(xnew | theta = mean(p(theta | x))).

In general, to calculate something like the posterior predictive mean you should fetch y_post many times and average.

MushroomHunting · August 25, 2017, 4:37am

Thanks Dustin!

To clarify, is there any difference in running Y_post, Y_post.mean() and multiple Y_post.sample() and then averaging their results; with Y_post.sample([num_samples]) being the most efficient to obtain an approximate posterior mean?

dustin · August 25, 2017, 11:24pm

Yes.

It’s worth working out what these mean:

np.mean([sess.run(y_post) for _ in range(50)]) fetches a posterior sample; then likelihood sample; then repeats 50 times and takes the mean.
sess.run(y_post.mean()) fetches a posterior sample, then takes the likelihood’s mean given the single posterior sample.
sess.run(y_post.sample([num_samples])) fetches a posterior sample, then draws num_samples samples from the likelihood given the single posterior sample.

Only the first method is correct.

MushroomHunting · August 27, 2017, 9:07am

cheers; this makes things clearer

Freezinger · July 11, 2018, 3:03pm

Sorry to reawaken this old thread but I’ve been asking this question for a project I’ve been working on, too. In the case of variational inference in a linear model that consists strictly of independent normally distributed random variables, don’t all random variables converge in probability to their posterior means when sampling from y_post? In this case, shouldn’t it be correct to evaluate the model using ed.copy() to replace all random variables (including y) with their posterior means and then generate a single prediction? This would just be for computing a point estimate of the model’s error/likelihood. For other types of model criticism the full posterior predictive would be needed.

The implementation would look like this:

# From the supervised learning tutorial
from edward.models import Normal

X = tf.placeholder(tf.float32, [N, D])
w = Normal(loc=tf.zeros(D), scale=tf.ones(D))
b = Normal(loc=tf.zeros(1), scale=tf.ones(1))
y = Normal(loc=ed.dot(X, w) + b, scale=tf.ones(N))

qw = Normal(loc=tf.get_variable("qw/loc", [D]),
            scale=tf.nn.softplus(tf.get_variable("qw/scale", [D])))
qb = Normal(loc=tf.get_variable("qb/loc", [1]),
            scale=tf.nn.softplus(tf.get_variable("qb/scale", [1])))

inference = ed.KLqp({w: qw, b: qb}, data={X: X_train, y: y_train})
inference.run(n_samples=5, n_iter=250)
y_post = ed.copy(y, {w: qw, b: qb})

# Current proposal, used only for point estimation of error/likelihood
y_MAP = ed.copy(y.mean(), {w: qw.mean(), b: qb.mean()}, scope='MAP')

Topic		Replies	Views
Why is there significant difference between these two methods in prediction?	1	625	August 1, 2017
Gaussian Process Regression, sampling new data points from the predictive posterior	3	1026	February 13, 2018
Correct way to evaluate linear regression predictive variance with learned alpha	1	844	September 25, 2017
Ed.evaluate y_pred values	0	946	January 2, 2019
Inference from within the model with prediction result	1	1241	July 31, 2017

Prediction or Criticism of the model mean()

Related topics