Any tutorials on MCMC?


#1

I am new to Edward, but I find that most of the available turtorials are on variational inference. Is there any tutorials on inference by MCMC? Especially, I hope a case with simple Bayesian linear regression.

Below is my code. It seems there is a dimension mismatch problem. I am rather confused on this issue.

from edward.models import Normal,Empirical
import numpy as np
import edward as ed
import tensorflow as tf

def build_toy_dataset(N, w, noise_std=0.1):
  D = len(w)
  x = np.random.randn(N, D)
  y = np.dot(x, w) + np.random.normal(0, noise_std, size=N)
  return x, y

import time
time_start=time.clock()

N = 40  # number of data points
D = 10  # number of features
n_chain=1000;
w_true = np.random.randn(D)
X_train, y_train = build_toy_dataset(N, w_true)
X_test, y_test = build_toy_dataset(N, w_true)

X = tf.placeholder(tf.float32, [N, D])
w = Normal(loc=tf.zeros(D), scale=tf.ones(D))
b = Normal(loc=tf.zeros(1), scale=tf.ones(1))
y = Normal(loc=ed.dot(X, w) + b, scale=tf.ones(N))

qw=Empirical(params=tf.Variable(tf.zeros([n_chain,D])))
qb=Empirical(params=tf.Variable(tf.zeros((n_chain,1))))

inference = ed.Gibbs({w:qw,b: qb}, data={X:X_train,y: y_train})
inference.run()

#2

Take a look at the examples directory in the Github, for example, the Rasch model using Hamiltonian Monte Carlo.

Running your code, I see the error

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/gibbs.py", line 40, in __init__
    for z in six.iterkeys(latent_vars)}
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/gibbs.py", line 40, in <dictcomp>
    for z in six.iterkeys(latent_vars)}
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/conjugacy/conjugacy.py", line 132, in complete_conditional
    'sufficient statistics.' % str(dist_key))
NotImplementedError: Conditional distribution has sufficient statistics (('#Add', ('#CPow2.0000e+00', (<tf.Tensor 'Reshape:0' shape=(40,) dtype=float32>, (<tf.Tensor 'MatMul:0' shape=(40, 1) dtype=float32>, (<tf.Tensor 'Placeholder:0' shape=(40, 10) dtype=float32>,), (<tf.Tensor 'ExpandDims:0' shape=(10, 1) dtype=float32>, ('#x',), (<tf.Tensor 'ExpandDims/dim:0' shape=() dtype=int32>,))), (<tf.Tensor 'Reshape/shape:0' shape=(1,) dtype=int32>,))), ('#Mul', (2.0,), (<tf.Tensor 'Reshape:0' shape=(40,) dtype=float32>, (<tf.Tensor 'MatMul:0' shape=(40, 1) dtype=float32>, (<tf.Tensor 'Placeholder:0' shape=(40, 10) dtype=float32>,), (<tf.Tensor 'ExpandDims:0' shape=(10, 1) dtype=float32>, ('#x',), (<tf.Tensor 'ExpandDims/dim:0' shape=() dtype=int32>,))), (<tf.Tensor 'Reshape/shape:0' shape=(1,) dtype=int32>,)), (<tf.Tensor 'Normal_4/sample/Reshape:0' shape=(1,) dtype=float32>,))), ('#CPow2.0000e+00', ('#x',)), ('#x',), (<tf.Tensor 'MatMul:0' shape=(40, 1) dtype=float32>, (<tf.Tensor 'Placeholder:0' shape=(40, 10) dtype=float32>,), (<tf.Tensor 'ExpandDims:0' shape=(10, 1) dtype=float32>, ('#x',), (<tf.Tensor 'ExpandDims/dim:0' shape=() dtype=int32>,)))), but no available exponential-family distribution has those sufficient statistics.

This error says that Gibbs sampling doesn’t know how to calculate the complete conditional posterior distributions for sampling. If you strongly prefer Gibbs over other algorithms, you can work out the conditional distributions yourself and pass them into Gibbs (see the API reference page).


#3

Thank you dustin. I switch to sghmc algorithm as in the https://github.com/blei-lab/edward/blob/master/examples/bayesian_linear_regression_sghmc.py
It works now.


#4

Thanks Dustin. One thing I learned from the examples is that:

The step_size parameter in HMC example is very important to the final result.