Conditioning a random variable on a value determined at run time

Hi all,

It’s my first time using Edward, so this is possibly a very basic question.
Essentially I’m asking how to condition a random variable on a value determined at run time?

I’m trying to build the following model for MCMC inference using a Gibbs sampler (haven’t tackled the inference part yet):

  • T ~ Multinomial(p, N) where p is K dimensional
  • For k \in [1…K], theta[k] ~ Dirichlet(q_k) where q_k are M dimensional
  • Z is a vector of size 1xN which consists of the actual values of a certain sample t, ordered by their value. For example, if K=3 and the count vector t = [1, 3, 5], then z = [0,1,1,1,2,2,2,2,2].
  • W of size NxM where W[n] ~ Multinomial(theta[z[n]], C)

I thought I could define z as a variable, compute it at run time from a given sample of T, and then feed it into the sampler of W. Unfortunately, I couldn’t get this to work with things along the lines of:

 w[i] = models.Multinomial(probs=tf.gather(theta, z[i]), total_count=np.float64( C ))

I also tried working with T as Categorical random variables but since I still need to reorder them, this wasn’t helpful.

The following test code works, but I’m not sure it does what I’m aiming for:

import numpy as np
import tensorflow as tf
from edward import models

N = 10
C = 100
K = 3
M = 50

t = models.Multinomial(probs=np.array([0.2, 0.3, 0.5]), total_count=np.float64(N), name='t')
z = tf.get_variable('z', shape=(N, ), dtype=np.int32)

theta_prior = np.array([5.]*M)
theta = [[0]*M] * K
for k in np.arange(K):
    theta[k] = models.Dirichlet(theta_prior, name='theta')
theta = np.array(theta)

w = [[0]*M] * N
with tf.Session() as sess:
    t_samples = sess.run(t.sample(1))[0]      
    feed_z = np.digitize(np.arange(N), np.cumsum(t_samples))     
    assign_op = z.assign(feed_z)
    sess.run(assign_op)
    for i in np.arange(N):
        w[i] = models.Multinomial(probs=theta[feed_z[i]], total_count=np.float64(C), name=''.join(['w_', str(i)]))
        w_samples = sess.run(w[i].sample(1))

Does it even make sense to define w[i] in the context of a running session?

Your help would be greatly appreciated!

Hi @inbalii, looking at the code it would make sense to define the random variables in 2 dimensions - for example theta = Dirichlet(tf.ones([K, M]) * 5.0) which essentially means that your batch size is K.
Also, although it is possible to perform inference by sampling and then feeding the data in the next inference step I think in your example you could define everything in the computation graph (using tf and Edward constructs) with placeholders where necessary and then do the inference. Hope this helps.