Saving ancestor variables in ancestral sampling

deoxyribose · October 12, 2017, 12:22pm

Hi,

I’m trying to write a utility where one can “freeze” a subset of latent variables during inference, for debugging problems with learning data sampled from a model. So I want to sample a given variable z, save it, pass it on to the rest of the model and sample the observed variable x, and then perform inference where z is treated like another observed variable. The rationale is that if inference works with a particular variable observed rather than latent, then the original problem stems from that variable.

In addition, I want to do inference on mini-batches. For example, in a Gaussian mixture model, I want to sample a batch of cluster assignments z, keeping the means, variances and mixing proportions constant, and sample a batch of observed variables x using my saved z, like so:

M = 100
K = 3
D = 2

mean_precision_shape,mean_precision_rate,obs_precision_shape,obs_precision_rate = 4.,200.,6.,10.

sess = tf.Session()
with sess.as_default():
    
    # p model
    alpha = 1
    pi = Dirichlet(np.atleast_1d(alpha*np.ones(K)).astype(np.float32))
    z = Multinomial(total_count=1.,probs=tf.reshape(tf.tile(pi,[M]),[M,K]))
    sigma2_mu_k = ed.models.InverseGamma([[mean_precision_shape]],[[mean_precision_rate]])
    sigma2_mu_d = ed.models.InverseGamma(mean_precision_shape*tf.ones([D]),mean_precision_rate*tf.ones([D]))
    sigma2_mu = tf.tile(sigma2_mu_k, [K,1])*sigma2_mu_d
    mu = ed.models.MultivariateNormalDiag(tf.zeros([K,D]), tf.sqrt(sigma2_mu))
    sigma2_obs_n = ed.models.InverseGamma([[obs_precision_shape]],[[obs_precision_rate]])
    sigma2_obs_d = ed.models.InverseGamma(obs_precision_shape*tf.ones([D]),obs_precision_rate*tf.ones([D]))
    sigma2_obs = tf.tile(sigma2_obs_n, [M,1])*sigma2_obs_d
    x = ed.models.MultivariateNormalDiag(tf.matmul(z, mu), tf.sqrt(sigma2_obs))
    
    init = tf.global_variables_initializer()
    init.run()
    
    # identify global and local latent variables
    latent_variables = x.get_ancestors()
    local_latent_variables = [lv for lv in latent_variables if lv.shape[0] == M]
    local_parent = [lv for lv in local_latent_variables[0].get_ancestors()]
    global_latent_variables = [lv for lv in latent_variables if lv not in local_latent_variables and lv not in local_parent]
        
    # sample global variables
    true_global_latent_variables = [lv.sample().eval() for lv in global_latent_variables]
    # sample M local variables M times
    true_local_latent_variables = ed.copy(z,{local_parent[0]:local_parent[0].eval()}).sample((M,)).eval()

    X_sample = np.zeros((M,M,D))
    latent_variables = dict(zip(global_latent_variables, true_global_latent_variables))
    for i in range(2):
        latent_variables.update({z:true_local_latent_variables[i]})
        X_sample[i] = ed.copy(x,latent_variables).eval() # this works only in the first iteration
        #tf.reshape(,(-1,K))
        plt.figure()
        plt.scatter(*X_sample[i].T,color=true_local_latent_variables[i])
        plt.axis('equal');

The output is

index
index2

It seems that ed.copy accepts the z’s that I’ve updated my dict with only in the first iteration, and in the second it ignores it and samples its own z’s - but still uses the other variables I pass it.

Does it make sense to build something like this?
Is there a better way to do it?
How do I get ed.copy to use my saved z’s in every iteration?

deoxyribose · October 16, 2017, 11:24am

Found the solution here: https://github.com/blei-lab/edward/issues/427 and here: Basics of Graphs / Flow Control
Using .value one can sample from the joint, and save the variables that their children condition on

M = 10000
K = 3
D = 2

mean_precision_shape,mean_precision_rate,obs_precision_shape,obs_precision_rate = 4.,200.,6.,10.

sess = tf.Session()

# p model
alpha = 1
pi = Dirichlet(np.atleast_1d(alpha*np.ones(K)).astype(np.float32))
z = Multinomial(total_count=1.,probs=tf.reshape(tf.tile(pi,[M]),[M,K]))
sigma2_mu_k = ed.models.InverseGamma([[mean_precision_shape]],[[mean_precision_rate]])
sigma2_mu_d = ed.models.InverseGamma(mean_precision_shape*tf.ones([D]),mean_precision_rate*tf.ones([D]))
sigma2_mu = tf.tile(sigma2_mu_k, [K,1])*sigma2_mu_d
mu = ed.models.MultivariateNormalDiag(tf.zeros([K,D]), tf.sqrt(sigma2_mu))
sigma2_obs_n = ed.models.InverseGamma([[obs_precision_shape]],[[obs_precision_rate]])
sigma2_obs_d = ed.models.InverseGamma(obs_precision_shape*tf.ones([D]),obs_precision_rate*tf.ones([D]))
sigma2_obs = tf.tile(sigma2_obs_n, [M,1])*sigma2_obs_d
x = ed.models.MultivariateNormalDiag(tf.matmul(z, mu), tf.sqrt(sigma2_obs))

latent_variables = x.get_ancestors()
model = [x, *latent_variables]
model_sample = dict(zip(model,sess.run([v.value() for v in model])))

and divide into mini-batches after.

Topic		Replies	Views
Saving Model Parameters	16	6324	July 17, 2017
Modeling sequential data	1	824	July 1, 2017
MCMC not working for basic model	4	982	May 8, 2018
Re-using models/inferences for several independent fits	1	808	July 1, 2017
Conditioning a random variable on a value determined at run time	1	752	December 5, 2017

Saving ancestor variables in ancestral sampling

Related topics