Hi,
I’m trying to write a utility where one can “freeze” a subset of latent variables during inference, for debugging problems with learning data sampled from a model. So I want to sample a given variable z, save it, pass it on to the rest of the model and sample the observed variable x, and then perform inference where z is treated like another observed variable. The rationale is that if inference works with a particular variable observed rather than latent, then the original problem stems from that variable.
In addition, I want to do inference on mini-batches. For example, in a Gaussian mixture model, I want to sample a batch of cluster assignments z, keeping the means, variances and mixing proportions constant, and sample a batch of observed variables x using my saved z, like so:
M = 100
K = 3
D = 2
mean_precision_shape,mean_precision_rate,obs_precision_shape,obs_precision_rate = 4.,200.,6.,10.
sess = tf.Session()
with sess.as_default():
# p model
alpha = 1
pi = Dirichlet(np.atleast_1d(alpha*np.ones(K)).astype(np.float32))
z = Multinomial(total_count=1.,probs=tf.reshape(tf.tile(pi,[M]),[M,K]))
sigma2_mu_k = ed.models.InverseGamma([[mean_precision_shape]],[[mean_precision_rate]])
sigma2_mu_d = ed.models.InverseGamma(mean_precision_shape*tf.ones([D]),mean_precision_rate*tf.ones([D]))
sigma2_mu = tf.tile(sigma2_mu_k, [K,1])*sigma2_mu_d
mu = ed.models.MultivariateNormalDiag(tf.zeros([K,D]), tf.sqrt(sigma2_mu))
sigma2_obs_n = ed.models.InverseGamma([[obs_precision_shape]],[[obs_precision_rate]])
sigma2_obs_d = ed.models.InverseGamma(obs_precision_shape*tf.ones([D]),obs_precision_rate*tf.ones([D]))
sigma2_obs = tf.tile(sigma2_obs_n, [M,1])*sigma2_obs_d
x = ed.models.MultivariateNormalDiag(tf.matmul(z, mu), tf.sqrt(sigma2_obs))
init = tf.global_variables_initializer()
init.run()
# identify global and local latent variables
latent_variables = x.get_ancestors()
local_latent_variables = [lv for lv in latent_variables if lv.shape[0] == M]
local_parent = [lv for lv in local_latent_variables[0].get_ancestors()]
global_latent_variables = [lv for lv in latent_variables if lv not in local_latent_variables and lv not in local_parent]
# sample global variables
true_global_latent_variables = [lv.sample().eval() for lv in global_latent_variables]
# sample M local variables M times
true_local_latent_variables = ed.copy(z,{local_parent[0]:local_parent[0].eval()}).sample((M,)).eval()
X_sample = np.zeros((M,M,D))
latent_variables = dict(zip(global_latent_variables, true_global_latent_variables))
for i in range(2):
latent_variables.update({z:true_local_latent_variables[i]})
X_sample[i] = ed.copy(x,latent_variables).eval() # this works only in the first iteration
#tf.reshape(,(-1,K))
plt.figure()
plt.scatter(*X_sample[i].T,color=true_local_latent_variables[i])
plt.axis('equal');
The output is
It seems that ed.copy accepts the z’s that I’ve updated my dict with only in the first iteration, and in the second it ignores it and samples its own z’s - but still uses the other variables I pass it.
- Does it make sense to build something like this?
- Is there a better way to do it?
- How do I get ed.copy to use my saved z’s in every iteration?