Gaussian Mixture Prior VAE

romsss.34 · July 6, 2017, 2:46am

Hello,

I am trying to implement this model in Edward to add it to the library (Deep Unsupervised Clustering With Gaussian Mixture VAE). I actually found RuiShu’s blog that seemed to raise a nice point so I copied the “True GMM” model right into edward. Original code is here https://github.com/RuiShu/vae-clustering/blob/master/gmvae.py

# MODEL
y = Categorical(logits=tf.ones([M, K]))
mu_z, sigma_z = generative_network_z(tf.expand_dims(tf.cast(y, tf.float32), 1))
z = Normal(loc=mu_z, scale=sigma_z)
location, scale = generative_network_x(z)
x = Normal(loc=location, scale=scale)

# INFERENCE
x_ph = tf.placeholder(tf.float32, [M, n_input])
qlogits_y, qlocation_z, qscale_z = inference_network(x_ph)
var_z = Normal(loc=qlocation_z, scale=qscale_z)
var_y = Categorical(logits=qlogits_y)

# INFERENCE
inference = ed.KLqp({z: var_z, y:var_y}, data={x: x_ph})
optimizer = tf.train.RMSPropOptimizer(0.001, epsilon=1.0)
inference.initialize(optimizer=optimizer)
tf.global_variables_initializer().run()

Full code can be found here https://drive.google.com/file/d/0ByfNvrhwPOkFXzN2UWN0OGhzM0E/view?usp=sharing

I have the vanilla autoencoder working, this code seems to work in the sense that when I visualize latent space and logits, it seems to do something reasonable on simulated datasets. However, the learning is unstable and I sometimes have the following error:

InvalidArgumentError: Received a label value of 2 which is outside the valid range of [0, 2).

How come the output of a Dirichlet [0, 2) can be 2 ? is there some masked NaN ? Moreover, training loss does not stabilize. I’d like to add a nice notebook to the library on this when over.

Thanks !
Romain

romsss.34 · July 6, 2017, 6:48pm

Just re-read the papers again, it seems that I would have to rewrite an new class of inference if I want a loss that looks like equation (7) of Kingma 2014 Semi-supervised learning with deep… I’ll post when I have something

Romain

dustin · July 6, 2017, 8:41pm

However, the learning is unstable and I sometimes have the following error:

I’m not surprised this happens when the dimension of the Categorical’s is large. Namely, the variance of the stochastic gradient updates increases with the dimension. As you note, one way to handle this (and which most VAE papers do) is to marginalize out the Categorical prior.

How come the output of a Dirichlet [0, 2) can be 2 ? is there some masked NaN ? Moreover, training loss does not stabilize. I’d like to add a nice notebook to the library on this when over.

If parameters in Categorical destabilize, the output can be strange:

from edward.models import Categorical

x = Categorical(probs=[0.0, 0.0])  # not technically valid
sess = ed.get_session()
sess.run(x)
## 2

I’ll post when I have something

Do keep us updated!

romsss.34 · July 7, 2017, 12:28am

Thanks for your answer !

As you note, one way to handle this (and which most VAE papers do) is to marginalize out the Categorical prior.

Yeah that is true but I am not sure of where to start modifying something of Edward. The result of the marginalization seems very simple (integrating the usual ELBO in fully observed model against the variational posterior of the categorical + the entropy term) but I have trouble seeing architecturally speaking where I should start playing.

I am working on pure TF now to have something working well. Tell me should you have a starting point.
Romain

pedrofale · January 24, 2018, 5:01pm

Hi Romain, any updates on this?

Topic		Replies	Views
Deep Generative Models in Edward	1	924	July 1, 2017
LDA with collapsed Gibbs Sampling	0	1577	November 14, 2018
Error in inference.run() for a mixture model	7	1987	July 19, 2017
Variational EM for Independent Factor Analysis	7	1807	March 28, 2017
Inference in mixture models containing transformations	2	1180	January 11, 2018

Gaussian Mixture Prior VAE

Related topics