Hello,
I am trying to implement this model in Edward to add it to the library (Deep Unsupervised Clustering With Gaussian Mixture VAE). I actually found RuiShu’s blog that seemed to raise a nice point so I copied the “True GMM” model right into edward. Original code is here https://github.com/RuiShu/vae-clustering/blob/master/gmvae.py
# MODEL
y = Categorical(logits=tf.ones([M, K]))
mu_z, sigma_z = generative_network_z(tf.expand_dims(tf.cast(y, tf.float32), 1))
z = Normal(loc=mu_z, scale=sigma_z)
location, scale = generative_network_x(z)
x = Normal(loc=location, scale=scale)
# INFERENCE
x_ph = tf.placeholder(tf.float32, [M, n_input])
qlogits_y, qlocation_z, qscale_z = inference_network(x_ph)
var_z = Normal(loc=qlocation_z, scale=qscale_z)
var_y = Categorical(logits=qlogits_y)
# INFERENCE
inference = ed.KLqp({z: var_z, y:var_y}, data={x: x_ph})
optimizer = tf.train.RMSPropOptimizer(0.001, epsilon=1.0)
inference.initialize(optimizer=optimizer)
tf.global_variables_initializer().run()
Full code can be found here https://drive.google.com/file/d/0ByfNvrhwPOkFXzN2UWN0OGhzM0E/view?usp=sharing
I have the vanilla autoencoder working, this code seems to work in the sense that when I visualize latent space and logits, it seems to do something reasonable on simulated datasets. However, the learning is unstable and I sometimes have the following error:
InvalidArgumentError: Received a label value of 2 which is outside the valid range of [0, 2).
How come the output of a Dirichlet [0, 2) can be 2 ? is there some masked NaN ? Moreover, training loss does not stabilize. I’d like to add a nice notebook to the library on this when over.
Thanks !
Romain