I am trying to reproduce the Convolutional VAE tutorial from Tensorflow (https://www.tensorflow.org/beta/tutorials/generative/cvae), by using Keras and Edward’s facilities for probabilistic programming. (Partially inspired by the Bayesian Layers paper as well).
Unfortunately, it seems that the loss I get is very high compared to the one presented in the Tensorflow tutorial. It seems that it does not learn anything and the generated images are pure noise. Is there perhaps something wrong with the way I specified the loss?
encoding = encoder(inputs) decoding = decoder(encoding) model = tf.keras.models.Model(inputs=inputs, outputs=decoding.value) nll = -decoding.distribution.log_prob(inputs) kl = encoding.distribution.kl_divergence(ed.Normal(0., 1.).distribution) loss = tf.keras.backend.sum(nll, axis=[-3, -2, -1]) + tf.keras.backend.sum(kl, axis=[-1]) batch_loss = tf.keras.backend.mean(loss) model.add_loss(batch_loss)
The full notebook: