a) Is RelaxedOneHotCategorical supported in Edward?
b) Is my usage correct?
I think that modeling of temperature parameters may not be appropriate.
What about trying to set the temperature parameter as a constant?
tau = tf.constant(0.5)
Please see the paper (https://arxiv.org/abs/1611.01144).
In the first experiment, they used a fixed
In the second experiment, they anneal the temperature using the schedule
\tau = max(0.5, exp(−rt)) of the global training step t.
It seems that they never modeled the temperature parameter as a random variable.
c) Are there any issues with KLqp inference when using such a model?
Attempting to approximate the Categorial distribution with OneHot Categorical distribution, nan occurred. I am looking for a good way.
I’m happy if I can keep good discussions with you.