Thanks for your response. I think I understand the idea of building a manual NN with priors on the weights.
However, I’d just like to put a prior on the outputs of that NN (e.g. the scales or logits). For example, if I could say that the scales are from some gamma distribution, which has high density for small values.
Is it possible to do this, or would I still need to build a manual NN with priors on the weights?
(I’m finding this meshing of neural networks and probabilistic models very confusing… would it be proper to cast this MDN as a form of VAE, where the NN is the inference network, and generation network is simply sampling from the mixture?)