I'm successfully using a Mixture Density Network, based on the MDN tutorial code. However, I'd like to modify the loss function, to try to get different behavior from the network. For example, I'd like the model to prefer more Gaussians, with larger weights and smaller standard deviations, rather than few Gaussians with larger standard deviations.
I think I could accomplish this by adding one or more regularization terms to the loss function. However, the existing maximum likelihood loss appears to be hidden by the MAP inference method.
Can anyone suggest a good way to do this?