The KL isn’t defined sure because of the entropy term blowing up but everything still works. Just view the dirac as a Gaussian with a tiny fixed variance though and this is fine. The entropy doesn’t depend on the mean so you can ignore it, and \int q log p ~= log p(theta), so you recover MAP inference over the variables with dirac q. This approach has been used for some variables in various VB papers.