If you’d like to use the local reparameterization trick, “expos[ing] the reparameterization in the model specification” is the proper approach. Namely, define the model marginalizing out the weights and where the neurons are random. Inference should be over logodds, scale, and eta given y and x.
Alternatively, you can build a new inference algorithm to try to automate local reparameterizations. That said, I prefer the former approach because I personally view the technique more as a choice of model parameterization for efficient VI in the same way we might use non-centered parameterizations for efficient HMC.
this solution doesn’t play nicely with ed.copy, so evaluating the model (e.g. computing coefficient of determination) requires pulling out the coefficients and computing things outside of Edward/Tensorflow.
Given parameters for the marginal distribution on the neurons, you can calculate the parameters for the distribution on the weights—all in Edward/TensorFlow (there’s a 1-1 mapping as in, e.g., Eq 6 of their paper).