Copying is necessary because for certain algorithms, we need to change the connectivity of nodes in the model. For example, the model is written connecting the prior to the likelihood, but the
ed.KLqp algorithm requires calculating the likelihood given samples from the approximate posterior. This requires copying the likelihood nodes and swapping the prior dependence with the approximate posterior.
I've been dealing with this in my own large-scale experiments by hacking a new inference algorithm that doesn't use
ed.copy, and which assumes the model is already written with whatever connectivities are needed. This is easy for example with
ed.MAP, if we assume there are no latent variables and we only want to optimize
tf.Variables written in the model.
There's more work to be done on intelligently avoiding copies in special cases. Contributions are welcome.
At a higher level, I've been thinking about a dynamic version of Edward that does lazy evaluation and avoids any graph-building until inference altogether; this will make graph building significantly faster. If this pans out, it will not be in Edward for at least several months to come.