Streamed batch training with manual update - scaling considerations

To some degree, yes. It’s used generally to scale any computation with respect to the random variables. For example, you might use it for masking.

n_samples is an algorithm hyperparameter in ed.KLqp, representing the number of samples to estimate the gradient of the loss function. It is always relevant whenever running the algorithm.

That would only work if after each streaming batch, you re-set the prior distribution to be the inferred posterior from the batch. Otherwise, imagine you had a billion streaming points, with a batch size of 1; without scaling the likelihood by 1 million, the prior overwhelms the likelihood so the posterior will not differ much from the prior.

The discussion in Iterative estimators ("bayes filters") in Edward? - #2 by dustin provides most detail.