Confusion about feeding data in batches

Hey all. Potentially dumb question that I wasn’t able to find an answer to in the docs/forum.

I’ve been understanding edward by means translating of the PyMC3 tutorials into edward/tensorflow. And I’m a bit confused on how to do minibatch updates in edward.

Now, I’m confused on what I should be passing into the data parameter of KLqp.__init__() and what I should be passing into the feed_dict parameter of inference.upate(). My read of the documentation is that I pass in the minibatches as I have done below.

If I do that, do I need to pass anything to the data parameter? It didn’t seem to indicate that I needed to, but if I omitted them the model training didn’t seem to converge. And the training converges better if I pass in a data dict with the full data set in it (X_train/y_train vs the first batch).

Am I misunderstanding passing in data, or is there something more fundamental?

n_epoch = 500
n_iter_per_epoch = 50
batch_size = 10

inference = ed.KLqp({
        W_0: qW_0, b_0: qb_0, W_1: qW_1, b_1: qb_1, W_2: qW_2, b_2: qb_2
    }, 
    data={
        X: X_train[:batch_size, :],
        y: Y_train[:batch_size]
    }
)

inference.initialize(
    scale={y: float(N) / batch_size}
)
tf.global_variables_initializer().run()

losses = []
pbar = Progbar(n_epoch)
for epoch in range(n_epoch):
    loss = 0.0

    for t in range(n_iter_per_epoch):
        tmod = t % n_iter_per_epoch
        xt   = X_train[tmod*batch_size:(tmod+1)*batch_size, :]
        yt   = Y_train[tmod*batch_size:(tmod+1)*batch_size]

        info_dict = inference.update(feed_dict={
            X: xt,  y: yt
        })
        loss += info_dict['loss']

    pbar.update(epoch+1, values={'loss': loss / float(n_iter_per_epoch)})

inference.finalize()

Batch training requires binding observed variables to TensorFlow placeholders in the Inference class. The intuition is that you are conditioning on values that you don’t specify until runtime.

In the feed_dict during update(), you then bind the TensorFlow placeholders to realized values (NumPy arrays).

The WIP tutorial here might be helpful.

Aha, I see. I made it half-way there and instantiated X as a placeholder, but hadn’t done the same for y with an equivalent y_ph. Adding that, and plus passing y_ph instead of y in the feed_dict solved it. Thanks!

The notebook is very much a WIP itself, but if you think the latter parts would make an interesting contribution, I’d be happy to clean it up for PR.

2 Likes