Confusion about feeding data in batches

matttrent · May 2, 2017, 6:37am

Hey all. Potentially dumb question that I wasn’t able to find an answer to in the docs/forum.

I’ve been understanding edward by means translating of the PyMC3 tutorials into edward/tensorflow. And I’m a bit confused on how to do minibatch updates in edward.

Now, I’m confused on what I should be passing into the data parameter of KLqp.__init__() and what I should be passing into the feed_dict parameter of inference.upate(). My read of the documentation is that I pass in the minibatches as I have done below.

If I do that, do I need to pass anything to the data parameter? It didn’t seem to indicate that I needed to, but if I omitted them the model training didn’t seem to converge. And the training converges better if I pass in a data dict with the full data set in it (X_train/y_train vs the first batch).

Am I misunderstanding passing in data, or is there something more fundamental?

n_epoch = 500
n_iter_per_epoch = 50
batch_size = 10

inference = ed.KLqp({
        W_0: qW_0, b_0: qb_0, W_1: qW_1, b_1: qb_1, W_2: qW_2, b_2: qb_2
    }, 
    data={
        X: X_train[:batch_size, :],
        y: Y_train[:batch_size]
    }
)

inference.initialize(
    scale={y: float(N) / batch_size}
)
tf.global_variables_initializer().run()

losses = []
pbar = Progbar(n_epoch)
for epoch in range(n_epoch):
    loss = 0.0

    for t in range(n_iter_per_epoch):
        tmod = t % n_iter_per_epoch
        xt   = X_train[tmod*batch_size:(tmod+1)*batch_size, :]
        yt   = Y_train[tmod*batch_size:(tmod+1)*batch_size]

        info_dict = inference.update(feed_dict={
            X: xt,  y: yt
        })
        loss += info_dict['loss']

    pbar.update(epoch+1, values={'loss': loss / float(n_iter_per_epoch)})

inference.finalize()

dustin · May 2, 2017, 9:55am

Batch training requires binding observed variables to TensorFlow placeholders in the Inference class. The intuition is that you are conditioning on values that you don’t specify until runtime.

In the feed_dict during update(), you then bind the TensorFlow placeholders to realized values (NumPy arrays).

The WIP tutorial here might be helpful.

matttrent · May 2, 2017, 9:07pm

Aha, I see. I made it half-way there and instantiated X as a placeholder, but hadn’t done the same for y with an equivalent y_ph. Adding that, and plus passing y_ph instead of y in the feed_dict solved it. Thanks!

The notebook is very much a WIP itself, but if you think the latter parts would make an interesting contribution, I’d be happy to clean it up for PR.

Topic		Replies	Views
A toy normal model failed (klqp) and why?	2	1674	July 25, 2017
Streamed batch training with manual update - scaling considerations	2	1069	August 12, 2017
Rookie problem (KLqp gets obviously wrong result)	7	1784	October 17, 2017
Iterative estimators ("bayes filters") in Edward?	5	2343	April 30, 2017
Get MAP parameters and value	2	824	April 22, 2018

Confusion about feeding data in batches

Related Topics