Trying to understand different shape parameters


#1

I’m confused with various shape parameters in Edward, sample_shape, batch_shape, and event_shape.
Would appreciate some clarification from experts!

Say I create a Dirichlet object:

import tensorflow as tf
from edward.models import Dirichlet

x = Dirichlet([1.0, 2.0, 3.0], sample_shape=(7, 4))

One draw from the object should be a vector of length 3, which should correspond to event_shape.
So:

x.event_shape
# TensorShape([Dimension(3)])
with tf.Session() as sess:
    print(sess.run(x.sample()))
# [ 0.07021441  0.43511733  0.49466828]
# This makes sense so far.

But, how do I understand sample_shape and batch_shape?
The sample_shape parameter specified when x is constructed doesn’t seem to have any effect.

Thank you!


#2

Check out Section 3.3 of the TF Distributions whitepaper, or an associated Colab tutorial (https://drive.google.com/file/d/1Ta88_r9O2GX5WPuhnmaWP842euy6L-BA/view). It illustrates differences with examples.


#3

Thanks @dustin for point out the resources!

It seems that the sample_shape argument in the Dirichlet constructor doesn’t take effect.
(If it did, I would expect to see a (7, 4, 3) tensor in the above example when x.sample() is called.)

In the TF Distributions whitepaper you pointed out, it also says that (last line of page 5) sample shape is determined from the sample method.


#4

I’m new to Edward so I may be wrong, but the way I think of it, what you get is
an array proportional to P(event|batch) tiling an array which has sample_shape. The reason why event is innermost index is out of convenience, because summing over event should give 1 when normalized.