Any difference between ed.dot and tf.reduce_sum(..., axis=-1)

For the Bernoulli sample code I posted before,

C = Bernoulli(logits=z_logits)

I replaced ed.dot with tf.reduce_sum. The results are mostly the same, but the parameters with very small values change from -0.0091895 to 0.03451591. There is no other difference then the need to reshape to use ed.dot. Any reason a reshape or change of ed.dot call can cause this difference on small parameter values?

ed.dot calls into tf.matmul, which calls a specialized matrix multiplication routine.

Multiplying and them summing up floating point numbers propagates errors differently. Refer to http://floating-point-gui.de/errors/propagation/