Any difference between and tf.reduce_sum(..., axis=-1)

For the Bernoulli sample code I posted before,

C = Bernoulli(logits=z_logits)

I replaced with tf.reduce_sum. The results are mostly the same, but the parameters with very small values change from -0.0091895 to 0.03451591. There is no other difference then the need to reshape to use Any reason a reshape or change of call can cause this difference on small parameter values? calls into tf.matmul, which calls a specialized matrix multiplication routine.

Multiplying and them summing up floating point numbers propagates errors differently. Refer to