# Combining Bernoulli variables

I am having trouble combining discrete random variables. All the examples in the documentation seem to consist mostly or entirely of continuous variables. The simplest model I could think of is where we have a latent variable of interest, x, and we have an observation which is equal to x with a probability that is high but not equal to 1. That is:

``````# Model
x = Bernoulli(0.5)  # Latent variable of interest
y = Bernoulli(0.9)  # A "noise"
z = tf.equal(x, y)  # Usually equals x, but not always.
z_data = True

# Inference
qx_p = tf.sigmoid(tf.Variable(tf.random_normal([])))
qx = Bernoulli(qx_p)
inference = ed.KLqp({x: qx}, data={z: z_data})
inference.run()
print("Posterior p(x=1|z)={}={}".format(qx_p.eval(), qx.mean().eval()))
# Prints:  Posterior p(x=1|z)=0.500000=0.622459
# Correct: Posterior p(x=1|z)=0.900000=0.900000
``````

I also tried KLpq for inference but the loss never converges; an example result after 100,000 iterations is:

``````# KLpq:    Posterior p(x=1|z)=0.982831=0.727670
``````

I tried Gibbs sampling but this gave me the error:

``````KeyError: "The name 'conjugate_log_joint/Bernoulli_9/_conjugate_log_prob:0' refers to a Tensor which does not exist. The operation, 'conjugate_log_joint/Bernoulli_9/_conjugate_log_prob', does not exist in the graph."
``````

Finally, I tried this model, which is mathematically equivalent (except z now has range +1/-1 rather than 1/0), but with similar problems:

``````# Mathematically equivalent model
x = Bernoulli(0.5)
y = Bernoulli(0.9)
x_ = 2 * tf.cast(x, dtype=tf.float32) - 1
y_ = 2 * tf.cast(y, dtype=tf.float32) - 1
z = x_ * y_
z_data = 1
``````

Does anyone have any thoughts? Am I just making a stupid mistake with the way I’m calling the API? Or am I trying to do something totally hopeless? I notice that Stan doesn’t even include support for discrete variables (it advises that you “integrate the discrete variables out” by hand!) - is that because this is never going to work?

As a last resort, I’m considering representing discrete variables using continuous variables instead e.g. with values >0 representing true. That seems a bit crazy to me though? Any thoughts or even examples of this being done before?

Hi John,

I’m learning Tensorflow and Edward and came across your problem. So as a learning exercise for me I tried:

``````from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import tensorflow as tf
import numpy as np
import edward as ed
from edward.models import Bernoulli,Beta,Empirical,OneHotCategorical

# ed.set_seed(42)
#DATA
# x_data = np.array([0,1,0,0,0,0,0,0,0,1])
z_data = True

#MODEL
# p = Beta(1.0,1.0)
x = Bernoulli(0.55)
y = Bernoulli(0.95)
z = tf.equal(x,y)

#INFERENCE
qx_p = tf.sigmoid(tf.Variable(tf.random_normal([])))
qy_p = tf.sigmoid(tf.Variable(tf.random_normal([])))
qx = Bernoulli(qx_p)
qy = Bernoulli(qy_p)

# qp_a = tf.nn.softplus(tf.Variable(tf.random_normal([])))
# qp_b = tf.nn.softplus(tf.Variable(tf.random_normal([])))
# qp = Beta(qp_a,qp_b)

# qp = Empirical(params=tf.Variable(tf.zeros([1000]) + 0.5))
# proposal_p = Beta(3.0,9.0)

inference = ed.KLqp({x:qx, y:qy},data={z:z_data})
inference.run()
print("Posterior x = {}".format(qx_p.eval()))
print("Posterior y = {}".format(qy_p.eval()))
``````

The result I get it is:

``````1000/1000 [100%] ██████████████████████████████ Elapsed: 1s | Loss: -0.000
Posterior x = 0.5499999523162842
Posterior y = 0.949999988079071
``````

regards,
Simon

You’re defining a model whose likelihood (`z`) is not tractable. More specifically, `z` is a `tf.Tensor` and not a `ed.RandomVariable`: therefore `z` has no `log_prob` method that inference can rely on. Methods such as `ed.KLqp` and `ed.Gibbs` rely on tractable likelihoods—if it senses that data items passed in have the form `tf.Tensor: tf.Tensor`, then funny issues will arise. (It will not explicitly raise an error because `tf.Tensor: tf.Tensor` items have other use cases.)

You can either (a) rewrite your model with tractable likelihood; or (b) resort to likelihood-free methods such as `ed.ImplicitKLqp` (which I don’t recommend to non-researchers).

1 Like

@sabladmin1 Thanks Simon, I see you got the same problem as me: the inferred posteriors are just the same as the priors. In your case, the true posterior for both x and y is Bern(0.9587).

@dustin Thanks, that’s very helpful. I’ll think steer clear of ImplicitKLqp in that case Based on what you said, I made an equivalent model where z is a `ed.RandomVariable`, but it didn’t really help:

``````# Model (and data)
x = Bernoulli(0.5)
z = Bernoulli(0.8 * tf.cast(x, tf.float32) + 0.1)
z_data = True
``````
• True posterior: p(x=1|z)=0.9
• KLqp: qx_p.eval()=0.8042; qx.mean().eval()=0.6909
• KLpq: e.g. qx_p.eval()=0.0303; qx.mean().eval()=0.5078 (does not converge)
• Gibbs: Similar exception as before (“The name ‘conjugate_log_joint/Bernoulli_2_conjugate_log_prob:0’ refers to a Tensor which does not exist. …”)

Is this what you meant by “tractable likelihood”? I realise there are still intermediate `tf.Tensor` objects here, but that seems quite common in the examples. If that’s a problem I could try making my own Bernoulli `ed.RandomVariable` taking three parameters (input Bernoulli, p_false, p_true) but I don’t see how that would differ from the model above. Any thoughts?

1 Like