How to implement Bayesian Hierarchical Inference Model?

I have the following bayesian hierarchical graphical (DAG) structure where all the edges are directed downward, i.e. A --> C,D; D–> H,I etc.

			 (A)                                (B)
			/    \ 			            /   \
		      /       \			          /       \
		   (C )     ( D )	               ( E )    ( F )
		  /   \     /    \		      /    \	  /    \
	        /     \   /	   \		    /        \  /       \
	     ( G )   ( H )     ( I )           ( J )     ( K )      ( L )

I have their co-occurence count as follows:-

A,B,C,D,E,F,G,H,I,J,K,L
1,1,0,0,1,1,0,0,0,1,1,1
0,0,0,1,1,1,1,0,0,0,0,0
1,0,1,0,1,0,1,0,1,1,1,1
0,0,0,0,0,0,1,1,1,1,1,1

I want to train a bayesian probabilistic inference model as per the defined graph structure so that it can learn from the above co-occurence matrix and I can get insights such as:-

  1. P(A,B,C|H=1,I=1,L=1)
  2. P(D,F| G=1,J=1)

Can I do that using edward? If yes then can you suggest a piece of code to do that.
As of now I tried pgmpy,libpgm,pomegranate,pymc3. But they are hardly scaling for big data.
Thanks in advance.

the graph structure is like this:
              (A)                         (B)
            /    \                       /   \
           /      \                     /     \
         (C )     ( D )              ( E )   ( F )
        /   \     /    \            /    \    /    \
       /     \   /      \          /      \  /      \
    ( G )   ( H )      ( I )     ( J )    ( K )    ( L )

To implement this, you need to follow the paradigm of modeling -> inference -> criticism. Namely, first build a DAG model where each variable is a Bernoulli distribution, and they are connected somehow through their parameterization. Then run an inference method such as ed.KLqp. Then check the fit.

How to connect the variables through parameterization ? Can you please provide me a small script for this problem?

How you parameterize them is your modeling assumption. For example:

A = Bernoulli(logits=0.0)
C = Bernoulli(logits=f_c(A))
D = Bernoulli(logits=f_d(A))
G = Bernoulli(logits=f_g(C))
...

and so on. Here, f_* are individual functions which you define, and which determine how each variable depends on the others. Given this model, you can then plug it into an inference algorithm to get those queries you seek.

1 Like

Thanks Dustin for your quick reply.

So, for the left tree I want to go upwards from bottom. So, I define the model like below, where G,H,I are Uniform distribution while C=G+H,D=H+I and A = C*D.

G = Uniform(low=0.0,high=1.0)
H = Uniform(low=0.0,high=1.0)
I = Uniform(low=0.0,high=1.0)

C = Bernoulli(logits=tf.add(G,H))
D = Bernoulli(logits=tf.add(H,I))

A = Bernoulli(logits=tf.multiply(C,D))

Now, How can I build the inference and pass the input co-occurrence matrix (which will be an ndarray) for training. Please help and tell me if I am going in the right direction.

1 Like

The model is okay. But you haven’t set up any parameters or latent variables. This implies given data, the model’s probability distribution would remain the same.

You can use tf.Variable() which can parameterize the distributions. And then you can use something like ed.MAP to maximize the log-likelihood with respect to those parameters. Alternatively you can do Bayesian inference with latent variables.

In general, I recommend looking at the API in more detail.

2 Likes