Bayesian Model Combination (Kim, Ghahramani 2012)

Hello,

Thank you for a great and interesting package - Edward.

I am attempting to implement an Independent Bayesian Classifier Combination (IBCC) model, according to Kim and Ghahramani 2012. However, having trouble running any type of inference on it available through Edward. Would appreciate, if someone could point out what I am doing wrong - thanks!

First, I’ve replicated the directed graphical model in Figure 1 of the paper, to create a data-set and obtain true values of the parameters to test the inference.

def build_dataset(I, J, K, nu_vec, lambda_mat):
"""
Inputs:
    I - int, number of instances
    J - int, number of classes
    K - int, number of classifiers 
    nu_vec - hyper parameter array of shape (J,) for dirichlet prior over categorical for t_i
    lambda_mat - a hyper param array of shape (J, J) for exponential prior over Dirichlet 
Outputs:
    A list c of length(K), where each element is an array of shape (I, )
"""

# Check if nu_vec has the corresponding (1,J) size
nu_vec = np.array(nu_vec)
assert nu_vec.shape == (J,), 'Wrong shape of parameter array nu!'

# Check if lambda_mat is of shape (J, J)
lambda_mat = np.array(lambda_mat)
assert lambda_mat.shape == (J, J), 'Wrong shape of parameter lamda_mat'

# Generate t vector
p_vec = Dirichlet( concentration = nu_vec )
t = Categorical( logits= p_vec, sample_shape = I)
   
# Reshape lambda_mat to be of shape (K, J, J), a K copies of (J,J) matrix
lambda_mat = tf.reshape( tf.concat([lambda_mat for _ in range(K)], axis=0), [K, J, J] )

# Continue into generating c vector
alpha = Exponential( rate = lambda_mat )
pi = Dirichlet( concentration = alpha ) # Confusion matrices of shape (K, J, J)
c = [Categorical( logits = tf.gather(pi[k], t) ) for k in range(K)] # iterate over k_i

return sess.run([c, pi, t])

Subsequently, I created a forward and backwards model, and passed it into the inference method:

# Forward model
I = 100
J = 3
K = 4
c_train = c
# Generate t vector
p_vec = Dirichlet( concentration = tf.ones([J])+10)
t = Categorical( logits= p_vec, sample_shape = I )

# Continue into generating c vector
alpha = [Exponential( rate = tf.ones([J, J]) ) for _ in range(K)]
pi = [Dirichlet( concentration = alpha[k] ) for k in range(K)]
c_forward = [Categorical( logits = tf.gather(pi[k], t) ) for k in range(K)] # iterate over k_i
  1. If passed into Gibbs method

    T = 1000
    q_p_vec = Empirical(tf.Variable(tf.ones([T, J])))
    q_t = Empirical(tf.cast(tf.Variable(tf.ones([T, I])), tf.int32))

    q_alpha = [Empirical(tf.Variable(tf.ones([T, J, J]))) for _ in range(K)]
    q_pi = [Empirical(tf.Variable(tf.ones([T, J, J]))) for _ in range(K)]

    latent_vars=dict(list(zip(pi, q_pi))+list(zip(alpha, q_alpha))+[(t, q_t), (p_vec, q_p_vec)])

    inference = ed.Gibbs(latent_vars=latent_vars
    , data=dict(zip(c_forward, c_train)))

I get an AttributeError: ‘NoneType’ object has no attribute ‘shape’ (screenshot below)

  1. If passed into HMC:

    T = 1000 # Number of MCMC samples

    q_p_vec = Empirical(tf.Variable(tf.ones([T, J])))
    q_t = Empirical(tf.cast(tf.Variable(tf.ones([T, I])), tf.int32))

    q_alpha = [Empirical(tf.Variable(tf.ones([T, J, J]))) for _ in range(K)]
    q_pi = [Empirical(tf.Variable(tf.ones([T, J, J]))) for _ in range(K)]

    inference = ed.HMC(latent_vars=dict(list(zip(pi, q_pi))+list(zip(alpha, q_alpha))+[(t, q_t), (p_vec, q_p_vec)])
    , data=dict(zip(c_forward, c_train)))

Using inference.run(), gives TypeError: unsupported operand type(s) for *: ‘float’ and ‘IndexedSlices’. Which I suspect is due to the use of tf.gather()? (screenshot below - turns out I can only add one image per post as a new user :frowning: )

Would appreciate any comments! Thank you!

1 Like

I’m having trouble adding the various transitions to get this script to run (prior to the inference errors). Is the following correct?

import numpy as np
import edward as ed
import tensorflow as tf
from edward.models import *

def build_dataset(I, J, K, nu_vec, lambda_mat):
  """
  Inputs:
      I - int, number of instances
      J - int, number of classes
      K - int, number of classifiers
      nu_vec - hyper parameter array of shape (J,) for dirichlet prior over categorical for t_i
      lambda_mat - a hyper param array of shape (J, J) for exponential prior over Dirichlet
  Outputs:
      A list c of length(K), where each element is an array of shape (I, )
  """

  # Check if nu_vec has the corresponding (1,J) size
  nu_vec = np.array(nu_vec)
  assert nu_vec.shape == (J,), 'Wrong shape of parameter array nu!'

  # Check if lambda_mat is of shape (J, J)
  lambda_mat = np.array(lambda_mat)
  assert lambda_mat.shape == (J, J), 'Wrong shape of parameter lamda_mat'

  # Generate t vector
  p_vec = Dirichlet( concentration = nu_vec )
  t = Categorical( logits= p_vec, sample_shape = I)

  # Reshape lambda_mat to be of shape (K, J, J), a K copies of (J,J) matrix
  lambda_mat = tf.reshape( tf.concat([lambda_mat for _ in range(K)], axis=0), [K, J, J] )

  # Continue into generating c vector
  alpha = Exponential( rate = lambda_mat )
  pi = Dirichlet( concentration = alpha ) # Confusion matrices of shape (K, J, J)
  c = [Categorical( logits = tf.gather(pi[k], t) ) for k in range(K)] # iterate over k_i

  sess = ed.get_session()
  return sess.run([c, pi, t])

# Forward model
I = 100
J = 3
K = 4

c, pi, t = build_dataset(I, J, K, np.ones(J), np.ones([J, J]))

c_train = c

# Generate t vector
p_vec = Dirichlet( concentration = tf.ones([J])+10)
t = Categorical( logits= p_vec, sample_shape = I )

# Continue into generating c vector
alpha = [Exponential( rate = tf.ones([J, J]) ) for _ in range(K)]
pi = [Dirichlet( concentration = alpha[k] ) for k in range(K)]
c_forward = [Categorical( logits = tf.gather(pi[k], t) ) for k in range(K)] # iterate over k_i

# Trying the HMC code snippet
T = 1000 # Number of MCMC samples

q_p_vec = Empirical(tf.Variable(tf.ones([T, J])))
q_t = Empirical(tf.cast(tf.Variable(tf.ones([T, I])), tf.int32))

q_alpha = [Empirical(tf.Variable(tf.ones([T, J, J]))) for _ in range(K)]
q_pi = [Empirical(tf.Variable(tf.ones([T, J, J]))) for _ in range(K)]

inference = ed.HMC(latent_vars=dict(list(zip(pi, q_pi))+list(zip(alpha, q_alpha))+[(t, q_t), (p_vec, q_p_vec)]), data=dict(zip(c_forward, c_train)))
inference.run()
1 Like

HI, thank you for the reply! That is correct. In the end this exactly same code produces an error at inference.run() for me: TypeError: unsupported operand type(s) for *: ‘float’ and ‘IndexedSlices’.

I got the following error. Reading through the script, it is caused by applying HMC to a Categorical latent variable. This is not possible because HMC requires distributions with differentiable support.

Traceback (most recent call last):
  File "temp.py", line 74, in <module>
    inference.run()
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/inference.py", line 123, in run
    self.initialize(*args, **kwargs)
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/hmc.py", line 64, in initialize
    return super(HMC, self).initialize(*args, **kwargs)
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/monte_carlo.py", line 101, in initialize
    self.train = self.build_update()
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/hmc.py", line 90, in build_update
    self.n_steps)
  File "/Users/dvt/Dropbox/ssh/edward/edward/inferences/hmc.py", line 165, in leapfrog
    r_new[key] = r + 0.5 * step_size * tf.convert_to_tensor(grad_log_joint[i])
  File "/Users/dvt/Envs/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 676, in convert_to_tensor
    as_ref=False)
  File "/Users/dvt/Envs/venv/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 741, in internal_convert_to_tensor
    ret = conversion_func(value, dtype=dtype, name=name, as_ref=as_ref)
  File "/Users/dvt/Envs/venv/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 113, in _constant_tensor_conversion_function
    return constant(v, dtype=dtype, name=name)
  File "/Users/dvt/Envs/venv/lib/python2.7/site-packages/tensorflow/python/framework/constant_op.py", line 102, in constant
    tensor_util.make_tensor_proto(value, dtype=dtype, shape=shape, verify_shape=verify_shape))
  File "/Users/dvt/Envs/venv/lib/python2.7/site-packages/tensorflow/python/framework/tensor_util.py", line 364, in make_tensor_proto
    raise ValueError("None values not supported.")
ValueError: None values not supported.