Acceptance Rate 0 for HMC in IRT models

AkmusNie · December 10, 2018, 9:00am

I am running HMC to learn a so called item response model that is common in my practice to estimate student abilities and question difficulties. With only these two groups of parameters, the algorithm seems to work ok, but when I add a third group of parameters, discrimination alphas, the estimation of alphas were not very good. When I add a fourth group of parameters, guessing Cs, the algorithm does not work anymore, showing no learning with acceptance rate 0 from the very beginning of the training. Below you can see the setup of my model and the results (it is a modification of the tutorial of rasch model in the edward library examples).

I am very new to statistical computing and the edward package, so I am not sure if I am setting the model wrong, or the hyperparamters wrong, or it is simply the algorithm not working for this occasion?

“”“Rasch model (Rasch, 1960).”“”
from future import absolute_import
from future import division
from future import print_function
import edward as ed
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf
import pandas as pd
from edward.models import Bernoulli, Normal, Empirical,Uniform
from scipy.special import expit

data

n_students = 300
n_questions = 200
bs = 50000
T = 500

define a function to simulate responses from a specified group of items and persons

def build_toy_dataset(n_students, n_questions, n_obs):

student_etas = np.random.normal(size=n_students)
question_etas = np.random.normal(size=n_questions)
question_alphas = np.exp(np.random.normal(loc=0., scale=0.25, size=n_questions))
question_guess = np.random.choice([0,0.2,0.25,0.33,0.5],size = n_questions)

student_ids = np.random.choice(range(n_students), n_obs)
question_ids = np.random.choice(range(n_questions), n_obs)

logits = question_alphas[question_ids]*(student_etas[student_ids] + question_etas[question_ids])
p = question_guess[question_ids] + (1-question_guess[question_ids])*expit(logits)

outcomes = np.random.binomial(1, p, n_obs)

data = pd.DataFrame({‘question_id’: question_ids,
‘student_id’: student_ids,
‘outcomes’: outcomes})

return data, student_etas, question_etas, question_alphas, question_guess

geneartaion of the true theta and true item difficulty

data, true_s_etas, true_q_etas, true_q_alphas, true_q_guess = build_toy_dataset(
n_students, n_questions, n_obs)
obs = data[‘outcomes’].values
student_ids = data[‘student_id’].values.astype(int)
question_ids = data[‘question_id’].values.astype(int)

MODEL

random effects

student_etas = Normal(loc=0.0, scale= 1.0,
sample_shape=n_students)
question_etas = Normal(loc=0.0, scale= 1.0,
sample_shape=n_questions)

log_question_alphas=Normal(loc=0.0, scale= 0.25,
sample_shape=n_questions)

question_alphas = tf.exp(log_question_alphas)

question_guess = Uniform(low=0.,high=1.,sample_shape=n_questions)

the dependent variable

observation_logodds = tf.multiply(tf.gather(question_alphas, question_ids) ,
tf.gather(student_etas, student_ids) + tf.gather(question_etas, question_ids))
probs_obs = tf.gather(question_guess, question_ids)+(tf.ones(data.shape[0])-tf.gather(question_guess,question_ids))*tf.sigmoid(observation_logodds)

outcomes = Bernoulli(probs=probs_obs)

INFERENCE

qstudents = Empirical(params=tf.get_variable(“q_students/params”,
[T, n_students]))
qquestions = Empirical(params=tf.get_variable(“q_questions/params”,
[T, n_questions]))
qlogalphas = Empirical(params=tf.get_variable(“q_log_alphas/params”,
[T,n_questions]))
qguess = Empirical(params=tf.get_variable(“q_guess/params”,
[T,n_questions]))

latent_vars = {
student_etas: qstudents,
question_etas: qquestions,
log_question_alphas: qlogalphas,
question_guess: qguess
}

data = {outcomes: obs}

inference = ed.HMC(latent_vars, data)
inference.run(step_size=0.001)

report

f, (ax1, ax2, ax3, ax4) = plt.subplots(1, 4)
ax1.set_title('Ability Parameters')
ax2.set_title('Difficulty Parameters')
ax3.set_title('Log Discrimiation Parameters')
ax4.set_title('Guessing Parameters')
ax1.set_position([0.0,0,1,1])
ax2.set_position([1.1,0,1,1])
ax3.set_position([2.2,0,1,1])
ax4.set_position([3.3,0,1,1])
ax1.scatter(true_s_etas, qstudents.mean().eval())
ax2.scatter(true_q_etas, qquestions.mean().eval())
ax3.scatter(np.log(true_q_alphas), qlogalphas.mean().eval())
ax4.scatter(true_q_guess, qguess.mean().eval())

plt.show()

AkmusNie · January 18, 2019, 3:14am

I think I kind of find the problem by myself. I have too many parameters to be estimated for the HMC algorithm and many of these parameters are intertwined to influence the observed events? So when the HMC algorithm tries to find the next step to go in the unreasonably high dimensional and intertwined parameter space, it always fails to find a point that is worth moving to, and therefore the acceptance rate is always zero? I am not an expert, but I think it is what’s happening here.

Topic		Replies	Views
Resolve logistic regression parameters on simulated data	20	2182	April 28, 2018
MCMC not working for basic model	4	1032	May 8, 2018
Having trouble setting up basic HMC model	4	1962	May 29, 2017
State space model inference by HMC	0	1413	April 27, 2018
Acceptance Rate 0 for pareto/nbd model	0	1611	January 20, 2019