for the bayesian_logistic_regression.py example, I modified the data generation procedure by setting an arbitrary value for w and b (instead of using the np.tanh function).
def build_toy_dataset(N, noise_std=0.1):
X = (random.uniform(size=(N)) - 0.5) * 100
w0 = np.full((FLAGS.D), 1.0, np.float64)
b0 = 2.0
# y = np.tanh(X) + np.random.normal(0, noise_std, size=N)
y = np.multiply(X, w0) + b0 + np.random.normal(0, noise_std, size=N)
y = expit(y)
threshold = random.uniform(size=(N))
y = np.less(threshold, y)
y = y.astype(int)
X = X.reshape((N, D))
return X, y
The rest of the program is basically the same as what was in the tutorial, except, I made a single call to inference.run(), instead of calling inference.update() over the iterations.
def main(_):
ed.set_seed(42)
# DATA
X_train, y_train = build_toy_dataset(FLAGS.N)
# MODEL
X = tf.placeholder(tf.float32, [FLAGS.N, FLAGS.D])
w = Normal(loc=tf.zeros(FLAGS.D), scale=3.0 * tf.ones(FLAGS.D))
b = Normal(loc=tf.zeros([]), scale=3.0 * tf.ones([]))
# logits=ed.dot(X, w) + b
y = Bernoulli(logits=ed.dot(X, w) + b)
# INFERENCE
qw = Empirical(params=tf.get_variable("qw/params", [FLAGS.T, FLAGS.D]))
qb = Empirical(params=tf.get_variable("qb/params", [FLAGS.T]))
inference = ed.HMC({w: qw, b: qb}, data={X: X_train, y: y_train})
inference.initialize(n_print=10, step_size=0.6)
tf.global_variables_initializer().run()
inference.run()
sess = ed.get_session()
print("qw = ", qw.eval(session=sess))
print("qb = ", qb.eval(session=sess))
After HMC inferencing, I print out the mean by print(qw.eval(session=sess)).
However, I am not getting the right w value (which I set to 1) back (I got qw = 0.00772677 and qb = 0.008026831), with 40 input samples, and 5000 draws. When I use 1000 input samples, even worse (qw = -0.00772677, qb = 0.008026831). In fact, the resolved values are independent of the initial values I set. They only change with the input sample size.
What do I need to know to get back the parameter values I set in the first place?