I’m new to Edward. I want implement Linear regression with ARD. This is my current implementation
# initiate placeholders
X = tf.placeholder(tf.float32, [None, d])
# initiate the noise
sigma = ed.models.TransformedDistribution(
distribution=ed.models.Normal(loc=0.0, scale=0.25),
bijector=bijector.Exp())
# initiating the hyperprior
alpha = ed.models.TransformedDistribution(
distribution=ed.models.Normal(loc=0.0, scale=1.0),
bijector=bijector.Exp())
# initiating the priors
w = Normal(loc=tf.zeros(d), scale=tf.ones(d) * alpha)
b = Normal(loc=tf.zeros(1), scale=tf.ones(1))
# initiate the likelihood
y = Normal(loc=ed.dot(X, w) + b, scale=sigma * tf.ones(1))
# initiate the posteriors
qw = Normal(loc=tf.get_variable("qw/loc", [d]),
scale=tf.nn.softplus(tf.get_variable("qw/scale", [d])))
qb = Normal(loc=tf.get_variable("qb/loc", [1]),
scale=tf.nn.softplus(tf.get_variable("qb/scale", [1])))
qsigma = ed.models.TransformedDistribution(
distribution=ed.models.Normal(loc=0.0, scale=0.25),
bijector=bijector.Exp())
qalpha = ed.models.TransformedDistribution(
distribution=ed.models.Normal(loc=0.0, scale=0.25),
bijector=bijector.Exp())
# inference
inference = ed.KLqp({w: qw, b: qb, sigma: qsigma, self.alpha : self.qalpha },
data={X: train_X, y: train_y})
inference.run(n_iter=500)
However this implementation have a very high error than the linear regression without ARD. It seems that I’m doing something wrong.
I implemented this model by looking at various sources [1], [2].
-
I have seen that TransformedDistribution is used to define the noise of the likelihood and the hyperpriors. What is the purpose of transformed distribution?
-
Can’t we define those without transformed distribution (similar to w and b)?
-
What am I doing wrong here? Can someone please help me to fix this model?