Implementing Linear regression with Automatic Relevance Determination


#1

I’m new to Edward. I want implement Linear regression with ARD. This is my current implementation

# initiate placeholders
X = tf.placeholder(tf.float32, [None, d])

# initiate the noise
sigma = ed.models.TransformedDistribution(
    distribution=ed.models.Normal(loc=0.0, scale=0.25),
    bijector=bijector.Exp())

# initiating the hyperprior
alpha = ed.models.TransformedDistribution(
    distribution=ed.models.Normal(loc=0.0, scale=1.0),
    bijector=bijector.Exp())

# initiating the priors
w = Normal(loc=tf.zeros(d), scale=tf.ones(d) * alpha)
b = Normal(loc=tf.zeros(1), scale=tf.ones(1))

# initiate the likelihood
y = Normal(loc=ed.dot(X, w) + b, scale=sigma * tf.ones(1))

# initiate the posteriors
qw = Normal(loc=tf.get_variable("qw/loc", [d]),
                 scale=tf.nn.softplus(tf.get_variable("qw/scale", [d])))

qb = Normal(loc=tf.get_variable("qb/loc", [1]),
                 scale=tf.nn.softplus(tf.get_variable("qb/scale", [1])))

qsigma = ed.models.TransformedDistribution(
    distribution=ed.models.Normal(loc=0.0, scale=0.25),
    bijector=bijector.Exp())

qalpha = ed.models.TransformedDistribution(
    distribution=ed.models.Normal(loc=0.0, scale=0.25),
    bijector=bijector.Exp())

# inference
inference = ed.KLqp({w: qw, b: qb, sigma: qsigma, self.alpha : self.qalpha },
                                 data={X: train_X, y: train_y})
inference.run(n_iter=500)

However this implementation have a very high error than the linear regression without ARD. It seems that I’m doing something wrong.

I implemented this model by looking at various sources [1], [2].

  • I have seen that TransformedDistribution is used to define the noise of the likelihood and the hyperpriors. What is the purpose of transformed distribution?

  • Can’t we define those without transformed distribution (similar to w and b)?

  • What am I doing wrong here? Can someone please help me to fix this model?