Fixed scales in regression example

cshenton · August 25, 2017, 2:55am

Hey, thanks for the great library, looking forward to building stuff in it.

I’m just trying to get my head around things, as I’m coming from stan (but I’m familiar with tensorflow also), and I’m a little confused about the linear regression example, where you assume the scales of the priors and the likelihoods are known. This seems fine for the priors, but confusing for the scale of the model y

X = tf.placeholder(tf.float32, [N, D])
w = Normal(loc=tf.zeros(D), scale=tf.ones(D))
b = Normal(loc=tf.zeros(1), scale=tf.ones(1))
y = Normal(loc=ed.dot(X, w) + b, scale=tf.ones(N))

In an equivalent stan example, I would specify scale as its own parameter with its own prior and posterior families, which represents the uncertainty in the data. But here it is set to a fixed value. Is there something edward does behind the scenes that allows the scale param of y to vary, even though it was initialised as a tf.Tensor and not a tf.Variable?

I think I understand that the scale of w and b can change since their posteriors qb and qw and defined with a tf.Variable for scale and location, but I can’t see where the same is done for y.

MushroomHunting · August 25, 2017, 4:02am

Hey cshenton

you should be able to learn the likelihood scale

have a look at

github.com

deoxyribose/auml/blob/master/edward-categorical/toy normal with lognormal sigma.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "collapsed": false
   },
   "outputs": [],
   "source": [
    "%matplotlib inline\n",
    "from __future__ import absolute_import\n",
    "from __future__ import division\n",
    "from __future__ import print_function\n",
    " \n",
    "import edward as ed\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import tensorflow as tf\n",
    "import edward.models as edm\n",

This file has been truncated. show original

cshenton · August 25, 2017, 4:24am

Thanks for that, I was just trying to implement learning the scale param and ran into similar problems with inverse gamma.

I wonder though, in bayesian linear regression, one typically allows for uncertainty in both beta and sigma, so new comers may be confused by an example that assumes sigma is fixed. Especially because, in the example, the data are generated with a std of 0.1 and then a scale parameter of 1.0 is assumed, which I thought was odd.

Would the team be open to pull requests that added parameter uncertainty to sigma? Seems like the standard way to do things, since it’s a little odd to put so much effort into parameter uncertainty, only to fix the uncertainty of the DGP.

dustin · August 25, 2017, 11:15pm

Sure I think that would be welcome. As @MushroomHunting hints, the tutorial would likely work best with LogNormal scale priors; or alternatively, use something like inverse gamma but change the algorithm to do ed.HMC instead.

cshenton · August 27, 2017, 11:14am

Would probably be good to keep things using VI for consistency with the current tutorial.

Reading up on how pymc3 deals with bounded variables. Behind the scenes it maps to -inf, inf and just transforms that sample to the 0, inf interval (I’d guess using softplus, they mention using log odds for bounded variables).

Would the correct approach for Edward (and so the one that should be in the tutorials) be to specify the latent “scale” over -inf, inf, use a diffuse normal prior, then just use NormalWithSoftplusScale to transform it to a scale parameter?

dustin · August 27, 2017, 8:45pm

PyMC3 and Stan differ from Edward in that they (1) transform all constrained continuous variables to be on the unconstrained space; (2) perform inference on the unconstrained scale; (3) transform back after convergence. Edward currently does everything on the original scale: if the prior has positive support (e.g., Gamma), then the approximating family should too (e.g., log normal).

I recommend the Normal random variable. What’s more vanilla for bayesian linear regression is to place a prior over the scale directly. For example, use a Inverse Gamma prior over the (squared) scale parameter and a log Normal variational approximation; or alternatively, a log Normal prior over the scale.

cshenton · August 29, 2017, 1:14pm

Okay, good to know. That’s also clarified a misunderstanding I’d had earlier. I’d thought KLqp had difficulty with both inverse gamma priors and variational models, but it seems it’s just the latter.

Also good to know that specifying variational models with the correct support prevents the need to transform the RVs. One gripe I have with PyMC is that their quickstart example is basically entirely about their variational inference implementation failing on a gaussian mixture. That’s good to know, but it is something which the user is powerless to fix, since PyMC ADVI is constrained to use a gaussian variational model!

Topic		Replies	Views
Unable to inference the variance in a simple bayesian linear regression	0	912	January 7, 2018
Variational parameters on prior and likelihood variances	2	1006	August 12, 2017
Bayesian regression	4	1221	June 26, 2017
How to make the standard deviation of the dependent variable to be inferred?	3	2044	January 8, 2018
Parameters not getting updated	4	944	May 11, 2017

Fixed scales in regression example

Related topics