Using TransformedDistribution for a Jeffrey's Prior


#1

I want to use a Jeffreys Prior on the scale parameter for a Gaussian latent variable or a Gaussian likelihood.
I used a Uniform distribution and then TransformedDistribution for the exp transformation. When I use that in the scale parameter, and use MetropolisHasting I end up getting 0 acceptance rate. However, when I set the scale to 1.0 I do end up with an acceptance rate of ~.50 so I assume I’m probably doing something wrong here.

Below is a snippet of the code:

T = 10000

# Latent variables
K = Normal(loc=tf.zeros((len(n_np), len(to_optimize))), scale=1.0)
log_sigma = Uniform(low=[-4.6052], high=[3.453])
sigma = TransformedDistribution(distribution=log_sigma, bijector=ds.bijectors.Exp(), name='sigma')

# Observed
Fourier_sum = tf.reduce_sum(tf.reduce_sum(tf.multiply(phis, K), axis=1), axis=1)
Fourier_sum_rel = Fourier_sum - tf.reduce_min(Fourier_sum)

# Proposal distributions
proposal_K = Normal(loc=K, scale=0.02)
qK = Empirical(tf.Variable(tf.zeros([T, 6, 3])))
proposal_sigma = Normal(loc=sigma, scale=0.02)
qsigma = Empirical(tf.Variable(tf.zeros([T, 1])))

# Likelihood
likelihood = Normal(loc=Fourier_sum, scale=sigma)

# MH inference 
inference = ed.MetropolisHastings(latent_vars={K: qK, sigma: qsigma}, 
                                  proposal_vars={K: proposal_K, sigma: proposal_sigma}, 
                                  data={likelihood: residual_data})

This works when I use a scale = 1.0 in the likelihood and remove the sigma from the latent and proposal variable.


#2

Can you post the full script, including hyperparams and data generation? It could be that you’re doing nothing wrong and that MH is just not a very good algorithm.


#3

Here is a simplified version of the problem.
The data is located in this file.

import numpy as np
from edward.models import Normal, Categorical, Uniform, Empirical, TransformedDistribution
import tensorflow as tf
import edward as ed

# data generation
K_true = np.array([[ 0.,  0.,  0.75312,  0.,  0., 0.],
                   [ 0.,  0.15979,  0.,  0.,  0.,-0.13297],
                   [ 0.,  0.,  0.92048,  0.,  0., 0.]])
phis_np = np.load('phis.npy')

# Data to fit to
data = (phis_np.dot(K_true)).trace(axis1=1, axis2=2)
rel_data = data - min(data)

phis_ts = tf.constant(phis_np, dtype=tf.float32)
data_ts = tf.constant(rel_data, dtype=tf.float32)

# Set up Edward model
T = 10000
ds = tf.contrib.distributions

# Latent variables
K = Normal(loc=tf.zeros((6,3)), scale=1.0)
log_sigma = Uniform(low=[-4.6052], high=[3.453])
sigma = TransformedDistribution(distribution=log_sigma, bijector=ds.bijectors.Exp(), name='sigma')

# Proposal variables
proposal_sigma = Normal(loc=sigma, scale=0.02)
proposal_K = Normal(loc=K, scale=0.02)

# Empirical variables
qK = Empirical(tf.Variable(tf.zeros([T, 6, 3])))
qsigma = Empirical(tf.Variable(tf.zeros([T, 1])))

# Observed
Fourier_sum = tf.reduce_sum(tf.reduce_sum(tf.multiply(phis_ts, K), axis=1), axis=1)
Fourier_sum_rel = Fourier_sum - tf.reduce_min(Fourier_sum)

# Likelihood
likelihood = Normal(loc=Fourier_sum_rel, scale=1.0)

# Inference with constant sigma
inference = ed.MetropolisHastings(latent_vars={K: qK}, 
                                  proposal_vars={K: proposal_K}, 
                                  data={likelihood: data_ts})
inference.run()

This script works in the sense that there’s a reasonable acceptance rate. I’m not recovering the true K but that’s probably happening for several reasons(not enough data points, wrong algorithm, etc.) My main question is why I can’t get a reasonable acceptance rate when I add a prior on sigma. Here is a Jupyter notebook that shows the results.

When I make the changes below (using the TransformedDistribution for sigma, the acceptance rate falls to zero:

# Include Jeffrey Prior on likelihood sigma
likelihood = Normal(loc=Fourier_sum_rel, scale=sigma)
inference_2 = ed.MetropolisHastings(latent_vars={K: qK, sigma: qsigma}, 
                                  proposal_vars={K: proposal_K, sigma: proposal_sigma}, 
                                  data={likelihood: data_ts})
inference_2.run()

The last cell in the linked notebook has the output.