Inf and nan loss in Stochastic Volatility Models

NTomasetti · March 17, 2017, 2:00pm

Hi,

Sorry for the wrong place, I didn’t see this forum.
I’m working on a Stochastic Volatility model, with code

import tensorflow as tf
import edward as ed
import numpy as np
from edward.models import Normal, InverseGamma, Uniform

# Model Parameters
T = 50
sigmaTrue = 0.5
gammaTrue = 0.2
phiTrue = 0.8

# Simulate Data
hTrue = [0] * T
yTrue = [0] * T
for t in range(T):
    if (t == 1):
        hTrue[t] = np.random.normal(loc=gammaTrue / (1 - phiTrue), 
                                   scale=sigmaTrue / np.sqrt(1 - phiTrue ** 2))
    else:
        hTrue[t] = np.random.normal(loc=gammaTrue + phiTrue * hTrue[t - 1], 
                                    scale=sigmaTrue)
    yTrue[t] = np.random.normal(loc=0, scale=np.exp(hTrue[t] / 2))

# Prior distributions
gamma = Normal(mu=0.0, sigma=5.0)
phi = Uniform(-1.0, 1.0)
sigma = InverseGamma(alpha=1.0, beta=1.0)
# Create h and y as lists
h = [0] * T
y = [0] * T
for t in range(T):
    if (t == 0):
        h[0] = Normal(mu=0.0, sigma=2.0)
    else:
        h[t] = Normal(mu=gamma + phi * h[t - 1], sigma=sigma)
    y[t] = Normal(mu=0.0, sigma=tf.exp(h[t] / 2))

# Create Approximating Distribution
qGamma = Normal(mu=tf.Variable(tf.random_normal([])),
                sigma=tf.nn.softplus(tf.Variable(tf.random_normal([]))))
qPhi = Normal(mu=tf.Variable(tf.random_normal([])),
              sigma=tf.nn.softplus(tf.Variable(tf.random_normal([]))))
qSigma = InverseGamma(
    alpha=tf.nn.softplus(tf.Variable(tf.random_normal([]))),
    beta=tf.nn.softplus(tf.Variable(tf.random_normal([]))))
qH = [0] * T
for t in range(T):
    qH[t] = Normal(mu=tf.Variable(tf.random_normal([])),
                   sigma=tf.nn.softplus(tf.Variable(tf.random_normal([]))))

# Inference
parameters = {
    gamma: qGamma,
    phi: qPhi,
    sigma: qSigma,
}
states = {h: qH for h, qH in zip(h, qH)}
parameters.update(states)
data = {yt: yTrue for yt, yTrue in zip(y, yTrue)}
inference = ed.KLqp(parameters, data=data)
inference.run()

With the result

Iteration    1 [  0%]: Loss = inf
Iteration  100 [ 10%]: Loss = nan
Iteration  200 [ 20%]: Loss = nan
Iteration  300 [ 30%]: Loss = nan
Iteration  400 [ 40%]: Loss = nan
Iteration  500 [ 50%]: Loss = nan
Iteration  600 [ 60%]: Loss = nan
Iteration  700 [ 70%]: Loss = nan
Iteration  800 [ 80%]: Loss = nan
Iteration  900 [ 90%]: Loss = nan
Iteration 1000 [100%]: Loss = nan

I assume there’s some issue with the setup, and the list for y and h feels very awkward, I was trying to follow the approach in https://github.com/blei-lab/edward/pull/480.
Thanks for any help.

dustin · March 17, 2017, 2:16pm

Two tips:

Gammas and inverse Gammas are difficult to work with using the generic ed.KLqp algorithm. In general I recommend not using Gammas unless you use another inference. If you just want a distribution on the positive reals, you can look into LogNormal distributions. Removing the Gammas makes the algorithm significantly faster and more stable.
phi and qphi have different supports. This is why the algorithm returns Inf then NaN: the ed.KLqp algorithm evaluates phi's density at samples from qphi. This is undefined outside [-1, 1].

In your code I fixed sigma and phi, and then commented out qSigma, qphi and the line sigma: qSigma, phi: qphi in the parameters dict. This successfully ran.

Hope that helps.

conduit · July 30, 2017, 6:47pm

I’m having a similar problem after replacing the Normal weights in a Bayes NN example with Uniform ones. Any idea what could be causing it?

Here are the relevant weight and bias sections:

weights = []
biases = []
bound = 1

weights.append(Uniform(low=-bound*tf.ones([D, width]), high=bound*tf.ones([D, width])))
biases.append(Uniform(low=-bound*tf.ones(width), high=bound*tf.ones(width)))
for _ in xrange(depth):
	weights.append(Uniform(low=-bound*tf.ones([width, width]), high=bound*tf.ones([width, width])))
	biases.append(Uniform(low=-bound*tf.ones(width), high=bound*tf.ones(width)))
weights.append(Uniform(low=-bound*tf.ones([width, O]), high=bound*tf.ones([width, O])))
biases.append(Uniform(low=-bound*tf.ones(O), high=bound*tf.ones(O)))

x = x_train
y = Normal(loc=neural_network(x, weights, biases, dropout),
           scale=0.1 * tf.ones(N_tests * L_tests))

qweights = []
qbiases = []

qweights.append(Uniform(low=tf.Variable(tf.random_uniform([D, width], minval=-bound, maxval=bound)),
              high=tf.nn.softplus(tf.Variable(tf.random_uniform([D, width], minval=-bound, maxval=bound)))))
qbiases.append(Uniform(low=tf.Variable(tf.random_uniform([width], minval=-bound, maxval=bound)),
              high=tf.nn.softplus(tf.Variable(tf.random_uniform([width], minval=-bound, maxval=bound)))))
for _ in xrange(depth):
	qweights.append(Uniform(low=tf.Variable(tf.random_uniform([width, width], minval=-bound, maxval=bound)),
    	          high=tf.nn.softplus(tf.Variable(tf.random_uniform([width, width], minval=-bound, maxval=bound)))))
	qbiases.append(Uniform(low=tf.Variable(tf.random_uniform([width], minval=-bound, maxval=bound)),
    	          high=tf.nn.softplus(tf.Variable(tf.random_uniform([width], minval=-bound, maxval=bound)))))
qweights.append(Uniform(low=tf.Variable(tf.random_uniform([width, O], minval=-bound, maxval=bound)),
              high=tf.nn.softplus(tf.Variable(tf.random_uniform([width, O], minval=-bound, maxval=bound)))))
qbiases.append(Uniform(low=tf.Variable(tf.random_uniform([O], minval=-bound, maxval=bound)),
              high=tf.nn.softplus(tf.Variable(tf.random_uniform([O], minval=-bound, maxval=bound)))))

Topic		Replies	Views
NaNs produced in output	0	1039	July 21, 2018
NaNs in Tensor for model, decreasing learning rate doesn't help	0	632	May 5, 2018
Using tfdbg for debugging Edward models	1	1296	October 5, 2017
Multinomial on convolutionnal model returns nans (equivalent code with Pyro)	1	939	June 6, 2018
Data subsampling of Non-Negative Matrix Factorization	9	1231	April 23, 2018

Inf and nan loss in Stochastic Volatility Models

Related topics