RNN in Edward - Tensor had NaN values


#1

I am new to Edward and relatively new to TensorFlow. I try to build a Bayesian RNN to predict remaining useful life on NASA’s CMAPSS dataset.
I started from the RNN example: Edward RNN and tried to get it running.

import edward as ed
import numpy as np
import pandas as pd
import tensorflow as tf
from edward.models import Normal
from edward.util import Progbar

x_train = np.load("/CMAPSSData/train_norm_sub.npy")
y_rul = np.load("/CMAPSSData/train_norm_rul.npy")

H = 50 # number of hidden units
D = 24 # number of features

def rnn_cell(hprev, xt):
return tf.tanh(ed.dot(hprev, W_h) + ed.dot(xt, W_x) + b_h)

W_h = Normal(loc=tf.zeros([H, H]), scale=tf.ones([H, H]))
W_x = Normal(loc=tf.zeros([D, H]), scale=tf.ones([D, H]))
W_y = Normal(loc=tf.zeros([H, 1]), scale=tf.ones([H, 1]))
b_h = Normal(loc=tf.zeros(H), scale=tf.ones(H))
b_y = Normal(loc=tf.zeros(1), scale=tf.ones(1))

x = tf.placeholder(tf.float32, [None, D],name=‘x’)
h = tf.scan(rnn_cell, x, initializer=tf.zeros(H))
y = Normal(loc=tf.matmul(h, W_y) + b_y, scale=1.0,name=‘y’)

qW_h = Normal(loc=tf.get_variable(“qW_h/loc”, [H, H]),
scale=tf.nn.softplus(tf.get_variable(“qW_h/scale”, [H, H])))
qW_x = Normal(loc=tf.get_variable(“qW_x/loc”, [D, H]),
scale=tf.nn.softplus(tf.get_variable(“qW_x/scale”, [D, H])))
qW_y = Normal(loc=tf.get_variable(“qW_y/loc”, [H, 1]),
scale=tf.nn.softplus(tf.get_variable(“qW_y/scale”, [H, 1])))
qb_h = Normal(loc=tf.get_variable(“qb_h/loc”, [H]),
scale=tf.nn.softplus(tf.get_variable(“qb_h/scale”, [H])))
qb_y = Normal(loc=tf.get_variable(“qb_y/loc”, [1]),
scale=tf.nn.softplus(tf.get_variable(“qb_y/scale”, [1])))

inference = ed.KLqp({W_h: qW_h, b_h: qb_h,
W_x: qW_x,
W_y: qW_y, b_y: qb_y},
data={x:x_train, y: y_rul})
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
inference.run(optimizer=optimizer)

I am facing the error:

InvalidArgumentError: : Tensor had NaN values
[[Node: inference_15/sample_47/scan/while/VerifyFinite_3/CheckNumerics = CheckNumerics[T=DT_FLOAT, _class=[“loc:@Normal_1/sample/Reshape”], message="", _device="/job:localhost/replica:0/task:0/device:CPU:0"](inference_15/sample_47/scan/while/VerifyFinite_3/CheckNumerics/Enter, ^inference_15/sample_47/scan/while/Identity)]]

I have already tried different learning rates and used the most recent Git version of Edward.

I already checked the dataframe by using:

x_train_df = pd.DataFrame(x_train)

output:
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0
20 0
21 0
22 0
23 0
dtype: int64

y_rul_df = pd.DataFrame(y_rul)

output:
0 0
dtype: int64


#2

It looks like there really are NaN values in the data (the last two columns).

It also looks like the data are not in the format you’re expecting based on your code (y_rul is the test data, not the training data).