RNN in Edward - Tensor had NaN values

chdr · April 4, 2018, 3:26pm

I am new to Edward and relatively new to TensorFlow. I try to build a Bayesian RNN to predict remaining useful life on NASA’s CMAPSS dataset.
I started from the RNN example: Edward RNN and tried to get it running.

import edward as ed
import numpy as np
import pandas as pd
import tensorflow as tf
from edward.models import Normal
from edward.util import Progbar

x_train = np.load(“/CMAPSSData/train_norm_sub.npy”)
y_rul = np.load(“/CMAPSSData/train_norm_rul.npy”)

H = 50 # number of hidden units
D = 24 # number of features

def rnn_cell(hprev, xt):
return tf.tanh(ed.dot(hprev, W_h) + ed.dot(xt, W_x) + b_h)

W_h = Normal(loc=tf.zeros([H, H]), scale=tf.ones([H, H]))
W_x = Normal(loc=tf.zeros([D, H]), scale=tf.ones([D, H]))
W_y = Normal(loc=tf.zeros([H, 1]), scale=tf.ones([H, 1]))
b_h = Normal(loc=tf.zeros(H), scale=tf.ones(H))
b_y = Normal(loc=tf.zeros(1), scale=tf.ones(1))

x = tf.placeholder(tf.float32, [None, D],name=‘x’)
h = tf.scan(rnn_cell, x, initializer=tf.zeros(H))
y = Normal(loc=tf.matmul(h, W_y) + b_y, scale=1.0,name=‘y’)

qW_h = Normal(loc=tf.get_variable(“qW_h/loc”, [H, H]),
scale=tf.nn.softplus(tf.get_variable(“qW_h/scale”, [H, H])))
qW_x = Normal(loc=tf.get_variable(“qW_x/loc”, [D, H]),
scale=tf.nn.softplus(tf.get_variable(“qW_x/scale”, [D, H])))
qW_y = Normal(loc=tf.get_variable(“qW_y/loc”, [H, 1]),
scale=tf.nn.softplus(tf.get_variable(“qW_y/scale”, [H, 1])))
qb_h = Normal(loc=tf.get_variable(“qb_h/loc”, [H]),
scale=tf.nn.softplus(tf.get_variable(“qb_h/scale”, [H])))
qb_y = Normal(loc=tf.get_variable(“qb_y/loc”, [1]),
scale=tf.nn.softplus(tf.get_variable(“qb_y/scale”, [1])))

inference = ed.KLqp({W_h: qW_h, b_h: qb_h,
W_x: qW_x,
W_y: qW_y, b_y: qb_y},
data={x:x_train, y: y_rul})
optimizer = tf.train.AdamOptimizer(learning_rate=0.0001)
inference.run(optimizer=optimizer)

I am facing the error:

InvalidArgumentError: : Tensor had NaN values
[[Node: inference_15/sample_47/scan/while/VerifyFinite_3/CheckNumerics = CheckNumerics[T=DT_FLOAT, _class=[“loc:@Normal_1/sample/Reshape”], message=“”, _device=“/job:localhost/replica:0/task:0/device:CPU:0”](inference_15/sample_47/scan/while/VerifyFinite_3/CheckNumerics/Enter, ^inference_15/sample_47/scan/while/Identity)]]

I have already tried different learning rates and used the most recent Git version of Edward.

I already checked the dataframe by using:

x_train_df = pd.DataFrame(x_train)

output:
0 0
1 0
2 0
3 0
4 0
5 0
6 0
7 0
8 0
9 0
10 0
11 0
12 0
13 0
14 0
15 0
16 0
17 0
18 0
19 0
20 0
21 0
22 0
23 0
dtype: int64

y_rul_df = pd.DataFrame(y_rul)

output:
0 0
dtype: int64

aksarkar · April 9, 2018, 12:27am

It looks like there really are NaN values in the data (the last two columns).

It also looks like the data are not in the format you’re expecting based on your code (y_rul is the test data, not the training data).

abdullahnisar92 · March 13, 2019, 9:05am

Any idea how to include different layers in the RNN model?

Topic		Replies	Views
NaNs in Tensor for model, decreasing learning rate doesn't help	0	653	May 5, 2018
Bayesian RNN in Edward	0	1520	March 13, 2019
InvalidArgumentError while running conditioning on Edward	0	1275	April 5, 2019
Getting started	5	2260	June 9, 2020
Using tfdbg for debugging Edward models	1	1333	October 5, 2017

RNN in Edward - Tensor had NaN values

Related topics