# Trying to implement Bayesian NN for regression

I am trying to apply the Bayesian NN presented by Torsten Scholak at PyCon to some real world data I have, in order to familiarize myself with edward and tensorflow and I am getting very weird results.
The network fits the data well but only up to a certain point and then flatlines. I can’t figure out where in the code I should tweak it. Here is the code for the network

``````def neural_network_with_2_layers(x, W_0, W_1, b_0, b_1):
h = tf.nn.tanh(tf.matmul(x, W_0) + b_0)
h = tf.matmul(h, W_1) + b_1
return tf.reshape(h, [-1])

dim = 10  # layer dimensions
W_0 = Normal(loc=tf.zeros([D, dim]),
scale=tf.ones([D, dim]))
W_1 = Normal(loc=tf.zeros([dim, 1]),
scale=tf.ones([dim, 1]))
b_0 = Normal(loc=tf.zeros(dim),
scale=tf.ones(dim))
b_1 = Normal(loc=tf.zeros(1),
scale=tf.ones(1))

x = tf.placeholder(tf.float32, [N, D])

#Reshaping
a = neural_network_with_2_layers(x,W_0,W_1,b_0,b_1)
b = tf.reshape(a,[len(X_train),1])
y = Normal(loc=b,scale=(tf.ones([N,1])*0.1))  # constant noise

`#BACKWARD MODEL A`

q_W_0 = Normal(loc=tf.Variable(tf.random_normal([D, dim])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([D, dim]))))
q_W_1 = Normal(loc=tf.Variable(tf.random_normal([dim, 1])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([dim, 1]))))
q_b_0 = Normal(loc=tf.Variable(tf.random_normal([dim])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([dim]))))
q_b_1 = Normal(loc=tf.Variable(tf.random_normal([1])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([1]))))

inference = ed.KLqp(latent_vars={W_0: q_W_0, b_0: q_b_0,
W_1: q_W_1, b_1: q_b_1},
data={x: X_train, y: Y_train})

inference.run(n_samples=50, n_iter=20000)
``````

Here are the results

and the code to plot them

``````# CRITICISM A
plt.scatter(X_train, Y_train, s=20.0);  # blue
plt.scatter(X_test, Y_test, s=20.0,  # red
color=sns.color_palette().as_hex()[2]);

xp = tf.placeholder(tf.float32, [1000, D])
[plt.plot(np.linspace(-1.0, 1.0, 1000),
sess.run(neural_network_with_2_layers(xp,
q_W_0, q_W_1,
q_b_0, q_b_1),
{xp: np.linspace(-1.0, 1.0, 1000)[:, np.newaxis]}),
color='black', alpha=0.1)
for _ in range(10)];
``````

Cheers

1 Like

What size is the hidden layer? Also, to verify your code works, have you tried dropping the hidden layer in the `neural_network` code to see if it properly reduces to Bayesian linear regression?

The size of the hidden layer is 10 and I just found out that my code indeed does not work even when I drop the hidden layer. Here is what it looked like

``````def neural_network_with_1_layer(x, W, b):
h = tf.matmul(x,W) + b
return tf.reshape(h, [-1])

W = Normal(loc=tf.zeros([1,1]),
scale=tf.ones([1,1]))

b = Normal(loc=tf.zeros([1,1]),
scale=tf.ones([1,1]))

x = tf.placeholder(tf.float32, [623, 1])

a = neural_network_with_1_layer(x,W,b)
c = tf.reshape(a,[len(X_train),1])
y = Normal(loc=c,scale=(tf.ones([1])*0.1))  # constant noise

q_W = Normal(loc=tf.Variable(tf.random_normal([1,1])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([1,1]))))

q_b = Normal(loc=tf.Variable(tf.random_normal([1,1])),
scale=tf.nn.softplus(tf.Variable(tf.random_normal([1,1]))))

inference = ed.KLqp(latent_vars={W: q_W, b: q_b},
data={x: X_train, y: Y_train})

inference.run(n_samples=10, n_iter=5000)

plt.scatter(X_train, Y_train, s=20.0,label="Training data");  # blue
plt.scatter(X_test, Y_test, s=20.0,label="Test data",  # red
color=sns.color_palette().as_hex()[2]);

xp = tf.placeholder(tf.float32, [2000, 1])
[plt.plot(np.linspace(-.5, 1.0, 2000),
sess.run(neural_network_with_2_layers(xp,q_W,q_b),
{xp: np.linspace(-.5, 1.0, 2000)[:, np.newaxis]}),
color='black', alpha=0.1)
for _ in range(10)];
plt.legend()
``````

And this is how I define my data in case it helps

``````X_test = data_4[::10]
X_test = X_test.reshape(len(X_test),1)
X_test = X_test.astype("float32")

X_train = data[::10]
X_train = X_train.astype("float32")
X_train = X_train.reshape(len(X_train),1)

Y_train = RUL_func(X_train)
Y_train = Y_train.astype("float32")
Y_train = Y_train.reshape(len(Y_train),1)

Y_test = RUL_func(X_test)
Y_test = Y_test.reshape(len(Y_test),1)
Y_test = Y_test.astype("float32")
``````

Thanks for taking the time to help

1 Like

Update: The problem may be caused by the activation function. Changing it from tanh to relu gave me a linear regression and to relu6 a more non-linear (over)fit again with a cut-off point but at a more convenient, for the present data, value
This is the relu output

And this is the relu6

Is your data normalized as per your activation function, normalize 0,1 for Relu and (-1, 1) for tanh?
Initialize your weights with some random normal and minimize the weights by multiplying with some factor e.g. 0.01.

2 Likes

That helped a lot. I already initialized the weights randomly but did not normalize the input. Here are 20 predictions using a tanh activation with (semi)normalized inputs (still, it does quite well)

It seems to be better, with a lot less overfitting however now I am getting negative loss. Guess there’s some more tweaking to be done. Thanks

1 Like