Cholesky decomposition of sklearn rbf kernel output works but I get a positive definite error when going by example


#1

Following the example:

http://edwardlib.org/tutorials/supervised-classification

I get an error about cholesky decomposition being unsuccessful, however if I do the below:

import edward as ed
from edward.models import Bernoulli, Normal, MultivariateNormalTriL
from sklearn.metrics.pairwise import rbf_kernel
X = tf.placeholder(tf.float32, [N, P])
chol_X = tf.placeholder(tf.float32, [N, N])
cov_mat = rbf_kernel(data)
chol_cov_mat = np.linalg.cholesky(cov_mat)

f = MultivariateNormalTriL(
        loc=tf.zeros(N),
        scale_tril=chol_X
)
y = Bernoulli(logits=f)

qf = Normal(loc=tf.get_variable("qf/loc", [N]),
        scale=tf.nn.softplus(tf.get_variable("qf/scale", [N]))
)

infer = ed.KLqp({f: qf}, data={X:data, y:response, chol_X:chol_cov_mat})
infer.run(n_iter=500)

it runs without error. I’m really unsure as to what is happening here ?

perhaps this may be a question for tensorflow people since it seems it’s the tf.cholesky that gives an error and not np.linalg.cholesky, or perhaps the rbf kernel I’m using from sklearn is different from ed.rbf?

going a bit further:

import tensorflow as tf
from edward.utils import rbf
from sklearn.metrics.pairwise import rbf_kernel

tf.enable_eager_execution()

# loading data is done

k1 = rbf(data).numpy()
k2 = rbf_kernel(data)

# the two matrices above are not the same

np.linalg.cholesky(k1) # fails to run
np.linalg.cholesky(k2) # runs

tf.cholesky(k1) # fails to run
tf.cholesky(k2) # runs

#2

I see now, the kernels aren’t exact, and using the gaussian process rbf kernel from sklearn also gives the cholesky decomposition error. The problem is in fact my data.