How to handle missing values in Gaussian Matrix Factorization

dustin · April 22, 2017, 4:25pm

In the literature, this is known as implicit feedback. From my (limited) understanding, there are a few ways to handle the zeroes:

Treat the zeros as part of the data, as you mention. For Gaussian MF, you then have to downweight the zeros somehow during inference via large penalizations. Poisson MF naturally solves this by defining a sparse generative process.
Treat the zeroes as missing values (latent variables), and marginalize them out. This can be tough in most cases. To do this in Edward, include a tf.placeholder for the indicators. Here’s an example.

I = tf.placeholder(tf.int32)
mu = tf.matmul(U, V, transpose_b=True)
sigma = tf.ones([M, N])
Y_obs = Normal(mu=tf.gather(mu, I), sigma=tf.gather(sigma, I))
Y_mis = Normal(mu=tf.gather(mu, 1 - I), sigma=tf.gather(sigma, 1 - I))

qY_mis = Normal(
    mu=tf.Variable(tf.random_uniform(Y_mis.shape)),
    sigma=tf.Variable(tf.nn.softplus(tf.random_uniform(Y_mis.shape)))
)

inference = ed.KLqp({U: qU, V: qV, Y_mis: qY_mis}, data={Y_obs: y_train, I: I_train})

Your mileage will vary depending on how you structure the indicators and nonzero values in the matrix.

Topic		Replies	Views
Nonnegative Matrix Factorization	2	1978	September 5, 2017
Matrix factorization with Masking	1	831	November 3, 2017
Handling missing values	5	1497	April 12, 2018
Matrix factorization - recovering latent factors	1	859	January 27, 2018
How to use Gibbs samples for inference in Probabilistic Matrix Model?	1	1389	May 29, 2017

How to handle missing values in Gaussian Matrix Factorization

Related topics