Matrix factorization with Masking


#1

I’m trying to compute matrix factorization R = UV’, except there are certain entries in R that we don’t know (eg. R is a ratings matrix for user-movie recommendations), how do I specify I want those entries to be masked out?

My attempt so far:

R_    = np.array([[1,3,3,4,5],
                  [1,2,4,3,5],
                  [5,3,2,5,1]])
mask_ = np.array([[1,1,1,1,1],
                  [1,0,0,0,0],
                  [1,1,1,1,1]])

n_users, n_items = R_.shape
latent_dim = 2

U = ed.models.Normal(loc=tf.zeros([n_users, latent_dim]), scale=tf.ones([n_users, latent_dim]), name="user_matrix")
V = ed.models.Normal(loc=tf.zeros([n_items, latent_dim]), scale=tf.ones([n_items, latent_dim]), name="item_matrix")
R = ed.models.Normal(loc=tf.matmul(U, V, transpose_b=True), scale=0.1 * tf.ones([n_users, n_items]))

mask = tf.placeholder(tf.bool, shape=mask_.shape)
R_masked = tf.boolean_mask(R, mask)

qU = ed.models.Normal(loc=tf.Variable(tf.random_normal([n_users, latent_dim])),
                      scale=tf.nn.softplus(tf.Variable(tf.random_normal([n_users, latent_dim]))))
qV = ed.models.Normal(loc=tf.Variable(tf.random_normal([n_items, latent_dim])),
                      scale=tf.nn.softplus(tf.Variable(tf.random_normal([n_items, latent_dim]))))

sess = ed.get_session()
tf.global_variables_initializer().run()
inference = ed.KLqp({U: qU, V: qV}, data={R_masked: R_[np.where(mask_)]})
# inference = ed.KLqp({U: qU, V: qV}, data={R: R_}) # <- btw this works, but it's not what I want.
inference.run(n_iter=1000, n_samples=5)

This runs but it doesn’t learn anything - any idea what I’m doing wrong? Thanks.


#2

Oh hold on, here’s an idea. I can just put very high variance on R, right?

R = ed.models.Normal(loc=tf.matmul(U, V, transpose_b=True), scale=100 * (1-mask) + 0.1)

This seems to work. Please let me know if this isn’t right!