Matrix factorization - recovering latent factors


#1

Hi,

I am new to Edward and I’m trying to implement a matrix factorization model, say R = U’V. I used the example of matrix factorization on github and I get a reasonable error on predictions of the matrix R.

However, when I evaluate the inferred U and V, the error can get very big. I know that this model is symmetric and latent factors can “rotate”, but I can’t figure out how to take care of this properly.

In the Matchbox documentation, it is said that this can be solved by fixing the upper square of U or V with the identity matrix, making all columns different and thus breaking symmetry. I tried generating synthetic data this way and putting an arbitrarily strong “identity prior” on the upper square of U or V, but it didn’t seem to help recovering the latent factors. Note that I tried with a latent space of dimension 2 and I couldn’t recover the latent factors by permuting them, so they don’t just “rotate”.

Could someone help me with either “fixing” values on latent variables in Edward to achieve this symmetry breaking, or point towards another way to recover these latent factors? Any contribution will be much appreciated.

Thank you!


#2

Here is an open Colab notebook with my situation. You can see that:

  • the inferred R is correct,
  • the inferred U and V aren’t correct,
  • the product of the inferred U and V gives a correct R.

The inferred U and V are just a valid solution, among others. Is there a way to make sure inference converges towards not only valid U and V but also those which were used to generate the data?

Thanks a lot!