Hi All,
I’m trying to implement an LDA-like mixed membership model in edward. I’ve seen several people attempt to run Gibbs sampling and KLqp on the version of LDA described in Dustin’s ICLR 2017 paper. Some have tried to do the same with their own versions of LDA. All examples use toy data, and no one claims to have obtained good results. Earlier this year it was suggested that ParamMixture
should be used to implement LDA, so maybe this is the way to go. However, I’m stuck on how to embed ParamMixture
within the structure of a mixed membership model. It seems straightforward at first, (naively, just put the call to ParamMixture
within a loop over your corpus), but things break down when you start calling edward’s inference methods (e.g ed.Gibbs()
). Below is my attempt at setting up the model.
alpha = tf.ones(K) * 0.1
eta = tf.ones(V) * 0.01
theta = Dirichlet(concentration=alpha,
sample_shape=D)
beta = Dirichlet(concentration=eta,
sample_shape=K)
W = [0] * D
for d in range(D):
W[d] = ParamMixture(mixing_weights=tf.gather(theta, d),
component_params={'probs': beta},
component_dist=Categorical,
sample_shape=N[d],
validate_args=True)
I’ve also seen claims that mixed memberships models scalable to real data would have to wait until conjugacy had been fully integrated into edward via ed.complete_conditional()
. Correct me if I’m wrong, but it seems like this function has already been folded into Gibbs sampling. Otherwise, I’m not sure how I would leverage this functionality to get my models off the ground.
Any help or guidance or getting LDA up and running would be much appreciated.