DPM model for clustering

Dear edward users,

I’m trying to extend the model for “unsupervised learning by clustering data points” in the Mixture models tutorial (http://edwardlib.org/tutorials/unsupervised) My intention is to put the components as a parameter inside the model and to do inference in the number of components … in the same way as @AustinRochford did in this example with pymc3 https://pymc-devs.github.io/pymc3/notebooks/dp_mix.html

So, I added some new lines to the code (the complete code below) in order to incorporate the components as a variable in the model:

alpha = Gamma(1.0, 1.0, sample_shape=K)
p = Dirichlet(alpha, sample_shape=N)
cat = Categorical(p=p)

The problem is that, I don’t know how to do the histogram of the distribution on the number of the components. Could you help me please?

Cheers.

--------------------------- The complete code ----------------------

#!/usr/bin/env python
"""Dirichlet process for unsupervised classification

We try to reprouce the example in unsupervised notebook
adding a Dirichlet Process:

https://github.com/blei-lab/edward/blob/master/notebooks/unsupervised.ipynb
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import edward as ed
import matplotlib.pyplot as plt
import matplotlib.cm as cm
import numpy as np
import six
import tensorflow as tf

from edward.models import Categorical, Dirichlet, InverseGamma, \
    MultivariateNormalDiag, Normal, ParamMixture, \
    Gamma, Mixture

sess = tf.Session()

def build_toy_dataset(N):
  pi = np.array([0.4, 0.6])
  mus = [[1, 1], [-1, -1]]
  stds = [[0.1, 0.1], [0.1, 0.1]]
  x = np.zeros((N, 2), dtype=np.float32)
  for n in range(N):
    k = np.argmax(np.random.multinomial(1, pi))
    x[n, :] = np.random.multivariate_normal(mus[k], np.diag(stds[k]))
  return x

N = 500  # number of data points
K = 2  # number of components
D = 2  # dimensionality of data
ed.set_seed(42)

x_train = build_toy_dataset(N)

plt.scatter(x_train[:, 0], x_train[:, 1])
plt.axis([-3, 3, -3, 3])
plt.title("Simulated dataset")
plt.show()

# -----
# Model
# -----
#
K = 10 # Max components
mu = Normal(mu=tf.zeros([K, D]), sigma=tf.ones([K, D]))
sigma = InverseGamma(alpha=tf.ones([K, D]), beta=tf.ones([K, D]))

alpha = Gamma(1.0, 1.0, sample_shape=K)
p = Dirichlet(alpha, sample_shape=N)
cat = Categorical(p=p)

components = [
    MultivariateNormalDiag(mu=tf.ones([N, 1]) * mu[k],
                           diag_stdev=tf.ones([N, 1]) * sigma[k])
    for k in range(K)]

x = Mixture(cat=cat, components=components)

# It would be possible also
#
#cat = Categorical(p=p, sample_shape=N)
#x = Normal(mu=tf.gather(mu, cat), sigma=tf.gather(sigma, cat))


# ---------
# Inference
# ---------
#
qmu = Normal(
    mu=tf.Variable(tf.random_normal([K, D])),
    sigma=tf.nn.softplus(tf.Variable(tf.zeros([K, D]))))
qsigma = InverseGamma(
    alpha=tf.nn.softplus(tf.Variable(tf.random_normal([K, D]))),
    beta=tf.nn.softplus(tf.Variable(tf.random_normal([K, D]))))

inference = ed.KLqp({mu: qmu, sigma: qsigma}, data={x: x_train})
inference.initialize(n_samples=20, n_iter=4000)

sess = ed.get_session()

tf.global_variables_initializer().run()

for _ in range(inference.n_iter):
  info_dict = inference.update()
  inference.print_progress(info_dict)
  t = info_dict['t']
  if t % inference.n_print == 0:
    print("\nInferred cluster means:")
    print(sess.run(qmu.mean()))

# ---------
# Criticism
# ---------
#
# Calculate likelihood for each data point and cluster assignment,
# averaged over many posterior samples. ``x_post`` has shape (N, 100, K, D).
mu_sample = qmu.sample(100)
sigma_sample = qsigma.sample(100)
x_post = Normal(mu=tf.ones([N, 1, 1, 1]) * mu_sample,
                sigma=tf.ones([N, 1, 1, 1]) * sigma_sample)
x_broadcasted = tf.tile(tf.reshape(x_train, [N, 1, 1, D]), [1, 100, K, 1])

# Sum over latent dimension, then average over posterior samples.
# ``log_liks`` ends up with shape (N, K).
log_liks = x_post.log_prob(x_broadcasted)
log_liks = tf.reduce_sum(log_liks, 3)
log_liks = tf.reduce_mean(log_liks, 1)

clusters = tf.argmax(log_liks, 1).eval()

plt.scatter(x_train[:, 0], x_train[:, 1], c=clusters, cmap=cm.bwr)
plt.axis([-3, 3, -3, 3])
plt.title("Predicted cluster assignments")
plt.show()

Can you elaborate on what you mean by “histogram of the distribution on the number of the components”?

After inference, you can plot the posterior distribution of p (say, with some approximating distribution qp; note you need to do this in the code snippet; you’ve only defined a prior). This will show how probability mass is being placed over the simplex.

Dear Dustin. I’m so sorry for this huge delay. Please, let me show you some code about what I want. It is based on this post https://pymc-devs.github.io/pymc3/notebooks/dp_mix.html which is a model constructed using pymc3 (outputs 18, 19 and 20): model with stick_breaking methodology. The stick_breaking function is

def stick_breaking(beta):
    portion_remaining = tt.concatenate([[1], tt.extra_ops.cumprod(1 - beta)[:-1]])
    return beta * portion_remaining

and the model

with pm.Model() as model:
    alpha = pm.Gamma('alpha', 1., 1.)
    beta = pm.Beta('beta', 1., alpha, shape=K)
    w = pm.Deterministic('w', stick_breaking(beta))
   #
    tau = pm.Gamma('tau', 1., 1., shape=K)
    lambda_ = pm.Uniform('lambda', 0, 5, shape=K)
    mu = pm.Normal('mu', 0, tau=lambda_ * tau, shape=K)
    obs = pm.NormalMixture('obs', w, mu, tau=lambda_ * tau,
                           observed=old_faithful_df.std_waiting.values)

I want to rewrite this code to Edward and plot the posterior expected mixture weight of each component (output 23)

fig, ax = plt.subplots(figsize=(8, 6))
plot_w = np.arange(K) + 1

ax.bar(plot_w - 0.5, trace['w'].mean(axis=0), width=1., lw=0);
ax.set_xlim(0.5, K);
ax.set_xlabel('Component');
ax.set_ylabel('Posterior expected mixture weight');

I’ve transformed previous code to Edward by using your example in https://github.com/blei-lab/edward/blob/master/examples/mixture_gaussian_mh.py changing pi ~ Dirichlet by the following

# pi = Dirichlet(concentration=tf.constant([1.0] * K))
alpha = Gamma(concentration=2.0, rate=2.0, name = "alpha")
beta = Beta(concentration1=1.0, concentration0=alpha, sample_shape = K, name = "beta")
pi = tf.concat([tf.reshape(beta[0],[1]), tf.reshape(tf.multiply(beta[1:],tf.cumprod(1 - beta[:-1])), [K-1])], 0)

mu = Normal(loc=tf.zeros([K, D]), scale=tf.ones([K, D]))
sigma = InverseGamma(concentration=tf.ones([K, D]), rate=tf.ones([K, D]))
c = Categorical(logits=tf.tile(tf.reshape(ed.logit(pi), [1, K]), [N, 1]))
x = Normal(loc=tf.gather(mu, c), scale=tf.gather(sigma, c))

Then the inference is

T = 5000
# qpi = Empirical(params=tf.Variable(tf.ones([T, K]) / K))
qalpha = Empirical(params=tf.Variable(tf.ones([T])))
qbeta = Empirical(params=tf.Variable(tf.ones([T,K])))

qmu = Empirical(params=tf.Variable(tf.zeros([T, K, D])))
qsigma = Empirical(params=tf.Variable(tf.ones([T, K, D])))
qc = Empirical(params=tf.Variable(tf.zeros([T, N], dtype=tf.int32)))

# gpi = Dirichlet(concentration=tf.constant([1.4, 1.6]))
galpha = Gamma(concentration=1.0, rate=1.0)
gbeta = Beta(concentration1=1.0, concentration0=1.0, sample_shape=K)

gmu = Normal(loc=tf.constant([[1.0, 1.0], [-1.0, -1.0]]),
             scale=tf.constant([[0.5, 0.5], [0.5, 0.5]]))
gsigma = InverseGamma(concentration=tf.constant([[1.1, 1.1], [1.1, 1.1]]),
                      rate=tf.constant([[1.0, 1.0], [1.0, 1.0]]))
gc = Categorical(logits=tf.zeros([N, K]))

inference = ed.MetropolisHastings(
    latent_vars={alpha:qalpha, beta:qbeta, mu: qmu, sigma: qsigma, c: qc},
    proposal_vars={alpha:galpha, beta:gbeta, mu: gmu, sigma: gsigma, c: gc},
    data={x: x_data}) 

but I have the following error

[angel:~/ownCloud … local-density/Edward] [tensorflow] 2s 1 $ python GausianMixtures2DMetropolis.py 
2017-05-26 10:06:18.255928: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-26 10:06:18.255951: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-26 10:06:18.255967: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.
2017-05-26 10:06:18.255973: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.
2017-05-26 10:06:18.255995: W tensorflow/core/platform/cpu_feature_guard.cc:45] The TensorFlow library wasn't compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.
Traceback (most recent call last):
  File "GausianMixtures2DMetropolis.py", line 282, in <module>
    info_dict = inference.update()
  File "/home/angel/src/edward/edward/inferences/monte_carlo.py", line 130, in update
    _, accept_rate = sess.run([self.train, self.n_accept_over_t], feed_dict)
  File "/home/angel/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 778, in run
    run_metadata_ptr)
  File "/home/angel/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 982, in _run
    feed_dict_string, options, run_metadata)
  File "/home/angel/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1032, in _do_run
    target_list, options, run_metadata)
  File "/home/angel/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1052, in _do_call
    raise type(e)(node_def, op, message)
tensorflow.python.framework.errors_impl.InvalidArgumentError: assertion failed: [] [Condition x > 0 did not hold element-wise: x = ] [concat:0] [1 0]
         [[Node: inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/Switch, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/data_0, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/data_1, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/data_2, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/Switch_1)]]

Caused by op u'inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert', defined at:
  File "GausianMixtures2DMetropolis.py", line 276, in <module>
    inference.initialize()
  File "/home/angel/src/edward/edward/inferences/monte_carlo.py", line 98, in initialize
    self.train = self.build_update()
  File "/home/angel/src/edward/edward/inferences/metropolis_hastings.py", line 119, in build_update
    zold = copy(z, dict_swap_old, scope=scope_old)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 228, in copy
    kwargs[key] = copy_default(value, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 84, in copy_default
    x = copy(x, *args, **kwargs)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 244, in copy
    new_op = copy(op, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 308, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 244, in copy
    new_op = copy(op, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 308, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 244, in copy
    new_op = copy(op, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 308, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 244, in copy
    new_op = copy(op, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 308, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 244, in copy
    new_op = copy(op, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 298, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 308, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 244, in copy
    new_op = copy(op, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 298, in copy
    elem = copy(x, dict_swap, scope, True, copy_q)
  File "/home/angel/src/edward/edward/util/random_variables.py", line 290, in copy
    op_def)
  File "/home/angel/anaconda2/envs/tensorflow/lib/python2.7/site-packages/tensorflow/python/framework/ops.py", line 1228, in __init__
    self._traceback = _extract_stack()

InvalidArgumentError (see above for traceback): assertion failed: [] [Condition x > 0 did not hold element-wise: x = ] [concat:0] [1 0]
         [[Node: inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert = Assert[T=[DT_STRING, DT_STRING, DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/cpu:0"](inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/Switch, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/data_0, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/data_1, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/data_2, inference_140093265818960/old/assert_positive/assert_less/Assert/AssertGuard/Assert/Switch_1)]]

That’s a pretty nasty error! Try removing key-value pairs in latent_vars and proposal_vars until you can diagnose which variable’s update in particular is throwing that error.

One tip: You can simplify this line

c = Categorical(logits=tf.tile(tf.reshape(ed.logit(pi), [1, K]), [N, 1]))

to be

c = Categorical(probs=pi, sample_shape=N)

which will still produce a variable of shape [N, K]. (Note they mean slightly different things: the latter is a single Categorical variable sampled N times; the former is N different Categorical variables each with the same parameters and each sampled once.) Same goes for other random variables where you do broadcasting instead of sample_shape.

Thanks @dustin ! I understand your final note in the previous message.
I’ve rewritten the code and it works (it does the computations) … well, not properly: the posterior mean does not change from (0,0)…I need spend some time here to understand what happens (maybe the hyperpriors needs to be reconfigured).

#!/usr/bin/env python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import edward as ed
import numpy as np
import six
import tensorflow as tf
import pandas
import matplotlib.pyplot as plt

from edward.models import \
    Categorical, Dirichlet, Empirical, InverseGamma, Normal, ParamMixture, \
    MultivariateNormalDiag, Mixture, Gamma, Beta
from scipy.stats import norm


def build_toy_dataset(N):
  pi = np.array([0.4, 0.6])
  mus = [[1, 1], [-1, -1]]
  stds = [[0.1, 0.1], [0.1, 0.1]]
  x = np.zeros((N, 2))
  for n in range(N):
    k = np.argmax(np.random.multinomial(1, pi))
    x[n, :] = np.random.multivariate_normal(mus[k], np.diag(stds[k]))
  return x


N = 500  # number of data points
K = 5  # number of components
D = 2  # dimensionality of data
ed.set_seed(42)

# DATASET
# -------
x_data = build_toy_dataset(N)

# MODEL
# -----
# pi = Dirichlet(concentration=tf.ones(K)/K)
alpha = Gamma(concentration=2.0, rate=2.0, name = "alpha")
beta = Beta(concentration1=1.0, concentration0=alpha, sample_shape = K, name = "beta")
pi = tf.concat([tf.reshape(beta[0],[1]), tf.reshape(tf.multiply(beta[1:],tf.cumprod(1 - beta[:-1])), [K-1])], 0)

mu = Normal(loc=tf.zeros(D), scale=tf.ones(D), sample_shape = K)
sigma = InverseGamma(concentration=tf.ones([D]), rate=tf.ones([D]), sample_shape = K)
c = Categorical(probs=pi, sample_shape = N)
x = Normal(loc=tf.gather(mu, c), scale=tf.gather(sigma, c))

# INFERENCE
# ---------
T = 5000
# qpi = Empirical(params=tf.Variable(tf.ones([T, K]) / K))
qalpha = Empirical(params=tf.Variable(tf.ones([T])))
qbeta = Empirical(params=tf.Variable(tf.ones([T,K])))

qmu = Empirical(params=tf.Variable(tf.zeros([T, K, D])))
qsigma = Empirical(params=tf.Variable(tf.ones([T, K, D])))
qc = Empirical(params=tf.Variable(tf.zeros([T, N], dtype=tf.int32)))

# gpi = Dirichlet(concentration=tf.constant([2.0]*K))
galpha = Gamma(concentration=2.0, rate=2.0)
gbeta = Beta(concentration1=1.0, concentration0=1.0, sample_shape=K)

gmu = Normal(loc=tf.ones([K,D]), scale=tf.ones([K,D]))
gsigma = InverseGamma(concentration=tf.ones([K,D]), rate=tf.ones([K,D]))
gc = Categorical(logits=tf.zeros([N, K]))

inference = ed.MetropolisHastings(
    latent_vars={alpha:qalpha, beta:qbeta, mu: qmu, sigma: qsigma, c: qc},
    proposal_vars={alpha:galpha, beta:gbeta, mu: gmu, sigma: gsigma, c: gc},
    data={x: x_data})

inference.initialize()

sess = ed.get_session()
tf.global_variables_initializer().run()

for _ in range(inference.n_iter):
  info_dict = inference.update()
  inference.print_progress(info_dict)
  t = info_dict['t']
  if t == 1 or t % inference.n_print == 0:
    qalpha_mean, qbeta_mean, qmu_mean = sess.run([qalpha.mean(), qbeta.mean(), qmu.mean()])
    print("")
    print("Inferred alpha mean:")
    print(qalpha_mean)
    print("")
    print("Inferred beta mean:")
    print(qbeta_mean)
    print("Inferred cluster means:")
    print(qmu_mean)
1 Like

Dear @dustin and Edward users. Finally the code below works, i.e., DPM for a mixture of two dimensional Gaussians using the stick breaking method. I had some issues with the stick breaking function, but I’ve check out the code using TensorFlow:

K = 5  # number of components to test
alpha = Gamma(concentration=6.0, rate=1.0)
beta = Beta(concentration1=1.0, concentration0=tf.ones(K)*alpha)
pi = tf.concat([tf.reshape(beta[0],[1]), tf.reshape(tf.multiply(beta[1:],tf.cumprod(1 - beta[:-1])), [K-1])], 0)

and using the code by Austin Rochford:

from theano import tensor as tt
def stick_breaking(beta):
    portion_remaining = tt.concatenate([[1], tt.extra_ops.cumprod(1 - beta)[:-1]])
    return beta * portion_remaining

obtaining the same results. Furthermore, you can extend the stick breaking method to a different sample_shape on beta:

K = 5  # number of components to test
alpha = Gamma(concentration=6.0, rate=1.0)
beta = Beta(concentration1=1.0, concentration0=tf.ones(K)*alpha, sample_shape=N)
pi = tf.concat([tf.reshape(beta[:,0], [N,1]), beta[:,1:] * tf.cumprod(1 - beta[:, :-1], axis = 1)], 1)

So, here the code:

#!/usr/bin/env python
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import edward as ed
import numpy as np
import tensorflow as tf

# import matplotlib.pyplot as plt
# import matplotlib.cm as cm
# plt.style.use('ggplot')

# import seaborn as sns
# sns.set(color_codes=True)

from edward.models import \
    Categorical, Empirical, InverseGamma, Normal, Gamma, Beta
from scipy.stats import norm

def build_toy_dataset(N):
  pi = np.array([0.4, 0.6])
  mus = [[1, 1], [-1, -1]]
  stds = [[0.1, 0.1], [0.1, 0.1]]
  x = np.zeros((N, 2))
  for n in range(N):
    k = np.argmax(np.random.multinomial(1, pi))
    x[n, :] = np.random.multivariate_normal(mus[k], np.diag(stds[k]))
  return x

# DATASET
# -------
N = 500  # number of data points
x_data = build_toy_dataset(N)
N, D = x_data.shape

# MODEL
# -----
K = 5  # number of components to test
# pi = Dirichlet(concentration=tf.ones(K)/K)
alpha = Gamma(concentration=6.0, rate=1.0)
beta = Beta(concentration1=1.0, concentration0=tf.ones(K)*alpha)
pi = tf.concat([tf.reshape(beta[0],[1]), tf.reshape(tf.multiply(beta[1:],tf.cumprod(1 - beta[:-1])), [K-1])], 0)

mu = Normal(loc=tf.zeros(D), scale=tf.ones(D), sample_shape = K)
sigma = InverseGamma(concentration=tf.ones([D]), rate=tf.ones([D]), sample_shape = K)
c = Categorical(probs=pi, sample_shape = N)
x = Normal(loc=tf.gather(mu, c), scale=tf.gather(sigma, c))

# INFERENCE
# ---------
T = 10000
# qpi = Empirical(params=tf.Variable(tf.ones([T, K]) / K))
qalpha = Empirical(params=tf.Variable(tf.ones(T)))
# qbeta = Empirical(params=tf.Variable(tf.ones([T,K])))

qmu = Empirical(params=tf.Variable(tf.zeros([T, K, D])))
qsigma = Empirical(params=tf.Variable(tf.ones([T, K, D])))
qc = Empirical(params=tf.Variable(tf.zeros([T, N], dtype=tf.int32)))

# gpi = Dirichlet(concentration=tf.constant([2.0]*K))
galpha = Gamma(concentration=6.0, rate=2.0)
gbeta = Beta(concentration1=2.0, concentration0=1.0, sample_shape=K)

gmu = Normal(loc=tf.zeros(D), scale=tf.ones(D), sample_shape = K)
gsigma = InverseGamma(concentration=tf.ones(D), rate=tf.ones(D), sample_shape = K)
gc = Categorical(probs=tf.ones(K)/K, sample_shape = N)

inference = ed.MetropolisHastings(
    latent_vars={alpha:qalpha, mu: qmu, sigma: qsigma, c: qc},
    proposal_vars={alpha:galpha, mu: gmu, sigma: gsigma, c: gc},
    data={x: x_data})

inference.initialize()

sess = ed.get_session()
tf.global_variables_initializer().run()

for _ in range(inference.n_iter):
  info_dict = inference.update()
  inference.print_progress(info_dict)
  t = info_dict['t']
  if t == 1 or t % inference.n_print == 0:
    qalpha_mean, qmu_mean = sess.run([qalpha.mean(), qmu.mean()])
    print("")
    print("Inferred alpha mean:")
    print(qalpha_mean)
    print("")
    print("Inferred cluster means:")
    print(qmu_mean)
2 Likes

Dear AngelBerihuete and Edwar users,

thanks for sharing your code AngelBerihuete! I was testing your code and the performance seems to be not really good. The found cluster means are all really close to [0,0] for your toy data with ground truth means [1,1] and [-1,-1]. Do you have an explanation for this? Any tips how to improve the performance? Should e.g. the base distribution for the means be changed? Or does it help to increase the number of samples T?

Thanks a lot!

2 Likes

Hi @AngelBerihuete,

Thanks for sharing your code !
Do you think there is a way to use edward.models.DirichletProcess so as not to decide a priori the number of components to test ? Something like:

N = 1000
base = Normal(0.0, 1.0)
alpha = 1.0
mu = DirichletProcess(alpha, base)
x = Normal(loc=mu, scale=0.1, sample_shape = N)

Hi @emilemathieu !

Have you seen the post written by @dustin about your issue?

Do you have any progress with DPM and Edward?

Cheers!

1 Like