Saving Model Parameters

moved from


I am working on a variational inference problem, and was wondering how I could snapshot the weights after training. I want to be able to restore then query ‘q’, my approximation distribution, and draw samples from it. I am familiar with saving variables with tf.train.saver. Is there a similarly easy way to save the parameters of my inference model?


hi @GhassanMakhoul | tf.train.Saver applies to variational approximations too. In your code, you should be handling your own tf variables to parameterize your inference model. After (or during) inference, you can call, e.g.,

saver = tf.train.Saver()

sess = ed.get_session()
save_path =, “/tmp/posterior.ckpt”)
print(“Inference model saved in file: %s” % save_path)

This is the [same way you would save model parameters in TensorFlow]( (All extensions apply, such as saving only a subset of the inference model parameters.)

Hi, I’m trying to add model saving to the example “”, and I’m having some trouble. Here is my code, with the added code marked with #ADDED CODE:

#!/usr/bin/env python
"""Convolutional variational auto-encoder for binarized MNIST.

The neural networks are written with TensorFlow Slim.

from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import edward as ed
import numpy as np
import os
import tensorflow as tf

from edward.models import Bernoulli, Normal
from edward.util import Progbar
from scipy.misc import imsave
from tensorflow.contrib import slim
from tensorflow.examples.tutorials.mnist import input_data

def generative_network(z):
  """Generative network to parameterize generative model. It takes
  latent variables as input and outputs the likelihood parameters.

  logits = neural_network(z)
  with slim.arg_scope([slim.conv2d_transpose],
                      normalizer_params={'scale': True}):
    net = tf.reshape(z, [M, 1, 1, d])
    net = slim.conv2d_transpose(net, 128, 3, padding='VALID')
    net = slim.conv2d_transpose(net, 64, 5, padding='VALID')
    net = slim.conv2d_transpose(net, 32, 5, stride=2)
    net = slim.conv2d_transpose(net, 1, 5, stride=2, activation_fn=None)
    net = slim.flatten(net)
    return net

def inference_network(x):
  """Inference network to parameterize variational model. It takes
  data as input and outputs the variational parameters.

  loc, scale = neural_network(x)
  with slim.arg_scope([slim.conv2d, slim.fully_connected],
                      normalizer_params={'scale': True}):
    net = tf.reshape(x, [M, 28, 28, 1])
    net = slim.conv2d(net, 32, 5, stride=2)
    net = slim.conv2d(net, 64, 5, stride=2)
    net = slim.conv2d(net, 128, 5, padding='VALID')
    net = slim.dropout(net, 0.9)
    net = slim.flatten(net)
    params = slim.fully_connected(net, d * 2, activation_fn=None)

  loc = params[:, :d]
  scale = tf.nn.softplus(params[:, d:])
  return loc, scale


M = 128  # batch size during training
d = 10  # latent dimension
DATA_DIR = "data/mnist"
IMG_DIR = "img"

if not os.path.exists(DATA_DIR):
if not os.path.exists(IMG_DIR):

# DATA. MNIST batches are fed at training time.
mnist = input_data.read_data_sets(DATA_DIR)

z = Normal(loc=tf.zeros([M, d]), scale=tf.ones([M, d]))
logits = generative_network(z)
x = Bernoulli(logits=logits)

x_ph = tf.placeholder(tf.int32, [M, 28 * 28])
loc, scale = inference_network(tf.cast(x_ph, tf.float32))
qz = Normal(loc=loc, scale=scale)

# Bind p(x, z) and q(z | x) to the same placeholder for x.
data = {x: x_ph}
inference = ed.KLqp({z: qz}, data)
optimizer = tf.train.AdamOptimizer(0.01, epsilon=1.0)

hidden_rep = tf.sigmoid(logits)


load_saved_model = True
model_path = r"/tmp/model_vae_edward.ckpt"
sess = ed.get_session()
saver = tf.train.Saver()
if load_saved_model:
  saver.restore(sess, model_path)
  print("Model restored.")

n_epoch = 1
n_iter_per_epoch = 10
for epoch in range(n_epoch):
  avg_loss = 0.0

  pbar = Progbar(n_iter_per_epoch)
  for t in range(1, n_iter_per_epoch + 1):
    x_train, _ = mnist.train.next_batch(M)
    x_train = np.random.binomial(1, x_train)
    info_dict = inference.update(feed_dict={x_ph: x_train})
    avg_loss += info_dict['loss']

  # Print a lower bound to the average marginal likelihood for an
  # image.
  avg_loss = avg_loss / n_iter_per_epoch
  avg_loss = avg_loss / M
  print("log p(x) >= {:0.3f}".format(avg_loss))

  # Visualize hidden representations.
  imgs = hidden_rep.eval()
  for m in range(M):
    imsave(os.path.join(IMG_DIR, '%d.png') % m, imgs[m].reshape(28, 28))

  save_path =, model_path)
  print("Model saved in file: %s" % save_path)

The model saves ok, but when I try to load the model (set load_saved_model = True), I get the following error:
NotFoundError (see above for traceback): Key optimizer_274937448/fully_connected/weights/Adam_1 not found in checkpoint

What am I doing wrong?

I am also having the same problem when trying to load a trained model using tf.train.import_meta_graph(). I get: KeyError: “The name ‘Normal’ refers to an Operation not in the graph.” Are we meant to only save the tf.variables and not the operations?

Here is what I’m doing:

    with tf.Session() as sess:
        # Restore variables from disk.
        saver = tf.train.import_meta_graph('./models/posterior.ckpt.meta')
        saver.restore(sess, tf.train.latest_checkpoint('./models/'))

The full error is:

Traceback (most recent call last):
  File "/home/rmason/Github/alpha-i/edward-mock-time-series-test/", line 87, in <module>
  File "/home/rmason/Github/alpha-i/edward-mock-time-series-test/", line 71, in run_testing
saver = tf.train.import_meta_graph('./models/posterior.ckpt.meta')
  File "/home/rmason/anaconda3/envs/time-series-env/lib/python3.4/site-packages/tensorflow/python/training/", line 1686, in import_meta_graph
  File "/home/rmason/anaconda3/envs/time-series-env/lib/python3.4/site-packages/tensorflow/python/framework/", line 536, in import_scoped_meta_graph
ops.prepend_name_scope(value, scope_to_prepend_to_names))
  File "/home/rmason/anaconda3/envs/time-series-env/lib/python3.4/site-packages/tensorflow/python/framework/", line 2584, in as_graph_element
return self._as_graph_element_locked(obj, allow_tensor, allow_operation)
  File "/home/rmason/anaconda3/envs/time-series-env/lib/python3.4/site-packages/tensorflow/python/framework/", line 2644, in _as_graph_element_locked
"graph." % repr(name))
KeyError: "The name 'Normal' refers to an Operation not in the graph."

@paul That looks like a bug. It tries to restore Adam optimizer parameters, which are stored under a unique inference name ( optimizer_274937448). However, this unique name is different the next time the script is run. Issue raised at

@rpmason I personally don’t have much experience with saving/restoring the graph itself. This is certainly worth more investigation.

@paul @rpmason Both bugs (restoring optimizer tf.Variables and importing the metagraph) are fixed in

@dustin brilliant! Thanks a lot :).

Hi dustin, I have updated the edward version to 1.3.3. But I still meet this bug. Will recent release fix this? Or I need to modify the files manually following this link

The bug fix is in Edward’s development version and not in 1.3.3. To install that, see

Thank you dustin. I have installed the development version. But in the example below, I still cannot load the posterior qw, though I can load the placeholder ‘X’. The error message: ‘The name ‘qw:0’ refers to a Tensor which does not exist. The operation ‘qw’, does not exist in the graph."’ Is there anything wrong in my code?

def build_toy_dataset(N, w, noise_std=0.1):
  D = len(w)
  x = np.random.randn(N, D)
  y =, w) + np.random.normal(0, noise_std, size=N)
  return x, y

###save all the variable

N = 40  # number of data points
D = 10  # number of features

w_true = np.random.randn(D)
X_train, y_train = build_toy_dataset(N, w_true)
X_test, y_test = build_toy_dataset(N, w_true)

X = tf.placeholder(tf.float32, [N, D],name='X')
w = Normal(loc=tf.zeros(D), scale=tf.ones(D),name='w')
b = Normal(loc=tf.zeros(1), scale=tf.ones(1),name='b')
y = Normal(, w) + b, scale=tf.ones(N),name='y')

qw = Normal(loc=tf.Variable(tf.random_normal([D])),
qb = Normal(loc=tf.Variable(tf.random_normal([1])),

inference = ed.KLqp({w: qw, b: qb}, data={X: X_train, y: y_train}), n_iter=250)

saver = tf.train.Saver()
sess = ed.get_session()
save_path =, fname)

#################################run in another part to load variables###################

loader = tf.train.import_meta_graph(fname+'.meta')
graph = tf.get_default_graph()
X_load = graph.get_tensor_by_name("X:0")
qw_load = graph.get_tensor_by_name("qw:0")

Unfortunately random variables aren’t explicitly stored on TensorFlow’s graph. We decided not to as it would require a new data format, similar to how one stores tf.Tensors and tf.Variables.

This implies you need to import the tensor associated to qw and then re-build qw:

qw_sample = graph.get_tensor_by_name("qw/sample/Reshape:0")
# this wraps the sample to include RV methods
qw = Normal(loc=graph.get_tensor_by_name("qw/loc:0"),

Maybe there’s an easier approach? Contributions/pull requests welcome.

Thank you dustin. Though not perfect, it can solve my model saving problem.

Hi dustin. Now I can save and restore model parameters, but I meet some difficulty in reconstructing the inference method. In the code below, when I try to run a reconstructed inference, there is an error:
ValueError: cannot add op with name optimizer/Variable_1/Adam as that name is already used

It seems that when I save the model, some inference object are samed, but how can I make a new inference and run it?

import tensorflow as tf
import edward as ed
import numpy as np
from edward.models import Normal

def build_toy_dataset(N, w, noise_std=0.1):
D = len(w)
x = np.random.randn(N, D)
y =, w) + np.random.normal(0, noise_std, size=N)
return x, y

##################code to save a linear regression model###########################


w_true =np.array([0.4,0.3,-0.1]) # np.random.randn(D)
N = 40 # number of data points
D = 3 # number of features

X_train, y_train = build_toy_dataset(N, w_true)
X_test, y_test = build_toy_dataset(N, w_true)

X = tf.placeholder(tf.float32, [N, D],name=‘X’)
w = Normal(loc=tf.zeros(D), scale=tf.ones(D),name=‘w’)
b = Normal(loc=tf.zeros(1), scale=tf.ones(1),name=‘b’)
y = Normal(, w) + b, scale=tf.ones(N),name=‘y’)

qw = Normal(loc=tf.Variable(tf.random_normal([D])),
qb = Normal(loc=tf.Variable(tf.random_normal([1])),

inference = ed.KLqp({w: qw, b: qb}, data={X: X_train, y: y_train}), n_iter=250)

saver = tf.train.Saver()
sess = ed.get_session()
save_path =, fname)

####code to reload a model and run a new inference with new data########################

loader = tf.train.import_meta_graph(fname+’.meta’)
graph = tf.get_default_graph()
X =graph.get_tensor_by_name(“X:0”)

w = Normal(loc=graph.get_tensor_by_name(“w/loc:0”),

b = Normal(loc=graph.get_tensor_by_name(“b/loc:0”),

qw = Normal(loc=graph.get_tensor_by_name(“qw/loc:0”),

qb = Normal(loc=graph.get_tensor_by_name(“qb/loc:0”),


y = Normal(loc=graph.get_tensor_by_name(“y/loc:0”),

N = 40 # number of data points
D = 3 # number of features
X_test, y_test = build_toy_dataset(N, w_true)

inference = ed.KLqp({w: qw, b: qb}, data={X: X_test, y: y_test}), n_iter=50) ###The value error occurs here

Because you import the graph, you can’t use which also builds the computation to run the graph. There’s more bookkeeping involved as you need to write your own training loop.

Thank you dustin. But I just want to restore the saved model parameters and then restart the inference process with new updated samples. Is there any convenient way to achieve that?

I recommend only restoring tf.Variables and not the metagraph itself. You can do so by calling inference.initialize(); then restore under your parameter saver, then manually run inference in a loop ( Basically, you’re replacing the tf.global_variables_initializer().run() line.

Thank you dustin, but how to restore tf.Variables without restoring the metagraph? I have tried the following code but it cannot find “qw/loc:0”. Also is it possible to creating qw_loc without specifying its shape. This will raise a dimension mismatch error in tensorflow.


1 Like