Model distribution for a list of categorical data

I currently have an autoencoder that compresses data where each sample is an N rows by M columns matrix.
In this data (DNA sequences), a row represents a site that can have A, T, C, G character which I represented as a one-hot vector where A is [1 0 0 0], T is [0 1 0 0] and so on. Thus one sample is a collection of N one-hot vectors of M choices. In tensorflow, I use a softmax on the last dimension (M) to ensure that rows sum to 1.

I would like to change my autoencoder into a variational autoencoder but I am new to Edward. Based on the tutorials, I can use Multinomial or OneHotCategorical. I tried using OneHotCategorical but my reconstruction error seems to be stuck. I am getting the following error when validate_args=True:

InvalidArgumentError: assertion failed: [] [Condition x <= 0 did not hold element-wise:x (input:0) = ] [[[0 0 1]]...]
[[Node: inference/sample/OneHotCategorical/log_prob/assert_non_positive/assert_less_equal/Assert/AssertGuard/Assert =
Assert[T=[DT_STRING, DT_STRING, DT_FLOAT], summarize=3, _device="/job:localhost/replica:0/task:0/device:CPU:0"]
(inference/sample/OneHotCategorical/log_prob/assert_non_positive/assert_less_equal/Assert/AssertGuard/Assert/Switch, 
inference/sample/OneHotCategorical/log_prob/assert_non_positive/assert_less_equal/Assert/AssertGuard/Assert/data_0, 
inference/sample/OneHotCategorical/log_prob/assert_non_positive/assert_less_equal/Assert/AssertGuard/Assert/data_1, 
inference/sample/OneHotCategorical/log_prob/assert_non_positive/assert_less_equal/Assert/AssertGuard/Assert/Switch_1)]]

....

InvalidArgumentError                      Traceback (most recent call last)
<ipython-input-30-1312f7e5da95> in <module>()
     24         pbar.update(j)
     25         batch_X = next(training_minibatch_generator)
---> 26         info_dict = inference.update(feed_dict={X:batch_X, B:100, mode:'train'})
     27         _train_loss += info_dict['loss']
     28         iterations += 1

I checked my X and batch_X shapes but they seem to be fine:

> batch_X.shape
(100, 1515, 4)
> Y.shape
TensorShape([Dimension(None), Dimension(1515), Dimension(4)])