Rui Shu

# 19 Mar 2018 Using a Bernoulli VAE on Real-Valued Observations

The Bernoulli observation VAE is supposed is used when one’s observed samples $x \in \sset{0, 1}^n$ are vectors of binary elements. However, I have, on occasion, seen people (and even papers) that apply Bernoulli observation VAEs to real-valued samples $x \in [0, 1]^n$. This will be a quick and dirty post going over whether this unholy marriage of Bernoulli VAE with real-valued samples is appropriate.

### Background and Notation for Bernoulli VAE

Given an empirical distribution $\hat{p}(x)$ whose samples are binary $x \in \sset{0, 1}^n$, the VAE objective is

If $p_\theta(x \giv z)$ is furthermore a fully-factorized Bernoulli observation model, then the distribution can be expressed as

where $\pi: \Z \to [0, 1]^n$ is a neural network parameterized by $\theta$. As preparation for the next section, we shall—with a slight abuse of notation—also define

where $\pi \in [0, 1]^n$.

### Applying Bernoulli VAE to Real-Valued Samples

Suppose we have a distribution over $r(\pi)$, and $\hat{p}(x)$ is in fact the marginalization of $r(\pi)p(x \giv \pi)$. This is the case for MNIST, where the real-valued samples are interpreted as observations of $\pi$. This allows us to construct the objective as

It turns out there is another equally valid lower bound

However, since $q_\phi(z \giv \pi)$ does not have access to $x$, it is unlikely to give a better approximation of $p_\theta(z \giv x)$ than the previous equation. Consequently, it is likely to be a looser bound (which can be verified empirically). A bit of tedious algebra shows that the objective is equivalent to

where the inner-most term is exactly the sum of element-wise cross-entropy terms, where each cross-entropy term is

Note that this is exactly the application of Bernoulli observation VAEs to real-valued samples. So long as the real-valued samples can be interpreted as the Bernoulli distribution parameters, then this lower bound is valid. However, as noted above, this lower bound tends to be looser.

End of post