Latent Variables

This chapter is dedicated to generative models called latent variable models. When we deal with those types of models we differentiate between two types of variables: observable variables and latent variables. Observable variables is the data we have been dealing with so far and it is the part of the data that we can actually measure. When we are dealing with MNIST for example, the pixel values of the digits are the observable variables. A number 1 would for example have higher pixel values in the middle of the drawing and lower values in the surrounding pixels.

Latent variables[1] on the other hand are not observable and there is no way for us to directly measure those.

Info

Latent variables are the unobservable variables, that determine the characteristics of the observable data.

When you try to imagine latent variables, think about the hidden characteristics of the data. Let's look at the two images of digits below and try to get additional intuition what hidden variables might be contained in the images.

Depending on your country of origin there are different ways to write the number 1. In the United States it is common to draw the number 1 as a straight line, while in many countries in the European Union, there is an additional line attached to the top. While we do not observe directly the country of origin of the people who drew the digit, we can still make an educated guess that they are most likely from different regions of the world. But you can not measure the country of origin directly from the picture, as this is a latent variable. There are many more latent variables that could be encoded in a digits dataset: the level of curviness, the cleanliness of the drawing or the tilt of the digit. None of them are directly observable or measurable.

If you are dealing with human faces on the other hand the latent variables might be the color of the skin, the shape of the mouth, the gender, the haircut, glasses and so on.

So instead of generating the face directly, we could first sample the characteristics of the face from the distribution of latent variables and then generate the image based on those latent characteristics.

While the above examples make it look like latent variables could be easily translated into human language, often latent variables are obscure and not easily interpretable. Still the above description should provide you with the necessary intuition that you will require during the following sections.