Attention

When we look at pictures or read text we do not focus on all the available data with equal strength, rather we focus on some specific parts. In other words we pay attention to certain elements, while we disregard others.

Look at the image below for example. What part of the image do you notice first? Most likely the yellow blossoms of the plant.

A plant seen through a round window.
Photo by Woody Yan on Unsplash

Did you notice the large leafs outside the window? Did you notice that the white circle that surrounds the window has a small defect at the top right? Your mind automatically wandered to the plant, filtering out the unnecessary noise that is not relevant to your understanding of the image.

The attention mechanisms that we are going to study in this chapter are designed to learn what parts of the data the neural network has to pay attention to. Attention models have become state of the art in natural language processing and even computer vision. These models were originally developed for language translation, therefore that is where we are going to start our journey.