Neural Network

Let's start this chapter with the obvious question: what is a neural network?

Info

A neural network is an object, that structures individual neurons in a hierarchy of layers and combines them into a single model, by feeding the outputs of a layer as inputs into the next layer.

If the above definition does not make any sense to you, below is a more intuitive explanation.

There are many different activation functions out there, but for now we will assume that we are dealing with the sigmoid activation function. That means, that a neuron is essentially a separate logistic regression unit with individual weights and a bias. The neuron below for example takes features undefined as inputs, multiplies those with individual weights undefined , adds the bias undefined and applies the sigmoid activation function undefined .

Features
undefined
undefined
undefined
undefined
Neuron
undefined

In a neural network the same are used to produce several different neurons. Those neurons utilize different weights and biases and produce therefore different outputs. Such a collection of neurons is called a layer.

Features
undefined
undefined
undefined
undefined
Neurons Layer
undefined
undefined
undefined

We can stack several layers after each other to produce a neural network. The outputs of the previous layer are used as inputs instead of the input features. Often the input neurons are also called hidden features.

Features
undefined
undefined
undefined
undefined
Hidden Layer 1
undefined
undefined
undefined
Hidden Layer 2
undefined
undefined
Output Layer
undefined

The output neuron(s) is (are) used as an input into the loss function, for example the cross-entropy loss if we are dealing with a classificatio problem.

We can train neural networks that can classify images, generate text or play computer games. No matter what task we are trying to accomplish and how the neural network is structured, the training process of neural networks is always done using the same steps that we used in linear and logistic regression.

In the forward pass the features are processed layer by layer and neuron by neuron to finally determine the loss of the neural network and to construct a computational graph.

Features
undefined
undefined
undefined
undefined
Hidden Layer
undefined
undefined
undefined
Output Layer
undefined
Loss
undefined

In the backward pass we use the backpropagation algorithm to calculate the gradients for all weights and biases.

Features
undefined
undefined
undefined
undefined
Hidden Layer
undefined
undefined
undefined
Output Layer
undefined
Loss
undefined

Conceptually the whole learning process is not much different from what we saw in the previous chapters. The computational graph is larger and broader, but the ideas are the same.