Neural Network Representation
I. A Shallow Neural Network Representation
Let’s use a shallow neural network for an education purpose. Shallow neural networks are neural networks with only one hidden layer. Therefore, there are only 3 layers in this neural network: an input layer, a hidden layer, and finally an output layer.
Let’s denote as our input matrix that has 3 features. Therefore,
Whereas, is our label, which is a scalar.
From the given information, we define the following shallow neural network:.
You may notice some notations in the network diagram, but do not worry as we will explain these notations right now.
The i-th layer is commonly represented as , where stands for activation and indicates the i-th layer. Although there is no activation function applies on the input layer, it is sometimes referred as the layer, where as the i-th count starts at 1 from the first hidden layer to the output layer. Moreover, is a vector.
The element on the layer is denoted as . For example, the first neuron in the first hidden layer is referred to as .
Let’s talk weights and bias. The weight matrix and the bias of the i-th layer are denoted as and , respectively. And the shape of these matrices are the following:
and
where is the number of neurons in the i-th layer.
For example, in the our network, , whereas since there are 4 neurons in the first hidden layer and there are 3 features in the input layer.
II. Computing a Neural Network Output
1. Single training example
Let’s do a forward pass on our Neural Network. The value of the first neuron in the first hidden layer is:
This applies a sigmoid activation function to the linear equation of the inputs. Similarly, we can compute the remaining neurons:
We can vectorize these equations as the following:
where
2. Multiple training examples
Let
Then, by using matrix multiplication we can calculate the hidden layer’s matrix:
where