Artificial Neural Network
An Artificial Neural Network is a mathematical model that tries to simulate the structure and functionalities of biological neural networks that make up a human brain so that the computer will be able to learn things and make decisions in a human-like manner.
The ANN model has three simple sets of rules i.e Multiplication, Summation, and Activation
Multiplication- At the entrance of artificial neurons, the inputs are weighted which means that every input value is multiplied with individual weight. Weights are adaptive coefficients that determine the intensity of the input signal as registered by the artificial neuron.
Summation- In the middle section of an artificial neuron, there is a sum function that sums all weighted inputs and bias.
Activation- At the exit of the artificial neuron the sum of previously weighted inputs and bias is passing trough activation function that is also called the transfer function. At the end an artificial neuron passes the processed information via output.
Components of ANN:
The input layer receives the data either from input files or directly from electronic sensors in real-time applications. This is the data that the network aims to process or learn about. Each input has its own relative weight, which gives the input the impact that it needs on the processing element's summation function. Some inputs are made more important than others to have a greater effect on the processing element as they combine to produce a neural response. From the input unit, the data goes through one or more hidden units.
Each neuron in a hidden layer receives the signals from all the neurons typically from the input layer. There could be more than one hidden layer in a neural network. The hidden layers perform computations on the weighted inputs and produce net input which is then applied with activation functions to produce the actual output.
The output layer in an artificial neural network is the last layer of neurons that produces given outputs for the program.
Activation functions are functions used in neural networks to computes the weighted sum of input and biases, which is used to decide if a neuron can be fired or not. Activation function can be either linear or non-linear depending on the function it represents, and are used to control the outputs of out neural networks
Some important Activation Functions are:-
2. Tanh- Hyperbolic Tangent
3. Relu - Rectified linear unit
It is an activation function and its range is between 0 and 1.
It is usually used in the output layer of a binary classification, where the result is either 0 or 1, as value for sigmoid function lies between 0 and 1 only so, the result can be predicted easily to be 1 if the value is greater than 0.5 and 0 otherwise.
Tanh- Hyperbolic Tangent:
It’s a mathematically shifted version of the sigmoid function. This activation works almost always better than the sigmoid function.
Its range in between -1 to 1 i.e -1 < output < 1 .
Usually used in hidden layers of a neural network.
Relu - Rectified linear unit:
It is the most widely used activation function.
It gives an output x if x is positive and 0 otherwise.
RELU learns much faster than sigmoid and Tanh function. We can easily backpropagate the errors and have multiple layers of neurons being activated by the ReLU function. It avoids and rectifies the vanishing gradient problem
The Softmax function is used in multi-class models where it returns probabilities of each class, with the target class having the highest probability. The softmax function would squeeze the outputs for each class between 0 and 1.
The learning rate is a tuning parameter that determines the step size at each iteration while moving toward a minimum of a loss function. • Smaller learning rates require more training epochs (requires more time to train) due to the smaller changes made to the weights in each update, whereas larger learning rates result in rapid changes and require fewer training epochs.
The Loss Function is one of the important components of the Neural Network. Loss is nothing but a prediction error of Neural Network. And the method to calculate the loss is called Loss Function. The loss function will simply measure the absolute difference between our prediction and the actual value.
Mean Absolute Error:
In Mean absolute error we take the mode of the error values i.e differences between the actual(target) and predicted values. It works well even with the outliers. It is not widely used because it generates a large gradient even for small values.
Mean Squared Error:
MSE loss is used for regression tasks. This function is quite similar to MAE, the only difference is that MSE is calculated by taking the mean of squared differences between actual(target) and predicted values.
BCE loss is used for the binary classification tasks. If you are using the BCE loss function, you just need one output node to classify the data into two classes. The output value should be passed through a sigmoid activation function and the range of output is (0 – 1).
It is used for the multi-class classification task. If you are using CCE loss function, there must be the same number of output nodes as the classes. And the final layer output should be passed through a softmax activation so that each node output a probability value between (0–1).