Mean squared error
The mean squared error (MSE) is also called a quadratic cost function as it uses the squared difference to measure the magnitude of the error:
Here, the following applies:
- a is the output from the ANN
- y is the expected output
- n is the number of samples used
The cost function is pretty straightforward. For example, consider a single neuron with just one sample, (n=1). If the expected output is 2 (y=2) and the neuron outputs 3 (a=3), then the MSE is as follows:
Similarly, if the expected output is 3 (y=3) and the neuron outputs 2 (a=2), then the MSE is as follows:
Therefore, the MSE quantifies the magnitude of the error made by the neuron. One of the issues with MSE is that when the values in the network get large, the learning becomes slow. In other words, when the weights (w) and bias (b) or z get large, the learning becomes very slow. Keep in mind that we are talking about thousands of neurons in an ANN, which is why the learning slows down and eventually stagnates with no further learning.