The Formula at the end is very simple but it’s an important part of the Neural network (Gradient Descent) with a backpropagation algorithm.
I am talking about “Linear Regression”. I don’t find this word intuitive enough, that’s why I haven’t used it in the title. Let’s break it down, it’s defining the relationship between two variables. One would be the independent variable (x), and another would be the dependent variable (y).
Dataset
Try out full dataset from here
xi (experience in months) | yi (salary in thousands) |
---|---|
18.290 | 16.522 |
17.023 | 11.666 |
26.344 | 23.167 |
19.106 | 20.877 |
27.743 | 23.166 |
31.671 | 32.966 |
14.186 | 15.294 |
29.933 | 33.159 |
32.841 | 32.033 |
26.874 | 32.348 |
So you will have two types of values: observed value (actual value) (from labeled data), and predicted value . Let’s assume there is a linear relationship between the feature variable () and the predicted variable , given as .
While you predict the value, you can’t be assured that it would be the same as the actual value . There would be some error. Let’s calculate the mean squared error using the formula below for the whole dataset.
Mean Square Error: A fundamental formula, just the square of the difference between the actual and predicted values, and the average of all those values from the dataset.
One thing you would notice here is that err is a quadratic function.
So
For both variables, w and b, we must select these parameters in such a way that our error is minimized. This is crucial because minimizing the error enables us to discover a perfect line that accurately predicts the trend when provided with an independent variable. Consequently, our predictions will closely match the actual values.
If you want to get the minimum value in a quadratic function, let’s assume that our quadratic function parabola is growing in a positive Y-direction, which is because if you chose an arbitrary value for and , would be positive always.
If you want to find the minimum value for this type of curve you can just get to the point where slop would be . Unfortunately, to calculate where the slope is we need an equation for this curve and we have to find its derivative value concerning and compare it with to find value. Remember our curve lies in verses graph.
There is one more way to find minimum value first just take any arbitrary value of , and calculate the slop, if it’s positive decrease the with a certain factor, and if the slop is negative increase the with a certain factor. we will call this factor a Learning rate ().
You can gain insight into this formula on this website: gradient-descent-visualiser