Lately I've been going through CS229 - the famous Machine Learning course taught by Andrew Ng.
The course is pretty theory-heavy, and honestly the math does get confusing at times. But instead of only watching the lectures, I wanted to actually build the algorithms from scratch and see what is happening under the hood.

So in this post, I'll walk through how I coded a Linear Regression Model from scratch.
No libraries, no frameworks - just pure math and Python.

Step 1: Defining the Training Data

X = [[1], [2], [3], [4], [5]]
y = [2, 4, 6, 8, 10]
Here X denotes the inputs and y is the corresponding output.
Our goal is to learn a mapping from X to y.

thread2.jpeg

Step 2: Bias Term and Theta

A bias term is now added to make our model learn where the line should start (not just its slope).
This is done by adding 1 to every input term.

Before: X = [ [1], [2], [3] ]
After: X= [ [1,1], [1,2], [1,3] ]

Let's also define theta as [0.0,0.0] since they represent the parameters of our model i.e. the intercept and the slope respectively.

thread3_1.jpeg

Step 3: Defining the Cost Function

Let's define the hyperparameters and a cost function to measure our model's error.

A cost function measures how wrong a model’s predictions are compared to actual results i.e. it quantifies the error to help the model improve.
The goal is to minimize the cost function by adjusting(modifying) the parameters of the model.

thread4_1.jpeg

thread4_2.jpeg

Step 4: Implementing Gradient Descent Algorithm

Now I'll be implementing a gradient descent algorithm - iteratively adjusting model parameters to reduce the cost function J(θ). I'll repeat this algorithm until convergence and try to obtain the global minimum (where θj will eventually converge)

thread5_1.jpeg

thread5_2.jpeg

Step 5: Testing

With the training complete and parameters learned, it’s time to test our model.
We’ll input x = 6 and see what our learned parameters return as the predicted output.

Result: 12.00 - perfect prediction

thread6_1.jpeg

thread6_2.jpeg

Step 6: Finding convergence

Let's find out when the model actually converged.
By tracking predictions every 50 epochs, I observed:
At epoch 750: Result = 11.999
At epoch 800: Result = 12.000

This confirms our model reached near-perfect convergence between 750 and 800 epochs and stayed stable afterward!

thread7_1.jpeg

thread7_2.jpeg

Full Code:

full_code.jpeg

Conclusion

And that is how I coded a linear regression model from scratch!

What I liked about this exercise is that it made the math feel less like something written on a slide and more like a process I could actually follow.

It is obviously a very small example, but building it from scratch helped me understand what is really happening behind the scenes when a model "learns".

Next up, maybe I'll try logistic regression.