Step 2: What Is a Model?

A model is two numbers — drag the knobs and BE the algorithm

1 ExplorePlay below

›

2 CompareSolution

›

💡 ReflectThink deeper

What Is a Model, Really?

You'll hear the word model in every ML course, blog post, and product spec — usually without anyone defining it. Here's the unflashy truth:

A model is a small bag of numbers. Nothing more. Not an algorithm. Not code. Not a black box. Just numbers stored in memory that, together, describe a function from input to output.

For linear regression that bag has exactly two numbers: a weight and a bias. They define a single straight line:

response_time = (weight * requests_per_second) + bias

That equation IS the model. When you save a trained linear regression to disk, those two numbers are what get saved.

Algorithm vs Model

The two words get used interchangeably and they shouldn't be. Here's the split:

	Algorithm	Model
What it is	A procedure (a recipe)	A bag of tuned numbers (the cake)
When it runs	Once, during training	Every time you call `.predict()`
Linear regression example	"Find `w` and `b` that minimise squared error"	`w = 1.82`, `b = 29.5`
What you ship to prod	Nothing	The numbers

The algorithm runs once, looks at all your data, picks the best numbers, and then it is done. The model — those numbers — is the thing you actually use in production.

How Big Is a "Model"?

The number of parameters in the bag depends on what kind of model it is. The mental model is the same at every scale:

Model	Parameter count	What they represent
Linear regression (this lesson)	2	1 weight + 1 bias
Digits classifier (Lesson 1.1)	~640	64 pixels × 10 digit classes
ResNet-50 (image classifier)	~25 million	Convolutional filter weights
GPT-4 (LLM)	~1.7 trillion	Transformer attention + feed-forward weights

Same idea, twelve orders of magnitude apart. The bag of numbers gets bigger; what it is doesn't change.

Be the Algorithm — Drag Two Knobs

Reading about this is the wrong way to learn it. Run the explore script and physically tune the model by hand: drag two sliders, watch the red line move across the data cloud, watch the error metric (RMSE) climb and fall. Then click "Let the algorithm do it" and watch sklearn snap the knobs to the optimal values in one shot.

After you've tuned the model by hand, the next step in this lesson — model.fit() — stops being magic. .fit() is doing exactly what you just did, just analytically and instantly.

python curriculum/stage1_classic_ml/02_linear_regression/0_interactive_intro/explore_model_knobs.py

Challenges to attempt before clicking "Let the algorithm do it":

What is the RMSE of a totally random model (the starting position)?
How low can you push the RMSE by hand? Write your best score down.
By how much does the algorithm beat your best score?
The optimal values are roughly w ≈ 1.8 and b ≈ 30. Translate them into plain English for a SOC analyst monitoring the server.

Think Deeper

Try this:

Run <code>explore_model_knobs.py</code>. By hand, try to push the RMSE below 20 ms. Then click "Let the algorithm do it". By how much did the algorithm beat your best score, and why?

Most learners can get the RMSE into the 20–40 ms range by eye. The algorithm typically lands at ~15 ms — better, but not by orders of magnitude. The algorithm wins because it solves the optimal w, b analytically (the normal equation) instead of guessing. The key insight isn't that the algorithm is smart — it's that the model is just those two numbers, and the algorithm's only job is picking them. Once it has, the algorithm is done; you ship the numbers.

Cybersecurity tie-in: When a vendor says "we shipped a new model to production," they mean they updated a file containing tuned parameters. That file is an asset — it can be stolen (model exfiltration), tampered with (model poisoning), or backdoored before deployment. You cannot defend the model if you do not know what the model physically is. It is a file with numbers in it.

← Previous ← → to navigate Next →