Step 2: Bias-Variance Tradeoff

Underfit, good fit, overfit -- three regimes visualised

1 ExplorePlay below

›

2 ReadUnderstand

›

3 BuildHands-on lab

›

4 CompareSolution

›

💡 ReflectThink deeper

Bias and Variance

Bias is the error from wrong assumptions. A high-bias model is too simple to capture the true pattern — it underfits. Every time you train it on different data, you get the same wrong answer.

Variance is the error from sensitivity to training data fluctuations. A high-variance model is too complex — it overfits. Train it on different data subsets and you get wildly different models.

	High Bias (underfit)	High Variance (overfit)
Training error	High	Low
Test error	High	High
Gap	Small	Large
Fix	More complex model	Regularisation, more data

Three Regimes Visualised

Regime	Depth	Train Acc	Val Acc	Gap	Decision boundary
UNDERFIT	1	~65%	~65%	~0%	A single split — too simple to learn the pattern
GOOD FIT	5	~99%	~97%	~2%	Clean boundary — captures the real decision surface
OVERFIT	50	100%	~92%	~8%	Wiggly boundary around noise — memorises every point

The sweet spot is the depth where both bias and variance are low enough — validation accuracy is highest and the gap is small.

Comparing Three Models

from sklearn.tree import DecisionTreeClassifier

models = {
    "Underfit (depth=1)": DecisionTreeClassifier(max_depth=1, random_state=42),
    "Good fit (depth=5)": DecisionTreeClassifier(max_depth=5, random_state=42),
    "Overfit (depth=50)": DecisionTreeClassifier(max_depth=50, random_state=42),
}

for name, model in models.items():
    model.fit(X_train, y_train)
    tr = model.score(X_train, y_train)
    va = model.score(X_val, y_val)
    print(f"{name:25s}  train={tr:.3f}  val={va:.3f}  gap={tr-va:.3f}")

Security Implications

Regime	Security impact
Underfit (high bias)	Misses most attacks — model is too simple to learn attack patterns
Good fit	Balanced detection rate and false-positive rate
Overfit (high variance)	Catches training attacks but fails on new variants in production

An overfit intrusion detector may pass all tests in the lab but miss novel attack variants the moment it goes live. The bias-variance tradeoff is not just theory — it directly determines whether your model protects the network or gives a false sense of security.

Think Deeper

Try this:

A depth-1 tree and a depth-50 tree both fail in production, but for opposite reasons. Explain what each gets wrong.

The depth-1 tree (high bias) is too simple -- it uses a single split and misses most attack patterns. Both training and test accuracy are around 65%. The depth-50 tree (high variance) memorises every training sample, including noise. It scores 100% on training data but drops to ~92% on new data. In security, the depth-1 tree misses most attacks outright; the depth-50 tree catches training-set attacks but fails on new attack variants.

Cybersecurity tie-in: Attackers constantly evolve their techniques. A model with high variance has memorised specific past attack signatures, not general attack behaviour. When the attacker changes one byte, the overfit model fails. A properly-tuned model with low variance learns the underlying pattern (e.g., "high connection rate + many destination ports") that persists across attack variations.

← Previous ← → to navigate Next →