Step 3: Precision, Recall, F1

The metrics that actually matter in security

1 ExplorePlay below

›

2 ReadUnderstand

›

3 BuildHands-on lab

›

4 CompareSolution

›

💡 ReflectThink deeper

Precision vs Recall

Precision answers: "Of all the alerts my system raised, what fraction were real attacks?"

Recall answers: "Of all the real attacks that occurred, what fraction did my system catch?"

Metric	Formula	High means	Low means
Precision	TP / (TP + FP)	Few false alarms	Alert fatigue
Recall	TP / (TP + FN)	Few missed attacks	Threats slip through
F1	2 × P × R / (P + R)	Balanced performance	One metric is collapsing

The Fundamental Tradeoff

These two metrics pull in opposite directions:

Lower threshold → more alerts → higher recall → but lower precision
Higher threshold → fewer, confident alerts → higher precision → but lower recall

You cannot have both perfect precision and perfect recall simultaneously (unless the model is perfect).

When to Prioritise Which

Security tool	Priority	Why
Email phishing scanner	Recall	Missing a phish leads to credential theft
Auto-blocking firewall	Precision	Blocking legitimate traffic has business impact
SOC triage queue	F1 (balanced)	Analysts need both trust and coverage
Malware sandbox	Recall	Missing malware = network compromise

Key Code Pattern

from sklearn.metrics import classification_report, precision_score, recall_score, f1_score

print(classification_report(y_test, y_pred,
                            target_names=['benign', 'attack']))

# Individual metrics
p = precision_score(y_test, y_pred)
r = recall_score(y_test, y_pred)
f = f1_score(y_test, y_pred)
print(f"Precision: {p:.3f}  Recall: {r:.3f}  F1: {f:.3f}")

Think Deeper

Try this:

Two models: Model A has precision=0.95, recall=0.60. Model B has precision=0.70, recall=0.95. Which do you deploy for email phishing?

For phishing email scanning, Model B. Missing a phishing email (FN) can lead to credential theft and lateral movement. A false positive only means one legitimate email gets quarantined. Recall matters more here. But for auto-blocking at a firewall? Model A — blocking legitimate traffic has business impact.

Cybersecurity tie-in: Alert fatigue is real. A model with 10% precision means 9 out of 10 alerts are false. Analysts learn to ignore it, and the 1 real alert gets missed too. High recall without acceptable precision is operationally useless.

← Previous ← → to navigate Next →