Step 3: Precision, Recall, F1

The metrics that actually matter in security

1 ExplorePlay below
2 ReadUnderstand
3 BuildHands-on lab
4 CompareSolution
💡 ReflectThink deeper

Precision vs Recall

Precision answers: "Of all the alerts my system raised, what fraction were real attacks?"

Recall answers: "Of all the real attacks that occurred, what fraction did my system catch?"

MetricFormulaHigh meansLow means
PrecisionTP / (TP + FP)Few false alarmsAlert fatigue
RecallTP / (TP + FN)Few missed attacksThreats slip through
F12 × P × R / (P + R)Balanced performanceOne metric is collapsing

The Fundamental Tradeoff

These two metrics pull in opposite directions:

  • Lower threshold → more alerts → higher recall → but lower precision
  • Higher threshold → fewer, confident alerts → higher precision → but lower recall

You cannot have both perfect precision and perfect recall simultaneously (unless the model is perfect).

When to Prioritise Which

Security toolPriorityWhy
Email phishing scannerRecallMissing a phish leads to credential theft
Auto-blocking firewallPrecisionBlocking legitimate traffic has business impact
SOC triage queueF1 (balanced)Analysts need both trust and coverage
Malware sandboxRecallMissing malware = network compromise

Key Code Pattern

from sklearn.metrics import classification_report, precision_score, recall_score, f1_score

print(classification_report(y_test, y_pred,
                            target_names=['benign', 'attack']))

# Individual metrics
p = precision_score(y_test, y_pred)
r = recall_score(y_test, y_pred)
f = f1_score(y_test, y_pred)
print(f"Precision: {p:.3f}  Recall: {r:.3f}  F1: {f:.3f}")
Loading...
Loading...
Loading...

Think Deeper

Two models: Model A has precision=0.95, recall=0.60. Model B has precision=0.70, recall=0.95. Which do you deploy for email phishing?

For phishing email scanning, Model B. Missing a phishing email (FN) can lead to credential theft and lateral movement. A false positive only means one legitimate email gets quarantined. Recall matters more here. But for auto-blocking at a firewall? Model A — blocking legitimate traffic has business impact.
Cybersecurity tie-in: Alert fatigue is real. A model with 10% precision means 9 out of 10 alerts are false. Analysts learn to ignore it, and the 1 real alert gets missed too. High recall without acceptable precision is operationally useless.

Loading...