Precision vs Recall
Precision answers: "Of all the alerts my system raised, what fraction were real attacks?"
Recall answers: "Of all the real attacks that occurred, what fraction did my system catch?"
| Metric | Formula | High means | Low means |
|---|---|---|---|
| Precision | TP / (TP + FP) | Few false alarms | Alert fatigue |
| Recall | TP / (TP + FN) | Few missed attacks | Threats slip through |
| F1 | 2 × P × R / (P + R) | Balanced performance | One metric is collapsing |
The Fundamental Tradeoff
These two metrics pull in opposite directions:
- Lower threshold → more alerts → higher recall → but lower precision
- Higher threshold → fewer, confident alerts → higher precision → but lower recall
You cannot have both perfect precision and perfect recall simultaneously (unless the model is perfect).
When to Prioritise Which
| Security tool | Priority | Why |
|---|---|---|
| Email phishing scanner | Recall | Missing a phish leads to credential theft |
| Auto-blocking firewall | Precision | Blocking legitimate traffic has business impact |
| SOC triage queue | F1 (balanced) | Analysts need both trust and coverage |
| Malware sandbox | Recall | Missing malware = network compromise |
Key Code Pattern
from sklearn.metrics import classification_report, precision_score, recall_score, f1_score
print(classification_report(y_test, y_pred,
target_names=['benign', 'attack']))
# Individual metrics
p = precision_score(y_test, y_pred)
r = recall_score(y_test, y_pred)
f = f1_score(y_test, y_pred)
print(f"Precision: {p:.3f} Recall: {r:.3f} F1: {f:.3f}")
Loading...
Loading...
Loading...
Think Deeper
Try this:
Two models: Model A has precision=0.95, recall=0.60. Model B has precision=0.70, recall=0.95. Which do you deploy for email phishing?
For phishing email scanning, Model B. Missing a phishing email (FN) can lead to credential theft and lateral movement. A false positive only means one legitimate email gets quarantined. Recall matters more here. But for auto-blocking at a firewall? Model A — blocking legitimate traffic has business impact.
Cybersecurity tie-in: Alert fatigue is real. A model with 10% precision means
9 out of 10 alerts are false. Analysts learn to ignore it, and the 1 real alert gets missed too.
High recall without acceptable precision is operationally useless.