The Operational Decision
Once you've selected a model, you still have one more decision: what threshold to deploy at. This is not a technical question — it's a business question:
"What is the cost of a missed attack vs the cost of a false alarm in your environment?"
Two Scenarios
| Scenario A: "Catch All Attacks" | Scenario B: "Trusted Alerts Only" | |
|---|---|---|
| Goal | Miss as few attacks as possible | When we alert, we want to be right |
| Accept | Higher false alarm rate | Some attacks may slip through |
| Threshold | Low (0.2–0.3) | High (0.6–0.8) |
| Key metric | Recall on attack class | Precision on attack class |
| Use case | Email security, malware sandbox | Auto-blocking firewall rules |
Building a Threshold Summary Table
from sklearn.metrics import precision_score, recall_score, f1_score
probs = model.predict_proba(X_test_scaled)[:, 1]
print(f"{'Threshold':>10} {'Precision':>10} {'Recall':>10} {'F1':>10} {'Flagged':>10}")
print("-" * 55)
for t in [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]:
y_pred_t = (probs >= t).astype(int)
p = precision_score(y_test, y_pred_t, zero_division=0)
r = recall_score(y_test, y_pred_t, zero_division=0)
f = f1_score(y_test, y_pred_t, zero_division=0)
n = y_pred_t.sum()
print(f"{t:>10.1f} {p:>10.3f} {r:>10.3f} {f:>10.3f} {n:>10}")
This table is what you present to stakeholders. They decide the acceptable tradeoff — you provide the data.
Communicating to Stakeholders
Avoid jargon. Instead of "precision = 0.85 at threshold 0.3", say:
- "At this setting, we catch 95% of attacks but 1 in 6 alerts is a false alarm"
- "If we tighten the filter, false alarms drop to 1 in 20 but we miss 30% of attacks"
- "Your team handles 50 alerts/day — at threshold X, we generate 45"
The threshold is a dial, not a fixed value. Different deployments may use different thresholds.
Loading...
Loading...
Loading...
Think Deeper
Try this:
Your SOC team says they can handle 50 alerts per day. Your model produces 200 at threshold 0.3 and 30 at threshold 0.6. What threshold do you recommend?
Threshold 0.6 keeps alerts under capacity (30/day), but check how many real attacks it misses. If recall drops from 95% to 40%, you're missing 60% of threats. The answer might be: deploy at 0.45 (find the threshold that produces ~50 alerts), or add auto-triage to handle volume at 0.3. The threshold decision is operational, not just mathematical.
Cybersecurity tie-in: This is the most operationally important step in the entire module.
A model is only as useful as its deployment threshold. The right threshold depends on your SOC's size,
alert triage automation, and the consequences of both missed threats and false alarms.
Models don't make decisions — operators do.