Step 5: Threshold Tuning

Choose the right operating point for your SOC

1 ExplorePlay below
2 ReadUnderstand
3 BuildHands-on lab
4 CompareSolution
💡 ReflectThink deeper

The Operational Decision

Once you've selected a model, you still have one more decision: what threshold to deploy at. This is not a technical question — it's a business question:

"What is the cost of a missed attack vs the cost of a false alarm in your environment?"

Two Scenarios

Scenario A: "Catch All Attacks"Scenario B: "Trusted Alerts Only"
GoalMiss as few attacks as possibleWhen we alert, we want to be right
AcceptHigher false alarm rateSome attacks may slip through
ThresholdLow (0.2–0.3)High (0.6–0.8)
Key metricRecall on attack classPrecision on attack class
Use caseEmail security, malware sandboxAuto-blocking firewall rules

Building a Threshold Summary Table

from sklearn.metrics import precision_score, recall_score, f1_score

probs = model.predict_proba(X_test_scaled)[:, 1]

print(f"{'Threshold':>10} {'Precision':>10} {'Recall':>10} {'F1':>10} {'Flagged':>10}")
print("-" * 55)

for t in [0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8]:
    y_pred_t = (probs >= t).astype(int)
    p = precision_score(y_test, y_pred_t, zero_division=0)
    r = recall_score(y_test, y_pred_t, zero_division=0)
    f = f1_score(y_test, y_pred_t, zero_division=0)
    n = y_pred_t.sum()
    print(f"{t:>10.1f} {p:>10.3f} {r:>10.3f} {f:>10.3f} {n:>10}")

This table is what you present to stakeholders. They decide the acceptable tradeoff — you provide the data.

Communicating to Stakeholders

Avoid jargon. Instead of "precision = 0.85 at threshold 0.3", say:

  • "At this setting, we catch 95% of attacks but 1 in 6 alerts is a false alarm"
  • "If we tighten the filter, false alarms drop to 1 in 20 but we miss 30% of attacks"
  • "Your team handles 50 alerts/day — at threshold X, we generate 45"

The threshold is a dial, not a fixed value. Different deployments may use different thresholds.

Loading...
Loading...
Loading...

Think Deeper

Your SOC team says they can handle 50 alerts per day. Your model produces 200 at threshold 0.3 and 30 at threshold 0.6. What threshold do you recommend?

Threshold 0.6 keeps alerts under capacity (30/day), but check how many real attacks it misses. If recall drops from 95% to 40%, you're missing 60% of threats. The answer might be: deploy at 0.45 (find the threshold that produces ~50 alerts), or add auto-triage to handle volume at 0.3. The threshold decision is operational, not just mathematical.
Cybersecurity tie-in: This is the most operationally important step in the entire module. A model is only as useful as its deployment threshold. The right threshold depends on your SOC's size, alert triage automation, and the consequences of both missed threats and false alarms. Models don't make decisions — operators do.

Loading...