The ROC Curve
The ROC (Receiver Operating Characteristic) curve plots performance at every possible threshold:
| Axis | Metric | Formula |
|---|---|---|
| x-axis | False Positive Rate (FPR) | FP / (FP + TN) |
| y-axis | True Positive Rate (TPR = Recall) | TP / (TP + FN) |
Key points on the curve:
| Point | Threshold | Meaning |
|---|---|---|
| (0, 0) | 1.0 | Predict nothing — no TP, no FP |
| (1, 1) | 0.0 | Predict everything — all TP, all FP |
| (0, 1) | — | Perfect classifier — all attacks, no false alarms |
| Diagonal | — | Random guessing (AUC = 0.5) |
A good classifier's curve bulges toward the upper-left corner.
AUC: Area Under the Curve
AUC summarises the entire ROC curve in a single number:
| AUC | Interpretation |
|---|---|
| 1.0 | Perfect separation |
| 0.9–0.99 | Excellent |
| 0.8–0.9 | Good |
| 0.7–0.8 | Fair |
| 0.5 | Random — model has no discriminative power |
AUC is threshold-independent — useful for comparing models before choosing a deployment threshold.
Key Code Pattern
from sklearn.metrics import roc_curve, roc_auc_score
import matplotlib.pyplot as plt
# Get probabilities
probs = model.predict_proba(X_test)[:, 1]
# Compute ROC curve
fpr, tpr, thresholds = roc_curve(y_test, probs)
auc = roc_auc_score(y_test, probs)
# Plot
plt.plot(fpr, tpr, label=f'Model (AUC={auc:.3f})')
plt.plot([0, 1], [0, 1], 'k--', label='Random (AUC=0.5)')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate (Recall)')
plt.title('ROC Curve')
plt.legend()
plt.show()
Loading...
Loading...
Loading...
Think Deeper
Try this:
Model X has AUC=0.92, Model Y has AUC=0.95. Can you always pick Model Y?
Not necessarily. AUC averages performance across all thresholds, but you only operate at one. Model X might outperform Y in the low-FPR region you actually care about. Always check the ROC curve shape, not just the AUC number. In security, the region near FPR=0.01 often matters most.
Cybersecurity tie-in: ROC curves are perfect for comparing IDS/IPS models.
Plot both on the same axes: the curve closer to the upper-left corner wins.
But check the low-FPR region specifically — in production, you operate at FPR < 1%,
not across the full range.