Step 4: ROC Curve and AUC

Compare classifiers across all thresholds

1 ExplorePlay below
2 ReadUnderstand
3 BuildHands-on lab
4 CompareSolution
💡 ReflectThink deeper

The ROC Curve

The ROC (Receiver Operating Characteristic) curve plots performance at every possible threshold:

AxisMetricFormula
x-axisFalse Positive Rate (FPR)FP / (FP + TN)
y-axisTrue Positive Rate (TPR = Recall)TP / (TP + FN)

Key points on the curve:

PointThresholdMeaning
(0, 0)1.0Predict nothing — no TP, no FP
(1, 1)0.0Predict everything — all TP, all FP
(0, 1)Perfect classifier — all attacks, no false alarms
DiagonalRandom guessing (AUC = 0.5)

A good classifier's curve bulges toward the upper-left corner.

AUC: Area Under the Curve

AUC summarises the entire ROC curve in a single number:

AUCInterpretation
1.0Perfect separation
0.9–0.99Excellent
0.8–0.9Good
0.7–0.8Fair
0.5Random — model has no discriminative power

AUC is threshold-independent — useful for comparing models before choosing a deployment threshold.

Key Code Pattern

from sklearn.metrics import roc_curve, roc_auc_score
import matplotlib.pyplot as plt

# Get probabilities
probs = model.predict_proba(X_test)[:, 1]

# Compute ROC curve
fpr, tpr, thresholds = roc_curve(y_test, probs)
auc = roc_auc_score(y_test, probs)

# Plot
plt.plot(fpr, tpr, label=f'Model (AUC={auc:.3f})')
plt.plot([0, 1], [0, 1], 'k--', label='Random (AUC=0.5)')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate (Recall)')
plt.title('ROC Curve')
plt.legend()
plt.show()
Loading...
Loading...
Loading...

Think Deeper

Model X has AUC=0.92, Model Y has AUC=0.95. Can you always pick Model Y?

Not necessarily. AUC averages performance across all thresholds, but you only operate at one. Model X might outperform Y in the low-FPR region you actually care about. Always check the ROC curve shape, not just the AUC number. In security, the region near FPR=0.01 often matters most.
Cybersecurity tie-in: ROC curves are perfect for comparing IDS/IPS models. Plot both on the same axes: the curve closer to the upper-left corner wins. But check the low-FPR region specifically — in production, you operate at FPR < 1%, not across the full range.

Loading...