Step 7: Accuracy Trap

When 99% accuracy is a lie

1 ExplorePlay below

›

2 ReadUnderstand

›

3 BuildHands-on lab

›

4 CompareSolution

›

💡 ReflectThink deeper

The accuracy trap

A real LogisticRegression model is trained on the digits every time you move the slider. As you remove samples of one class, the dataset becomes increasingly imbalanced — and the four metrics below diverge. Accuracy stays near 100%. Recall collapses toward 0%. The gap between them is the trap.

Target class:

Remove % of target class

Accuracy

—

Of every prediction, what fraction did the model get right?
(correct ÷ total)

Precision

—

When it cries "target", how often is it right?
(true positives ÷ predicted positives)

Recall

—

Of every real target, what fraction did it actually catch?
(true positives ÷ real positives)

F1 Score

—

Balance of precision and recall — drops if either one collapses.
(harmonic mean)

Metric trajectories

Each dot is one model trained at that imbalance level. Watch the green line (accuracy) stay flat near the top while the red line (recall) crashes downward. A model evaluated only on accuracy looks fine all the way to 95% removal — but it has long since stopped finding the target class.

Think Deeper

Try this:

In the Explore tab, at what removal % does recall drop below 50% while accuracy stays above 90%?

The trap springs around 70-80% removal. Accuracy barely moves because the majority class dominates. Recall collapses because the model stops finding the rare class. A malware detector with 99% accuracy but 10% recall misses 9 out of 10 threats.

Cybersecurity tie-in: In security, the "rare class" is the attack. A malware detector with 99% accuracy but 10% recall misses 9 out of 10 threats. Security models must use F1, precision/recall, and ROC-AUC — not accuracy.

← Previous ← → to navigate Next →