The accuracy trap
A real LogisticRegression model is trained on the digits every time you move the slider.
As you remove samples of one class, the dataset becomes increasingly imbalanced — and the four metrics below diverge.
Accuracy stays near 100%. Recall collapses toward 0%. The gap between them is the trap.
(correct ÷ total)
(true positives ÷ predicted positives)
(true positives ÷ real positives)
(harmonic mean)
Metric trajectories
Each dot is one model trained at that imbalance level. Watch the green line (accuracy) stay flat near the top while the red line (recall) crashes downward. A model evaluated only on accuracy looks fine all the way to 95% removal — but it has long since stopped finding the target class.
Think Deeper
In the Explore tab, at what removal % does recall drop below 50% while accuracy stays above 90%?