Every headline number carries a confidence interval, and we track whether our edge is decaying over time instead of quietly hiding it. If the model breaks, this page says so first.
EDGE STILL HOLDINGCUSUM change-point test clean · stat 0.00 · baseline accuracy 55.0%
HEADLINE NUMBERS
95% CI · 1,564 BOOTSTRAP RESAMPLES
67.0%
OVERALL ACCURACY
95% CI 64.6% – 69.3%
+35.7%
DISAGREE ROI
95% CI +19.7% – +52.2%
54.7%
DISAGREE WIN RATE
95% CI 48.4% – 61.3%
Tier
Accuracy
95% CI
S
82.5%
75.4% – 89.5%
A+
72.0%
65.9% – 77.6%
A
71.8%
61.5% – 82.1%
B
66.7%
63.4% – 70.0%
C
59.7%
55.0% – 64.4%
CALIBRATION
PREDICTED VS ACTUAL
The model optimizes which side to pick, not the exact win probability — so low-confidence buckets win more often than their stated number (we’re conservative there, not wrong). The gap column shows it honestly; whiskers are 95% Wilson intervals.