MODEL HONESTY

WE SHOW OUR WORK

Every headline number carries a confidence interval, and we track whether our edge is decaying over time instead of quietly hiding it. If the model breaks, this page says so first.

EDGE STILL HOLDINGCUSUM change-point test clean · stat 2.99 · baseline accuracy 65.3%

HEADLINE NUMBERS

95% CI · 1,610 BOOTSTRAP RESAMPLES

66.5%

OVERALL ACCURACY

95% CI 64.1% – 68.8%

Tier	Accuracy	95% CI
S	83.7%	75.6% – 90.7%
A+	75.1%	69.2% – 81.1%
A	68.0%	59.0% – 77.0%
B	66.3%	63.0% – 69.5%
C	58.3%	53.4% – 63.1%

CALIBRATION

PREDICTED VS ACTUAL

The raw ensemble optimizes which side to pick, not the exact win probability — the pick is a hard 3-model vote while this number is the mean of three probabilities, so the mid buckets win a few points more often than stated (underconfident, not wrong). The confidence we display on picks corrects this with a per-season Platt layer (fit on prior seasons only, clamped so it never flips the pick); this chart shows the raw model underneath. The gap column shows it honestly; whiskers are 95% Wilson intervals.

0.065

EXPECTED CAL. ERROR

0.223

BRIER SCORE

0.005

RELIABILITY

0.002

RESOLUTION

Predicted Bucket	Predicted	Actual	Gap	N
40% – 50%	48.9%	54.7%	+5.8%	64
50% – 60%	55.0%	63.7%	+8.6%	842
60% – 70%	64.2%	68.9%	+4.7%	528
70% – 80%	73.6%	75.8%	+2.2%	161
80% – 90%	81.9%	86.7%	+4.8%	15

EDGE DECAY

IS THE MODEL STILL WORKING?

ROLLING ACCURACY (50-PICK)60.0%

SEASON OVER SEASON

THE EDGE OVER TIME

Season	Accuracy
2020	67.5%
2021	62.7%
2022	66.2%
2023	64.7%
2024	69.5%
2025	68.3%

How We Pick→Full Results→Model Edge & CLV→