Brier Skill Score — bpleone / brain

📚 What does this number mean?

BSS Range	Tier	Interpretation
< 0	broken	Brain is WORSE than always guessing the mean. Something is wrong.
0.00 – 0.05	weak	Barely better than baseline. Probably noise.
0.05 – 0.10	fair	Real edge but small. Hard to trust for sizing.
0.10 – 0.20	useful	Clear edge over baseline. Worth sizing with confidence.
0.20 – 0.30	strong	Significantly better than naive — institutional-grade.
≥ 0.30	excellent	Exceptional edge. Verify it's not data leakage.

📐 The math

Brier Score: mean (predicted_prob − actual_outcome)². Bounded in [0, 1], lower is better. Zero = perfect, 0.25 = no information (constant 50% guess on 50/50 data).

Baseline: always predict the population mean rate. Brier of baseline depends on base rate: for 50/50 outcomes it's 0.25; for skewed outcomes (say 70% wins) it's 0.7 × 0.3 = 0.21.

Skill Score: BSS = 1 − (BS_model / BS_baseline). This normalizes against the no-information baseline. A skill score above 0 means the brain has measurable edge above randomly predicting the average rate.

Why BSS over raw accuracy: a model can have 70% accuracy on a 70/30 dataset just by always predicting "yes" — its BSS would be ~0 since it's no better than the trivial baseline. BSS catches this; raw accuracy doesn't. BSS is the standard meteorological / forecasting metric for exactly this reason.