Training on hard binary labels pushes the model toward saturation. The cross-entropy loss for a correct prediction p=0.99 vs y=1 is
-log(0.99) β 0.01 β but the gradient is still pulling the model toward p=1.00, which is unreachable. The result: overconfident predictions that don't generalize.
Label smoothing (Szegedy et al. 2016, MΓΌller et al. 2019) replaces hard labels with soft ones:
y_smoothed = y Γ (1 β Ξ΅) + Ξ΅ / K
For binary classification (K=2) and Ξ΅=0.05:
- y = 1 β 1 Γ 0.95 + 0.025 = 0.975
- y = 0 β 0 Γ 0.95 + 0.025 = 0.025
The model learns that even "wins" aren't perfectly certain β and stops trying to output 0.99+. Empirically this improves calibration and generalization.
Where it's applied: at every
model.train(features, label) call site β the main model trainer (model.js fullRetrain) and the per-horizon models (multi-horizon.js trainHorizon). Disable to bypass.
How to tune: Ξ΅=0.05 is the standard starting point. Increase up to 0.10 if Brier Skill Score is stagnating with overconfident predictions; decrease to 0 to disable.