βš™ Configuration
Range: Ρ ∈ [0, 0.25]. Default: 0.05.
🎬 Preview
Win label (y=1)
1.000
↓
0.975
Loss label (y=0)
0.000
↓
0.025
πŸ“š Why label smoothing?
Training on hard binary labels pushes the model toward saturation. The cross-entropy loss for a correct prediction p=0.99 vs y=1 is -log(0.99) β‰ˆ 0.01 β€” but the gradient is still pulling the model toward p=1.00, which is unreachable. The result: overconfident predictions that don't generalize.

Label smoothing (Szegedy et al. 2016, MΓΌller et al. 2019) replaces hard labels with soft ones: y_smoothed = y Γ— (1 βˆ’ Ξ΅) + Ξ΅ / K For binary classification (K=2) and Ξ΅=0.05:
  • y = 1 β†’ 1 Γ— 0.95 + 0.025 = 0.975
  • y = 0 β†’ 0 Γ— 0.95 + 0.025 = 0.025
The model learns that even "wins" aren't perfectly certain β€” and stops trying to output 0.99+. Empirically this improves calibration and generalization.

Where it's applied: at every model.train(features, label) call site β€” the main model trainer (model.js fullRetrain) and the per-horizon models (multi-horizon.js trainHorizon). Disable to bypass.

How to tune: Ξ΅=0.05 is the standard starting point. Increase up to 0.10 if Brier Skill Score is stagnating with overconfident predictions; decrease to 0 to disable.