The brain has five different ways to predict the same thing:
- Model โ logistic regression on 22 features, trained with Adam
- Ensemble โ multi-horizon (10min/30min/4h) average
- Bootstrap โ K=5 models trained on resampled data, then averaged
- k-NN โ non-parametric memory of the 10 most similar past resolutions
- SWA โ average of late-training weights, sits in flatter minimum
Previously we blended them with hand-picked weights.
The hand-picked weights are guesses. The Meta-Stacker replaces them with weights learned from the data:
1. At predict time, collect the five base predictions as features:
x = [p_model, p_ensemble, p_bootstrap, p_knn, p_swa, 1]
2. Train a logistic regression:
meta_p = sigmoid(w ยท x)
3. On resolution, do an online SGD step on the meta-weights with the actual outcome
Cold start: Below 30 resolutions, fall back to the hand-picked weights. After that, the learned blend takes over. The Unified Predictor automatically chooses whichever is available.
Why this is good: if k-NN starts outperforming the parametric model in a specific regime, its weight grows automatically. If SWA's average becomes stale during a regime shift, its weight shrinks. The blend self-tunes.