Module Leaderboard — bpleone / brain

📊 Per-module rolling performance (last 100 resolutions)

Δ acc = module accuracy minus the average across all modules. Positive means the module is contributing genuine signal above the consensus; negative means it's pulling the blend down.
Brier score = mean (predicted - actual)². Lower is better, 0.25 is the no-information baseline.
Log loss = -log(p_actual). 0.693 is the no-info baseline; below that means the module is informative.

📚 Why this matters

The brain has 5 base learners. The Meta-Stacker learns how to blend them, and as resolutions accumulate it shifts weight toward the best ones. But before the meta-stacker has data, the blend is hand-picked. This page shows the truth.

What to look for:

If a module's accuracy is consistently 5+ points below the average, the Meta-Stacker should be (and eventually will be) downweighting it.
If a module's log loss is above 0.693 (random baseline), it's actively hurting predictions when included.
The "Final blend" row shows what the combined prediction is achieving — it should beat every individual module.
If k-NN has only 30 samples while model has 100, the k-NN row is statistically less reliable. The "n" column shows sample size.