Counterfactual Replay — bpleone / brain

📚 How it works

For ~20% of resolutions (sampled to keep cost low), the brain runs counterfactual replay:

Take the features that produced the prediction
For each feature, perturb it by ±10% and re-predict
Measure the maximum deviation across all 2×N perturbations
Compute robustness = 1 − max_dev / 0.5 (1.0 = no change; 0.0 = predictions swing to extremes)

Robustness score interpretation:

≥ 0.95 — rock-solid, small feature changes don't move the prediction
0.7 – 0.95 — healthy
0.5 – 0.7 — somewhat fragile (one feature could flip it)
< 0.5 — knife-edge, prediction would flip with normal noise

Why this matters: a 70% prediction with 0.95 robustness is much more trustworthy than a 70% prediction with 0.4 robustness. The model thinks both are equally likely to win, but one is on a knife edge. If brittlePct is consistently >30%, the model is over-fitting to specific feature configurations and would benefit from MORE regularization (turn up label smoothing or confidence penalty).