The problem: the shared 22-feature model captures general patterns but can't see symbol-specific quirks (NVDA's institutional flow profile, TSLA's retail-meme dynamics, MSFT's defensive-tech regime). One model, many symbols, suboptimal fit per symbol.
The mixed-effects fix: add ONE learned offset per symbol. When the shared model predicts p_shared, the final prediction is
p_final = sigmoid(logit(p_shared) + bias[sym])
Sector transfer: until a symbol has 5+ own resolutions, it borrows the average bias of its sector peers. So AMD inherits NVDA's bias before AMD has its own data.
Learning rule: after each resolution, EMA update the symbol's bias toward the residual (label vs predicted). LR=0.05, clamped to ยฑ1.5 in logit space.
Bias is in logit space, not probability space. A bias of +0.5 means: if shared model predicts 50% (logit 0), final is sigmoid(0+0.5) โ 62%. If shared predicts 70% (logit 0.85), final is sigmoid(1.35) โ 79%. The bias has bigger impact near 50%.
Positive bias = shared model has been UNDERESTIMATING this symbol (it wins more than predicted). The brain adds confidence.
Negative bias = shared model has been OVERESTIMATING. The brain subtracts confidence.
Zero bias = shared model has been calibrated on this symbol โ no per-symbol adjustment needed.
Yellow (sector fallback): this symbol doesn't have 5 resolutions yet, so it's borrowing the average bias of its sector peers. Transitions to its own bias once data accumulates.