The principle: the 22-feature logistic regression model is parametric โ€” it compresses everything into 22 weights. Useful for generalization, but throws away the memory of specific past situations.

k-NN recall fixes this: for every new prediction, find the K=10 most-similar past RESOLVED predictions (Euclidean distance, feature-importance weighted) and look at their realized outcomes. If 8 of 10 similar past cases won, k-NN votes 80%.

Blended output: final_prob = 0.7 ร— model_prob + 0.3 ร— knn_prob

Why this catches what the model misses: the parametric model averages over all training data. If your most-recent 5 NVDA setups looked exactly like THIS setup and 4 of 5 won, k-NN sees that. The averaged model might dilute it with 100 other unrelated NVDA setups.
History size
0
K (neighbors)
10
Default alpha
0.70
Ready?
โ€”
๐Ÿ” Live per-symbol k-NN votes
For each live symbol, the K=10 most-similar past resolved predictions and their outcomes. The k-NN probability is the inverse-distance-weighted win rate among those neighbors.
๐Ÿ”ฌ How blending works
ฮฑ = 0.7 (default): model carries 70% of the weight, k-NN gets 30%. The model dominates for general patterns; k-NN provides a sanity check from memory.

When k-NN disagrees strongly with the model: ensemble agreement scorer picks up the divergence and reduces overall confidence. So the two systems naturally check each other.

Distance metric: weighted Euclidean. Each feature's weight comes from FeatureImportance.lrMultiplier โ€” so the high-alpha features dominate the similarity calculation. Noise features barely matter.

Why not always trust k-NN? Need enough history. Until 5+ resolved predictions exist, k-NN returns null and the system uses the model alone. After 50+, k-NN gets meaningful and starts catching pattern-specific signal the parametric model dilutes.