ML Glossary — bpleone / trade

🎯 Core Concepts

Model

A mathematical function that takes inputs (features) and outputs predictions (here: P(win)). Our model is "trained" to make better predictions by adjusting internal numbers (weights) based on past wins and losses.

Like an experienced trader's intuition — but with every gut-call written down as a number.

Feature

A single numeric input the model uses to make predictions. We use 22 features: RSI, ATR%, RVOL, MA distance, IV, regime, etc.

If a chef predicts how good a stew will be, features = "saltiness, simmer time, herb amount."

Weight

A number that says how much a feature matters to the prediction. Positive weight = feature pushes toward "win"; negative = toward "loss." Training adjusts these weights to minimize errors.

If "RVOL > 2× has weight +0.8", high volume reliably contributes to winning trades in your data.

Label

The "correct answer" we're trying to predict. Here labels are binary: 1 = hit (won), 0 = miss (lost). Each finding becomes a labeled training row after the brain rates it 30 min later.

Logistic Regression

The simplest classification model. Takes a weighted sum of features, squashes it through a sigmoid function to get a probability between 0 and 1. Fast, interpretable, hard to overfit. Our primary model.

P(win) = sigmoid(w1·RSI + w2·RVOL + ... + bias)

Neural Network (MLP)

Multi-layer perceptron. Stacks logistic regressions with non-linear activations (ReLU) so it can learn feature interactions. More powerful but slower, harder to interpret, more prone to overfitting.

Learns rules like "high RVOL + low IV + bullish regime" without you telling it about that combo.

⚙ Training

SGD (Stochastic Gradient Descent)

The algorithm that updates weights one sample at a time. For each training row: predict, compare to actual, nudge weights to reduce the error. Repeat thousands of times.

Learning Rate

How big each weight update is. Too small: model trains slowly. Too big: oscillates / never converges. Sweet spot is typically 0.01-0.1. See LR Tuner.

Epoch

One complete pass through all the training data. We typically do 3-5 epochs in a full retrain. More = better fit but risk of overfitting.

Loss

A number measuring how wrong the model's predictions were. Lower = better. We use binary cross-entropy which heavily penalizes confident-but-wrong predictions.

Overfitting

When the model memorizes the training data instead of learning generalizable rules. Shows up as high in-sample accuracy but low out-of-sample. Mitigated by: simpler model, more data, regularization.

Like memorizing test answers instead of understanding the subject.

📊 Evaluation

Accuracy

(Correct predictions) / (Total predictions). Random guessing = 50%. Anything above is signal. Above 55% is a meaningful trading edge.

Precision

When the model says "WIN," how often is it actually a win? Of 100 buy signals, if 65 are wins, precision = 65%.

Recall

Of all the actual wins, how many did the model catch? If 100 setups are real wins but model only flagged 70, recall = 70%.

F1 Score

Harmonic mean of precision and recall. Single number balancing both. Above 0.55 is solid for trading models.

Confusion Matrix

A 2×2 table: True Positives (predicted win, was win), False Positives (predicted win, was loss), True Negatives, False Negatives. Tells you what KIND of errors the model makes. See Prediction Replay.

Cross-Validation (K-fold CV)

Robust accuracy estimation. Split data into K folds. For each fold: train on the other K-1, test on this one. Average results. Tells you how the model will perform on unseen data.

🧬 Self-Learning Loop

Online Learning

Training one sample at a time as new data arrives. Our brain does this on every newly-rated outcome — model keeps improving as you use the site.

Feature Engineering

Designing what to feed the model. Bad features = bad predictions. Good features = strong predictions even with simple models. See all features documented.

Ensemble

Running multiple models and combining their predictions. Reduces variance, often improves accuracy. See Ensemble A/B + Model Compare.

SHAP (Feature Attribution)

A method for explaining individual predictions: shows which features pushed the prediction up (toward win) or down (toward loss). Tells you the WHY behind any model output. See Model Explain.

Confidence (probability)

The model's output — a number 0-1 representing P(win). 0.7 = model thinks 70% chance this works. Not certainty — calibrated probability. See Model Confidence.

Cold-Start Problem

A fresh model has no data — predictions are random. Takes 50-100 real outcomes before it's calibrated. Use Model Seed to bootstrap w/ synthetic but realistic data.

🚀 Practical Tips

Warm-Start

Initialize a fresh model with pre-existing knowledge instead of zero weights. Our Model Seed feature does this by generating realistic synthetic data based on historical setup hit-rates.

Reading model probabilities

P > 0.7 = high confidence (model lifts conviction). 0.45-0.55 = noise (model has no opinion). P < 0.3 = model says fade. Always interpret in context of overall strategy.

Model Version

Every snapshot of weights is a version. Old versions stay archived (up to 50). Promote any past version to current with one click if a training run goes wrong. See Model Versions.