Platt scaling fits a sigmoid. But the true calibration curve may be bumpy or have flat regions โ sigmoid can't capture that.
Isotonic regression is non-parametric: it fits an arbitrary monotone step function via the Pool Adjacent Violators algorithm:
- Sort (raw, actual) pairs by raw probability ascending
- Start each pair as its own block with weight=1
- Walk left-to-right. If a block's average y violates the monotone constraint (greater y than next block), merge it with its neighbor and recurse left
- The result is a piecewise-constant, monotone non-decreasing function
Trade-offs vs Platt: isotonic is more flexible (better fit when the data isn't sigmoidal) but needs more samples to avoid overfitting individual outliers. Use Platt for <100 calibration pairs, isotonic for >500.
Where it fits: the unified predictor currently uses Platt (global) โ regime-Platt (per-regime). Isotonic is a third option you can wire in if Platt's coverage diverges from target.