Per-bucket reliability diagram. When the brain says 70% conviction, does it actually win 70% of the time?
A perfectly calibrated brain is on the dotted diagonal: predicted = actual. Above the line = under-confident (could size up). Below = over-confident (gates should tighten). ECE (expected calibration error) summarizes overall miss.