Overview & Experiments 55 Synthesis Roadmap Lookahead Audit

Vol forecast: tail behaviour

Promoted

2026-05-17 calibrationtailsproduction-gate

Hypothesis

When realised vol enters its top-5% (storms), the GBM 4h forecast has risen into its top-10% at least a few hours earlier — providing useful lead time for position reduction. Conversely, top-10% forecasts are followed by top-10% realisations more often than chance.

Verdict

**TAIL-READY** — the forecast is usable in tail-driven sizing. 85.0% of storms get advance warning (median lead 0.0h), alerts are followed by storms 66.5% of the time, top-decile bias only -0.020. Vol-targeting in storms should be effective.

n_alerts

4,599

n_storms

2,300

gates_total

gates_passed

top_decile_bias

-0.0198

warning_rate_pct

+85.0435

median_lead_hours

+0.0000

alert_hit_rate_pct

+66.5145

Vol forecast: tail behaviour

2026-05-17 · status: promoted · 8.1s

Hypothesis: When realised vol enters its top-5% (storms), the GBM 4h forecast has risen into its top-10% at least a few hours earlier — providing useful lead time for position reduction. Conversely, top-10% forecasts are followed by top-10% realisations more often than chance.

Verdict: TAIL-READY — the forecast is usable in tail-driven sizing. 85.0% of storms get advance warning (median lead 0.0h), alerts are followed by storms 66.5% of the time, top-decile bias only -0.020. Vol-targeting in storms should be effective.

Key metrics

metric	value
n_storms	`2,300`
warning_rate_pct	`+85.0435`
median_lead_hours	`+0.0000`
n_alerts	`4,599`
alert_hit_rate_pct	`+66.5145`
top_decile_bias	`-0.0198`
gates_passed	`3`
gates_total	`3`

Approach

OOS GBM 4h forecasts, walk-forward (21 windows, ~46k hourly observations). Bias-corrected with the production scalar (×1.0322).

Definitions: - Storm: realised vol > p95 of all realised values = 1.152 ann σ (top-5%). - Alert: forecast > p90 of all forecasts = 0.842 ann σ (top-10%).

1. Early-warning lead time for storms

Total storms (actual ≥ 1.152): 2,300
Storms preceded by an alert within 24h: 1,956 (85.0%)
Lead time distribution (hours before storm): median 0.0, P25 0.0, P75 0.0

A 100% warning rate would mean every storm was flagged; a 0% warning rate means the forecast is purely lagging. The random baseline is ≈ 1 − (1 − 0.10)^24 = 92.0% at this alert threshold.

2. False-alarm analysis

Total alerts (forecast ≥ 0.842): 4,599
Alerts followed by a storm within 24h: 3,059 (66.5%)
False-alarm rate: 33.5%

For sizing, a high false-alarm rate is acceptable — we just shrink positions during the alert window and reopen later. The cost is opportunity, not loss.

3. Conditional error by forecast decile

Is the model unbiased everywhere or does it under-/over-forecast in the tails specifically? bias = mean predicted − mean actual within the decile.

bin	n	pred_mean	act_mean	mae	rel_mae	bias
0	4599	0.1934	0.1959	0.0535	0.283	-0.0025
1	4598	0.2795	0.2788	0.0734	0.276	0.0007
2	4598	0.3398	0.344	0.0933	0.284	-0.0042
3	4599	0.3952	0.3925	0.1052	0.293	0.0027
4	4598	0.4497	0.438	0.1211	0.323	0.0117
5	4598	0.5059	0.4947	0.1344	0.309	0.0113
6	4599	0.5672	0.5567	0.1409	0.276	0.0105
7	4598	0.6456	0.6437	0.1538	0.253	0.0018
8	4598	0.7578	0.7696	0.1737	0.223	-0.0118
9	4599	1.1241	1.1439	0.2777	0.239	-0.0198

lead time

confusion

bias by decile

Production gates

gate	pass?	actual

| warning rate ≥ 70% | OK | 85.0% |

| alert hit-rate ≥ 30% | OK | 66.5% |

| top-decile |bias| ≤ 0.10 | OK | -0.0198 |

Passed: 3/3