Lab · ML Experiments

ML — Pattern Discovery

Inverted workflow: find conditional edges in BTC data first, build strategies second.
55 experiments

Vol forecast: tail behaviour

Promoted
2026-05-17 calibrationtailsproduction-gate
Hypothesis
When realised vol enters its top-5% (storms), the GBM 4h forecast has risen into its top-10% at least a few hours earlier — providing useful lead time for position reduction. Conversely, top-10% forecasts are followed by top-10% realisations more often than chance.
Verdict
**TAIL-READY** — the forecast is usable in tail-driven sizing. 85.0% of storms get advance warning (median lead 0.0h), alerts are followed by storms 66.5% of the time, top-decile bias only -0.020. Vol-targeting in storms should be effective.
n_alerts
4,599
n_storms
2,300
gates_total
3
gates_passed
3
top_decile_bias
-0.0198
warning_rate_pct
+85.0435
median_lead_hours
+0.0000
alert_hit_rate_pct
+66.5145

Vol forecast: tail behaviour

2026-05-17 · status: promoted · 8.1s

Hypothesis: When realised vol enters its top-5% (storms), the GBM 4h forecast has risen into its top-10% at least a few hours earlier — providing useful lead time for position reduction. Conversely, top-10% forecasts are followed by top-10% realisations more often than chance.

Verdict: TAIL-READY — the forecast is usable in tail-driven sizing. 85.0% of storms get advance warning (median lead 0.0h), alerts are followed by storms 66.5% of the time, top-decile bias only -0.020. Vol-targeting in storms should be effective.

Key metrics

metric value
n_storms 2,300
warning_rate_pct +85.0435
median_lead_hours +0.0000
n_alerts 4,599
alert_hit_rate_pct +66.5145
top_decile_bias -0.0198
gates_passed 3
gates_total 3

Approach

OOS GBM 4h forecasts, walk-forward (21 windows, ~46k hourly observations). Bias-corrected with the production scalar (×1.0322).

Definitions: - Storm: realised vol > p95 of all realised values = 1.152 ann σ (top-5%). - Alert: forecast > p90 of all forecasts = 0.842 ann σ (top-10%).

1. Early-warning lead time for storms

  • Total storms (actual ≥ 1.152): 2,300
  • Storms preceded by an alert within 24h: 1,956 (85.0%)
  • Lead time distribution (hours before storm): median 0.0, P25 0.0, P75 0.0

A 100% warning rate would mean every storm was flagged; a 0% warning rate means the forecast is purely lagging. The random baseline is ≈ 1 − (1 − 0.10)^24 = 92.0% at this alert threshold.

2. False-alarm analysis

  • Total alerts (forecast ≥ 0.842): 4,599
  • Alerts followed by a storm within 24h: 3,059 (66.5%)
  • False-alarm rate: 33.5%

For sizing, a high false-alarm rate is acceptable — we just shrink positions during the alert window and reopen later. The cost is opportunity, not loss.

3. Conditional error by forecast decile

Is the model unbiased everywhere or does it under-/over-forecast in the tails specifically? bias = mean predicted − mean actual within the decile.

bin n pred_mean act_mean mae rel_mae bias
0 4599 0.1934 0.1959 0.0535 0.283 -0.0025
1 4598 0.2795 0.2788 0.0734 0.276 0.0007
2 4598 0.3398 0.344 0.0933 0.284 -0.0042
3 4599 0.3952 0.3925 0.1052 0.293 0.0027
4 4598 0.4497 0.438 0.1211 0.323 0.0117
5 4598 0.5059 0.4947 0.1344 0.309 0.0113
6 4599 0.5672 0.5567 0.1409 0.276 0.0105
7 4598 0.6456 0.6437 0.1538 0.253 0.0018
8 4598 0.7578 0.7696 0.1737 0.223 -0.0118
9 4599 1.1241 1.1439 0.2777 0.239 -0.0198

lead time

confusion

bias by decile

Production gates

gate pass? actual

| warning rate ≥ 70% | OK | 85.0% |

| alert hit-rate ≥ 30% | OK | 66.5% |

| top-decile |bias| ≤ 0.10 | OK | -0.0198 |

Passed: 3/3