Lab · ML Experiments

ML — Pattern Discovery

Inverted workflow: find conditional edges in BTC data first, build strategies second.
55 experiments

Regime clustering (k-means)

Inconclusive
2026-05-17 clusteringregimecore
Hypothesis
Unsupervised k-means clustering on (rv_1d, ret_24h, vol_z_1d, dd_7d) yields market regimes with materially different forward-24h returns walk-forward.
Verdict
**INCONCLUSIVE** — spread between best and worst regime is 32.0 bps but t-stat (-0.68) too weak. The clusters separate the data but the OOS return differences are noisier than they look in-sample.
n_windows
21
k_clusters
4
best_regime
2
best_t_stat
-0.6781
best_mean_bps
-18.5632
regime_spread_bps
+32.0167

Regime clustering (k-means)

2026-05-17 · status: inconclusive · 4.0s

Hypothesis: Unsupervised k-means clustering on (rv_1d, ret_24h, vol_z_1d, dd_7d) yields market regimes with materially different forward-24h returns walk-forward.

Verdict: INCONCLUSIVE — spread between best and worst regime is 32.0 bps but t-stat (-0.68) too weak. The clusters separate the data but the OOS return differences are noisier than they look in-sample.

Key metrics

metric value
k_clusters 4
best_regime 2
best_mean_bps -18.5632
best_t_stat -0.6781
regime_spread_bps +32.0167
n_windows 21

Approach

Features: rv_1d_ann, ret_24h, vol_z_1d, dd_7d. Daily observations sampled at 00:00 UTC (2,308 obs). Per walk-forward window we fit a StandardScaler + KMeans(k=4) on the training segment, then predict cluster labels on the held-out test segment.

Walk-forward windows: 21

Pooled OOS per regime (sorted by trailing vol, low → high)

k_sorted mean_bps se_bps t_stat n_windows total_obs avg_centroid_rv avg_centroid_ret avg_centroid_dd
0 -7.43 16.86 -0.44 21 590 0.424 0.0038 -0.037
1 10.72 11.33 0.95 21 831 0.502 0.0017 -0.04
2 -18.56 27.37 -0.68 21 272 0.812 0.0198 -0.074
3 13.45 21.81 0.62 18 223 1.097 -0.0238 -0.142

regime returns

regime occupancy

centroids