Botty · CUSUM-Filter

Cumulative-sum event filter (López de Prado / Page)

Accumulates returns in two buckets (up/down pressure) and marks an event as soon as a threshold is breached — then resets. A vol-adaptive event sampler ('don't sample every candle'), not a buy/sell signal.

What is the CUSUM filter?

CUSUM = Cumulative Sum. Originating in quality control (E. S. Page, 1954) as a 'control chart' for detecting a drift in the mean. Popularised in financial ML by López de Prado as an event filter — exactly the variant Thomas Skinner (Delta Trend Trading) shows in the video 'STOP Sampling Every Candle'.

Idea: two running sums of the returns — one for upward, one for downward pressure. When one breaches the threshold, an event is marked and the sum is reset to zero:

S_up = max(0, S_up + r_t)      # accumulated upward pressure
S_dn = min(0, S_dn + r_t)      # accumulated downward pressure

if S_up >=  h:  → Up-Event,   S_up = 0
if S_dn <= -h:  → Down-Event, S_dn = 0

r_t = (log) return per bar, h = threshold. Crucially: h is vol-/ATR-normalised (h = k·ATR) instead of fixed. With a fixed threshold you would get spammed with events during high-vol phases and almost none during quiet ones; vol-normalised, the events are evenly distributed across regimes.

What it gives us

The event sampler for the 'don't sample every candle' discipline:

Noise out. Forecasting at every bar means predicting normal intraday noise with no catalyst → low accuracy. CUSUM filters down to the bars where a meaningful directional move has accumulated — the only ones with anything potentially learnable.
Partner to Triple-Barrier. CUSUM first defines when an event occurs, then Triple-Barrier labels what happens afterwards. Exactly the 'event-defining → outcome features' pipeline.
One tuning lever. k controls the event density: low = many events (towards noise), high = few (sample too small). You tune to a sensible frequency.

Important: the events are not signals. An up-event does not mean 'long'. It is only a timestamp: 'here it is worth looking'. The edge only emerges from contextualising features + a model on top of them.

Where it sits in Botty — the ML module

Home: ml/, as an event sampler (planned: ml/events.py or a function in ml/features.py) that returns a list of event timestamps. Experiments in ml/experiments/ use it to subset the bars before computing context features and labelling them via Triple-Barrier.

Not strategies/conditions/ — at least not first. This is Botty's architectural boundary: ml/ delivers findings, strategies/ implements signals only after validation. Since CUSUM events are explicitly not signals, an 'entry on CUSUM event' would be edge-less. Only once an ml/ study shows that CUSUM events + context have predictive power does a derived entry move into strategies/.

CUSUM vs. BOCPD

Both detect 'something has changed', but differently:

	CUSUM filter	BOCPD
Type	frequentist, threshold-based	Bayesian, probabilistic
Output	event yes/no (reset)	P(fresh structural break)
Cost	very cheap, 1 parameter	more expensive, model
Status in Botty	proposed	live (`ml/forecast/bocpd_live.py`)

For pure event sampling, CUSUM is lighter; as an early vol-warning signal, BOCPD is stronger. They don't compete — they complement each other.

Honest assessment

Research infrastructure (sampling), not an edge in itself. Belongs to the live-readiness discipline.
Don't confuse the two CUSUM variants: the event filter (López de Prado / Skinner — this one) and the classic changepoint detector (mean shift). For Botty we mean the event filter.
Causal & lookahead-free (only past returns).

Status in Botty: implemented in ml/events.py (cusum_events + vol_threshold); pilot pipeline in ml/experiments/cusum_triple_barrier/ (64% downsampling on 1h BTC, +1 base rate, walk-forward stable).

Parameter

k	1.0
Threshold as a multiple of the (ATR/vol) normalization; controls event density
vol_window	20
Bars for the ATR/vol normalization of the threshold

External sources

Use in Botty

ml/events.py
ml/experiments/cusum_triple_barrier/

triple_barrier bocpd atr live_readiness