Knowledge · Terms · Walk-Forward

Walk-Forward

Indicator concept
Walk-Forward Analysis (Optimization & Validation)
Robustness test: optimize parameters on a train window (walk-forward optimization, WFO), validate on a following out-of-sample test window, then roll the windows. Filters overfitting.

Terminology: WFO vs. WFA

Walk-forward optimization (WFO) refers to the optimization step inside each train window; walk-forward analysis/validation (WFA) is the whole rolling process including the out-of-sample validation test. In practice the terms are often used interchangeably — what's always meant is: optimize on train, measure honestly on the unseen test window.

Problem

Optimizing a strategy on the in-sample backtest is trivial — you find parameters that work perfectly in the past. But overfitting means: those parameters only work in that exact history, not in the future.

Solution

Walk-forward simulates honest live trading on historical data:

  1. Take a train window (e.g. 6 months). Optimize the parameters within it.
  2. Test the best parameters on a test window (e.g. 2 months) — this window was not visible during optimization.
  3. Roll both windows forward by the test period (e.g. 2 months ahead).
  4. Repeat 3–10 times.
  5. Aggregate the test results (never the train results!).

What this filters out

  • Curve fitting — strategies that only perform in a specific regime fail as soon as the test window falls into a different regime.
  • Parameter sensitivity — if the optimal parameters swing wildly between windows, the strategy is unstable.
  • Regime robustness — the test automatically covers different market phases.

How Botty uses it

Mega-sweep phase 3 (backtesting/megasweep.py) tests the top N from phase 2 on 3 rolling walk-forward windows:

  • Train: 6 months — parameters are optimized here
  • Test: 2 months — out-of-sample evaluation
  • Roll-forward: 2 months per iteration

A strategy only survives phase 3 if it performs consistently in the test windows.

Trade-offs

✅ The most honest way to validate on history. ✅ Catches many (not all) cases of overfitting.

❌ Computationally expensive — parameter optimization per window. ❌ Only as good as the metrics used — if you already overfit in phase 2, walk-forward may not detect it. ❌ Works poorly for strategies with very low trade frequency (too few trades per window).