Botty · Calmar Ratio

Annualized return divided by the maximum drawdown. Measures how much return was earned per unit of pain. Higher is better.

Definition

Calmar = annualized_return / |max_drawdown|

Example: 40% p.a. return and 20% max DD ⇒ Calmar = 2.0.

Developed by Terry W. Young (1991) as a simple, robust score for commodity trading advisors. Similar to Sharpe, but instead of volatility it takes the actual worst drawdown as the risk measure — which is closer to a trader's psychological reality (upside volatility doesn't hurt, only downside does).

Interpretation

Calmar	Assessment
< 0.5	poor return/risk ratio
0.5 – 1.0	OK, not convincing
1.0 – 3.0	good
> 3.0	very good, be cautious with short history

Very high Calmar values (> 5) in backtests are usually a warning sign of overfitting, not of brilliance.

How Botty uses it

The mega-sweep (backtesting/megasweep.py) uses Calmar as the primary score for ranking strategies, with a trade-minimum penalty:

score = calmar × min(1, trades / min_trades)

This way strategies that earn consistently with low drawdown win — not strategies with 2 lucky hits.

Limits

Sensitive to single DDs. A single bad trade can permanently depress the Calmar.
Needs enough history. With < 1 year of data the metric is shaky.
Ignores trade frequency. A strategy with 2 trades per year can have a high Calmar but mean nothing statistically — hence the min_trades factor.