Annualized return divided by the maximum drawdown. Measures how much return was earned per unit of pain. Higher is better.
Definition
Calmar = annualized_return / |max_drawdown|
Example: 40% p.a. return and 20% max DD ⇒ Calmar = 2.0.
Developed by Terry W. Young (1991) as a simple, robust score for commodity trading advisors. Similar to Sharpe, but instead of volatility it takes the actual worst drawdown as the risk measure — which is closer to a trader's psychological reality (upside volatility doesn't hurt, only downside does).
Interpretation
| Calmar | Assessment |
|---|---|
| < 0.5 | poor return/risk ratio |
| 0.5 – 1.0 | OK, not convincing |
| 1.0 – 3.0 | good |
| > 3.0 | very good, be cautious with short history |
Very high Calmar values (> 5) in backtests are usually a warning sign of overfitting, not of brilliance.
How Botty uses it
The mega-sweep (backtesting/megasweep.py) uses Calmar as the primary score for ranking strategies, with a trade-minimum penalty:
score = calmar × min(1, trades / min_trades)
This way strategies that earn consistently with low drawdown win — not strategies with 2 lucky hits.
Limits
- Sensitive to single DDs. A single bad trade can permanently depress the Calmar.
- Needs enough history. With < 1 year of data the metric is shaky.
- Ignores trade frequency. A strategy with 2 trades per year can have a high Calmar but mean nothing statistically — hence the
min_tradesfactor.