Backtest vs Live Trading
Backtesting and live trading are fundamentally different environments. Backtesting runs your strategy on historical data where everything is known in advance. Live trading runs on an unknown future with real money, real emotions, and real execution challenges.
The gap between backtest and live performance is not a bug — it’s a feature of reality. Understanding this gap, its sources, and how to minimize it is essential.
Backtesting: deterministic, fast, clean data, perfect execution, no emotions.
Live trading: stochastic, real-time, messy data, imperfect execution, fear and greed.
The Performance Degradation Chain
Each factor chips away at backtest results:
Gross backtest return: 25%
- Overfitting bias: -5%
- Survivorship bias: -1%
- Slippage & costs: -4%
- Data quality issues: -1%
- Execution problems: -2%
- Market regime change: -3%
= Realistic live return: ~9%
This ~60% degradation is typical. Many practitioners use a rule of thumb: expect live performance to be 50-70% of backtest performance for a well-constructed backtest.
Concrete Examples
Execution Differences
Backtest: Buy 1,000 shares at $50.00 (the price when the signal triggers). Instant fill, no impact.
Live: By the time the order reaches the exchange (50ms latency), the price is $50.02. Your order takes out shares at $50.02, $50.03, and $50.05 due to limited depth. Average fill: $50.03.
With 200 trades/year and $0.03 slippage per share, that’s $6,000 in hidden costs.
Data Feed Issues
Backtest: Clean, adjusted OHLC data. No gaps, no errors, no missing bars.
Live: Your data feed hiccups at 10:32 AM — one bar is missing. Your indicator miscalculates for the next 20 bars. A signal fires that shouldn’t have. You enter a losing trade.
This never appears in backtesting but happens regularly live.
Psychological Interference
Backtest: Strategy enters a position, takes a 15% drawdown, then recovers to 25% gain. No hesitation.
Live: The same drawdown occurs. After watching $15,000 evaporate over 3 weeks, you override the system and close. Price recovers the next day. You lost $15,000 AND the $25,000 gain.
Market Regime Change
Backtest: Momentum strategy optimized on 2015-2020 (trending markets, low rates). Sharpe: 1.8.
Live: Deployed in 2022 (choppy markets, rising rates). Momentum signals whipsaw. Sharpe: 0.3. The strategy isn’t broken — the regime changed.
Capacity Constraints
Backtest: Small-cap strategy tested with $100K produces 40% annual returns.
Live: With $5M, the stocks only trade $500K/day. Your orders move markets, positions take days to build. Returns drop to 12%.
Building Realistic Backtests
-
Include all transaction costs: Commissions, slippage, spread, market impact. Use conservative estimates — if in doubt, double them.
-
Use survivorship-bias-free data: Include delisted securities and point-in-time index constituents.
-
Eliminate look-ahead bias: No future information in decisions. Use point-in-time data for fundamentals.
-
Test across regimes: Include bull, bear, high/low volatility, trending and ranging periods. If your data doesn’t include 2008 and 2020, it’s incomplete.
-
Model realistic execution: Use next-bar execution (signal on close, execute on next open). Add random slippage variation. Limit order fill rates should be realistic. Account for gaps on stops.
-
Apply walk-forward analysis: Re-optimize periodically as you would in real life.
Bridging the Gap: Backtest to Live
1. Paper Trade First
Run the strategy on live data without real money for 1-3 months. Compare forward test results with what the backtest predicted for the same period.
2. Start Small
Deploy with 10-25% of intended capital. Scale up only after confirming live results are within the expected range.
3. Track Execution Quality
Log every fill. Compare against the backtest’s assumed fills. Calculate actual vs. expected slippage.
4. Monitor in Real-Time
Set alerts for when live performance deviates from expected ranges. Use rolling Sharpe, drawdown, and win rate as monitoring metrics.
5. Define Kill Criteria in Advance
Before going live, write down exactly what would cause you to stop: “If max drawdown exceeds 1.5x the backtest max drawdown, halt and review.”
The Comparison Table
| Factor | Backtest | Live | Mitigation |
|---|---|---|---|
| Fills | Instant at signal price | Delayed, at market price | Model 1-bar delay + slippage |
| Data | Clean, complete | Gaps, errors, delays | Build error handling |
| Costs | Often zero or fixed | Variable, higher | Conservative estimates |
| Psychology | None | Significant | Automate execution |
| Capacity | Unlimited | Liquidity-constrained | Realistic position sizes |
| Market impact | None | Real | Model as function of size vs. ADV |
| Regime | Fixed historical | Unknown future | Test across regimes |
When to Worry
70-100% of backtest performance: Normal. Continue monitoring.
50-70%: Investigate. Check execution quality, slippage, data issues. May still be acceptable.
Below 50%: Red flag. Halt trading. Likely overfitting or a fundamental flaw.
Exceeding backtest: Suspicious. You may be in a favorable regime that will revert. Don’t increase size based on a lucky start.
Resources
- Investopedia: Backtesting — foundational overview
- Quantified Strategies: Backtest vs Live — realistic expectations
- QuantNomad: Why Live Trading Is Worse — common causes of degradation
- QuantConnect: Live Trading Documentation — deployment guide
- Algorithmic Trading by Ernie Chan — practical deployment chapter
- Systematic Trading by Rob Carver — comprehensive backtest-to-live framework
- Kevin Davey, Building Winning Algorithmic Trading Systems — full pipeline from idea to live