Backtest vs Live Trading

Backtesting and live trading are fundamentally different environments. Backtesting runs your strategy on historical data where everything is known in advance. Live trading runs on an unknown future with real money, real emotions, and real execution challenges.

The gap between backtest and live performance is not a bug — it’s a feature of reality. Understanding this gap, its sources, and how to minimize it is essential.

Backtesting: deterministic, fast, clean data, perfect execution, no emotions.

Live trading: stochastic, real-time, messy data, imperfect execution, fear and greed.

The Performance Degradation Chain

Each factor chips away at backtest results:

Gross backtest return:  25%
- Overfitting bias:      -5%
- Survivorship bias:     -1%
- Slippage & costs:      -4%
- Data quality issues:   -1%
- Execution problems:    -2%
- Market regime change:  -3%
= Realistic live return: ~9%

This ~60% degradation is typical. Many practitioners use a rule of thumb: expect live performance to be 50-70% of backtest performance for a well-constructed backtest.

Concrete Examples

Execution Differences

Backtest: Buy 1,000 shares at $50.00 (the price when the signal triggers). Instant fill, no impact.

Live: By the time the order reaches the exchange (50ms latency), the price is $50.02. Your order takes out shares at $50.02, $50.03, and $50.05 due to limited depth. Average fill: $50.03.

With 200 trades/year and $0.03 slippage per share, that’s $6,000 in hidden costs.

Data Feed Issues

Backtest: Clean, adjusted OHLC data. No gaps, no errors, no missing bars.

Live: Your data feed hiccups at 10:32 AM — one bar is missing. Your indicator miscalculates for the next 20 bars. A signal fires that shouldn’t have. You enter a losing trade.

This never appears in backtesting but happens regularly live.

Psychological Interference

Backtest: Strategy enters a position, takes a 15% drawdown, then recovers to 25% gain. No hesitation.

Live: The same drawdown occurs. After watching $15,000 evaporate over 3 weeks, you override the system and close. Price recovers the next day. You lost $15,000 AND the $25,000 gain.

Market Regime Change

Backtest: Momentum strategy optimized on 2015-2020 (trending markets, low rates). Sharpe: 1.8.

Live: Deployed in 2022 (choppy markets, rising rates). Momentum signals whipsaw. Sharpe: 0.3. The strategy isn’t broken — the regime changed.

Capacity Constraints

Backtest: Small-cap strategy tested with $100K produces 40% annual returns.

Live: With $5M, the stocks only trade $500K/day. Your orders move markets, positions take days to build. Returns drop to 12%.

Building Realistic Backtests

  1. Include all transaction costs: Commissions, slippage, spread, market impact. Use conservative estimates — if in doubt, double them.

  2. Use survivorship-bias-free data: Include delisted securities and point-in-time index constituents.

  3. Eliminate look-ahead bias: No future information in decisions. Use point-in-time data for fundamentals.

  4. Test across regimes: Include bull, bear, high/low volatility, trending and ranging periods. If your data doesn’t include 2008 and 2020, it’s incomplete.

  5. Model realistic execution: Use next-bar execution (signal on close, execute on next open). Add random slippage variation. Limit order fill rates should be realistic. Account for gaps on stops.

  6. Apply walk-forward analysis: Re-optimize periodically as you would in real life.

Bridging the Gap: Backtest to Live

1. Paper Trade First

Run the strategy on live data without real money for 1-3 months. Compare forward test results with what the backtest predicted for the same period.

2. Start Small

Deploy with 10-25% of intended capital. Scale up only after confirming live results are within the expected range.

3. Track Execution Quality

Log every fill. Compare against the backtest’s assumed fills. Calculate actual vs. expected slippage.

4. Monitor in Real-Time

Set alerts for when live performance deviates from expected ranges. Use rolling Sharpe, drawdown, and win rate as monitoring metrics.

5. Define Kill Criteria in Advance

Before going live, write down exactly what would cause you to stop: “If max drawdown exceeds 1.5x the backtest max drawdown, halt and review.”

The Comparison Table

FactorBacktestLiveMitigation
FillsInstant at signal priceDelayed, at market priceModel 1-bar delay + slippage
DataClean, completeGaps, errors, delaysBuild error handling
CostsOften zero or fixedVariable, higherConservative estimates
PsychologyNoneSignificantAutomate execution
CapacityUnlimitedLiquidity-constrainedRealistic position sizes
Market impactNoneRealModel as function of size vs. ADV
RegimeFixed historicalUnknown futureTest across regimes

When to Worry

70-100% of backtest performance: Normal. Continue monitoring.

50-70%: Investigate. Check execution quality, slippage, data issues. May still be acceptable.

Below 50%: Red flag. Halt trading. Likely overfitting or a fundamental flaw.

Exceeding backtest: Suspicious. You may be in a favorable regime that will revert. Don’t increase size based on a lucky start.

Resources