Back to Blog · Software Architecture

Closing the Loop: How Trading Strategies Graduate from Backtest to Live Execution

Architecture and implementation of a multi-stage promotion pipeline that moves automated trading strategies from historical backtest through paper trading into live production, with gated qualification at each stage.

MF
Martin Fournier
· June 02, 2026 · 7 MIN READ
Illustration for: Closing the Loop: How Trading Strategies Graduate from Backtest to Live Execution

Every algorithmic trading strategy starts as an idea. Some die in a notebook. Most fail in backtest. A few survive and show consistent edge across decades of historical data. But surviving backtest is not the same as being ready for live money.

The hard problem is the gap between those two states. A strategy that prints 15 percent annualized Sharpe on 20 years of EUR/USD data can lose money in two weeks of live trading. The reasons are legion: survivorship bias in historical data, regime changes the backtest never saw, latency assumptions that do not hold in production, position sizing that works on paper but breaks under margin constraints.

This post covers how Trading Bridge (the Java monorepo powering my automated forex strategies) closes that gap. The architecture is a multi-stage promotion pipeline with qualification gates at every level. Trading Promotion Pipeline

The three-stage promotion pipeline: backtest qualification through PromoteGates (minTrades, maxDrawdown, goldenBaseline, validationModule), paper trading on OANDA Practice with daily drawdown tracking and kill switch protection, then live execution on OANDA/IBKR with continuous monitoring and automatic demotion.

The Three-Environment Model

The system runs strategies across three environments, each with different data, different execution semantics, and different qualification criteria.

Stage 1: Historical Backtest

Every strategy enters through the backtest engine. The engine runs against 20+ years of 1-minute OHLC bars. The raw data originates as CSV from Dukascopy, then gets converted to binary .bars format (44 bytes per bar, memory-mapped via MappedByteBuffer for zero-copy random access). A Java BarStore handles the binary format with fixed-size records and O(1) index-based lookups. The backtest fills MARKET orders at bar.open() -- conservative, no look-ahead, no fill slippage assumptions.

The BacktestEngine logs every RunEvent to JSONL for later analysis. The key metrics that matter at this stage:

  • Net profit, Sharpe ratio, max drawdown
  • Percent winning trades, average win/loss ratio
  • Profit factor, number of trades (statistical significance check)
  • Performance consistency across market regimes

A strategy must pass a golden baseline comparison before it qualifies for Stage 2. The baseline is a set of known-good strategies with verified performance -- if a new strategy cannot match or exceed the baseline risk-adjusted metrics, it goes back to the drawing board.

Stage 2: Paper Trading (Qualification)

This is where most systems stop -- or never implement properly. Paper trading is not a simulation of live execution. It is live execution with fake money.

The Runtime module (trading-runtime) manages this through the ControlPlane HTTP API. A strategy registered for paper trading gets the same Broker interface as live strategies, but connected to a paper endpoint. OANDA provides real paper environments that simulate fills, spreads, and overnight swaps with market data.

The PromoteGate evaluates three things during paper trading:

  1. Execution drift: does the strategy live P&L track the backtest P&L within a tolerance band? Large drift means the backtest made assumptions that do not hold in real-time.
  2. Fill quality: OANDA paper fills are optimistic. The gate applies a configurable slippage penalty to paper trades to estimate worst-case live fills.
  3. Operational stability: does the strategy crash, hang, or produce invalid orders? A strategy that runs clean for 30 days in paper qualifies for consideration.

Stage 3: Live Execution

The live broker connector (OandaExecutor for OANDA, with IBKR on the roadmap) uses the exact same Broker interface. The only difference is the API endpoint and the authentication credentials.

Live strategies run through the same RunManager lifecycle. The runtime logs every event to the SQLite EventStore, enabling full replay and audit. If a live strategy exceeds its configured risk limits, the PromoteGate can demote it back to paper or halt it entirely.

The Close Loop Pattern

The term "Close loop" appears repeatedly in the commit history -- it is the operational cycle that moves strategies through these stages. A close loop involves:

  1. Backtest qualifying a strategy against historical data
  2. Registering the strategy in the runtime catalog via the ControlPlane API
  3. Running it in paper for a qualification period (typically 2-4 weeks)
  4. Reviewing paper results against backtest projections
  5. Promoting to live if qualification passes
  6. Monitoring live performance with automatic demotion triggers

The NFP Week deployment (June 1-5, 2026) is a recent example. The strategy targets EUR/USD specifically during Non-Farm Payroll week -- a known volatility event. It was backtested, paper-qualified, then deployed as a dedicated Docker Compose service alongside the main runtime. Event-specific strategies run in their own process space so they can have different resource profiles, logging verbosity, and shutdown policies.

Why This Architecture Matters

Most retail trading setups skip the paper stage entirely. A strategy gets backtested on clean historical data, looks profitable, and hits live trading directly. The result is almost always the same: the strategy underperforms because the backtest was overfit or the execution environment differed.

The layered promotion model forces patience. A strategy must prove itself in three separate environments before it touches real capital. Each stage filters out strategies that were lucky in backtest but fragile in real-time.

The PromoteGate adds an additional safety layer. Even in live trading, the system continuously monitors for anomaly. If a strategy that showed 12 percent drawdown in 20 years of backtest suddenly draws down 8 percent in two weeks of live trading, the gate flags it for review. If the drawdown exceeds the configured hard limit, the gate halts the strategy automatically.

Implementation Details

The PromoteGate reads qualification thresholds from a JSON config file in data/runtime/. Thresholds can be adjusted without recompiling. The EventStore provides the data: every run, every trade, every order is logged to SQLite with timestamps. The gate queries this store to evaluate current performance against baseline.

public class PromoteGate {
    private final EventStore eventStore;
    private final ThresholdConfig thresholds;

    public QualificationResult evaluate(StrategyId id, RunEnvironment env) {
        var events = eventStore.getEvents(id, env);
        var stats = RunStatistics.compute(events);

        if (stats.drawdown() > thresholds.maxDrawdown(env)) {
            return QualificationResult.DEMOTE;
        }
        if (stats.sharpRatio() < thresholds.minSharpe(env)) {
            return QualificationResult.HOLD;
        }
        if (stats.totalTrades() < thresholds.minTrades(env)) {
            return QualificationResult.INSUFFICIENT_DATA;
        }
        return QualificationResult.PASS;
    }
}

The qualification rules are deliberately asymmetric. Passing a gate requires more evidence than failing it. A strategy stays in its current environment unless it clearly demonstrates it belongs in the next one.

The Docker Compose Edge Case

Not every strategy runs in the main runtime. The NFP deployment showed a pattern worth calling out: event-specific strategies run as independent Docker services. They share the same broker API and event store schema, but they have their own lifecycle.

The rationale is operational. A strategy that trades only during a 30-minute window once a month should not compete for resources with round-the-clock strategies. If the NFP strategy crashes, it should not affect the main runtime. If it needs different logging levels or a different JVM heap configuration, it should have its own process.

The shared interface is the key. Because every broker connector implements the same Broker contract, any strategy can run in any environment -- backtest, paper, live, standalone -- without code changes. Only the configuration changes.

Closing Thoughts

The gap between backtest and live trading is where most alpha disappears. Closing it requires infrastructure, not just better strategies. The three-stage promotion model with qualification gates is not complex to implement, but it enforces the discipline that most trading systems lack.

Every new strategy in Trading Bridge goes through this pipeline. The ones that make it to live are not the ones that looked best in backtest. They are the ones that survived all three environments -- and that is a much higher bar.