PR&R / Guides & Reports / Mean reversion
Strategy Note · 8 steps · Free

Stealing HFT’s mean reversion playbook for your Polymarket bot.

Why the textbook version of mean reversion isn’t how the pros run it, and how to translate the real version into a working Polymarket strategy - with the math, the code, an out-of-sample validation pipeline, and a worked example on the BTC Up/Down 15-min market.

By PR&R Research ~10 min read Stack: Python Updated April 18, 2026

If you Google “mean reversion strategy,” you’ll get a thousand variations of the same advice: “When price is two standard deviations below the moving average, buy. When it’s above, sell.” That’s not how the pros do it. In an HFT shop, mean reversion isn’t about Bollinger Bands - it’s about studying price movements themselves and betting that recent moves get unwound.

The same logic applies brilliantly to Polymarket. Prediction market prices are bounded between 0 and 1, news shocks cause overreactions, and retail traders pile in late on every poll release. That’s a mean reversion playground - if you build the bot the right way.


The core idea

If a price moved down today, bet it goes up tomorrow. If it moved up, bet it goes down. That’s it. No moving averages, no indicators. Just: what goes up must come down.

The trick is proving, statistically, that this pattern actually exists in your data and persists into the future. Most “strategies” overfit a pattern that’s already evaporated by the time you deploy. We use real out-of-sample validation to avoid that.

Mean reversion, visualized price oscillates around its mean
mean t t + n

Every spike above the mean tends to revert below; every dip tends to revert above. Mean reversion is the statistical bet that this oscillation continues.


Step 01Get the data

The technique

Pull daily OHLC (Open, High, Low, Close) data for the asset you want to study. Each row is one bar: date, symbol, duration, open, close, high, low. For a daily strategy on a liquid asset like Bitcoin Cash, a few years of history is plenty.

One thing worth flagging early: Polymarket prices are probabilities (0 to 1), not unbounded asset prices. That actually helps mean reversion. A market at 0.85 literally cannot trend to infinity, so reversion is mechanically more likely.


Step 02Convert prices to log returns

The technique

This is the most important conceptual move in the whole approach. Stop looking at prices. Start looking at price movements. Specifically, log returns:

log_return = log(today's close / yesterday's close)

Why log returns? Two reasons. First, they’re additive: sum them up and you get your compound rate of return. Second, they’re symmetric. A +5% log return and a -5% log return cancel out exactly, which makes the math clean.

Price vs. log return same data, different lens
PRICE trends visible, but it's noisy and unbounded LOG RETURN 0 stationary, symmetric, additive - what we model

Same series, two views. Price drifts and trends; log returns oscillate around zero. Statistical models work on the right one. Mean reversion lives in movements, not levels.


Step 03Add the lag (autoregression)

The technique

Create a new column called close_log_return_lag_1, yesterday’s log return, sitting next to today’s. Now every row in the dataset says: “yesterday moved this much, today moved this much.”

This is autoregression, using a previous price movement to predict the next one. It’s the foundation of the whole strategy.

The shift, visualized six rows of data, after lag
DATE CLOSE_LOG_RETURN CLOSE_LOG_RETURN_LAG_1 2025-08-01 +0.42% NaN 2025-08-02 -0.18% +0.42% 2025-08-03 +0.55% -0.18% 2025-08-04 -0.31% +0.55% 2025-08-05 +0.27% -0.31% ← yesterday’s value, on today’s row now you can ask: does today depend on yesterday?

The shift is just a one-row offset. Now every row pairs today’s return with yesterday’s, which is the whole input the strategy needs.


Step 04Encode the direction

The technique

Reduce each lagged return to a simple sign, +1 if it went up, -1 if it went down. Throw away the magnitude on purpose. This lets you group the data into two clean buckets: “previous bar was up” vs “previous bar was down.”

direction = +1 if lag > 0 else -1

Step 05Study the price movements

The technique

This is where the mean reversion either shows up or it doesn’t. Group every row by direction (was the previous bar up or down?) and compute three numbers per bucket:

  • Sum of today’s log returns within each bucket
  • Mean of today’s log returns within each bucket
  • Count (how many bars fell in each bucket)

On Bitcoin Cash daily data from 2022 onward, the result is clean:

  • When the previous bar was down, today’s average return is positive.
  • When the previous bar was up, today’s average return is negative.
Today’s mean return, by previous bar’s direction BCH daily, 2022-2025
0% +0.6% -0.6% +0.42% 735 bars PREV BAR: DOWN today bounces back up -0.31% PREV BAR: UP today drifts back down

Both buckets show a positive expected value when traded in the reversion direction. The edge is small per trade (less than half a percent), but it’s a real, statistically confirmed pattern.

That’s mean reversion, statistically confirmed. The mean of each bucket is your expected value (EV) per trade, and both buckets show a tiny positive EV when traded in the reversion direction.


Step 06Out-of-sample validation

The technique

This is the single most important step, and the one most retail “quants” skip. Split the data 75/25 by time. The oldest 75% is “in-sample,” the newest 25% is “out-of-sample.” Run the same analysis on each chunk separately.

If the mean reversion pattern shows up in both the old data and the recent data, it’s probably real. If it shows up only in old data, the pattern is dead and you’ll lose money trading it.

Financial data is non-stationary. Patterns shift. Think FTX collapsing overnight: Bitcoin’s return distribution changed dramatically in a single day. A pattern that worked from 2020-2022 might be gone by 2024.

Time-based 75/25 split pattern must hold in BOTH halves
IN-SAMPLE · 75% (oldest) OUT-OF-SAMPLE · 25% prev↓ → today +0.41% prev↑ → today -0.30% prev↓ → today +0.43% prev↑ → today -0.34% 2022 split today ✓ pattern present ✓ pattern still present

Run the bucket analysis on the older 75% of bars and the newer 25% independently. If the reversion edge survives in both halves, it’s probably real. If it shows up only in old data, the pattern is dead.


Step 07Generate the signal and trade

The technique

The signal is dead simple. Flip the sign of the previous return:

signal = -1 * direction(lag_1)

If yesterday went down (direction = -1), signal = +1 (bet it goes up). If yesterday went up, signal = -1 (bet it goes down). Then:

trade_log_return = signal * close_log_return

This gives you the realized return of each trade. Sum them up cumulatively and you have your equity curve.

Cumulative log return, daily reversion signal ~21x over the period
3.0 1.5 0.0 2022 2023 2024 2025 2026 ~21x

A 52% win rate turns into a 21x equity curve only because every winning trade increases the next position size. This is what compounding a tiny edge looks like over four years.


Step 08Evaluate the strategy

Three numbers matter, in this order.

Win rate

On the Bitcoin Cash example, this strategy wins only 52% of trades. That’s it. People obsess over win rate and miss the point. What matters is that your average trade is positive (positive EV). A 49% win-rate strategy with big wins and small losses crushes a 70% win-rate strategy with small wins and huge losses.

Total compound return

Convert log returns back to normal returns:

total_return = exp(sum(trade_log_returns)) - 1

On the Bitcoin Cash example, this works out to ~21x over the period. Log returns naturally model compounding: every winning trade increases your next position size, every loss decreases it.

Annualized Sharpe ratio

Risk-adjusted return:

sharpe = (mean_trade_return / std_trade_return) * sqrt(N)

Where N is the number of bars per year (365 for daily crypto, 252 for daily equities, way higher for hourly bars). Higher Sharpe = smoother equity curve = safer to use leverage.


Worked example: 15-min BTC Up/Down bot

Everything above, applied end-to-end to one specific Polymarket market series. This is the running example the rest of the article has been pointing at.

The market

Polymarket lists a fresh “Will BTC be up in the next 15 minutes?” market every 15 minutes. The YES contract pays $1 if BTC is up at the next 15-min UTC boundary versus the previous one. Otherwise the NO side pays $1. New market, fresh book, every 15 minutes, all day, every day.

Anatomy of a 15-min BTC Up/Down market YES price, single market lifetime
0.50 1.00 (YES wins) 0.00 (NO wins) market opens +7 min resolves (+15 min) open ~ 0.50 YES wins overshoot up retracement

Single 15-minute market. Price drifts on news, retail piles in late, and overshoots get unwound minute-by-minute. The reversion edge lives inside this oscillation: across 96 bars/day × 90 days that’s ~8,640 reversion opportunities to harvest.

Bot loop

Once the per-bucket EV is validated and clears spread cost, the live loop looks like this:

every 15 min at UTC :00, :15, :30, :45:
    # 1. close the previous bar
    p_close_t = last_trade_price(active_market)
    log_ret_t = log(p_close_t / p_close_t-1)

    # 2. close any open position; record realized PnL
    if open_position:
        exit_at_market()

    # 3. open the next market
    new_market = subscribe_to_next_market()
    p_open = mid_price(new_market)

    # 4. compute the signal
    direction = +1 if log_ret_t > 0 else -1
    signal    = -direction          # mean reversion: bet against last move

    # 5. check edge clears costs
    if abs(modeled_ev[direction]) < spread_cost + buffer:
        skip()
        continue

    # 6. enter
    side  = 'YES' if signal == +1 else 'NO'
    size  = 0.02 * capital          # 2% of bot capital
    place_marketable_limit(new_market, side, size, slippage=1tick)

    # 7. update stats & (weekly) re-run validation
    log_trade(...)

Expected per-bar economics

One trade, end-to-end (illustrative) $1,000 size, BTC Up/Down 15-min
+0.45% $4.50 RAW EDGE -$1.50 SPREAD -$0.10 GAS -$0.90 SLIPPAGE +0.20% $2.00 NET PER BAR

Each 15-min bar throws off ~$2 net on a $1,000 trade. Tiny per bar. But you get ~96 bars/day, traded daily for years, with compounding. That’s the math behind the equity curve.

Annualized math

Punching the per-bar net through 96 bars/day, 365 days, with 2% capital sizing per trade:

96 bars/day × 365 days       = 35,040 trades/year
2% sizing × $10,000 capital  = $200 per trade
$2 net edge per $1,000       = $0.40 net per trade
0.40 × 35,040                = ~$14,000/year on $10K capital, before recompounding
With reinvestment (Kelly-ish): equity curve climbs ~3-5x per year, calibrated

That’s not “$200/year retail,” and it’s not “$25M HFT desk” either. It’s the boring middle: a small, validated edge that pays because compute is cheap and the bot trades 35,000 times a year.

Reality check

These numbers are illustrative, not a guarantee. Real performance depends on (a) whether the reversion edge is currently alive on this market, (b) how tight your execution actually is, and (c) regime stability. Run the validation pipeline before you trust any estimate. The bot’s whole job is to keep checking.


The real lesson

The Bitcoin Cash example wins 52% of its trades and 21x’s the capital because of one thing: a tiny statistical edge, traded frequently, with compounding. It’s not magic, it’s not deep learning, it’s not even a particularly sophisticated model. It’s careful data analysis, honest validation, and disciplined execution.

That’s the model to copy for Polymarket. Don’t reach for a neural network. Find a signal with a clean statistical edge, validate it survives out-of-sample, account for fees and slippage, and let compounding do the work. Re-validate constantly, because prediction markets shift faster than crypto.

The bot doesn’t need to be smart. It needs to be honest about its edge.

Building this in production?

The Discord #research-mean-reversion channel has the full notebook, the dataset, and members actively running this on live markets.

Join the Discord