The SPY Put Spread Matrix - 18 Million Spreads, 8 Years, Theoretical vs Realized | FlashAlpha

The SPY Put Spread Matrix - 18 Million Spreads, 8 Years, Theoretical vs Realized

Every put credit spread permutation on SPY from 2018-04-16 to 2026-04-02 priced with SVI+Black-Scholes, then joined with actual spot at expiry. 18.3 million rows. Average theoretical EV is minus $0.48. Average realized P&L is plus $0.58. The gap is the VRP plus SPY drift that flat-vol pricing cannot see.

T
Tomasz Dobrowolski Quant Engineer
Apr 22, 2026
39 min read
Options Backtesting VRP PutSpreads SPY Quant HistoricalData

When a blog post claims "sell 30-delta puts at 45 DTE with a 50% profit take," you have no way to know whether that rule is near-optimal, arbitrary, or picked because it happens to look good in the author's sample. The only cure for cherry-picking is to not pick - compute the full surface and show where the edge actually lives.

That is what the SPY Put Spread Matrix is. For every trading day from 2018-04-16 to 2026-04-02, for every SPY expiry with a usable SVI fit, for every short-strike delta in [0.05, 0.40] and every width in {1, 2, 3, 5, 10, 20}, compute the theoretical credit spread under Black-Scholes with surface-consistent IVs, then join the row with the actual SPY close at expiry to measure what the trade would have paid. 18.3M rows. 1,698 trading days. 1,525 unique expiries. Roughly 48 seconds to rebuild from raw SVI parameters and daily forwards.

What comes out is one of the cleanest pictures of the VRP-vs-skew tradeoff on a liquid US index that you can get from public data.

-$0.48
Avg theoretical EV
+$0.58
Avg realized P&L
+$1.06
Realized edge
18.3M spreads · 2018-04 → 2026-04
92.9% realized winrate · 5.6% max-loss rate
Every bucket -EV by theory. Every year +P&L by reality.

Finding 1 - Every Theoretical Bucket Is Negative EV

Across all 18,307,256 rows, the average theoretical EV is -$0.484. That is the skew premium the market charges for holding the short-put wing. There is no slice of the (DTE, delta, width) grid where theoretical EV is positive. Under the flat-vol lens, put credit spreads are net losers in expectation.

Flat-vol pricing is a trap. If this is the only model you rely on, the conclusion is clear and wrong: don't sell put spreads. Flat-vol does not price the variance risk premium or SPY's drift - it only prices the skew the market is literally quoting. The next finding shows what that missing pricing looks like in dollars.


Finding 2 - Realized P&L Is Positive Almost Everywhere

Average realized P&L across the full matrix: +$0.577. Average realized edge (realized - theoretical): +$1.061. The skew premium was paid, fully, and then some. The "then some" is the VRP (implied > realized vol) plus the structural upward drift of SPY over the sample.

74.8%
Model-implied POP
92.9%
Realized winrate
92.6%
Full-win rate
5.6%
Max-loss rate

Yearly breakdown, restricted to a single strategy footprint (width = $5, short delta 0.20-0.30) so the year-to-year comparison is apples-to-apples. Cells are heat-mapped: deeper green = better outcome, deeper red = worse.

YearnTheo EVRealized P&LEdgeWinrateMax-loss rateAvg IV
2023119,986-$0.43+$0.97+$1.3997.8%1.2%21.1
2024124,447-$0.47+$0.86+$1.3397.5%1.8%18.7
2026 (Q1)29,114-$0.57+$0.86+$1.4397.0%1.9%25.2
2020131,318-$0.51+$0.84+$1.3595.6%3.7%32.2
2025117,429-$0.54+$0.77+$1.3196.3%3.1%21.8
201970,470-$0.40+$0.50+$0.9189.8%8.3%19.7
2018 (Apr-Dec)46,479-$0.41+$0.28+$0.6984.9%11.5%21.2
2021129,952-$0.62+$0.21+$0.8385.7%12.2%25.5
2022108,386-$0.52+$0.02+$0.5478.5%18.2%28.5

Every year is profitable. Every year's theoretical EV is negative. The only year where realized was close to breakeven is 2022 - the one true bear year in the sample. More on that below.


Finding 3 - The Realized Optimal Entry Is Wide, Deep, and Long-Dated

When ranked by average realized P&L per spread (min 500 observations), the top of the list is dominated by 91+ DTE wide spreads at relatively deep short deltas:

DTE|Δ|WidthRealized P&LWinrateMax-loss rateEdge vs theoryn
91+0.40$20+$3.8587.7%8.7%+$6.21119,452
91+0.35$20+$3.6189.8%7.5%+$5.88257,875
91+0.30$20+$3.2391.6%5.3%+$5.33290,954
91+0.25$20+$2.8093.6%3.3%+$4.71313,083
91+0.20$20+$2.0995.1%2.1%+$3.72274,737
91+0.40$10+$1.9686.7%10.9%+$3.22119,450
91+0.35$10+$1.8889.4%9.3%+$3.09257,875
91+0.30$10+$1.6691.2%7.3%+$2.80291,077
46-900.40$20+$1.2381.8%10.6%+$3.038,659
91+0.20$10+$1.2195.7%3.0%+$2.13330,008

Two drivers stack together at the top of that list:

  • DTE captures drift. Over 91+ days SPY has had, on average, enough upward move for even 30-40Δ shorts to end up out-of-the-money most of the time.
  • Width scales the edge. The $20-wide at a given delta collects roughly 4x the credit of the $5-wide and carries roughly 4x the max loss, but realized winrate barely changes. In a regime where realized underperforms implied, bigger width is a linear multiplier on dollar edge, not on risk-adjusted outcome.

Per-trade Sharpe (avg realized P&L / stddev) ranges from 0.57 to 0.65 across the top ten rows. Under an IID annualization, the 91+ DTE deep-wide profiles project to an annualized Sharpe around 0.55-0.60 - technically the global peak, but estimated from fewer than one independent trade per year per slot, which means the confidence interval is wide. The 45 DTE profiles analyzed in Finding 5 project to annualized Sharpe ~0.48 from ~8 trades/year, which is lower in point estimate but much tighter in confidence. Pick your tradeoff: tighter Sharpe with more turnover, or slightly higher Sharpe with LEAP-length capital lockup.


Finding 4 - Capital Efficiency Flips the Verdict on Short DTE

Absolute P&L per spread goes up with DTE. Annualized return on capital does not. The below is for short delta 0.20-0.30, width $5, aggregated by DTE bucket:

DTE bucketAvg DTEAvg creditAvg max lossRealized P&LPer-trade ROCAnnualized ROCWinrate
0-74.7$0.68$4.32+$0.0420.97%75.3%83.1%
8-2114.4$0.74$4.26+$0.0691.62%41.0%84.2%
46-9067.5$0.79$4.21+$0.2556.05%32.7%88.5%
22-4532.5$0.76$4.24+$0.1212.86%32.1%85.9%
91+394$1.06$3.94+$0.75219.1%17.7%93.6%

The 0-7 DTE bucket has the worst winrate and the smallest absolute P&L per trade, but if you could redeploy capital at that turnover the annualized number is 75%. Four caveats:

  1. Fill costs eat this alive. The matrix uses mid-quote theoretical prices reconstructed from SVI. Real bid/ask on 0-7 DTE SPY credit spreads is wide enough to erase most of a $0.04 edge. Treat the 0-7 number as an upper bound, not a strategy.
  2. The 91+ bucket has a huge capital lockup. 394 days of average hold time means capital turns over less than once a year; even a +19% per-trade return only annualizes to ~18%.
  3. The 22-90 DTE band is the comfortable zone. ~32-33% annualized, winrates in the high 80s, manageable max-loss rates.
  4. None of this models profit takes or stops. Every trade is held to expiry. Overlaying a 50% PT and a -200% stop changes the distribution materially.

Finding 5 - Risk-Adjusted Returns and Portfolio Scaling

Annualized ROC alone hides risk. A profile that compounds $0.04 of edge 51 times a year is not obviously better than one that earns $0.33 at 8x turnover - stddev scales with turnover too. To get a clean comparison, we need Sharpe, and to get Sharpe we need to push the per-trade distribution through an IID annualization.

The formulas are closed-form from the per-trade distribution of realized P&L - no Monte Carlo, no bootstrap, no path simulation:

trades_per_year = 365 / avg_dte
ann_return_$    = avg_pnl  × trades_per_year
ann_stddev_$    = sd_pnl   × √(trades_per_year)
ann_sharpe      = ann_return_$ / ann_stddev_$
ann_ROC%        = (avg_pnl / avg_max_loss) × trades_per_year × 100

Applied to fifteen (DTE, delta, width) profiles, one trade per day picking the spread whose delta is closest to target:

ProfilenAvg DTEAvg $/tradeTrades/yrAnn return $Ann SharpeAnn ROC %WinrateMax-loss rate
7 DTE 40Δ 5w1,5627.2$0.1150.9$5.400.39142.5%74.1%16.1%
45 DTE 40Δ 10w1,31543.9$0.628.3$5.150.4967.6%79.5%12.9%
90 DTE 40Δ 20w95990.2$0.974.1$3.920.2725.2%79.2%13.0%
7 DTE 25Δ 5w1,6797.2$0.0750.9$3.610.3483.5%84.1%7.7%
45 DTE 25Δ 10w1,32443.9$0.418.3$3.410.4139.5%88.4%7.5%
90 DTE 25Δ 20w95990.2$0.684.1$2.770.2415.9%88.2%7.5%
45 DTE 40Δ 5w1,31543.9$0.338.3$2.740.4874.8%78.2%17.3%
30 DTE 40Δ 5w1,67331.5$0.2111.6$2.430.3565.9%76.9%20.0%
45 DTE 30Δ 5w1,32443.9$0.288.3$2.310.4757.2%85.2%12.1%
45 DTE 25Δ 5w1,32443.9$0.238.3$1.870.4244.3%87.9%9.5%
60 DTE 25Δ 5w93760.2$0.226.1$1.300.3330.9%87.9%10.5%
14 DTE 25Δ 5w1,65713.9$0.0426.3$1.060.1324.7%83.8%9.7%
180 DTE 25Δ 5w1,061179.7$0.462.0$0.940.4622.8%91.0%8.7%
90 DTE 25Δ 5w95990.2$0.224.1$0.900.2821.6%87.7%10.8%
30 DTE 25Δ 5w1,67931.5$0.0711.6$0.790.1418.7%84.7%11.7%

Three useful picks fall out of this table, all at 40Δ:

PickProfileSharpeROCCapital / contractAnnual $Note
Best risk-adjusted 45 DTE 40Δ $5 wide 0.48 75% $367 $274 Winrate 78%, max-loss 17%
Biggest dollar yield 45 DTE 40Δ $10 wide 0.49 68% $762 $515 Same edge, 2x footprint
Highest ROC (frequency) 7 DTE 40Δ $5 wide 0.39 142% $379 $540 $0.11/sh credit - costs eat most of it
What Sharpe 0.48 actually means

SPY buy-and-hold over 2018-2026 runs a Sharpe of roughly 0.55-0.70. Typical long-only equity strategies sit in the 0.4-0.8 range. The best put-spread profile in this sample is competitive with passive SPY, not a free lunch. The honest use case is diversification: put-spread returns are dominated by theta decay and VRP, only weakly correlated to the daily SPY path, so a small allocation lowers total-portfolio volatility while adding a structurally different return stream. Treating this as a standalone strategy that dominates buy-and-hold is not what the numbers support.

Scaling to a $100k portfolio

Deploying a notional $100,000 on the best Sharpe profile (45 DTE 40Δ $5 wide) with fully laddered expiries so capital stays continuously deployed:

  • Capital per contract (avg max loss): $367.
  • Contracts concurrently held: $100,000 / $367 ≈ 272.
  • Annual expected P&L: 272 x $274 = ~$74,500 (~74% ROC).
  • Annual stddev: 272 x $566 = ~$154,000 (Sharpe 0.48).
  • 2022-regime stress: winrate drops to ~78% and max-loss rate rises to ~18% - a realistic loss of $20-30k is in-sample.

The IID Sharpe math above gives an upper-bound annualized stddev. In practice, losses cluster - when one spread hits max loss in a sharp gap-down, the other 271 in the ladder are on the same side of that move. Realized portfolio variance is wider than sd_per_trade * sqrt(trades_per_year) would suggest, sometimes materially. Sizing should reflect this: 25-50% of capital at full-sizing (~$18-37k expected annual return on $100k) is a more defensible deployment than the 100%-sized $74k number.

Run this Sharpe table for your own setup
Every profile above came from one API: SVI surfaces, forwards, spot at expiry, since 2018.
Any symbol on coverage, any DTE, any delta target, any width. Walk-forward by default - the percentile and skew at time t only see data from before t. The exact feed the FlashAlpha research team used for this 18.3M-row study.
Surfaces
Every expiry
Coverage
SPY, QQQ, IWM
History from
2018-01
Get Alpha tier access → API spec Alpha tier · from $1,199/mo

Finding 6 - Delta Tradeoff at the Classic 30-DTE Horizon

The "sell around 30 delta" rule gets most of its intuition from this shape. Holding DTE at 22-45 and width at $5, varying only the short delta:

|Δ|Realized P&LWinrateMax-loss rateAvg creditTheo EVRealized edge
0.40+$0.20276.5%19.7%$1.31-$0.51+$0.71
0.35+$0.16678.8%17.3%$1.15-$0.49+$0.66
0.30+$0.13282.1%14.2%$0.96-$0.46+$0.59
0.25+$0.12385.7%10.9%$0.77-$0.42+$0.54
0.15+$0.11893.0%4.9%$0.42-$0.30+$0.42
0.20+$0.11089.2%8.0%$0.59-$0.36+$0.48
0.10+$0.11096.5%2.5%$0.26-$0.22+$0.33
0.05+$0.08398.4%1.3%$0.16-$0.16+$0.24

Two honest observations:

  • Absolute realized P&L keeps climbing with delta, all the way to 40Δ. Deeper shorts collect more, and in a drift-positive regime that extra credit sticks.
  • Max-loss rate also keeps climbing - from 1.3% at 5Δ to 19.7% at 40Δ. One-in-five 40Δ trades hits the floor. Fine if they're sized so you can take it; not fine if they're not.

Skew premium is monotonic in delta. Realized edge (the gap vs theoretical flat-vol EV) grows from +$0.24 at 5Δ to +$0.71 at 40Δ - the market charges the most skew premium at the strikes closest to the money, and those are also the strikes where drift and VRP pay back the most.


Finding 7 - IV Regime Counter-Intuitively Favors Crisis

Bucketing the 22-45 DTE, 20-30Δ, $5-wide set by short-strike IV:

IV regimeTheo EVRealized P&LEdgeWinrateMax-loss raten
crisis (30+)-$0.42+$0.54+$0.9592.8%6.1%15,555
high (22-30)-$0.42+$0.16+$0.5986.3%10.6%26,486
elevated (16-22)-$0.42+$0.05+$0.4785.1%11.5%30,132
normal (12-16)-$0.38-$0.10+$0.2881.7%13.1%17,848
calm (<12 IV)-$0.37-$0.66-$0.3071.3%22.6%1,676
Counter-intuitive but robust

Calm is the worst regime. Crisis is the best. Theoretical EV is nearly flat across regimes, but realized P&L flips from -$0.66 in calm to +$0.54 in crisis. This is the variance risk premium in one table - the highest edge is at the point of maximum perceived danger.

The mechanism:

  • Calm (<12 IV). Small credits, complacent market. These are the setups that sit at vol floors and then get picked off by a regime change. The sample is small (1,676 observations), but the direction is consistent with other VRP work - low VIX is not the same as safe.
  • Crisis (30+ IV). Implied vol overshoots realized. You are selling into the fear, and reality comes in below what the market priced. Winrate is highest, max-loss rate is lowest, despite the regime label.

Finding 8 - The 2022 Stress Test

2022 is the only real bear year in the sample, and it is the year every backtest of SPY premium selling should be pressure-tested against:

  • Winrate in the strategy footprint (width $5, 20-30Δ) dropped from 85.7% in 2021 to 78.5% in 2022, then recovered to 97.8% in 2023.
  • Max-loss rate went 12.2% (2021) → 18.2% (2022) → 1.2% (2023).
  • Average realized P&L went +$0.21 → +$0.02 → +$0.97.

The strategy survived 2022 in aggregate, but with no margin. Any leverage, outsized sizing, or concentrated tenor would have produced a double-digit drawdown. This is the "how often do you need discipline on sizing and sequencing" question - in every other year of the sample the honest answer is "almost never"; in 2022 the answer is "every trade."


Caveats That Should Be Loud

A clean dashboard is seductive. The numbers above are real in the sense of "this is what the data says," but several assumptions make them an upper bound on live tradability:

  1. No transaction costs. Options spreads have wide bid/ask, commissions, and slippage. Short-DTE results in particular are not clearable: at 7 DTE 40Δ the average credit is about $0.11 per share, and a realistic round-trip commission plus slippage budget of $0.10-0.20 per share erases most of the edge. The 7-DTE 142% annualized ROC is a model ceiling, not a target.
  2. Theoretical prices are mid-quote proxies. IV is reconstructed from SVI and puts are priced with Black-Scholes. No live bid/ask is captured. Actual executable credit is closer to the bid than to the mid, and deep-OTM longer-dated wings (90+ DTE, low delta) can quote $0.10+ wide on their own.
  3. No liquidity filter. Rows exist wherever the SVI surface calibrated. Thin strikes may be included that would not actually fill at the modeled price. This matters most for very deep OTM and long-dated combinations.
  4. No stops, no profit takes. Every trade is held to expiry. Realistic execution includes 50% PT and -200% stops (or other variants), which change both the distribution and the drawdown profile.
  5. IID Sharpe is an upper bound. Annualizing per-trade stddev by sqrt(trades_per_year) assumes independence across trades. In bear regimes - clearly 2022, also the tail of Q1 2020 - losses cluster, so realized annual stddev exceeds the IID estimate. Real drawdowns are worse than the Sharpe math implies; the Sharpe 0.48 figure is therefore an optimistic number, not a conservative one.
  6. SVI calibration survivorship. Rows only exist where the surface fit succeeded. Days of extreme tape-tearing volatility may be under-represented.
  7. Single underlying. SPY has structural upward drift. Single-name, earnings-driven underlyings would look materially worse (fatter tails, no drift).
  8. 8 years is short. One real bear year. Conclusions about "crisis regime wins" and the Sharpe estimate itself carry wide confidence intervals.

None of these invalidate the shape of the findings, but they all push the magnitude of any live strategy built on top of this analysis downward. The theoretical-vs-realized gap is real; the exact basis-point edge is smaller than this model suggests.


Methodology & Data Disclosure

Every number above comes from three ingredients: historical SVI surface parameters, daily forwards, and SPY spot at expiry. All three are available through the FlashAlpha Historical API and the flashalpha Python package - same surface fits the live product uses, walk-forward by default, no look-ahead.

pip install flashalpha
from flashalpha import Client
import pandas as pd

fa = Client(api_key="...")
surfaces = fa.historical.surface("SPY",
    start="2018-04-16", end="2026-04-02")

# surfaces is a DataFrame: ts, expiry, dte, forward,
# svi_a, svi_b, svi_rho, svi_m, svi_sigma
# From there: reconstruct IV at every strike, price BS puts,
# enumerate spread permutations, join with spot at expiry.

Coverage today: SPY, QQQ, IWM with surfaces and forwards since January 2018. Additional symbols are available on request.

Analytical notes. Sharpe ratios above are computed from the realized per-trade P&L distribution (mean and stddev) and annualized under an IID assumption. Realized P&L is computed by joining each modeled spread with the actual SPY close at that expiry - not simulated. No profit takes, no stops, no transaction costs modeled. Per-contract dollar figures are per-share values multiplied by 100.


The One-Line Takeaway

TL;DR

Over eight years and 18 million simulated put credit spreads on SPY, the market priced skew premium into every bucket. Flat-vol BS theoretical EV averaged -$0.48. Actual realized P&L, joined with real spot at expiry, averaged +$0.58. The +$1.06 gap - realized edge - is the variance risk premium plus SPY's drift, combined and paid out in cash. Put selling on SPY is renting the market's fear, and the rent has been paid every year in the sample.


Related Articles

Historical API · Alpha tier · from $1,199/mo
Build the matrix for your own symbol
Pull SVI surfaces, forwards, and spot at expiry from 2018 onward. Same leak-free, walk-forward API the FlashAlpha research team used for this study. Coverage: SPY, QQQ, IWM today - more symbols on request.
View pricing →
Data freshness: intraday data through the previous trading day's close, refreshed by the daily pipeline run. Live coverage status at /v1/tickers.

Upgrade to Alpha API Spec

Live Market Pulse

Get tick-by-tick visibility into market shifts with full-chain analytics streaming in real time.

Intelligent Screening

Screen millions of option pairs per second using your custom EV rules, filters, and setups.

Execution-Ready

Instantly send structured orders to Interactive Brokers right from your scan results.

Join the Community

Discord

Engage in real time conversations with us!

Twitter / X

Follow us for real-time updates and insights!

GitHub

Explore our open-source SDK, examples, and analytics resources!