Why Nobody Else Sells Historical Options Analytics - The Build vs Buy Math | FlashAlpha

Why Nobody Else Sells Historical Options Analytics - The Build vs Buy Math

Raw historical options data exists - ThetaData, Polygon, ORATS all sell flavors of it. Pre-computed historical analytics - GEX, VRP, dealer regime, max pain, vol surfaces, replayable at any minute - don't exist anywhere else. This is an honest look at why the assembly is a multi-year engineering project and why the economics didn't work until now.

T
Tomasz Dobrowolski Quant Engineer
Apr 15, 2026
17 min read
HistoricalData OptionsAPI MarketAnalysis DataEngineering Quant

When we announced the Historical Analytics API, the reasonable reaction from anyone who's shopped this space is: "surely somebody already sells this?" The short answer is no. The longer answer - why not - is more interesting than the product announcement itself, because it explains why the competitive picture will look very different for the next year or two than it did last week.

Full disclosure up front: I built this. I'm going to be as honest as I can about what the other providers ship well, because the claim "we're the only one" is easy to make and usually untrue. In this specific case it actually is true, and the reason is a combination of data economics, calculator investment, and product scope that nobody else has chosen to cross simultaneously.


What Exists Today

Here are the providers a serious buyer would look at, and what each of them ships in the "historical options" space:

Provider What they ship historically What they don't
ThetaData Cheap historical options tick data, EOD chains, some NBBO reconstruction No greeks at tick resolution out of the box. No pre-computed GEX/VRP/regime. No SVI fits. You bring the pipeline.
Polygon.io Tick-level historical quotes and trades, options aggregates, stocks history Same as above - raw data, no analytics layer. Strong infrastructure, empty product layer.
ORATS 25 years of EOD options data, backtesting engine, 98 proprietary indicators, IV rank End-of-day only (no intraday minute resolution). No GEX/DEX/VEX/CHEX per strike. No dealer regime classification. Backtesting is hosted, not API-streamed.
Intrinio Institutional options + fundamentals bundles, historical chains Raw data-flavored; analytics layer is generic across asset classes. Not focused on dealer-positioning analytics.
Unusual Whales Historical flow data, unusual options activity, historical sweep archives Not a dealer-positioning product. No GEX/VRP/vol-surface history.
SpotGamma Browser-first historical dealer-positioning charts (partial API) No programmatic at-any-minute replay API. Dashboard-shaped, not developer-shaped.
CBOE DataShop Official historical options data, institutional pricing Raw only. You build the entire analytics layer. Pricing is institutional.

The pattern: raw historical data is a commodity. It is available, often cheaply, from multiple vendors. The gap everyone leaves is the analytics layer - the work of turning those raw chains into per-strike exposures, regime labels, leak-free percentiles, vol surfaces, and composite dashboards, all at the same minute-level granularity you'd expect from a live product.

ORATS is the closest to shipping analytics historically. The gap between ORATS and FlashAlpha's Historical API is the dealer-positioning category specifically - GEX, DEX, VEX, CHEX, per-strike hedging flow, regime classification, gamma flip - plus the minute-level resolution. ORATS ships EOD; FlashAlpha ships every minute from 9:30 to 16:00 ET.


Why the Gap Stayed Open

Three reasons, roughly in order of importance.

1. The Data Cost Is Real

SPY alone - 8 years of minute-level options quotes with greeks - is 6.7 billion rows. To cover the S&P 500 equivalents at the same resolution you're looking at hundreds of billions of rows. That data has to be stored somewhere (we use QuestDB with time-partitioned tables), queried efficiently (LATEST ON queries with partition pruning run 50-300ms), and re-ingested every trading day. The storage isn't exotic in 2026, but it isn't free, and the combination of "minute resolution" + "multi-year history" + "greeks attached" puts it at a scale where most analytics startups optimize differently.

You can see this in how the other providers chose their scope. ORATS picked EOD + 25-year history - big in the time dimension, small in the intraday dimension. ThetaData picked tick resolution + bring-your-own-greeks - big in granularity, small in the analytics dimension. SpotGamma picked browser-first + recent history - rich presentation, short archive. FlashAlpha picked minute + 8 years + pre-computed analytics, which is expensive along every axis simultaneously.

2. Calculator Maturity Is a Prerequisite

Shipping the Historical API required that the live analytics layer already exist and be correct. FlashAlpha spent the first two years of the product building the calculator stack - ExposureCalculator, NarrativeBuilder, VrpCalculator, VolatilityAnalyzer, AdvancedVolatilityCalculator, VolSurfaceGridBuilder - as pure-static classes, so that they could be invoked against any timestamped input without dragging a web stack along.

A provider that grew up shipping dashboards instead of APIs has to refactor their analytics to be pure functions of inputs before they can replay history at arbitrary timestamps. That refactor is a quarter-to-a-year of engineering work by itself, and it has to land on a product team that's already busy servicing live customers. The ones that didn't invest early are now behind by the exact gap they didn't invest.

3. The Market Had to Get Bigger

Pre-computed dealer-positioning analytics - GEX in particular - was a niche product three years ago. In 2026 it's a recognized category, driven by 0DTE growth, retail options expansion, and quant teams at mid-sized hedge funds starting to build GEX-overlay strategies. The demand for historical dealer-positioning data lagged by 12-18 months behind live, because buyers had to first see the value of live before asking "what did this look like during the last drawdown?"

We started the Historical API backfill when that ask hit a threshold in the sales pipeline. Other providers haven't hit that threshold yet, either because their customer mix skews toward other use cases (Polygon's retail developers, ORATS's backtesters, Intrinio's institutional fundamentals bundles) or because the investment case hasn't been approved.


The Bundle, Specifically

What makes FlashAlpha's Historical API a product rather than a feature is the bundle:

  • Minute resolution for greeks, spot, and derived exposures.
  • EOD layer for OI, SVI parameters, forwards, and macro (VIX, VVIX, SKEW, MOVE, SPX, DGS10).
  • Same calculators as live - every endpoint delegates to the same pure-static classes, and bug-fixes land simultaneously.
  • Leak-free percentiles - VRP percentile and z-score are date-bounded in SQL, not convention.
  • Same response shape as live - code written against api.flashalpha.com works against historical.flashalpha.com with a base-URL swap.
  • Self-describing coverage - /v1/tickers returns loaded symbols, date ranges, and gaps.
  • Repeatable, gap-aware pipeline - the backfill is idempotent; re-runs are minutes, not hours.

Each line of that list is fifteen-to-thirty engineer-days of work to replicate. The bundle is what makes the product hard to reproduce, not any single item.


Build vs Buy - The Honest Math

For the few teams that would consider building this themselves, here's the rough budget, based on how long it took us:

MilestoneEng-weeksNotes
ThetaData ingestion pipeline (parquets, gap-aware)4Assumes ThetaData subscription + existing infra
BSM greek hydration for ~7M quotes/day3Correctness, performance, dividend handling
QuestDB (or equivalent) cluster + partitioning + DBA ops3Columnar store choice is a multi-week decision by itself
Stock minute-bars pipeline + EOD close extraction2
End-of-day OI pipeline + reconciliation2
Daily SVI fitter + forward-price calculation4Arbitrage checks, butterfly constraints
Macro ingest (VIX, VVIX, SKEW, MOVE, DGS10, SPX)2
Pure-static calculator refactor (exposures, narrative, VRP, vol, advanced vol)12If the calculators aren't already pure, this is where you live for a quarter
Leak-free percentile infrastructure + DailyVrpSnapshots3Easy to get wrong
REST layer mirroring live endpoint shapes4
Data Quality Report + coverage endpoint2You will want this before the first customer call
Backfill runs + gap detection + re-run idempotency3
Ops, on-call, cost monitoring4Ongoing, but front-loaded
Total~48 eng-weeksRoughly one engineer-year, assuming the calculator work is already done. Double it if starting from scratch.

That's the build side. The buy side is an Alpha-tier subscription. For any team whose core product isn't "historical options data infrastructure," the math is one-sided - and the teams whose product is that are the ones who'd be competing with us, which brings us back to why the gap exists.


What Catches Up First

Predictions, honestly made, about the next 18 months:

  • ORATS is the most likely to add minute-level resolution and dealer-positioning analytics. They have the calculator discipline and the historical depth. Watch for a "real-time analytics" product announcement.
  • SpotGamma is the most likely to add a programmatic historical replay, because their dashboard already visualizes this and the demand is explicit. API-shape discipline is the gap.
  • Polygon is unlikely to move up-stack - they're committed to the infrastructure layer and that's a strong position.
  • ThetaData, similarly, is committed to the "cheap raw data" position and the gap to analytics is larger than it appears.
  • A new entrant is plausible - the demand is proven now, the engineering is hard but bounded, and the economics work at Alpha-tier pricing.

In each case the lag to a comparable product is at least a year, more likely two. Which is the window we have to deepen coverage (more symbols), extend features (minute-level OI changes, surface smoothing), and lock in the "pre-computed historical analytics, same shape as live" category definition before someone else defines it for us.


The Uncomfortable Upside

One unflattering truth: the other providers aren't shipping this because the TAM for "historical pre-computed options analytics, minute resolution, Alpha-tier pricing" is smaller than the TAM for raw data or retail dashboards. We're comfortable with that - it's a high-value niche with clear buyers (quant teams, researchers, ML workflows at mid-market hedge funds) and the competitive moat is real.

If you're one of those buyers and you've been waiting for this dataset to exist, it does now. If you're not - if your workflow is covered by raw chains, EOD snapshots, or browser-first charts - one of the other providers is probably a better fit. The whole point of this post is to be honest about that.


Related Articles

Historical API · Alpha tier · from $1,199/mo
Replay any analytics endpoint at any minute since 2018
Same response shape as live, leak-free percentiles, 6.7B option rows for SPY, more symbols on demand.
View pricing →
Data freshness: intraday data through the previous trading day's close, refreshed by the daily pipeline run. Live coverage status at /v1/tickers.

Upgrade to Alpha API Spec

Live Market Pulse

Get tick-by-tick visibility into market shifts with full-chain analytics streaming in real time.

Intelligent Screening

Screen millions of option pairs per second using your custom EV rules, filters, and setups.

Execution-Ready

Instantly send structured orders to Interactive Brokers right from your scan results.

Join the Community

Discord

Engage in real time conversations with us!

Twitter / X

Follow us for real-time updates and insights!

GitHub

Explore our open-source SDK, examples, and analytics resources!