"Yang-Zhang vs Close-to-Close: Which Realized Volatility Estimator Should You Use?" | FlashAlpha Research

"Yang-Zhang vs Close-to-Close: Which Realized Volatility Estimator Should You Use?"

Realized volatility is the backbone of every options pricing model and VRP strategy, but the estimator you choose can change your answer by 30% or more. This article compares four OHLC-based realized volatility estimators — close-to-close, Parkinson, Garman-Klass, and Yang-Zhang — with formulas, efficiency analysis, Python implementations, and practical guidance on which to use for volatility risk premium calculations.

T
Tomasz Dobrowolski
Quant Engineer
Mar 17, 2026 · 39 min read
RealizedVolatility YangZhang Parkinson VolatilityEstimation Quant

The Problem: Measuring "How Volatile" a Stock Has Been

Every options trader needs a number for realized volatility (RV). You need it to compute the volatility risk premium — the spread between implied and realized vol that drives premium-selling strategies. You need it to calibrate models, backtest systems, and assess whether options are cheap or expensive.

But realized volatility is not a single number. It depends entirely on how you measure it. The simplest estimator — close-to-close — throws away most of the information in your price data. More sophisticated estimators use open, high, low, and close (OHLC) prices to extract more signal from the same data. The question is: which one should you use?

Why this matters

If you compute VRP as IV minus RV, and your RV estimator is noisy or biased, your VRP signal is noisy or biased. A poor RV estimate can make you think options are overpriced when they're fairly valued, or vice versa. The estimator you choose directly affects your trading decisions.


Close-to-Close: The Baseline

The close-to-close estimator is the simplest and most widely used. It computes volatility as the standard deviation of log returns, annualized:

Close-to-Close Volatility $$ \sigma_{\text{CC}} = \sqrt{\frac{252}{n-1} \sum_{t=1}^{n} \left( \ln \frac{C_t}{C_{t-1}} - \bar{r} \right)^2 } $$

Where Ct is the closing price on day t, n is the number of observations, and is the mean log return. The factor of 252 annualizes from daily to yearly.

Pros
  • Simple to compute and understand
  • Only requires closing prices
  • Unbiased — converges to true vol with enough data
  • Universal baseline, everyone uses it
Cons
  • Ignores all intraday price action
  • Statistically inefficient — needs many observations to converge
  • A stock can swing 5% intraday and close flat — CC reports zero vol
  • High variance, especially over short windows
What Close-to-Close Sees vs What It Misses Trading Days Price Day 1 Day 2 Day 3 Day 4 Day 5 -0.7% +1.1% -1.6% +1.2% Close-to-close returns (what CC uses) Intraday high-low range (lost information)

Close-to-close volatility only sees the blue dots and connecting line. The faded candlesticks show the intraday high-low range -- information that is entirely discarded. A day can swing 5% intraday yet close flat, registering zero volatility.

The core limitation is efficiency. In statistics, efficiency measures how much information an estimator extracts from available data. Close-to-close uses exactly one data point per day (the close), discarding everything that happened between open and close. A stock that gaps up 3%, drops 5%, and recovers to close unchanged had a volatile day — but CC says nothing happened.


Parkinson: Using the High-Low Range

Parkinson (1980) recognized that the daily high-low range contains far more information about volatility than closing prices alone. His estimator:

Parkinson Volatility $$ \sigma_{\text{P}}^2 = \frac{252}{4n \ln 2} \sum_{t=1}^{n} \left( \ln \frac{H_t}{L_t} \right)^2 $$

Where Ht and Lt are the high and low prices on day t. The constant 1/(4 ln 2) comes from the theoretical relationship between range and volatility under geometric Brownian motion.

~5x
More efficient than close-to-close
2
Data points used per day (high, low)
Biased low
Discrete sampling misses true extremes

Parkinson is approximately 5 times more efficient than close-to-close, meaning it achieves the same estimation accuracy with one-fifth as many observations. A 10-day Parkinson estimate is roughly as accurate as a 50-day close-to-close estimate.

Parkinson ignores overnight gaps. It assumes continuous trading — the price path from yesterday's close to today's open doesn't exist in the model. If a stock gaps down 8% on earnings overnight and then trades in a 1% range during the day, Parkinson only sees the 1% range. This makes it systematically biased low for stocks with significant overnight moves.


Garman-Klass: Adding Open and Close

Garman and Klass (1980) extended Parkinson by incorporating open and close prices alongside the high-low range. Their estimator partially corrects for the overnight gap bias:

Garman-Klass Volatility $$ \sigma_{\text{GK}}^2 = \frac{252}{n} \sum_{t=1}^{n} \left[ \frac{1}{2} \left( \ln \frac{H_t}{L_t} \right)^2 - (2\ln 2 - 1) \left( \ln \frac{C_t}{O_t} \right)^2 \right] $$

Where Ot is the opening price. The first term captures the range-based information (similar to Parkinson), and the second term adjusts for the open-to-close return, which partially accounts for overnight gaps by anchoring the estimate to the open.

Garman-Klass is roughly 7-8 times more efficient than close-to-close. However, it still assumes that the opening price equals the previous close — a violation in any market with overnight sessions, pre-market trading, or earnings announcements.


Yang-Zhang: The Complete Estimator

Yang and Zhang (2000) developed an estimator that explicitly handles overnight gaps, intraday drift, and open-to-close variance. It combines three components:

Yang-Zhang Volatility $$ \sigma_{\text{YZ}}^2 = \sigma_o^2 + k \cdot \sigma_c^2 + (1-k) \cdot \sigma_{\text{RS}}^2 $$

Where:

  • σo² = variance of overnight returns (close-to-open): ln(Ot/Ct-1)
  • σc² = variance of open-to-close returns: ln(Ct/Ot)
  • σRS² = Rogers-Satchell variance (a range-based estimator that handles drift)
  • k = weighting constant, typically k = 0.34 / (1.34 + (n+1)/(n-1))

The Rogers-Satchell component itself is:

Rogers-Satchell Variance $$ \sigma_{\text{RS}}^2 = \frac{1}{n} \sum_{t=1}^{n} \left[ \ln\frac{H_t}{C_t} \cdot \ln\frac{H_t}{O_t} + \ln\frac{L_t}{C_t} \cdot \ln\frac{L_t}{O_t} \right] $$

This three-component architecture is what makes Yang-Zhang unique. The overnight variance captures gap risk. The open-to-close variance captures intraday drift. The Rogers-Satchell component captures range-based volatility without drift bias. The weighted combination produces the minimum-variance unbiased estimator for daily OHLC data.

Yang-Zhang Strengths
  • Handles overnight gaps explicitly
  • Drift-independent (works in trending markets)
  • ~14x more efficient than close-to-close
  • Uses all four OHLC data points
  • Minimum-variance unbiased estimator for OHLC data
Yang-Zhang Limitations
  • More complex to implement correctly
  • Sensitive to window length — short windows can be noisy
  • Requires clean OHLC data (bad highs/lows corrupt the estimate)
  • Assumes independent, identically distributed returns

Efficiency Comparison

Efficiency measures how much variance reduction an estimator achieves relative to close-to-close. Higher efficiency means the estimator converges to the true volatility faster with fewer data points.

Estimator Data Used Relative Efficiency Handles Gaps Handles Drift
Close-to-Close Close 1.0x (baseline) Yes (implicitly) No
Parkinson High, Low ~5.2x No No
Garman-Klass O, H, L, C ~7.4x Partial No
Rogers-Satchell O, H, L, C ~8x No Yes
Yang-Zhang O, H, L, C ~14x Yes Yes

The practical implication: a 5-day Yang-Zhang estimate carries roughly the same statistical precision as a 70-day close-to-close estimate. For short-window VRP calculations — where you need a responsive RV measure — this difference is enormous.

These are theoretical efficiencies derived from the original papers under ideal conditions (continuous trading, no microstructure noise). Real-world efficiency depends on data quality, market microstructure, and lookback window. The ratios above (~5x Parkinson, ~7.4x Garman-Klass, ~14x Yang-Zhang) are useful as relative benchmarks, but actual variance reduction in live markets will vary.

Key takeaway

Efficiency is not about being "fancier." It is about extracting maximum information from the data you already have. If you are using OHLC bars — which nearly every data provider delivers — you are throwing away ~93% of the available information by using close-to-close instead of Yang-Zhang.


Relative Efficiency of OHLC Volatility Estimators Relative Efficiency 0 3.5 7.0 10.5 14.0 1.0x Close-to- Close ~5.0x Parkinson ~7.4x Garman- Klass ~14.0x Yang- Zhang

Yang-Zhang extracts 14 times more information from the same daily OHLC data compared to close-to-close. A 5-day Yang-Zhang window matches the precision of a 70-day close-to-close window.

What FlashAlpha Uses

The FlashAlpha Volatility Analysis endpoint (/v1/volatility/{symbol}) returns realized volatility across five lookback windows: 5-day, 10-day, 20-day, 30-day, and 60-day. These are computed using the Yang-Zhang estimator, giving you a high-efficiency RV estimate that properly accounts for overnight gaps and intraday range.

from flashalpha import FlashAlphaClient

client = FlashAlphaClient(api_key="your_api_key")
vol = client.get_volatility("SPY")

print(f"SPY Realized Volatility (Yang-Zhang):")
print(f"  RV  5d:  {vol['realized_vol']['rv_5d']}%")
print(f"  RV 10d:  {vol['realized_vol']['rv_10d']}%")
print(f"  RV 20d:  {vol['realized_vol']['rv_20d']}%")
print(f"  RV 30d:  {vol['realized_vol']['rv_30d']}%")
print(f"  RV 60d:  {vol['realized_vol']['rv_60d']}%")
print(f"  ATM IV:  {vol['atm_iv']}%")
print(f"  VRP:     {vol['iv_rv_spreads']['vrp_20d']}%")

The VRP spread (vrp_20d) is computed as ATM IV minus 20-day Yang-Zhang RV. Because the RV estimate is high-efficiency, the VRP signal is more stable and responsive than if you used close-to-close — fewer false signals, faster regime detection.

Get Realized Volatility Across 5 Windows via API

Yang-Zhang RV, ATM IV, VRP spreads, and term structure — all from a single endpoint. Free tier: 10 requests/day.

View Plans and Pricing →

Python: Compute All Four Estimators from OHLC Data

Here is a complete implementation of all four estimators so you can compare them on your own data:

import numpy as np
import pandas as pd
import yfinance as yf

def close_to_close(df, window=20):
    """Standard close-to-close realized volatility."""
    log_returns = np.log(df['Close'] / df['Close'].shift(1))
    return log_returns.rolling(window).std() * np.sqrt(252) * 100

def parkinson(df, window=20):
    """Parkinson (1980) high-low range estimator."""
    log_hl = np.log(df['High'] / df['Low'])
    factor = 252 / (4 * window * np.log(2))
    return np.sqrt(factor * (log_hl ** 2).rolling(window).sum()) * 100

def garman_klass(df, window=20):
    """Garman-Klass (1980) OHLC estimator."""
    log_hl = np.log(df['High'] / df['Low'])
    log_co = np.log(df['Close'] / df['Open'])
    term1 = 0.5 * log_hl ** 2
    term2 = (2 * np.log(2) - 1) * log_co ** 2
    return np.sqrt((252 / window) * (term1 - term2).rolling(window).sum()) * 100

def yang_zhang(df, window=20):
    """Yang-Zhang (2000) estimator with overnight, intraday, and RS components."""
    log_oc = np.log(df['Open'] / df['Close'].shift(1))  # overnight
    log_co = np.log(df['Close'] / df['Open'])            # open-to-close

    # Rogers-Satchell component
    log_hc = np.log(df['High'] / df['Close'])
    log_ho = np.log(df['High'] / df['Open'])
    log_lc = np.log(df['Low'] / df['Close'])
    log_lo = np.log(df['Low'] / df['Open'])
    rs = log_hc * log_ho + log_lc * log_lo

    # Variances (rolling)
    k = 0.34 / (1.34 + (window + 1) / (window - 1))
    overnight_var = log_oc.rolling(window).var()
    close_var = log_co.rolling(window).var()
    rs_var = rs.rolling(window).mean()

    yz_var = overnight_var + k * close_var + (1 - k) * rs_var
    return np.sqrt(yz_var * 252) * 100

# --- Compare estimators on SPY ---
spy = yf.download("SPY", period="1y")

results = pd.DataFrame({
    'Close-to-Close': close_to_close(spy, 20),
    'Parkinson': parkinson(spy, 20),
    'Garman-Klass': garman_klass(spy, 20),
    'Yang-Zhang': yang_zhang(spy, 20)
})

print(results.tail(10).round(2).to_string())
print(f"\nLatest 20-day estimates:")
for col in results.columns:
    print(f"  {col:20s}: {results[col].iloc[-1]:.2f}%")

You will typically observe that Parkinson and Garman-Klass run slightly below close-to-close (because they miss overnight gaps), while Yang-Zhang tracks close to close-to-close on average but with much less noise. The spread between estimators widens during earnings season and macro events — exactly when the overnight gap component matters most.


Practical Guidance: When to Use Which

There is no universally "best" estimator — the right choice depends on your use case, data quality, and the market structure of what you're trading.

Use Case Recommended Estimator Why
Quick sanity check Close-to-Close Everyone uses it, easy to verify, universal benchmark
Intraday-only assets (crypto, FX) Parkinson or Garman-Klass No overnight gaps in 24/7 markets; range estimators shine
VRP calculation (equities) Yang-Zhang Handles gaps, drift-independent, most efficient for daily OHLC
Short-window RV (5-10 days) Yang-Zhang Efficiency advantage is largest when observations are scarce
Long-window RV (60+ days) Close-to-Close is acceptable With 60+ observations, even an inefficient estimator converges
Backtesting with questionable data Close-to-Close Range estimators amplify bad highs/lows; CC is more robust to data errors

Data quality is critical for range estimators. Parkinson, Garman-Klass, and Yang-Zhang all depend on accurate high and low prices. If your data source reports erroneous spikes (a common issue with penny stocks and thinly traded ETFs), range-based estimators will produce wildly inflated volatility readings. Always validate your OHLC data before trusting range-based estimates.

Which Estimator Should You Use? Do you have OHLC data? No (only closes) Close-to- Close Yes Are there overnight gaps? Yes (equities) Yang- Zhang No (crypto, FX) Garman-Klass

Decision tree for choosing a realized volatility estimator. Yang-Zhang is the best default for equity options; Garman-Klass suits continuous markets; close-to-close works when you only have closing prices.


Why This Matters for VRP

The volatility risk premium is defined as:

Volatility Risk Premium $$ \text{VRP} = \sigma_{\text{implied}} - \sigma_{\text{realized}} $$

Your choice of RV estimator directly changes the VRP number you compute. Consider a concrete example:

28.4%
ATM Implied Volatility
24.6%
RV (Close-to-Close, 20d)
21.1%
RV (Parkinson, 20d)
25.8%
RV (Yang-Zhang, 20d)

Same stock, same period, three different VRP readings: 3.8% (CC), 7.3% (Parkinson), and 2.6% (YZ). Parkinson says premium is rich. Yang-Zhang says it's modest. The difference is that Parkinson missed overnight gaps that added real volatility, making RV look artificially low and VRP look artificially high. Acting on the Parkinson signal could lead you to sell premium that isn't actually overpriced.

Bottom line

If you are computing VRP for equity options, Yang-Zhang is the right default. It captures overnight gaps (which are real risk), handles trending markets without bias, and gives you the most precise estimate from daily data. The FlashAlpha Volatility endpoint returns Yang-Zhang RV across five windows, plus the pre-computed VRP spread — so you don't have to build this yourself.

Related Reading

Frequently Asked Questions

Realized volatility (RV) measures how much a stock actually moved over a past period, calculated from historical price data. Implied volatility (IV) is the market's forward-looking expectation of future volatility, extracted from option prices. The difference between them — IV minus RV — is the volatility risk premium (VRP), which represents the insurance premium embedded in options.
Yang-Zhang is approximately 14 times more efficient than close-to-close, meaning it produces equally accurate estimates with far fewer data points. It also explicitly handles overnight gaps (close-to-open returns) and is drift-independent, so it works correctly in trending markets. For equity options where overnight earnings gaps and macro events create real risk, Yang-Zhang captures volatility that close-to-close and Parkinson miss.
It depends on the estimator. With close-to-close, you typically need 20-30 days for a reasonable estimate. With Yang-Zhang, 5-10 days can be sufficient due to its 14x efficiency advantage. For VRP calculations, the FlashAlpha API provides Yang-Zhang RV across five windows (5, 10, 20, 30, and 60 days) so you can assess volatility at multiple time scales without building the estimators yourself.
Yes. Parkinson and Garman-Klass are excellent choices for 24/7 markets like crypto and most FX pairs because these markets have no overnight gaps. The main weakness of Parkinson — ignoring close-to-open jumps — is irrelevant when the market never closes. In these continuous markets, Parkinson's 5x efficiency advantage over close-to-close comes with no downside.

Live Market Pulse

Get tick-by-tick visibility into market shifts with full-chain analytics streaming in real time.

Intelligent Screening

Screen millions of option pairs per second using your custom EV rules, filters, and setups.

Execution-Ready

Instantly send structured orders to Interactive Brokers right from your scan results.

Join the Community

Discord

Engage in real time conversations with us!

Twitter / X

Follow us for real-time updates and insights!

GitHub

Explore our open-source SDK, examples, and analytics resources!