Yang-Zhang vs Close-to-Close: Which Realized Volatility Estimator Should You Use?

Realized volatility is the backbone of every options pricing model and VRP strategy, but the estimator you choose can change your answer by 30% or more. This article compares four OHLC-based realized volatility estimators - close-to-close, Parkinson, Garman-Klass, and Yang-Zhang - with formulas, efficiency analysis, Python implementations, and practical guidance on which to use for volatility risk premium calculations.

Tomasz Dobrowolski Quant Engineer

Mar 17, 2026

39 min read

RealizedVolatility YangZhang Parkinson VolatilityEstimation Quant

How do you measure how volatile a stock has been?

Every options trader needs a number for realized volatility (RV). You need it to compute the volatility risk premium - the spread between implied and realized vol that drives premium-selling strategies. You need it to calibrate models, backtest systems, and assess whether options are cheap or expensive.

But realized volatility is not a single number. It depends entirely on how you measure it. The simplest estimator - close-to-close - throws away most of the information in your price data. More sophisticated estimators use open, high, low, and close (OHLC) prices to extract more signal from the same data. The question is: which one should you use?

Why this matters

If you compute VRP as IV minus RV, and your RV estimator is noisy or biased, your VRP signal is noisy or biased. A poor RV estimate can make you think options are overpriced when they're fairly valued, or vice versa. The estimator you choose directly affects your trading decisions.

What is the close-to-close estimator?

The close-to-close estimator is the simplest and most widely used. It computes volatility as the standard deviation of log returns, annualized:

Close-to-Close Volatility $$ \sigma_{\text{CC}} = \sqrt{\frac{252}{n-1} \sum_{t=1}^{n} \left( \ln \frac{C_t}{C_{t-1}} - \bar{r} \right)^2 } $$

Where C_t is the closing price on day t, n is the number of observations, and r̄ is the mean log return. The factor of 252 annualizes from daily to yearly.

Pros

Simple to compute and understand
Only requires closing prices
Unbiased - converges to true vol with enough data
Universal baseline, everyone uses it

Cons

Ignores all intraday price action
Statistically inefficient - needs many observations to converge
A stock can swing 5% intraday and close flat - CC reports zero vol
High variance, especially over short windows

Close-to-close volatility only sees the blue dots and connecting line. The faded candlesticks show the intraday high-low range -- information that is entirely discarded. A day can swing 5% intraday yet close flat, registering zero volatility.

The core limitation is efficiency. In statistics, efficiency measures how much information an estimator extracts from available data. Close-to-close uses exactly one data point per day (the close), discarding everything that happened between open and close. A stock that gaps up 3%, drops 5%, and recovers to close unchanged had a volatile day - but CC says nothing happened.

Parkinson: Using the High-Low Range

Parkinson (1980) recognized that the daily high-low range contains far more information about volatility than closing prices alone. His estimator:

Parkinson Volatility $$ \sigma_{\text{P}}^2 = \frac{252}{4n \ln 2} \sum_{t=1}^{n} \left( \ln \frac{H_t}{L_t} \right)^2 $$

Where H_t and L_t are the high and low prices on day t. The constant 1/(4 ln 2) comes from the theoretical relationship between range and volatility under geometric Brownian motion.

~5x

More efficient than close-to-close

Data points used per day (high, low)

Biased low

Discrete sampling misses true extremes

Parkinson is approximately 5 times more efficient than close-to-close, meaning it achieves the same estimation accuracy with one-fifth as many observations. A 10-day Parkinson estimate is roughly as accurate as a 50-day close-to-close estimate.

Parkinson ignores overnight gaps. It assumes continuous trading - the price path from yesterday's close to today's open doesn't exist in the model. If a stock gaps down 8% on earnings overnight and then trades in a 1% range during the day, Parkinson only sees the 1% range. This makes it systematically biased low for stocks with significant overnight moves.

Garman-Klass: Adding Open and Close

Garman and Klass (1980) extended Parkinson by incorporating open and close prices alongside the high-low range. Their estimator partially corrects for the overnight gap bias:

Garman-Klass Volatility $$ \sigma_{\text{GK}}^2 = \frac{252}{n} \sum_{t=1}^{n} \left[ \frac{1}{2} \left( \ln \frac{H_t}{L_t} \right)^2 - (2\ln 2 - 1) \left( \ln \frac{C_t}{O_t} \right)^2 \right] $$

Where O_t is the opening price. The first term captures the range-based information (similar to Parkinson), and the second term adjusts for the open-to-close return, which partially accounts for overnight gaps by anchoring the estimate to the open.

Garman-Klass is roughly 7-8 times more efficient than close-to-close. However, it still assumes that the opening price equals the previous close - a violation in any market with overnight sessions, pre-market trading, or earnings announcements.

Yang-Zhang: The Complete Estimator

Yang and Zhang (2000) developed an estimator that explicitly handles overnight gaps, intraday drift, and open-to-close variance. It combines three components:

Yang-Zhang Volatility $$ \sigma_{\text{YZ}}^2 = \sigma_o^2 + k \cdot \sigma_c^2 + (1-k) \cdot \sigma_{\text{RS}}^2 $$

Where:

σ_o² = variance of overnight returns (close-to-open): ln(O_t/C_t-1)
σ_c² = variance of open-to-close returns: ln(C_t/O_t)
σ_RS² = Rogers-Satchell variance (a range-based estimator that handles drift)
k = weighting constant, typically k = 0.34 / (1.34 + (n+1)/(n-1))

The Rogers-Satchell component itself is:

Rogers-Satchell Variance $$ \sigma_{\text{RS}}^2 = \frac{1}{n} \sum_{t=1}^{n} \left[ \ln\frac{H_t}{C_t} \cdot \ln\frac{H_t}{O_t} + \ln\frac{L_t}{C_t} \cdot \ln\frac{L_t}{O_t} \right] $$

This three-component architecture is what makes Yang-Zhang unique. The overnight variance captures gap risk. The open-to-close variance captures intraday drift. The Rogers-Satchell component captures range-based volatility without drift bias. The weighted combination produces the minimum-variance unbiased estimator for daily OHLC data.

Yang-Zhang Strengths

Handles overnight gaps explicitly
Drift-independent (works in trending markets)
~14x more efficient than close-to-close
Uses all four OHLC data points
Minimum-variance unbiased estimator for OHLC data

Yang-Zhang Limitations

More complex to implement correctly
Sensitive to window length - short windows can be noisy
Requires clean OHLC data (bad highs/lows corrupt the estimate)
Assumes independent, identically distributed returns

Efficiency Comparison

Efficiency measures how much variance reduction an estimator achieves relative to close-to-close. Higher efficiency means the estimator converges to the true volatility faster with fewer data points.

Estimator	Data Used	Relative Efficiency	Handles Gaps	Handles Drift
Close-to-Close	Close	1.0x (baseline)	Yes (implicitly)	No
Parkinson	High, Low	~5.2x	No	No
Garman-Klass	O, H, L, C	~7.4x	Partial	No
Rogers-Satchell	O, H, L, C	~8x	No	Yes
Yang-Zhang	O, H, L, C	~14x	Yes	Yes

The practical implication: a 5-day Yang-Zhang estimate carries roughly the same statistical precision as a 70-day close-to-close estimate. For short-window VRP calculations - where you need a responsive RV measure - this difference is enormous.

These are theoretical efficiencies derived from the original papers under ideal conditions (continuous trading, no microstructure noise). Real-world efficiency depends on data quality, market microstructure, and lookback window. The ratios above (~5x Parkinson, ~7.4x Garman-Klass, ~14x Yang-Zhang) are useful as relative benchmarks, but actual variance reduction in live markets will vary.

Key takeaway

Efficiency is not about being "fancier." It is about extracting maximum information from the data you already have. If you are using OHLC bars - which nearly every data provider delivers - you are throwing away ~93% of the available information by using close-to-close instead of Yang-Zhang.

Yang-Zhang extracts 14 times more information from the same daily OHLC data compared to close-to-close. A 5-day Yang-Zhang window matches the precision of a 70-day close-to-close window.

What FlashAlpha Uses

The FlashAlpha Volatility Analysis endpoint (/v1/volatility/{symbol}) returns realized volatility across five lookback windows: 5-day, 10-day, 20-day, 30-day, and 60-day. These are computed using the Yang-Zhang estimator, giving you a high-efficiency RV estimate that properly accounts for overnight gaps and intraday range.

from flashalpha import FlashAlpha

client = FlashAlpha("your_api_key")
vol = client.volatility("SPY")

print(f"SPY Realized Volatility (Yang-Zhang):")
print(f"  RV  5d:  {vol['realized_vol']['rv_5d']}%")
print(f"  RV 10d:  {vol['realized_vol']['rv_10d']}%")
print(f"  RV 20d:  {vol['realized_vol']['rv_20d']}%")
print(f"  RV 30d:  {vol['realized_vol']['rv_30d']}%")
print(f"  RV 60d:  {vol['realized_vol']['rv_60d']}%")
print(f"  ATM IV:  {vol['atm_iv']}%")
print(f"  VRP:     {vol['iv_rv_spreads']['vrp_20d']}%")

The VRP spread (vrp_20d) is computed as ATM IV minus 20-day Yang-Zhang RV. Because the RV estimate is high-efficiency, the VRP signal is more stable and responsive than if you used close-to-close - fewer false signals, faster regime detection.

Get Realized Volatility Across 5 Windows via API

Yang-Zhang RV, ATM IV, VRP spreads, and term structure - all from a single endpoint. Free tier: 5 requests/day.

View Plans and Pricing →

Python: Compute All Four Estimators from OHLC Data

Here is a complete implementation of all four estimators so you can compare them on your own data:

import numpy as np
import pandas as pd
import yfinance as yf

def close_to_close(df, window=20):
    """Standard close-to-close realized volatility."""
    log_returns = np.log(df['Close'] / df['Close'].shift(1))
    return log_returns.rolling(window).std() * np.sqrt(252) * 100

def parkinson(df, window=20):
    """Parkinson (1980) high-low range estimator."""
    log_hl = np.log(df['High'] / df['Low'])
    factor = 252 / (4 * window * np.log(2))
    return np.sqrt(factor * (log_hl ** 2).rolling(window).sum()) * 100

def garman_klass(df, window=20):
    """Garman-Klass (1980) OHLC estimator."""
    log_hl = np.log(df['High'] / df['Low'])
    log_co = np.log(df['Close'] / df['Open'])
    term1 = 0.5 * log_hl ** 2
    term2 = (2 * np.log(2) - 1) * log_co ** 2
    return np.sqrt((252 / window) * (term1 - term2).rolling(window).sum()) * 100

def yang_zhang(df, window=20):
    """Yang-Zhang (2000) estimator with overnight, intraday, and RS components."""
    log_oc = np.log(df['Open'] / df['Close'].shift(1))  # overnight
    log_co = np.log(df['Close'] / df['Open'])            # open-to-close

    # Rogers-Satchell component
    log_hc = np.log(df['High'] / df['Close'])
    log_ho = np.log(df['High'] / df['Open'])
    log_lc = np.log(df['Low'] / df['Close'])
    log_lo = np.log(df['Low'] / df['Open'])
    rs = log_hc * log_ho + log_lc * log_lo

    # Variances (rolling)
    k = 0.34 / (1.34 + (window + 1) / (window - 1))
    overnight_var = log_oc.rolling(window).var()
    close_var = log_co.rolling(window).var()
    rs_var = rs.rolling(window).mean()

    yz_var = overnight_var + k * close_var + (1 - k) * rs_var
    return np.sqrt(yz_var * 252) * 100

# --- Compare estimators on SPY ---
spy = yf.download("SPY", period="1y")

results = pd.DataFrame({
    'Close-to-Close': close_to_close(spy, 20),
    'Parkinson': parkinson(spy, 20),
    'Garman-Klass': garman_klass(spy, 20),
    'Yang-Zhang': yang_zhang(spy, 20)
})

print(results.tail(10).round(2).to_string())
print(f"\nLatest 20-day estimates:")
for col in results.columns:
    print(f"  {col:20s}: {results[col].iloc[-1]:.2f}%")

You will typically observe that Parkinson and Garman-Klass run slightly below close-to-close (because they miss overnight gaps), while Yang-Zhang tracks close to close-to-close on average but with much less noise. The spread between estimators widens during earnings season and macro events - exactly when the overnight gap component matters most.

Practical Guidance: When to Use Which

There is no universally "best" estimator - the right choice depends on your use case, data quality, and the market structure of what you're trading.

Use Case	Recommended Estimator	Why
Quick sanity check	Close-to-Close	Everyone uses it, easy to verify, universal benchmark
Intraday-only assets (crypto, FX)	Parkinson or Garman-Klass	No overnight gaps in 24/7 markets; range estimators shine
VRP calculation (equities)	Yang-Zhang	Handles gaps, drift-independent, most efficient for daily OHLC
Short-window RV (5-10 days)	Yang-Zhang	Efficiency advantage is largest when observations are scarce
Long-window RV (60+ days)	Close-to-Close is acceptable	With 60+ observations, even an inefficient estimator converges
Backtesting with questionable data	Close-to-Close	Range estimators amplify bad highs/lows; CC is more robust to data errors

Data quality is critical for range estimators. Parkinson, Garman-Klass, and Yang-Zhang all depend on accurate high and low prices. If your data source reports erroneous spikes (a common issue with penny stocks and thinly traded ETFs), range-based estimators will produce wildly inflated volatility readings. Always validate your OHLC data before trusting range-based estimates.

Decision tree for choosing a realized volatility estimator. Yang-Zhang is the best default for equity options; Garman-Klass suits continuous markets; close-to-close works when you only have closing prices.

Why This Matters for VRP

The volatility risk premium is defined as:

Volatility Risk Premium $$ \text{VRP} = \sigma_{\text{implied}} - \sigma_{\text{realized}} $$

Your choice of RV estimator directly changes the VRP number you compute. Consider a concrete example:

28.4%

ATM Implied Volatility

24.6%

RV (Close-to-Close, 20d)

21.1%

RV (Parkinson, 20d)

25.8%

RV (Yang-Zhang, 20d)

Same stock, same period, three different VRP readings: 3.8% (CC), 7.3% (Parkinson), and 2.6% (YZ). Parkinson says premium is rich. Yang-Zhang says it's modest. The difference is that Parkinson missed overnight gaps that added real volatility, making RV look artificially low and VRP look artificially high. Acting on the Parkinson signal could lead you to sell premium that isn't actually overpriced.

Bottom line

If you are computing VRP for equity options, Yang-Zhang is the right default. It captures overnight gaps (which are real risk), handles trending markets without bias, and gives you the most precise estimate from daily data. The FlashAlpha Volatility endpoint returns Yang-Zhang RV across five windows, plus the pre-computed VRP spread - so you don't have to build this yourself.

Frequently Asked Questions

Realized volatility (RV) measures how much a stock actually moved over a past period, calculated from historical price data. Implied volatility (IV) is the market's forward-looking expectation of future volatility, extracted from option prices. The difference between them - IV minus RV - is the volatility risk premium (VRP), which represents the insurance premium embedded in options.

Yang-Zhang is approximately 14 times more efficient than close-to-close, meaning it produces equally accurate estimates with far fewer data points. It also explicitly handles overnight gaps (close-to-open returns) and is drift-independent, so it works correctly in trending markets. For equity options where overnight earnings gaps and macro events create real risk, Yang-Zhang captures volatility that close-to-close and Parkinson miss.

It depends on the estimator. With close-to-close, you typically need 20-30 days for a reasonable estimate. With Yang-Zhang, 5-10 days can be sufficient due to its 14x efficiency advantage. For VRP calculations, the FlashAlpha API provides Yang-Zhang RV across five windows (5, 10, 20, 30, and 60 days) so you can assess volatility at multiple time scales without building the estimators yourself.

Yes. Parkinson and Garman-Klass are excellent choices for 24/7 markets like crypto and most FX pairs because these markets have no overnight gaps. The main weakness of Parkinson - ignoring close-to-open jumps - is irrelevant when the market never closes. In these continuous markets, Parkinson's 5x efficiency advantage over close-to-close comes with no downside.

The realized volatility estimator you choose is not a minor implementation detail - it directly affects your VRP signal, your trading decisions, and your backtest results. Close-to-close is the universal baseline but wastes most of the information in your data. Parkinson and Garman-Klass improve efficiency dramatically but fail on overnight gaps. Yang-Zhang combines overnight, intraday, and range components into the most efficient unbiased estimator available for daily OHLC data. For equity options VRP strategies, Yang-Zhang is the right default - and the FlashAlpha Volatility API delivers it across five lookback windows so you can focus on trading rather than plumbing.

#Python #VolatilityScanner #OptionsAPI #IVRank #FreeTier #Tutorial

Yang-Zhang vs Close-to-Close: Which Realized Volatility Estimator Should You Use?

How do you measure how volatile a stock has been?

What is the close-to-close estimator?

Parkinson: Using the High-Low Range

Garman-Klass: Adding Open and Close

Yang-Zhang: The Complete Estimator

Efficiency Comparison

What FlashAlpha Uses

Python: Compute All Four Estimators from OHLC Data

Practical Guidance: When to Use Which

Why This Matters for VRP

Related Reading

Frequently Asked Questions

Building a Volatility Scanner in Python (With Free API)

IV Rank vs IV Percentile: Which One Should You Actually Use?

Realized vs Implied Volatility: How to Spot the Volatility Risk Premium

Live Market Pulse

Intelligent Screening

Export-Ready

Join the Community

Discord

Twitter / X

GitHub

Welcome to FlashAlpha!

How did you hear about us?

How do you measure how volatile a stock has been?

What is the close-to-close estimator?

Parkinson: Using the High-Low Range

Garman-Klass: Adding Open and Close

Yang-Zhang: The Complete Estimator

Efficiency Comparison

What FlashAlpha Uses

Python: Compute All Four Estimators from OHLC Data

Practical Guidance: When to Use Which

Why This Matters for VRP

Related Reading

Frequently Asked Questions

Building a Volatility Scanner in Python (With Free API)

IV Rank vs IV Percentile: Which One Should You Actually Use?

Realized vs Implied Volatility: How to Spot the Volatility Risk Premium

Live Market Pulse

Intelligent Screening

Export-Ready

Join the Community

Discord

Twitter / X

GitHub