🔬 Advanced

Variance Calculator – Population & Sample Variance

Calculate variance and standard deviation for a data set. Supports population and sample variance. Free online statistics calculator for instant results.

★★★★★ 4.8/5 · 📊 0 calculations · 🔒 Private & free

What Is Variance?

Variance measures the spread of a dataset — how far the values are from the mean. A low variance means data points cluster near the mean; a high variance means they are spread out widely.

Variance is calculated as the average of squared differences from the mean:

Population variance (σ²): σ² = Σ(xᵢ − μ)² / N
Sample variance (s²): s² = Σ(xᵢ − x̄)² / (N−1)

Where xᵢ is each data point, μ (or x̄) is the mean, and N is the number of values. The standard deviation is simply the square root of variance — it is in the same units as the original data, making it more interpretable.

Why do we square the differences? Two reasons: (1) squaring eliminates negative values so that deviations above and below the mean don't cancel out, and (2) squaring gives disproportionate weight to outliers, making variance sensitive to extreme values. This property is both a strength (outlier detection) and a weakness (outlier sensitivity). For data with extreme outliers, consider using the median absolute deviation (MAD) as a more robust alternative.

Population vs. Sample Variance

The key difference is the denominator — N vs. (N−1) — known as Bessel's correction:

Type	Denominator	Use When	Symbol
Population Variance	N	You have data on the entire population	σ²
Sample Variance	N−1	You have a sample from a larger population	s²

In practice, most real-world data is a sample. Using N−1 (sample variance) produces an unbiased estimate of the true population variance. Using N (population variance) on a sample systematically underestimates the true variance.

Example: Testing a new drug on 50 patients means using sample variance (s²). Analyzing all students in a classroom means using population variance (σ²).

Why does Bessel's correction work? When you calculate the sample mean, you use one "degree of freedom" — the mean is computed from the data itself, so the deviations from the mean are not fully independent. Dividing by (N−1) instead of N compensates for this loss of one degree of freedom, producing an unbiased estimator of the population variance. As N grows large, the difference between N and N−1 becomes negligible.

Step-by-Step Variance Calculation

Given the data set: 4, 7, 13, 2, 8

Calculate the mean: (4+7+13+2+8) ÷ 5 = 34/5 = 6.8
Find deviations from mean: (4−6.8)=−2.8; (7−6.8)=0.2; (13−6.8)=6.2; (2−6.8)=−4.8; (8−6.8)=1.2
Square the deviations: 7.84; 0.04; 38.44; 23.04; 1.44
Sum of squares: 7.84+0.04+38.44+23.04+1.44 = 70.8
Population variance: 70.8 ÷ 5 = 14.16
Sample variance: 70.8 ÷ 4 = 17.7
Standard deviation: √14.16 = 3.76 (population) or √17.7 = 4.21 (sample)

Shortcut Formula for Variance

There is an equivalent "computational" formula that avoids calculating deviations explicitly, useful when computing by hand or in spreadsheets:

σ² = (Σxᵢ²)/N − (Σxᵢ/N)² = (Σxᵢ² − (Σxᵢ)²/N) / N

For sample variance: s² = (Σxᵢ² − (Σxᵢ)²/N) / (N−1)

Using our example data (4, 7, 13, 2, 8):

Σxᵢ = 34, so (Σxᵢ)² = 1,156
Σxᵢ² = 16 + 49 + 169 + 4 + 64 = 302
Population variance = (302 − 1156/5) / 5 = (302 − 231.2) / 5 = 70.8 / 5 = 14.16 ✓
Sample variance = 70.8 / 4 = 17.7 ✓

This formula is numerically identical but can suffer from floating-point precision issues when values are very large. For computational stability, Welford's online algorithm (which processes one value at a time) is preferred in software implementations.

Related Statistical Measures

Variance is one of several measures of spread. Each has different strengths:

Measure	Formula	Units	Robustness to Outliers	Best For
Variance (σ² or s²)	Avg. of squared deviations	Squared units	Low — very sensitive	Theoretical statistics, ANOVA
Standard Deviation (σ or s)	√Variance	Same as data	Low	Reporting spread in original units
Range	Max − Min	Same as data	Very low	Quick check, small samples
Interquartile Range (IQR)	Q3 − Q1	Same as data	High	Skewed distributions, box plots
Mean Absolute Deviation (MAD)	Avg. of \|xᵢ − mean\|	Same as data	Moderate	Intuitive measure of spread
Coefficient of Variation (CV)	(SD / Mean) × 100%	Percentage	Low	Comparing spread across different scales

For normal (bell-curve) distributions, the standard deviation has a special interpretation: approximately 68% of data falls within ±1 SD of the mean, 95% within ±2 SD, and 99.7% within ±3 SD. This is the empirical rule (68-95-99.7 rule).

Variance in Spreadsheets and Programming

Most tools have built-in variance functions. Make sure you choose the correct version (population vs. sample):

Tool	Sample Variance	Population Variance
Excel / Google Sheets	`VAR.S(range)` or `VAR(range)`	`VAR.P(range)` or `VARP(range)`
Python (NumPy)	`np.var(data, ddof=1)`	`np.var(data)`
Python (statistics)	`statistics.variance(data)`	`statistics.pvariance(data)`
R	`var(x)`	`var(x) * (n-1)/n`
JavaScript	Manual calculation (no built-in)	Manual calculation
SQL (PostgreSQL)	`VAR_SAMP(column)`	`VAR_POP(column)`
MATLAB	`var(x)`	`var(x, 1)`

Note: Python's NumPy defaults to population variance (ddof=0), while R's var() defaults to sample variance. This is a common source of confusion when comparing results across languages.

Practical Applications of Variance

Field	Application	Example
Finance	Investment risk	High variance = more volatile stock returns
Manufacturing	Quality control	Low variance = consistent product dimensions
Medicine	Clinical trials	Measuring variability in patient responses
Sports science	Performance analysis	Variability in athlete performance over season
Education	Test score analysis	Understanding spread of student performance

Variance in Finance: Portfolio Risk

In finance, variance and standard deviation measure investment risk. Higher variance means returns fluctuate more — the investment is riskier. Harry Markowitz's Modern Portfolio Theory (1952, Nobel Prize 1990) uses variance as the central risk measure.

For a portfolio of two assets, the combined variance depends on individual variances and the correlation between assets:

σ²_portfolio = w₁²σ₁² + w₂²σ₂² + 2·w₁·w₂·σ₁·σ₂·ρ₁₂

Where w = weight, σ² = variance, and ρ = correlation. When ρ < 1 (assets don't move in perfect lockstep), the portfolio variance is less than the weighted average of individual variances. This is the mathematical basis of diversification — combining uncorrelated assets reduces overall risk without proportionally reducing expected return.

Asset Class (2000–2023)	Annualized Return	Annualized SD (Volatility)
US Large Cap (S&P 500)	~7.5%	~15%
US Small Cap (Russell 2000)	~7.0%	~20%
International Developed (EAFE)	~4.5%	~17%
US Bonds (Aggregate)	~4.0%	~4%
Gold	~8.0%	~16%

A portfolio combining stocks and bonds typically has a standard deviation significantly lower than stocks alone, while still capturing most of the equity return premium.

Variance in Quality Control (Six Sigma)

Manufacturing uses variance to control product quality. The Six Sigma methodology, developed by Motorola in the 1980s, aims to reduce process variance until virtually no products fall outside specification limits.

Sigma Level	Defects per Million (DPMO)	Yield	Process Capability (Cpk)
1σ	691,462	30.9%	0.33
2σ	308,538	69.1%	0.67
3σ	66,807	93.3%	1.00
4σ	6,210	99.38%	1.33
5σ	233	99.977%	1.67
6σ	3.4	99.99966%	2.00

A process operating at 6σ produces only 3.4 defects per million opportunities. The process capability index Cpk directly relates to variance: Cpk = (USL − μ) / (3σ), where USL is the upper specification limit. Reducing variance (through better machines, training, or materials) increases Cpk and pushes the process toward Six Sigma quality.

Worked Examples from Different Fields

These real-world examples show how variance is calculated and interpreted in practice:

Example 1: Stock Return Volatility

Monthly returns for a stock over 6 months: +3.2%, −1.5%, +4.8%, −0.7%, +2.1%, +1.6%

Mean = (3.2−1.5+4.8−0.7+2.1+1.6) / 6 = 9.5/6 = 1.583%
Deviations: 1.617, −3.083, 3.217, −2.283, 0.517, 0.017
Squared: 2.615, 9.504, 10.349, 5.212, 0.267, 0.0003
Sum of squares = 27.947
Sample variance = 27.947/5 = 5.589 (%²)
Standard deviation = √5.589 = 2.364% per month
Annualized volatility ≈ 2.364% × √12 = 8.19%

This stock has moderate volatility. The S&P 500 historically has ~15% annualized volatility, so this stock is roughly half as volatile as the broad market.

Example 2: Manufacturing Quality Control

A factory produces bolts with target length 50.00 mm. A sample of 8 bolts measures: 50.02, 49.98, 50.05, 49.97, 50.01, 50.03, 49.99, 50.00 mm.

Mean = 400.05/8 = 50.00625 mm
Sample variance = 0.000655 mm²
Standard deviation = 0.0256 mm
With spec limits of 50.00 ± 0.10 mm: Cpk = (50.10 − 50.006) / (3 × 0.0256) = 1.22

A Cpk of 1.22 means the process is capable but has little margin. The industry standard target is Cpk ≥ 1.33 (4σ capability), so this process needs tighter control to achieve that level.

Example 3: Student Test Scores

A class of 10 students scores: 72, 85, 90, 68, 77, 95, 83, 79, 88, 73 on an exam.

Mean = 810/10 = 81.0
Population variance (entire class) = 72.2
Standard deviation = 8.50
Coefficient of variation = 8.50/81.0 × 100% = 10.5%

A CV of 10.5% indicates moderate spread — most students performed within a reasonable range of the mean. If CV exceeded 25%, the instructor might investigate whether the test had questions that were too difficult for some students or whether there was a bimodal distribution (two distinct groups).

Common Mistakes When Calculating Variance

Avoid these frequent errors:

Mistake	Why It's Wrong	Correction
Using N instead of N−1 for samples	Underestimates true population variance	Use N−1 for any data that's a sample from a larger population
Averaging absolute deviations instead of squaring	Gives MAD, not variance	Square each deviation, then average. Take √ for standard deviation
Forgetting to square before averaging	Positive and negative deviations cancel out, giving ~0	Always square deviations first
Comparing variance across different scales	Variance depends on units; $² ≠ kg²	Use coefficient of variation (CV) for cross-scale comparison
Assuming variance = standard deviation	Variance is SD²; units are squared	Take the square root of variance to get SD

ANOVA: Comparing Variance Across Groups

Analysis of Variance (ANOVA) is a statistical test that compares means of multiple groups by analyzing variance. Despite the name, it tests whether group means differ, not whether variances differ.

ANOVA partitions total variance into two components:

Between-group variance: How much group means differ from the overall mean
Within-group variance: How much individual values vary within each group

The F-statistic = Between-group variance / Within-group variance. A large F means the groups are more different from each other than expected by chance. If F exceeds the critical value (or p < 0.05), at least one group mean is significantly different.

Example: Comparing test scores of students taught by three different methods. ANOVA tells you whether the teaching method matters; post-hoc tests (Tukey, Bonferroni) tell you which methods differ.

💡 Did you know?

Variance was introduced by Ronald Fisher in 1918 — the same paper where he coined the term "variance."
In finance, variance is the basis of Modern Portfolio Theory. A portfolio's variance depends not just on individual asset variance but on correlations between assets.
The coefficient of variation (CV = standard deviation / mean × 100%) allows comparing variability across datasets with different units or scales.
Chebyshev's inequality guarantees that for any distribution (not just normal), at least 75% of data falls within ±2 standard deviations and at least 89% within ±3 standard deviations. This is weaker than the empirical rule but applies universally.

Frequently Asked Questions

What is the difference between variance and standard deviation?

Variance is the average of squared deviations from the mean; standard deviation is its square root. Standard deviation is in the same units as the original data (e.g., dollars, kg, seconds), making it more interpretable. Variance is useful in mathematical operations (variances of independent variables add directly), while standard deviation is better for describing spread to a non-technical audience.

When should I use sample vs. population variance?

Use population variance when your data contains every member of the group you're analyzing (e.g., all employees in one company). Use sample variance when your data is a subset of a larger group (e.g., a survey of 500 voters to estimate all voters' opinions). In most real-world research and statistics, sample variance is appropriate.

Can variance be negative?

No. Variance is always zero or positive because it is calculated from squared values. Variance = 0 only when all data points are identical (no spread). A negative variance is mathematically impossible and indicates a calculation error.

What is a "high" or "low" variance?

High and low are relative to the scale and context of the data. A variance of 10 is "low" for human heights in cm but "high" for heights in meters. The coefficient of variation (SD / mean × 100%) is scale-independent and allows comparison across different datasets. In quality control, specifications define acceptable variance ranges for each measurement.

How does variance relate to the normal distribution?

The normal (Gaussian) distribution is fully described by just two parameters: the mean (μ) and the variance (σ²). The familiar bell curve is wider when variance is large and narrower when variance is small. For normal data, the empirical rule holds: 68.3% within ±1σ, 95.4% within ±2σ, and 99.7% within ±3σ. Many statistical tests (t-test, ANOVA, regression) assume data follows a normal distribution or that sample means are approximately normal (via the Central Limit Theorem).

What is pooled variance?

Pooled variance is a weighted average of sample variances from two or more groups, used in the two-sample t-test when you assume equal variances across groups. The formula is: s²_pooled = [(n₁−1)s₁² + (n₂−1)s₂²] / (n₁ + n₂ − 2). This produces a single variance estimate that incorporates information from both samples, increasing statistical power when the equal-variance assumption is valid.

},{"@type":“Question”,“name”:“Can variance be negative?”,“acceptedAnswer”:{"@type":“Answer”,“text”:“No. Variance is always zero or positive because it is calculated from squared values. Variance = 0 only when all data points are identical.”}},{"@type":“Question”,“name”:“How does variance relate to the normal distribution?”,“acceptedAnswer”:{"@type":“Answer”,“text”:“The normal distribution is fully described by mean and variance. For normal data, 68% falls within ±1 standard deviation, 95% within ±2, and 99.7% within ±3.”}},{"@type":“Question”,“name”:“What is pooled variance?”,“acceptedAnswer”:{"@type":“Answer”,“text”:“Pooled variance is a weighted average of sample variances from two or more groups, used in two-sample t-tests when assuming equal variances.”}}]}