Skip to main content
🔬 Advanced

Variance Calculator – Population & Sample Variance

Calculate variance and standard deviation for a data set. Supports population and sample variance. Free online statistics calculator for instant results.

★★★★★ 4.8/5 · 📊 0 calculations · 🔒 Private & free

What Is Variance?

Variance measures the spread of a dataset — how far the values are from the mean. A low variance means data points cluster near the mean; a high variance means they are spread out widely.

Variance is calculated as the average of squared differences from the mean:

Where xᵢ is each data point, μ (or x̄) is the mean, and N is the number of values. The standard deviation is simply the square root of variance — it is in the same units as the original data, making it more interpretable.

Why do we square the differences? Two reasons: (1) squaring eliminates negative values so that deviations above and below the mean don't cancel out, and (2) squaring gives disproportionate weight to outliers, making variance sensitive to extreme values. This property is both a strength (outlier detection) and a weakness (outlier sensitivity). For data with extreme outliers, consider using the median absolute deviation (MAD) as a more robust alternative.

Population vs. Sample Variance

The key difference is the denominator — N vs. (N−1) — known as Bessel's correction:

TypeDenominatorUse WhenSymbol
Population VarianceNYou have data on the entire populationσ²
Sample VarianceN−1You have a sample from a larger population

In practice, most real-world data is a sample. Using N−1 (sample variance) produces an unbiased estimate of the true population variance. Using N (population variance) on a sample systematically underestimates the true variance.

Example: Testing a new drug on 50 patients means using sample variance (s²). Analyzing all students in a classroom means using population variance (σ²).

Why does Bessel's correction work? When you calculate the sample mean, you use one "degree of freedom" — the mean is computed from the data itself, so the deviations from the mean are not fully independent. Dividing by (N−1) instead of N compensates for this loss of one degree of freedom, producing an unbiased estimator of the population variance. As N grows large, the difference between N and N−1 becomes negligible.

Step-by-Step Variance Calculation

Given the data set: 4, 7, 13, 2, 8

  1. Calculate the mean: (4+7+13+2+8) ÷ 5 = 34/5 = 6.8
  2. Find deviations from mean: (4−6.8)=−2.8; (7−6.8)=0.2; (13−6.8)=6.2; (2−6.8)=−4.8; (8−6.8)=1.2
  3. Square the deviations: 7.84; 0.04; 38.44; 23.04; 1.44
  4. Sum of squares: 7.84+0.04+38.44+23.04+1.44 = 70.8
  5. Population variance: 70.8 ÷ 5 = 14.16
  6. Sample variance: 70.8 ÷ 4 = 17.7
  7. Standard deviation: √14.16 = 3.76 (population) or √17.7 = 4.21 (sample)

Shortcut Formula for Variance

There is an equivalent "computational" formula that avoids calculating deviations explicitly, useful when computing by hand or in spreadsheets:

σ² = (Σxᵢ²)/N − (Σxᵢ/N)² = (Σxᵢ² − (Σxᵢ)²/N) / N

For sample variance: s² = (Σxᵢ² − (Σxᵢ)²/N) / (N−1)

Using our example data (4, 7, 13, 2, 8):

  1. Σxᵢ = 34, so (Σxᵢ)² = 1,156
  2. Σxᵢ² = 16 + 49 + 169 + 4 + 64 = 302
  3. Population variance = (302 − 1156/5) / 5 = (302 − 231.2) / 5 = 70.8 / 5 = 14.16
  4. Sample variance = 70.8 / 4 = 17.7

This formula is numerically identical but can suffer from floating-point precision issues when values are very large. For computational stability, Welford's online algorithm (which processes one value at a time) is preferred in software implementations.

Related Statistical Measures

Variance is one of several measures of spread. Each has different strengths:

MeasureFormulaUnitsRobustness to OutliersBest For
Variance (σ² or s²)Avg. of squared deviationsSquared unitsLow — very sensitiveTheoretical statistics, ANOVA
Standard Deviation (σ or s)√VarianceSame as dataLowReporting spread in original units
RangeMax − MinSame as dataVery lowQuick check, small samples
Interquartile Range (IQR)Q3 − Q1Same as dataHighSkewed distributions, box plots
Mean Absolute Deviation (MAD)Avg. of |xᵢ − mean|Same as dataModerateIntuitive measure of spread
Coefficient of Variation (CV)(SD / Mean) × 100%PercentageLowComparing spread across different scales

For normal (bell-curve) distributions, the standard deviation has a special interpretation: approximately 68% of data falls within ±1 SD of the mean, 95% within ±2 SD, and 99.7% within ±3 SD. This is the empirical rule (68-95-99.7 rule).

Variance in Spreadsheets and Programming

Most tools have built-in variance functions. Make sure you choose the correct version (population vs. sample):

ToolSample VariancePopulation Variance
Excel / Google SheetsVAR.S(range) or VAR(range)VAR.P(range) or VARP(range)
Python (NumPy)np.var(data, ddof=1)np.var(data)
Python (statistics)statistics.variance(data)statistics.pvariance(data)
Rvar(x)var(x) * (n-1)/n
JavaScriptManual calculation (no built-in)Manual calculation
SQL (PostgreSQL)VAR_SAMP(column)VAR_POP(column)
MATLABvar(x)var(x, 1)

Note: Python's NumPy defaults to population variance (ddof=0), while R's var() defaults to sample variance. This is a common source of confusion when comparing results across languages.

Practical Applications of Variance

FieldApplicationExample
FinanceInvestment riskHigh variance = more volatile stock returns
ManufacturingQuality controlLow variance = consistent product dimensions
MedicineClinical trialsMeasuring variability in patient responses
Sports sciencePerformance analysisVariability in athlete performance over season
EducationTest score analysisUnderstanding spread of student performance

Variance in Finance: Portfolio Risk

In finance, variance and standard deviation measure investment risk. Higher variance means returns fluctuate more — the investment is riskier. Harry Markowitz's Modern Portfolio Theory (1952, Nobel Prize 1990) uses variance as the central risk measure.

For a portfolio of two assets, the combined variance depends on individual variances and the correlation between assets:

σ²portfolio = w₁²σ₁² + w₂²σ₂² + 2·w₁·w₂·σ₁·σ₂·ρ₁₂

Where w = weight, σ² = variance, and ρ = correlation. When ρ < 1 (assets don't move in perfect lockstep), the portfolio variance is less than the weighted average of individual variances. This is the mathematical basis of diversification — combining uncorrelated assets reduces overall risk without proportionally reducing expected return.

Asset Class (2000–2023)Annualized ReturnAnnualized SD (Volatility)
US Large Cap (S&P 500)~7.5%~15%
US Small Cap (Russell 2000)~7.0%~20%
International Developed (EAFE)~4.5%~17%
US Bonds (Aggregate)~4.0%~4%
Gold~8.0%~16%

A portfolio combining stocks and bonds typically has a standard deviation significantly lower than stocks alone, while still capturing most of the equity return premium.

Variance in Quality Control (Six Sigma)

Manufacturing uses variance to control product quality. The Six Sigma methodology, developed by Motorola in the 1980s, aims to reduce process variance until virtually no products fall outside specification limits.

Sigma LevelDefects per Million (DPMO)YieldProcess Capability (Cpk)
691,46230.9%0.33
308,53869.1%0.67
66,80793.3%1.00
6,21099.38%1.33
23399.977%1.67
3.499.99966%2.00

A process operating at 6σ produces only 3.4 defects per million opportunities. The process capability index Cpk directly relates to variance: Cpk = (USL − μ) / (3σ), where USL is the upper specification limit. Reducing variance (through better machines, training, or materials) increases Cpk and pushes the process toward Six Sigma quality.

Worked Examples from Different Fields

These real-world examples show how variance is calculated and interpreted in practice:

Example 1: Stock Return Volatility

Monthly returns for a stock over 6 months: +3.2%, −1.5%, +4.8%, −0.7%, +2.1%, +1.6%

  1. Mean = (3.2−1.5+4.8−0.7+2.1+1.6) / 6 = 9.5/6 = 1.583%
  2. Deviations: 1.617, −3.083, 3.217, −2.283, 0.517, 0.017
  3. Squared: 2.615, 9.504, 10.349, 5.212, 0.267, 0.0003
  4. Sum of squares = 27.947
  5. Sample variance = 27.947/5 = 5.589 (%²)
  6. Standard deviation = √5.589 = 2.364% per month
  7. Annualized volatility ≈ 2.364% × √12 = 8.19%

This stock has moderate volatility. The S&P 500 historically has ~15% annualized volatility, so this stock is roughly half as volatile as the broad market.

Example 2: Manufacturing Quality Control

A factory produces bolts with target length 50.00 mm. A sample of 8 bolts measures: 50.02, 49.98, 50.05, 49.97, 50.01, 50.03, 49.99, 50.00 mm.

  1. Mean = 400.05/8 = 50.00625 mm
  2. Sample variance = 0.000655 mm²
  3. Standard deviation = 0.0256 mm
  4. With spec limits of 50.00 ± 0.10 mm: Cpk = (50.10 − 50.006) / (3 × 0.0256) = 1.22

A Cpk of 1.22 means the process is capable but has little margin. The industry standard target is Cpk ≥ 1.33 (4σ capability), so this process needs tighter control to achieve that level.

Example 3: Student Test Scores

A class of 10 students scores: 72, 85, 90, 68, 77, 95, 83, 79, 88, 73 on an exam.

  1. Mean = 810/10 = 81.0
  2. Population variance (entire class) = 72.2
  3. Standard deviation = 8.50
  4. Coefficient of variation = 8.50/81.0 × 100% = 10.5%

A CV of 10.5% indicates moderate spread — most students performed within a reasonable range of the mean. If CV exceeded 25%, the instructor might investigate whether the test had questions that were too difficult for some students or whether there was a bimodal distribution (two distinct groups).

Common Mistakes When Calculating Variance

Avoid these frequent errors:

MistakeWhy It's WrongCorrection
Using N instead of N−1 for samplesUnderestimates true population varianceUse N−1 for any data that's a sample from a larger population
Averaging absolute deviations instead of squaringGives MAD, not varianceSquare each deviation, then average. Take √ for standard deviation
Forgetting to square before averagingPositive and negative deviations cancel out, giving ~0Always square deviations first
Comparing variance across different scalesVariance depends on units; $² ≠ kg²Use coefficient of variation (CV) for cross-scale comparison
Assuming variance = standard deviationVariance is SD²; units are squaredTake the square root of variance to get SD

ANOVA: Comparing Variance Across Groups

Analysis of Variance (ANOVA) is a statistical test that compares means of multiple groups by analyzing variance. Despite the name, it tests whether group means differ, not whether variances differ.

ANOVA partitions total variance into two components:

The F-statistic = Between-group variance / Within-group variance. A large F means the groups are more different from each other than expected by chance. If F exceeds the critical value (or p < 0.05), at least one group mean is significantly different.

Example: Comparing test scores of students taught by three different methods. ANOVA tells you whether the teaching method matters; post-hoc tests (Tukey, Bonferroni) tell you which methods differ.

💡 Did you know?

Frequently Asked Questions

What is the difference between variance and standard deviation?

Variance is the average of squared deviations from the mean; standard deviation is its square root. Standard deviation is in the same units as the original data (e.g., dollars, kg, seconds), making it more interpretable. Variance is useful in mathematical operations (variances of independent variables add directly), while standard deviation is better for describing spread to a non-technical audience.

When should I use sample vs. population variance?

Use population variance when your data contains every member of the group you're analyzing (e.g., all employees in one company). Use sample variance when your data is a subset of a larger group (e.g., a survey of 500 voters to estimate all voters' opinions). In most real-world research and statistics, sample variance is appropriate.

Can variance be negative?

No. Variance is always zero or positive because it is calculated from squared values. Variance = 0 only when all data points are identical (no spread). A negative variance is mathematically impossible and indicates a calculation error.

What is a "high" or "low" variance?

High and low are relative to the scale and context of the data. A variance of 10 is "low" for human heights in cm but "high" for heights in meters. The coefficient of variation (SD / mean × 100%) is scale-independent and allows comparison across different datasets. In quality control, specifications define acceptable variance ranges for each measurement.

How does variance relate to the normal distribution?

The normal (Gaussian) distribution is fully described by just two parameters: the mean (μ) and the variance (σ²). The familiar bell curve is wider when variance is large and narrower when variance is small. For normal data, the empirical rule holds: 68.3% within ±1σ, 95.4% within ±2σ, and 99.7% within ±3σ. Many statistical tests (t-test, ANOVA, regression) assume data follows a normal distribution or that sample means are approximately normal (via the Central Limit Theorem).

What is pooled variance?

Pooled variance is a weighted average of sample variances from two or more groups, used in the two-sample t-test when you assume equal variances across groups. The formula is: s²pooled = [(n₁−1)s₁² + (n₂−1)s₂²] / (n₁ + n₂ − 2). This produces a single variance estimate that incorporates information from both samples, increasing statistical power when the equal-variance assumption is valid.

},{"@type":“Question”,“name”:“Can variance be negative?”,“acceptedAnswer”:{"@type":“Answer”,“text”:“No. Variance is always zero or positive because it is calculated from squared values. Variance = 0 only when all data points are identical.”}},{"@type":“Question”,“name”:“How does variance relate to the normal distribution?”,“acceptedAnswer”:{"@type":“Answer”,“text”:“The normal distribution is fully described by mean and variance. For normal data, 68% falls within ±1 standard deviation, 95% within ±2, and 99.7% within ±3.”}},{"@type":“Question”,“name”:“What is pooled variance?”,“acceptedAnswer”:{"@type":“Answer”,“text”:“Pooled variance is a weighted average of sample variances from two or more groups, used in two-sample t-tests when assuming equal variances.”}}]}