TOPICS
Search

Variance


For a single variate X having a distribution P(x) with known population mean mu, the population variance var(X), commonly also written sigma^2, is defined as

 sigma^2=<(X-mu)^2>,
(1)

where mu is the population mean and <X> denotes the expectation value of X. For a discrete distribution with N possible values of x_i, the population variance is therefore

 sigma^2=sum_(i=1)^NP(x_i)(x_i-mu)^2,
(2)

whereas for a continuous distribution, it is given by

 sigma^2=intP(x)(x-mu)^2dx.
(3)

The variance is therefore equal to the second central moment mu_2.

Note that some care is needed in interpreting sigma^2 as a variance, since the symbol sigma is also commonly used as a parameter related to but not equivalent to the square root of the variance, for example in the log normal distribution, Maxwell distribution, and Rayleigh distribution.

If the underlying distribution is not known, then the sample variance may be computed as

 s_N^2=1/Nsum_(i=1)^N(x_i-x^_)^2,
(4)

where x^_ is the sample mean.

Note that the sample variance s_N^2 defined above is not an unbiased estimator for the population variance sigma^2. In order to obtain an unbiased estimator for sigma^2, it is necessary to instead define a "bias-corrected sample variance"

 s_(N-1)^2=1/(N-1)sum_(i=1)^N(x_i-x^_)^2.
(5)

The distinction between s_N^2 and s_(N-1)^2 is a common source of confusion, and extreme care should be exercised when consulting the literature to determine which convention is in use, especially since the uninformative notation s is commonly used for both. The bias-corrected sample variance s_(N-1)^2 for a list of data is implemented as Variance[list].

The square root of the variance is known as the standard deviation.

The reason that s_N^2 gives a biased estimator of the population variance is that two free parameters mu and sigma^2 are actually being estimated from the data itself. In such cases, it is appropriate to use a Student's t-distribution instead of a normal distribution as a model since, very loosely speaking, Student's t-distribution is the "best" that can be done without knowing sigma^2.

Formally, in order to estimate the population variance sigma^2 from a sample of n elements with a priori unknown mean (i.e., the mean is estimated from the sample itself), we need an unbiased estimator for sigma^2. This is given by the k-statistic k_2=sigma^^^2, where

 k_2=N/(N-1)m_2
(6)

and m_2=s_N^2 is the sample variance uncorrected for bias.

It turns out that the quantity Ns_N^2/sigma^2 has a chi-squared distribution.

For set of data X, the variance of the data obtained by a linear transformation is given by

var(aX+b)=<[(aX+b)-<aX+b>]^2>
(7)
=<(aX+b-a<X>-b)^2>
(8)
=<(aX-amu)^2>
(9)
=<a^2(X-mu)^2>
(10)
=a^2<(X-mu)^2>
(11)
=a^2var(X)
(12)

For multiple variables, the variance is given using the definition of covariance,

var(sum_(i=1)^(n)X_i)=cov(sum_(i=1)^(n)X_i,sum_(j=1)^(n)X_j)
(13)
=sum_(i=1)^(n)sum_(j=1)^(n)cov(X_i,X_j)
(14)
=sum_(i=1)^(n)sum_(j=1; j=i)^(n)cov(X_i,X_j)+sum_(i=1)^(n)sum_(j=1; j!=i)^(n)cov(X_i,X_j)
(15)
=sum_(i=1)^(n)cov(X_i,X_i)+sum_(i=1)^(n)sum_(j=1; j!=i)^(n)cov(X_i,X_j)
(16)
=sum_(i=1)^(n)var(X_i)+2sum_(i=1)^(n)sum_(j=i+1)^(n)cov(X_i,X_j).
(17)

A linear sum has a similar form:

var(sum_(i=1)^(n)a_iX_i)=cov(sum_(i=1)^(n)a_iX_i,sum_(j=1)^(n)a_jX_j)
(18)
=sum_(i=1)^(n)sum_(j=1)^(n)a_ia_jcov(X_i,X_j)
(19)
=sum_(i=1)^(n)a_i^2var(X_i)+2sum_(i=1)^(n)sum_(j=i+1)^(n)a_ia_jcov(X_i,X_j).
(20)

These equations can be expressed using the covariance matrix.


See also

Central Moment, Charlier's Check, Covariance, Covariance Matrix, Error Propagation, k-Statistic, Mean, Moment, Raw Moment, Sample Variance, Sample Variance Computation, Sample Variance Distribution, Sigma, Standard Error, Statistical Correlation Explore this topic in the MathWorld classroom

Explore with Wolfram|Alpha

References

Kenney, J. F. and Keeping, E. S. Mathematics of Statistics, Pt. 2, 2nd ed. Princeton, NJ: Van Nostrand, 1951.Papoulis, A. Probability, Random Variables, and Stochastic Processes, 2nd ed. New York: McGraw-Hill, pp. 144-145, 1984.Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. "Moments of a Distribution: Mean, Variance, Skewness, and So Forth." §14.1 in Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 604-609, 1992.Roberts, M. J. and Riccardo, R. A Student's Guide to Analysis of Variance. London: Routledge, 1999.

Referenced on Wolfram|Alpha

Variance

Cite this as:

Weisstein, Eric W. "Variance." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/Variance.html

Subject classifications