Variance
For a single variate
having a distribution
with known population
mean
, the population
variance
, commonly also written
, is defined
as
|
(1)
|
where
is the population
mean and
denotes the expectation
value of
. For a discrete
distribution with
possible values of
, the population
variance is therefore
|
(2)
|
whereas for a continuous distribution, it is given by
|
(3)
|
The variance is therefore equal to the second central moment
.
Note that some care is needed in interpreting
as a variance,
since the symbol
is also commonly used as a parameter
related to but not equivalent to the square root of the variance, for example in
the log normal distribution, Maxwell
distribution, and Rayleigh distribution.
If the underlying distribution is not known, then the sample variance may be computed as
|
(4)
|
where
is the sample mean.
Note that the sample variance
defined above
is not an unbiased estimator for the
population variance
. In order
to obtain an unbiased estimator for
, it is necessary
to instead define a "bias-corrected sample variance"
|
(5)
|
The distinction between
and
is a common
source of confusion, and extreme care should be exercised when consulting the literature
to determine which convention is in use, especially since the uninformative notation
is commonly used for both. The bias-corrected sample
variance
for a list of data is implemented
as Variance[list].
The square root of the variance is known as the standard deviation.
The reason that
gives a biased
estimator of the population variance is
that two free parameters
and
are actually
being estimated from the data itself. In such cases, it is appropriate to use a Student's t-distribution instead
of a normal distribution as a model since,
very loosely speaking, Student's t-distribution
is the "best" that can be done without knowing
.
Formally, in order to estimate the population variance
from a sample of
elements with a
priori unknown mean (i.e., the mean
is estimated from the sample itself), we need an unbiased
estimator for
. This is given by the k-statistic
, where
|
(6)
|
and
is the sample
variance uncorrected for bias.
It turns out that the quantity
has
a chi-squared distribution.
For set of data
, the variance of the data obtained by
a linear transformation is given by
|
(7)
| |||
|
(8)
| |||
|
(9)
| |||
|
(10)
| |||
|
(11)
| |||
|
(12)
|
For multiple variables, the variance is given using the definition of covariance,
|
(13)
| |||
|
(14)
| |||
![]() |
(15)
| ||
![]() |
(16)
| ||
|
(17)
|
A linear sum has a similar form:
|
(18)
| |||
|
(19)
| |||
|
(20)
|
These equations can be expressed using the covariance matrix.


variance {21.3, 38.4,
12.7, 41.6}




