Covariance provides a measure of the strength of the correlation between two or more sets of random variates. The covariance for two random variates and
, each with sample size
, is defined by
the expectation value
|
(1)
| |||
|
(2)
|
where and
are
the respective means, which can be written out explicitly
as
|
(3)
|
For uncorrelated variates,
|
(4)
|
so the covariance is zero. However, if the variables are correlated in some way, then their covariance will be nonzero. In fact, if , then
tends to increase
as
increases, and if
,
then
tends to decrease as
increases. Note
that while statistically independent variables are always uncorrelated, the converse
is not necessarily true.
In the special case of ,
|
(5)
| |||
|
(6)
|
so the covariance reduces to the usual variance . This motivates the use of the symbol
, which then provides a consistent
way of denoting the variance as
,
where
is the standard
deviation.
The derived quantity
|
(7)
| |||
|
(8)
|
is called statistical correlation of and
.
The covariance is especially useful when looking at the variance of the sum of two random variates, since
|
(9)
|
The covariance is symmetric by definition since
|
(10)
|
Given random variates denoted
, ...,
, the covariance
of
and
is defined by
|
(11)
| |||
|
(12)
|
where and
are the means of
and
, respectively.
The matrix
of the quantities
is called the covariance matrix.
The covariance obeys the identities
|
(13)
| |||
|
(14)
| |||
|
(15)
| |||
|
(16)
|
By induction, it therefore follows that
|
(17)
| |||
|
(18)
| |||
|
(19)
| |||
|
(20)
| |||
|
(21)
|