TOPICS
Search

Maximum Likelihood


Maximum likelihood, also called the maximum likelihood method, is the procedure of finding the value of one or more parameters for a given statistic which makes the known likelihood distribution a maximum. The maximum likelihood estimate for a parameter mu is denoted mu^^.

For a Bernoulli distribution,

 d/(dtheta)[(N; Np)theta^(Np)(1-theta)^(Nq)]=Np(1-theta)-thetaNq=0,
(1)

so maximum likelihood occurs for theta=p. If p is not known ahead of time, the likelihood function is

f(x_1,...,x_n|p)=P(X_1=x_1,...,X_n=x_n|p)
(2)
=p^(x_1)(1-p)^(1-x_1)...p^(x_n)(1-p)^(1-x_n)
(3)
=p^(sumx_i)(1-p)^(sum(1-x_i))=p^(sumx_i)(1-p)^(n-sumx_i),
(4)

where x=0 or 1, and i=1, ..., n.

 lnf=sumx_ilnp+(n-sumx_i)ln(1-p)
(5)
 (d(lnf))/(dp)=(sumx_i)/p-(n-sumx_i)/(1-p)=0.
(6)

Rearranging gives

 sumx_i-psumx_i=np-psumx_i,
(7)

so

 p^^=(sumx_i)/n.
(8)

For a normal distribution,

f(x_1,...,x_n|mu,sigma)=product1/(sigmasqrt(2pi))e^(-(x_i-mu)^2/(2sigma^2))
(9)
=((2pi)^(-n/2))/(sigma^n)exp[-(sum(x_i-mu)^2)/(2sigma^2)]
(10)

so

 lnf=-1/2nln(2pi)-nlnsigma-(sum(x_i-mu)^2)/(2sigma^2)
(11)

and

 (partial(lnf))/(partialmu)=(sum(x_i-mu))/(sigma^2)=0,
(12)

giving

 mu^^=(sumx_i)/n.
(13)

Similarly,

 (partial(lnf))/(partialsigma)=-n/sigma+(sum(x_i-mu)^2)/(sigma^3)=0
(14)

gives

 sigma^^=sqrt((sum(x_i-mu^^)^2)/n).
(15)

Note that in this case, the maximum likelihood standard deviation is the sample standard deviation, which is a biased estimator for the population standard deviation.

For a weighted normal distribution,

 f(x_1,...,x_n|mu,sigma)=product1/(sigma_isqrt(2pi))e^(-(x_i-mu)^2/2sigma_i^2)
(16)
 lnf=-1/2nln(2pi)-nsumlnsigma_i-sum((x_i-mu)^2)/(2sigma_i^2)
(17)
 (partial(lnf))/(partialmu)=sum((x_i-mu))/(sigma_i^2)=sum(x_i)/(sigma_i^2)-musum1/(sigma_i^2)=0
(18)

gives

 mu^^=(sum(x_i)/(sigma_i^2))/(sum1/(sigma_i^2)).
(19)

The variance of the mean is then

 sigma_mu^2=sumsigma_i^2((partialmu)/(partialx_i))^2.
(20)

But

 (partialmu)/(partialx_i)=partial/(partialx_i)(sum(x_i/sigma_i^2))/(sum(1/sigma_i^2))=(1/sigma_i^2)/(sum(1/sigma_i^2)),
(21)

so

sigma_mu^2=sumsigma_i^2((1/sigma_i^2)/(sum(1/sigma_i^2)))^2
(22)
=sum(1/sigma_i^2)/([sum(1/sigma_i^2)]^2)
(23)
=1/(sum(1/sigma_i^2)).
(24)

For a Poisson distribution,

 f(x_1,...,x_n|lambda)=(e^(-lambda)lambda^(x_1))/(x_1!)...(e^(-lambda)lambda^(x_n))/(x_n!)=(e^(-nlambda)lambda^(sumx_i))/(x_1!...x_n!)
(25)
 lnf=-nlambda+(lnlambda)sumx_i-ln(productx_i!)
(26)
 (d(lnf))/lambda=-n+(sumx_i)/lambda=0
(27)
 lambda^^=(sumx_i)/n.
(28)

See also

Bayesian Analysis, Likelihood, Likelihood Function, Maximum Likelihood Estimator

Explore with Wolfram|Alpha

References

Harris, J. W. and Stocker, H. "Maximum Likelihood Method." §21.10.4 in Handbook of Mathematics and Computational Science. New York: Springer-Verlag, p. 824, 1998.Hoel, P. G. Introduction to Mathematical Statistics, 3rd ed. New York: Wiley, p. 57, 1962.Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. "Least Squares as a Maximum Likelihood Estimator." §15.1 in Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed. Cambridge, England: Cambridge University Press, pp. 651-655, 1992.

Referenced on Wolfram|Alpha

Maximum Likelihood

Cite this as:

Weisstein, Eric W. "Maximum Likelihood." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/MaximumLikelihood.html

Subject classifications