Upload
salama-badawi
View
215
Download
0
Embed Size (px)
Citation preview
Chapter 7: Estimation
7.5 Maximum Likelihood Estimators
Maximum likelihood estimation chooses as the estimate
of the value of that provides the largest value of the
likelihood function.
Definition: Likelihood Function. When the joint p.d.f.
or the joint p.f. fn(x|) = f (x1|) ... f (xn|) of theobservations in a random sample is regarded as a function
of for given values of x1, ..., xn, it is called the likelihood
function.
Definition: Maximum Likelihood Estimator/Estimate.
For each possible observed vector x = (x1, ..., xn), let
(x) denote a value of for which the likelihoodfunction fn(x|) is a maximum, and let = (X) bethe estimator of defined in this way. The estimator
is called a maximum likelihood estimator of . After
X = x is observed, the value (x) is called a maximum
likelihood estimate of . The set of all possible values
of a parameter/parameters is called parameter space.
111
Examples of Maximum Likelihood Estimators
Example: Choose a sample of size 3 from the expo-
nential distribution with parameter > 0, the observed
data are (x1, x2, x3) = (3, 1.5, 2.1).
X f (x) =
{ex x > 0
0 x 0The likelihood function is,
f3(x|) = f (x1)f (x2)f (x3) = 3e(x1+x2+x3) = 3e6.6
Since log is an increasing function, the value of that
maximizes the likelihood function f3(x|) will be the sameas the value of that maximizes log f3(x|).
L() = log f3(x|) = 3 log 6.6
Taking the derivative, setting the derivative to 0, and
solving for yields
dL()
d=
3
6.6, d
2L()
d2= 3
2< 0
dL()
d= 0, =
3
6.6= 0.455
The maximum likelihood estimate is then 0.455.
Example: Suppose X has the Bernoulli distribution
with parameter .
112
X 0 1
P 1 The p.f. of X can be rewriten as X f (x|) = x(1 )1x, x = 0, 1 . Let the parameter space = {0.1, 0.9}.
If x = 0 (sample size is one) is observed,
f (x = 0|) =
{0.9 if = 0.1
0.1 if = 0.9Clearly, = 0.1
maximizes the likelihood when x = 0 is observed. So the
MLE is = 0.1 if X = 0.
Question: if x = 1 is observed, what is the MLE of ?
Example: Suppose that the random variablesX1, ..., Xn
form a random sample from the Bernoulli distribution
with parameter , which is unknown (0 1). Forall observed values x1, ..., xn, where each xi is either 0 or
1, the likelihood function is
fn(x|) =ni=1
xi(1 )1xi = ni=1 xi(1 )n
ni=1 xi
L() = log fn(x|) =ni=1
xi log + [nni=1
xi] log(1)
113
dL()
d=
ni=1 xi
nn
i=1 xi1
Ifn
i=1 xi = 0,dL()d =
n1 < 0, then L() is a de-
creasing function of , and hence L achieves its maximum
at = 0 = x.
Ifn
i=1 xi = n,dL()d =
n > 0, then L() is a in-
creasing function of , and hence L achieves its maximum
at = 1 = x.
Ifn
i=1 xi 6= {0, n}, SetdL()d = 0, = x,
dL2()d2
= ni=1 xi2 n
ni=1 xi
(1)2 < 0,
then L achieves its maximum at = x.
that the M.L.E. of is X .
Example: Suppose that X1, ..., Xn form a random
sample from a normal distribution for which the mean
is unknown and the variance 2 is known. For all ob-
served values x1, ..., xn, the likelihood function will be
fn(x|) =1
(22)n/2exp
[ 1
22
ni=1
(xi )2]
fn(x|) will be maximized by the value of that min-
114
imizes
Q() =
ni=1
(xi )2 =ni=1
x2i 2ni=1
xi + n2
We see that Q is a quadratic in with positive coefficient
on 2. It follows that Q will be minimized where its
derivative is 0.
dQ()
d= 2
ni=1
xi + 2n = 0
=
ni=1 xin
= x
the M.L.E. of is = X .
Example: Suppose again that X1, ..., Xn form a ran-
dom sample from a normal distribution for which the
mean is unknown and the variance 2 is also unknown.
For all observed values x1, ..., xn, the likelihood function
will be
fn(x|) =1
(22)n/2exp
[ 1
22
ni=1
(xi )2]
The parameter is = (, 2), where < 0.
115
L() = log fn(x|) = n
2log(2)n
2log 2 1
22
ni=1
(xi)2
We shall find the value of = (, 2) for which L() is
maximum.{L =
12
ni=1(xi ) = 0
L2
= n22
+ 124
ni=1(xi )2 = 0
Solve these two equations, we have
= x, 2 = 1nn
i=1(xi x)2.the M.L.E. of is = (, 2) = (X, 1n
ni=1(XiX)2).
Example: Suppose that X1, ..., Xn form a random
sample from the uniform distribution on the interval [0, ],
where > 0. The pdf of each observation is
X f (x|) =
{1 0 x 0 otherwise
The joint pdf (likelihood function) fn(x|) of X1, ..., Xnhas the form
fn(x|) =
{1n 0 xi (i = 1, ..., n)0 otherwise
The MLE of must be a value of for which xi (i =1, ..., n) and that maximizes 1n among all such values.
116
Since 1n is a decreasing function of , the estimate will be
the smallest value of such that xi for i = 1, ..., n.Since this value is = max{x1, ..., xn}, the MLE of is = max{X1, ..., Xn}.
7.6 Properties of Maximum Likelihood Estimators
Theorem: Invariance Property of M.L.E.s. If is the
maximum likelihood estimator of and if g is a one-to-one
function, then g() is the maximum likelihood estimator
of g().
Example: Suppose that X1, ..., Xn form a random
sample from a normal distribution for which both the
mean and the variance 2 are unknown. It was found
that the MLE of = (, 2) is
= (, 2) = (X,1
n
ni=1
(Xi X)2)
From the invariance property, we can conclude that the
MLE of is =2 =
1n
ni=1(Xi X)2, also, the
MLE of 2 + 2 is 2 + 2 = X2 + 1nn
i=1(Xi X)2.
117
Consistency
Under some conditions, the maximum likelihood esti-
mator is consistent. The consistency means that having a
sufficiently large number of observations n, it is possible
to find the value of with arbitrary precision.
limn
P (|n | < ) = 1
Method of Moments
Definition: Assume that X1, ..., Xn form a random
sample from a distribution X .
sample moments: mj = 1nn
i=1Xji for j = 1, ..., k.
population moments: i() = E(Xj) = E(Xji )
for a k-dimensional parameter , set up the k equations
mj = j() and solve for .
Example: we considered a sample of size n from the
gamma distribution with parameters and 1.
we use one equation. m1 =1n
ni=1 xi = x, 1 =
EX = , we let m1 = 1, = x, The method of
moments estimator is then = X .
118
Definition: A random variable X has the gamma distri-
bution with parameters > 0, > 0, if X has a contin-
uous distribution for which the p.d.f. is
f (x) =
{
()x1ex x > 0
0 x 0Theorem: If X Gamma(, ), then
E(X) =
, V ar(X) =
2.
Example: we considered a sample of size n from the
gamma distribution with unknown parameters and .
1 =
, 2 =
( + 1)
2
The method of moments says to replace the right-hand
sides of these equations by the sample moments and then
solve for and .
Let 1 = m1, 2 = m2, then
=m21
m2 m21, =
m1m2 m21
Theorem: The sequence of method of moments esti-
mators based on X1, ..., Xn is a consistent sequence of
estimators of .
119