Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Likelihood Ratio Tests-
Let It :④Era and A : ④ ERA ,where Laura =L
and DH n SLA = 0 .
Definition The Likelihood Ratio Test ( LRT) statistic
is X Ge) = sup FCK lo)Otrasup FCK 10)
O Er
andany
test which rejects it when Hk) E c for some
c Eco . I) is called a likelihood ratio test#
^
Note that by the factorization theorem Ilk) will depend onk only through a sufficient statistic .
Eixample Say Po saysthat X = ( X
, . . . . , Xn ) are i. i. d .
N ( M , o') .
Here 0 = ( M ,02) E r = IR x co, o ) . Say
H : M = Mo and A : M F Mo for some given Mo EIR .
Such an alternative is called a z - sided alternative and
the hypothesis is called a point hypothesis . We havef- Cx Io) =⇐E)" e
- to ( ki - M )'
←
The denominator of Xlk) is the likelihood at the Mceof O
,which is 8 = in
,OT ) = ( I
,ht ( ki - I )
' ).
Under H the likelihood is maximized at M = Mo and
one = 'T ( ki - Mo)'
.
When you plug in thevalues do and OT in the numerator
of Xcx ),and the values I and I in the denominator
,
the exponentials cancel in the ratio .The constant Eta) "
also cancels in the ratio.
What remains is
X Cx) ="'
foil"
= .
( sci -I )' n 's
(EEE)= I!xi -I)
' n'a
⇐i⇒y
"
E. ( ki - I )'
t n CI -mo)'
' fit j"
n CI - Mo )2
This is small if and only if ¥iis big ,
equivalently ,if and only if
is big , equivalenty if and only ifI
II - Mol- is big ,
where S'= IT .
( ki - IT is
5 IF the sample variance .
The test statistic lr is called * z- sided t test
statistic,because when oera we have that M - Mo ,
and
I - Mo =¥oiEz I ntn ,
E. CXi -I )4cn-D
So the power function of this test is constant for all
OER # ,and for a size a test we reject It if
I I - MolE > th
- 1,4 , ← areais E to the night of this
value under the th- i density .
This is the standard z - sided t - test.
P - Hakes-
A p -value is a statistic ( a function of x ) and is
defined for any given family of tests .
Det.
Let the hypothesis be H :① era.
and the alternativeA :④ E DA
. .
Let { 0g : 8 ET} be a family of non randomizedtests of Hrs A indexed by 8 ET
.
Let 218) be the size of the
test 0g .
The p-value for a given observed value of x is
p Cx) = inf { 218) : 0g (x) =L } .
In words,the smallest
size amongall the tests in the family that reject It for
the givin k .
Note that pH ) E Co , D for any K .
Example Say Po says that X = ( X. . . . ., Xn) are iii.d
.
NCO ,07 , o
'known .
Let Hi ④ soo and A i.④ soo,
where Oo is given .
We saw in a previous example that theUMP level a test was
01, Ck)= I if Ios > Z,{ o if I za
where Ze is the value such that the area to the right of it underthe Nco, D pdf is 2 .
we take our family of tests to be { Ola ! 2 C- Co , D} .
The size of the test 02 is 4.Now suppose k =L K . . . . , xn)
is observed .What is p Csc)
,the p - value fo- the observed K .
Ola rejects H if I - OoZ Z,
.
The set of tests , ;
our family that reject H for the given k are all those
for which the threshold Za is less that Io, .As za
increases,the size of the test decreases
.So the p - value
will be the size of the test when Za =.The
size of this test is the p - value ,
put =P (IzI I ④ -
- Oo) = i - Io (I) .
Remarks-aboutthep-ralueiithisexampl.cm ,
① As I increases the p - value decreases ,and also the
strength of evidence against It increases .This agreeswith the usual interpretation of a p - value,which is
that as the p - value decreases ,the strength of evidence
against It is greater .
② In this example , it is true that for a given x ,the
set of tests that do not reject It are more"
conservative "
than the tests that do reject It , and the tests that aremore conservative hare size less than p Cx) . This orderingof the tests according to their size corresponding to howconservative they are is widely held to be true in generalCit 's not true in general ) in various fields where p - valuesare computed and reported .
Eixample ( Lehmann , Testing Statistical Hypotheses)Let X be a single random variable ,
I = { I, 2,33
,
a - I supposethe distributions of X given ④ = O are all
discrete on { I , 2 , 3,43 , given in the following table !K
i 2 3 4
113O 2 4 2 I 6 113
3 4 3 2 4 113
Suppose H : O E { 1,23 and A ! O =3
We take as our family of tests the ones corresponding toall the subsets of { I , 2 , 3,43 ( including the empty set ) ;the subset is the set of x values for which we would rejectH
.There are 16 such tests in our family . Suppose we observe
k =3,
what is pls) ?
Firstly , for a given subset BC { 1
, 2,3 , 4} ,the test
corresponding to the rejection region B has powerat a given
value of O computed by summing over the row correspondingto O and over the columns corresponding the x values in B
.
The size of this test is the maximum of the row sums for0=1 and 0=2 , For example ,
if 'B = { I , 3,43 ,the size of
the test with rejection region B is max {÷,
= IT.
The set of tests that reject It when x =3 is observed is allthose tests whose rejection region contains 3
. The test withthe smallest size will be the one whose rejection region isjust the Singleton { 3) ,
and the size of this test is 3113,
and this is the p - value when K =3 is observed ; i.e .. p (3) = 343 .
Remade If you look at all the tests that do not rejectIt when x =3. is
observed C these are all the tests whose
rejection region does not contain 3) , all of them havesize at least ¥ .
So,it is not true in this example
that the more" conservative
"
tests that did not reject Itwhen k =3 is observed have smaller size than the p - value .
* other than the test that never rejects ,i.e .
the rejection' region is 0 .
Set Estimation-
Approach to inference about a parameter ④ that providesas an estimate a set of O values rather than a singlevalue
,which is what one does in point estimation .
Let ④ = (④ o ,④
, ) .
If ④o=④ then we ignore ④ ,
( i.e.,
it may be that ④ois the entire parameter . Similarly ,
I = So xD , ( ro is the parameter space for ④ oand
r, is
the parameter space for ④ ,)
.If R = Do we ignore
r,( even though in the notation we still write down r , )
.
Say we want to give a set estimator for ④o .
Def A setestimator of ④o is a set - valued function
of X, say
CLX),
whose range is in 2% ( 2^0 is
the set of all subsets of ro ) .
A setestim.at is
a realization of a set estimator, say CCK) .
In the classical framework ,both set estimators and set
estimates are called confidence regions when Do is one -
-dimensional and the set estimator is always an interval ,
the set estimator and the set estimate are called confidenceinteai
Det.
For a given OER the coverage probability-of c ( x) at O is P( Oo C- CH) I ④ = O)
,where 0=100,0 , )
.
If the average probability is Z l - a for all OER,
then the set estimator C Cx) is called aCtttconfidenceset,or a Li - L) - confidence regions#,
or a
( i - L) - confidence interval ( in the care when do is one -
dimensional and CH) is always an interval ) .The value tax
is called the cozfidenceyff.ci of C CX ) . often theconfidence coefficient is multiplied by Loo and the confidenceset is referred to as a Ci -2) x too 90 confidence set
.
Examples Let Po say that X= ( Xi , Xz) are iii. d. given④ =D
Uniform ( o - I , Ott) ,O Er = IR .
Here,④ ,
and D ,
are not there .Let C CX) = ( min ( Xi ,XD , max (Xi , Xs) ) .
Given ④ = Oo ,the coverage probability of ( CX ) is
P ( min. LX , ,Xz) E Oo E max ( X , , Xz) / HQ = Oo )=L - P ( Oo ¢ ( min CX , , XD ,
max ( X, , Xa) I ④ = Oo)
=L - P ( { X , a Oo , Xz < Oo) U { X ,> Oo
,X, > Oo} I ④ = Oo)
=L - LP ( X , Coo , Xz < Oo I④ '- Oo) t PCX,> Oo
, Xz > Oo I④ = Oo) )=L - LP ( X ,
cool ④ '- Oo )'t P ( X
,> Oo I④ = Oo )')
= i - k¥1' t (E)Y '- i - I = I.
This coverage probability is the same for all Oo .
So the
confidence coefficient is I.
That is,(min ( Xi
,K)
,max CX , , Xs))
is a 50% confidence interval for ④.
Remark If the realized interval (min ( ki .kz) , max ( x , , Ks ) )-
is of length greater than I then the interval must containthe true value of O .
The "50%"
does not refer to anythingabout the realized interval .
We can compute the probabilitythat the length of ( min (Xi , Xs) , max (Xi ,Ks)) will begreater than I .
. .
":#six: ''Ii i. the
points in the shaded region .The area of
the shaded region is ITSo with probability 4 the realized interval will containthe true value of O with certainty .
We can construct tests from set estimators and owe can also
construct set estimators from families of tests .
Theorem-
(a) Let { Ooo : Oo C- Do) be a family of tests such that
doo is a level a test of It :④o
'
- Oo vs A : ④o too
,
( same 2 for everytest doo)
.
Define
( ( x) = { Oo C- Do ! Ooo Ck) = o}.
( i.e.,CCK) is the set
of Oo for which the test Ooo accepts H for the givin x ) .
Then C CX ) is a Cl - d) - confidence region .
(b) Let C ( x) be a Ci -2) - confidence regions fo- ④ o.
For
each Oo C- Do define the test
doo (x) = I if Oo f E ( x){ o if Oo C- CCK)
Then too.
is a level a test of H : ④o
- Oo vs A : ④of Oofor all Oo C- Do
.
Proof(a) Need to show that for any O = too
,o
,) that
P ( Oo ECCX) I ④ = O) Z l - 2.
But PCO . E CCx) I ④-
- O) =P ( Ooo CX) -- o l④ -
- O)= I - PC 0%1×1=1 I④ -
- O)-S2 because Ooo isa level a test
Z i -2
(b) We want to show that for any O = ( Oo, Oi )
P ( Ooo CX) =L I ④ -
- o) E 2But PC Ooo Cx) -
- I I ④ = O ) =P ( Oo te C CX) I ④ -
- O)= I - P ( Oo E C CX) I ④ -
- O )-Z l - 2 since C CX) hasconfidence coefficient I - L
E 2
Example Say Po says that X =L X . . . ., Xn) are iii. d .
Nlm ,or) ,
O = ( Mio-)
.
Let the hypotheses beH : M
-
- Mo vs A : MFMo fo- given Mo ER .We had derived
the z - sided t test in our example illustrating the likelihoodratio test .
This test isI - Mo
0/41=1 if ⇐ Z th - I, 212{o '},I÷ etn-i.am
This test is a level 2 test.
This is a family of
level 2 tests as we let the hypothesis no vary over R.
The c - 2) - confidence interval given by part Ca) of thetheorem is
,for a givin k , the set of all Mo for which
the test of H : M = Mo accepts It for the glue . K .
We have
c. Csc) = { no :'Iif Stn - i , an}
= { Mo : II - Mol E Ent n - i.an}= { Mo : I - fatal
,anE Mo II t Ent n - i
,ah}
S . C CX) = ( I - In tn - i.an ,I t Int n - i.an) is
a Ci - 2) x 100% confidence interval for M.
Set Estimation in the Bayesian framework-
Again ,let ④ = (④
o ,④
i) and D= Roth , ,where ⑤
,and
r , can be dropped if ④ = ④oand D=Do
.A set estimator
for ④owill be based on the posterior distribution of ④o given
X = x .
For any given subset of ro ,if the posterior
probability of this set is l - 2 then we say that this set
is [email protected] are many possible ways we might choose this set .
① If ④ ois one - dimensional then a reasonable way
would be
to choose the set of Oo values to be those between the
÷ - quantile and the Cl - E) - quantile of the posteriordistribution of ④ o give in X
= K ( assuming these quartilesare unique .
② A more general way to obtain such aset is to choose
Oo values that have the highest posterior density orprobability . Specifically , for a given c > o one would
look at { Oo ! IT too IK) Z c}.If this set of Oo values
has posterior probability i - 2 then this set is called
a Cl - 2) highest posterior density ( HPD) cred .
-
' ble region
for ④ o .
In practice we would specify d and the. find
the threshold c to give an HPD credible region with
probability I - 2.
This will often entail finding acomputationally .
Eixample In a previous example where Po says thatX = ( X , , i ,
Xn) are iii. d . N ( 0104 , O
'
known, given =D
,and
with prior IT lol being NlMo , Oo') , we computed the
posterior distribution of ④ given X = K to be NCM , , o,
'),
where u.= It - Mo
Mo't 110oz
and of = too)"
A.95 HPD credible interval for ④ is the set of O values
between the .025 - quantile and the
.975 quantile of the
N ( µ , , op) distribution ,since any normal density is
symmetric about its mean and the density values inbetween these z quantile are all higher than the densityvalue , beyond these quartiles .
Example Say Po says that X= CX
, ,Xs) are i. i.d. given④ -
-O,
Uniform ( o - I , Ot t) ,O C- D= R
.
Let the prior TCO)
put positive density at every value of O EIR( e.g , a normal
density) .
The posterior density is proportional to thelikelihood x prior .
TT lol k) h Ic - • , o + z )(max ki) Ico - ±
,→Cm in ki) TCO)
= Imax sci - I , min sci +±)(O) TCO )
Thus,whatever prior IT lol that we choose ,
the support ofthe posterior is ( max ki - I , min kit 's) .
This would be
a look credible interval for ④ .
Note that the length of thisinterval is min kit E - ( max Ki - I)
= I - ( max Ki- min ki )
As max Ki - min Ki increases ,the length of the
interval decreases.