Upload
benjamin-ng
View
217
Download
0
Embed Size (px)
Citation preview
7/31/2019 ST102 Notes
1/21
STATISTICS (ST102)LENT TERM MATERIAL
one. Point Estimation
Data
X, Y
Estimator
(formula)
Estimates
function of data to
give estimates
Method of Moments Least Squares Maximum Likelihood
Note: The Estimator is a random variable, as the value of the estimate changes as different samples are
drawn. Thus, the estimator has a probability distribution. The samples are regarded as constants.
Measuring the Preference of an Estimator:
Mean Square Error (MSE) - Measures Tradeoff
Between Bias and Effi
ciency
MSE = E ( )2
= E E( )+ E( ) 2
==Var()+ Bias(){ }2
Bias - on average, the estimator gives the true value
Bias ( ) = E ( )Therefore, if an estimator is unbiased:
Bias ( ) = 0 E ( ) =
Disadvantages: No equal weights, magnies values that >1 and miniaturizes values that < 1
Mean Absolute Deviations (MAD)
MAD= E
Method of Moments Estimator (MME):
Kth Sample Moment
(non-centered)= Kth Population Moment
k=11
nX
i
i=1
n
= X = E X( )
k=21
nX
i
2
i=1
n
= E X2( ) = Var X( )+ E X( ) 2
... ...
k1
nX
i
k
i=1
n
= E Xk( )
Computable from Data
Not Computable
Depends on unknown
parameters
7/31/2019 ST102 Notes
2/21
Maximum Likelihood Estimator (MLE): requires identical and independently distributed samples
Step 1: Constructing Likelihood Function
L ( ) = f X1, X2 ,..., Xn;( ) = f X1;( )f X2;( )...f Xn;( ) = f Xi;( )i=1
n
Often easier to work with log-likelihood function:
l ( ) = ln L ( ) = ln f Xi;( )i=1
n
= l n f Xi;( )
i=1
n
Step 2: Maximize Likelihood Function
Maximize log-likelihood function by differentiation or by observation
dl ( )d
=
= 0
Note: In likelihood function, Xs (sample data) are treated as constants, and (parameter) is a variable
First order conditions are only application if the function is continuous differentiable
Properties:
- Under suitable conditions, MLE and MME have nice large sample properties
- Consistent
n, MSE ( ) 0
- Asymptotically Normal
As n approaches innity, under some regularity conditions,
n ( ) N 0,1
I ( )
( ) N 0,1
nI ( )
N ,
1
nI ( )
I() is the Fisher Information, dened as:
I ( ) = E2
2
ln f X;( )
= f X;( ) i2
2
ln f X;( )
= f X;( ) i
2
2
ln f X;( )
- On top of this, MLE is also:
- Invariant
MLE ( ) = MLE g ( ) = g ( )- More Efficient than MME
Var MLE( ) Var MME( )
Conceptually, MLE uses more information in calculation than MME
Hence, we always use MLE when possible
7/31/2019 ST102 Notes
3/21
two. Condence Intervals
Condence Interval for Parameter at 90/95/99% interval
C SE ( )Length of Condence Interval = 2 x SE
X% Condence Interval: If one repeats the interval estimation a large number of times, about x% of times theinterval estimator covers the true
Condence Intervals for Population Mean
Population Nor ally Distributed Population Not Normal For Proportion
Variance Known Variance UnknownIf N is large,
By Central Limit TheoremSimplication after CLT
X
n
N 0,1( )
Use Z-table
X
s
n
tn1
Use t-table
X
n
N 0,1( )
approximately
Use Z-table
X
(1
)
n
N 0,1
( )
Use Z-table
Chi-Squared Distribution chi-squared distribution with k degrees of freedom
If Xi N(0,1), Z = X12+ X2
2+ ...+ Xk
2+ Xi
2 k
2
i=1
k
k
2= 0,[ ) E(Z) = k Var(Z) = 2k
Test for Population Variance:
n1( )s2
2
n1
2
1( )Confidence Interval for 2 isn 1( )s2
2
1/2,n1
,n 1( )s2
2
/2,n1
Proof
Let Xi N(,2) i =1, 2, ..., n
Xi
2
i=1
n
= 1
2(X i )
2
i=1
n
n2
1
2
(Xi )2
i=1
n
=1
2
Xi X( )+ X ( ) 2
i=1
n
=1
2
Xi X( )2
i=1
n
+ X ( )2
i=1
n
+ 2 X ( ) Xi X( )i=1
n
1
2
X ( )2
i=1
n
= n
2X ( )
2
=
X
n
2
1
2
1
2X
i X( )
2
=n 1( )
2
1
n 1X
i X( )
2
i=1
n
=
n 1( )s2
2
i=1
n
n2 12 = n12
7/31/2019 ST102 Notes
4/21
t-Distribution student t-distribution with k degrees of freedom
Let Z N(0,1) and X k2
T =Z
X k t
k
t is a continuous and symmetric distribution on (- ,), with heavier tails than the normal distribution.
As k approaches innity, the t distribution converges to the standard normal distribution.
Test for Population Mean:
X
s
n
tn1
1( )Confidence Interval for is X t/2,n1
s
n
Proof
X
2n N 0,1( ) and
n 1( )s2
2
n1
2
X
2n
n 1( )s2 2
n 1
=
X
2n
2
s2=
X
s
n
tn1
F-Distribution F-distribution with degrees of freedom p, k
U ~ p2 V ~ k
2 U,V are independent r.v.s
W =
Up
Vk
~ Fp,k
Fp,k = 0,[ ) E W( ) =k
k 2,k > 2 Var(W)=
2k2
p + k 2( )
p k 2( )2
k 4( ),k > 4
If W ~ Fp,k , W1 ~ Fk,p If T ~ tk ,T
2 ~ F1,k
Test for ratio of 2 Normal Variances
H0
:Y
2/
X
2= r vs H
1:
Y
2/
X
2 r
T =
n 1( )sX2
/X
2
n 1( )m 1( )sY
2/
Y
2
m 1( )
=
Y
2
X
2
sX
2
sY
2= r
sX
2
sY
2~ F
n1,m1
1( )Confidence Interval for
Y
2
X
2is F
1/2,n1,m1 s
Y
2
sX
2, F
/2,n1,m1 s
Y
2
sX
2
7/31/2019 ST102 Notes
5/21
three. Hypothesis Testing
one State the N ll and Al ernative Hypothesis
Two-Tail Test
H0
:= k vs H1
: k
H0
: =
One-Tail Test
k vs H1
: > k / H0
: = k vs H1
: < k
twoCompute the T st Statistic (T)
three Look Up Critical Values
at Level of Signicance (1%, 5%, 1
One-tail Test:
P(T >C) =
If T >C, reject at level of sign
Two-tail Test:
P( T >C/2 ) =
If T >C/2 , reject at level of sig
%)
ificance
ificance
Compute the p-value
p-value is the smallest level of signicance that Ho
can be rejected.
Let t be the calculated test statistic
P(T > t) /P(T < t) = p value
One-tail TestIf p-value , reject at level of significance
Two-tail Test
If p-value / 2, reject at level of significance
four
Reject even at the 1% level of signi
Conc
ance / Do
lude
not reject even at the 10% level of signicance
7/31/2019 ST102 Notes
6/21
Errors in Hypothesis Testing
Type 1 Error: Reject the Null Hypothesis when H0 is true
Type 2 Error: Not rejecting H0 when H1 is true
Power: P ( Rejecting H0 when H1 is true )
Note: P ( Type 2 Error ) + Power = 1
For Ex
H0
:= 0
T =X 0
n
Un
mple:
s H1
: 0
er H 0 ,T N(0,1)
Type 1 Error
P(Type 1 Error)
= P(Observed T lies in critical region)
= level of signicance
Type 2 Error
True Mean is not0, instead = 1
T is no longer a standard normal distribution.
T ~ N
1
0
n
,1
P(Type 2 Error) = P(T, under H1, lies within critical values)
Power = P(T, under H1, lies outside critical values)
Properties
Trade offbetween Type 1 and Type 2 error
As falls, P( Type 2 Error) Increases
As the distance 0 - 1 increases, P( Type 2 Error) Falls
If variance increases, P( Type 2 Error) IncreasesIf no. of samples increases, variance falls,
P( Type 2 Error) Increases
7/31/2019 ST102 Notes
7/21
Testing difference of 2 population means
Data is Normally Distributed, or n is large
X N X ,X2( ) Y N Y,Y
2( )
XN X,
X
2
nX
Y N
Y,
Y
2
nY
X
k= a
Matched Pairs
Two distributions can be logically linked,
same sample size
nX= n
Y= n
Z = X Y N X Y,X
2
n+
Y
2
n
Under Ho,
If Variance known
T =Z a
X
2
n+
Y
2
n
~ N(0,1)
If Variance unknown, Use Sample Variance
T=Z a
s n~ t
n1 where s2=
1
n1Z
i
2 nZ
2( )
Independent Samples
Two distributions cannot be linked,
Different sample sizes
nX n
Y
T = X Y( ) aSE X Y( )
Under Ho,
If Variance known
SE X Y( ) =
X
2
nX
+
Y
2
nY
T=X Y( ) a
X
2
nX
+
Y
2
nY
~ N 0,1( )
1( )Confidence Interval for X Y is X Y Z/2
X
2
nX
+
Y
2
nY
If Variance unknown, but equal
X
2=
Y
2=
2
Pooled Variance = sP
2=
nx1( )sX
2+ n
Y1( )sY
2
nx1( )+ nY 1( )
Pooled Variance is the weighted average of sample variances
SE X Y( ) =s
P
2
nX
+s
P
2
nY
=1
nX
+1
nY
nx1( )sX
2+ n
Y1( )sY
2
nx1( )+ nY 1( )
T =X Y( ) a
sP
2
nX
+s
P
2
nY
=
X Y( ) a
1
nX
+1
nY
nx1( )sX
2+ n
Y1( )sY
2
nx1( )+ nY 1( )
~ tn+m2
1( )Confidence Interval for XY is
X Y t/2,n+m2 1
nX
+ 1nY
nx 1( )sX2
+ nY 1( )sY2
nx1( )+ nY 1( )
Test for Correlation
Corr X,Y( ) =Cov X,Y( )
Var X( )Var Y( )=
E X EX( ) Y EY( )
X EX( )2
YEY( )2
Correlation measures the linear relationship
between X and Y. When p=0, X and Y are linearly
independent.
H0
:= 0 vs H1
:> 0 / < 0 / 0
Sample Correlation Coefficient:
=X X( ) YY( )
X X( )2
YY( )2
=
XiY
i nXY(n 1)sXsY
T = n 2
1 2 =
n 2
1
2 1~ t n2
7/31/2019 ST102 Notes
8/21
Goodness of Fit Test to assess if a given distribution ts the data well
1. H0: r.v. X follows a certain distribution
H1: r.v. X does not follow a certain distribution
Note: in cases where the parameters of the distribution are not given, use MLE/MME to get a point estimate
2. Construct the Table and calculate the Test Statistic
1 2 ... Total
Observed Frequency, Zi n
Probability, p np1 np2 1
Expected Frequency, Ei n
Difference Zi - Ei 0
(Zi - Ei)2 / Ei T
Under the Null Hypothesis,
T =Zi E i( )
2
E ii=1
n
~ n1No. of parameters estimated2
T =Z
i E
i( )2
Eii=1
n
=Z
i
2
Ei
2i=1
n
Zi +i=1
n
E ii=1
n
=Z
i
2
Ei
ni=1
n
Note: If any category has expected cell count < 5, then merge groups so that all groups have expected countsthat are more than or equal to 5. For some cases, intervals or groups can be self created. For example, in a test
for normality, one can divide the line into 10 intervals with the probability at each interval being 10%.
Contingency Tables / Tests of Association Special application of the goodness-of-t test
p - number of free counts among Z
d - no. of the estimated free parameters
for most cases, p-d = (r-1)(c-1)
Test of independence:
Test for several Binomial Distributions:
1 2 ... c
1 Z11 Z12 ... Z1c
2 Z21 Z22 ... Z2c
... ... ... ... ...
r Zr1 Zr2 ... Zrc
1 2 ... c
1 E11 E12 ... E1c
2 E21 E22 ... E2c
... ... ... ... ...
r Er1 Er2 ... Erc
-1 2 ... c
1 ...
2 ...
... ... ... ... ...
r ...
H0 : p ij=
p ii
pij
p i
i
=
Zii
n ,p
ij=
Zij
n
p ij
= p i
i
p
ij
E ij=
npij
H0 :p11 = p12 = ...= p1n = p p=Z11 + Z12 + ...+ Z1n
Z1 + Z2 + ...+ Zn
T =Zi E i( )
2
E ij=1
c
i=1
r
~ pd2
7/31/2019 ST102 Notes
9/21
7/31/2019 ST102 Notes
10/21
Common Tests:
Test whether B0 is non-zero
H0 :0 = 0 T=
0
0
SE 0( )=
0
0
2
nxi
2
i=1
n
/ (x i x)2
i=1
n
~ tn2
1( )Confidence Interval for 0 is0 t/2,n2
2
nxi
2
i=1
n
/ (x i x)2i=1
n
Test whether B1 is signicantly non-zero
H0 :1 = 0 T=
1
1
SE 1( )=
1
1
2 / (xi x)2
i=1
n
~ tn2
1( )Confidence Interval for 1 is1 t/2,n2
2/ (x i x)
2
i=1
n
Testing the variance of the residuals
n 2( )2
2=
1
2y i 0 + 1xi( )
2
i=1
n
~ n22
ANOVA
Total Sum of Squares (SS) = Regression SS + Resudual SS
(y i y)2
i=1
n
= 12(x i x)
2
i=1
n
+ y i 0 + 1x i( )2
i=1
n
Total SS = (y i y)2
i=1
n
= yi2
i=1
n
ny2
Regression SS = 12(x i x)
2
i=1
n
= 12 xi2i=1
n
nx2
Another Test for whether B1 is signicantly non-zero
F =Regression SS
Residual SS( ) / n 2( )=
11
SE 1( )
2
~ F1,n2
Regression Correlation Coefficient Percentage of total variation explained by x
R =Regression SS
Total SS= 1
Residual SS
Total SS, Radj =
Regression SS / n 2( )Total SS / n 1( )
7/31/2019 ST102 Notes
11/21
Analysis of Minitab Results
!160140120100806040200
70
60
50
40
30
20
10
Stopping Distance
Velocity
S 2.14805
R-Sq 98.4%
R-Sq (ad j ) 9 8. 0%
Regression
95% CI
95% PI
Fitted Line PlotVelocity = 18.06 + 0.2818 Stopping Distance
7/31/2019 ST102 Notes
12/21
STATISTICS (ST102)MICHAELMAS TERM MATERIAL
Notepad for Probability
Basic axioms, independence, mutual exclusion, pairwise disjoint/partitions,
total probability, Bayes theorem, permutations and combinations
7/31/2019 ST102 Notes
13/21
Notepad for Discrete and Continuous Random Variables
Discrete Random Variables Continuous Random Variables
p.d.f.
c.d.f
E(X)
Var(X)
m.g.f.
Misc
7/31/2019 ST102 Notes
14/21
Discrete Random Variables 1: Discrete Uniform Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
15/21
Discrete Random Variables 2: Bernoulli Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
16/21
Discrete Random Variables 3: Binomial Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
17/21
Discrete Random Variables 4: Poisson Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
18/21
Continuous Random Variables 1: Uniform Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
19/21
Continuous Random Variables 2: Exponential Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
20/21
Continuous Random Variables 3: Normal Distribution
pdf: cdf: mgf: mean:
variance:
7/31/2019 ST102 Notes
21/21
Notepad for Multivariate Random Variables
Joint distributions, Marginal Distributions, Conditional Distributions, Covariance and Correlation