2
Descriptive statistics Variance = ! ! (! ! ! ) ! = ! ! [ ! ! ! ! ! ! ] k=n for population k=n-1 for sample Coeff. of variation = !.!. ! * V p/100 = X k when k > np/100 if it is not an integer; V p/100 = (X np/100 + X np/100+1 )/2 when np/100 is an integer Probability - Relative Risk = !" (!|!) !" (!|! ) - Bayes rules: Pr ! ! = !" ! ! !" (!) !" ! ! !" ! ![!!!" ! ! ]!" (! ) Given +ve Given -ve +ve -ve Disease PV + 1 – PV - Given disease Sensitivity False –ve (1-SS) No disease 1 – PV + PV - Given no disease False +ve (1-SF) Specificity Probability distribution - ! = ! ! ; ! ! = !"# ! = !{ ! ! ! } ! = ! ! ! [! ! ] ! (for all X) - ! = !" ! ; ! ! = [(! !) ! ! ! ] = [! ! ! ! ] ! ! Discrete X - ! = !" ! !" ; ! ! = [(! !) ! ! ! ]!" = [! ! ! ! ]!" ! ! Continuous X Binomial (limited by n) Pr ! = ! = ! ! ! ! ! (1 !) !!! !" !"# Poisson (not limited by n) Pr(X = x) = ! !! ! ! !! where ! = !" ! ! Normal [for N(0,1 2 )] Normalization: ! = !!! ! ! ! = 1 ! 2! ! ! ! !! ! (!!!) ! [!(!; 0,1) = 1 2! ! ! ! ! ! ! ] Φ ! = Pr ! ! = ! !; 0,1 !" ! !! Φ ! ! = Pr ! ! ! = ! ! ! ! * Φ ! = 1 Φ(!) ; Pr ! ! ! = Pr ! ! Z ! ! = Φ !!! ! Φ( !!! ! ) * Pr(X a) = Pr(X < a) for continuous random variables only (as Pr(X = a) =0) * Approximation = equalize E(X) and Var(X) of different distribution Poisson approximation to binomial when np < 5 (remember to check values) Normal approximation to binomial when npq 5 (remember continuity correction) Pr ! ! ! Pr (! 0.5 < ! < ! + 0.5) ; Pr ! = ! Pr (! 0.5 < ! < ! + 0.5) special cases: !" ! = 0 !" ! < 0.5 ; !"(! = !) Pr (Y > n 0.5) Relationships between random variables - ! !" = !"#$(! = ! ! = !) ; E(XY) = E(X)E(Y) if X,Y are independent - linear combination (l.c.) for all Xs : ! ! = ! ! !(! ! ) = ! ! ! ! - !"# !, ! = ! ! ! ! ! ! ! = ! !" ! ! ! ! !"# !, ! = !"# ! ; !"# !, ! = !"# ! ; !"# !, ! = 0 if X,Y are independent !"# !, !" + !" = !"#$ !, ! + !"#$(!, !) - !"## !, ! = ! !" = !"#(!,!) ! ! ! ! - l.c. for all Xs : !"# ! = ! ! ! !"# ! ! + 2 ! ! ! ! !"# ! ! , ! ! = ! ! ! ! ! ! + 2 ! ! ! ! ! ! ! ! !"##(! ! , ! ! ) - (sample covariance) ! !" = ! !!! (! ! ! )(! ! ! ) ! !!! ! !! = ! !!! (! ! ! ) ! = ! ! (sample variance) when X = Y sample corr. coeff. = ! !" = ! !" ! !! ! !! Point estimation - Choice of estimator: ! ! = ! for unbiased; ! ! < !(! ) for minimum variance - ! ! = ! !!! (! ! ! ) ! ! !!! is estimator for pop. variance σ 2 - ! is the best estimator for pop. mean μ for N.D. standard error = ! ! = !"# ! = !/ ! !/ ! is estimator for standard error - ! is the best estimator for pop. prop. p for N.D. standard error = ! ! = !"# ! = !"/! ! ! /! is estimator for standard error - ! ~! !, ! ! ! !!" !~! !, ! ! ; ! ! !, ! ! ! !!" ! 30 (central limit theorem) Sampling distribution Sampling distribution

Stat1012 Cheatsheet Double-sided

Embed Size (px)

Citation preview

Page 1: Stat1012 Cheatsheet Double-sided

Descriptive statistics Variance =  !

!(!! − !)! =

!![ !!! − !!!] k=n for population

k=n-1 for sample Coeff. of variation = !.!.

!

* Vp/100 = Xk when k > np/100 if it is not an integer; Vp/100 = (Xnp/100 + Xnp/100+1)/2 when np/100 is an integer Probability - Relative Risk = !"  (!|!)

!"  (!|!)

- Bayes rules: Pr ! ! = !" ! ! !"  (!)!" ! ! !" ! ![!!!" ! ! ]!"  (!)

Given +ve Given -ve +ve -ve

Disease PV+ 1 – PV- Given disease Sensitivity False –ve (1-SS) No disease 1 – PV+ PV- Given no disease False +ve (1-SF) Specificity

Probability distribution - ! = ! !        ;        !! = !"# ! = !{ ! − ! ! }! = ! !! − [! ! ]! (for all X) - ! = !" ! ; !! = [(! − !)!! ! ] = [!!! ! ]− !! Discrete X - ! = !" ! !" ; !! = [(! − !)!! ! ]!" = [!!! ! ]!" − !! Continuous X

Binomial (limited by n) Pr ! = ! = !!!!!(1− !)!!! !" !"# Poisson (not limited by n) Pr(X = x) = !

!!!!

!! where ! = !" ! !

Normal [for N(0,12)]

Normalization: ! = !!!!

! ! =1

! 2!!!

!!!!(!!!)

!        [!(!; 0,1) =

12!

!!!!!

!]

Φ ! = Pr ! ≤ ! = ! !; 0,1 !"!!!

Φ !! = Pr ! ≤ !! = !

! !!

* Φ −! = 1−Φ(!) ; Pr ! ≤ ! ≤ ! = Pr !! ≤ Z ≤ !! = Φ !!!!

−Φ(!!!!)

* Pr(X ≤ a) = Pr(X < a) for continuous random variables only (as Pr(X = a) =0) * Approximation = equalize E(X) and Var(X) of different distribution à Poisson approximation to binomial when np < 5 (remember to check values) à Normal approximation to binomial when npq ≥ 5 (remember continuity correction) à Pr ! ≤ ! ≤ ! ≈ Pr  (! − 0.5 < ! < ! + 0.5) ; Pr ! = ! ≈ Pr  (! − 0.5 < ! < ! + 0.5) à special cases: !" ! = 0 ≈ !" ! < 0.5        ;        !"(! = !) ≈ Pr  (Y > n− 0.5)   Relationships between random variables - ! !" = !"#$(! = ! ∩ ! = !) ; E(XY) = E(X)E(Y) if X,Y are independent - linear combination (l.c.) for all Xs : ! ! = !!!(!!) = !!!! - !"# !,! = ! ! − !! ! − !! = ! !" − !!!! à !"# !,! = !"# !        ;        !"# !,−! = −!"# !        ;        !"# !,! = 0 if X,Y are independent à !"# !,!" + !" = !"#$ !,! + !"#$(!,!) - !"## !,! = !!" =

!"#(!,!)!!!!

- l.c. for all Xs : !"# ! = !!!!"# !! + 2 !!!!!"# !! ,!! = !!!!!! + 2 !!!!!!!!!"##(!! ,!!) - (sample covariance) !!" =

!!!!

(!! − !)(!! − !)!!!!

à !!! =!

!!!(!! − !)! =  !! (sample variance) when X = Y

à sample corr. coeff. = !!" =!!"

!!!!!!

Point estimation - Choice of estimator: ! ! = ! for unbiased; ! ! < !(!∗) for minimum variance - !! = !

!!!(!! − !)!!

!!! is estimator for pop. variance σ2

- ! is the best estimator for pop. mean µ for N.D. à standard error = ! ! = !"# ! = !/ ! à !/ ! is estimator for standard error - ! is the best estimator for pop. prop. p for N.D. à standard error = ! ! = !"# ! = !"/! à !!/! is estimator for standard error

- !~! !, !!

!!ℎ!"  !~! !,!!        ; ! ≈ ! !, !

!

!!ℎ!"  ! ≥ 30 (central limit theorem)

Sampling  distribution  

Sampling  distribution  

Page 2: Stat1012 Cheatsheet Double-sided

Interval estimation: (1-α)100% C.I. (two-sided) ! ± !!!!/!

!!

when σ is known or n >200 if σ is unknown

! ± !!!!/!!!!

(when !!! ≥ 5)

! ± !!!!,!!!/!!!

when σ is unknown

valid for large n even not normally distributed

! − 1 !!

!!!!,!!!!

! ,! − 1 !!

!!!!,!!

!

not valid for non-normal distribution

90% !!.!" = 1.645 −!!!!,!!=

!!!!,!!!/!

No simple relationship between !!!!,!/!! and

!!!!,!!!/!! 95% !!.!"# = 1.960 99% !!.!!" = 2.576

Hypothesis testing (Given: H0: µ = µ0)

Type I and Type II Error

- p-value = probability of obtaining test statistic as extreme as or more than that observed test statistics à p > α implies it is a ‘general case’ so not sufficient evidence to reject H0 (vice versa for p < α) à (for z, t) one-sided = 1 – Φ (⏐test statistics⏐) ; two-sided = 2[1 – Φ(⏐test statistics⏐)] à (for χ2) two sided = 2[Φ(χ2)] for !! ≤ !! ; 2[1 - Φ(χ2)] for !! > !!

One sample (v.s. population) tests (two-sided)

Two paired-samples test (two-sided)

Two independent-samples tests (two-sided)

- significance level = α (given) = Pr(Reject H0 | H0 is true) - Power of test =  1 −  β = ϕ(!!!!!!!/√!    !!!

!/√!)

- sample size ! = (!!!!!!!!!)!!!

(!!!!)!

H1: μ0 > μ

One-­‐  sided  

For 2 sample test, !! ≈ !"#$%&  compare  with  !! ≈ !"!#$%&'"(  

* Must clarify!