Stat1012 Cheatsheet Double-sided

Descriptive statistics Variance = !

!(!! − !)! =

!![ !!! − !!!] k=n for population

k=n-1 for sample Coeff. of variation = !.!.

!

* Vp/100 = Xk when k > np/100 if it is not an integer; Vp/100 = (Xnp/100 + Xnp/100+1)/2 when np/100 is an integer Probability - Relative Risk = !" (!|!)

!" (!|!)

- Bayes rules: Pr ! ! = !" ! ! !" (!)!" ! ! !" ! ![!!!" ! ! ]!" (!)

Given +ve Given -ve +ve -ve

Disease PV+ 1 – PV- Given disease Sensitivity False –ve (1-SS) No disease 1 – PV+ PV- Given no disease False +ve (1-SF) Specificity

Probability distribution - ! = ! ! ; !! = !"# ! = !{ ! − ! ! }! = ! !! − [! ! ]! (for all X) - ! = !" ! ; !! = [(! − !)!! ! ] = [!!! ! ]− !! Discrete X - ! = !" ! !" ; !! = [(! − !)!! ! ]!" = [!!! ! ]!" − !! Continuous X

Binomial (limited by n) Pr ! = ! = !!!!!(1− !)!!! !" !"# Poisson (not limited by n) Pr(X = x) = !

!!!!

!! where ! = !" ! !

Normal [for N(0,12)]

Normalization: ! = !!!!

! ! =1

! 2!!!

!!!!(!!!)

! [!(!; 0,1) =

12!

!!!!!

!]

Φ ! = Pr ! ≤ ! = ! !; 0,1 !"!!!

Φ !! = Pr ! ≤ !! = !

! !!

* Φ −! = 1−Φ(!) ; Pr ! ≤ ! ≤ ! = Pr !! ≤ Z ≤ !! = Φ !!!!

−Φ(!!!!)

* Pr(X ≤ a) = Pr(X < a) for continuous random variables only (as Pr(X = a) =0) * Approximation = equalize E(X) and Var(X) of different distribution à Poisson approximation to binomial when np < 5 (remember to check values) à Normal approximation to binomial when npq ≥ 5 (remember continuity correction) à Pr ! ≤ ! ≤ ! ≈ Pr (! − 0.5 < ! < ! + 0.5) ; Pr ! = ! ≈ Pr (! − 0.5 < ! < ! + 0.5) à special cases: !" ! = 0 ≈ !" ! < 0.5 ; !"(! = !) ≈ Pr (Y > n− 0.5) Relationships between random variables - ! !" = !"#$(! = ! ∩ ! = !) ; E(XY) = E(X)E(Y) if X,Y are independent - linear combination (l.c.) for all Xs : ! ! = !!!(!!) = !!!! - !"# !,! = ! ! − !! ! − !! = ! !" − !!!! à !"# !,! = !"# ! ; !"# !,−! = −!"# ! ; !"# !,! = 0 if X,Y are independent à !"# !,!" + !" = !"#$ !,! + !"#$(!,!) - !"## !,! = !!" =

!"#(!,!)!!!!

- l.c. for all Xs : !"# ! = !!!!"# !! + 2 !!!!!"# !! ,!! = !!!!!! + 2 !!!!!!!!!"##(!! ,!!) - (sample covariance) !!" =

!!!!

(!! − !)(!! − !)!!!!

à !!! =!

!!!(!! − !)! = !! (sample variance) when X = Y

à sample corr. coeff. = !!" =!!"

!!!!!!

Point estimation - Choice of estimator: ! ! = ! for unbiased; ! ! < !(!∗) for minimum variance - !! = !

!!!(!! − !)!!

!!! is estimator for pop. variance σ2

- ! is the best estimator for pop. mean µ for N.D. à standard error = ! ! = !"# ! = !/ ! à !/ ! is estimator for standard error - ! is the best estimator for pop. prop. p for N.D. à standard error = ! ! = !"# ! = !"/! à !!/! is estimator for standard error

- !~! !, !!

!!ℎ!" !~! !,!! ; ! ≈ ! !, !

!

!!ℎ!" ! ≥ 30 (central limit theorem)

Sampling distribution

Sampling distribution

Interval estimation: (1-α)100% C.I. (two-sided) ! ± !!!!/!

!!

when σ is known or n >200 if σ is unknown

! ± !!!!/!!!!

(when !!! ≥ 5)

! ± !!!!,!!!/!!!

when σ is unknown

valid for large n even not normally distributed

! − 1 !!

!!!!,!!!!

! ,! − 1 !!

!!!!,!!

!

not valid for non-normal distribution

90% !!.!" = 1.645 −!!!!,!!=

!!!!,!!!/!

No simple relationship between !!!!,!/!! and

!!!!,!!!/!! 95% !!.!"# = 1.960 99% !!.!!" = 2.576

Hypothesis testing (Given: H0: µ = µ0)

Type I and Type II Error

- p-value = probability of obtaining test statistic as extreme as or more than that observed test statistics à p > α implies it is a ‘general case’ so not sufficient evidence to reject H0 (vice versa for p < α) à (for z, t) one-sided = 1 – Φ (⏐test statistics⏐) ; two-sided = 2[1 – Φ(⏐test statistics⏐)] à (for χ2) two sided = 2[Φ(χ2)] for !! ≤ !! ; 2[1 - Φ(χ2)] for !! > !!

One sample (v.s. population) tests (two-sided)

Two paired-samples test (two-sided)

Two independent-samples tests (two-sided)

- significance level = α (given) = Pr(Reject H0 | H0 is true) - Power of test = 1 − β = ϕ(!!!!!!!/√! !!!

!/√!)

- sample size ! = (!!!!!!!!!)!!!

(!!!!)!

H1: μ0 > μ

One-‐ sided

For 2 sample test, !! ≈ !"#$%& compare with !! ≈ !"!#$%&'"(

* Must clarify!

Documents

Stat1012 Cheatsheet Double-sided