Upload
nicholas-so
View
238
Download
0
Embed Size (px)
Citation preview
Descriptive statistics Variance = !
!(!! − !)! =
!![ !!! − !!!] k=n for population
k=n-1 for sample Coeff. of variation = !.!.
!
* Vp/100 = Xk when k > np/100 if it is not an integer; Vp/100 = (Xnp/100 + Xnp/100+1)/2 when np/100 is an integer Probability - Relative Risk = !" (!|!)
!" (!|!)
- Bayes rules: Pr ! ! = !" ! ! !" (!)!" ! ! !" ! ![!!!" ! ! ]!" (!)
Given +ve Given -ve +ve -ve
Disease PV+ 1 – PV- Given disease Sensitivity False –ve (1-SS) No disease 1 – PV+ PV- Given no disease False +ve (1-SF) Specificity
Probability distribution - ! = ! ! ; !! = !"# ! = !{ ! − ! ! }! = ! !! − [! ! ]! (for all X) - ! = !" ! ; !! = [(! − !)!! ! ] = [!!! ! ]− !! Discrete X - ! = !" ! !" ; !! = [(! − !)!! ! ]!" = [!!! ! ]!" − !! Continuous X
Binomial (limited by n) Pr ! = ! = !!!!!(1− !)!!! !" !"# Poisson (not limited by n) Pr(X = x) = !
!!!!
!! where ! = !" ! !
Normal [for N(0,12)]
Normalization: ! = !!!!
! ! =1
! 2!!!
!!!!(!!!)
! [!(!; 0,1) =
12!
!!!!!
!]
Φ ! = Pr ! ≤ ! = ! !; 0,1 !"!!!
Φ !! = Pr ! ≤ !! = !
! !!
* Φ −! = 1−Φ(!) ; Pr ! ≤ ! ≤ ! = Pr !! ≤ Z ≤ !! = Φ !!!!
−Φ(!!!!)
* Pr(X ≤ a) = Pr(X < a) for continuous random variables only (as Pr(X = a) =0) * Approximation = equalize E(X) and Var(X) of different distribution à Poisson approximation to binomial when np < 5 (remember to check values) à Normal approximation to binomial when npq ≥ 5 (remember continuity correction) à Pr ! ≤ ! ≤ ! ≈ Pr (! − 0.5 < ! < ! + 0.5) ; Pr ! = ! ≈ Pr (! − 0.5 < ! < ! + 0.5) à special cases: !" ! = 0 ≈ !" ! < 0.5 ; !"(! = !) ≈ Pr (Y > n− 0.5) Relationships between random variables - ! !" = !"#$(! = ! ∩ ! = !) ; E(XY) = E(X)E(Y) if X,Y are independent - linear combination (l.c.) for all Xs : ! ! = !!!(!!) = !!!! - !"# !,! = ! ! − !! ! − !! = ! !" − !!!! à !"# !,! = !"# ! ; !"# !,−! = −!"# ! ; !"# !,! = 0 if X,Y are independent à !"# !,!" + !" = !"#$ !,! + !"#$(!,!) - !"## !,! = !!" =
!"#(!,!)!!!!
- l.c. for all Xs : !"# ! = !!!!"# !! + 2 !!!!!"# !! ,!! = !!!!!! + 2 !!!!!!!!!"##(!! ,!!) - (sample covariance) !!" =
!!!!
(!! − !)(!! − !)!!!!
à !!! =!
!!!(!! − !)! = !! (sample variance) when X = Y
à sample corr. coeff. = !!" =!!"
!!!!!!
Point estimation - Choice of estimator: ! ! = ! for unbiased; ! ! < !(!∗) for minimum variance - !! = !
!!!(!! − !)!!
!!! is estimator for pop. variance σ2
- ! is the best estimator for pop. mean µ for N.D. à standard error = ! ! = !"# ! = !/ ! à !/ ! is estimator for standard error - ! is the best estimator for pop. prop. p for N.D. à standard error = ! ! = !"# ! = !"/! à !!/! is estimator for standard error
- !~! !, !!
!!ℎ!" !~! !,!! ; ! ≈ ! !, !
!
!!ℎ!" ! ≥ 30 (central limit theorem)
Sampling distribution
Sampling distribution
Interval estimation: (1-α)100% C.I. (two-sided) ! ± !!!!/!
!!
when σ is known or n >200 if σ is unknown
! ± !!!!/!!!!
(when !!! ≥ 5)
! ± !!!!,!!!/!!!
when σ is unknown
valid for large n even not normally distributed
! − 1 !!
!!!!,!!!!
! ,! − 1 !!
!!!!,!!
!
not valid for non-normal distribution
90% !!.!" = 1.645 −!!!!,!!=
!!!!,!!!/!
No simple relationship between !!!!,!/!! and
!!!!,!!!/!! 95% !!.!"# = 1.960 99% !!.!!" = 2.576
Hypothesis testing (Given: H0: µ = µ0)
Type I and Type II Error
- p-value = probability of obtaining test statistic as extreme as or more than that observed test statistics à p > α implies it is a ‘general case’ so not sufficient evidence to reject H0 (vice versa for p < α) à (for z, t) one-sided = 1 – Φ (⏐test statistics⏐) ; two-sided = 2[1 – Φ(⏐test statistics⏐)] à (for χ2) two sided = 2[Φ(χ2)] for !! ≤ !! ; 2[1 - Φ(χ2)] for !! > !!
One sample (v.s. population) tests (two-sided)
Two paired-samples test (two-sided)
Two independent-samples tests (two-sided)
- significance level = α (given) = Pr(Reject H0 | H0 is true) - Power of test = 1 − β = ϕ(!!!!!!!/√! !!!
!/√!)
- sample size ! = (!!!!!!!!!)!!!
(!!!!)!
H1: μ0 > μ
One-‐ sided
For 2 sample test, !! ≈ !"#$%& compare with !! ≈ !"!#$%&'"(
* Must clarify!