If you can't read please download the document
Upload
basil-turner
View
58
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Advanced Statistical Methods: Beyond Linear Regression. John R. Stevens Utah State University Notes 2. Statistical Methods I Mathematics Educators Workshop 28 March 2009. 1. http://www.stat.usu.edu/~jrstevens/pcmi. ObsFlightTempDamage 1STS166NO 2STS970NO 3STS51B75NO - PowerPoint PPT Presentation
Citation preview
John R. StevensUtah State University
Notes 2. Statistical Methods I
Mathematics Educators Workshop 28 March 2009*Advanced Statistical Methods:Beyond Linear Regressionhttp://www.stat.usu.edu/~jrstevens/pcmi
What would your students know to do with these data?ObsFlightTempDamage1STS166NO2STS970NO3STS51B75NO4STS270YES5STS41B57YES6STS51G70NO7STS369NO8STS41C63YES9STS51F81NO10STS48011STS41D70YES12STS51I76NO13STS568NO14STS41G78NO15STS51J79NO16STS667NO17STS51A67NO18STS61A75YES19STS772NO20STS51C53YES21STS61B76NO22STS873NO23STS51D67NO24STS61C58YES
Two Sample t-test
data: Temp by Damage t = 3.1032, df = 21, p-value = 0.005383alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 2.774344 14.047085 sample estimates: mean in group NO mean in group YES 72.12500 63.71429
Does the t-test make sense here?Traditional:Treatment Group mean vs. Control Group mean
What is the response variable?Temperature? [Quantitative, Continuous]Damage? [Qualitative]
Traditional Statistical Model 1Linear Regression: predict continuous response from [quantitative] predictorsY=weight, X=heightY=income, X=education levelY=first-semester GPA, X=parents incomeY=temperature, X=damage (0=no, 1=yes)
Can also control for other [possibly categorical] factors (covariates):SexMajorState of OriginNumber of Siblings
Traditional Statistical Model 2Logistic Regression: predict binary response from [quantitative] predictorsY=graduate within 5 years=0 vs. Y=not=1X=first-semester GPAY=0 (no damage) vs. Y=1 (damage)X=temperatureY=0 (survive) vs. Y=1 (death)X=dosage (dose-response model)Can also control for other factors, or covariatesRace, SexGenotypep = P(Y=1 | relevant factors) = prob. that Y=1, given state of relevant factors
Traditional Dose-Response Modelp = Probability of death at dose d:
Look at what affects the shape of the curve, LD50 (lethal dose for 50% efficacy), etc.
Fitting the Dose-Response ModelWhy logistic regression?0 = place-holder constant1 = effect of dosage dTo estimate parameters:Newton-Raphson iterative process to maximize the likelihood of the modelCompare Y=0 (no damage) with Y=1 (damage) groups
Likelihood Function (to be maximized)likelihood for obs. imultiply probabilities (independence)
Estimation by IRLSIteratively Reweighted Least Squares
equivalent: Newton-Raphson algorithm for iteratively solving score equations
Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 15.0429 7.3786 2.039 0.0415 *Temp -0.2322 0.1082 -2.145 0.0320 *---Signif. codes: 0 *** 0.001 ** 0.01 * 0.05 . 0.1 1
What if the data were even better?Complete separation of points
What should happen to our slope estimate?
Coefficients: Estimate Std. Error z value Pr(>|z|)(Intercept) 928.9 913821.4 0.001 1Temp -14.4 14106.7 -0.001 1
Failure?Shape of likelihood function
Large Standard Errors
Solution only in 2006
Rather than maximizing likelihood, consider a penalty:
Model fitted by Penalized MLConfidence intervals and p-values by Profile Likelihood
coef se(coef) Chisq p(Intercept) 30.4129282 16.5145441 11.35235 0.0007535240Temp -0.4832632 0.2528934 13.06178 0.0003013835
Beetle Data
Phosphine
Total
Dosage
Receiving
Total
Total
Survivors Observed at Genotype
(mg/L)
Dosage
Deaths
Survivors
-/B
-/H
-/A
+/B
+/H
+/A
0
98
0
98
31
27
10
6
20
4
0.003
100
16
84
18
26
10
6
20
4
0.004
100
68
32
10
4
3
5
7
4
0.005
100
78
22
1
4
7
2
6
2
0.01
100
77
23
0
1
9
8
5
0
0.05
300
270
30
0
0
0
5
20
5
0.1
400
383
17
0
0
0
0
10
7
0.2
750
740
10
0
0
0
0
0
10
0.3
500
490
10
0
0
0
0
0
10
0.4
500
492
8
0
0
0
0
0
8
1.0
7850
7,806
44
0
0
0
0
0
44
10,798
10,420
378
Dose-response modelRecall simple model:
pij = Pr(Y=1 | dosage level j and genotype level i)
But when is genotype (covariate Gi) observed?
Coefficients: Estimate Std. Error z value Pr(>|z|)(Intercept) -2.657e+01 8.901e+04 -2.98e-04 1dose -7.541e-26 1.596e+07 -4.72e-33 1G1+ -3.386e-28 1.064e+05 -3.18e-33 1G2B -1.344e-14 1.092e+05 -1.23e-19 1G2H -3.349e-28 1.095e+05 -3.06e-33 1dose:G1+ 7.541e-26 1.596e+07 4.72e-33 1dose:G2B 3.984e-12 3.075e+07 1.30e-19 1dose:G2H 7.754e-26 2.760e+07 2.81e-33 1G1+:G2B 1.344e-14 1.465e+05 9.17e-20 1G1+:G2H 3.395e-28 1.327e+05 2.56e-33 1dose:G1+:G2B -3.984e-12 3.098e+07 -1.29e-19 1dose:G1+:G2H -7.756e-26 2.763e+07 -2.81e-33 1Before we fix this, first a little detour
A Multivariate Gaussian MixtureComponent j is MVN(j,j) with proportion j
The Maximum Likelihood Approach
A Possible Work-AroundKeys here:the true group memberships are unknown (latent)statisticians specialize in unknown quantities
A reasonable approach1. Randomly assign group memberships , and estimate group means j , covariance matrices j , and mixing proportions j2. Given those values, calculate (for each obs.) j = E[j|] = P(obs. in group j)3. Update estimates for j , j , and j , weighting each observation by these : 4. Repeat steps 2 and 3 to convergence
Plotting character and color indicate most likely component
The EM (Baum-Welch) Algorithm- maximization made easier with Zm = latent (unobserved) data; T = (Z,Zm) = complete dataStart with initial guesses for parametersExpectation: At the kth iteration, compute Maximization: Obtain estimate by maximizing over Iterate steps 2 and 3 to convergence ($?)
Beetle Data NotationObserved values Unobserved (latent) values If Nij had been observed:
How Nij can be [latently] considered:
Likelihood FunctionParameters =(p,P) and complete data T=(n,N) After simplification:
Mechanism of missing data suggests EM algorithm
Missing at Random (MAR)Necessary assumption for usual EM applicationsCovariate x is MAR if probability of observing x does not depend on x or any other unobserved covariate, but may depend on response and other observed covariates (Ibrahim 1990)Here genotype is observed only for survivors, and for all subjects at zero dosage
Initialization StepTwo classes of marginal information hereFor all dosage levels j observeAt zero dosage level observe for genotype iAllows estimate of Pi Consider marginal distn. of missing categorical covariate (genotype)Using zero dosage level:
This is the key the marginal distribution of the missing categorical covariate
Expectation StepDropping constants and :
Need to evaluate:
(*)
Expectation StepBayes Formula:
Multinomial (*)
Expectation StepFor :Not needed for maximization only affects EM convergence rateDirect calculation from multinomial distn. is possible but computationally prohibitiveNeed to employ some approximation strategySecond-order Taylor series about , using Binets formula(*)
Expectation StepConsider Binets formula (like Stirlings):
Have:
Use a second-order Taylor series approximation taken about as a function of :(*)
Maximization StepPortion of related to :
Portion of related to :by Lagrange multipliersby Newton-Raphson iterations, with some parameterization(*)
Convergence
Dose Response Curves (log scale)
EM Resultstest statistic for H0: no dosage effectseparation of points
Confidence
LD50
L95
U95
t
-/B
0.0035
0.0031
0.0039
3.99
-/H
0.0033
0.0028
0.0038
4.98
-/A
0.0290
-7.1862
7.2442
0.13
+/B
0.0484
0.0123
0.0845
0.09
+/H
0.0664
0.0407
0.0921
4.20
+/A
0.7382
0.1428
1.3336
1.36
Topics Used HereCalculusDifferentiation & Integration (including vector differentiation)Lagrange MultipliersTaylor Series ExpansionsLinear AlgebraDeterminants & EigenvaluesInverting [computationally/nearly singular] MatricesPositive DefinitenessProbabilityDistributions: Multivariate Normal, Binomial, MultinomialBayes FormulaStatisticsLogistic RegressionSeparation of Points[Penalized] Likelihood MaximizationEM AlgorithmBiology a little time and communication
*