67
All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th , 2006

All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Embed Size (px)

Citation preview

Page 1: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

All of statistics……revisited

Lecture 7

Likelihood Methods in Forest Ecology

October 9th – 20th , 2006

Page 2: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited

Bolker

Page 3: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited:Simple Variance Structures

Page 4: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited

Page 5: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

General linear models

• Predictions are a linear function of a set of parameters.• Includes:

– Linear models– ANOVA– ANCOVA

• Assumptions:– Normally distributed, independent errors– Constant variance

• Not to be confused with generalized linear models!• Distinction between factors and covariates.

Page 6: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Linear regression

),(NbXa~Y 20

Standard R code:

>lm.reg<-lm(Y~X)>summary(lm)>anova(lm.reg)

Likelihood R code:

>lmfun<-function(a, b, sigma){Y.pred<-a+b*x-sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE))}

Page 7: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Analysis of variance (ANOVA)

),(N~Y ijjiij20

Standard R code:

>lm.onewayaov<-lm(Y~f1)>summary(lm.aov)>anova(lm.aov) # will give you an ANOVA table

Likelihood R code:

>aovfun<-function(a11, a12, sigma){Y.pred<-c(a11,a12)-sum(dnorm(DBH, mean=Y.pred, sd=sigma, log=TRUE))}

Page 8: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Analysis of variance (ANOVA): H &M (p177)Individual and cage effects on fly wing

length),(N~Y ijjiij

20

Cage Female Left wing Right wing

1 1 58.5 59.5

1 2

1 3

1

2

2

2

2

3

3

3 12

Table 7.5

Compare:

Likeli of mean modelLikeli of cage modelLikeli of indiv fly model

Page 9: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Analysis of covariance (ANCOVA)

),(NX~Y iii20

Standard R code:

>lm.anc<-lm(Y~f*X)>summary(lm.anc)>str(summary(lm.anc))

Likelihood R code:

>ancfun<-function(a11, a12, slope1, slope2, sigma){Y.pred<-c(a11,a12)[f] + c(slope1, slope2)[f]*X-sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE))}

Page 10: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited

Page 11: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Nonlinearlity: Non-linear least squares

),(NaX~Y bi

20

Standard R code:

>nls(y~a*x^b, start=list(a=1,b=1)>summary(nls)>str(summary(lns))

Likelihood R code:

>nlsfun<-function(a, b, sigma){Y.pred<-a*x^b-sum(dnorm(Y, mean=Y.pred, sd=sigma, log=TRUE))}

Uses numerical methods similar to those use in likelihood

Page 12: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited

Page 13: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Generalized linear models• Assumptions:

– Non-normal distributed errors ( but still independent and only certain kinds of non-normality)

– Non-linear relationships are allowed but only if they have a linearizing transformation (the link function).

• Linearizing transformations:

• Non-normal distributed errors ( but still independent and only certain kinds of non-normality). These include the exponential family and are typically used with a specific linearizing function.

• Poisson: loglink• Binomial: logit transfomation• Gamma: inverse Gaussian

• Fit by iteratively reweighed least square methods: estimate variance associated with each point for each estimate of parameter(s).

• Not to be confused with general linear models!

y

ylogx

e

ey

x

x

112xyyx

)ylog(xey x

Page 14: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

GML: Poisson regression

Standard R code:

>glm.pois<-glm(Y~X, family=poisson) >summary(gml.pois)

Likelihood R code:

>poisregfun=function(a,b){Y.pred<-exp(a+b*X)-sum(dpois(Y, lambda=Y.pred, log=TRUE))}

bxaey

Page 15: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

GML: Logistic regression

Standard R code:

>glm2<-glm(y,x, family=“binomial”) >summary(gml2)

Likelihood R code:

>logregfun=function(a,b,N){p.pred<-exp(a + b*X))/(1+exp(a + b*X))-sum(dbinom(Y, size=N, prob=p.pred, log=TRUE))}

y

ylog)x(itlogfunctionlink

e

ey

x

x

1

1

Page 16: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited

Page 17: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Generalized (non)linear least-squares models:Variance changes with a covariate or among

groups

Standard R code:

>gls<-gls(y~1,weights=varIdent(form=~1|f)>summary(gls)

Likelihood R code:

>vardifffun=function(a, sd1,sd2){sdval<-c(sd1,sd2)[f]-sum(dbinom(Y, mean=a, sd=sdval, log=TRUE)}

),(Nc~y ii20

Page 18: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Standard statistics revisited: Complex Variance Structures

Page 19: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Complex error structures

• Error structures are not independent• Complex likelihood functions• Includes:

– Time series analysis– Spatial correlation– Repeated measures analysis

Vector ofdata

Vector ofMeans (pred)

Variance-covariance matrix

x x

Page 20: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Complex error structures

22 2 /)xexp( ii 22 2 /)xexp( ii

22 2 /)xexp( ii

Independent Increasing variance General case

x (x

22 2 iii /)xexp(

Page 21: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Complex error structures

• Variance/covariance matrix is symetric so we need to specify at most n(n-1)/2 parameters.

• V/C matrix must also be positive definite (logical), this translates to having a positive eigenvalue or positive diagonal values/

• Select elements of matrix that define the error structure and ensure positive definite.

• In this example, correlation drops off with the number ofd steps between sites.c

1||

Page 22: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Complex error structures: An exampleSpatially-correlated errors

R code:

>rho=0.5>m=matrix(nrow=5, ncol=5)>m<-rho^(abs(row(m)-col(m)) #OR#>m[abs(row(m)-col(m))==1]=rho

>mvlik<-function(a,b,rho){mu=a+b*xn=length(x)m=diag(n) generates diag matrix of n rows, n columnsm[abs(row(m)-col(m))==1]=rho-dmvnorm(y, mu, Sigma=m, log=TRUE)}

Page 23: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Mixed models & Generalized linear mixed models (GLMM)

• Samples within a group (block, site) are equally correlated with each other.

• Fixed effects: effects of covariates• Random effects: block, site etc.• GLMM’s are generalized linear models with random

effects

Page 24: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Complex variance structures

• So how do you incorporate all potential sources of variance?– Block effects– Individual effects (repeated measures

includes both individual and temporal correlation)

– Measurement vs. process error– …..

Page 25: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Bolker

Page 26: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Analysesof

Experimental data

Page 27: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Osenberg et al. 2002

Threshold Natural variation

Ambien density Variance

DD detected

DD undetected

Page 28: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Why variation in experimental conclusions?

• Inference derived from p-value

• No effect size (strength of the process)

• No per-capita effects

• Time difference between studies

• Spatial extent difference between studies

Page 29: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Approach: Analyze data using one equation

di dd

Page 30: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Results

No difference in per-capita effects

Difference due to initial density

So….beware of experiments!

Page 32: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Hierarchical structure

Page 33: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

i=1 (nem pres), 2 (nem absent)j = caterpillar treatment j

No. of sdlgs dead = Binomial random variable

Analyses

Page 34: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Model selection

Page 35: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Results

• 47% died in absence of nematodes

• 11% died in presence of nematodes

Nematodes absent

Nematodes present

Page 36: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Model selection

Four models beta 2 positive (neg eff of nematodes anddiff from beta 1 (which = 0 or < beta 2)

Page 37: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Traditional Approach:Logistic regression

Page 38: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Logistic regression

Beware of canned packages! Need to determine hierarchical error structure when testing complex hypotheses

Page 39: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Take-home points

• Non-linear effects and non-normal response variables will often cause problems with canned packages.

• Focus on model construction, parameter estimation and model evaluation.

• Represent variability in your data using the appropriate probability function

Page 40: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Predator-induced hatching plasticity

Vonesh & Bolker 2004

Page 41: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Trait-mediated predator effects

• Density effects: consumptive effects resulting from predators killing prey (affects density).

• Trait effects: non-consumptive effects resulting from changes in prey behavior or morphology in response to predation risk (e.g., growth rates)

Lutberg & Kirby 2005

Page 42: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Trait-mediated effects

Preisser et al. 2005

Page 43: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Predator-mediated plasticity in anurans

• Prey respond to predators by changing their behavior, morphology and life history.

• Timing of habitat shifts, metamorphosis, and hatching involve change of habitat and often, suite of predators.

• Timing of transition between two life stages should evolve in response to variation in growth and mortality among life history stages.

Page 44: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Postponement of hatching in response to predators may

• Allow hatchlings to reach a greater body size before encountering predators, thus increasing their survival

• But…there may different predation risk at different life stages (i.e., terrestrial vs aquatic predators) so it may be best to hatch early.

• How are these tradeoffs determined?

Page 45: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

The study system

Page 46: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Predator effects on terrestrial stage

• Both frogs and flies cause embryos to hatch approximately 30% earlier.

• Early hatchlings have lower weights and are at earlier developmental stages

• Frogs can reduce density of tadpoles entering the pond by 60%

• Flies have a much smaller effect.• So both size and density change over

time!

Page 47: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Experiment I: Quantifying the functional response

• Vary larval density in the presence and absence of the dragonfly (aquatic predator).

Page 48: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Scientific Model: Functional response-Mortality as a function of density

Keep size fixed

Number of preyeaten in t days

Number of predators

Attack rate Handling time

•Assume that actual attacked number follows a binomial distributionwith p =probability of an individual being killed over the courseof the experiment

•Obtain estimates of α and HD that maximize the likelihood.

Page 49: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Experiment II: Effect of larval size on predation risk

• Expose five larval age/size classes to aquatic predators.

• Dragonflies were replaced daily to keep predator densities constant

Page 50: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

The Scientific Model Part II: Size-specific mortality

(Keep density fixed)

Size-specificpredation prob.

Prey size

Phenomenological scientific model

Assume that probability of predation follows a binomial distribution with this probability. This function peaks atintermediate prey sizes

Page 51: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Combining size & density-dependent mortality

to predict attack rate• Two tricks….

Number of prey eaten per predator per day=It becomes the risk of predationat density N

Density of predation in the size experiment

Page 52: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

• Population of interest is simulated.• Draw repeated samples from pseudo-

population.• Statistic (parameter) computed in each

pseudo-sample.• Sampling distribution of statistic examined.• Where do true parameters fall within this

distribution?

Monte-Carlo methods

Page 53: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

1.Calculate predicted values with known parameter values (these may also be calculated from data).

2.Add random error to predicted values to create observed.

3.Estimate parameter values given observed and predicted.

4.Go back to step 2 and loop through 100-1000 times.

5.Examine frequency distribution of estimated parameters of interest.

Basic procedure

Page 54: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Describe the distribution of the predicted variable

• Vonesh & Bolker obtain parameter estimates of their model with CI and variance-covariance matrix.

• Draw repeatedly from these distributions.

• Simulate larval growth and survival from estimated parameters and error around estimates (var-cov matrix).

• Generate expected distribution of the variable of interest (number killed).

• Can do these with just one set of data analyzed in a traditional framework.

Page 55: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Measurement & Observation Error

Schnute

Page 56: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

X measuredPerfectly(process error)

No process uncertainty(measurement error)

Page 57: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Why should we care?

• Measurement error only affects the current measurement.

• Process error propagates through time.

• This is big deal in dynamic models.

Page 58: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Bolker

Page 59: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

A famous example

Pascual & Kareiva 1996

Page 60: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Fit L-V models to Gause’s data

Page 61: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Traditional conclusion

Page 62: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Observation & Process Error

• Process uncertainty: random events cause the response variable to change in ways that are not predicted by the model. These may be errors in the process itself or in the observer of y. Propagates through time.

• Observer uncertainty: Error in sampling due to measurement. Error in the predictive variable. Does not propagate through time.

Page 63: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Observation error• We only need initial conditions.

• We take observation from each time step It, and predict just the next step, I(t+1)

• Contrast trajectory and actual data.

• Minimize the difference between observed and predicted data

• Often involves non-linear minimization

Page 64: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Process error• We need complete series of observations.• We take observation from each time step

It, and predict just the next step, I(t+1)• For estimation we fit a regression between

N(t+1) and N(t)• Minimize the difference b/observed and

predicted N(t+1)• One-step ahead fitting• Linear regression approach

Page 65: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Estimation

• To estimate both observation and process error we need either independent estimates of:– the magnitude of the errors OR– their relative size

• Otherwise we have to choose between the two• Fitting assuming observation errors provides unbiased

and more precise estimates even when data contained only process error. However, it produces downward-biased estimates of variance.

• If two kinds of errors uncorrelated, it gives the extreme values of possible parameter estimates.

Page 66: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Statistical flip-flopping

• We often use MLE’s to estimate parameters although IT does not have this requirement.

• An AIC does not say anything about our confidence (or error) in the parameter estimate.

• Therefore, we resort to frequentist stats to generate some 95% CI.

• An alternative is to generate a true likelihood profile and chi-square but ultimately this also produces a p value.

• The only consistent statistical logic is Bayesian

Page 67: All of statistics……revisited Lecture 7 Likelihood Methods in Forest Ecology October 9 th – 20 th, 2006

Philosophy vs pragmatism

• It is useful to have a broader more encompassing philosophy but…

• Greater generality often implies greater complexity –often computational and mathematical