Statistical Inference I - UNIL · 2017-08-31 · Statistical Inference I GLM, Contrasts & RFT...

Statistical Inference I GLM, Contrasts & RFT

Guillaume Flandin

Wellcome Trust Centre for Neuroimaging

University College London

SPM Course

Lausanne, November 2013

Normalisation

Statistical Parametric Map

Image time-series

Parameter estimates

General Linear Model Realignment Smoothing

Design matrix

Anatomical

reference

Spatial filter

Statistical

Inference RFT

p <0.05

Normalisation

Image time-series

Parameter estimates

Design matrix

Anatomical

reference

Spatial filter

Statistical

Inference RFT

p <0.05

Passive word

listening

versus rest

7 cycles of

rest and listening

Blocks of 6 scans

with 7 sec TR

Question: Is there a change in the BOLD

response between listening and rest?

Stimulus function

One session

fMRI experiment example

BOLD signal

single voxel

time series

Voxel-wise time series analysis

specification

Parameter

estimation

Hypothesis

Statistic

BOLD signal

= 1 2 + + err

x1 x2 e

Single voxel regression model

exxy 2211

Mass-univariate analysis: voxel-wise GLM

e+ y X

Model is specified by

1. Design matrix X

2. Assumptions about e

N: number of scans

p: number of

regressors

The design matrix embodies all available knowledge about

experimentally controlled factors and potential confounds.

),0(~ 2INe

• one sample t-test

• two sample t-test

• paired t-test

• Analysis of Variance

(ANOVA)

• Analysis of Covariance

(ANCoVA)

• correlation

• linear regression

• multiple regression

GLM: a flexible framework for parametric analyses

Parameter estimation

Ordinary least

squares estimation

(OLS) (assuming i.i.d.

error):

yXXX TT 1)(ˆ

Objective:

estimate

parameters to

minimize

Problems of this model with fMRI time series

1. The BOLD response has a delayed and dispersed shape.

2. The BOLD signal includes substantial amounts of low-frequency noise (eg due to scanner drift).

3. Due to breathing, heartbeat & unmodeled neuronal activity, the errors are serially correlated. This violates the assumptions of the noise model in the GLM.

Boynton et al, NeuroImage, 2012.

Scaling

Additivity

invariance

Problem 1: BOLD response

Hemodynamic response function (HRF):

Linear time-invariant (LTI) system:

u(t) x(t) hrf(t)

Convolution operator:

Problem 1: BOLD response Solution: Convolution model

Convolution model of the BOLD response

Convolve stimulus function

with a canonical

hemodynamic response

function (HRF):

dtgftgf0

)()()(

Stimulus

Undershoot

Informed Basis Set Canonical HRF

Hemodynamic Response Temporal Basis Set

Canonical

Temporal

Dispersion

blue = data

black = mean + low-frequency drift

green = predicted response, taking into

account low-frequency drift

red = predicted response, NOT taking

into account low-frequency drift

Problem 2: Low-frequency noise Solution: High pass filtering

discrete cosine

transform (DCT)

)(eCov

autocovariance

function

Problem 3: Serial correlations

i.i.d:

Multiple covariance components

= 1 + 2

Estimation of hyperparameters with ReML (Restricted Maximum Likelihood).

enhanced noise model at voxel i

error covariance components Q

and hyperparameters

),0(~ ii CNe

Summary: a mass-univariate approach

Summary: Estimation of the parameters

i.i.d. assumptions:

OLS estimates:

Normalisation

Image time-series

Parameter estimates

Design matrix

Anatomical

reference

Spatial filter

Statistical

Inference RFT

p <0.05

Contrasts

[1 0 0 0 0 0 0 0 0 0 0 0 0 0] [0 1 -1 0 0 0 0 0 0 0 0 0 0 0]

Hypothesis Testing

Null Hypothesis H0

Typically what we want to disprove (no effect).

The Alternative Hypothesis HA expresses outcome of interest.

To test an hypothesis, we construct “test statistics”.

Test Statistic T

The test statistic summarises evidence

about H0.

Typically, test statistic is small in

magnitude when the hypothesis H0 is true

and large when false.

We need to know the distribution of T

under the null hypothesis. Null Distribution of T

cT = 1 0 0 0 0 0 0 0

contrast of

estimated

parameters

variance

estimate

box-car amplitude > 0 ?

1 = cT> 0 ?

1 2 3 4 5 ...

T-test - one dimensional contrasts – SPM{t}

Question:

Null hypothesis: H0: cT=0

Test statistic:

)ˆvar(

T-test: one dimensional contrasts – SPM{t}

T-test is a signal-to-noise measure (ratio of estimate to

standard deviation of estimate).

T-contrasts are simple combinations of the betas; the T-statistic does not depend on the scaling of the regressors or the scaling of the contrast.

H0: 0Tc vs HA: 0Tc

Alternative hypothesis:

F-test - the extra-sum-of-squares principle

Model comparison:

Null Hypothesis H0: True model is X0 (reduced model)

Full model ?

or Reduced model?

X0 Test statistic: ratio of

explained variability and

unexplained variability (error)

1 = rank(X) – rank(X0)

2 = N – rank(X)

2ˆfull

2ˆreduced

F-test - multidimensional contrasts – SPM{F}

Tests multiple linear hypotheses:

0 0 0 1 0 0 0 0 0

0 0 0 0 1 0 0 0 0

0 0 0 0 0 1 0 0 0

0 0 0 0 0 0 1 0 0

0 0 0 0 0 0 0 1 0

0 0 0 0 0 0 0 0 1

H0: 4 = 5 = ... = 9 = 0

X1 (4-9) X0

Full model? Reduced model?

H0: True model is X0

test H0 : cT = 0 ?

SPM{F6,322}

F-test example: movement related effects

Design matrix

2 4 6 8

contrast(s)

Design matrix 2 4 6 8

contrast(s)

F-test: summary

F-tests can be viewed as testing for the additional variance explained by a larger model wrt a simpler (nested) model model comparison.

In testing uni-dimensional contrast with an F-test, for example 1 – 2, the result will be the same as testing 2 – 1. It will be exactly the square of the t-test, testing for both positive and negative effects.

F tests a weighted sum of squares of one or several combinations of the regression coefficients .

Hypotheses:

0 : Hypothesis Null 3210 H

0 oneleast at : Hypothesis eAlternativ kAH

Orthogonal regressors

Variability in Y

Correlated regressors

Shared variance

Variability in Y

Normalisation

Image time-series

Parameter estimates

Design matrix

Anatomical

reference

Spatial filter

Statistical

Inference RFT

p <0.05

Inference at a single voxel

Null distribution of test statistic T

Decision rule (threshold) u:

determines false positive

rate α

Null Hypothesis H0:

zero activation

Choose u to give acceptable

α under H0

Multiple tests

Signal

If we have 100,000 voxels,

α=0.05 5,000 false positive voxels.

This is clearly undesirable; to correct

for this we can define a null hypothesis

for a collection of tests.

Multiple tests

11.3% 11.3% 12.5% 10.8% 11.5% 10.0% 10.7% 11.2% 10.2% 9.5%

Use of ‘uncorrected’ p-value, α =0.1

Percentage of Null Pixels that are False Positives

If we have 100,000 voxels,

α=0.05 5,000 false positive voxels.

This is clearly undesirable; to correct

for this we can define a null hypothesis

for a collection of tests.

Family-Wise Null Hypothesis

Use of ‘corrected’ p-value, α =0.1

Use of ‘uncorrected’ p-value, α =0.1

Family-Wise Null Hypothesis:

Activation is zero everywhere

If we reject a voxel null hypothesis at any voxel,

we reject the family-wise Null hypothesis

A FP anywhere in the image gives a Family Wise Error (FWE)

Family-Wise Error rate (FWER) = ‘corrected’ p-value

Bonferroni correction

The Family-Wise Error rate (FWER), αFWE, for a family of N

tests follows the inequality:

where α is the test-wise error rate.

Therefore, to ensure a particular FWER choose:

This correction does not require the tests to be independent but

becomes very stringent if dependence.

Spatial correlations

100 x 100 independent tests Spatially correlated tests (FWHM=10)

Bonferroni is too conservative for spatial correlated data.

Discrete data Spatially extended data

Topological inference

Topological feature: Peak height

Peak level inference

Topological feature: Cluster extent

uclus : cluster-forming threshold

Cluster level inference

Topological feature: Number of clusters

uclus : cluster-forming threshold

Set level inference

RFT and Euler Characteristic

Search volume Roughness

(1/smoothness) Threshold

Expected Euler Characteristic

Random Field Theory

The statistic image is assumed to be a good lattice

representation of an underlying continuous stationary

random field.

Typically, FWHM > 3 voxels

(combination of intrinsic and extrinsic smoothing)

A priori hypothesis about where an activation should be,

reduce search volume Small Volume Correction:

• mask defined by (probabilistic) anatomical atlases

• mask defined by separate "functional localisers"

• mask defined by orthogonal contrasts

• (spherical) search volume around previously reported coordinates

Pre- processings

General Linear Model

Statistical Inference

Contrast c

Random

Field Theory

Statistical Inference I - UNIL · 2017-08-31 · Statistical Inference I GLM, Contrasts & RFT...

Documents

Statistical inference

Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Zurich, February 2009

Introduction to statistical inference for infectious … to statistical inference for ... Introduction to statistical inference for infectious ... infection with probability S(t)=n).Published

CO902 Probabilistic and statistical inference · CO902 Probabilistic and statistical inference Lecture 1 ... “Probability and Statistics ... Casella & Berger “Statistical Inference

Grossman Statistical Inference

Statistical Inference

Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course London, May 2010

Contrasts & Statistical Inference

Introduction to Statistical Inference - Bioinformatics Grazgenome.tugraz.at/MedicalInformatics2/Statistical... · 2015-01-26 · Statistical Inference • The target of statistical

Statistical Inference - UZH

FMRI Modelling & Statistical Inference Guillaume Flandin Wellcome Trust Centre for Neuroimaging University College London SPM Course Chicago, 22-23 Oct

STATS 200: Introduction to Statistical Inference 200: Introduction to Statistical Inference ... Statistical inference Statistical inference = Probability 1 ... STATS 200: Introduction

Lecture 4: Statistical Inference

Statistical Inference for Diffusion Processes · Ergodicity, LLN, CLTs Statistical inference for SDEs (high frequency data) Statistical inference for SDEs (discrete data) Numerical

Statistical Inference Statistical Inference is the process of making judgments about a population based on properties of the sample Statistical Inference

Big Data, Statistical Inference and Official Statistics • BIG DATA, STATISTICAL INFERENCE AND OFFICIAL STATISTICS † 1351.0.55.054 1 BIG DATA, STATISTICAL INFERENCE AND OFFICIAL

3 . Statistical Inference

APTS: Statistical Inference

Statistical Inference Web

Principles of Statistical Inference