Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
Models of effective connectivity &Dynamic Causal Modelling (DCM)
Presented by: Ariana Anderson
Slides shared by:
Karl Friston Functional Imaging Laboratory (FIL) Wellcome Trust Centre for Neuroimaging University College London
Klaas Enno Stephan Laboratory for Social & Neural Systems Research Institute for Empirical Research in Economics University of Zurich
Overview
• Brain connectivity: types & definitions – anatomical connectivity – functional connectivity – effective connectivity
• Dynamic causal models (DCMs) – DCM for fMRI: Neural and hemodynamic levels – Parameter estimation & inference
• Applications of DCM to fMRI data – Design of experiments and models – Some empirical examples and simulations
Connectivity
A central property of any system
Communication systems Social networks (internet) (Canberra, Australia)
FIgs. by Stephen Eick and A. Klovdahl;see http://www.nd.edu/~networks/gallery.htm
Structural, functional & effective connectivity
• anatomical/structural connectivity = presence of axonal connections
• functional connectivity = statistical dependencies between regional time series
• effective connectivity = causal (directed) influences between neurons or
neuronal populations
Sporns 2007, Scholarpedia
Anatomical connectivity
• neuronal communication via synaptic contacts
• visualisation by tracing techniques
• long-range axons “association fibres”
Different approaches to analysing functional connectivity
• Seed voxel correlation analysis
• Eigen-decomposition (PCA, SVD)
• Independent component analysis (ICA)
• any other technique describing statistical dependencies amongst regional time series
Does functional connectivity not simply correspond to co-activation in SPMs?
No, it does not - see the fictitious example on the right: Here both areas A1 and A2 are correlated identically to task T, yet they have zero correlation among themselves: r(A1,T) = r(A2,T) = 0.71 but r(A1,A2) = 0 !
task T regional response A2 regional response A1
Stephan 2004, J. Anat.
Pros & Cons of functional connectivity analyses
• Pros: – useful when we have no experimental control over the
system of interest and no model of what caused the data (e.g. sleep, hallucinatons, etc.)
• Cons: – interpretation of resulting patterns is difficult / arbitrary – no mechanistic insight into the neural system of interest – usually suboptimal for situations where we have a priori
knowledge and experimental control about the system of interest
models of effective connectivity necessary
For understanding brain function mechanistically, we need models of effective connectivity, i.e.
models of causal interactions among neuronal
populations.
Some models for computing effective connectivity from fMRI data
• Structural Equation Modelling (SEM) McIntosh et al. 1991, 1994; Büchel & Friston 1997; Bullmore et al. 2000
• regression models (e.g. psycho-physiological interactions, PPIs)Friston et al. 1997
• Volterra kernels Friston & Büchel 2000
• Time series models (e.g. MAR, Granger causality)Harrison et al. 2003, Goebel et al. 2003
• Dynamic Causal Modelling (DCM)bilinear: Friston et al. 2003; nonlinear: Stephan et al. 2008
Overview
• Brain connectivity: types & definitions – anatomical connectivity – functional connectivity – effective connectivity
• Dynamic causal models (DCMs) – DCM for fMRI: Neural and hemodynamic levels – Parameter estimation & inference
• Applications of DCM to fMRI data – Design of experiments and models – Some empirical examples and simulations
Dynamic causal modelling (DCM)
• DCM framework was introduced in 2003 for fMRI by Karl Friston, Lee Harrison and Will Penny (NeuroImage 19:1273-1302)
• part of the SPM software package • currently more than 100 published papers on DCM
GLM DCM
Inference Within voxels Among ROIs
On BOLD signal Neuronal Activation
Answering Where the stimulus produced activation.
How the stimulus activated the system of interconnected ROIs.
Using Frequentist Estimation Bayesian Estimation
DCM vs GLM: Cheat Sheet
),,( θuxFdtdx =
Neural state equation:
Electromagnetic forward model:
neural activity, EEGMEG LFP
Dynamic Causal Modeling (DCM)
simple neuronal model complicated forward model
complicated neuronal model simple forward model
fMRI EEG/MEG
inputs
Hemodynamicforward model:neural activity, BOLD
Stephan & Friston 2007, Handbook of Brain Connectivity
DCM for fMRI: the basic idea • Using a bilinear state equation, a cognitive
system is modelled at its underlying neuronal level (which is not directly accessible for fMRI).
• The modelled neuronal dynamics (x) is transformed into area-specific BOLD signals (y) by a hemodynamic forward model (λ).
λ
z
y
The aim of DCM is to estimate parameters at the neuronal level such that the modelled BOLD signals are maximally similar to the experimentally measured BOLD signals.
intrinsic connectivity
direct inputs
modulation of connectivity
Neural state equation CuxBuAx jj ++= ∑ )( )(
uxC
xx
uB
xxA
j
j
∂∂=
∂∂
∂∂=
∂∂=
)(
hemodynamic model λ
x
y
integration
BOLD y y y
activity x1(t)
activity x2(t) activity
x3(t)
neuronal states
t
driving input u1(t)
modulatory input u2(t)
t
Stephan & Friston (2007), Handbook of Brain Connectivity
Bilinear DCM
CuxBuAdtdx m
i
ii +⎟
⎠⎞⎜
⎝⎛ += ∑
=1
)(
Bilinear state equation:
driving input
modulation
...)0,(),(2
0 +∂∂
∂+∂∂+
∂∂+≈= ux
uxfu
ufx
xfxfuxf
dtdx
Two-dimensional Taylor series (around x0=0, u0=0):
LG left
LG right
RVF LVF
FG right
FG left
LG = lingual gyrus FG = fusiform gyrus Visual input in the - left (LVF) - right (RVF) visual field. x1 x2
x4 x3
u2 u1
x1 = a11x1 + a12x2 + a13x3 + c12u2x2 = a21x1 + a22x2 + a24x4 + c21u1x3 = a31x1 + a33x3 + a34x4x4 = a42x2 + a43x3 + a44x4
Example: a linear system of dynamics in visual cortex
Example: a linear system of dynamics in visual cortex
LG = lingual gyrus FG = fusiform gyrus Visual input in the - left (LVF) - right (RVF) visual field.
state changes
effective connectivity
externalinputs
systemstate
input parameters
x1
x2
x3
x4
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
=
a11 a12 a13 0
a21 a22 0 a24
a31 0 a33 a34
0 a42 a43 a44
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
x1
x2
x3
x4
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
+
0c21
00
c12
000
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
u1
u2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
x = Ax +Cu
},{ CA=θ
LG left
LG right
RVF LVF
FG right
FG left
x1 x2
x4 x3
u2 u1
Extension: bilinear dynamic system
LG left
LG right
RVF LVF
FG right
FG left
x1 x2
x4 x3
u2 u1
CONTEXT u3
x = (A+ u jB( j )
j=1
m
∑ )x +Cu
x1
x2
x3
x4
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
=
a11 a12 a13 0
a21 a22 0 a24
a31 0 a33 a34
0 a42 a43 a44
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
+ u3
0 b12(3) 0 0
0 0 0 00 0 0 b34
(3)
0 0 0 0
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
⎧
⎨
⎪⎪⎪
⎩
⎪⎪⎪
⎫
⎬
⎪⎪⎪
⎭
⎪⎪⎪
x1
x2
x3
x4
⎡
⎣
⎢⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥⎥
+
0c21
00
c12
000
0000
⎡
⎣
⎢⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥⎥
u1
u2
u3
⎡
⎣
⎢⎢⎢⎢
⎤
⎦
⎥⎥⎥⎥
-
x2
stimuli u1
context u2
x1
+
+
-
-
- +
u 1
Z 1
u 2
Z 2
x = Ax + u2B(2)x +Cu1
x1x2
⎡
⎣⎢⎢
⎤
⎦⎥⎥= σ a12
a21 σ⎡
⎣⎢⎢
⎤
⎦⎥⎥x + u2 11
2
b 0
022
2
b
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥x + c1 0
0 0
⎡
⎣⎢⎢
⎤
⎦⎥⎥
u1u2
⎡
⎣⎢⎢
⎤
⎦⎥⎥
Example: context-dependent decay u1
u2
x2
x1
Penny, Stephan, Mechelli, Friston NeuroImage (2004)
Any design that is good for a GLM of fMRI data.
What type of design is good for DCM?
GLM vs. DCM
DCM tries to model the same phenomena as a GLM, just in a different way:
It is a model, based on connectivity and its modulation, for explaining experimentally controlled variance in local responses.
No activation detected by a GLM → inclusion of this region in a DCM is useless!
Stephan 2004, J. Anat.
Multifactorial design: explaining interactions with DCM
Task factor Task A Task B
Stim
1
Stim
2
Stim
ulus
fact
or
TA/S1 TB/S1
TA/S2 TB/S2
X1 X2
Stim2/Task A
Stim1/ Task A
Stim 1/ Task B
Stim 2/ Task B
GLM
X1 X2
Stim2
Stim1
Task A Task B
DCM Let’s assume that an SPM analysis shows a main effect of stimulus in X1 and a stimulus * task interaction in X2.
How do we model this using DCM?
Stim 1Task A
Stim 2Task A
Stim 1Task B
Stim 2Task B
Simulated data
X1
X2
+++ X1 X2
Stimulus 2
Stimulus 1
Task A Task B
+ +++ +
+++ +
– –
Stephan et al. 2007, J. Biosci.
Stim 1Task A
Stim 2Task A
Stim 1Task B
Stim 2Task B
plus added noise (SNR=1)
X1
X2
DCM parameters = rate constants
dx axdt
= 0( ) exp( )x t x at=
The coupling parameter a thus describes the speed of the exponential change in x(t)
0
0
( ) 0.5exp( )
x xx a
ττ
==
Integration of a first-order linear differential equation gives an exponential function:
a = ln2 /τ
00.5x
a/2ln=τ
Coupling parameter a is inversely proportional to the half life of z(t): τ
Interpretation of DCM parameters • Dynamic model (differential equations)
a neural parameters correspond to rate constants (inverse of time constants = Hz!)
speed at which effects take place
• Identical temporal scaling in all areas by factorising A and B with σ: all connection strengths are relative to the self-connections.
• Each parameter is characterised by the mean (ηθ|y) and covariance of its a posteriori distribution. Its mean can be compared statistically against a chosen threshold γ.
γ ηθ|y
θ n = {A, B,C,σ}
a = ln2 /τ
A →σA =σ−1 a12
a21 −1
⎡
⎣
⎢⎢⎢
⎤
⎦
⎥⎥⎥
τ
The problem of hemodynamic convolution
Goebel et al. 2003, Magn. Res. Med.
sf
tionflow induc
= (rCBF)
s
v
stimulus functions
vq q/vvEf,EEfqτ /α
dHbchanges in1
00 )( −=/αvfvτ volumechanges in
1−=
f
q
)1(
−−−= fγsxssignalryvasodilato
κ
u
s
CuxBuAdtdx m
j
jj +⎟⎟⎠
⎞⎜⎜⎝
⎛+= ∑
=1
)(
t
neural state equation
( ) ( )
εε
ϑ
λ
−===
⎥⎦⎤
⎢⎣⎡ −+⎟
⎠⎞⎜
⎝⎛ −+−≈Δ=
1
3.4
111),(
3
002
001
32100
kTEErkTEEk
vkvqkqkV
SSvq
hemodynamic state equations f
Balloon model
BOLD signal change equation
},,,,,{ ερατγκθ =h
important for model fitting, but (usually) of no interest for statistical inference
• 6 hemodynamic parameters:
• Computed separately for each area (like the neural parameters) region-specific HRFs!
The hemodynamic model in DCM
Friston et al. 2000, NeuroImage Stephan et al. 2007, NeuroImage
5 10 15 20 25 30 35 40
5
10
15
20
25
30
35
40
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0.4
0.6
0.8
1
A
B
C
ε
How interdependent are neural and hemodynamic parameter estimates?
Stephan et al. 2007, NeuroImage
θ
sf =(rCBF)induction -flow
s
v
f
stimulus function u
modelled BOLD response
vq
changes in dHb q = f E( f,ρ) ρ − v1/q / v/αvfvτ 1
in volume changes−=
f
q
)1(signalry vasodilatodependent -activity
−−−= fγszs κ
s
)(xy λ= y = h(u,θ )+ Xβ + eobservation model
hidden states { , , , , }z x s f v q=
state equation z = F (x,u,θ )
parameters
},{},...,{},,,,{
1
nh
mn
h
CBBAθθθ
θρατγκθ
===
• Combining the neural and hemodynamic states gives the complete forward model.
• An observation model includes measurement error e and confounds X (e.g. drift).
• Bayesian parameter estimation by means of a Levenberg-Marquardt gradient ascent, embedded into an EM algorithm.
• Result:Gaussian a posteriori parameter distributions, characterised by mean ηθ|y and covariance Cθ|y.
Overview:parameter estimation
ηθ|y
neural state equation x = (A+ u jB
j )x +Cu∑
p-value: probability of getting the observed data in the effect’s absence. If small, reject null hypothesis that there is no effect.
0
0
: 0( | )Hp y H
θ =
Limitations: One can never accept the null hypothesis Given enough data, one can always demonstrate a significant effect Correction for multiple comparisons necessary
Solution: infer posterior probability of the effect
Probability of observing the data y, given no effect.
)|( yp θ
Problems of classical (frequentist) statistics
Probability of the effect, given the observed data
Bayes‘ Theorem
Reverend Thomas Bayes 1702 - 1761
“Bayes‘ Theorem describes, how an ideally rational person processes information." Wikipedia
)()()|()|(
yppypyP θθθ =
Likelihood Prior
Evidence
Posterior
Bayesian statistics
p(θ | y)∝ p(y |θ )p(θ )posterior likelihood ∙ prior
)|( θyp )(θp
Bayes theorem allows one to formally incorporate prior knowledge into computing statistical probabilities.
Priors can be of different sorts:empirical, principled or shrinkage priors.
The “posterior” probability of the parameters given the data is an optimal combination of prior knowledge and new data, weighted by their relative precision.
new data prior knowledge
∝
y Observation of data
Likelihood
prior distribution
Formulation of a generative model
Update of beliefs based upon observations, given a prior state of knowledge
p(θ | y)∝ p(y |θ )p(θ )
Principles of Bayesian inference
p(y |θ )p(θ )
• needed for Bayesian estimation, embody constraints on parameter estimation
• express our prior knowledge or “belief” about parameters of the model
• hemodynamic parameters: empirical priors
• temporal scaling: principled prior
• coupling parameters: shrinkage priors
Priors in DCM
posterior likelihood · prior
Bayes Theorem
p(θ | y)∝ p(y |θ ) ⋅ p(θ )∝
Shrinkage Priors Small & variable effect Large & variable effect
Small but clear effect Large & clear effect
• Gaussian assumptions about the posterior distributions of the parameters
• Use of the cumulative normal distribution to test the probability that a certain parameter (or contrast of parameters cT ηθ|y) is above a chosen threshold γ:
• By default, γ is chosen as zero ("does the effect exist?").
Inference about DCM parameters:Bayesian single-subject analysis
⎟⎟⎟
⎠
⎞
⎜⎜⎜
⎝
⎛ −=
cCc
cp
yT
yT
N
θ
θ γηφ
Bayesian single subject inference
LG left
LG right
RVF stim.
LVF stim.
FG right
FG left
LD|RVF
LD|LVF
LD LD
0.34 0.14
-0.08 0.16
0.13 0.19
0.01 0.17
0.44 0.14
0.29 0.14
Contrast: Modulation LG right -> LG links by LD|LVF vs. modulation LG left -> LG right by LD|RVF
p(cTηθ|y>0|y) = 98.7%
Stephan et al. 2005, Ann. N.Y. Acad. Sci.
Inference about DCM parameters: Bayesian fixed-effects group analysis
Because the likelihood distributions from different subjects are independent, one can combine their posterior densities by using the posterior of one subject as the prior for the next:
)|()...|()|(),...,|(...
)|()|( )()|()|(),|(
)()|( )|(
111
12
1221
11
ypypypyyp
ypyppypypyyp
pypyp
NNN θθθθ
θθθθθθ
θθθ
−∝
∝∝∝
1,...,|
1|
1|,...,|
1
1|
1,...,|
11
1
−
=
−
=
−−
⎟⎠⎞⎜
⎝⎛=
=
∑
∑
NiiN
iN
yy
N
iyyyy
N
iyyy
CC
CC
θθθθ
θθ
ηη
Under Gaussian assumptions this is easy to compute:
group posterior covariance
individual posterior covariances
group posterior mean
individual posterior covariances and means
Bayesian model selection (BMS) Given competing hypotheses on structure & functional mechanisms of a system, which model is the best?
For which model m does p(y|m) become maximal?
Which model represents thebest balance between model fit and model complexity?
Pitt & Miyung (2002), TICS
θθθ dmpmypmyp )|(),|()|( ∫ ⋅=Model evidence:
Various approximations, e.g.: - negative free energy - AIC - BIC
Penny et al. (2004) NeuroImage
Bayesian model selection (BMS)
)|()|(
2
1
mypmypBF =
Model comparison via Bayes factor:
)|()|(),|(),|(
mypmpmypmyp θθθ =Bayes’ rules:
accounts for both accuracy and complexity of the model
allows for inference about structure (generalisability) of the model
Bayes factors
)|()|(
2
112 myp
mypB =
But: the log evidence is just some number – not very intuitive!
A more intuitive interpretation of model comparisons is made possible by Bayes factors:
To compare two models, we can just compare their log evidences.
B12 p(m1|y) Evidence 1 to 3 50-75% weak
3 to 20 75-95% positive 20 to 150 95-99% strong
> 150 > 99% Very strong Kass & Raftery classification:
Kass & Raftery 1995, J. Am. Stat. Assoc.
Example studies of DCM for fMRI • DCM now an
established tool for fMRI & M/EEG analysis
• >100 studies published, incl. high-profile journals
• combinations of DCM with computational models
Thank you