Multi-subject models of the resting brain

Multi-subject models of the resting brainGael Varoquaux , France

Rest, a window on intrinsic structures

Anti-correlated functional networks(segregation)

Small-world, highly-connected, graphs(integration)

Small-sample biases?Few spatial modesSpurious correlations

Gael Varoquaux 2

Challenges to modeling the resting brain

Model selectionSmall-sample estimation

Mitigating data scarcityGenerative multi-subject modelsMachine-learning/high-dimensional statistics

Gael Varoquaux 3

Outline

1 Spatial modes

2 Functional interactions graphs

Gael Varoquaux 4

1 Spatial modes

Gael Varoquaux 5

1 Spatial modes

Gael Varoquaux 5

1 Decomposing in spatial modes: a modelti

me

voxels

tim

e

voxels

tim

e voxels

Y +E · S=

25

N

Decomposing time series into:covarying spatial maps, Suncorrelated residuals, N

ICA: minimize mutual information across S

Gael Varoquaux 6

1 ICA on multiple subjects: group ICA

[Calhoun HBM 2001]

Estimate common spatial maps S:ti

me

voxels

tim

e

voxels

tim

e voxels

Y +E · S= N111

tim

e

tim

e

tim

e

Y +E · S= Nsss

··· ··· ···

Concatenate images, minimize norm of residualsCorresponds to fixed-effects modeling:

i.i.d. residuals Ns

Gael Varoquaux 7

1 ICA on multiple subjects: group ICA

[Calhoun HBM 2001]

Estimate common spatial maps S:ti

me

voxels

tim

e

voxels

tim

e voxels

Y +E · S= N111

tim

e

tim

e

tim

e

Y +E · S= Nsss

··· ··· ···

Concatenate images, minimize norm of residualsCorresponds to fixed-effects modeling:

i.i.d. residuals Ns

Gael Varoquaux 7

1 ICA: Noise modelObservation noise: minimize group residuals (PCA):

tim

e

voxels

tim

e

voxels

tim

e voxels

Y +W· B= Oconcat

Learn interesting maps (ICA):

sourc

es voxels

B M · S=voxels

sourc

es

Gael Varoquaux 8

1 CanICA: random effects model

[Varoquaux NeuroImage 2010]

Subj

ect

Gro

upObservation noise: minimize subject residuals (PCA):

tim

e

voxels

tim

e

voxels

tim

e voxels

Y +W · P= Os s s s

Select signal similar across subjects (CCA):

P1

...

PsR+=

sourc

es

Λ ·· Bvoxels

subje

cts

voxels

Learn interesting maps (ICA):

sourc

es voxels

B M · S=voxels

sourc

es

Gael Varoquaux 9

1 ICA: model selection

[Varoquaux NeuroImage 2010]

Metric: reproducibility across controls groupsno CCA CanICA MELODIC.36 (.02) .72 (.05) .51 (.04)

Quantifies usefulnessBut not goodness of fitCannot select number of maps

Gael Varoquaux 10

1 CanICA: qualitative observationsStructured components

ICA extracts a brain parcellationDoes not select for what we interpretNo overall control of residualsLack of model-selection metric

Gael Varoquaux 11

1 ICA as dictionary learningti

me

voxels

tim

e

voxels

tim

e voxels

Y +E · S=

25

N

Degenerate model: need priorICA is an improper prior⇒ Noise N must be estimated separately

Impose sparsity, rather than independence

Gael Varoquaux 12

1 Sparse structured dictionary learning

[Jenatton, in preparation]

SpatialmapsTime series

Model of observed data:Y = UVT + E, E ∼ N (0, σI)

Sparsity prior:V ∼ exp (−ξ Ω(V)), Ω(v) = ‖v‖1

Structured sparsity

Gael Varoquaux 13

1 Sparse structured dictionary learning

[Varoquaux, NIPS workshop 2010]

50 100 150 200Number of maps

Cro

ss-v

alid

ate

d lik

elih

ood SSPCA

SPCAICA

Can learn many regionsGael Varoquaux 14

1 Sparse structured dictionary learningICA

Sparse structured

Brain parcellations

Gael Varoquaux 15

1 Multi-subject dictionary learning

[Varoquaux IPMI 2011]

25 xSubject

mapsGroup

mapsTime series

Subject level spatial patterns:Ys = UsVs T + Es , Es ∼ N (0, σI)

Group level spatial patterns:Vs = V + Fs , Fs ∼ N (0, ζI)

Sparsity and spatial-smoothness prior:V ∼ exp (−ξ Ω(V)), Ω(v) = ‖v‖1+

12vT Lv

Gael Varoquaux 16



Estimation: maximum a posterioriargminUs ,Vs ,V

∑sujets

(‖Ys −UsVsT‖2

Fro + µ‖Vs − V‖2Fro

)+ λΩ(V)

Data fit Subjectvariability

Penalization: sparseand smooth maps

Parameter selectionµ: comparing variance (PCA spectrum) at subjectand group levelλ: cross-validation

Gael Varoquaux 17



Individual maps + Atlas of functional regions

Gael Varoquaux 18

1 Multi Subject dictionary learningICA

MSDL

Brain parcellations

Gael Varoquaux 19

Spatial modes: from fluctuations to a parcellationti

me

voxels

tim

e

voxels

tim

e voxels

Y +E · S= N

Gael Varoquaux 20

Associated time series:tim

e

voxels

time

voxels

time voxels

Y +E · S= N

Gael Varoquaux 20

2 Functional interactions graphsGraphical models of brainconnectivity

Gael Varoquaux 21

2 Inferring a brain wiring diagram

Small-world connectivity:sparse graph with efficient transport

integrationIsolate functional structures:

segregation/specialization

Gael Varoquaux 22

2 Independence graphs from correlation matrices

[Varoquaux NIPS 2010, Smith 2011]

For a given correlation matrix:Multivariate normal P(X) ∝

√|Σ−1|e−1

2XT Σ−1X

Parametrized by inverse covariance matrix K = Σ−1

Covariance matrix:Direct andindirect effects

0

1

2

3

4

Inverse covariance:Partial correlations⇒ Independence graph

0

1

2

3

4

Gael Varoquaux 23

2 Sparse inverse covariance estimation

Inverse empirical covariance

Background noise confounds small-world properties?

Small-sample estimation problem

Gael Varoquaux 24

2 Sparse inverse covariance estimation: penalized

[Varoquaux NIPS 2010] [Smith 2011]

Maximum a posteriori:Fit models with a prior

K = argmaxK0

L(Σ|K) + f (K)

Sparse Prior ⇒ Lasso-like problem: `1 penalization

Optimal graphalmost dense

2.5 3.0 3.5 4.0

−log10λ

Test

-dat

a lik

eliho

od

Sparsity

Gael Varoquaux 25

2 Sparse inverse covariance estimation: penalized

[Varoquaux NIPS 2010] [Smith 2011]

Maximum a posteriori:Fit models with a prior

K = argmaxK0

L(Σ|K) + f (K)

Sparse Prior ⇒ Lasso-like problem: `1 penalization

Optimal graphalmost dense

2.5 3.0 3.5 4.0

−log10λ

Test

-dat

a lik

eliho

od

Sparsity

Gael Varoquaux 25

2 Sparse inverse covariance estimation: greedy

[Varoquaux J. Physio Paris, accepted]

Greedy algorithm: PC-DAG1. PC-alg: prune graph by independence tests

conditioning on neighbors2. Learn covariance on resulting structure

High-degree nodesprevent properestimation

Lattice-like structurewith hubs

0 20Fillingfactor (percents)

Test

dat

a lik

eliho

od

Gael Varoquaux 26

2 Sparse inverse covariance estimation: greedy


Greedy algorithm: PC-DAG1. PC-alg: prune graph by independence tests

conditioning on neighbors2. Learn covariance on resulting structure

High-degree nodesprevent properestimation

Lattice-like structurewith hubs

0 20Fillingfactor (percents)

Test

dat

a lik

eliho

od

Gael Varoquaux 26

2 Decomposable covariance estimation


Decomposable models:Cliques of nodes,independent conditionallyon intersections

Greedy algorithm for estimation

C1

C3

C2

S1

S2

Max clique (percents)

Test

dat

a lik

eliho

od

20 30 40 50 60 70 80 90

`1-penalized not very sparsePC-DAG limited by high-degree nodesModels not decomposable in small systems

Modular, small world graphs

Gael Varoquaux 27





C1

C3

C2

S1

S2


Test

dat

a lik

eliho

od

20 30 40 50 60 70 80 90



Gael Varoquaux 27





C1

C3

C2

S1

S2


Test

dat

a lik

eliho

od

20 30 40 50 60 70 80 90



Gael Varoquaux 27

2 Multi-subject sparse inverse covariance estimation

[Varoquaux NIPS 2010]

Accumulate samples for better structure estimationMaximum a posteriori:

K = argmaxK0

L(Σ|K) + f (K)

New prior: Population prior:same independence structure across subjects⇒ Estimate together all Ks from Σs

Group-lasso (mixed norms):`21 penalization f

(Ks

)= λ

∑i 6=j

√∑s

(Ksi ,j)

2

Gael Varoquaux 28

2 Population-sparse graph perform better


Σ−1 Sparseinverse

Populationprior

Likelihood of new data (nested cross-validation)

Subject data, Σ−1 -57.1Subject data, sparse inverse 43.0

Group average data, Σ−1 40.6Group average data, sparse inverse 41.8

Population prior 45.6

Gael Varoquaux 29

2 Small-world structure of brain graphs


Rawcorrelations

Populationprior

Gael Varoquaux 30


Rawcorrelations

Populationprior

Functional segregation structure:Graph modularity =

divide in communities tomaximize intra-class connectionsversus extra-class

Gael Varoquaux 30


Rawcorrelations

Populationprior

Gael Varoquaux 30

Multi-subject models of the resting brainFrom brain networks to brain parcellations

Good models learn many regionsSparsity, structure and subject-variability⇒ Population-level atlas

Y +E · S=

25

N

Small-world brain networksHigh-degrees and long cycles hard to estimateModular structure reflects functional systems

Small-sample estimation is challengingGael Varoquaux 31

ThanksB. Thirion, J.B. Poline, A. Kleinschmidt

Dictionary learning F. Bach, R. JenattonSparse inverse covariance A. Gramfort

Software: in Pythonscikit-learn: machine learningF. Pedegrosa, O. Grisel, M. Blondel . . .

Mayavi: 3D plottingP. Ramachandran

Gael Varoquaux 32

Bibliography 1[Varoquaux NeuroImage 2010] G. Varoquaux, S. Sadaghiani, P. Pinel, A.Kleinschmidt, J.B. Poline, B. Thirion A group model for stable multi-subject ICAon fMRI datasets, NeuroImage 51 p. 288 (2010)http://hal.inria.fr/hal-00489507/en

[Varoquaux NIPS workshop 2010] G. Varoquaux, A. Gramfort, B. Thirion, R.Jenatton, G. Obozinski, F. Bach, Sparse Structured Dictionary Learning forBrain Resting-State Activity Modeling, NIPS workshop (2010)https://sites.google.com/site/nips10sparsews/schedule/papers/RodolpheJennatton.pdf

[Varoquaux IPMI 2011] G. Varoquaux, A. Gramfort, F. Pedregosa, V. Michel,and B. Thirion, Multi-subject dictionary learning to segment an atlas of brainspontaneous activity, Information Processing in Medical Imaging p. 562 (2011)http://hal.inria.fr/inria-00588898/en

[Varoquaux NIPS 2010] G. Varoquaux, A. Gramfort, J.B. Poline and B. Thirion,Brain covariance selection: better individual functional connectivity models usingpopulation prior, NIPS (2010)http://hal.inria.fr/inria-00512451/en

Gael Varoquaux 33

http://hal.inria.fr/hal-00489507/en

https://sites.google.com/site/nips10sparsews/schedule/papers/RodolpheJennatton.pdf

https://sites.google.com/site/nips10sparsews/schedule/papers/RodolpheJennatton.pdf

http://hal.inria.fr/inria-00588898/en


Bibliography 2[Smith 2011] S. Smith, K. Miller, G. Salimi-Khorshidi et al, Network modellingmethods for fMRI, Neuroimage 54 p. 875 (2011)

[Varoquaux J. Physio Paris, accepted] G. Varoquaux, A. Gramfort, J.B. Polineand B. Thirion, Markov models for fMRI correlation structure: is brain functionalconnectivity small world, or decomposable into networks?, J. Physio Paris,(accepted)

[Ramachandran 2011] P. Ramachandran, G. Varoquaux Mayavi: 3D visualizationof scientific data, Computing in Science & Engineering 13 p. 40 (2011)http://hal.inria.fr/inria-00528985/en

[Pedregosa 2011] F. Pedregosa, G. Varoquaux, A. Gramfort et al, Scikit-learn:machine learning in Python, JMLR 12 p. 2825 (2011)http://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html

Gael Varoquaux 34


http://jmlr.csail.mit.edu/papers/v12/pedregosa11a.html

Technology

Multi-subject models of the resting brain