Interpreting predictive models in terms of · Pattern Recognition for NeuroImaging (PR4NI)...

Pattern Recognition for NeuroImaging (PR4NI)

Interpreting predictive

models in terms of

anatomically labelled regions

J. Schrouff

Stanford University, CA, USA

J. Schrouff

Stanford University, CA, USA

Feature

extraction/selection

specification

Accuracy and

significance

interpretation

Pattern Recognition for NeuroImaging (PR4NI) 2

Where in the brain is the

information of interest?

i.e. Which brain areas

contribute most to the

model’s prediction?

Feature

specification

Accuracy and

significance

interpretation

J. Schrouff

Stanford University, CA, USA Pattern Recognition for NeuroImaging (PR4NI) 3

Computing “weight maps”:

- From linear (kernel) methods

- Attributing one weight per feature

- For an SVM model:

𝐰 = 𝑦𝑖𝛼𝑖

𝑖=1

𝐱𝐢

J. Schrouff

Thresholding weight maps:

Can we choose to look only at the largest contributions?

- Linear combination 𝑓 x𝒊 = w, 𝐱𝐢 + 𝑏 - As in an election

- Each vote counts

1 10 1 1 8 1 1 1 1

J. Schrouff

Different strategies:

1. A priori knowledge: ROI selection

2. Searchlight

3. Feature selection

4. Sparse algorithms

5. Multiple Kernel Learning (MKL)

6. A posteriori knowledge: weight summarization

J. Schrouff

Different strategies:

1. A priori knowledge: ROI

selection

2. Searchlight

5. Multiple Kernel Learning

6. A posteriori knowledge:

weight summarization

Feature

specification

Accuracy and

significance

interpretation

J. Schrouff

e.g. Choosing voxels/features based on literature or on

previous univariate results (independent dataset).

J. Schrouff

+ - Good accuracy? Need an anatomical

hypothesis

Anatomically/ functionally

relevant

Cannot threshold obtained

Arbitrary choice of region

number and size

Multiple models = multiple

comparisons

J. Schrouff

2. Searchlight (Kriegeskorte et al., 2006)

J. Schrouff

2. Searchlight (Kriegeskorte et al., 2006)

Threshold for each

neighborhood

Not anatomically/functionally

relevant (spheres)

Locally multivariate

Arbitrary choice of

neighborhood size

Multiple models = multiple

comparisons

Time-consuming

J. Schrouff

Rank the features according to a specific criterion and

select a subset.

Example:

- Univariate t-test

- Recursive Feature Elimination (RFE, De Martino, 2008)

based on SVM weights

J. Schrouff

+ - Data-driven Not anatomically/functionally

relevant (voxel sparse)

Time-consuming: nested CV

Depends on selected

technique: flaws in feature

selection are reflected in

model. user choice must

be informed.

Different strategies = different

patterns with same accuracy.

J. Schrouff

Algorithms with regularization constraints such that the

weight of some features is perfectly null.

Examples:

- LASSO (Tibshirani, 1996)

- Elastic-net (Zou and Hastie, 2005)

J. Schrouff

Data-driven Not anatomically/functionally

relevant (voxel sparse)

Embedded in model Stability of selected voxels

(Baldassarre, et al., 2012):

different regularization terms

= different patterns for same

accuracy.

J. Schrouff

Note: stability selection (Rondina et al., 2014).

+ - Data-driven Not anatomically/functionally

relevant

Embedded in model Stability of selected voxels

(Baldassarre, et al., 2012):

different regularization terms

= different patterns for same

accuracy.

J. Schrouff

Build one kernel per anatomically defined region as input

of a (sparse) MKL algorithm (Filippone, 2012).

𝐾 𝒙, 𝒙′ = 𝑑𝑚𝐾𝑚𝑀𝑚=1 (𝒙, 𝒙′)

with 𝑑𝑚 ≥ 0, 𝑑𝑚𝑀𝑚=1 = 1

J. Schrouff

Decision function of an MKL problem:

𝑓 x𝒊 = w𝒎, 𝐱𝐢𝑚 + 𝑏

Considered algorithm (Rakotomamonjy et al., 2008):

𝑑𝑚𝑚

w𝑚2 + 𝐶 𝜉𝑖

𝑠. 𝑡. 𝑦𝑖 𝐰𝒎, 𝐱𝐢𝑚

+ 𝑏 ≥ 1 − 𝜉𝑖 ∀𝑖

𝜉𝑖 ≥ 0 ∀𝑖, 𝑑𝑚𝑚

= 1, 𝑑𝑚 ≥ 0 ∀𝑚

18 J. Schrouff

Weight map of an MKL model:

𝐰𝒎 = 𝑑𝑚 𝑦𝑖𝛼𝑖

𝑖=1

𝐱𝐢

But also 𝑑𝑚

Can be considered as a hierarchical approach

19 J. Schrouff

J. Schrouff

+ - Anatomically/ functionally

relevant

Stability of the selected

regions

Used the full pattern to

build the model

Depends on atlas

Thresholded and sorted

list of regions according to

Hierarchical view of the

model parameters

J. Schrouff

(Schrouff et al., 2013)

Build weight maps then summarize weights according to

anatomically defined regions (e.g. based on an atlas).

Mathematically:

𝑁𝑊𝑅𝑂𝐼 = 𝑊𝑣𝑣 ∈𝑅𝑂𝐼

#𝑣 ∈ 𝑅𝑂𝐼

J. Schrouff

AAL atlas (Tzourio-Mazoyer,

2002).

J. Schrouff

+ - Anatomically/ functionally

relevant

No thresholding of the sorted

list of regions

Used the full pattern to

build the model

Depends on atlas

Sorted list of regions

according to their

proportion of NWROI

J. Schrouff

How about electrophysiological recordings?

- All strategies explained previously apply

- For MKL, can be applied on each dimension (i.e.

channels, time point, frequency band) or combination

(e.g. channel + frequency band)

J. Schrouff

Conclusions:

- Interpreting machine learning based models can be

complex.

- Different strategies exist, but always a trade-off between

different parameters:

• Thresholded list of voxels/regions contributing

• Anatomical/functional relevance

• Multivariate power

• Stability of the pattern

• Computational time

Your (informed) choice!

J. Schrouff

Acknowledgments:

- F.R.S. – F.N.R.S. Belgian National Research Funds

- Belgian American Educational Foundation (BAEF)

- Medical Foundation of the Liège Rotary Club

- Stanford University

- University College London

- Wellcome Trust

Figures created using PRoNTo

(www.mlnl.cs.ucl.ac.uk/pronto/, development version to be

released soon).

28 J. Schrouff

References: - Baldassarre, L., J. Mourao-Miranda, et M. Pontil. «Structured Sparsity Models for

Brain Decoding from fMRI data.» Proceedings of the 2nd conference on Pattern

Recognition in NeuroImaging. 2012.

- Filippone, M., A. Marquand, C. Blain, C. Williams, J. Mourao-Miranda, et M.

Girolami. «Probabilistic prediction of neurological disorders with a statistical

assessement of neuroimaging data modalities.» Annals of Applied Statistics 6

(2012): 1883-1905.

- Kriegeskorte, Nikolaus, Rainer Goebel, et Peter Bandettini. «Information-based

functional brain mapping.» PNAS 103 (2006): 3863-3868.

- Rakotomamonjy, Alain, Francis R. Bach, Stéphane Canu, et Yves Grandvalet.

«SimpleMKL.» Journal of Machine Learning 9 (2008): 2491-2521.

- Schrouff, Jessica, Julien Cremers, Gaëtan Garraux, Luca Baldassarre, Janaina

Mourão-Miranda, et Christophe Phillips. «Localizing and comparing weight maps

generated from linear kernel machine learning models.» Proceedings of the 3rd

workshop on Pattern Recognition in NeuroImaging. 2013.

- Tibshirani, R. «Regression shrinkage and selection via the lasso.» Journal of the

Royal Statistical Society. 58 (1996): 267-288.

- Tzourio-Mazoyer, N., et al. «Automated Anatomical Labeling of activations in

SPM using a Macroscopic Anatomical Parcellation of the MNI MRI single-subject

brain.» NeuroImage 15 (2002): 273-289.

- Zou, H., et T. Hastie. «Regularization and variable selection via the elastic net.»

J. R. Statist. Soc. B 67 (2005): 301-320.

Interpreting predictive models in terms of · Pattern Recognition for NeuroImaging (PR4NI)...

Documents

INTERPRETING SPECTRAL ANALYSES IN TERMS …L Annals of Economic and Soc in! Measurement, 5/1, 1976 INTERPRETING SPECTRAL ANALYSES IN TERMS OF TIME-DOMAIN MODELS BY ROBERT F. ENGL&

Can Computer-assisted Interpreting Tools Assist Interpreting?

Physical and biogeochemical modulation of ocean acidification in … · ation when designing and interpreting ocean pH monitoring ef-forts and predictive models. carbon cycle climate

Plotting Linear Main Effects Models Interpreting 1 st order terms w/ & w/o interactions Coding & centering… gotta? oughta? Plotting single-predictor models

Interpreting the Scriptures - Church Leadership · PDF fileLesson 19-20 – Interpreting ... book Interpreting the Scriptures by Kevin Conner and Ken Malmin. ... understanding to interpreting

Interpreting Quantum Mechanics in Terms of Random ...philsci-archive.pitt.edu/9589/1/Shan_Gao_-_PhD_thesis.pdfInterpreting Quantum Mechanics in Terms of Random Discontinuous Motion

Interpreting Quantum Mechanics in Terms of Random ... · Interpreting Quantum Mechanics in Terms of Random Discontinuous Motion of Particles by Shan Gao ... The idea of random discontinuous

Interpreting predictive models in terms of Materials/PR4NI_Schrouff_Jessica.pdf- Interpreting machine learning based models can be complex. - Different strategies exist, but always

Predictive Peacekeeping: Strengthening Predictive Analysis

Interpreting universals and interpreting style dokto… · pt. Interpreting universals and interpreting style napisałam samodzielnie. Oznacza to, że przy pisaniu pracy, poza niezb

Terms _____________________: “The process of collecting, synthesizing, and interpreting information to aid in decision-making” (Airasian, 1997, p. 3)

Chapter 1 Starting a Proprietorship. Terms that you need to know Accounting Planning recording analyzing and interpreting financial information

Interpreting Feedback from Baseline Tests – Predictive Data Course: CEM Information Systems for Beginners and New Users Day 1 Session 3 Wednesday 17 th

ON NORMS AND ETHICS IN THE DISCOURSE ON INTERPRETING · conference interpreting, ... Historical instances of translational behaviour can then be explained in terms ... On norms and

CURRICULUM FOR COLLEGE-PREP PRECALCULUS · Interpreting Functions Interpret functions that arise in applications in terms of the context F-IF.4, 5, 6 Interpreting Functions Analyze

Interpreting Seismic Profiles in terms of Structure and ... · ao Majid K, Shahid N, Munawar S, Muhammad H (2016) Interpreting Seismic Profiles in terms of Structure and Stratigraphy

Short Story Terms A Guide to Interpreting Short Stories

Predictive Maintenance & Service and Predictive Quality ... · Predictive Maintenance & Service and Predictive Quality– ... Predictive Maintenance is an Important Building Block

Interpreting terms used in river boundary definition - …fsgk.se/2013/...terms-in-river-boundary-definition-(slides).pdf · Interpreting terms used in river boundary definition

Clarity on Data & Analytics · this will enable more real-time predictive analysis in life sciences. Across many industries, interpreting data patterns will be a key differentiator