2016 July 15 PRNI 2016 recap - Higher School of Economics 2016... · 2016-12-15 · interpretability at all — Example: Spatial pyramid match kernels for brain image classification,

PRNI 2016 recapDmitry Petrov

July 152016

Overview

— Data sources

— Data organisation

— Preprocessing and tools

— Machine learning

— Meta

Overview

— Data sources (small and somewhat cool)

— Data organisation (small and extremely boring)

— Preprocessing and tools (small and boring)

— Machine learning (big and somewhat cool)

— Meta (small and cool)

Data sources

— People mostly use open data sources (‘you know human connectome, right?’)

— If they have data, people aren’t enthusiastic about sharing it

— Community understands that their samples are small

— There is some movement towards more data sharing and data unification

Data sources (may overlap)http://www.humanconnectomeproject.org/ — ¯\_(ツ)_/¯

https://openfmri.org/dataset/ — about 50 different fMRI datasets with 1812 subjects overall

http://studyforrest.org/ — multimodal 7T data of 20 subjects watching ‘Forrest Gump’ and listening to music

http://www.nature.com/sdata/ — repository of scientific data of ‘Nature’

http://brain-development.org/ixi-dataset/ — ~600 subjects, MRI and DTI data.

http://fcon_1000.projects.nitrc.org/indi/pro/eNKI_RS_TRT/FrontPage.html — fMRI and DWI data of ~1000 subjects (ages 8-65)

https://dataverse.harvard.edu/ — big Harvard database which also contains neuroimaging data (~20 fMRI related search results, ~10 MRI)

http://neurovault.org/ — strangely organised repository of some neuroimaging data

http://link.springer.com/journal/12021, http://www.sciencedirect.com/science/journal/23523409, http://f1000research.com/ — journals which publish datasets, including neuroimaging ones

http://www.humanconnectomeproject.org/

http://www.humanconnectomeproject.org/

https://openfmri.org/dataset/

https://openfmri.org/dataset/

http://studyforrest.org/

http://studyforrest.org/

http://www.nature.com/sdata/

http://www.nature.com/sdata/

http://brain-development.org/ixi-dataset/

http://brain-development.org/ixi-dataset/

http://fcon_1000.projects.nitrc.org/indi/pro/eNKI_RS_TRT/FrontPage.html

http://fcon_1000.projects.nitrc.org/indi/pro/eNKI_RS_TRT/FrontPage.html

https://dataverse.harvard.edu/

https://dataverse.harvard.edu/

http://neurovault.org/

http://neurovault.org/

http://link.springer.com/journal/12021

http://www.sciencedirect.com/science/journal/23523409

http://f1000research.com/

Data organisation (boooring)

— If we are serious, we need to establish standard pipelines to organise our data

— Also, it can be useful, if we are going to combine data from different sources

— Also, it can be useful, if we don’t want to rewrite preprocessing algorithms completely when we meet new data

— There is no canonical way of doing it

— There are some movements in this direction in the community (BIDS, for example)

Data organisation

— https://git-annex.branchable.com/ — git version which allows managing files with git, without checking the file contents into git.

— http://datalad.org/ — platform for accessing neuroimaging data based on git-annex (https://github.com/datalad/datalad)

— http://bids.neuroimaging.io/ — intuitive standard for neuroimaging data organisation (with blackjack, documentation and validators)

https://git-annex.branchable.com/

http://datalad.org/

https://github.com/datalad/datalad

http://bids.neuroimaging.io/

Data organisation

BIDS example

Source: https://github.com/INCF/BIDS-examples (July 12 screenshot)

https://github.com/INCF/BIDS-examples

Preprocessing and tools

— Everybody is doing it, one way or another

— Except us (ಥ_ಥ)

— It is essential skill to learn, because almost nobody is willing to give processed data

— Thankfully, there are a lot of tutorials and ready-made scripts

Preprocessing and tools

http://nipy.org/nipype/users/pipeline_tutorial.html — NiPype pipelines

http://nilearn.github.io/user_guide.html — nilearn documentaion (designed for fMRI data mostly)

https://github.com/FBK-NILab/ — FBK pipeline collection (PRNI 2016 organizers)

https://practical-neuroimaging.github.io/ — Berkeley practical course on neuroimaging (found during this presentation preparation)

http://nipy.org/nipype/users/pipeline_tutorial.html

http://nipy.org/nipype/users/pipeline_tutorial.html

http://nilearn.github.io/user_guide.html

https://github.com/FBK-NILab/

https://github.com/FBK-NILab/

https://practical-neuroimaging.github.io/

Machine learning

— Yeah, we are good

— But it doesn’t matter as long we can’t preprocess data ¯\_(ツ)_/¯

— Most interesting research is mostly based on fMRI: more data, more classes, somewhat less noise

— Kernels are hot

— Many people doesn’t care about interpretability of their features

— Everybody talks about Deep Learning

Machine learning: kernel fever 1

— A lot of people are doing some kind of kernels and even specialize in it. There was tutorial about kernels and invited talk

— Let’s walk through kernel tutorial at PRNI

— There is overview of some kernels on graphs with/without node correspondence

— Comparison of some kernels in terms of p-values

— Source code of this presentation experiments is available

http://www.gatsby.ucl.ac.uk/~szabo/talks/invited_talk/Zoltan_Szabo_invited_talk_PRNI_22_06_2016.pdf

https://docs.google.com/presentation/d/1fkH5X74uwA8KwcEDC6enR93eQgelBdoU5MBJeaITJnE/


— A lot of people doing some pretty complex kernels and don’t care about interpretability at all

— Example: Spatial pyramid match kernels for brain image classification, Jonathan Young, PRNI 2016

— IXI dataset, 538 subjects, gender and age prediction.

— Pretty complex kernel, gives somewhat good results with completely unclear interpretation

http://media.wix.com/ugd/6a4302_90cc66d52f6643feb0fada014875254b.pdf




— More interesting example (with blackjack and Laplacians): Classifying HCP Task-fMRI Networks Using Heat Kernels, Ai Wern Chung et al., PRNI 2016

— Normalized Laplacian exponential based kernel features, not kernel itself

— 491 subjects, fMRI HCP data, motor vs working memory task

— Good physics intuition, big sample, good results (for chosen metrics)

— Again: so what? And why didn’t they use it as, you know, kernel?

http://media.wix.com/ugd/6a4302_5b292814b14441898b00f5f0618660c8.pdf



Machine learning: deep learning

— Somebody trying to use it: Evaluation of weight sparsity control during autoencoder training of resting-state fMRI using non-zero ratio and Hoyer’s sparseness, Hyun-Chul Kim, PRNI 2016

— There was some strange examples. For example, something that looked like word2vec on neuroscience papers abstracts

— Everybody talks about Deep Learning, but nobody has any sense (and data) how to apply it to neuroimaging

http://media.wix.com/ugd/6a4302_d5fd180863584e2fad4ba7804db320ee.pdf





http://media.wix.com/ugd/6a4302_96dd17424c5d473280c6feb429258e90.pdf

Machine learning: Riemann and Baracahant

— Barachant talk was hugely successful. We can expect studies applying his methods to fMRI covariance matrices.

— He doesn’t think that BCI problem is solvable on EEG data.

— Deep Learning doesn’t kill Riemann metric methods. Kaggle competition where they ruled had very suitable setting for them.

— If there will be cleaner and better data he thinks his methods will still be good

— He is interested in subject independent algorithms (who doesn’t)

— He is also applying his methods to non-EEG data

Machine learning: this is WeiRD

— Another example of tricky method on obscure features: WeiRD – a fast and performant multivoxel pattern classifier, PRNI 2016

— Binary distance-to-centroid classifier. “It can be described as a voxel-wise (or more generally, feature-wise) voting scheme”.

— Works slightly better than RF and SVM on paper’s data.

— Again: so what?

http://media.wix.com/ugd/6a4302_193f460d95474daa8ae13136ae4a09e6.pdf



Machine learning: ADNI classification

— Yet another example of tricky features: Novel histogram-weighted cortical thickness networks and a multi-scale analysis of predictive power in Alzheimer’s disease, PRNI 2016

— Great sample of 412 subjects, good result (~0.87 AUC) with rbf SVM + t-based feature selection (technically done very good)

— Investigation of different edge definitions using patch-wise cortical thickness correlation/similarity/dissimilarity

— But again: very obscure features which don’t reflect structure of the brain (but maybe we don’t need it for this task)

http://media.wix.com/ugd/6a4302_76478e1fc3f74973a54913c27d58a53e.pdf





Meta: multimodal data comparison by Cichy

— Very interesting talk by Radoslaw Cichy. He wants to understand how humans process visual information and found ingenious way of comparing data from different modalities.

— Idea: show people (or models) same set of objects while collecting different data, than calculate similarity matrices of these objects according to this data.

— It’s simple, but effective. For example, allows to compare data from fMRI and EEG/MEG.

— Even more — it provides a way of ‘theoretic’ brain models validation.

http://userpage.fu-berlin.de/rmcichy/

Meta: selected Cichy works

— Example 1. Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks (Nature Scientific Reports, 2016).

— Example 2. Similarity-Based Fusion of MEG and fMRI Reveals Spatio-Temporal Dynamics in Human Cortex During Visual Object Recognition (Cerebral Cortex, 2016)

— Example 3. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence (Neuroimage, 2016)

Idea: show people (or models) same set of objects while collecting different data, than calculate similarity matrices of these objects according to this data.

— It’s simple, but effective. For example, allows to compare data from fMRI and EEG/MEG.

— Even more it provides a way of ‘theoretic’ brain models validation.

http://www.sciencedirect.com/science/article/pii/S1053811916300076

http://www.sciencedirect.com/science/article/pii/S1053811916300076

http://userpage.fu-berlin.de/rmcichy/publication_pdfs/Cichy_et_al_CC_2016.pdf



http://www.nature.com/articles/srep27755

http://www.nature.com/articles/srep27755

Meta: sidenotes

— Some Hungarian guys found funny way to measure nodes’ centralities using modularity and fixed partition

— There was from Italy with two oral talks and her work looked rather weak. For example, she selected features on all dataset

— Some people try to move fancy machine learning things to Neuroscience. For example there was hadoop-Spark something-something abstract.

— Community mostly doesn’t care about the meaning (Cichy and several guys are exception)

Meta: some ideas

— Ядра. Повалидировать сложные и простые ядра на различных искуственных данных и сделать резюме: мол, такие-то ядра ловят такую-то штуку.

— фМРТ. Можно ли из resting-state оценить вероятности перехода сигнала в разные участки мозга? Если да, то как они соотносятся со структурными вероятностями?

— Можно ли на основе данных Cichy для человека и задачи предсказать параметры DNN, которая будет работать примерно так же на тех же данных?

— Попарная классификация. ¯\_(ツ)_/¯

Conclusions

— We need to gather more data (covered it)

— We need to learn how to preprocess it (partly covered it)

— We should do more fMRI analysis

— There are a lot of guys with our selling point ‘we are great ML guys, so we teach you how to do it in neuroscience’

— Still unclear is there enough signal in neuroimage data

Thank you!✿*∗˵╰༼✪‿✪༽╯˵∗*✿

Documents

2016 July 15 PRNI 2016 recap - Higher School of Economics 2016... · 2016-12-15 · interpretability at all — Example: Spatial pyramid match kernels for brain image classification,