Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
PRNI 2016 recapDmitry Petrov
July 152016
Overview
— Data sources
— Data organisation
— Preprocessing and tools
— Machine learning
— Meta
Overview
— Data sources (small and somewhat cool)
— Data organisation (small and extremely boring)
— Preprocessing and tools (small and boring)
— Machine learning (big and somewhat cool)
— Meta (small and cool)
Data sources
— People mostly use open data sources (‘you know human connectome, right?’)
— If they have data, people aren’t enthusiastic about sharing it
— Community understands that their samples are small
— There is some movement towards more data sharing and data unification
Data sources (may overlap)http://www.humanconnectomeproject.org/ — ¯\_(ツ)_/¯
https://openfmri.org/dataset/ — about 50 different fMRI datasets with 1812 subjects overall
http://studyforrest.org/ — multimodal 7T data of 20 subjects watching ‘Forrest Gump’ and listening to music
http://www.nature.com/sdata/ — repository of scientific data of ‘Nature’
http://brain-development.org/ixi-dataset/ — ~600 subjects, MRI and DTI data.
http://fcon_1000.projects.nitrc.org/indi/pro/eNKI_RS_TRT/FrontPage.html — fMRI and DWI data of ~1000 subjects (ages 8-65)
https://dataverse.harvard.edu/ — big Harvard database which also contains neuroimaging data (~20 fMRI related search results, ~10 MRI)
http://neurovault.org/ — strangely organised repository of some neuroimaging data
http://link.springer.com/journal/12021, http://www.sciencedirect.com/science/journal/23523409, http://f1000research.com/ — journals which publish datasets, including neuroimaging ones
Data organisation (boooring)
— If we are serious, we need to establish standard pipelines to organise our data
— Also, it can be useful, if we are going to combine data from different sources
— Also, it can be useful, if we don’t want to rewrite preprocessing algorithms completely when we meet new data
— There is no canonical way of doing it
— There are some movements in this direction in the community (BIDS, for example)
Data organisation
— https://git-annex.branchable.com/ — git version which allows managing files with git, without checking the file contents into git.
— http://datalad.org/ — platform for accessing neuroimaging data based on git-annex (https://github.com/datalad/datalad)
— http://bids.neuroimaging.io/ — intuitive standard for neuroimaging data organisation (with blackjack, documentation and validators)
Data organisation
BIDS example
Source: https://github.com/INCF/BIDS-examples (July 12 screenshot)
Preprocessing and tools
— Everybody is doing it, one way or another
— Except us (ಥ_ಥ)
— It is essential skill to learn, because almost nobody is willing to give processed data
— Thankfully, there are a lot of tutorials and ready-made scripts
Preprocessing and tools
http://nipy.org/nipype/users/pipeline_tutorial.html — NiPype pipelines
http://nilearn.github.io/user_guide.html — nilearn documentaion (designed for fMRI data mostly)
https://github.com/FBK-NILab/ — FBK pipeline collection (PRNI 2016 organizers)
https://practical-neuroimaging.github.io/ — Berkeley practical course on neuroimaging (found during this presentation preparation)
Machine learning
— Yeah, we are good
— But it doesn’t matter as long we can’t preprocess data ¯\_(ツ)_/¯
— Most interesting research is mostly based on fMRI: more data, more classes, somewhat less noise
— Kernels are hot
— Many people doesn’t care about interpretability of their features
— Everybody talks about Deep Learning
Machine learning: kernel fever 1
— A lot of people are doing some kind of kernels and even specialize in it. There was tutorial about kernels and invited talk
— Let’s walk through kernel tutorial at PRNI
— There is overview of some kernels on graphs with/without node correspondence
— Comparison of some kernels in terms of p-values
— Source code of this presentation experiments is available
Machine learning: kernel fever 2
— A lot of people doing some pretty complex kernels and don’t care about interpretability at all
— Example: Spatial pyramid match kernels for brain image classification, Jonathan Young, PRNI 2016
— IXI dataset, 538 subjects, gender and age prediction.
— Pretty complex kernel, gives somewhat good results with completely unclear interpretation
Machine learning: kernel fever 3
— More interesting example (with blackjack and Laplacians): Classifying HCP Task-fMRI Networks Using Heat Kernels, Ai Wern Chung et al., PRNI 2016
— Normalized Laplacian exponential based kernel features, not kernel itself
— 491 subjects, fMRI HCP data, motor vs working memory task
— Good physics intuition, big sample, good results (for chosen metrics)
— Again: so what? And why didn’t they use it as, you know, kernel?
Machine learning: deep learning
— Somebody trying to use it: Evaluation of weight sparsity control during autoencoder training of resting-state fMRI using non-zero ratio and Hoyer’s sparseness, Hyun-Chul Kim, PRNI 2016
— There was some strange examples. For example, something that looked like word2vec on neuroscience papers abstracts
— Everybody talks about Deep Learning, but nobody has any sense (and data) how to apply it to neuroimaging
Machine learning: Riemann and Baracahant
— Barachant talk was hugely successful. We can expect studies applying his methods to fMRI covariance matrices.
— He doesn’t think that BCI problem is solvable on EEG data.
— Deep Learning doesn’t kill Riemann metric methods. Kaggle competition where they ruled had very suitable setting for them.
— If there will be cleaner and better data he thinks his methods will still be good
— He is interested in subject independent algorithms (who doesn’t)
— He is also applying his methods to non-EEG data
Machine learning: this is WeiRD
— Another example of tricky method on obscure features: WeiRD – a fast and performant multivoxel pattern classifier, PRNI 2016
— Binary distance-to-centroid classifier. “It can be described as a voxel-wise (or more generally, feature-wise) voting scheme”.
— Works slightly better than RF and SVM on paper’s data.
— Again: so what?
Machine learning: ADNI classification
— Yet another example of tricky features: Novel histogram-weighted cortical thickness networks and a multi-scale analysis of predictive power in Alzheimer’s disease, PRNI 2016
— Great sample of 412 subjects, good result (~0.87 AUC) with rbf SVM + t-based feature selection (technically done very good)
— Investigation of different edge definitions using patch-wise cortical thickness correlation/similarity/dissimilarity
— But again: very obscure features which don’t reflect structure of the brain (but maybe we don’t need it for this task)
Meta: multimodal data comparison by Cichy
— Very interesting talk by Radoslaw Cichy. He wants to understand how humans process visual information and found ingenious way of comparing data from different modalities.
— Idea: show people (or models) same set of objects while collecting different data, than calculate similarity matrices of these objects according to this data.
— It’s simple, but effective. For example, allows to compare data from fMRI and EEG/MEG.
— Even more — it provides a way of ‘theoretic’ brain models validation.
Meta: selected Cichy works
— Example 1. Dynamics of scene representations in the human brain revealed by magnetoencephalography and deep neural networks (Nature Scientific Reports, 2016).
— Example 2. Similarity-Based Fusion of MEG and fMRI Reveals Spatio-Temporal Dynamics in Human Cortex During Visual Object Recognition (Cerebral Cortex, 2016)
— Example 3. Comparison of deep neural networks to spatio-temporal cortical dynamics of human visual object recognition reveals hierarchical correspondence (Neuroimage, 2016)
Idea: show people (or models) same set of objects while collecting different data, than calculate similarity matrices of these objects according to this data.
— It’s simple, but effective. For example, allows to compare data from fMRI and EEG/MEG.
— Even more it provides a way of ‘theoretic’ brain models validation.
Meta: sidenotes
— Some Hungarian guys found funny way to measure nodes’ centralities using modularity and fixed partition
— There was from Italy with two oral talks and her work looked rather weak. For example, she selected features on all dataset
— Some people try to move fancy machine learning things to Neuroscience. For example there was hadoop-Spark something-something abstract.
— Community mostly doesn’t care about the meaning (Cichy and several guys are exception)
Meta: some ideas
— Ядра. Повалидировать сложные и простые ядра на различных искуственных данных и сделать резюме: мол, такие-то ядра ловят такую-то штуку.
— фМРТ. Можно ли из resting-state оценить вероятности перехода сигнала в разные участки мозга? Если да, то как они соотносятся со структурными вероятностями?
— Можно ли на основе данных Cichy для человека и задачи предсказать параметры DNN, которая будет работать примерно так же на тех же данных?
— Попарная классификация. ¯\_(ツ)_/¯
Conclusions
— We need to gather more data (covered it)
— We need to learn how to preprocess it (partly covered it)
— We should do more fMRI analysis
— There are a lot of guys with our selling point ‘we are great ML guys, so we teach you how to do it in neuroscience’
— Still unclear is there enough signal in neuroimage data
Thank you!✿*∗˵╰༼✪‿✪༽╯˵∗*✿