135
Prediction of protein networks through data integration Lars Juhl Jensen EMBL Heidelberg

Prediction of protein networks through data integration

Embed Size (px)

DESCRIPTION

MIPS Retreat, Kloster Frauenchiemsee, Chiemsee, Germany, July 9-10, 2007

Citation preview

Page 1: Prediction of protein networks through data integration

Prediction of protein networks through data integration

Lars Juhl Jensen

EMBL Heidelberg

Page 2: Prediction of protein networks through data integration

prediction of interactions

Page 3: Prediction of protein networks through data integration

STRING

Page 4: Prediction of protein networks through data integration
Page 5: Prediction of protein networks through data integration

functional interactions

Page 6: Prediction of protein networks through data integration

373 genomes

Page 7: Prediction of protein networks through data integration

model organism databases

Page 8: Prediction of protein networks through data integration

Ensembl

Page 9: Prediction of protein networks through data integration

Genome Reviews

Page 10: Prediction of protein networks through data integration

RefSeq

Page 11: Prediction of protein networks through data integration

genomic context methods

Page 12: Prediction of protein networks through data integration

gene neighborhood

Page 13: Prediction of protein networks through data integration
Page 14: Prediction of protein networks through data integration

gene fusion

Page 15: Prediction of protein networks through data integration
Page 16: Prediction of protein networks through data integration

phylogenetic profiles

Page 17: Prediction of protein networks through data integration
Page 18: Prediction of protein networks through data integration
Page 19: Prediction of protein networks through data integration
Page 20: Prediction of protein networks through data integration
Page 21: Prediction of protein networks through data integration

Cell

Cellulosomes

Cellulose

Page 22: Prediction of protein networks through data integration

correct interactions

Page 23: Prediction of protein networks through data integration

wrong associations

Page 24: Prediction of protein networks through data integration

phylogenetic profiles

Page 25: Prediction of protein networks through data integration
Page 26: Prediction of protein networks through data integration

SVDSingular Value Decomposition

Page 27: Prediction of protein networks through data integration

Euclidian distance

Page 28: Prediction of protein networks through data integration

gene neighborhood

Page 29: Prediction of protein networks through data integration
Page 30: Prediction of protein networks through data integration

sum of intergenic distances

Page 31: Prediction of protein networks through data integration

raw quality scores

Page 32: Prediction of protein networks through data integration

rank by reliability

Page 33: Prediction of protein networks through data integration

not comparable

Page 34: Prediction of protein networks through data integration

Euclidian distance

Page 35: Prediction of protein networks through data integration

sum of intergenic distances

Page 36: Prediction of protein networks through data integration

benchmarking

Page 37: Prediction of protein networks through data integration

calibrate vs. gold standard

Page 38: Prediction of protein networks through data integration
Page 39: Prediction of protein networks through data integration

raw quality scores

Page 40: Prediction of protein networks through data integration

probabilistic scores

Page 41: Prediction of protein networks through data integration

curated knowledge

Page 42: Prediction of protein networks through data integration

many sources

Page 43: Prediction of protein networks through data integration

KEGGKyoto Encyclopedia of Genes and Genomes

Page 44: Prediction of protein networks through data integration

Reactome

Page 45: Prediction of protein networks through data integration

PIDNCI-Nature Pathway Interaction Database

Page 46: Prediction of protein networks through data integration

STKESignal Transduction Knowledge Environment

Page 47: Prediction of protein networks through data integration

MIPSMunich Information center

for Protein Sequences

Page 48: Prediction of protein networks through data integration

Gene Ontology

Page 49: Prediction of protein networks through data integration

different gene identifiers

Page 50: Prediction of protein networks through data integration

synonyms list

Page 51: Prediction of protein networks through data integration

literature mining

Page 52: Prediction of protein networks through data integration

MEDLINE

Page 53: Prediction of protein networks through data integration

SGDSaccharomyces Genome Database

Page 54: Prediction of protein networks through data integration

The Interactive Fly

Page 55: Prediction of protein networks through data integration

OMIMOnline Mendelian Inheritance in Man

Page 56: Prediction of protein networks through data integration

co-mentioning

Page 57: Prediction of protein networks through data integration

NLPNatural Language Processing

Page 58: Prediction of protein networks through data integration

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxgene The GAL4 gene]

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 59: Prediction of protein networks through data integration

calibrate vs. gold standard

Page 60: Prediction of protein networks through data integration
Page 61: Prediction of protein networks through data integration

primary experimental data

Page 62: Prediction of protein networks through data integration

gene expression

Page 63: Prediction of protein networks through data integration

GEOGene Expression Omnibus

Page 64: Prediction of protein networks through data integration

expression compendia

Page 65: Prediction of protein networks through data integration

protein interactions

Page 66: Prediction of protein networks through data integration

BINDBiomolecular Interaction Network Database

Page 67: Prediction of protein networks through data integration

BioGRIDGeneral Repository for Interaction Datasets

Page 68: Prediction of protein networks through data integration

DIPDatabase of Interacting Proteins

Page 69: Prediction of protein networks through data integration

IntAct

Page 70: Prediction of protein networks through data integration

MINTMolecular Interactions Database

Page 71: Prediction of protein networks through data integration

HPRDHuman Protein Reference Database

Page 72: Prediction of protein networks through data integration

many sources

Page 73: Prediction of protein networks through data integration

different gene identifiers

Page 74: Prediction of protein networks through data integration

redundancy

Page 75: Prediction of protein networks through data integration

not comparable

Page 76: Prediction of protein networks through data integration

merge data by publication

Page 77: Prediction of protein networks through data integration

raw quality scores

Page 78: Prediction of protein networks through data integration

calibrate vs. gold standard

Page 79: Prediction of protein networks through data integration
Page 80: Prediction of protein networks through data integration

combine all evidence

Page 81: Prediction of protein networks through data integration

spread over many species

Page 82: Prediction of protein networks through data integration

transfer by orthology

Page 83: Prediction of protein networks through data integration

naïve Bayesian scoring

Page 84: Prediction of protein networks through data integration
Page 85: Prediction of protein networks through data integration

prediction of interactions

Page 86: Prediction of protein networks through data integration

NetworKIN

Page 87: Prediction of protein networks through data integration
Page 88: Prediction of protein networks through data integration

the idea

Page 89: Prediction of protein networks through data integration

phosphoproteomics

Page 90: Prediction of protein networks through data integration

mass spectrometry

Page 91: Prediction of protein networks through data integration
Page 92: Prediction of protein networks through data integration

phosphorylation sites

Page 93: Prediction of protein networks through data integration

Phospho.ELM

Page 94: Prediction of protein networks through data integration

in vivo

Page 95: Prediction of protein networks through data integration

kinases are unknown

Page 96: Prediction of protein networks through data integration

computational methods

Page 97: Prediction of protein networks through data integration

NetPhosK

Page 98: Prediction of protein networks through data integration

Scansite

Page 99: Prediction of protein networks through data integration

sequence motifs

Page 100: Prediction of protein networks through data integration
Page 101: Prediction of protein networks through data integration

kinase families

Page 102: Prediction of protein networks through data integration

overprediction

Page 103: Prediction of protein networks through data integration

no context

Page 104: Prediction of protein networks through data integration

what a kinase could do

Page 105: Prediction of protein networks through data integration

not what it actually does

Page 106: Prediction of protein networks through data integration

context

Page 107: Prediction of protein networks through data integration

co-activators

Page 108: Prediction of protein networks through data integration

scaffolders

Page 109: Prediction of protein networks through data integration

protein networks

Page 110: Prediction of protein networks through data integration
Page 111: Prediction of protein networks through data integration

the algorithm

Page 112: Prediction of protein networks through data integration

NetworKIN

Page 113: Prediction of protein networks through data integration
Page 114: Prediction of protein networks through data integration

benchmarking

Page 115: Prediction of protein networks through data integration

Phospho.ELM

Page 116: Prediction of protein networks through data integration
Page 117: Prediction of protein networks through data integration

2.5-fold better accuracy

Page 118: Prediction of protein networks through data integration

context is crucial

Page 119: Prediction of protein networks through data integration

global statistics

Page 120: Prediction of protein networks through data integration
Page 121: Prediction of protein networks through data integration

visualization

Page 122: Prediction of protein networks through data integration
Page 123: Prediction of protein networks through data integration

ATM signaling

Page 124: Prediction of protein networks through data integration
Page 125: Prediction of protein networks through data integration

experimental validation

Page 126: Prediction of protein networks through data integration

summary

Page 127: Prediction of protein networks through data integration

reanalysis

Page 128: Prediction of protein networks through data integration

benchmarking

Page 129: Prediction of protein networks through data integration

integration

Page 130: Prediction of protein networks through data integration

complementary data types

Page 131: Prediction of protein networks through data integration

computational methods

Page 132: Prediction of protein networks through data integration

reproduce what is know

Page 133: Prediction of protein networks through data integration

biological discoveries

Page 134: Prediction of protein networks through data integration

testable hypotheses

Page 135: Prediction of protein networks through data integration

Acknowledgments

The STRING database– Christian von Mering

– Michael Kuhn

– Berend Snel

– Martijn Huynen

– Sean Hooper

– Samuel Chaffron

– Julien Lagarde

– Mathilde Foglierini

– Peer Bork

Literature mining– Jasmin Saric

– Rossitza Ouzounova

– Isabel Rojas

The NetworKIN method– Rune Linding

– Gerard Ostheimer

– Francesca Diella

– Karen Colwill

– Jing Jin

– Pavel Metalnikov

– Vivian Nguyen

– Adrian Pasculescu

– Jin Gyoon Park

– Leona D. Samson

– Rob Russell

– Peer Bork

– Michael Yaffe

– Tony Pawson