90
The STRING database Lars Juhl Jensen EMBL Heidelberg

The STRING database

Embed Size (px)

DESCRIPTION

Introduction to Bioinformatics, Heidelberg University, Heidelberg, Germany, November 29, 2006

Citation preview

Page 1: The STRING database

The STRING database

Lars Juhl Jensen

EMBL Heidelberg

Page 2: The STRING database

data integration

Page 3: The STRING database

Jensen et al., Drug Discovery Today: Targets, 2004

Page 4: The STRING database

functional interactions

Page 5: The STRING database

Bork et al., Current Opinion in Structural Biology, 2005

Page 6: The STRING database

373 proteomes

Page 7: The STRING database

Genome Reviews

Page 8: The STRING database

RefSeq

Page 9: The STRING database

Ensembl

Page 10: The STRING database

model organism databases

Page 11: The STRING database

genomic context methods

Page 12: The STRING database

gene fusion

Page 13: The STRING database
Page 14: The STRING database

gene neighborhood

Page 15: The STRING database
Page 16: The STRING database

phylogenetic profiles

Page 17: The STRING database
Page 18: The STRING database
Page 19: The STRING database
Page 20: The STRING database
Page 21: The STRING database

Cell

Cellulosomes

Cellulose

Page 22: The STRING database

automation

Page 23: The STRING database

scoring scheme

Page 24: The STRING database

correct interactions

Page 25: The STRING database

wrong associations

Page 26: The STRING database

gene fusion

Page 27: The STRING database

sequence similarity

Page 28: The STRING database
Page 29: The STRING database

gene neighborhood

Page 30: The STRING database

sum of intergenic distances

Page 31: The STRING database
Page 32: The STRING database

phylogenetic profiles

Page 33: The STRING database

SVDSingular Value Decomposition

Page 34: The STRING database

Euclidian distance

Page 35: The STRING database
Page 36: The STRING database

raw quality scores

Page 37: The STRING database

not comparable

Page 38: The STRING database

sequence similarity

Page 39: The STRING database

sum of intergenic distances

Page 40: The STRING database

Euclidian distance

Page 41: The STRING database

benchmarking

Page 42: The STRING database

calibrate vs. gold standard

Page 43: The STRING database
Page 44: The STRING database

raw quality scores

Page 45: The STRING database

probabilistic scores

Page 46: The STRING database

curated knowledge

Page 47: The STRING database

KEGGKyoto Encyclopedia of Genes and Genomes

Page 48: The STRING database

Reactome

Page 49: The STRING database

MIPSMunich Information center

for Protein Sequences

Page 50: The STRING database

STKESignal Transduction Knowledge Environment

Page 51: The STRING database

primary experimental data

Page 52: The STRING database

many sources

Page 53: The STRING database

many parsers

Page 54: The STRING database

physical protein interactions

Page 55: The STRING database

BINDBiomolecular Interaction Network Database

Page 56: The STRING database

GRIDGeneral Repository for Interaction Datasets

Page 57: The STRING database

MINTMolecular Interactions Database

Page 58: The STRING database

DIPDatabase of Interacting Proteins

Page 59: The STRING database

HPRDHuman Protein Reference Database

Page 60: The STRING database

merge data by publication

Page 61: The STRING database

topology-based scores

Page 62: The STRING database

von Mering et al., Nucleic Acids Research, 2005

Page 63: The STRING database

co-expression

Page 64: The STRING database

GEOGene Expression Omnibus

Page 65: The STRING database

correlation coefficient

Page 66: The STRING database

literature mining

Page 67: The STRING database

different gene identifiers

Page 68: The STRING database

synonyms lists

Page 69: The STRING database

MEDLINE

Page 70: The STRING database

SGDSaccharomyces Genome Database

Page 71: The STRING database

The Interactive Fly

Page 72: The STRING database

OMIMOnline Mendelian Inheritance in Man

Page 73: The STRING database

co-mentioning

Page 74: The STRING database

NLPNatural Language Processing

Page 75: The STRING database

Gene and protein namesCue words for entity recognitionVerbs for relation extraction

[nxgene The GAL4 gene]

[nxexpr The expression of [nxgene the cytochrome genes [nxpg CYC1 and CYC7]]]is controlled by[nxpg HAP1]

Page 76: The STRING database

calibrate vs. gold standard

Page 77: The STRING database
Page 78: The STRING database

combine all evidence

Page 79: The STRING database

spread over many species

Page 80: The STRING database

transfer by orthology

Page 81: The STRING database

von Mering et al., Nucleic Acids Research, 2005

Page 82: The STRING database

two modes

Page 83: The STRING database
Page 84: The STRING database

orthologous groups

Page 85: The STRING database

von Mering et al., Nucleic Acids Research, 2005

Page 86: The STRING database

fuzzy orthology

Page 87: The STRING database

von Mering et al., Nucleic Acids Research, 2005

Page 88: The STRING database

Bayesian scoring scheme

Page 89: The STRING database

Bork et al., Current Opinion in Structural Biology, 2005

Page 90: The STRING database

Acknowledgments

The STRING team (EMBL)– Christian von Mering

– Berend Snel

– Martijn Huynen

– Sean Hooper

– Samuel Chaffron

– Julien Lagarde

– Mathilde Foglierini

– Peer Bork

Literature mining project(EML Research)– Jasmin Saric

– Rossitza Ouzounova

– Isabel Rojas