121
Lars Juhl Jensen STRING & STITCH Network integration of heterogeneous data

STRING & STITCH: Network integration of heterogeneous data

Embed Size (px)

Citation preview

Lars Juhl Jensen

STRING & STITCHNetwork integration of heterogeneous

data

interaction networks

association networks

guilt by association

molecular networks

web resources

STRING

string-db.org

STITCH

stitch-db.org

9.6 million proteins

430 thousand chemicals

common foundation

data integration

curated knowledge

protein complexes

3D structures

pathways

signaling pathways

metabolic pathways

Letunic & Bork, Trends in Biochemical Sciences, 2008

known drug targets

text mining

text corpus

PubMed abstracts

open access articles

dictionary

genes/proteins

chemical compounds

“black list”

co-mentioning

natural language processing

experimental data

physical interactions

high-throughput assays

Jensen & Bork, Science, 2008

genetic interactions

synthetic lethality

Beyer et al., Nature Reviews Genetics, 2007

transcriptomics

microarrays

RNAseq

gene coexpression

chemical screens

binding assays

Campillos et al., Science, 2008

activity assays

genomic context

genome comparison

gene fusion

Korbel et al., Nature Biotechnology, 2004

operons

Korbel et al., Nature Biotechnology, 2004

bidirectional promoters

Korbel et al., Nature Biotechnology, 2004

phylogenetic profiles

Korbel et al., Nature Biotechnology, 2004

a real example

Cell

Cellulosomes

Cellulose

complications

many databases

different formats

different identifiers

variable quality

not comparable

not same species

hard work

parsers

mapping files

quality scores

text mining

co-mention score

affinity purification

von Mering et al., Nucleic Acids Research, 2005

phylogenetic profiles

score calibration

gold standard

von Mering et al., Nucleic Acids Research, 2005

implicit weighting by quality

common scale

orthology-based transfer

orthologous groups

Franceschini et al., Nucleic Acids Research, 2013

access

web interface

evidence viewers

good for small networks

not for big networks

Cytoscape

stringApp

handles large networks

great for visualization

steep learning curve

download files

R/Bioconductor

REST API

payload mechanism

requires programming

related resources

RAIN

ncRNA networks

Junge et al., Database, 2017rth.dk/resources/rain

COMPARTMENTS

subcellular localization

Binder et al., Database, 2014compartments.jensenlab.org

TISSUES

tissue expression

tissues.jensenlab.org Santos et al., PeerJ, 2015

DISEASES

disease associations

heterogeneous data

common identifiers

quality scores