Meta-Proteome-Analyzer: a graph database backed protein analysis software
Thilo MuthMPI Magdeburg / Germany
(cooperation with CompOmics group in Ghent / Belgium)
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Life Sciences + Graph Database?Life scientists are interested in genes, proteins, metabolites, relationships, interactions and biological networks...
Introduction
Meta-Proteome-Analyzer – a graph database backed protein analysis software
...but not so much in a bunch of boring tables.
Introduction
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Classical way:Most biological data are being stored and provided in excel sheets or SQL databases.
Shortcomings:• Scalability and performance issues (data growth)• Complexity • Lack of information • Models far away from biological reality
Introduction
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Metaproteomics
• Metaproteomics studies reveal functional activities in microbial communities• Metaproteomics results can also be used for taxonomic assignment (proteins to species).
What is metaproteomics ?
Meta-Proteome-Analyzer – a graph database backed protein analysis software
One Protein - Different Organisms
Protein Species B1 : n
Species A
Species C
Metaproteomics
Meta-Proteome-Analyzer – a graph database backed protein analysis software
How to handle more complex structures?
Metaproteomics
Peptide A
Protein B
Protein A
Protein C
Species A
Species BPeptide B
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Find a more „natural“ model:Transfer the graph model of biological data back to the database storage level.
Metaproteomics
USE A GRAPH DATABASE
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Graph Database
We implemented the graph database Neo4jinto our free software Meta-Proteome-Analyzer.
Biology + Graphs!
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Graph Database
• Peptides• Proteins• Species• Enzymes• Pathways• Ontologies
Graph nodes/vertices:NODE
NODE
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Graph Database
• Peptide-BELONGS-TO-Protein• Species-HOLDS-Protein• Protein-BELONGS-TO-Pathway
Graph edges/relationships:NODE
NODE
Meta-Proteome-Analyzer – a graph database backed protein analysis software
„Complex“ Data
Did you find that guy in here?
Meta-Proteome-Analyzer – a graph database backed protein analysis software
„Complex“ Data
No ?
Asara et al. (Science 2007) reported the detection of collagen protein fragments in a 68-million-year-old Tyrannosaurus rex bone by shotgun proteomics.
This finding has been called into question by researchers marking these proteins as contaminants from modern species.
Me neither!
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Graph Database Queries
Simple Query: Filter out (T.Rex) contaminants
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Graph Database Queries
Find a certain pathway and its related entitites
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Graph Database Queries
START pathways = nodeathways("Identifier:*")MATCH (pathways)<-[rel:BELONGS_TO_PATHWAY]-(proteins)<-[:IS_METAPROTEIN_OF]-(metaproteins)WHERE (pathways.Identifier='K00399')WITH pathways, proteins, metaproteinsMATCH (proteins)-[:BELONGS_TO]->(taxa)RETURN pathways, taxa, metaproteins, proteins
Neo4j Cypher Query
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Performance
A B C247 277 280
2584
12095
19740
Performance comparisonClient Neo4J (ms) Server MySQL (ms)
Test Datasets: A) 1388 proteins B) 4946 proteinsC) 9393 proteins
Meta-Proteome-Analyzer – a graph database backed protein analysis software
Software
Current State• Tested by four differents labs (Magdeburg, Leipzig, Helsinki, Alghero)• Documentation and manuscript in preparation• Testing and fixing bugs for release version 1.0
PR TEOMEANALYZER
META
PA
P