Upload
alexander-pico
View
1.146
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Presentation for Network Biology SIG 2013 by David Amar, Tel-Aviv University, Israel. “Algorithms for Mapping Modules in Pairs of Biological Networks”
Citation preview
1
From gene networks to module-maps:
improving interpretability and prediction in systems biology
David AmarSchool of Computer Science
Tel Aviv UniversityJuly 2013
2
Biological interaction networksNodes: genes/proteins or other moleculesEdges based on evidence for interaction
Voineagu et al. 2011 Nature
Breker and Schuldiner 2009
Gene co-expression
Protein-protein interaction
Genetic interaction
Goal: Integrated analysis of different types of networks
3
Integration of networksBetter picture, reduces noiseTraditional approaches:
Look for “conserved” clusters co-clustering (Hanisch et al. 2002); JointCluster
(Narayanan et al. 2011), Look for clusters with special properties
MATISSE (Ulitsky and Shamir 2008)
4
Analysis of network pairsInteractions types can differ: within (“positive”)
vs. between (“negative”) functional units Input: networks P, N with same vertex setGoal: summarize both networks in a module map
Node – module: gene set highly connected in PLink – two modules highly interconnected
in NBetween-pathway models
Kelley and Ideker 2005Ulitsky et al. 2008Kelley and Kingsford 2011Leiserson et al. 2011
PN
5
AlgorithmsDifferent definitions for the links and the
optimization objective functionProblems are NP hardApproximation is also hard (weighted
graphs)
Our algorithmic strategy: Initiators: Find a good initial solutionImprovers: refine by merging/excluding
modules
6
Initiators Cluster P
HierarchicalNode addition
Find linked module pairs DICER: Local search in
the P and N (Kelley, Ideker 2005, Amar et al. 2013)
MBC-DICER: Find bi-cliques
Define candidate sets U and V that are bicliques in N
Exhaustive solver (FP-MBC Li et al. 2007) - requires tuning
7
Local Improvement (DICER algorithm, Amar et al. PLoS CB 2013)Link: sum of N weights between modules is
positiveGoal: enlarge links
Greedy approachMerge module links or add single nodes to link
8
Global analysis: node vs. moduleNull hypothesis: edges
between v and M are drawn randomly (n=deg(v))
Hyper-geometric p-valueOptions for weighted
graphs:Use Wilcoxon rank-sum
testSet a threshold and use the
same test
M
Not M
v
9
Global analysis: module vs. moduleCalculate a p-value for each
node in V and each node in UMerge p-values using Fisher’s
method
Under the null-hypothesis follows a Chi-square distribution (dfs=number of p-values)
U V
Other nodes
10
Global analysisGiven a set of modules M and a set of
significant links L, the solution score:
Improvement steps: merge modules if the score improves (select the best step iteratively)
Fast and accurate analysis:Decide when to recalculate p-values Perform many merges simultaneously
11
Experimental Results
12
(0) SimulationsGraphs with 500 nodes, edge weight 1, non edge -1Plant a tree map with 6 modules (module size 10-20)Add random Gaussian noise (mean 0, SD = 1.2), additional
modules, bi-cliques
MBC-D
ICER
DICER5
hier
arch
ical
NodeA
dditi
on
DICER
MBC-D
ICER
DICER5
NodeA
dditi
on
hier
arch
ical
DICER
MBC-D
ICER
DICER5
hier
arch
ical
NodeA
dditi
on
DICER
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Jaccard
Global Local Initiator only
13
(1) Yeast PPI and GI networks3979 genesP: protein-protein interactions (45,456 edges)N: negative genetic interactions (76,267
edges)Local improvers: poor results (less than 3
links)Results for global improver:
Initiator Modules Gene coverage
Max module
size
Enriched GO terms
Enriched modules
(%)
Enriched links (%)
Links
MBC-DICER 100 946 49 243 87 80 430
DICER5 103 957 46 249 82 74 438
DICER 104 837 34 192 67 61 498
Hierarchical 123 877 30 186 68 59 394
NodeAddition 102 950 49 240 83 79 430
14
Link p <10-50
Chromatin related hubs similar to Baryshnikova et al. 2011
The yeast module map
15
The top links in the map (p <10-70)
Between complexes
Between
subcomplexes
16
Comparison to extant methodsAnalysis of the Collins et al. 2007 dataComparing to extant methods that exploit
both positive and negative GIs and their weights
AlgorithmNumber of modules
Gene coverage
Maximal module size
Number of enriched GO terms
Percent enriched modules
Percent enriched
linksNumber of
links
MBC-DICER (Global) 32 238 20 53 84 79 67
Genecentric (Leiserson et al. 11)
116 1248 25 39 63 43 58
Kelley and Kingsford 11 117 355 17 32 17 6 403
17
(2) Arabidopsis PPI & MD networks P: PPIs. N: metabolic dependencies (Tzfadia et al. 2012)
Discover protein complexes and their metabolic links
18
Using the module map for function predictionValidated modules by their ability to predict gene
functions in MapMan Function assignment: the gene’s module best
assignmentLOOCV: precision and recall > 80%
Gene MapMan termModule p-value
AT5G48000sulfur-containing.glucosinolates 0.0001
AT5G42590sulfur-containing.glucosinolates 0.0001
AT2G30870redox.ascorbate and glutathione.ascorbate 0.0028
AT4G15440 isoprenoids.carotenoids 0.0002
AT1G62830 isoprenoids.carotenoids 0.0003
AT4G01690 isoprenoids.carotenoids 0.0003
New predictions
19
(3) Human case-control profilesData: expression profiles of Lung cancer (blood)P: multi-phenotype co-expression network ; N: differential
correlation (DC): change in correlation in disease vs. controls
Cross-validation: most links show high DC in the test set
Link example:
Breakage of immune activation in cancer (enrichment q-value<1E-10)
Enrichment for NSLC-specific causal miRNA (mir-34 family, p =0.002, mir2disease DB)
20
SummaryIntegration of networks
Considering different interaction typesA summary module-map
AlgorithmsInitiatorsImprovers
Algorithms perform well in simulations and real dataPPI+GIPPI+MDHuman disease: correlation and differential correlation
Next steps (?)Cytoscape app (maybe next year…)Can we use module maps instead of gene networks for network
inference?
21
Thank you!
Ron Shamir