44
OVERALL - PROJECT SUMMARY The mission of the National Resource for Network Biology (nrnb.org) is to advance the science of biological networks by providing leading-edge bioinformatic methods, software and infrastructure, and by engaging the scientific community in a portfolio of collaboration and training opportunities. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. The NRNB technologies enable researchers to assemble and analyze these networks and to use them to better understand biological systems and, in particular, how they fail in disease. The NRNB has been funded as an NIGMS Biomedical Technology Research Resource since 2010; the present application is a competitive renewal. During the previous funding period, NRNB investigators introduced a series of innovative methods for network biology including network-based biomarkers, network-based stratification of genomes, and automated inference of gene ontologies using network data. We put in place major software and hardware infrastructure such as the Cytoscape App Store, an online market place of network analysis tools; GeneMANIA, a popular online network query and gene function prediction tool; cBioPortal, a powerful web portal for analyzing cancer mutations in a network and pathway context; and a high-performance computing cluster for network analysis. We established a large portfolio of research collaborations (75 currently active); two annual international network biology meetings, the Network Biology SIG and the Cytoscape Symposium; and a rich set of training opportunities and published network analysis protocols. Over the next five years, we will seek to catalyze major phase transitions in how biological networks are represented and used, working across three broad themes: (1) From static to differential networks, (2) From descriptive to predictive networks, and (3) From flat to hierarchical networks bridging across scales. All of these efforts leverage and further support our growing stable of network technologies, including the popular Cytoscape network analysis infrastructure.

Overall Vision for NRNB: 2015-2020

Embed Size (px)

Citation preview

OVERALL - PROJECT SUMMARY The mission of the National Resource for Network Biology (nrnb.org) is to advance the science of biological networks by providing leading-edge bioinformatic methods, software and infrastructure, and by engaging the scientific community in a portfolio of collaboration and training opportunities. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. The NRNB technologies enable researchers to assemble and analyze these networks and to use them to better understand biological systems and, in particular, how they fail in disease. The NRNB has been funded as an NIGMS Biomedical Technology Research Resource since 2010; the present application is a competitive renewal. During the previous funding period, NRNB investigators introduced a series of innovative methods for network biology including network-based biomarkers, network-based stratification of genomes, and automated inference of gene ontologies using network data. We put in place major software and hardware infrastructure such as the Cytoscape App Store, an online market place of network analysis tools; GeneMANIA, a popular online network query and gene function prediction tool; cBioPortal, a powerful web portal for analyzing cancer mutations in a network and pathway context; and a high-performance computing cluster for network analysis. We established a large portfolio of research collaborations (75 currently active); two annual international network biology meetings, the Network Biology SIG and the Cytoscape Symposium; and a rich set of training opportunities and published network analysis protocols. Over the next five years, we will seek to catalyze major phase transitions in how biological networks are represented and used, working across three broad themes: (1) From static to differential networks, (2) From descriptive to predictive networks, and (3) From flat to hierarchical networks bridging across scales. All of these efforts leverage and further support our growing stable of network technologies, including the popular Cytoscape network analysis infrastructure.

OVERALL - PROJECT NARRATIVE Although we are all familiar with some of the components of biological systems – DNA, proteins, cells, organs, individuals – understanding life involves more than just cataloging its component parts. It is critical to understand the many interactions between these parts, and how this complex network of parts gives rise to biological functions and responses, health and disease. The National Resource for Network Biology provides the biomedical research community with a shared set of computational tools for studying a wide range of biological networks, including networks of genes and proteins, networks of cell-to-cell communication, and social networks of human individuals.

OVERALL - RESEARCH STRATEGY The mission of the National Resource for Network Biology (NRNB; http://nrnb.org/) is to advance the new science of Biological Networks by providing leading-edge bioinformatic methods, software and infrastructure, and stimulating the scientific community with a portfolio of collaboration and training opportunities. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. NRNB technologies enable researchers to assemble these networks and to use them to understand and predict how biological systems function and fail. The NRNB has been funded as an NIGMS Biomedical Technology Research Resource since 2010; the present application is a competitive renewal. It supports six major components, including:

§ Technology Research and Development of new bioinformatic methods and tools (3 TRDs) § Driving Biomedical Projects closely integrated with research and development (10 DBPs) § Collaboration and Service Projects within the greater scientific community (70+ CSPs) § Software and hardware infrastructure for Network Biology § Dissemination of methods, leading research, bioinformatic tools and databases § Training via symposia, workshops and a summer code development program

Technology Research and Development of new bioinformatic resources is the central activity of the NRNB. As a significant deliverable, mature bioinformatic methods developed by each of the three TRD projects are disseminated as freely available open-source software, many as apps to the Cytoscape platform, a popular network analysis package that our team initiated in 2001. Each research and development area is stimulated by an array of collaborators who are generating or using biological network data within Driving Biomedical Projects. Specific DBPs are chosen to ensure that our technology development is actively guided according to the needs of the scientific community. Over time, the selection of active DBPs, and the more distal CSPs, is revised and expanded in consultation with our External Advisory Committee. These research efforts are solidified by substantial software and high-performance hardware infrastructure we have put in place over the past several years, along with a broad program of training and dissemination. During the previous funding period, NRNB investigators introduced a series of innovative methods for network biology including network-based biomarkers1,2, network-based stratification of genomes3,4, and automated inference of gene ontologies using network data5-8. We put in place major software and hardware infrastructure such as the Cytoscape App Store9, an online market place of network analysis tools, cBioPortal10, a powerful web portal for analyzing cancer mutations in a network and pathway context, and a high-performance computing cluster for network analysis. We established a large portfolio of research collaborations (75 currently active); two annual international network biology meetings, the Network Biology SIG and the Cytoscape Symposium; and a rich set of training opportunities and published network analysis protocols. All of these efforts leveraged and further supported our popular Cytoscape network analysis infrastructure. Technology Themes

As a result of these efforts, we are nearing completion of a vision whereby a coordinated suite of tools is able to gather public data on biological networks—including molecular, genetic, functional, and/or social interactions; to visually and computationally integrate these networks with genomics and proteomics data sets; to semi-automatically assemble all of the data into models of biological pathways; to suggest points for experimental validation or clinical intervention; and to publish network models at various stages of refinement online. Although work is still required to mature and iteratively improve on this vision, the stage is now set to critically evaluate the state of the science and to take on a new set of challenges.

During the next funding period, we will seek to bring about a major phase transition in how biological networks are represented and used. In consultation with our External Advisors, we have identified three complementary themes which identify major barriers or areas of need in pushing forward the field of Network Biology, and for which NRNB is well-positioned to catalyze major change: A. Differential Networks. In this TRD area, we will develop a series of tools and methodologies for conducting differential analyses of biological networks perturbed under multiple conditions. The novel algorithmic methodologies enable us to make use of high-throughput proteomic level data to recover biological networks under specific biological perturbations. The software tools developed in this project enable researchers to further predict, analyze, and visualize the effects of these perturbations and alterations, while enabling researchers to aggregate additional information regarding the known roles of the involved interactions and their participants. B. From Descriptive to Predictive Networks. Genomics is mapping complex data about human biology and promises major medical advances. However, the routine use of genomics data in medical research is in its infancy, due mainly to the challenges of working with highly complex “big data”. In this TRD theme, we will use network information to help organize, analyze and integrate these data into models that can be used to make clinically relevant diagnoses and predictions about an individual. C. Multi-scale Network Representations. Although networks have been very useful for representing molecular interactions and mechanisms, network diagrams do not visually resemble the contents of cells. Rather, the cell involves a multi-scale hierarchy of components – proteins are subunits of protein complexes which, in turn, are parts of pathways, biological processes, organelles, cells, tissues, and so on. In this technology research project, we will pursue methods that move Network Biology towards such hierarchical, multi-scale views of cell structure and function.

Figure 1. Themes of network-based research. The NRNB competitive renewal application is organized around three major themes. Two themes advance technology for systematic assembly of network models that are dynamic (Theme I) and multi-scale (Theme III). These advances in network representation support the remaining theme of using such network knowledge to enable general biological and biomedical predictions, such as translation of genotype to phenotype (Theme II).

These three themes serve as the major organizing principles underlying each of the three TRD projects, along with matching DBPs (Figure 1, Table 1). These themes are timely, as the recent accumulation of molecular network data in mammals, and the concomitant growth of new sequencing and other ‘omics technologies, create remarkable new opportunities for using network-based approaches to understand disease states on a genome-scale in human individuals.

Computational Network Biology: An evolving discipline

Biomedical research is at a critical juncture in which vast amounts of information on molecules and molecular interactions have been collected, but methods to analyze these data are still in their infancy. Technologies for interaction mapping are advancing at a staggering pace. Experimental approaches include yeast two-hybrid assays11-19, affinity purification coupled to mass spectrometry20-29, reverse PCA30, chromatin immunoprecipitation of transcription factor binding31-42, chromatin conformation capture of DNA-DNA interactions43-45, synthetic-lethal and epistatic networks46-57, expression quantitative trait loci58-66, and many others. Apart from the molecular level, biological networks of other types are central to a range of biomedical disciplines. For example, vast social networks connecting friends, co-workers, and family members are being charted to lend insight into how disease is transmitted67-70. Such networks are being powerfully integrated with data from systematic profiling experiments, such as changes in mRNA or protein abundance71-73, protein phosphorylation state74 and metabolite concentrations75 quantified with mass spectrometry, and other advanced techniques. This enormous collection of networks and profiles necessitates bioinformatic frameworks to integrate, filter, visualize, and interpret the data. A wide range of techniques are being developed to analyze network data to address specific problems with relevance to basic biology and human health:

⇒ Gene function prediction. Since many genes still have unknown functions, examining genes in a network context can provide important functional clues. For example, a protein connected to a set of proteins involved in the same biological process is likely to also function in that process76-78.

Table 1: Matrix of technology development and matching biomedical projects. Matrix view of Technology Research and Development project aims (rows) grouped by themes (colors) and Driving Biomedical Projects (columns) also grouped by themes (alternating gray), which are labeled in bottom header row. A matrix cell is filled with a color corresponding to the Technology Theme for every case where a particular DBP is driving a particular TRD aim. Each Technology Theme is driven by multiple biomedical projects; 1-6 DBPs are associated with each TRD aim. Similarly, each Driving Biology Project is matched to 1-4 TRD aims, often across themes.

⇒ Detection of protein complexes and other modular structures. There is clear evidence in biological networks for modularity and principles of higher-order organization79,80. For example, molecular interaction networks have been found to cluster into regions that represent complexes81, and statistically over-represented motifs, such as feedback loops, have been discovered82 some of which have been thoroughly analyzed83,84.

⇒ Network evolution. Increasingly, the study of evolution is moving to the network level. A number of techniques have been developed that identify common network structures and organizational principles that are conserved across species85-88, including our PathBLAST89 and NetworkBLAST90 tools which align protein networks to identify conserved interaction paths and clusters.

⇒ Prediction and further specification of interactions. Statistically significant domain-domain correlations in a protein interaction network suggest that certain domains (and domain pairs) mediate protein binding, allowing prediction of new interactions and binding interfaces91,92. Machine learning has also been applied to predict protein-protein or genetic interactions through integration of diverse types of evidence for interaction93,94.

⇒ Identification of disease subnetworks. A recent and powerful trend is to identify neighborhoods of a biological network (i.e., subnetworks) that are transcriptionally active in disease95-99. Such neighborhoods suggest key pathway components in disease progression and provide leads for further study and potential new therapeutic targets.

⇒ Subnetwork-based diagnosis. It has become clear that subnetworks also provide a rich source of biomarkers for disease classification. Several groups3,100-103 have demonstrated that mRNA profiles can be integrated with protein networks to identify subnetwork biomarkers—i.e. subnetworks of interconnected genes whose aggregate expression levels or nucleotide variants are predictive of disease state.

⇒ Subnetwork-based gene association. Despite the growing number of gene-disease association studies104-106 based on Single Nucleotide Polymorphism (SNP) and Copy Number Variation (CNV) profiling, most human genetic variation remains uncharacterized, both in biological mechanism and in impact on disease pathology. We anticipate that molecular networks will continue to provide a powerful framework for mapping the common pathway mechanisms impacted by a collection of genotypes, as several proofs-of-principle have been developed in this area3,107-114.

Major Changes from Prior Period of Support

NRNB has progressed in important ways since the initial application was funded in 2010. Numerous minor, and several major, changes to the organization and focus of the resource were prompted during regular reviews by our very engaged External Advisory Committee (2011, 2012, and 2014 meetings). Generally, the EAC has found that “scientific progress has been impressive… [as highlighted by] the spectrum of recent publications” and that “the collaboration and outreach strategy developed by the NRNB has been very effective to make available and accessible network biology methods to those researchers less versed in it” (written EAC report, July 12, 2014). They also have made constructive critiques which were taken seriously and catalyzed changes in trajectory. Below are highlights of critiques summarized from the progression of EAC meetings, followed by our corresponding actions: Year-to-year milestones should be tracked more carefully as each TRD aim progresses. At the suggestion of the EAC, we now track each TRD aim along a progression of four stages: [1] Identification of a resounding need, [2] Research leading to a prototype method or proof-of-principle, [3] Transition to a robust end-user tool; [4] Dissemination and collaboration to a wide audience. Focus not only on network visualization but on network-based analysis and decision making. Exciting progress was made for using networks as diagnostic biomarkers under original TRD A1,3,115. This enabled us, in the renewal, to seed an entire Theme 2 (TRD 2) on ‘predictive’ networks. Encourage increased citation of NRNB tools such as Cytoscape. Track classroom use of Cytoscape.

We refactored nrnb.org and cytoscape.org websites to encourage citation and to better capture both citations and non-cited usage. We identified 32 courses worldwide using Cytoscape (2011 – 2014). Better track Cytoscape plugins and do more to encourage developers to write them. We rebranded plugins as ‘Apps’ and obtained an NRNB award supplement allowing development of the Cytoscape App Store (apps.cytoscape.org) which has greatly stimulated App development and use. Transition away from paper counts and toward metrics of community enablement and integration. We now track metrics such as number of NIH grants using NRNB tools (3800!), third-party software development we enabled (>200 Cytoscape Apps in App Store), and number of Google and NRNB Academy mentees (59). We routinely solicit qualitative ‘testimonials’, of which we have received close to 100. Connect to synergistic resources like GenomeSpace and IGV and enable network sharing on the Web. Cytoscape is now connected to RBVI's Chimera, Sage Bionetworks’ Synapse, GenomeSpace (a CSP which transitioned to independent funding from NHGRI, PI Jill Mesirov), IGV and the Cancer Gene Encyclopedia. We can now export networks to these and other resources using cytoscape.js and d3.js. Optimize existing TRDs and DBPs to focus the core NRNB strengths. Include new, more distal DBPs. In the renewal, TRDs have been organized around three themes which capture major challenges in Network Biology with significant potential impact. We have added key DBP sites including Sage Bionetworks (Stephen Friend), Dana Farber (Marc Vidal), and Stanford (Mike Cherry). These changes in TRDs and DBPs have also prompted changes to our key personnel: James Fowler, who formerly ran a TRD related to social network analysis, is now positioned as a social networks DBP, which allows general network analysis methodology in multiple TRDs to impact social networks as one driving project alongside other biological network types. We have transitioned former TRD D, related to gene regulatory network inference, into a DBP with collaborators from Sage/DREAM. NRNB Leadership and Administration

NRNB is led through the coordinated efforts of a team of four pre-eminent faculty in Network Biology and Bioinformatics: Dr. Trey Ideker (University of California San Diego), Dr. Alexander Pico (Gladstone Institute / University of California San Francisco), Dr. Gary Bader (University of Toronto), and Dr. Chris Sander (Memorial Sloan Kettering Cancer Center). These investigators are well-known thought leaders in the field with strong track records of innovative methods development and collaboration in network biology tools and applications. NRNB administration is based at the University of California with Dr. Ideker as Director and Alexander Pico as Executive Co-Director. Ideker and Pico together coordinate activities among host institutions under a Consortium/Contractual arrangement with UCSD. The four leaders are distributed across major biomedical research sites spanning both U.S. coasts and Canada. This arrangement positions NRNB as a truly national and international organization with broad exposure to, and involvement in, a spectrum of Driving Biomedical Projects and opportunities for Collaboration and Service. Because NRNB expertise is geographically distributed, this allows each NRNB site to harness the best local talent as well as to stimulate and respond to collaborative projects within their local research communities. This is an inversion of the usual arrangement for resource centers, in which the center is geographically localized and reaches out to investigators located elsewhere. In contrast, NRNB is already global and thus “thinks globally, acts locally.” NRNB Director, Dr. Trey Ideker’s credentials stem from a career of pioneering work in systems biology and a very strong track record of creating innovative and widely-used bioinformatic tools. Ideker received bachelor’s and master’s degrees in computer science from MIT and a PhD in biomedical sciences working with Dr. Lee Hood at the University of Washington. Since 2001 Ideker has led an independent research laboratory first as a Whitehead Fellow then UCSD professor, under continuous NIH funding. Ideker’s research is led by the vision that, given the right experimentation and analysis, it

will be possible to automatically assemble maps of pathways just as we now assemble maps of genomes. During graduate work, he developed a general iterative framework for how biological systems can be systematically perturbed, interrogated and modeled. This framework laid the foundation for many studies in the future discipline of Systems Biology116,117. Using his new Cytoscape platform, he demonstrated that biological networks could be integrated with gene expression to systematically map pathways118 and aligned, like sequences, to reveal conserved and divergent functions89,119-123. He showed that the best biomarkers of disease are typically not single proteins but aggregates of proteins in networks100. Ideker was named a Top Ten Innovator of 2006 by Tech. Review and received the 2009 Overton Prize, the highest award in computational biology. He serves on the Editorial Boards of Cell, Cell Reports, Molecular Systems Biology, and PLoS Comp. Bio. Directing NRNB builds on previous successful leadership such as Director of the San Diego Center for Systems Biology, Genome Program Leader at the UCSD Cancer Center, former Division Chief of Genetics in the Dept. of Medicine, and as a regular organizer of major conferences such as Cold Spring Harbor Laboratory Systems Biology: Networks and the annual Protein Networks Track at ISMB. NRNB Executive Director, Dr. Alexander Pico is the Associate Director of Bioinformatics at the Gladstone Institutes on the UCSF Mission Bay Campus where he leads a Systems Biology Group. After obtaining a PhD in molecular neurobiology and biophysics from Rockefeller University in 2003, Dr. Pico joined Gladstone as a postdoctoral fellow in the laboratory of Dr. Bruce Conklin. Dr. Pico develops software tools and resources that help analyze, visualize and explore biomedical data in the context of biomolecular networks. He cofounded WikiPathways in 2007 and has contributed to the development of Cytoscape since 2006. Dr. Pico received the David Rockefeller Award in 2001, the Howard Hughes Medical Institute Predoctoral Fellowship from 1999-2003, the David and the Mary Phillips Distinguished Postdoctoral Fellowship for 2005-2006. NRNB Investigator, Dr. Gary Bader works on biological network analysis and pathway information resources as an Associate Professor at The Donnelly Centre at the University of Toronto. He helped found the field of network biology with early contributions to network visualization and modeling124-127, database resources128 and analysis81. He was included twice (for Biology/Biochemistry and Computer Science) on the Thomson Reuters list of most influential scientists worldwide in 2014 (www.highlycited.com). His research focuses on identifying causal pathways in disease and in precision medicine (predicting patient outcome based on genetic, molecular, pathway and clinical data). For example, using his methods, his team identified novel pathways involved in autism spectrum disorder129 and the first potential therapy for a devastating pediatric brain cancer130. He completed post-doctoral work in Chris Sander’s group in the Computational Biology Center at Memorial Sloan-Kettering Cancer Center in New York. Gary developed the Biomolecular Interaction Network Database (BIND) during his Ph.D. in the lab of Christopher Hogue in the Department of Biochemistry at the University of Toronto and the Samuel Lunenfeld Research Institute at Mount Sinai Hospital in Toronto. He completed a B.Sc. in Biochemistry at McGill University in Montreal. NRNB Investigator, Dr. Chris Sander is Chair of the Computational Biology Program at the Memorial Sloan–Kettering Cancer Center in New York City. Sander was initially trained as a physicist, with a first degree in physics and mathematics from the University of Berlin in 1967, and a PhD in theoretical physics after graduate studies at the State University of New York, the Niels Bohr Institute in Copenhagen and the University of California in Berkeley. Sander has made many contributions to structural bioinformatics including developing tools such as FSSP and DSSP. DSSP, Sander’s foundational algorithm for analyzing protein structures, is still used 17 years later as a standard for identifying protein secondary structures in experimental 3D structures. In 1986, he became department chair and founder of the Biocomputing Program at the European Molecular Biology laboratories (EMBL) in Heidelberg with the support of Sydney Brenner. He was a founding member of the International Society for Computational Biology (ISCB) when the society was established in early 1997, and he is a former Executive Editor for the journal Bioinformatics. In 2010 he was awarded the ISCB Accomplishment by a Senior Scientist Award.

External Advisory Committee. The above leadership team is co-advised by an External Advisory Committee (EAC) which has historically been very active. The EAC consists of:

Dr. Stephen Friend, Founder and President of Sage Bionetworks (EAC Chair) Dr. David Hill, Associate Director of the Harvard Center for Cancer Systems Biology Dr. Tamara Munzner, Professor of Computer Science at University of British Columbia Dr. Nik Schork, Director of Human Biology, J. Craig Venter Institute Dr. Gustavo Stolovitzky, Systems Biology Group Leader at IBM/Watson Research Center Dr. Marian Walhout, Professor of Molecular Medicine at University of Massachusetts Dr. Ravichandran Veerasamy, NIGMS Program Representative

Each of these members brings relevant, world-renowned expertise and leadership in functional genomics (Friend), protein networks and interactomics (Walhout, Hill), human genetics (Schork), information visualization (Munzner), and inference of signaling and transcriptional regulatory networks (Stolovitzky founded the well-known DREAM competition). The EAC met with NRNB leadership and investigators to review progress at three separate face-to-face meetings during the past funding period. The EAC also plays a critical role in the continual evolution of DBPs and CSPs, by suggesting new collaborative opportunities in the field and approving all substantive changes to this lineup. Projects are evaluated based on their ability to yield joint publications between resource personnel and their collaborators, with a strong preference towards DBPs that are already NIH funded. For the DBPs described in this proposal, all are already supported separately by the NIH or Genome Canada. Size of the Research Community to be Served by NRNB

The community of scientists who might be considered “network biologists” has grown rapidly over the past few years. As an illustration of how pervasive networks have become, the U.S. National Institutes of Health currently funds 19,759 active research grants covering the topic “protein protein interactions” which, remarkably, is up from 3,076 at the time of the last NRNB submission four years ago in fall 2010. Network research is also actively supported by a variety of initiatives from other major national and supra-national funding agencies, such as the NSF, DOE, EU, and private foundations. As of 2014, approximately one-third of papers presented at the major Bioinformatics conferences (e.g., ISMB, RECOMB, ECCB) have focused on novel methodologies related to analysis or inference of biological networks— this number is up from less than one tenth a decade ago. Network Biology is the focus of major conferences per year, recently including the “Systems Biology: Networks” meeting (March 13-16 2013, CSHL); “Computational analysis of protein-protein interactions” (Sept 29 - Oct 3 2014; organized by EMBO), and “NetBio” (July 1 -15 2014, organized by NRNB staff at ISMB). Another indicator of the growing community of network research is usage of software tools. For instance, use of Cytoscape has increased at a rate of ~50% per year since 2003. In the past year alone, Cytoscape has been downloaded ~90,000 times and the website has had 370,000 sessions and 880,000 pageviews. The NRNB-built and maintained App Store has delivered over 170,000 apps to Cytoscape users since its launch in mid-2012. The original Cytoscape paper131 and the more recently released update papers131-133 have together been cited over 12,000 times and used un-cited in many more. Cytoscape analyses appeared in ~50 Science/Nature/Cell articles in 2013. Figure 2 shows how a subset of these works has been supported by a collection of different NIH institutes as well as other funding agencies. These increases in software usage and development closely parallel the increases in availability of large-scale network information and its potential impacts in biomedicine.

Thus, the importance of Network Biology is very much appreciated by basic scientists and clinicians, many of whom use interaction databases and network analysis tools in their day-to-day research. On the other hand, many other investigators (basic, clinical, and pharmaceutical) as of yet have little knowledge of recent developments in network biology and would benefit greatly from outreach and education centered on network-based technologies. With continued support of the NRNB National Resource— focused on technology development supporting research, collaboration, education, and dissemination— the Network Biology community could potentially become a great deal larger. Progress Report Highlights

The initial period of support (Sept 2010 - Aug 2015) has led to the establishment of a Network Biology resource which provides a unique and fertile environment for bioinformatics research, tool development, dissemination, collaboration and training. The initial years have been very productive by both quantitative metrics (cumulative) and research narrative:

Total publications: 83 total, 16 in Nature / Science / Cell family; 63 collaborative with ≥ 2 faculty Cytoscape gets 8,000 dls/mo, 500 cites/yr and has supported 3,800 NIH grants Number of network analysis Apps in Cytoscape App Store: 210 149 collaborations with external investigators on diverse topics (75 active) 92 training events in 12 countries, including 15 course lectures by NRNB staff 70 Google Summer of Code projects = $304,000 invested by Google Workshops: 39 total, ~1,000 attendees Symposia: 7 total, ~700 attendees

The below narrative highlights major progress under each component of the resource, including new technology development, infrastructure, collaborative activities, dissemination, and training: New Technology: Network Approach to Building Gene Ontologies. NeXO (The Network Extracted Ontology) uses a principled computational approach which integrates evidence from hundreds of thousands of individual gene and protein interactions to construct a complete hierarchy of cellular components and processes5,7. This data-derived ontology aligns with known biological machinery in the GO Database and also uncovers many new structures (Figure 3).

NIGMS, 1005

NCI, 459

NCRR, 275 NIAID, 260 NIDDK, 210

NHLBI, 202

NHGRI, 164

NIEHS, 157

NLM, 117 NIMH, 97

Wellcome Trust, 95

NINDS, 89

Canadian Institutes of Health Research, 85

NIDA, 72

Others, 541

Figure 2: Use of Cytoscape by grantees of funding agencies. We have been able to identify at least 3,800 active grants that depend on Cytoscape, up from ~1,500 18 months ago. Information was extracted from the Acknowledgments section of journal articles. One pie slice represents the percent of this total support provided by a particular agency. Over 90% of grants are by NIH institutes.

New Technology: Network-based stratification of tumor mutations. NBS is a method for stratification (clustering) of patients in a cancer cohort based on somatic mutations and a gene interaction network3. The method uses network propagation to integrate genome wide somatic mutation profiles for each patient over a gene interaction network, and a non-negative matrix factorization based clustering approach in order to derive biologically meaningful stratification of a patient cohort. Key Infrastructure: Cytoscape. Our team has worked very closely together since 2002 to develop Cytoscape, a standard visualization, integration and analysis package for biological networks (http://www.cytoscape.org/). Our previous Cytoscape development efforts, which predate NRNB, demonstrate the strong commitment of our leadership team to open-source bioinformatic tools. Our ability to establish a strong resource center owes in large part to the groundwork we have put in place over the previous decade of Cytoscape development. During the funding period, NRNB helped expand the Cytoscape desktop application into a suite of related tools, Apps, websites and databases which we propose to call the Cytoscape Cyberinfrastructure. Key Infrastructure: Establishment of the NRNB Compute Cluster. Over the performance period we worked closely with the San Diego Supercomputer Center (SDSC) to establish a high-performance computing environment for network biology applications. Through NRNB funds and matching SDSC support, we constructed an advanced cluster centered on 512 cores (64GB and 256GB HP SL230s Gen8 blade nodes) with 128TB attached storage. This infrastructure has been highly used and critical for new technology development, as manipulation, search, and statistical scoring of large network structures is typically compute intensive. We propose to renovate and expand this infrastructure in the next funding period. Dissemination: The Cytoscape App Store. A highlight of NRNB Dissemination efforts is the Cytoscape App Store (http://apps.cytoscape.org/) which was developed under supplemental funding to the main NRNB award. The goals of the App Store are to highlight the important features that apps add to Cytoscape, to enable researchers to find apps they need, and for developers to promote their apps. It has stimulated a sizable community of Cytoscape App developers, enabling over 200 third-party network analysis and visualization tools to-date. Dissemination: The cBio Cancer Genomics Portal (http://cbioportal.org) is an open-access resource for interactive exploration of multi-dimensional cancer genomics data sets, currently providing access to data from more than 5,000 tumor samples from 20 cancer studies10. The cBio Cancer Genomics Portal empowers researchers to translate these rich data sets into biologic insights and clinical applications, for instance by viewing the molecular network context of cancer mutations using cytoscape.js for the web. Following its introduction in 2012, the portal has been used by thousands of basic cancer researchers, and increasingly clinicians, wishing to understand the prevalence, biological context, and outcome of a set of tumor mutations. Key DBP Collaboration: Synthetic Genetic Analysis of Budding Yeast. NRNB investigator Gary Bader had a very productive DBP with scientists at the University of Toronto (Charles Boone, Brenda Andrews) leading to the first complete genetic interaction network for a cell48,134. This reference map provides a model for expanding genetic network analysis to higher organisms and is stimulating

Figure 3. Journal cover showing top level view of the Network-Extracted Ontology (NeXO), a completely data-driven reconstruction of the literature-based GO.

Figure 5. Differential genetic interactions specifically highlight pathways responding to DNA damage. Genetic interaction module map in which protein complexes / pathways (nodes) are connected by differential genetic interactions (edges). “Differential positive” scores indicate genetic interactions that increase after exposure to MMS (green) while “differential negative” scores indicate interactions that decrease (red).

valuable insights into gene function and drug targets. The 2010 publication of roughly 20% of the complete map was in the top 30 most cited papers of that year (Figure 4). Key CSP Collaboration: Genetic Networks Remodeled by DNA Damage. Although cellular behaviors are dynamic, the networks that govern these behaviors have been mapped primarily as static snapshots. To explore network dynamics, Ideker collaborated with Nevan Krogan at UCSF to develop an approach called differential epistasis mapping, which creates a genetic network (Figure 4) based on changes in interaction observed between two static conditions. They mapped widespread changes in genetic interaction as the cell responds to DNA damage57. Differential networks chart a new type of genetic landscape that will be invaluable for mapping many different cellular responses to stimuli. Training: Google Summer of Code. One of our most successful training activities is our participation as a mentoring organization in the Google Summer of Code program. Over the past 4 years, we have recruited over 50 mentors and attracted over 100 students around the world to apply to work on network biology projects. For example, Google funded 18 students recruited by NRNB in 2014, ranking us in the top 10% in terms of the number of slots allocated by Google across all accepted organizations (comparable interest to Apache, Mozilla, R, and Python). Our overall application success rate is 94% on GSoC projects. These training efforts have resulted in 10 student co-authored papers9,135-143. Training: RECOMB/DREAM Conference and Challenges. Another highlight of our training efforts has been to create an annual international symposium on bioinformatic methods for Network Biology. Formerly this was the Cytoscape Symposium, a three-day annual meeting that attracted 200-300 attendees each year. In 2013 we began discussions with the well-known RECOMB (Research in Computational Molecular Biology) and ISCB (International Society for Computational Biology) conference families to combine efforts. These discussions led to an agreement to expand the existing RECOMB/ISCB conference series on Systems and Regulatory Biology (originally co-founded by Trey Ideker, Ron Shamir, and Eleazar Eskin in 2003) to integrate sessions including a

Figure 4. The genetic interaction map of budding yeast, as visualized by a DBP between NRNB and the Boone laboratory.

Cytoscape App Expo and Cytoscape user and developer training events. At the same time, NRNB partnered with the popular DREAM Challenge, an annual competition that nucleates thousands of participants to address important problems in inference and validation of network biology models. DREAM is also co-organized with the RECOMB conference and, since 2013, the winning network inference method(s) have been implemented as Cytoscape Apps by the donated time of an NRNB software developer. Future Progress Assessment and Evaluation

NRNB progress will be measured using both narrative and numerical information. All investigators provide annually a narrative of research progress and future plans within each of the TRDs and coupled DBPs. We maintain a database of bibliographic data that tracks the number of peer-reviewed publications per TRD and investigator. We track the number of citations to these papers along with near-term metrics such as the ‘Altmetric’, which indicates rapid online interest in a paper even before enough time has passed to garner follow-on citations. Beyond publications, an important NRNB aim is production and maintenance of cutting-edge software tools, websites, and platforms. These resources will be posted to our website in a manner consistent with NIH policies (see Resource Sharing Plan), and we will track usage statistics of all new and existing NRNB-supported resources (e.g. cytoscape.org, apps.cytoscape.org, nexontology.org). We will also track tools developed by third parties, such as Cytoscape Apps, which are enabled or supported by core NRNB efforts. Success of Collaboration and Service Projects is measured by the overall number, complexity, and quality of ongoing CSPs we manage and the publications stemming from these collaborations. We also routinely solicit qualitative feedback about CSPs and tool usage in the form of user testimonials. Products of training and outreach, such as workshops, symposia, and summer training programs like Google Summer of Code, are tracked via event descriptions and post-event surveys and metrics such as the number of events and attendees. Distinction Between NRNB And Existing Resources

NRNB complements other bioinformatic resources and centers, while having unique aims and deliverables. The following is a partial list of other NIH-supported bioinformatics efforts that are distinct, but synergistic, with NRNB: The Cytoscape Project. Cytoscape is a software platform on which to provide tools and infrastructure for biological network analysis. The core codebase of the Cytoscape desktop application is supported by a research project grant from NIGMS (R01 GM070743 to Ideker). In contrast, NRNB provides training and outreach on Cytoscape and its apps, and TRD projects use the Cytoscape platform for analysis and development of new tools as Apps. NRNB also supports a range of other tools and databases such as Wikipathways, Enrichment Map, NeXO, the Cytoscape App Store, and Cytoscape.js/web. Resource for Biocomputing, Visualization, and Informatics (RBVI) directed by Thomas Ferrin at UCSF. RBVI is another informatics BTRR that develops software for protein structural analysis and visualization, including Chimera. This resource has already developed the StructureViz144 Chimera to Cytoscape App, which links protein networks analyzed in Cytoscape to 3D structures of proteins145,146. National Resource for Cell Analysis and Modeling (NRCAM) directed by Leslie Loew at University of Connecticut. The NRCAM is another informatics BTRR that develops the VCell modeling tool147 which enables precise physico-chemical simulations of pathway dynamics given information on kinetic parameters, cellular sub-structure, and localization. In contrast, the focus of NRNB is to integrate

heterogeneous interactions and genome-scale data sets together into consolidated pathway maps of the cell. These maps may be useful as starting points for building simulation models using VCell. NIGMS Centers for Systems Biology. NRNB has productive interactions with NIGMS Systems Biology Centers, including the San Diego Center for Systems Biology (SDCSB). The SDCSB brings together 18 systems biology faculty at UCSD, Salk, and surrounding institutes, organized into project teams who apply systems approaches to advance our understanding of fundamental cellular processes, including chromatin architecture and transcription, protein turnover, and cell-cell communication. In 2013 the original PI Alexander Hoffman moved to UCLA and passed leadership to Ideker at that time.

Annotated Timeline

The goal of the annotated timeline is not to set explicit milestone dates (per FOA instructions), but rather to provide target dates for technology availability and to illustrate how the various sub-project aims related to each other. We are using Munzner's 4-Stage Model to show the preliminary work (gray bars) and the proposed extension (open bars) for each project (each row) in the context of an abstracted technology development cycle. Our goal is to push each project to either Stage 3 (implemented tool) or Stage 4 (widely adopted tool) during the next support period. Target dates per project are provided right-justified within each open bar.

TRD NRNB Lead Specific Aim or Subaim1.1 MSK / Sander Tools for inference of differential networks

- Integrated tools for perturbation response analysis- Predictive power using tissue-specific priors; PERA 2.0- UX and visualization for network comparison

1.2 Toronto / Bader Protein network alignment algorithm and viewer- IIN network alignment algorithm (GreedyPlus)- Develop network alignment viewer YEAR 1- Accessible IIN data via PSICQUIC web service YEAR 1, DBP2

1.3 Gladstone / Pico Facilitating the interpretation of AP-MS data as interaction networks- Tools for AP-MS import and assessment YEAR 1- Tools for network augmentation and comparative analysis

2.1 Toronto / Bader Predicting clinical outcome using patient similarity networks- Outcome prediction methods- Patient population stratification- Patient similarity network visualization

2.2 UCSD / Ideker Predicting cellular response to perturbation w/ network-guided regression& MSK / Sander - Predicting cellular response to treatment using molecular profile data

- Predicting cellular response to treatment using diverse, integrated data2.3 Gladstone / Pico Network analysis of genetic variant data

- Variant analysis and visualization tools- Variant data and annotation tools

3.1 UCSD / Ideker Data-driven assembly & refinement of gene ontologies from networks- Inference of gene ontologies from network data- NeXO (nexontology.org) data-driven ontology web browser ! YEAR 1- Procedures for integrating datasets & iterative ontology improvement !

3.2 UCSD / Ideker Functionalized gene ontologies as a hierarchy of functional prediction- Procedures for propagation of state on gene ontologies

3.3 Toronto / Bader Bridging ligand-receptor networks to cell-cell communication networks- Cell-cell interaction network inference- Identify key players and rules- Intracellular pathway control- Mutl-scale visualization technology

Java tool YEAR 2 - 3, DBP 9 YEAR 3, CSPs

YEAR 4 - 5, CSPsYEAR 4 - 5, CSPs

YEAR 3 - 4, CSPsYEAR 4 - 5, CSPs

YEAR 4 - 5, CSPsYEAR 4 - 5, CSPs

YEAR 3, CSPsYEAR 3, CSPs

YEAR 4 - 5, CSPs

Perturbation biology methodPERA algorithm YEAR 1 - 2, DBP 7,8

YEAR 2 - 3, DBP 7,8

'Siri' essay in Cell YEAR 1 - 3, DBP 3, 4 YEAR 3 - 5, DBP 3, 4

NeXO and CliXO AlgorithmsNeXO Web Browser

YEAR 2 - 3, DBP 1 - 5

YEAR 1 - 2, DBP 1 - 5,10

YEAR 3 - 4, DBP 1 - 5,10

Identify problem, driving biological question, and

target community

Develop new method or approach as a solution;

establish proof of concept

Implement solution as a mature resource (software,

website, DB)

Broad adoption of resource via dissemination and

collaboration

Stage 1 Stage 2 Stage 3 Stage 4

YEAR 2 - 3, CSPs

Morris(2014) Nature Protocol YEAR 2 - 3, DBP 1YEAR 2 - 3, DBP 1Web service resources

PSICQUIC hosted web services

Prototypes & mock-ups YEAR 1, DBP 8

YEAR 1 - 3, DBP 2, 10YEAR 2 - 4, DBP 1,2,7,8,10

Prototype Java toolsBeta rendering system

Optimize prototype

Mock-up YEAR 2 - 3, DBP 4,5,6 YEAR 3, DBP 4,5,6,10 YEAR 4, CSPsYEAR 3, CSPs

YEAR 2 - 3, DBP 5YEAR 2 - 3, DBP 4,5,6,10Java toolYEAR 3 - 4, DBP 4,5,6,10 YEAR 4 - 5, CSPs

Regression methodExtension to new data

YEAR 1 - 3, DBP 8 YEAR 3, DBP 4,8YEAR 1 - 3, DBP 8 YEAR 4, DBP 4,8

Cytoscape IntegrationCytoscape Integration

YEAR 2 - 3, DBP 4,5,6,8 YEAR 3, CSPsYEAR 3 - 4, DBP 4,5,6,8 YEAR 4, CSPs

YEAR 4 - 5, CSPsYEAR 2 - 3, DBP 1,2

YEAR 4, DBP 9 YEAR 5, DBP 9Visualization appPathway analysis

YEAR 3, CSPsYEAR 4, CSPsYEAR 5, CSPs

Topology analysis YEAR 2, DBP 9 YEAR 3, DBP 9YEAR 3, DBP 9 YEAR 4, DBP 9

YEAR 4, CSPsYEAR 5, CSPs

OVERALL - SPECIFIC AIMS The mission of the National Resource for Network Biology is to advance the science of Biological Networks by providing leading-edge bioinformatic methods, software and infrastructure, and by engaging the scientific community in a portfolio of collaboration and training opportunities. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. NRNB technologies enable researchers to assemble and analyze these networks and to use them to better understand biological systems and, in particular, how they fail in disease. NRNB has been funded as an NIGMS Biomedical Technology Research Resource since 2010. Our overall mission is accomplished through the following five Specific Aims: Specific Aim 1. To mount aggressive programs of research in cutting-edge bioinformatic technology to address major challenges and opportunities in Network Biology. In past years, we introduced a series of innovative methods including network-based biomarkers and network-based stratification of genomes. In the next support period, we will research approaches to represent and analyze network architecture across conditions or times; a general engine for genotype-to-phenotype prediction using network knowledge; and a platform to crowd-source construction of a gene ontology based wholly on network data from the community. Research proceeds in frequent communication with Driving Biomedical Projects, which provide data and applications from the labs of our close collaborators. Specific Aim 2. To catalyze phase transitions in how biological networks are represented and used in biomedical research. We are well-positioned to catalyze change along three complementary themes: I. Moving from static network data and models to networks that are differential or dynamic, II. Moving from networks that are primarily descriptive to those that are predictive of a range of phenotypes and behaviors, and III. Moving from flat networks (lists of pairwise interactions) to multi-scale representations that capture the hierarchy of modules comprising a biological system and reflected in its data. Specific Aim 3. To establish and disseminate robust end-user software, databases and high-performance computing infrastructure that enable network analysis and visualization methods for a broad biomedical research community. We will further develop the popular Cytoscape desktop application and App Store database of network analysis tools into a mature platform for network-based research on the web and in the cloud, to be called the Cytoscape Cyberinfrastructure. We will grow the collection of supported network tools distributed through nrnb.org. And we will update and expand the current NRNB computing hardware to keep pace with the growing size of network datasets and computing tasks. Specific Aim 4. To engage with leading biomedical investigators in productive collaborations uniquely enabled by NRNB methods and tools. Since our inception in 2010, the NRNB has maintained an active and rolling portfolio of approximately 60-80 Collaborative and Service Projects. We will use best practices learned during this initial period of support to continue to acquire, triage, and/or complete collaborative projects, with an increasing focus on deploying new network methodologies developed under the three major technology themes outlined in Aim 2. Specific Aim 5. To train the current and next-generation of biomedical investigators in the science of Network Biology and its applications in disease research. The NRNB runs a highly attended annual network biology symposium, which starting in 2014 is planned jointly with DREAM and ICSB/RECOMB. We run a broad collection of network biology workshops and training events, including a very popular Google Summer Of Code program which over the past 4 years has recruited over 50 mentors and 100 students around the world to work on network biology projects under matching support from Google.

OVERALL - PROJECT NARRATIVE Although we are all familiar with some of the components of biological systems – DNA, proteins, cells, organs, individuals – understanding life involves more than just cataloging its component parts. It is critical to understand the many interactions between these parts, and how this complex network of parts gives rise to biological functions and responses, health and disease. The National Resource for Network Biology provides the biomedical research community with a shared set of computational tools for studying a wide range of biological networks, including networks of genes and proteins, networks of cell-to-cell communication, and social networks of human individuals.

OVERALL - SPECIFIC AIMS The mission of the National Resource for Network Biology is to advance the science of Biological Networks by providing leading-edge bioinformatic methods, software and infrastructure, and by engaging the scientific community in a portfolio of collaboration and training opportunities. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. NRNB technologies enable researchers to assemble and analyze these networks and to use them to better understand biological systems and, in particular, how they fail in disease. NRNB has been funded as an NIGMS Biomedical Technology Research Resource since 2010. Our overall mission is accomplished through the following five Specific Aims: Specific Aim 1. To mount aggressive programs of research in cutting-edge bioinformatic technology to address major challenges and opportunities in Network Biology. In past years, we introduced a series of innovative methods including network-based biomarkers and network-based stratification of genomes. In the next support period, we will research approaches to represent and analyze network architecture across conditions or times; a general engine for genotype-to-phenotype prediction using network knowledge; and a platform to crowd-source construction of a gene ontology based wholly on network data from the community. Research proceeds in frequent communication with Driving Biomedical Projects, which provide data and applications from the labs of our close collaborators. Specific Aim 2. To catalyze phase transitions in how biological networks are represented and used in biomedical research. We are well-positioned to catalyze change along three complementary themes: I. Moving from static network data and models to networks that are differential or dynamic, II. Moving from networks that are primarily descriptive to those that are predictive of a range of phenotypes and behaviors, and III. Moving from flat networks (lists of pairwise interactions) to multi-scale representations that capture the hierarchy of modules comprising a biological system and reflected in its data. Specific Aim 3. To establish and disseminate robust end-user software, databases and high-performance computing infrastructure that enable network analysis and visualization methods for a broad biomedical research community. We will further develop the popular Cytoscape desktop application and App Store database of network analysis tools into a mature platform for network-based research on the web and in the cloud, to be called the Cytoscape Cyberinfrastructure. We will grow the collection of supported network tools distributed through nrnb.org. And we will update and expand the current NRNB computing hardware to keep pace with the growing size of network datasets and computing tasks. Specific Aim 4. To engage with leading biomedical investigators in productive collaborations uniquely enabled by NRNB methods and tools. Since our inception in 2010, the NRNB has maintained an active and rolling portfolio of approximately 60-80 Collaborative and Service Projects. We will use best practices learned during this initial period of support to continue to acquire, triage, and/or complete collaborative projects, with an increasing focus on deploying new network methodologies developed under the three major technology themes outlined in Aim 2. Specific Aim 5. To train the current and next-generation of biomedical investigators in the science of Network Biology and its applications in disease research. The NRNB runs a highly attended annual network biology symposium, which starting in 2014 is planned jointly with DREAM and ICSB/RECOMB. We run a broad collection of network biology workshops and training events, including a very popular Google Summer Of Code program which over the past 4 years has recruited over 50 mentors and 100 students around the world to work on network biology projects under matching support from Google.

OVERALL - RESEARCH STRATEGY The mission of the National Resource for Network Biology (NRNB; http://nrnb.org/) is to advance the new science of Biological Networks by providing leading-edge bioinformatic methods, software and infrastructure, and stimulating the scientific community with a portfolio of collaboration and training opportunities. Biomedical research is increasingly dependent on knowledge of biological networks of multiple types and scales, including gene, protein and drug interactions, cell-cell and cell-host communication, and vast social networks. NRNB technologies enable researchers to assemble these networks and to use them to understand and predict how biological systems function and fail. The NRNB has been funded as an NIGMS Biomedical Technology Research Resource since 2010; the present application is a competitive renewal. It supports six major components, including:

§ Technology Research and Development of new bioinformatic methods and tools (3 TRDs) § Driving Biomedical Projects closely integrated with research and development (10 DBPs) § Collaboration and Service Projects within the greater scientific community (70+ CSPs) § Software and hardware infrastructure for Network Biology § Dissemination of methods, leading research, bioinformatic tools and databases § Training via symposia, workshops and a summer code development program

Technology Research and Development of new bioinformatic resources is the central activity of the NRNB. As a significant deliverable, mature bioinformatic methods developed by each of the three TRD projects are disseminated as freely available open-source software, many as apps to the Cytoscape platform, a popular network analysis package that our team initiated in 2001. Each research and development area is stimulated by an array of collaborators who are generating or using biological network data within Driving Biomedical Projects. Specific DBPs are chosen to ensure that our technology development is actively guided according to the needs of the scientific community. Over time, the selection of active DBPs, and the more distal CSPs, is revised and expanded in consultation with our External Advisory Committee. These research efforts are solidified by substantial software and high-performance hardware infrastructure we have put in place over the past several years, along with a broad program of training and dissemination. During the previous funding period, NRNB investigators introduced a series of innovative methods for network biology including network-based biomarkers1,2, network-based stratification of genomes3,4, and automated inference of gene ontologies using network data5-8. We put in place major software and hardware infrastructure such as the Cytoscape App Store9, an online market place of network analysis tools, cBioPortal10, a powerful web portal for analyzing cancer mutations in a network and pathway context, and a high-performance computing cluster for network analysis. We established a large portfolio of research collaborations (75 currently active); two annual international network biology meetings, the Network Biology SIG and the Cytoscape Symposium; and a rich set of training opportunities and published network analysis protocols. All of these efforts leveraged and further supported our popular Cytoscape network analysis infrastructure. Technology Themes

As a result of these efforts, we are nearing completion of a vision whereby a coordinated suite of tools is able to gather public data on biological networks—including molecular, genetic, functional, and/or social interactions; to visually and computationally integrate these networks with genomics and proteomics data sets; to semi-automatically assemble all of the data into models of biological pathways; to suggest points for experimental validation or clinical intervention; and to publish network models at various stages of refinement online. Although work is still required to mature and iteratively improve on this vision, the stage is now set to critically evaluate the state of the science and to take on a new set of challenges.

During the next funding period, we will seek to bring about a major phase transition in how biological networks are represented and used. In consultation with our External Advisors, we have identified three complementary themes which identify major barriers or areas of need in pushing forward the field of Network Biology, and for which NRNB is well-positioned to catalyze major change: A. Differential Networks. In this TRD area, we will develop a series of tools and methodologies for conducting differential analyses of biological networks perturbed under multiple conditions. The novel algorithmic methodologies enable us to make use of high-throughput proteomic level data to recover biological networks under specific biological perturbations. The software tools developed in this project enable researchers to further predict, analyze, and visualize the effects of these perturbations and alterations, while enabling researchers to aggregate additional information regarding the known roles of the involved interactions and their participants. B. From Descriptive to Predictive Networks. Genomics is mapping complex data about human biology and promises major medical advances. However, the routine use of genomics data in medical research is in its infancy, due mainly to the challenges of working with highly complex “big data”. In this TRD theme, we will use network information to help organize, analyze and integrate these data into models that can be used to make clinically relevant diagnoses and predictions about an individual. C. Multi-scale Network Representations. Although networks have been very useful for representing molecular interactions and mechanisms, network diagrams do not visually resemble the contents of cells. Rather, the cell involves a multi-scale hierarchy of components – proteins are subunits of protein complexes which, in turn, are parts of pathways, biological processes, organelles, cells, tissues, and so on. In this technology research project, we will pursue methods that move Network Biology towards such hierarchical, multi-scale views of cell structure and function.

Figure 1. Themes of network-based research. The NRNB competitive renewal application is organized around three major themes. Two themes advance technology for systematic assembly of network models that are dynamic (Theme I) and multi-scale (Theme III). These advances in network representation support the remaining theme of using such network knowledge to enable general biological and biomedical predictions, such as translation of genotype to phenotype (Theme II).

These three themes serve as the major organizing principles underlying each of the three TRD projects, along with matching DBPs (Figure 1, Table 1). These themes are timely, as the recent accumulation of molecular network data in mammals, and the concomitant growth of new sequencing and other ‘omics technologies, create remarkable new opportunities for using network-based approaches to understand disease states on a genome-scale in human individuals.

Computational Network Biology: An evolving discipline

Biomedical research is at a critical juncture in which vast amounts of information on molecules and molecular interactions have been collected, but methods to analyze these data are still in their infancy. Technologies for interaction mapping are advancing at a staggering pace. Experimental approaches include yeast two-hybrid assays11-19, affinity purification coupled to mass spectrometry20-29, reverse PCA30, chromatin immunoprecipitation of transcription factor binding31-42, chromatin conformation capture of DNA-DNA interactions43-45, synthetic-lethal and epistatic networks46-57, expression quantitative trait loci58-66, and many others. Apart from the molecular level, biological networks of other types are central to a range of biomedical disciplines. For example, vast social networks connecting friends, co-workers, and family members are being charted to lend insight into how disease is transmitted67-70. Such networks are being powerfully integrated with data from systematic profiling experiments, such as changes in mRNA or protein abundance71-73, protein phosphorylation state74 and metabolite concentrations75 quantified with mass spectrometry, and other advanced techniques. This enormous collection of networks and profiles necessitates bioinformatic frameworks to integrate, filter, visualize, and interpret the data. A wide range of techniques are being developed to analyze network data to address specific problems with relevance to basic biology and human health:

⇒ Gene function prediction. Since many genes still have unknown functions, examining genes in a network context can provide important functional clues. For example, a protein connected to a set of proteins involved in the same biological process is likely to also function in that process76-78.

Table 1: Matrix of technology development and matching biomedical projects. Matrix view of Technology Research and Development project aims (rows) grouped by themes (colors) and Driving Biomedical Projects (columns) also grouped by themes (alternating gray), which are labeled in bottom header row. A matrix cell is filled with a color corresponding to the Technology Theme for every case where a particular DBP is driving a particular TRD aim. Each Technology Theme is driven by multiple biomedical projects; 1-6 DBPs are associated with each TRD aim. Similarly, each Driving Biology Project is matched to 1-4 TRD aims, often across themes.

⇒ Detection of protein complexes and other modular structures. There is clear evidence in biological networks for modularity and principles of higher-order organization79,80. For example, molecular interaction networks have been found to cluster into regions that represent complexes81, and statistically over-represented motifs, such as feedback loops, have been discovered82 some of which have been thoroughly analyzed83,84.

⇒ Network evolution. Increasingly, the study of evolution is moving to the network level. A number of techniques have been developed that identify common network structures and organizational principles that are conserved across species85-88, including our PathBLAST89 and NetworkBLAST90 tools which align protein networks to identify conserved interaction paths and clusters.

⇒ Prediction and further specification of interactions. Statistically significant domain-domain correlations in a protein interaction network suggest that certain domains (and domain pairs) mediate protein binding, allowing prediction of new interactions and binding interfaces91,92. Machine learning has also been applied to predict protein-protein or genetic interactions through integration of diverse types of evidence for interaction93,94.

⇒ Identification of disease subnetworks. A recent and powerful trend is to identify neighborhoods of a biological network (i.e., subnetworks) that are transcriptionally active in disease95-99. Such neighborhoods suggest key pathway components in disease progression and provide leads for further study and potential new therapeutic targets.

⇒ Subnetwork-based diagnosis. It has become clear that subnetworks also provide a rich source of biomarkers for disease classification. Several groups3,100-103 have demonstrated that mRNA profiles can be integrated with protein networks to identify subnetwork biomarkers—i.e. subnetworks of interconnected genes whose aggregate expression levels or nucleotide variants are predictive of disease state.

⇒ Subnetwork-based gene association. Despite the growing number of gene-disease association studies104-106 based on Single Nucleotide Polymorphism (SNP) and Copy Number Variation (CNV) profiling, most human genetic variation remains uncharacterized, both in biological mechanism and in impact on disease pathology. We anticipate that molecular networks will continue to provide a powerful framework for mapping the common pathway mechanisms impacted by a collection of genotypes, as several proofs-of-principle have been developed in this area3,107-114.

Major Changes from Prior Period of Support

NRNB has progressed in important ways since the initial application was funded in 2010. Numerous minor, and several major, changes to the organization and focus of the resource were prompted during regular reviews by our very engaged External Advisory Committee (2011, 2012, and 2014 meetings). Generally, the EAC has found that “scientific progress has been impressive… [as highlighted by] the spectrum of recent publications” and that “the collaboration and outreach strategy developed by the NRNB has been very effective to make available and accessible network biology methods to those researchers less versed in it” (written EAC report, July 12, 2014). They also have made constructive critiques which were taken seriously and catalyzed changes in trajectory. Below are highlights of critiques summarized from the progression of EAC meetings, followed by our corresponding actions: Year-to-year milestones should be tracked more carefully as each TRD aim progresses. At the suggestion of the EAC, we now track each TRD aim along a progression of four stages: [1] Identification of a resounding need, [2] Research leading to a prototype method or proof-of-principle, [3] Transition to a robust end-user tool; [4] Dissemination and collaboration to a wide audience. Focus not only on network visualization but on network-based analysis and decision making. Exciting progress was made for using networks as diagnostic biomarkers under original TRD A1,3,115. This enabled us, in the renewal, to seed an entire Theme 2 (TRD 2) on ‘predictive’ networks. Encourage increased citation of NRNB tools such as Cytoscape. Track classroom use of Cytoscape.

We refactored nrnb.org and cytoscape.org websites to encourage citation and to better capture both citations and non-cited usage. We identified 32 courses worldwide using Cytoscape (2011 – 2014). Better track Cytoscape plugins and do more to encourage developers to write them. We rebranded plugins as ‘Apps’ and obtained an NRNB award supplement allowing development of the Cytoscape App Store (apps.cytoscape.org) which has greatly stimulated App development and use. Transition away from paper counts and toward metrics of community enablement and integration. We now track metrics such as number of NIH grants using NRNB tools (3800!), third-party software development we enabled (>200 Cytoscape Apps in App Store), and number of Google and NRNB Academy mentees (59). We routinely solicit qualitative ‘testimonials’, of which we have received close to 100. Connect to synergistic resources like GenomeSpace and IGV and enable network sharing on the Web. Cytoscape is now connected to RBVI's Chimera, Sage Bionetworks’ Synapse, GenomeSpace (a CSP which transitioned to independent funding from NHGRI, PI Jill Mesirov), IGV and the Cancer Gene Encyclopedia. We can now export networks to these and other resources using cytoscape.js and d3.js. Optimize existing TRDs and DBPs to focus the core NRNB strengths. Include new, more distal DBPs. In the renewal, TRDs have been organized around three themes which capture major challenges in Network Biology with significant potential impact. We have added key DBP sites including Sage Bionetworks (Stephen Friend), Dana Farber (Marc Vidal), and Stanford (Mike Cherry). These changes in TRDs and DBPs have also prompted changes to our key personnel: James Fowler, who formerly ran a TRD related to social network analysis, is now positioned as a social networks DBP, which allows general network analysis methodology in multiple TRDs to impact social networks as one driving project alongside other biological network types. We have transitioned former TRD D, related to gene regulatory network inference, into a DBP with collaborators from Sage/DREAM. NRNB Leadership and Administration

NRNB is led through the coordinated efforts of a team of four pre-eminent faculty in Network Biology and Bioinformatics: Dr. Trey Ideker (University of California San Diego), Dr. Alexander Pico (Gladstone Institute / University of California San Francisco), Dr. Gary Bader (University of Toronto), and Dr. Chris Sander (Memorial Sloan Kettering Cancer Center). These investigators are well-known thought leaders in the field with strong track records of innovative methods development and collaboration in network biology tools and applications. NRNB administration is based at the University of California with Dr. Ideker as Director and Alexander Pico as Executive Co-Director. Ideker and Pico together coordinate activities among host institutions under a Consortium/Contractual arrangement with UCSD. The four leaders are distributed across major biomedical research sites spanning both U.S. coasts and Canada. This arrangement positions NRNB as a truly national and international organization with broad exposure to, and involvement in, a spectrum of Driving Biomedical Projects and opportunities for Collaboration and Service. Because NRNB expertise is geographically distributed, this allows each NRNB site to harness the best local talent as well as to stimulate and respond to collaborative projects within their local research communities. This is an inversion of the usual arrangement for resource centers, in which the center is geographically localized and reaches out to investigators located elsewhere. In contrast, NRNB is already global and thus “thinks globally, acts locally.” NRNB Director, Dr. Trey Ideker’s credentials stem from a career of pioneering work in systems biology and a very strong track record of creating innovative and widely-used bioinformatic tools. Ideker received bachelor’s and master’s degrees in computer science from MIT and a PhD in biomedical sciences working with Dr. Lee Hood at the University of Washington. Since 2001 Ideker has led an independent research laboratory first as a Whitehead Fellow then UCSD professor, under continuous NIH funding. Ideker’s research is led by the vision that, given the right experimentation and analysis, it

will be possible to automatically assemble maps of pathways just as we now assemble maps of genomes. During graduate work, he developed a general iterative framework for how biological systems can be systematically perturbed, interrogated and modeled. This framework laid the foundation for many studies in the future discipline of Systems Biology116,117. Using his new Cytoscape platform, he demonstrated that biological networks could be integrated with gene expression to systematically map pathways118 and aligned, like sequences, to reveal conserved and divergent functions89,119-123. He showed that the best biomarkers of disease are typically not single proteins but aggregates of proteins in networks100. Ideker was named a Top Ten Innovator of 2006 by Tech. Review and received the 2009 Overton Prize, the highest award in computational biology. He serves on the Editorial Boards of Cell, Cell Reports, Molecular Systems Biology, and PLoS Comp. Bio. Directing NRNB builds on previous successful leadership such as Director of the San Diego Center for Systems Biology, Genome Program Leader at the UCSD Cancer Center, former Division Chief of Genetics in the Dept. of Medicine, and as a regular organizer of major conferences such as Cold Spring Harbor Laboratory Systems Biology: Networks and the annual Protein Networks Track at ISMB. NRNB Executive Director, Dr. Alexander Pico is the Associate Director of Bioinformatics at the Gladstone Institutes on the UCSF Mission Bay Campus where he leads a Systems Biology Group. After obtaining a PhD in molecular neurobiology and biophysics from Rockefeller University in 2003, Dr. Pico joined Gladstone as a postdoctoral fellow in the laboratory of Dr. Bruce Conklin. Dr. Pico develops software tools and resources that help analyze, visualize and explore biomedical data in the context of biomolecular networks. He cofounded WikiPathways in 2007 and has contributed to the development of Cytoscape since 2006. Dr. Pico received the David Rockefeller Award in 2001, the Howard Hughes Medical Institute Predoctoral Fellowship from 1999-2003, the David and the Mary Phillips Distinguished Postdoctoral Fellowship for 2005-2006. NRNB Investigator, Dr. Gary Bader works on biological network analysis and pathway information resources as an Associate Professor at The Donnelly Centre at the University of Toronto. He helped found the field of network biology with early contributions to network visualization and modeling124-127, database resources128 and analysis81. He was included twice (for Biology/Biochemistry and Computer Science) on the Thomson Reuters list of most influential scientists worldwide in 2014 (www.highlycited.com). His research focuses on identifying causal pathways in disease and in precision medicine (predicting patient outcome based on genetic, molecular, pathway and clinical data). For example, using his methods, his team identified novel pathways involved in autism spectrum disorder129 and the first potential therapy for a devastating pediatric brain cancer130. He completed post-doctoral work in Chris Sander’s group in the Computational Biology Center at Memorial Sloan-Kettering Cancer Center in New York. Gary developed the Biomolecular Interaction Network Database (BIND) during his Ph.D. in the lab of Christopher Hogue in the Department of Biochemistry at the University of Toronto and the Samuel Lunenfeld Research Institute at Mount Sinai Hospital in Toronto. He completed a B.Sc. in Biochemistry at McGill University in Montreal. NRNB Investigator, Dr. Chris Sander is Chair of the Computational Biology Program at the Memorial Sloan–Kettering Cancer Center in New York City. Sander was initially trained as a physicist, with a first degree in physics and mathematics from the University of Berlin in 1967, and a PhD in theoretical physics after graduate studies at the State University of New York, the Niels Bohr Institute in Copenhagen and the University of California in Berkeley. Sander has made many contributions to structural bioinformatics including developing tools such as FSSP and DSSP. DSSP, Sander’s foundational algorithm for analyzing protein structures, is still used 17 years later as a standard for identifying protein secondary structures in experimental 3D structures. In 1986, he became department chair and founder of the Biocomputing Program at the European Molecular Biology laboratories (EMBL) in Heidelberg with the support of Sydney Brenner. He was a founding member of the International Society for Computational Biology (ISCB) when the society was established in early 1997, and he is a former Executive Editor for the journal Bioinformatics. In 2010 he was awarded the ISCB Accomplishment by a Senior Scientist Award.

External Advisory Committee. The above leadership team is co-advised by an External Advisory Committee (EAC) which has historically been very active. The EAC consists of:

Dr. Stephen Friend, Founder and President of Sage Bionetworks (EAC Chair) Dr. David Hill, Associate Director of the Harvard Center for Cancer Systems Biology Dr. Tamara Munzner, Professor of Computer Science at University of British Columbia Dr. Nik Schork, Director of Human Biology, J. Craig Venter Institute Dr. Gustavo Stolovitzky, Systems Biology Group Leader at IBM/Watson Research Center Dr. Marian Walhout, Professor of Molecular Medicine at University of Massachusetts Dr. Ravichandran Veerasamy, NIGMS Program Representative

Each of these members brings relevant, world-renowned expertise and leadership in functional genomics (Friend), protein networks and interactomics (Walhout, Hill), human genetics (Schork), information visualization (Munzner), and inference of signaling and transcriptional regulatory networks (Stolovitzky founded the well-known DREAM competition). The EAC met with NRNB leadership and investigators to review progress at three separate face-to-face meetings during the past funding period. The EAC also plays a critical role in the continual evolution of DBPs and CSPs, by suggesting new collaborative opportunities in the field and approving all substantive changes to this lineup. Projects are evaluated based on their ability to yield joint publications between resource personnel and their collaborators, with a strong preference towards DBPs that are already NIH funded. For the DBPs described in this proposal, all are already supported separately by the NIH or Genome Canada. Size of the Research Community to be Served by NRNB

The community of scientists who might be considered “network biologists” has grown rapidly over the past few years. As an illustration of how pervasive networks have become, the U.S. National Institutes of Health currently funds 19,759 active research grants covering the topic “protein protein interactions” which, remarkably, is up from 3,076 at the time of the last NRNB submission four years ago in fall 2010. Network research is also actively supported by a variety of initiatives from other major national and supra-national funding agencies, such as the NSF, DOE, EU, and private foundations. As of 2014, approximately one-third of papers presented at the major Bioinformatics conferences (e.g., ISMB, RECOMB, ECCB) have focused on novel methodologies related to analysis or inference of biological networks— this number is up from less than one tenth a decade ago. Network Biology is the focus of major conferences per year, recently including the “Systems Biology: Networks” meeting (March 13-16 2013, CSHL); “Computational analysis of protein-protein interactions” (Sept 29 - Oct 3 2014; organized by EMBO), and “NetBio” (July 1 -15 2014, organized by NRNB staff at ISMB). Another indicator of the growing community of network research is usage of software tools. For instance, use of Cytoscape has increased at a rate of ~50% per year since 2003. In the past year alone, Cytoscape has been downloaded ~90,000 times and the website has had 370,000 sessions and 880,000 pageviews. The NRNB-built and maintained App Store has delivered over 170,000 apps to Cytoscape users since its launch in mid-2012. The original Cytoscape paper131 and the more recently released update papers131-133 have together been cited over 12,000 times and used un-cited in many more. Cytoscape analyses appeared in ~50 Science/Nature/Cell articles in 2013. Figure 2 shows how a subset of these works has been supported by a collection of different NIH institutes as well as other funding agencies. These increases in software usage and development closely parallel the increases in availability of large-scale network information and its potential impacts in biomedicine.

Thus, the importance of Network Biology is very much appreciated by basic scientists and clinicians, many of whom use interaction databases and network analysis tools in their day-to-day research. On the other hand, many other investigators (basic, clinical, and pharmaceutical) as of yet have little knowledge of recent developments in network biology and would benefit greatly from outreach and education centered on network-based technologies. With continued support of the NRNB National Resource— focused on technology development supporting research, collaboration, education, and dissemination— the Network Biology community could potentially become a great deal larger. Progress Report Highlights

The initial period of support (Sept 2010 - Aug 2015) has led to the establishment of a Network Biology resource which provides a unique and fertile environment for bioinformatics research, tool development, dissemination, collaboration and training. The initial years have been very productive by both quantitative metrics (cumulative) and research narrative:

Total publications: 83 total, 16 in Nature / Science / Cell family; 63 collaborative with ≥ 2 faculty Cytoscape gets 8,000 dls/mo, 500 cites/yr and has supported 3,800 NIH grants Number of network analysis Apps in Cytoscape App Store: 210 149 collaborations with external investigators on diverse topics (75 active) 92 training events in 12 countries, including 15 course lectures by NRNB staff 70 Google Summer of Code projects = $304,000 invested by Google Workshops: 39 total, ~1,000 attendees Symposia: 7 total, ~700 attendees

The below narrative highlights major progress under each component of the resource, including new technology development, infrastructure, collaborative activities, dissemination, and training: New Technology: Network Approach to Building Gene Ontologies. NeXO (The Network Extracted Ontology) uses a principled computational approach which integrates evidence from hundreds of thousands of individual gene and protein interactions to construct a complete hierarchy of cellular components and processes5,7. This data-derived ontology aligns with known biological machinery in the GO Database and also uncovers many new structures (Figure 3).

NIGMS, 1005

NCI, 459

NCRR, 275 NIAID, 260 NIDDK, 210

NHLBI, 202

NHGRI, 164

NIEHS, 157

NLM, 117 NIMH, 97

Wellcome Trust, 95

NINDS, 89

Canadian Institutes of Health Research, 85

NIDA, 72

Others, 541

Figure 2: Use of Cytoscape by grantees of funding agencies. We have been able to identify at least 3,800 active grants that depend on Cytoscape, up from ~1,500 18 months ago. Information was extracted from the Acknowledgments section of journal articles. One pie slice represents the percent of this total support provided by a particular agency. Over 90% of grants are by NIH institutes.

New Technology: Network-based stratification of tumor mutations. NBS is a method for stratification (clustering) of patients in a cancer cohort based on somatic mutations and a gene interaction network3. The method uses network propagation to integrate genome wide somatic mutation profiles for each patient over a gene interaction network, and a non-negative matrix factorization based clustering approach in order to derive biologically meaningful stratification of a patient cohort. Key Infrastructure: Cytoscape. Our team has worked very closely together since 2002 to develop Cytoscape, a standard visualization, integration and analysis package for biological networks (http://www.cytoscape.org/). Our previous Cytoscape development efforts, which predate NRNB, demonstrate the strong commitment of our leadership team to open-source bioinformatic tools. Our ability to establish a strong resource center owes in large part to the groundwork we have put in place over the previous decade of Cytoscape development. During the funding period, NRNB helped expand the Cytoscape desktop application into a suite of related tools, Apps, websites and databases which we propose to call the Cytoscape Cyberinfrastructure. Key Infrastructure: Establishment of the NRNB Compute Cluster. Over the performance period we worked closely with the San Diego Supercomputer Center (SDSC) to establish a high-performance computing environment for network biology applications. Through NRNB funds and matching SDSC support, we constructed an advanced cluster centered on 512 cores (64GB and 256GB HP SL230s Gen8 blade nodes) with 128TB attached storage. This infrastructure has been highly used and critical for new technology development, as manipulation, search, and statistical scoring of large network structures is typically compute intensive. We propose to renovate and expand this infrastructure in the next funding period. Dissemination: The Cytoscape App Store. A highlight of NRNB Dissemination efforts is the Cytoscape App Store (http://apps.cytoscape.org/) which was developed under supplemental funding to the main NRNB award. The goals of the App Store are to highlight the important features that apps add to Cytoscape, to enable researchers to find apps they need, and for developers to promote their apps. It has stimulated a sizable community of Cytoscape App developers, enabling over 200 third-party network analysis and visualization tools to-date. Dissemination: The cBio Cancer Genomics Portal (http://cbioportal.org) is an open-access resource for interactive exploration of multi-dimensional cancer genomics data sets, currently providing access to data from more than 5,000 tumor samples from 20 cancer studies10. The cBio Cancer Genomics Portal empowers researchers to translate these rich data sets into biologic insights and clinical applications, for instance by viewing the molecular network context of cancer mutations using cytoscape.js for the web. Following its introduction in 2012, the portal has been used by thousands of basic cancer researchers, and increasingly clinicians, wishing to understand the prevalence, biological context, and outcome of a set of tumor mutations. Key DBP Collaboration: Synthetic Genetic Analysis of Budding Yeast. NRNB investigator Gary Bader had a very productive DBP with scientists at the University of Toronto (Charles Boone, Brenda Andrews) leading to the first complete genetic interaction network for a cell48,134. This reference map provides a model for expanding genetic network analysis to higher organisms and is stimulating

Figure 3. Journal cover showing top level view of the Network-Extracted Ontology (NeXO), a completely data-driven reconstruction of the literature-based GO.

Figure 5. Differential genetic interactions specifically highlight pathways responding to DNA damage. Genetic interaction module map in which protein complexes / pathways (nodes) are connected by differential genetic interactions (edges). “Differential positive” scores indicate genetic interactions that increase after exposure to MMS (green) while “differential negative” scores indicate interactions that decrease (red).

valuable insights into gene function and drug targets. The 2010 publication of roughly 20% of the complete map was in the top 30 most cited papers of that year (Figure 4). Key CSP Collaboration: Genetic Networks Remodeled by DNA Damage. Although cellular behaviors are dynamic, the networks that govern these behaviors have been mapped primarily as static snapshots. To explore network dynamics, Ideker collaborated with Nevan Krogan at UCSF to develop an approach called differential epistasis mapping, which creates a genetic network (Figure 4) based on changes in interaction observed between two static conditions. They mapped widespread changes in genetic interaction as the cell responds to DNA damage57. Differential networks chart a new type of genetic landscape that will be invaluable for mapping many different cellular responses to stimuli. Training: Google Summer of Code. One of our most successful training activities is our participation as a mentoring organization in the Google Summer of Code program. Over the past 4 years, we have recruited over 50 mentors and attracted over 100 students around the world to apply to work on network biology projects. For example, Google funded 18 students recruited by NRNB in 2014, ranking us in the top 10% in terms of the number of slots allocated by Google across all accepted organizations (comparable interest to Apache, Mozilla, R, and Python). Our overall application success rate is 94% on GSoC projects. These training efforts have resulted in 10 student co-authored papers9,135-143. Training: RECOMB/DREAM Conference and Challenges. Another highlight of our training efforts has been to create an annual international symposium on bioinformatic methods for Network Biology. Formerly this was the Cytoscape Symposium, a three-day annual meeting that attracted 200-300 attendees each year. In 2013 we began discussions with the well-known RECOMB (Research in Computational Molecular Biology) and ISCB (International Society for Computational Biology) conference families to combine efforts. These discussions led to an agreement to expand the existing RECOMB/ISCB conference series on Systems and Regulatory Biology (originally co-founded by Trey Ideker, Ron Shamir, and Eleazar Eskin in 2003) to integrate sessions including a

Figure 4. The genetic interaction map of budding yeast, as visualized by a DBP between NRNB and the Boone laboratory.

Cytoscape App Expo and Cytoscape user and developer training events. At the same time, NRNB partnered with the popular DREAM Challenge, an annual competition that nucleates thousands of participants to address important problems in inference and validation of network biology models. DREAM is also co-organized with the RECOMB conference and, since 2013, the winning network inference method(s) have been implemented as Cytoscape Apps by the donated time of an NRNB software developer. Future Progress Assessment and Evaluation

NRNB progress will be measured using both narrative and numerical information. All investigators provide annually a narrative of research progress and future plans within each of the TRDs and coupled DBPs. We maintain a database of bibliographic data that tracks the number of peer-reviewed publications per TRD and investigator. We track the number of citations to these papers along with near-term metrics such as the ‘Altmetric’, which indicates rapid online interest in a paper even before enough time has passed to garner follow-on citations. Beyond publications, an important NRNB aim is production and maintenance of cutting-edge software tools, websites, and platforms. These resources will be posted to our website in a manner consistent with NIH policies (see Resource Sharing Plan), and we will track usage statistics of all new and existing NRNB-supported resources (e.g. cytoscape.org, apps.cytoscape.org, nexontology.org). We will also track tools developed by third parties, such as Cytoscape Apps, which are enabled or supported by core NRNB efforts. Success of Collaboration and Service Projects is measured by the overall number, complexity, and quality of ongoing CSPs we manage and the publications stemming from these collaborations. We also routinely solicit qualitative feedback about CSPs and tool usage in the form of user testimonials. Products of training and outreach, such as workshops, symposia, and summer training programs like Google Summer of Code, are tracked via event descriptions and post-event surveys and metrics such as the number of events and attendees. Distinction Between NRNB And Existing Resources

NRNB complements other bioinformatic resources and centers, while having unique aims and deliverables. The following is a partial list of other NIH-supported bioinformatics efforts that are distinct, but synergistic, with NRNB: The Cytoscape Project. Cytoscape is a software platform on which to provide tools and infrastructure for biological network analysis. The core codebase of the Cytoscape desktop application is supported by a research project grant from NIGMS (R01 GM070743 to Ideker). In contrast, NRNB provides training and outreach on Cytoscape and its apps, and TRD projects use the Cytoscape platform for analysis and development of new tools as Apps. NRNB also supports a range of other tools and databases such as Wikipathways, Enrichment Map, NeXO, the Cytoscape App Store, and Cytoscape.js/web. Resource for Biocomputing, Visualization, and Informatics (RBVI) directed by Thomas Ferrin at UCSF. RBVI is another informatics BTRR that develops software for protein structural analysis and visualization, including Chimera. This resource has already developed the StructureViz144 Chimera to Cytoscape App, which links protein networks analyzed in Cytoscape to 3D structures of proteins145,146. National Resource for Cell Analysis and Modeling (NRCAM) directed by Leslie Loew at University of Connecticut. The NRCAM is another informatics BTRR that develops the VCell modeling tool147 which enables precise physico-chemical simulations of pathway dynamics given information on kinetic parameters, cellular sub-structure, and localization. In contrast, the focus of NRNB is to integrate

heterogeneous interactions and genome-scale data sets together into consolidated pathway maps of the cell. These maps may be useful as starting points for building simulation models using VCell. NIGMS Centers for Systems Biology. NRNB has productive interactions with NIGMS Systems Biology Centers, including the San Diego Center for Systems Biology (SDCSB). The SDCSB brings together 18 systems biology faculty at UCSD, Salk, and surrounding institutes, organized into project teams who apply systems approaches to advance our understanding of fundamental cellular processes, including chromatin architecture and transcription, protein turnover, and cell-cell communication. In 2013 the original PI Alexander Hoffman moved to UCLA and passed leadership to Ideker at that time.

Annotated Timeline

The goal of the annotated timeline is not to set explicit milestone dates (per FOA instructions), but rather to provide target dates for technology availability and to illustrate how the various sub-project aims related to each other. We are using Munzner's 4-Stage Model to show the preliminary work (gray bars) and the proposed extension (open bars) for each project (each row) in the context of an abstracted technology development cycle. Our goal is to push each project to either Stage 3 (implemented tool) or Stage 4 (widely adopted tool) during the next support period. Target dates per project are provided right-justified within each open bar.

TRD NRNB Lead Specific Aim or Subaim1.1 MSK / Sander Tools for inference of differential networks

- Integrated tools for perturbation response analysis- Predictive power using tissue-specific priors; PERA 2.0- UX and visualization for network comparison

1.2 Toronto / Bader Protein network alignment algorithm and viewer- IIN network alignment algorithm (GreedyPlus)- Develop network alignment viewer YEAR 1- Accessible IIN data via PSICQUIC web service YEAR 1, DBP2

1.3 Gladstone / Pico Facilitating the interpretation of AP-MS data as interaction networks- Tools for AP-MS import and assessment YEAR 1- Tools for network augmentation and comparative analysis

2.1 Toronto / Bader Predicting clinical outcome using patient similarity networks- Outcome prediction methods- Patient population stratification- Patient similarity network visualization

2.2 UCSD / Ideker Predicting cellular response to perturbation w/ network-guided regression& MSK / Sander - Predicting cellular response to treatment using molecular profile data

- Predicting cellular response to treatment using diverse, integrated data2.3 Gladstone / Pico Network analysis of genetic variant data

- Variant analysis and visualization tools- Variant data and annotation tools

3.1 UCSD / Ideker Data-driven assembly & refinement of gene ontologies from networks- Inference of gene ontologies from network data- NeXO (nexontology.org) data-driven ontology web browser ! YEAR 1- Procedures for integrating datasets & iterative ontology improvement !

3.2 UCSD / Ideker Functionalized gene ontologies as a hierarchy of functional prediction- Procedures for propagation of state on gene ontologies

3.3 Toronto / Bader Bridging ligand-receptor networks to cell-cell communication networks- Cell-cell interaction network inference- Identify key players and rules- Intracellular pathway control- Mutl-scale visualization technology

Java tool YEAR 2 - 3, DBP 9 YEAR 3, CSPs

YEAR 4 - 5, CSPsYEAR 4 - 5, CSPs

YEAR 3 - 4, CSPsYEAR 4 - 5, CSPs

YEAR 4 - 5, CSPsYEAR 4 - 5, CSPs

YEAR 3, CSPsYEAR 3, CSPs

YEAR 4 - 5, CSPs

Perturbation biology methodPERA algorithm YEAR 1 - 2, DBP 7,8

YEAR 2 - 3, DBP 7,8

'Siri' essay in Cell YEAR 1 - 3, DBP 3, 4 YEAR 3 - 5, DBP 3, 4

NeXO and CliXO AlgorithmsNeXO Web Browser

YEAR 2 - 3, DBP 1 - 5

YEAR 1 - 2, DBP 1 - 5,10

YEAR 3 - 4, DBP 1 - 5,10

Identify problem, driving biological question, and

target community

Develop new method or approach as a solution;

establish proof of concept

Implement solution as a mature resource (software,

website, DB)

Broad adoption of resource via dissemination and

collaboration

Stage 1 Stage 2 Stage 3 Stage 4

YEAR 2 - 3, CSPs

Morris(2014) Nature Protocol YEAR 2 - 3, DBP 1YEAR 2 - 3, DBP 1Web service resources

PSICQUIC hosted web services

Prototypes & mock-ups YEAR 1, DBP 8

YEAR 1 - 3, DBP 2, 10YEAR 2 - 4, DBP 1,2,7,8,10

Prototype Java toolsBeta rendering system

Optimize prototype

Mock-up YEAR 2 - 3, DBP 4,5,6 YEAR 3, DBP 4,5,6,10 YEAR 4, CSPsYEAR 3, CSPs

YEAR 2 - 3, DBP 5YEAR 2 - 3, DBP 4,5,6,10Java toolYEAR 3 - 4, DBP 4,5,6,10 YEAR 4 - 5, CSPs

Regression methodExtension to new data

YEAR 1 - 3, DBP 8 YEAR 3, DBP 4,8YEAR 1 - 3, DBP 8 YEAR 4, DBP 4,8

Cytoscape IntegrationCytoscape Integration

YEAR 2 - 3, DBP 4,5,6,8 YEAR 3, CSPsYEAR 3 - 4, DBP 4,5,6,8 YEAR 4, CSPs

YEAR 4 - 5, CSPsYEAR 2 - 3, DBP 1,2

YEAR 4, DBP 9 YEAR 5, DBP 9Visualization appPathway analysis

YEAR 3, CSPsYEAR 4, CSPsYEAR 5, CSPs

Topology analysis YEAR 2, DBP 9 YEAR 3, DBP 9YEAR 3, DBP 9 YEAR 4, DBP 9

YEAR 4, CSPsYEAR 5, CSPs

Resource Sharing Plan Plan for Sharing of Data. The Principal Investigator, co-Investigators, and all key personnel of this proposal are firmly committed to the NIH policy on sharing of research data (NOT-OD-03-032). Central to this policy is the timely sharing of data and technologies produced in the course of our research and development. A major function of the National Resource for Network Biology is to build a resource of methods and tools, which we foresee will be widely used and re-used in the course of network biology research by the biomedical research community. The NRNB website already serves effectively as a central portal for major tools and resources generated by resource members in the course of their research. Relevant information can be found at http://nrnb.org/tools.html.

Plan for Sharing Network Models. In the course of our Technology Research and Development we will be generating network maps, data-driven ontologies and predictive models. These will be derived as important primary products through analysis of the raw data. For example, one of the major outcomes of the analysis of AP-MS data is the production of augmented interaction networks. These networks will be viewable and reusable in Cytoscape using Cytoscape format (CYS) and distributed through our websites and publications. In future years, we will leverage the developing work on the Network Data Exchange (NDEx, http://ndexbio.org/) for sharing network data via as many relevant contexts as possible. Plan for Sharing of Software. The Principal Investigator, co-Investigators, and all key personnel of this proposal are committed to open source software and open data access. All of the software we develop will be made available under the open source GNU Library General Public License (LGPL), and all other computational resources developed through this project will also be made available under the LGPL or other open source license. All software will be disseminated through publicly accessible online avenues in the form of public web servers, FTP sites, source code version management servers (i.e. GitHub repository) and publications. Other content, such as training material, will be made available under a Creative Commons license that allows redistribution without restrictions. Through these provisions, the project meets the four criteria for software dissemination as specified by the NIH. Specifically:

1) The software will be freely available to biomedical researchers and educators in the non-profit sector, such as institutions of education, research institutes, and government laboratories.

2) The terms of software availability permit the commercialization of enhanced or customized versions of the software, or incorporation of the software (or portions thereof) into other software packages.

3) The terms of software availability permit researchers who are not directly funded by this proposal to modify the source code and to share modifications with other colleagues.

4) The software will be maintained in a form such that if the applicant team loses interest in the software subsequent to the life of the project, another individual or team can make use of previous work to continue development.

We have transitioned all software code to a single source code repository, GitHub. This has made our software easier for software developers to access and modify and will ensure releases remain available even if contributing groups become unavailable to maintain the system. Websites like GitHub are advantageous because they enable others to modify source code while the center maintains the official version, they make it easy for the center to integrate contributions from others into the official core (via pull requests) and are easily transferable to others to continue development if needed. We will develop all software from inception in a public code repository, such that it will immediately and always be publicly accessible.

 

OVERALL - PROGRESS REPORT PUBLICATION LIST As a renewal applications, we are including complete references to the 118 publications and manuscripts accepted for publication resulting from the NRNB since it was last reviewed, including TR&D, DBP and CSP-related publications. The 83 publications that explicitly acknowledge support from this award are highlighted in bold. 1. Aksoy BA, Demir E, Babur O, Wang W, Jing X, et al. (2014) Prediction of

individualized therapeutic vulnerabilities in cancer from genomic profiles. Bioinformatics 30: 2051-2059.

2. Babur O, Aksoy BA, Rodchenkov I, Sumer SO, Sander C, et al. (2014) Pattern search in BioPAX models. Bioinformatics 30: 139-140.

3. Brakefield TA, Mednick SC, Wilson HW, De Neve J-E, Christakis NA, et al. (2014) Same-sex sexual attraction does not spread in adolescent social networks. Arch Sex Behav 43: 335-344.

4. Carvunis AR, Ideker T (2014) Siri of the cell: what biology could learn from the iPhone. Cell 157: 534-538.

5. Christakis NA, Fowler JH (2014) Friendship and natural selection. Proc Natl Acad Sci U S A 111 Suppl 3: 10796-10801.

6. Dutkowski J, Ono K, Kramer M, Yu M, Pratt D, et al. (2014) NeXO Web: the NeXO ontology database and visualization platform. Nucleic Acids Res 42: D1269-1274.

7. Gao J, Ciriello G, Sander C, Schultz N (2014) Collection, integration and analysis of cancer genomic profiles: from data to insight. Curr Opin Genet Dev 24: 92-98.

8. Garcia-Herranz M, Moro E, Cebrian M, Christakis NA, Fowler JH (2014) Using friends as sensors to detect global-scale contagious outbreaks. PLoS One 9: e92413.

9. Goa JZ, C; van Iersel, M; Zhang, L; Xu, D; Schultz, N; Pico, AR (2014) BridgeDB app: unifying identifier mapping services for Cytoscape. F1000Research 3.

10. Gross AM, Orosco RK, Shen JP, Egloff AM, Carter H, et al. (2014) Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to 3p loss. Nat Genet 46: 939-943.

11. Kim MS, Pinto SM, Getnet D, Nirujogi RS, Manda SS, et al. (2014) A draft map of the human proteome. Nature 509: 575-581.

12. Kramer M, Dutkowski J, Yu M, Bafna V, Ideker T (2014) Inferring gene ontologies from pairwise similarity data. Bioinformatics 30: i34-42.

13. Kutmon ML, S; Evelo, CT; Pico, AR (2014) WikiPathways App for Cytoscape: Making biological pathways amenable to network analysis and visualization. F1000Research 3.

14. Lee AY, St Onge RP, Proctor MJ, Wallace IM, Nile AH, et al. (2014) Mapping the cellular response to small molecules using chemogenomic fitness signatures. Science 344: 208-211.

15. Leung A, Bader GD, Reimand J (2014) HyperModules: identifying clinically and phenotypically significant network modules with disease mutations for biomarker discovery. Bioinformatics 30: 2230-2232.

16. Morris JHK, A.; Ferrin, T.E.; Pico, A.R. (2014) enhancedGraphics: a Cytoscape app

for enhanced node graphics. F1000Research 3. 17. Morris JHK, G.M.; Verschueren, E.; Johnson, J.R.; Cimermancic, P.; Greninger,

A.L.; Pico, A.R. (2014) Affinity Purification-Mass Spectrometry and Network Analysis to Understand Protein-Protein Interactions (accepted, pending publication). Nature Protocol.

18. Northcott PA, Lee C, Zichner T, Stutz AM, Erkek S, et al. (2014) Enhancer hijacking activates GFI1 family oncogenes in medulloblastoma. Nature 511: 428-434.

19. Novarino G, Fenstermaker AG, Zaki MS, Hofree M, Silhavy JL, et al. (2014) Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders. Science 343: 506-511.

20. Qiao W, Wang W, Laurenti E, Turinsky AL, Wodak SJ, et al. (2014) Intercellular network structure and regulatory motifs in the human hematopoietic system. Mol Syst Biol 10: 741.

21. Shakya HB, Christakis NA, Fowler JH (2014) Association between social network communities and health behavior: an observational sociocentric network study of latrine ownership in rural India. Am J Public Health 104: 930-937.

22. Sheikh MO, Schafer CM, Powell JT, Rodgers KK, Mooers BH, et al. (2014) Glycosylation of Skp1 affects its conformation and promotes binding to a model f-box protein. Biochemistry 53: 1657-1669.

23. Walsh KM, Codd V, Smirnov IV, Rice T, Decker PA, et al. (2014) Variants near TERT and TERC influencing telomere length are associated with high-grade glioma risk. Nat Genet 46: 731-735.

24. Aksoy BA, Gao J, Dresdner G, Wang W, Root A, et al. (2013) PiHelper: an open source framework for drug-target and antibody-target data. Bioinformatics 29: 2071-2072.

25. Alshalalfa M, Bader GD, Bismar TA, Alhajj R (2013) Coordinate microRNA-mediated regulation of protein complexes in prostate cancer. PLoS One 8: e84261.

26. Antoniotti M, Bader GD, Caravagna G, Crippa S, Graudenzi A, et al. (2013) GESTODIFFERENT: a Cytoscape plugin for the generation and the identification of gene regulatory networks describing a stochastic cell differentiation process. Bioinformatics 29: 513-514.

27. Bilal E, Dutkowski J, Guinney J, Jang IS, Logsdon BA, et al. (2013) Improving breast cancer survival analysis through competition-based multidimensional modeling. PLoS Comput Biol 9.

28. Carter H, Hofree M, Ideker T (2013) Genotype to phenotype via network analysis. Curr Opin Genet Dev 23: 611-621.

29. Christakis NA, Fowler JH (2013) Social contagion theory: examining dynamic social networks and human behavior. Stat Med 32: 556-577.

30. Ciriello G, Miller ML, Aksoy BA, Senbabaoglu Y, Schultz N, et al. (2013) Emerging landscape of oncogenic signatures across human cancers. Nat Genet 45: 1127-1133.

31. Ciriello G, Sinha R, Hoadley KA, Jacobsen AS, Reva B, et al. (2013) The molecular diversity of Luminal A breast tumors. Breast Cancer Res Treat 141: 409-420.

32. Dutkowski J, Kramer M, Surma MA, Balakrishnan R, Cherry JM, et al. (2013) A gene

ontology inferred from molecular networks. Nat Biotechnol 31: 38-45. 33. Elo LL, Schwikowski B (2013) Analysis of time-resolved gene expression

measurements across individuals. PLoS One 8: e82340. 34. Fijten RRR, Jennen DGJ, van Delft JHM (2013) Pathways for ligand activated nuclear

receptors to unravel the genomic responses induced by hepatotoxicants. Curr Drug Metab 14: 1022-1028.

35. Fried JY, van Iersel MP, Aladjem MI, Kohn KW, Luna A (2013) PathVisio-Faceted Search: an exploration tool for multi-dimensional navigation of large pathways. Bioinformatics 29: 1465-1466.

36. Gao J, Aksoy BA, Dogrusoz U, Dresdner G, Gross B, et al. (2013) Integrative analysis of complex cancer genomics and clinical profiles using the cBioPortal. Sci Signal 6: pl1.

37. Guo Y, Darshi M, Ma Y, Perkins GA, Shen Z, et al. (2013) Quantitative proteomic and functional analysis of liver mitochondria from high fat diet (HFD) diabetic mice. Mol Cell Proteomics 12: 3744-3758.

38. Gwinner F, Acosta-Martin AE, Boytard L, Chwastyniak M, Beseme O, et al. (2013) Identification of additional proteins in differential proteomics using protein interaction networks. Proteomics 13: 1065-1076.

39. Ho AS, Kannan K, Roy DM, Morris LG, Ganly I, et al. (2013) The mutational landscape of adenoid cystic carcinoma. Nat Genet 45: 791-798.

40. Hofree M, Shen JP, Carter H, Gross A, Ideker T (2013) Network-based stratification of tumor mutations. Nat Methods 10: 1108-1115.

41. Hui S, Xing X, Bader GD (2013) Predicting PDZ domain mediated protein interactions from structure. BMC Bioinformatics 14: 27-27.

42. Jhas B, Sriskanthadevan S, Skrtic M, Sukhai MA, Voisin V, et al. (2013) Metabolic adaptation to chronic inhibition of mitochondrial protein synthesis in acute myeloid leukemia cells. PLoS One 8: e58367.

43. Jones JJ, Bond RM, Fariss CJ, Settle JE, Kramer ADI, et al. (2013) Yahtzee: an anonymized group level matching procedure. PLoS One 8.

44. Jones JJ, Settle JE, Bond RM, Fariss CJ, Marlow C, et al. (2013) Inferring tie strength from online directed behavior. PLoS One 8: e52168.

45. Jordan JJ, Rand DG, Arbesman S, Fowler JH, Christakis NA (2013) Contagion of Cooperation in Static and Fluid Social Networks. PLoS One 8.

46. Karagiannis GS, Weile J, Bader GD, Minta J (2013) Integrative pathway dissection of molecular mechanisms of moxLDL-induced vascular smooth muscle phenotype transformation. BMC Cardiovasc Disord 13: 4-4.

47. Lotia S, Montojo J, Dong Y, Bader GD, Pico AR (2013) Cytoscape app store. Bioinformatics 29: 1350-1351.

48. Miller ML, Molinelli EJ, Nair JS, Sheikh T, Samy R, et al. (2013) Drug synergy screen and network modeling in dedifferentiated liposarcoma identifies CDK4 and IGF1R as synergistic drug targets. Sci Signal 6: ra85.

49. Mitra K, Carvunis A-R, Ramesh SK, Ideker T (2013) Integrative approaches for finding modular structure in biological networks. Nat Rev Genet 14: 719-732.

50. Molinelli EJ, Korkut A, Wang W, Miller ML, Gauthier NP, et al. (2013) Perturbation

biology: inferring signaling networks in cellular systems. PLoS Comput Biol 9: e1003290.

51. Omberg L, Ellrott K, Yuan Y, Kandoth C, Wong C, et al. (2013) Enabling transparent and collaborative computational analysis of 12 tumor types within The Cancer Genome Atlas. Nat Genet 45: 1121-1126.

52. Piersma S, Denham EL, Drulhe S, Tonk RH, Schwikowski B, et al. (2013) TLM-Quant: an open-source pipeline for visualization and quantification of gene expression heterogeneity in growing microbial cells. PLoS One 8: e68696.

53. Potts MB, Kim HS, Fisher KW, Hu Y, Carrasco YP, et al. (2013) Using functional signature ontology (FUSION) to identify mechanisms of action for natural products. Sci Signal 6: ra90.

54. Reimand J, Bader GD (2013) Systematic analysis of somatic mutations in phosphorylation signaling predicts novel cancer drivers. Mol Syst Biol 9: 637-637.

55. Reimand J, Wagih O, Bader GD (2013) The mutational landscape of phosphorylation signaling in cancer. Sci Rep 3: 2651.

56. Rice T, Zheng S, Decker PA, Walsh KM, Bracci P, et al. (2013) Inherited variant on chromosome 11q23 increases susceptibility to IDH-mutated but not IDH-normal gliomas regardless of grade or histology. Neuro Oncol 15: 535-541.

57. Rudolph AE, Crawford ND, Latkin C, Fowler JH, Fuller CM (2013) Individual and neighborhood correlates of membership in drug using networks with a higher prevalence of HIV in New York City (2006-2009). Ann Epidemiol 23: 267-274.

58. Sharma K, Karl B, Mathew AV, Gangoiti JA, Wassel CL, et al. (2013) Metabolomics reveals signature of mitochondrial dysfunction in diabetic kidney disease. J Am Soc Nephrol 24: 1901-1912.

59. Srivas R, Costelloe T, Carvunis AR, Sarkar S, Malta E, et al. (2013) A UV-induced genetic network links the RSC complex to nucleotide excision repair and shows dose-dependent rewiring. Cell Rep 5: 1714-1724.

60. Tamborero D, Gonzalez-Perez A, Perez-Llamas C, Deu-Pons J, Kandoth C, et al. (2013) Comprehensive identification of mutational cancer driver genes across 12 tumor types. Sci Rep 3: 2650-2650.

61. Walsh KM, Anderson E, Hansen HM, Decker PA, Kosel ML, et al. (2013) Analysis of 60 reported glioma risk SNPs replicates published GWAS findings but fails to replicate associations from published candidate-gene studies. Genet Epidemiol 37: 222-228.

62. Walsh KM, Rice T, Decker PA, Kosel ML, Kollmeyer T, et al. (2013) Genetic variants in telomerase-related genes are associated with an older age at diagnosis in glioma patients: evidence for distinct pathways of gliomagenesis. Neuro Oncol 15: 1041-1047.

63. Xin X, Gfeller D, Cheng J, Tonikian R, Sun L, et al. (2013) SH3 interactome conserves general function over specific form. Mol Syst Biol 9: 652-652.

64. Zhang C, Wang J, Hanspers K, Xu D, Chen L, et al. (2013) NOA: a cytoscape plugin for network ontology analysis. Bioinformatics 29: 2066-2067.

65. Zuberi K, Franz M, Rodriguez H, Montojo J, Lopes CT, et al. (2013) GeneMANIA prediction server 2013 update. Nucleic Acids Res 41: 115-122.

66. Alshalalfa M, Bader GD, Goldenberg A, Morris Q, Alhajj R (2012) Detecting

microRNAs of high influence on protein functional interaction networks: a prostate cancer case study. BMC Syst Biol 6: 112-112.

67. Apicella CL, Marlowe FW, Fowler JH, Christakis NA (2012) Social networks and cooperation in hunter-gatherers. Nature 481: 497-501.

68. Bond RM, Fariss CJ, Jones JJ, Kramer AD, Marlow C, et al. (2012) A 61-million-person experiment in social influence and political mobilization. Nature 489: 295-298.

69. Buescher JM, Liebermeister W, Jules M, Uhr M, Muntel J, et al. (2012) Global network reorganization during dynamic adaptations of Bacillus subtilis metabolism. Science 335: 1099-1103.

70. Califano A, Butte AJ, Friend S, Ideker T, Schadt E (2012) Leveraging models of cell regulation and GWAS data in integrative network-based association studies. Nat Genet 44: 841-847.

71. Cerami E, Gao J, Dogrusoz U, Gross BE, Sumer SO, et al. (2012) The cBio cancer genomics portal: an open platform for exploring multidimensional cancer genomics data. Cancer Discov 2: 401-404.

72. Chandan K, van Iersel MP, Aladjem MI, Kohn KW, Luna A (2012) PathVisio-Validator: a rule-based validation plugin for graphical pathway notations. Bioinformatics 28: 889-890.

73. Chuang HY, Rassenti L, Salcedo M, Licon K, Kohlmann A, et al. (2012) Subnetwork-based analysis of chronic lymphocytic leukemia identifies pathways that associate with disease progression. Blood 120: 2639-2649.

74. Ciriello G, Cerami E, Sander C, Schultz N (2012) Mutual exclusivity analysis identifies oncogenic network modules. Genome Res 22: 398-406.

75. Diezmann S, Michaut M, Shapiro RS, Bader GD, Cowen LE (2012) Mapping the Hsp90 genetic interaction network in Candida albicans reveals environmental contingency and rewired circuitry. PLoS Genet 8: e1002562.

76. Fiume M, Smith EJM, Brook A, Strbenac D, Turner B, et al. (2012) Savant Genome Browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res 40: 615-621.

77. Fu F, Nowak MA, Christakis NA, Fowler JH (2012) The evolution of homophily. Sci Rep 2: 845-845.

78. Jenkins RB, Xiao Y, Sicotte H, Decker PA, Kollmeyer TM, et al. (2012) A low-frequency variant at 8q24.21 is strongly associated with risk of oligodendroglial tumors and astrocytomas with IDH1 or IDH2 mutation. Nat Genet 44: 1122-1125.

79. Kelder T, van Iersel MP, Hanspers K, Kutmon M, Conklin BR, et al. (2012) WikiPathways: building research communities on biological pathways. Nucleic Acids Res 40: D1301-1307.

80. Labbe RM, Irimia M, Currie KW, Lin A, Zhu SJ, et al. (2012) A comparative transcriptomic analysis reveals conserved features of stem cell pluripotency in planarians and mammals. Stem Cells 30: 1734-1745.

81. Lechman ER, Gentner B, van Galen P, Giustacchini A, Saini M, et al. (2012) Attenuation of miR-126 activity expands HSC in vivo without exhaustion. Cell Stem Cell 11: 799-811.

82. Liu JC, Voisin V, Bader GD, Deng T, Pusztai L, et al. (2012) Seventeen-gene signature from enriched Her2/Neu mammary tumor-initiating cells predicts clinical outcome for human HER2+:ERalpha- breast cancer. Proc Natl Acad Sci U S A 109: 5832-5837.

83. Michaut M, Bader GD (2012) Multiple genetic interaction experiments provide complementary information useful for gene function prediction. PLoS Comput Biol 8: e1002559.

84. Network TCGAR (2012) Comprehensive molecular portraits of human breast tumours. Nature 490: 61-70.

85. Network TCGAR (2012) Comprehensive genomic characterization of squamous cell lung cancers. Nature 489: 519-525.

86. Network TCGAR (2012) Comprehensive molecular characterization of human colon and rectal cancer. Nature 487: 330-337.

87. Nicolas P, Mader U, Dervyn E, Rochat T, Leduc A, et al. (2012) Condition-dependent transcriptome reveals high-level regulatory architecture in Bacillus subtilis. Science 335: 1103-1106.

88. Northcott PA, Shih DJ, Peacock J, Garzia L, Morrissy AS, et al. (2012) Subgroup-specific structural variation across 1,000 medulloblastoma genomes. Nature 488: 49-56.

89. O'Malley AJ, Arbesman S, Steiger DM, Fowler JH, Christakis NA (2012) Egocentric social network structure, health, and pro-social behaviors in a national panel study of Americans. PLoS One 7.

90. Saito R, Smoot ME, Ono K, Ruscheinski J, Wang P-L, et al. (2012) A travel guide to Cytoscape plugins. Nat Methods 9: 1069-1076.

91. Shakya HB, Christakis NA, Fowler JH (2012) Parental influence on substance use in adolescent social networks. Arch Pediatr Adolesc Med 166: 1132-1139.

92. Strully KW, Fowler JH, Murabito JM, Benjamin EJ, Levy D, et al. (2012) Aspirin use and cardiovascular events in social networks. Soc Sci Med 74: 1125-1129.

93. Zhang C, Hanspers K, Kuchinsky A, Salomonis N, Xu D, et al. (2012) Mosaic: making biological sense of complex networks. Bioinformatics 28: 1943-1944.

94. Zhang L, Lim SL, Du H, Zhang M, Kozak I, et al. (2012) High temperature requirement factor A1 (HTRA1) gene regulates angiogenesis through transforming growth factor-beta family member growth differentiation factor 6. J Biol Chem 287: 1520-1526.

95. Aranda B, Blankenburg H, Kerrien S, Brinkman FS, Ceol A, et al. (2011) PSICQUIC and PSISCORE: accessing and scoring molecular interactions. Nat Methods 8: 528-529.

96. Atwood A, DeConde R, Wang SS, Mockler TC, Sabir JS, et al. (2011) Cell-autonomous circadian clock of hepatocytes drives rhythms in transcription and polyamine synthesis. Proc Natl Acad Sci U S A 108: 18560-18565.

97. Beauchamp JP, Cesarini D, Johannesson M, van der Loos MJ, Koellinger PD, et al. (2011) Molecular Genetics and Economics. J Econ Perspect 25: 57-82.

98. Bellay J, Han S, Michaut M, Kim T, Costanzo M, et al. (2011) Bringing order to protein disorder through comparative genomics and genetic interactions. Genome Biol 12.

99. Dutkowski J, Ideker T (2011) Protein networks as logic functions in development and

cancer. PLoS Comput Biol 7: e1002180. 100. Fowler JH, Settle JE, Christakis NA (2011) Correlated genotypes in friendship

networks. Proc Natl Acad Sci U S A 108: 1993-1997. 101. Ideker T, Dutkowski J, Hood L (2011) Boosting signal-to-noise in complex biology: prior

knowledge is power. Cell 144: 860-863. 102. Merico D, Isserlin R, Bader GD (2011) Visualizing gene-set enrichment results

using the Cytoscape plug-in enrichment map. Methods Mol Biol 781: 257-277. 103. Michaut M, Baryshnikova A, Costanzo M, Myers CL, Andrews BJ, et al. (2011)

Protein complexes are central in the yeast genetic landscape. PLoS Comput Biol 7: e1001092.

104. Morris JH, Apeltsin L, Newman AM, Baumbach J, Wittkop T, et al. (2011) clusterMaker: a multi-algorithm clustering plugin for Cytoscape. BMC Bioinformatics 12: 436.

105. Northcott PA, Korshunov A, Witt H, Hielscher T, Eberhart CG, et al. (2011) Medulloblastoma comprises four distinct molecular variants. J Clin Oncol 29: 1408-1414.

106. Oesper L, Merico D, Isserlin R, Bader GD (2011) WordCloud: a Cytoscape plugin to create a visual semantic summary of networks. Source Code Biol Med 6: 7-7.

107. Rosenquist JN, Fowler JH, Christakis NA (2011) Social network determinants of depression. Mol Psychiatry 16: 273-281.

108. Smoot M, Ono K, Ideker T, Maere S (2011) PiNGO: a Cytoscape plugin to find candidate genes in biological networks. Bioinformatics 27: 1030-1031.

109. Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T (2011) Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27: 431-432.

110. Somwar R, Erdjument-Bromage H, Larsson E, Shum D, Lockwood WW, et al. (2011) Superoxide dismutase 1 (SOD1) is a target for a small molecule identified in a screen for inhibitors of the growth of lung adenocarcinoma cell lines. Proc Natl Acad Sci U S A 108: 16375-16380.

111. Srivas R, Hannum G, Ruscheinski J, Ono K, Wang P-L, et al. (2011) Assembling global maps of cellular function through integrative analysis of physical and genetic networks. Nat Protoc 6: 1308-1323.

112. Wallace IM, Bader GD, Giaever G, Nislow C (2011) Displaying chemical information on a biological network using Cytoscape. Methods Mol Biol 781: 363-376.

113. Witt H, Mack SC, Ryzhova M, Bender S, Sill M, et al. (2011) Delineation of two clinically and molecularly distinct subgroups of posterior fossa ependymoma. Cancer Cell 20: 143-157.

114. Bandyopadhyay S, Mehta M, Kuo D, Sung MK, Chuang R, et al. (2010) Rewiring of genetic networks in response to DNA damage. Science 330: 1385-1389.

115. Christakis NA, Fowler JH (2010) Social network sensors for early detection of contagious outbreaks. PLoS One 5: e12948.

116. Chuang HY, Hofree M, Ideker T (2010) A decade of systems biology. Annu Rev Cell Dev Biol 26: 721-744.

117. Isserlin R, Merico D, Alikhani-Koupaei R, Gramolini A, Bader GD, et al. (2010) Pathway

analysis of dilated cardiomyopathy using global proteomic profiling and enrichment maps. Proteomics 10: 1316-1327.

118. Kirouac DC, Ito C, Csaszar E, Roch A, Yu M, Sykes EA, Bader GD, Zandstra PW. (2010) Dynamic interaction networks in a hierarchically organized tissue. Mol Syst Biol. 2010 Oct 5;6:417

OVERALL - BIBLIOGRAPHY AND REFERENCES CITED 1. Chuang, H.Y. et al. Subnetwork-based analysis of chronic lymphocytic leukemia identifies

pathways that associate with disease progression. Blood 120, 2639-49 (2012). 2. Chuang, H.Y., Hofree, M. & Ideker, T. A decade of systems biology. Annu Rev Cell Dev Biol

26, 721-44 (2010). 3. Hofree, M., Shen, J.P., Carter, H., Gross, A. & Ideker, T. Network-based stratification of tumor

mutations. Nat Methods 10, 1108-15 (2013). 4. Gross, A.M. et al. Multi-tiered genomic analysis of head and neck cancer ties TP53 mutation to

3p loss. Nat Genet 46, 939-43 (2014). 5. Kramer, M., Dutkowski, J., Yu, M., Bafna, V. & Ideker, T. Inferring gene ontologies from

pairwise similarity data. Bioinformatics 30, i34-42 (2014). 6. Dutkowski, J. et al. NeXO Web: the NeXO ontology database and visualization platform.

Nucleic Acids Res 42, D1269-74 (2014). 7. Dutkowski, J. et al. A gene ontology inferred from molecular networks. Nat Biotechnol 31, 38-

45 (2013). 8. Zhang, C. et al. NOA: a cytoscape plugin for network ontology analysis. Bioinformatics 29,

2066-7 (2013). 9. Lotia, S., Montojo, J., Dong, Y., Bader, G.D. & Pico, A.R. Cytoscape app store. Bioinformatics

29, 1350-1 (2013). 10. Cerami, E. et al. The cBio cancer genomics portal: an open platform for exploring

multidimensional cancer genomics data. Cancer Discov 2, 401-4 (2012). 11. Fields, S. High-throughput two-hybrid analysis. The promise and the peril. Febs J 272, 5391-9

(2005). 12. Giot, L. et al. A protein interaction map of Drosophila melanogaster. Science 302, 1727-36

(2003). 13. Li, S. et al. A map of the interactome network of the metazoan C. elegans. Science 303, 540-3

(2004). 14. Rual, J.F. et al. Towards a proteome-scale map of the human protein-protein interaction

network. Nature 437, 1173-8 (2005). 15. Stelzl, U. et al. A human protein-protein interaction network: a resource for annotating the

proteome. Cell 122, 957-68 (2005). 16. Uetz, P. et al. A comprehensive analysis of protein-protein interactions in Saccharomyces

cerevisiae. Nature 403, 623-7 (2000). 17. Weimann, M. et al. A Y2H-seq approach defines the human protein methyltransferase

interactome. Nat Methods 10, 339-42 (2013). 18. Hegele, A. et al. Dynamic protein-protein interaction wiring of the human spliceosome. Mol Cell

45, 567-80 (2012). 19. Rajagopala, S.V. et al. The binary protein-protein interaction landscape of Escherichia coli. Nat

Biotechnol 32, 285-90 (2014). 20. Gavin, A.C. et al. Proteome survey reveals modularity of the yeast cell machinery. Nature 440,

631-6 (2006). 21. Krogan, N.J. et al. Global landscape of protein complexes in the yeast Saccharomyces

cerevisiae. Nature 440, 637-43 (2006). 22. Jager, S. et al. Global landscape of HIV-human protein complexes. Nature 481, 365-70 (2012). 23. Yasui, N. et al. Directed network wiring identifies a key protein interaction in embryonic stem

cell differentiation. Mol Cell 54, 1034-41 (2014). 24. Zheng, Y. et al. Temporal regulation of EGF signalling networks by the scaffold protein Shc1.

Nature 499, 166-71 (2013). 25. Bisson, N. et al. Selected reaction monitoring mass spectrometry reveals the dynamics of

signaling through the GRB2 adaptor. Nat Biotechnol 29, 653-8 (2011). 26. Breitkreutz, A. et al. A global protein kinase and phosphatase interaction network in yeast.

Science 328, 1043-6 (2010).

27. Varjosalo, M. et al. Interlaboratory reproducibility of large-scale human protein-complex analysis by standardized AP-MS. Nat Methods 10, 307-14 (2013).

28. Maeda, K. et al. Interactome map uncovers phosphatidylserine transport by oxysterol-binding proteins. Nature 501, 257-61 (2013).

29. Bonn, S. et al. Cell type-specific chromatin immunoprecipitation from multicellular complex samples using BiTS-ChIP. Nat Protoc 7, 978-94 (2012).

30. Lev, I. et al. Reverse PCA, a systematic approach for identifying genes important for the physical interaction between protein pairs. PLoS Genet 9, e1003838 (2013).

31. Harbison, C.T. et al. Transcriptional regulatory code of a eukaryotic genome. Nature 431, 99-104 (2004).

32. Ren, B. et al. Genome-wide location and function of DNA binding proteins. Science 290, 2306-9 (2000).

33. Loh, Y.H. et al. The Oct4 and Nanog transcription network regulates pluripotency in mouse embryonic stem cells. Nat Genet 38, 431-40 (2006).

34. Wei, C.L. et al. A global map of p53 transcription-factor binding sites in the human genome. Cell 124, 207-19 (2006).

35. Boyle, A.P. et al. Comparative analysis of regulatory information and circuits across distant species. Nature 512, 453-6 (2014).

36. Gerstein, M.B. et al. Architecture of the human regulatory network derived from ENCODE data. Nature 489, 91-100 (2012).

37. Kasowski, M. et al. Variation in transcription factor binding among humans. Science 328, 232-5 (2010).

38. Stefflova, K. et al. Cooperativity and rapid evolution of cobound transcription factors in closely related mammals. Cell 154, 530-40 (2013).

39. Schmidt, D. et al. Five-vertebrate ChIP-seq reveals the evolutionary dynamics of transcription factor binding. Science 328, 1036-40 (2010).

40. Kwiatkowski, N. et al. Targeting transcription regulation in cancer with a covalent CDK7 inhibitor. Nature 511, 616-20 (2014).

41. Whyte, W.A. et al. Master transcription factors and mediator establish super-enhancers at key cell identity genes. Cell 153, 307-19 (2013).

42. Novershtern, N. et al. Densely interconnected transcriptional circuits control cell states in human hematopoiesis. Cell 144, 296-309 (2011).

43. Naumova, N., Smith, E.M., Zhan, Y. & Dekker, J. Analysis of long-range chromatin interactions using Chromosome Conformation Capture. Methods 58, 192-203 (2012).

44. Miele, A., Gheldof, N., Tabuchi, T.M., Dostie, J. & Dekker, J. Mapping chromatin interactions by chromosome conformation capture. Curr Protoc Mol Biol Chapter 21, Unit 21 11 (2006).

45. Dixon, J.R. et al. Topological domains in mammalian genomes identified by analysis of chromatin interactions. Nature 485, 376-80 (2012).

46. Ryan, O. et al. Global gene deletion analysis exploring yeast filamentous growth. Science 337, 1353-6 (2012).

47. Cokol, M. et al. Systematic exploration of synergistic drug pairs. Mol Syst Biol 7, 544 (2011). 48. Costanzo, M. et al. The genetic landscape of a cell. Science 327, 425-31 (2010). 49. Ooi, S.L., Shoemaker, D.D. & Boeke, J.D. DNA helicase gene interaction network defined

using synthetic lethality analyzed by microarray. Nat Genet 35, 277-86 (2003). 50. Collins, S.R., Schuldiner, M., Krogan, N.J. & Weissman, J.S. A strategy for extracting and

analyzing large-scale quantitative epistatic interaction data. Genome Biol 7, R63 (2006). 51. Lehner, B., Crombie, C., Tischler, J., Fortunato, A. & Fraser, A.G. Systematic mapping of

genetic interactions in Caenorhabditis elegans identifies common modifiers of diverse signaling pathways. Nat Genet 38, 896-903 (2006).

52. Braberg, H. et al. Quantitative analysis of triple-mutant genetic interactions. Nat Protoc 9, 1867-81 (2014).

53. Braberg, H. et al. From structure to systems: high-resolution, quantitative genetic analysis of RNA polymerase II. Cell 154, 775-88 (2013).

54. Guenole, A. et al. Dissection of DNA damage responses using multiconditional genetic interaction maps. Mol Cell 49, 346-58 (2013).

55. Ryan, C.J. et al. Hierarchical modularity and the evolution of genetic interactomes across species. Mol Cell 46, 691-704 (2012).

56. Beltrao, P., Cagney, G. & Krogan, N.J. Quantitative genetic interactions reveal biological modularity. Cell 141, 739-45 (2010).

57. Bandyopadhyay, S. et al. Rewiring of genetic networks in response to DNA damage. Science 330, 1385-9 (2010).

58. Bao, L. et al. Combining gene expression QTL mapping and phenotypic spectrum analysis to uncover gene regulatory relationships. Mamm Genome 17, 575-83 (2006).

59. Chesler, E.J., Lu, L., Wang, J., Williams, R.W. & Manly, K.F. WebQTL: rapid exploratory analysis of gene expression and genetic networks for brain and behavior. Nat Neurosci 7, 485-6 (2004).

60. Petretto, E. et al. Heritability and tissue specificity of expression quantitative trait loci. PLoS Genet 2, e172 (2006).

61. Schadt, E.E. et al. Genetics of gene expression surveyed in maize, mouse and man. Nature 422, 297-302 (2003).

62. Albert, F.W., Treusch, S., Shockley, A.H., Bloom, J.S. & Kruglyak, L. Genetics of single-cell protein abundance variation in large yeast populations. Nature 506, 494-7 (2014).

63. Ehrenreich, I.M. et al. Dissection of genetically complex traits with extremely large pools of yeast segregants. Nature 464, 1039-42 (2010).

64. Zhang, B. et al. Integrated systems approach identifies genetic nodes and networks in late-onset Alzheimer's disease. Cell 153, 707-20 (2013).

65. Grundberg, E. et al. Mapping cis- and trans-regulatory effects across multiple tissues in twins. Nat Genet 44, 1084-9 (2012).

66. Schadt, E.E., Woo, S. & Hao, K. Bayesian method to predict individual SNP genotypes from gene expression data. Nat Genet 44, 603-8 (2012).

67. Kumar, R., Novak, J. & Tomkins, A. Structure and Evolution of Online Social Networks. Link Mining: Models, Algorithms, and Applications, 337-357 (2010).

68. Christakis, N.A. & Fowler, J.H. The spread of obesity in a large social network over 32 years. N Engl J Med 357, 370-9 (2007).

69. Christakis, N.A. & Fowler, J.H. The collective dynamics of smoking in a large social network. N Engl J Med 358, 2249-58 (2008).

70. Fowler, J.H. & Christakis, N.A. Dynamic spread of happiness in a large social network: Longitudinal analysis over 20 years in the Framingham Heart Study. British Medical Journal 337(2008).

71. DeRisi, J.L., Iyer, V.R. & Brown, P.O. Exploring the metabolic and genetic control of gene expression on a genomic scale. Science 278, 680-6 (1997).

72. Mortazavi, A., Williams, B.A., McCue, K., Schaeffer, L. & Wold, B. Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nat Methods 5, 621-8 (2008).

73. Gygi, S.P. et al. Quantitative analysis of complex protein mixtures using isotope-coded affinity tags. Nat Biotechnol 17, 994-999 (1999).

74. Zhou, H., Watts, J.D. & Aebersold, R. A systematic approach to the analysis of protein phosphorylation. Nat Biotechnol 19, 375-8. (2001).

75. Griffin, J.L., Mann, C.J., Scott, J., Shoulders, C.C. & Nicholson, J.K. Choline containing metabolites during cell transfection: an insight into magnetic resonance spectroscopy detectable changes. FEBS Lett 509, 263-6. (2001).

76. Schwikowski, B., Uetz, P. & Fields, S. A network of protein-protein interactions in yeast. 18, 1257-1261 (2000).

77. Lee, I., Date, S.V., Adai, A.T. & Marcotte, E.M. A probabilistic functional network of yeast genes. Science 306, 1555-8 (2004).

78. Vanunu, O., Magger, O., Ruppin, E., Shlomi, T. & Sharan, R. Associating genes and protein complexes with disease via network propagation. PLoS Comput Biol 6, e1000641 (2010).

79. Ravasz, E., Somera, A.L., Mongru, D.A., Oltvai, Z.N. & Barabasi, A.L. Hierarchical organization of modularity in metabolic networks. Science 297, 1551-5 (2002).

80. Barabasi, A.L. & Oltvai, Z.N. Network biology: understanding the cell's functional organization. Nature reviews. Genetics 5, 101-13 (2004).

81. Bader, G.D. & Hogue, C.W. An automated method for finding molecular complexes in large protein interaction networks. BMC Bioinformatics 4, 2 (2003).

82. Milo, R. et al. Network motifs: simple building blocks of complex networks. Science 298, 824-7 (2002).

83. Mangan, S. & Alon, U. Structure and function of the feed-forward loop network motif. Proc Natl Acad Sci U S A 100, 11980-5 (2003).

84. Prill, R.J., Iglesias, P.A. & Levchenko, A. Dynamic Properties of Network Motifs Contribute to Biological Network Organization. PLoS Biol 3, e343 (2005).

85. Flannick, J., Novak, A., Srinivasan, B.S., McAdams, H.H. & Batzoglou, S. Graemlin: general and robust alignment of multiple large interaction networks. Genome Res 16, 1169-81 (2006).

86. Singh, R., Xu, J. & Berger, B. Global alignment of multiple protein interaction networks with application to functional orthology detection. Proc Natl Acad Sci U S A 105, 12763-8 (2008).

87. Kuchaiev, O. & Przulj, N. Integrative network alignment reveals large regions of global network similarity in yeast and human. Bioinformatics 27, 1390-6 (2011).

88. Patro, R. & Kingsford, C. Global network alignment using multiscale spectral signatures. Bioinformatics 28, 3105-14 (2012).

89. Kelley, B.P. et al. PathBLAST: a tool for alignment of protein interaction networks. Nucleic Acids Res 32, W83-8 (2004).

90. Kalaev, M., Smoot, M., Ideker, T. & Sharan, R. NetworkBLAST: comparative analysis of protein networks. Bioinformatics 24, 594-6 (2008).

91. Wang, X. et al. Three-dimensional reconstruction of protein networks provides insight into human genetic disease. Nat Biotechnol 30, 159-64 (2012).

92. Mosca, R., Ceol, A., Stein, A., Olivella, R. & Aloy, P. 3did: a catalog of domain-based interactions of known three-dimensional structure. Nucleic Acids Res 42, D374-9 (2014).

93. Bock, J.R. & Gough, D.A. Predicting protein--protein interactions from primary structure. 17, 455-460 (2001).

94. Wong, S.L. et al. Combining biological networks to predict genetic interactions. Proc Natl Acad Sci U S A 101, 15682-7 (2004).

95. Ideker, T. & Sharan, R. Protein networks in disease. Genome Res 18, 644-52 (2008). 96. Halldorsson, B.V. & Sharan, R. Network-based interpretation of genomic variation data. J Mol

Biol 425, 3964-9 (2013). 97. Vandin, F., Upfal, E. & Raphael, B.J. De novo discovery of mutated driver pathways in cancer.

Genome Res 22, 375-85 (2012). 98. Ng, S. et al. PARADIGM-SHIFT predicts the function of mutations in multiple cancers using

pathway impact analysis. Bioinformatics 28, i640-i646 (2012). 99. Vaske, C.J. et al. Inference of patient-specific pathway activities from multi-dimensional cancer

genomics data using PARADIGM. Bioinformatics 26, i237-45 (2010). 100. Chuang, H.Y., Lee, E., Liu, Y.T., Lee, D. & Ideker, T. Network-based classification of breast

cancer metastasis. Mol Syst Biol 3, 140 (2007). 101. Efroni, S., Schaefer, C.F. & Buetow, K.H. Identification of key processes underlying cancer

phenotypes using biologic pathway analysis. PLoS ONE 2, e425 (2007). 102. Crosby, M.A., Goodman, J.L., Strelets, V.B., Zhang, P. & Gelbart, W.M. FlyBase: genomes by

the dozen. Nucleic Acids Res 35, D486-91 (2007). 103. Tuck, D.P., Kluger, H.M. & Kluger, Y. Characterizing disease states from topological properties

of transcriptional regulatory networks. BMC Bioinformatics 7, 236 (2006). 104. Dixon, A.L. et al. A genome-wide association study of global gene expression. Nat Genet 39,

1202-7 (2007). 105. Goring, H.H. et al. Discovery of expression QTLs using large-scale transcriptional profiling in

human lymphocytes. Nat Genet 39, 1208-16 (2007). 106. Stranger, B.E. et al. Population genomics of human gene expression. Nat Genet 39, 1217-24

(2007). 107. Lage, K. et al. A human phenome-interactome network of protein complexes implicated in

genetic disorders. Nat Biotechnol 25, 309-16 (2007). 108. Oti, M. & Brunner, H.G. The modular nature of genetic diseases. Clin Genet 71, 1-11 (2007).

109. Magger, O., Waldman, Y.Y., Ruppin, E. & Sharan, R. Enhancing the prioritization of disease-causing genes through tissue specific protein interaction networks. PLoS Comput Biol 8, e1002690 (2012).

110. Carter, H., Hofree, M. & Ideker, T. Genotype to phenotype via network analysis. Curr Opin Genet Dev 23, 611-21 (2013).

111. Novarino, G. et al. Exome sequencing links corticospinal motor neuron disease to common neurodegenerative disorders. Science 343, 506-11 (2014).

112. Evangelou, M., Dudbridge, F. & Wernisch, L. Two novel pathway analysis methods based on a hierarchical model. Bioinformatics 30, 690-7 (2014).

113. Bakir-Gungor, B., Egemen, E. & Sezerman, O.U. PANOGA: a web server for identification of SNP-targeted pathways from genome-wide association study data. Bioinformatics 30, 1287-9 (2014).

114. Ciriello, G., Cerami, E., Sander, C. & Schultz, N. Mutual exclusivity analysis identifies oncogenic network modules. Genome Res 22, 398-406 (2012).

115. Lee, K. et al. Proteome-wide discovery of mislocated proteins in cancer. Genome Res 23, 1283-94 (2013).

116. Ideker, T., Galitski, T. & Hood, L. A new approach to decoding life: systems biology. Annu Rev Genomics Hum Genet 2, 343-72 (2001).

117. Ideker, T. et al. Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. Science 292, 929-34 (2001).

118. Ideker, T., Ozier, O., Schwikowski, B. & Siegel, A.F. Discovering regulatory and signalling circuits in molecular interaction networks. Bioinformatics 18 Suppl 1, S233-40 (2002).

119. Kelley, B.P. et al. Conserved pathways within bacteria and yeast as revealed by global protein network alignment. Proc Natl Acad Sci U S A 100, 11394-9 (2003).

120. Sharan, R. et al. Conserved patterns of protein interaction in multiple species. Proc Natl Acad Sci U S A 102, 1974-9 (2005).

121. Suthram, S., Sittler, T. & Ideker, T. The Plasmodium protein network diverges from those of other eukaryotes. Nature 438, 108-12 (2005).

122. Ravasi, T. et al. An atlas of combinatorial transcriptional regulation in mouse and man. Cell 140, 744-52 (2010).

123. Sharan, R. & Ideker, T. Modeling cellular machinery through biological network comparison. Nat Biotechnol 24, 427-33 (2006).

124. Ho, Y. et al. Systematic identification of protein complexes in Saccharomyces cerevisiae by mass spectrometry. Nature 415, 180-3 (2002).

125. Tong, A.H. et al. A combined experimental and computational strategy to define protein interaction networks for peptide recognition modules. Science 295, 321-4 (2002).

126. Tong, A.H. et al. Systematic genetic analysis with ordered arrays of yeast deletion mutants. Science 294, 2364-8 (2001).

127. Bader, G.D. & Hogue, C.W. Analyzing yeast protein-protein interaction data obtained from different sources. Nat Biotechnol 20, 991-7 (2002).

128. Bader, G.D. et al. BIND--The Biomolecular Interaction Network Database. Nucleic Acids Res 29, 242-5 (2001).

129. Pinto, D. et al. Functional impact of global rare copy number variation in autism spectrum disorders. Nature (2010).

130. Mack, S.C. et al. Epigenomic alterations define lethal CIMP-positive ependymomas of infancy. Nature 506, 445-50 (2014).

131. Shannon, P. et al. Cytoscape: a software environment for integrated models of biomolecular interaction networks. Genome Res 13, 2498-504 (2003).

132. Cline, M.S. et al. Integration of biological networks and gene expression data using Cytoscape. Nat Protoc 2, 2366-82 (2007).

133. Smoot, M.E., Ono, K., Ruscheinski, J., Wang, P.L. & Ideker, T. Cytoscape 2.8: new features for data integration and network visualization. Bioinformatics 27, 431-2 (2011).

134. Koh, J.L. et al. DRYGIN: a database of quantitative genetic interaction networks in yeast. Nucleic Acids Res 38, D502-7 (2010).

135. Zhang, C. et al. Mosaic: making biological sense of complex networks. Bioinformatics 28, 1943-4 (2012).

136. Zhang, C. et al. NOA: a cytoscape plugin for network ontology analysis. Bioinformatics 29, 2066-2067 (2013).

137. Fried, J.Y., van Iersel, M.P., Aladjem, M.I., Kohn, K.W. & Luna, A. PathVisio-Faceted Search: an exploration tool for multi-dimensional navigation of large pathways. Bioinformatics 29, 1465-1466 (2013).

138. Fijten, R.R.R., Jennen, D.G.J. & van Delft, J.H.M. Pathways for ligand activated nuclear receptors to unravel the genomic responses induced by hepatotoxicants. Curr Drug Metab 14, 1022-1028 (2013).

139. Fiume, M. et al. Savant Genome Browser 2: visualization and analysis for population-scale genomics. Nucleic Acids Res 40, 615-621 (2012).

140. Chandan, K., van Iersel, M.P., Aladjem, M.I., Kohn, K.W. & Luna, A. PathVisio-Validator: a rule-based validation plugin for graphical pathway notations. Bioinformatics 28, 889-890 (2012).

141. Kutmon, M.L., S; Evelo, CT; Pico, AR. WikiPathways App for Cytoscape: Making biological pathways amenable to network analysis and visualization. F1000Research 3(2014).

142. van Iersel, M.P. et al. Integrated visualization of a multi-omics study of starvation in mouse intestine. J Integr Bioinform 11, 235 (2014).

143. Goa, J.Z., C; van Iersel, M; Zhang, L; Xu, D; Schultz, N; Pico, AR BridgeDB app: unifying identifier mapping services for Cytoscape. F1000Research 3(2014).

144. Morris, J.H., Huang, C.C., Babbitt, P.C. & Ferrin, T.E. structureViz: linking Cytoscape and UCSF Chimera. Bioinformatics 23, 2345-7 (2007).

145. Aloy, P. et al. Structure-based assembly of protein complexes in yeast. Science 303, 2026-9 (2004).

146. Kim, P.M., Lu, L.J., Xia, Y. & Gerstein, M.B. Relating three-dimensional structures to protein networks provides evolutionary insights. Science 314, 1938-41 (2006).

147. Loew, L.M. & Schaff, J.C. The Virtual Cell: a software environment for computational cell biology. Trends Biotechnol 19, 401-6 (2001).