9
Protocol Exploratory Analysis of Biological Networks through Visualization, Clustering, and Functional Annotation in Cytoscape Anastasia Baryshnikova 1 Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544 Biological networks dene how genes, proteins, and other cellular components interact with one another to carry out specic functions, providing a scaffold for understanding cellular organization. Although in-depth network analysis requires advanced mathematical and computational knowledge, a preliminary visual exploration of biological networks is accessible to anyone with basic computer skills. Visualization of biological networks is used primarily to examine network topology, identify functional modules, and predict gene functions based on gene connectivity within the network. Net- works are excellent at providing a birds-eye view of data sets and have the power of illustrating complex ideas in simple and intuitive terms. In addition, they enable exploratory analysis and generation of new hypotheses, which can then be tested using rigorous statistical and experimental tools. This protocol describes a simple procedure for visualizing a biological network using the genetic interaction similarity network for Saccharomyces cerevisiae as an example. The visualization procedure described here relies on the open-source network visualization software Cytoscape and includes detailed instructions on for- matting and loading the data, clustering networks, and overlaying functional annotations. MATERIALS Equipment BiNGO (Maere et al. 2005) or other relevant app for network annotation (see Step 17) Computer system System requirements are available at www.cytoscape.org/documentation_users.html. Cytoscape 3.0 (Smoot et al. 2011) (see Step 1) Network data to be analyzed or sample data (see Steps 2 and 3) METHOD Install Cytoscape and Prepare Data 1. Download Cytoscape 3.0 from www.cytoscape.org/download.php. Instructions for installation are available at www.cytoscape.org/documentation_users.html. Proceed to either Step 2 (if using sample data) or Step 3. 1 Correspondence: [email protected] © 2016 Cold Spring Harbor Laboratory Press Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644 1 Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/ Downloaded from

Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

  • Upload
    lekien

  • View
    231

  • Download
    5

Embed Size (px)

Citation preview

Page 1: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

Protocol

Exploratory Analysis of Biological Networks throughVisualization, Clustering, and Functional Annotationin Cytoscape

Anastasia Baryshnikova1

Lewis-Sigler Institute for Integrative Genomics, Princeton University, Princeton, New Jersey 08544

Biological networks define how genes, proteins, and other cellular components interact with oneanother to carry out specific functions, providing a scaffold for understanding cellular organization.Although in-depth network analysis requires advanced mathematical and computational knowledge, apreliminary visual exploration of biological networks is accessible to anyone with basic computerskills. Visualization of biological networks is used primarily to examine network topology, identifyfunctional modules, and predict gene functions based on gene connectivity within the network. Net-works are excellent at providing a bird’s-eye view of data sets and have the power of illustrating complexideas in simple and intuitive terms. In addition, they enable exploratory analysis and generation of newhypotheses, which can then be tested using rigorous statistical and experimental tools. This protocoldescribes a simple procedure for visualizing a biological network using the genetic interaction similaritynetwork for Saccharomyces cerevisiae as an example. The visualization procedure described here relies onthe open-source network visualization software Cytoscape and includes detailed instructions on for-matting and loading the data, clustering networks, and overlaying functional annotations.

MATERIALS

Equipment

BiNGO (Maere et al. 2005) or other relevant app for network annotation (see Step 17)Computer system

System requirements are available at www.cytoscape.org/documentation_users.html.

Cytoscape 3.0 (Smoot et al. 2011) (see Step 1)Network data to be analyzed or sample data (see Steps 2 and 3)

METHOD

Install Cytoscape and Prepare Data

1. Download Cytoscape 3.0 from www.cytoscape.org/download.php.Instructions for installation are available at www.cytoscape.org/documentation_users.html.

Proceed to either Step 2 (if using sample data) or Step 3.

1Correspondence: [email protected]

© 2016 Cold Spring Harbor Laboratory PressCite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644

1

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 2: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

2. Download the following sample data from www.baryshnikova-lab.org/publications/4/.

• Data file 1. A tab-delimited text file containing the genetic interaction similarity data fromCostanzo et al. (2010): Costanzo_Science_2010_correlation_network.txt.

This file contains a list of gene pairs (Columns 1 and 2) whose genetic interaction profile similarity, asmeasured by Pearson correlation coefficient, is >0.2 (Column 3). Here, a genetic interaction is definedas an unexpected phenotype arising from combining two mutations in the same organism. In Costanzoet al. (2010), genetic interactions were quantitatively measured for 30% of all possible double-mutantcombinations in the yeast Saccharomyces cerevisiae. Genes sharing similar genetic interactions areknown to share similar functions. We can therefore build a network of functional relationships betweenall genes in the data set by computing a quantitative measure of profile similarity, such as Pearsoncorrelation, and selecting the gene pairs with the highest observable correlation.

• Data file 2. A tab-delimited text file containing the functional annotation standard used forvisualization in Figure 1 in Costanzo et al. (2010): Costanzo_Science_2010_functional_anno-tation.txt.

This file contains the list of genes in the Costanzo et al. network and four different node attributes:CommonName (Column 2; the three-letter code for the gene), ORF (Column 3; the systematic name forthe gene), FunctionalAnnotation (Column 4; discrete values indicating the association of the gene withone of 13 different functional groups; see Step 18), and Essentiality (Column 5; binary 0–1 valuesindicating whether the gene is known to be essential or not). Column 1 contains a unique StrainID thatmatches the node label used in Data file 1.

3. (Optional) If not using sample data, prepare network data for import.Network data can be imported into Cytoscape using a variety of file formats (wiki.cytoscape.org/Cytoscape_3/UserManual/Network_Formats/). The simplest format is a delimited text file that defines anetwork by listing its interactions (Fig. 1A; Data file 1). Each interaction (or “edge”) must report the labels ofthe two connected nodes (i.e., gene or protein names) and, optionally, may also include any number ofinteraction properties (or “edge attributes”), such as type, strength, and confidence of the interaction.Columns containing node labels and edge attributes must be separated by a special character (e.g., a tabor a comma) that is never encountered within a node label (Fig. 1A; Data file 1).

The delimited text file is the simplest way to store and exchange network data because it can be generatedand edited in many applications on many platforms and is easily interpretable by both humans and com-puters. The limitation of delimited text files is that they do not carry any information relative to the positioningof nodes in the network and do not specify any node attribute, such as alternative labels, shape, color, orannotation. This information, if available, must be stored in an additional file and loaded separately (see Step18i—Loading node attributes). More complex file formats, such as XGMML, store all network information ina single file and allow a user to import organized and annotated networks in a single step. XGMML files,however, must be generated using a specialized tool, involve relatively complex syntax (http://cgi5.cs.rpi.edu/research/groups/pb/punin/public_html/XGMML/), and are not easily manipulated. As a result,XGMML files are more appropriate for advanced users to exchange visualizations across different applica-tions or across different networks within the same application.

Load the Network

4. In Cytoscape, go to File� Import�Network� File. Choose the file containing the networkdata and click Open.

In the sample data provided, the file is Costanzo_Science_2010_correlation_network.txt.

5. View the Preview section of the window, which displays a sample of the input file and shows howinformation in the file will be parsed.

6. If parsing appears incorrect (e.g., two node labels are merged in the same column or, conversely, asingle node label is split into multiple columns), click on Show Text File Import Options in theAdvanced section of the window to specify a different column delimiter character. In addition,indicate whether the input file contains any header rows (Start Import Row) and/or whether thefirst row should be used to label the columns (Transfer first line as column names).

7. In the Interaction Definition section of the window, specify which columns contain the labelsof the interacting nodes (Source Interaction and Target Interaction).

The corresponding columns in the Preview section will be highlighted in purple and orange,respectively.

2 Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644

A. Baryshnikova

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 3: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

For undirected edges (e.g., protein–protein interactions), it does not matter which node is labeled as Sourceand which one as Target, unless, for example, it is important to preserve the order in which the interactionwas detected (Source and Target may correspond to bait and prey proteins, respectively). For directededges (e.g., kinase–substrate relationships), Source and Target should indicate the upstream (kinase) anddownstream (substrate) nodes, respectively.

In the example provided (undirected edges), Columns 1 and 2 should be labeled as Source and Target (orvice versa) (Fig. 1A).

8. Click on the columns containing edge attributes to select them. Each selected column will behighlighted in blue. To name an attribute and/or to specify whether it contains texts or numericalvalues, right-click (or Command-click in Mac OS X) on the column and input the relevantinformation.

In the example provided, Column 3 contains a measure of similarity (Pearson correlation coefficients)between the genetic interaction profiles of Source and Target genes. This attribute should be namedCorrelation and assigned to the Floating Point data type (Figure 1A). Alternatively, if, for example,this column contained a discrete score (from 1 to 5) representing our level of confidence inthe correlation, we would have named the edge attribute Confidence and assigned it to theInteger data type.

9. Click OK to load the network.

10. If the network does not appear automatically, right-click on the name of the network in theControl Panel on the left and select Create View.

This step is usually necessary for large networks because they are not automatically visualized to preservememory.

A B

C D F

E

FIGURE 1. Key steps for visualizing a large-scale biological network using Cytoscape 3.0. (A) Loading the network datafrom a tab-delimited text file. (B) Creating a preliminary view of the network. (C ) Organizing the network using theedge-weighted spring-embedded layout. (D) Adjusting the visual appearance of the network by changing color, shape,size, and transparency of nodes and edges. (E) Loading customized functional annotations as node attributes. (F )Adjusting node appearance based on node attributes.

Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644 3

Visualizing Biological Networks Using Cytoscape

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 4: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

In the sample data provided, the Control Panel should list a network named Costanzo_Science_2010_cor-relation_network.txt with 2838 nodes and 10,189 edges (Fig. 1B).

11. Once the network has been loaded and visualized, save it as a Cytoscape Session file (∗.cys), whichcan then be reopened at a later time on the same or on a different computer and will maintain allthe properties (layout, annotations, and visual style) of the original network.

Organize the Network

For a discussion of possible ways to organize the network see the Discussion section.

12. As an example, we here apply the edge-weighted spring-embedded layout, which mimics thebehavior of a system of connected springs and reorganizes the network such that densely con-nected nodes are positioned close to each other, whereas disconnected nodes are spread fartherapart. To apply the layout, select Layout� Edge-weighted Spring Embedded. As an option, anedge attribute can be used to specify the spring force coefficient. In our example, choose theCorrelation node attribute to apply stronger forces to highly correlated gene pairs or select (none)to assign equal forces to all gene pairs.

Because of a randomization step in the algorithm, the edge-weighted spring-embedded layout generates anew network configuration at every run. Although these configurations are equivalent in terms of forcebalance between the edges connecting each node, it is not possible to reproduce the same exact networkvisualization in two independent Cytoscape sessions.

Adjust the Visual Style of the Network

The procedure described below reproduces the visual style of the correlation network published as Fig. 1 in Costanzoet al. (2010). It is provided as an example of the visual properties that can be adjusted in Cytoscape to generateinformative network visualizations (Fig. 1D). Additional options can be explored by the user within Cytoscape.

13. Set the background color to black.

i. Go toControl Panel�VizMapper. Click on the image in theDefaults (Click to edit) section.

ii. Click on Network at the bottom right of the window. Double-click on Network BackgroundPaint and choose the black color.

14. Make the edges white and semitransparent.

i. In the same window as above, click on Edge at the bottom right of the window.

ii. In the bottom center, click on Show All.

iii. Scroll through the properties to find Edge Stroke Color (Unselected), double-click, andchoose the white color.

iv. Scroll through the properties to find Edge Transparency, double-click, and input 50 (values0 and 255 correspond to 0% and 100% opacity, respectively).

15. Make the nodes white, small, and round.

i. In the same window as above, click on Node at the bottom right of the window.

ii. In the bottom center, click on Show All.

iii. Scroll through the properties to find Node Size, double-click, and input 20.

iv. Scroll through the properties to find Node Fill Color, double-click, and choose thewhite color.

v. Scroll through the properties to find Node Shape, double-click, and choose Ellipse.

16. Click on Apply in the bottom left corner of the window (Fig. 1D).Depending on the network size, some of the visual changes might not be immediately visible in theCytoscape main window. To see the changes, it may be necessary to zoom in or out of the currentnetwork view (“+” and “−” magnifying glasses in the top toolbar) or export the current network viewinto a PDF file (see Step 19).

4 Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644

A. Baryshnikova

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 5: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

Annotate the Network

For a discussion of how and why to use network annotation, see the Discussion section.

17. Perform functional enrichment.While functional enrichment tests are not directly implemented in the default version of Cytoscape, numer-ous third-party apps have been developed that provide this capability (http://apps.cytoscape.org/apps/with_tag/enrichmentanalysis). As an example, I describe the app BiNGO (Maere et al. 2005) with most ofits default settings. BiNGO is particularly suitable for functional annotation of biological networks because itenables the user to test selected network regions for enrichment against common functional annotationstandards, such as Gene Ontology biological process terms, and returns a standard significance p-value.

i. To download and install BiNGO, choose the menu Apps� App Manager. In the Searchwindow, type BiNGO, select the app from the list, and click on Install.

ii. To run a functional enrichment analysis, select a subset of nodes (e.g., a cluster of interest)from the network and choose the menu Apps� BiNGO.

iii. Type in a name for the selected cluster.By default, BiNGO extracts node labels directly from the network (Get Cluster from Network, selectedby default) and compares them against the specified Gene Ontology file (lower in this window, Selectontology file: GO_Biological_Process, GO_Cellular_Component, GOSlim_Yeast, etc.). Some of thenode labels, however, may not match the gene identifiers listed in GO due to the existence of genename aliases (e.g., SRS2 or HPR5 for YJL092W) and/or custom annotations appended to node labels(e.g., in our example, YAL041W_tsq148 denotes one of the several CDC24 temperature-sensitive allelespresent in the data set). In theory, it is possible to provide an additional node attribute containing astandard gene identifier for all nodes in the network (see Step 18i—Loading node attributes). However,BiNGO cannot perform functional enrichment on node attributes. Instead, standard gene identifierscorresponding to the selected nodes can be manually copied from the network window (or any othersource) and pasted directly into the BiNGO settings window (following the selection of the Paste Genesfrom Text option).

iv. Leave all other options as their defaults (for more details, see Maere et al. (2005)).

v. Click Start BiNGO.

vi. View results.The results window lists the GO terms whose members are significantly overrepresented/enrichedamong the genes in the selected cluster. In addition, BiNGO generates a network visualizing thedistribution of significantly overrepresented GO terms in the overall hierarchy of GO.

18. Perform visual annotation using functional standards.In Cytoscape, node attributes (i.e., numerical values or text labels associatedwith each node) can be used toassociate nodes with unique visual properties, such as size, shape, and color. Using this feature, genes actingin known biological pathways can be easily identified and assessed qualitatively with respect to networktopology. For example, we can determine whether members of a biological pathway cluster together in anetwork module or whether essential genes tend to be highly connected and act as network hubs.

i. Check the file containing node attribute data: Similarly to network data, the simplest formatfor importing node attributes is a delimited text file where each row corresponds to a nodeand each column corresponds to a different node attribute. Columnsmust be separated by aspecial character (e.g., a tab or a comma) (Fig. 1E; Data file 2). One of the columns mustcontain a node identifier: the primary identifier used in the network data file (see Step 3) or asecondary identifier previously loaded as a node attribute (e.g., see Step 18v).

In our example (Fig. 1E), the node attribute file contains a customized functional annotation: genes areassigned a functional group (numbered from 1 to 13) if they (a) belong to one of the large functionallyenriched clusters in the network and (b) are annotated to the biological processmost represented in thatcluster. These numbers were generated specifically for the genetic interaction similarity network using amultistep procedure involving network clustering, functional enrichment, and manual annotation (Cos-tanzo et al. 2010).

ii. Go to File� Import� Table� File. Choose the file containing the node attribute dataand click Open.

iii. View the Preview section of the window, which displays a sample of the input file showinghow it will be parsed for information.

Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644 5

Visualizing Biological Networks Using Cytoscape

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 6: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

iv. If parsing appears incorrect, click on Show Text File Import Options in the Advancedsection of the window to specify a different column delimiter character. In addition, youcan indicate whether or not the first row of the file should be used to label the columns.

v. To select a different node label in the network, click on the Key Column for Network drop-down menu (second from the top of the window) and select the desired node label.

By default, Cytoscape will match the first column of the node attribute file to the primary node label inthe network. Both of these options can be changed.It may be necessary to load additional node labels before being able to load the functional annotation.For example, the primary node identifier in the network may be the common name of a gene, whereasthe functional annotation standard lists genes by their systematic ORF (open reading frame) names. As aresult, it is first necessary to create a node attribute containing the systematic ORF name correspondingto each common name label. This can be done using the same exact procedure described in this section(Steps i–v). Once the node attribute containing the systematic ORF name has been created, it can beused as Key Column for Network to load the functional annotation standard.

vi. To select a different column from the node attribute file, click on Show Mapping Optionsand select the correct column from the Select the primary key column in table drop-downmenu. The corresponding column will be highlighted in blue in the Preview section.

vii. By default, all columns of the node attribute file will be loaded as attributes and will benamed Column 1, Column 2, Column 3, etc. To avoid loading unnecessary information,click on a column in the Preview section to deselect it. To rename an attribute and/or specifywhether it contains text or numerical values, right-click (or Command-click in Mac OS X)on the column and input the relevant information.

viii. To load the node attribute file, click OK.

ix. To assign visual properties to node attributes (Fig. 1F): Go to Control Panel�VizMapper.

x. In the Visual Mapping Browser section, click on Show All. Scroll down all the Node andEdge properties and double-click on Node Fill Color.

xi. Click on Please select a value and choose the node attribute that should be representedas a color.

In our example, choose the FunctionalAnnotation node attribute.

xii. Click on the cell next to Mapping Type and select Discrete Mapping.Discrete Mappingwill associate a distinct color to each of the unique values in the FunctionalAnnotationnode attribute. As a result, if the network contains 10 nodes annotated to functional group 1 and thisfunctional group is associated with the color red, the network will show 10 red nodes.

xiii. Right-click (or Command-click in Mac OS X) on Node Fill Color, select Mapping ValueGenerators� Rainbow (or any other option). This will assign random colors to the func-tional annotation groups.

xiv. To adjust a color associated with a specific functional group, click on the category inVizmapper, then click on … and select the appropriate color.

xv. Similarly, node attributes can be associated with different node shape, size, and transparency.

19. Finalize network visualization: It is often convenient to manually adjust the visualization of anetwork using a vector graphic editor such as Adobe Illustrator. To do this, export the networkfromCytoscape into an editable PDFfile by choosing File� Export�NetworkView asGraphics� PDF File (∗.pdf). Click on Save a File to specify the name and the location of the PDF file.

DISCUSSION

Network Organization

Cytoscape provides a wide range of automatic network layout options (http://wiki.cytoscape.org/Cytoscape_3/UserManual#Cytoscape_3.2BAC8-UserManual.2BAC8-Navigation_Layout.Automatic_

6 Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644

A. Baryshnikova

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 7: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

Layout_Algorithms) that may be useful for different visualization strategies. For example, to examinethe frequency of interactions between specific pathways or protein complexes, nodes can be groupedaccording to one of their attributes (Layout�Group Attributes Layout). The procedure for loadingfunctional annotations as a node attribute is described in Step 18i—Loading Node Attributes. Alter-natively, nodes can be sorted by their degree (i.e., number of interactions) to identify network hubsand analyze their common properties (Layout�Degree Sorted Circle Layout).

To explore the network in an unbiased way, it is often useful to arrange nodes based on theirinteraction strength and density. This approach visually clusters the network, revealing denselyconnected regions and their positioning with respect to each other, and has the potential touncover biologically relevant modules. Different network layouts use different algorithms toachieve this effect. In the edge-weighted spring-embedded layout, for example, every edge acts as aspring that pulls nodes together with a strength proportional to an edge attribute (e.g., geneticinteraction profile similarity) that is specified by the user (Kamada and Kawai 1989; Fruchtermanand Reingold 1991). This and other similar network layouts have been widely adopted to representbiological networks because they often produce clean, symmetrical, and esthetically pleasant visual-izations, with a minimal number of crossing edges and overlapping nodes (Fig. 1C) (Freeman et al.2007; Atkinson et al. 2009; Costanzo et al. 2010; Breuer et al. 2013).

Network Annotation

To evaluate a network and determine how well it recapitulates known biology, it is often usefulto visually associate nodes and/or node clusters with specific biological processes. For example, afunctional enrichment test may be required to determine whether members of a particular networkcluster are overrepresented/enriched for genes annotated to a particular Gene Ontology term. GeneOntology (GO) is a standardized vocabulary that describes every gene’s biological role, molecularfunction, and cellular localization (Ashburner et al. 2000), and provides a unique resource for eval-uating the output of genomic experiments and large scale networks (see Step 17—Functional enrich-ment). As an alternative to functional enrichment and GO, the user may consider directly labelingthe network using a custom set of functional annotations such as, for example, a set of mutants witha phenotype of interest or a hit list from a systematic experiment, and visually assessing the distri-bution of functional groups throughout the network (see Step 18—Visual annotation using functionalstandards). This latter approach, while less rigorous, often provides a rapid and useful overview offunctional information within the network that may motivate the implementation of more accuratestatistical tests.

The level of detail with which a network should be annotated depends on the purpose of thevisualization and on the availability of properly defined annotation standards. Although GO is cer-tainly the most widely adopted standard, its hierarchical structure often hinders its direct use innetwork visualization without preliminary manipulation, such as compression and/or filtering(e.g., Myers et al. 2006). One potential alternative is GO Slim, which has a flat structure, fewerterms, and broader definitions. The list of GO Slim terms for Saccharomyces cerevisiae can be down-loaded from the Gene Ontology webpage (www.geneontology.org/GO_slims/goslim_yeast.obo),whereas the gene to term mapping can be obtained from the Saccharomyces Genome Database(SGD, http://downloads .yeastgenome.org/curation/literature/go_slim_mapping.tab).

Although simpler than GO, GO Slim may still provide a functional categorization that is toodetailed for many visualization purposes (e.g., 167 terms in the yeast version). To address this issue,more general annotation standards have been developed. For example, the most recent yeast geneticinteraction study grouped 4414 genes into 17 functional categories (Supplementary Data File S6 inCostanzo et al. (2010)).

Similarly, it is often useful to visualize functional annotations that are more specific thanGO Slim, such as, for example, individual biological pathways or protein complexes. In additionto a few systematic analyses of co-complex associations (Gavin et al. 2006; Krogan et al. 2006),several groups have curated and integrated the results of multiple experimental data sets to

Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644 7

Visualizing Biological Networks Using Cytoscape

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 8: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

produce a compendium of protein complexes for yeast (http://downloads.yeastgenome.org/curation/literature/go_protein_complex_slim.tab) (Pu et al. 2009; Baryshnikova et al. 2010).

ACKNOWLEDGMENTS

I thank Michael Costanzo for providing critical feedback on the manuscript.

REFERENCES

Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP,Dolinski K, Dwight SS, Eppig JT, et al. 2000. Gene ontology: Tool forthe unification of biology. The Gene Ontology Consortium. Nat Genet25: 25–29.

Atkinson HJ, Morris JH, Ferrin TE, Babbitt PC. 2009. Using sequence sim-ilarity networks for visualization of relationships across diverse proteinsuperfamilies. PLoS One 4: e4345.

Baryshnikova A, Costanzo M, Kim Y, Ding H, Koh J, Toufighi K, Youn JY,Ou J, San Luis BJ, Bandyopadhyay S, et al. 2010. Quantitative analysis offitness and genetic interactions in yeast on a genome scale.Nat Methods7: 1017–1024.

Breuer K, Foroushani AK, Laird MR, Chen C, Sribnaia A, Lo R, Winsor GL,Hancock RE, Brinkman FS, Lynn DJ. 2013. InnateDB: Systems biologyof innate immunity and beyond—Recent updates and continuing cura-tion. Nucleic Acids Res 41: D1228–D1233.

Costanzo M, Baryshnikova A, Bellay J, Kim Y, Spear ED, Sevier CS, Ding H,Koh JL, Toufighi K, Mostafavi S, et al. 2010. The genetic landscape of acell. Science 327: 425–431.

Freeman TC, Goldovsky L, BroschM, vanDongen S,Maziere P, Grocock RJ,Freilich S, Thornton J, Enright AJ. 2007. Construction, visualisation,and clustering of transcription networks from microarray expressiondata. PLoS Comput Biol 3: 2032–2042.

Fruchterman TMJ, Reingold EM. 1991. Graph drawing by force-directedplacement. Software 21: 1129–1164.

Gavin AC, Aloy P, Grandi P, Krause R, Boesche M, Marzioch M, RauC, Jensen LJ, Bastuck S, Dumpelfeld B, et al. 2006. Proteomesurvey reveals modularity of the yeast cell machinery. Nature 440:631–636.

Kamada T, Kawai S. 1989. An algorithm for drawing general unidirectedgraphs. Processing Lett 31: 7–15.

Krogan NJ, Cagney G, Yu H, Zhong G, Guo X, Ignatchenko A, Li J, Pu S,Datta N, Tikuisis AP, et al. 2006. Global landscape of protein complexesin the yeast Saccharomyces cerevisiae. Nature 440: 637–643.

Maere S, Heymans K, Kuiper M. 2005. BiNGO: A Cytoscape plugin to assessoverrepresentation of gene ontology categories in biological networks.Bioinformatics 21: 3448–3449.

Myers CL, Barrett DR, Hibbs MA, Huttenhower C, Troyanskaya OG. 2006.Finding function: Evaluation methods for functional genomic data.BMC Genomics 7: 187.

Pu S, Wong J, Turner B, Cho E, Wodak SJ. 2009. Up-to-date catalogues ofyeast protein complexes. Nucleic Acids Res 37: 825–831.

Smoot ME, Ono K, Ruscheinski J, Wang PL, Ideker T. 2011. Cytoscape 2.8:New features for data integration and network visualization. Bioinfor-matics 27: 431–432.

8 Cite this protocol as Cold Spring Harb Protoc; doi:10.1101/pdb.prot077644

A. Baryshnikova

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from

Page 9: Exploratory Analysis of Biological Networks through ...cshprotocols.cshlp.org/content/early/2016/03/17/pdb.prot077644... · Protocol Exploratory Analysis of Biological Networks through

doi: 10.1101/pdb.prot077644; published online March 17, 2016Cold Spring Harb Protoc;  Anastasia Baryshnikova and Functional Annotation in CytoscapeExploratory Analysis of Biological Networks through Visualization, Clustering,

ServiceEmail Alerting click here.Receive free email alerts when new articles cite this article -

CategoriesSubject Cold Spring Harbor Protocols.Browse articles on similar topics from

(130 articles)Yeast Genetics (109 articles)Protein: Protein Interactions, general

(82 articles)Protein: Protein Interactions (5 articles)Genome-wide and Computer-based Analysis

(100 articles)Computational Biology (181 articles)Bioinformatics/Genomics, general

http://cshprotocols.cshlp.org/subscriptions go to: Cold Spring Harbor Protocols To subscribe to

© 2016 Cold Spring Harbor Laboratory Press

Cold Spring Harbor Laboratory Press on June 22, 2018 - Published by http://cshprotocols.cshlp.org/Downloaded from