45
3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution Philip M. Kim, Ph.D. Yale University GCB 2006, Tuebingen, Germany September 21st, 2006

Philip M. Kim, Ph.D. Yale University

  • Upload
    brandy

  • View
    20

  • Download
    0

Embed Size (px)

DESCRIPTION

3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution. Philip M. Kim, Ph.D. Yale University. GCB 2006, Tuebingen, Germany September 21st, 2006. MOTIVATION. ILLUSTRATIVE. Network perspective:. =. - PowerPoint PPT Presentation

Citation preview

Page 1: Philip M. Kim, Ph.D. Yale University

3-D Structural Analysis of Protein Interaction Networks Gives New Insight Into Protein Function, Network Topology and Evolution

Philip M. Kim, Ph.D.Yale University

GCB 2006, Tuebingen, GermanySeptember 21st, 2006

Page 2: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

2

MOTIVATION

AB1-4

Cdk/cyclin complex Part of the RNA-pol complex

ILLUSTRATIVE

A

B1

B2

B3

B4

Network perspective:

Structural biology perspective:

=

There remains a rich sourceof knowledge unmined by network

theorists!

Page 3: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

3

OUTLINE

Interaction Networks and their properties

Network properties revisited

A 3-D structural point of view

Conclusions

Page 4: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

4

OUTLINE

Interaction Networks and their properties

Network properties revisited

A 3-D structural point of view

Conclusions

Page 5: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

5

PROTEIN INTERACTION NETWORKS IN YEAST

Source: Gavin et al. Nature (2002), Uetz et al. Nature (2000), Cytoscape and DIP

• Determined by:

– Large-scale Yeast-two-hydrid

– TAP-Tagging

– Literature curation

• Currently over 20,000 unique interactions available in yeast

• Spawned a field of computational “graph theory” analyses that view proteins as “nodes” and interactions as “edges”

A snapshot of the current interactome Description and methodologies

ILLUSTRATIVE

DIP (Database of interacting Proteins)

Page 6: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

6

TINY GLOSSARY: DEGREE AND HUBS

C: Degree = 1A: Degree = 5

A is a “Hub”*

*The definition of hubs is somewhat arbitrary, usually a cutoff is used

Source: PMK

Topology is dominatedby hubs!

(“Scale-free”)

Page 7: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

7

HUBS TEND TO BE IMPORTANT PROTEINS, THEY ARE MORE LIKELY TO BE ESSENTIAL PROTEINS AND TEND TO BE MORE CONSERVED

Source: Jeong et al. Nature (2001), Yu et al. TiG (2004) and Fraser et al. Science (2002)

• By now it is well documented that proteins with a large degree tend to be essential proteins in yeast.

(“Hubs are essential”)

• Likewise, it has been found that hubs tend to evolve more slowly than other proteins

(“Hubs are slower evolving”)

There is some controversy regarding

this relationship

Page 8: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

8

THERE IS A RELATIONSHIP BETWEEN NETWORK TOPOLOGY AND GENE EXPRESSION DYNAMICS

Source: Han et al. Nature (2004) and Yu*, Kim* et al. (Submitted)

Frequency

Co-expression correlation

Page 9: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

9

SCALE FREENESS GENERALLY EVOLVES THROUGH PREFERENTIAL ATTACHMENT (THE RICH GET RICHER)

Source: Albert et al. Rev. Mod. Phys. (2002) and Middendorf et al. PNAS (2005)

• Theoretical work shows that a mechanism of preferential attachment leads to a scale-free topology

(“The rich get richer”)

The Duplication Mutation Model Description

ILLUSTRATIVE

• In interaction network, gene duplication followed by mutation of the duplicated gene is generally thought to lead to preferential attachment

• Simple reasoning: The partners of a hub are more likely to be duplicated than the partners of a non-hub

Gene duplication

The interaction partners of A are more likely to beduplicated

Page 10: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

10

OUTLINE

Interaction Networks and their properties

Network properties revisited

A 3-D structural point of view

Conclusions

Page 11: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

11

THERE IS A PROBLEM WITH SCALE-FREENESS AND REALLY BIG HUBS IN INTERACTION NETWORKS

Source: DIP, Institut fuer Festkoerperchemie (Univ. Tuebingen)

A really big hub (>200 Interactions)

Gedankenexperiment

How many maximum neighbors can a protein have?

• Clearly, a protein is very unlikely to have >200 simultaneous interactors.

• Some of the >200 are most likely false positives

• Some others are going to be mutually exclusive interactors (i.e. binding to the same interface).

Conclusion

• There appears to be an obvious discrepancy between >200 and 12.

ILLUSTRATIVEWouldn’t it be great to

be able to see the differentbinding interfaces?

Page 12: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

12

UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES

*Many redundant structures

Source: PMK

ILLUSTRATIVE

InteractomeUse a high-confidencefilter

Map Pfam domains to all proteins in the interactome

Distinguish interfaces

Combine with all structures of yeast protein complexes

Annotate interactionswith available structures,discard all others

PDB

Homology mappingof Pfam domainsto all structures of interactions

~10000 Structures of interactions*

~20000 interactions

Page 13: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

13

THAT IS HOW THE RESULTING NETWORK LOOKS LIKE

Source: PDB, Pfam, iPfam and PMK

• Represents a “very high confidence” network

• Total of 873 nodes and 1269 interactions, each of which is structurally characterized

• 438 interactions are classified as mutually exclusive and 831 as simultaneously possible

• While much smaller than DIP, it is of similar size as other high-confidence datasets

The Structural Interaction Dataset (SID) Properties

Page 14: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

14

OUTLINE

Interaction Networks and their properties

Network properties revisited

A 3-D structural point of view

Conclusions

Page 15: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

15

THERE DO NOT APPEAR TO BE THE KINDS OF REALLY BIG HUBS AS SEEN BEFORE – IS THE TOPOLOGY STILL SCALE-FREE?

Source: PMK

• With the maximum number of interactions at 13, there are no “really big hubs” in this network

• Note that in other high-confidence datasets (or similar size), there are still proteins with a much higher degree

• The degree distribution appears to top out much earlier and less scale free than that of other networks

Degree distribution Properties

Page 16: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

16

Entire genomeAll proteins

In our dataset

64.9%

31.8%32.3%15.1%

Single-interface hubs only

Multi-interface hubs only

Percentage ofessential proteins

IT’S REALLY ONLY THE MULTI-INTERFACE HUBS THAT ARE SIGNIFICANTLY MORE LIKELY TO BE ESSENTIAL

Source: PMK

Page 17: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

17

All proteinsIn our dataset

Single-interface hubs only

Multi-interface hubs only

ExpressionCorrelation

0.20.17

0.25

Expression correlation

DATE-HUBS AND PARTY-HUBS ARE REALLY SINGLE-INTERFACE AND MULTI-INTERFACE HUBS

Source: Han et al. Nature (2004) and PMK

Frequency

Page 18: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

18

AND ONLY MULTI-INTERFACE PROTEINS ARE EVOLVING SLOWER, SINGLE-INTERFACE HUBS DO NOT

Entire genomeAll proteins

In our datasetSingle-interface

hubs onlyMulti-interface

hubs only

EvolutionaryRate (dN/dS)

0.029

0.077

0.047 0.051

Source: PMK

Page 19: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

19

IN FACT, EVOLUTIONARY RATE CORRELATES BEST WITH THE FRACTION OF INTERFACE AVAILABLE SURFACE AREA

Source: PMK

DATA IN BINS

Small portion of surface area involved in interfaces – fast evolving

Large portion of surface area involved in interfaces – slow evolving

Page 20: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

20

IS THERE A DIFFERENCE BETWEEN SINGLE-INTERFACE HUBS AND MULTI-INTERFACE HUBS WITH RESPECT TO NETWORK EVOLUTION?

Source: PMK

The Duplication Mutation Model

Gene duplication

The interaction partners of A are more likely to beduplicated

In the structural viewpoint

If these models were correct,there would be an enrichment of

paralogs among B

Page 21: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

21

0.00%

0.15%

0.07%

0.003%

Random pair

Same partner

Same partnerdifferent interface

Same partnersame interface

Fraction of paralogsbetween pairs of proteins

MULTI-INTERFACE HUBS DO NOT APPEAR TO EVOLVE BY A GENE DUPLICATION – THE DUPLICATION MUTATION MODEL CAN ONLY EXPLAIN THE EXISTENCE OF SINGLE-INTERFACE HUBS

Source: PMK

But that also means that the duplication-mutation modelcannot explain the full current

interaction network!

Page 22: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

22

OUTLINE

Interaction Networks and their properties

Network properties revisited

A 3-D structural point of view

Conclusions

Page 23: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

23

CONCLUSIONS

• The topology of a direct physical interaction network is much less dominated by hubs than previously thought

• Several genomic features that were previously thought to be correlated with the degree are in fact related to the number of interfaces and not the degree

• Specifically, a proteins evolutionary rate appears to be dependent on the fraction of surface area involved in interactions rather than the degree

• The current network growth model can only explain a part of currently known networks

PRELIMINARY

Source: PMK

Page 24: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

24

ACKNOWLEDGEMENTS

Mark Gerstein

Long Jason Lu

Yu Brandon Xia

The Gersteinlab, in particular:

Jan Korbel

Joel Rozowsky

Tom Royce

Page 25: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

25

BACKUP

Page 26: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

26

INTERESTING PROPERTIES OF INTERACTION NETWORKS

Source: Various, see following slides

Network topology

Network Evolution

Relationship of topology and genomic features

Examples of studies

• What distribution does the degree (number of interaction partners) follow?

• What is the relationship between the degree and a proteins essentiality?

• Is there a relationship between a proteins connectivity and expression profile?

• What is the relationship between a proteins evolutionary rate and its degree?

• How did the observed network topology evolve?

OVERVIEW

Page 27: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

27

INTERACTION NETWORKS ARE SCALE-FREE – THEIR TOPOLOGY IS DOMINATED BY SO-CALLED HUBS

Source: Barabasi, A. and Albert, R., Science (1999)

• So-called scale-free topology has been observed in many kinds of networks (among them interaction networks)

• Scale freeness: A small number of hubs and a large number of poorly connected ones (“Power-law behavior”)

• Topology is dominated by “hubs”

• Scale-freeness is in stark contrast to normal (gaussian) distribution

p(k) ~ kγ

Page 28: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

28

• But the “Yes” side appears to be winning

… OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE

Source: See text

Yes, hubs are more conserved

• Fraser et al. Science (2002)

• Fraser et al. BMC Evol. Biol. (2003)

• Wuchty Genome Res. (2004)

• Jordan et al. Genome Res. (2002)

• Hahn et al. J. Mol. Evol. (2004)

• Jordan et al. BMC Evol. Biol. (2003)

No, the relationship is unclear

?

EXAMPLES

• Fraser Nature Genetics (2005)

Page 29: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

29

SHORT DIGRESSION: THIS ALLOWS US TO DISTINGUISH SYSTEMATICALLY BETWEEN SIMULTANEOUSLY POSSIBLE AND MUTUALLY EXCLUSIVE INTERACTIONS

Simultaneouslypossible

interactions

Mutuallyexclusive

interactions

Source: PMK

Page 30: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

30

Mutuallyexclusive

interactions

Simultaneouslypossible

interactions

0.24

0.14Fractionsame biologicalprocess

p<<0.001

Fractionsamemolecularfunction

p<<0.001

Mutuallyexclusive

interactions

Simultaneouslypossible

interactions

Co-expressioncorrelation

p<<0.001

0.33

0.18

0.23

0.17

Fractionsamecellularcomponent

p<<0.001

0.27

0.12

SIMULTANEOUSLY POSSIBLE INTERACTIONS (“PERMANENT”) MORE OFTEN LINK PROTEINS THAT ARE FUNCTIONALLY SIMILAR, COEXPRESSED AND CO-LOCATED

Source: PMK

Page 31: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

31

REMEMBER THE NETWORK PROPERTIES AS WE DESCRIBED BEFORE?

Source: Various, see following slides

Network topology

Network Evolution

Relationship of topology and genomic features

Examples of studies

• What distribution does the degree (number of interaction partners follow?)

• Does the network easily separate into more than one component?

• What is the relationship between the degree and a proteins essentiality?

• Is there a relationship between a proteins connectivity and expression profile?

• What is the relationship between a proteins evolutionary rate and its degree?

• How did the observed network topology evolve?

OVERVIEW

Page 32: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

32

• But the “Yes” side appears to be winning

… OR ARE THEY? THERE IS AN ONGOING DEBATE ABOUT THE RELATIONSHIP BETWEEN EVOLUTIONARY RATE AND DEGREE

Source: See text

Yes, hubs are more conserved

• Fraser et al. Science (2002)

• Fraser et al. BMC Evol. Biol. (2003)

• Wuchty Genome Res. (2004)

• Jordan et al. Genome Res. (2002)

• Hahn et al. J. Mol. Evol. (2004)

• Jordan et al. BMC Evol. Biol. (2003)

No, the relationship is unclear

?

This debate may have arisenbecause the two different sides were

all looking at the wrong variable!

Page 33: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

33

OUT

Page 34: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

34

UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES

Source: PMK

Combine with all structures of yeast protein complexes

• Start with high-confidence interactome dataset

• Collected dimer and multimer structures and mapped Pfam domains onto the corresponding proteins

• Removed ubiquitous domains (e.g., WD40)

• All interactions that contain Pfam domains found to interact in a crystal structure are annotated with this structural information (all others are removed)

• Dataset: ~1269 interactions (combined with all structures that were from yeast).

ILLUSTRATIVE

Pfam -- Homology

Explain methodology….

Page 35: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

35

UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES

*Many redundant structures

Source: PMK

ILLUSTRATIVE

PDB

Interactome

Homology mappingof Pfam domainsto all structures of interactions

Use a high-confidencefilter

Map Pfam domains to all proteins in the interactome

Distinguish interfaces

Combine with all structures of yeast protein complexes

Annotate interactionswith available structures,discard all others

~10000 Structures of interactions*

~20000 interactions

Page 36: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

36

SOME NETWORK STATISTICS – SCALE FREENESS?

Source: PMK

• In the Pfam dataset, the vast majority (570 out of 790) of the proteins (even hubs) has only one distinct interface.

• 220 proteins (~25%) have 2 or more interfaces.

• Most hubs are mediated by promiscuous interfaces rather than many interfaces ~ 2.6 interactions/interface

MaxDegree

161 nodes(degree >5)

220 nodes(numint>1) 6.0

1.4

3.5

19

MaxInterfaces

Avg.Degree

Avg.Interfaces

Page 37: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

37

UTILIZING PROTEIN CRYSTAL STRUCTURES, WE CAN DISTINGUISH THE DIFFERENT BINDING INTERFACES

Source: PMK

ILLUSTRATIVE

PDB

Interactome

Page 38: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

38

CLIQUES, K-PLEXES AND K-CORES IN SOCIAL NETWORKS

Source:…

• …

• …

• …

Page 39: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

39

AUTOMORPHIC EQUIVALENCE

Source: …

• …

… …

Page 40: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

40

NETWORKS IN MANAGEMENT SCIENCE - THE FIELD OF ORGANIZATION THEORY

… …

Page 41: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

41

DECISION MAKING IN ORGANIZATIONS: DECENTRALIZATION OF CERTAIN ISSUES

… …

Page 42: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

42

GROWING ORGANIZATIONS NEED TO DEPARTMENTALIZE

Source: …

• …

… …

Page 43: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

43

DOES SIZE MATTER?

Source: …

• …

… …

Page 44: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

44

ENVIRONMENTAL EFFECTS ON ORGANIZATIONAL STRUCTURE

* …

Source: …

• …

• …

• …

• …

• …

Page 45: Philip M. Kim, Ph.D. Yale University

060921_GCB2006_Talk_PMK

45

• …

FIVE DIFFERENT ORGANIZATIONAL CONFIGURATIONS

* …

Source: …

• …

• …

• …

• …

• …

• …

• …

• …

• …

• …

• …

• …