41
Statistical mechanics approach Statistical mechanics approach to complex networks: to complex networks: from abstract to from abstract to Vittoria Colizza Vittoria Colizza Supervisor: Supervisor: Prof. Amos Prof. Amos Maritan Maritan biological networks biological networks

Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Embed Size (px)

Citation preview

Page 1: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Statistical mechanics approach Statistical mechanics approach to complex networks: to complex networks:

from abstract to from abstract to

Vittoria ColizzaVittoria Colizza

Supervisor:Supervisor: Prof. Amos Prof. Amos MaritanMaritan

biological networksbiological networks

Page 2: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Protein-protein Protein-protein Interaction Interaction NetworksNetworks

biological networksbiological networks

Page 3: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

OutlineOutline

• PINPIN• MethodsMethods• Topological analysisTopological analysis• RenormalizationRenormalization• Topology / functionality correlationsTopology / functionality correlations• Function predictionFunction prediction

mixed Global Optimization Modelmixed Global Optimization Model Maximum Entropy Estimate ModelMaximum Entropy Estimate Model

• Conclusions & PerspectivesConclusions & Perspectives

SISSA - PHD - October, 18th 2004

Page 4: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Protein Interaction NetworksProtein Interaction Networks

Involved in Involved in almost every cellular almost every cellular

process process ::

DNA replication, transcription and translationDNA replication, transcription and translation intracellular communicationintracellular communication cell cycle controlcell cycle control the workings of complex molecular motorsthe workings of complex molecular motors ……....

SISSA - PHD - October, 18th 2004

Page 5: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Protein Interaction NetworksProtein Interaction Networks

Undirected network:Undirected network:

nodes nodes proteinsproteins

links links direct direct

interactioninteraction

SISSA - PHD - October, 18th 2004

S.Maslov & K.Sneppen, S.Maslov & K.Sneppen, ScienceScience 296296, 910 (2002), 910 (2002)

Page 6: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Interaction-detection MethodsInteraction-detection Methods

experimental techniquesexperimental techniques physical physical bindingsbindings• yeast two-hybrid systemsyeast two-hybrid systems• mass spectrometry analysis of purifiedmass spectrometry analysis of purified complexescomplexes

interaction prediction methodsinteraction prediction methods functional associationsfunctional associations• correlated mRNA expression profilescorrelated mRNA expression profiles• genetic interaction-detection methodsgenetic interaction-detection methods• in silicoin silico approaches – gene fusion, gene neighborhood, approaches – gene fusion, gene neighborhood,

phylogenetic profilesphylogenetic profiles

SISSA - PHD - October, 18th 2004

Page 7: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Yeast two-hybrid system (Y2H)Yeast two-hybrid system (Y2H)

simple, rapid, sensitive, simple, rapid, sensitive, inexpensive inexpensive suitable for suitable for large-large-scale applicationsscale applications

virtually virtually every protein-protein every protein-protein interactioninteraction, even transient, , even transient, unstable or weak ints.unstable or weak ints.

no cooperativeno cooperative binding bindingsome kindssome kinds of proteins of proteins not not suitablesuitable, e.g. transcription factors, e.g. transcription factorsfalse negativefalse negative ints. (artificially ints. (artificially made hybrids)made hybrids)false positivefalse positive ints. (spatio- ints. (spatio-temporal constraints)temporal constraints)

SISSA - PHD - October, 18th 2004

Page 8: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Protein Complex analysisProtein Complex analysis

identification of identification of whole complexeswhole complexes cooperative bindingcooperative binding

in vivoin vivo technique; one artificially technique; one artificially made proteinmade protein

physiologicalphysiological settings settings several componentsseveral components as as tagged tagged

baitsbaits for test for test

tagging procedure interferencetagging procedure interference with complex formationwith complex formation

false negativefalse negative ints. (weakly ints. (weakly associated proteins) associated proteins)

SISSA - PHD - October, 18th 2004

Page 9: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Topological analysisTopological analysis

Saccharomyces cerevisiaeSaccharomyces cerevisiae• network (I): network (I): Y2HY2H binary interactions from 2 binary interactions from 2

distinct experimentsdistinct experiments

• network (II): network (II): interactions from complex interactions from complex analysis (analysis (tandem affinity purification,tandem affinity purification, TAPTAP))

• network (III): network (III): mixedmixed collection of interactions collection of interactions from different exp. techniques (from different exp. techniques (Database of Database of Interacting Proteins,Interacting Proteins, DIPDIP) )

SISSA - PHD - October, 18th 2004

P.Uetz P.Uetz et al. Natureet al. Nature 403403, 623 (2000), 623 (2000)T.Ito T.Ito et al. Proc. Natl. Acad. Sci. USAet al. Proc. Natl. Acad. Sci. USA 9898, 4569 (2001), 4569 (2001)

A.C.Gavin A.C.Gavin et al. Natureet al. Nature 415415, 141 (2002), 141 (2002)

Database of Interacting Proteins (DIP)Database of Interacting Proteins (DIP)http://dip.doe-mbi.ucla.edu/http://dip.doe-mbi.ucla.edu/

Page 10: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Topological analysisTopological analysis

SISSA - PHD - October, 18th 2004

(I) (II) (III)

# proteins 2152 1361 4713

# links 2831 3221 14846

<k> 2.63 4.73 6.30

<C> 0.10 0.22 0.09

<Crand> 0.0064 0.019 0.018

Page 11: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

degree distrib. P(k)degree distrib. P(k)

SISSA - PHD - October, 18th 2004

0/

0 )()(kk

ekkkP

H.Jeong, S.P.Mason, A.-L. Barabasi & Z.N.Oltvai, H.Jeong, S.P.Mason, A.-L. Barabasi & Z.N.Oltvai, NatureNature 411411, 41 (2001), 41 (2001)

Page 12: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

SISSA - PHD - October, 18th 2004

0/

0 )()(kk

ekkkP

5.2)( I

5.2)( III1.2)( II

Page 13: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

clustering coeff. C(k)clustering coeff. C(k)

SISSA - PHD - October, 18th 2004

)1(

2

ii

ii kk

eC

i

kki iC

knPkC ,)(

1)(

48.0)( II

Ck

Page 14: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

neighb. degree kneighb. degree knnnn(k)(k)

SISSA - PHD - October, 18th 2004

j

jijii

kknn kAkknP

kki

1

)(

1)( ,

24.0)( III

knn

Page 15: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

rich-club phenomenonrich-club phenomenon

SISSA - PHD - October, 18th 2004

)1(

2)(

kk

k

NN

ek

96.1)( I

k

14.1)( III

k01.1

)( II

k

Page 16: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

SISSA - PHD - October, 18th 2004

Y2H networkY2H network

TAP networkTAP network

DIP networkDIP network

no correlationsno correlations

only 3-points only 3-points correlations correlations (complexes)(complexes)

hierarchical hierarchical structure,structure,degree degree correlationscorrelations

Page 17: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Topological analysisTopological analysis

SISSA - PHD - October, 18th 2004

throughthrough

network renormalizationnetwork renormalization

Page 18: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Network renormalizationNetwork renormalization

investigation of investigation of critical behaviorscritical behaviors of of complex networks through complex networks through RG approachRG approach

coarse-graining: coarse-graining: decimation of decimation of less less relevant detailsrelevant details to elucidate to elucidate critical critical propertiesproperties

‘‘simplification’ simplification’ simpler and more simpler and more understandable versions of large-scale understandable versions of large-scale networks networks network visualization ?network visualization ?

SISSA - PHD - October, 18th 2004

Page 19: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

=0,1=0,1 ’ ’ not onlynot only 0,1 0,1 weighted weighted networksnetworks

SISSA - PHD - October, 18th 2004

Page 20: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

PIN renormalizationPIN renormalization

SISSA - PHD - October, 18th 2004

Page 21: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

PIN renormalizationPIN renormalization

SISSA - PHD - October, 18th 2004

power-law + exp. cut-offpower-law + exp. cut-off

purepure power-law power-law

no no purepure power-law (+ exp. cut-off) power-law (+ exp. cut-off)

power-law (+ exp. cut-off)power-law (+ exp. cut-off)

Page 22: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Functional characterizationFunctional characterization

SISSA - PHD - October, 18th 2004

Change in the view of protein function:Change in the view of protein function:

individual taskindividual task

cooperative behaviour protein cooperative behaviour protein

interactionsinteractions

functional relationshipsfunctional relationshipsMIPS Comprehensive Yeast Genome Database (CYGD).MIPS Comprehensive Yeast Genome Database (CYGD).http://mips.gsf.de/proj/yeast/CYGD/dbhttp://mips.gsf.de/proj/yeast/CYGD/db

Page 23: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Function predictionFunction prediction

SISSA - PHD - October, 18th 2004

• about 30% of encoded proteins per sequenced genome about 30% of encoded proteins per sequenced genome are stillare still uncharacterizeduncharacterized

• network-based methods for function prediction:network-based methods for function prediction:

Majority rule (MR)Majority rule (MR)

Global optimization (GOM)Global optimization (GOM)

Topological redundanciesTopological redundancies

Functional clustering (PRODISTIN)Functional clustering (PRODISTIN)

Mixed GOMMixed GOM

MEE modelMEE model

B.Schwikowski, P.Uetz & S.Fields. B.Schwikowski, P.Uetz & S.Fields. Nature Biotech.Nature Biotech. 1818, 1257 (2000), 1257 (2000)

H.Hishigaki, K.Nakai, T.Ono & A.Tanigami. H.Hishigaki, K.Nakai, T.Ono & A.Tanigami. YeastYeast 1818, 523 (2001), 523 (2001)

A.Vazquez, A. Flammini, A. Maritan & A.Vespignani. A.Vazquez, A. Flammini, A. Maritan & A.Vespignani. Nature Biotech.Nature Biotech. 2121, 697 (2003), 697 (2003)

M.P.Samanta & S.Liang. M.P.Samanta & S.Liang. Proc. Natl. Acad. Sci. USAProc. Natl. Acad. Sci. USA, , 100100,12579 (2003),12579 (2003)

C.Brun C.Brun et al.et al. Genome Biol.Genome Biol. 55, R6 (2003), R6 (2003)

VC, P.De Los Rios, A.Flammini & A.Maritan. VC, P.De Los Rios, A.Flammini & A.Maritan. In preparationIn preparation

Page 24: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Function predictionFunction prediction

SISSA - PHD - October, 18th 2004

Basic strategy:Basic strategy:

closeclose proteins proteins closely relatedclosely related functional functional

annotationsannotations

RateRate (link (link f common) f common)(I) (II) (III)

exp

NM1

NM2

%90.82 %89.82 %36.72)%19.055.60( )%22.035.65( )%15.028.49(

)%16.062.49( )%20.005.64( )%20.064.60(

Page 25: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Majority ruleMajority rule

SISSA - PHD - October, 18th 2004

function assigned = function assigned = most commonmost common function(s) function(s) among classified partnersamong classified partners

???

2

3,4,10

12

links uncl./uncl. proteins links uncl./uncl. proteins completely completely neglectedneglected !!! !!!

Page 26: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Global Optimization Model Global Optimization Model (GOM)(GOM)

SISSA - PHD - October, 18th 2004

links links unclassified / unclassified proteinsunclassified / unclassified proteins also taken into accountalso taken into account

???

whole set of interactionswhole set of interactions of each of each uncharacterized protein uncharacterized protein self-consistencyself-consistency

2,4

3,4,10

12

2

3,4,10

12

Page 27: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Global Optimization Model Global Optimization Model (GOM)(GOM)

SISSA - PHD - October, 18th 2004

functional assignment functional assignment

scorescore

global optimizationglobal optimization::minimum minimum EE functional assignment proposedfunctional assignment proposed

links uncl./class. proteinslinks uncl./class. proteinslinks uncl./uncl. proteinslinks uncl./uncl. proteins

ji i

iiij hJEji

)(,

Page 28: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

mixed GOM & MEE modelsmixed GOM & MEE models

SISSA - PHD - October, 18th 2004

Designed to take Designed to take full advantagefull advantage of the of the observedobservedcorrelationscorrelations between the between the pattern of interactionspattern of interactionsamong proteins & their among proteins & their functionalitiesfunctionalities

more throughful more throughful investigation of the investigation of the

topologytopologymixed GOMmixed GOM

observed correlationsobserved correlationsbetween the functionsbetween the functionsof interacting proteinsof interacting proteins

MEE modelMEE model

Page 29: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

mixed Global Optimization mixed Global Optimization ModelModel

SISSA - PHD - October, 18th 2004

II neighborsII neighbors

experimental reasons: direct interaction/experimental reasons: direct interaction/ mediated interaction mediated interaction

evolution by duplication/divergenceevolution by duplication/divergence

topological redundanciestopological redundancies

M.P.Samanta & S.Liang. M.P.Samanta & S.Liang. Proc. Natl. Acad. Sci. USAProc. Natl. Acad. Sci. USA, , 100100,12579 (2003),12579 (2003)

A.Edwards A.Edwards et al. Trends Genetet al. Trends Genet. . 1818, 529 (2002), 529 (2002)

A.Force A.Force et al. Geneticset al. Genetics 151151, 1531 (1999), 1531 (1999)M.Lynch and A.Force. M.Lynch and A.Force. GeneticsGenetics 154154, 459 (2000), 459 (2000)

Page 30: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

mixed Global Optimization mixed Global Optimization ModelModel

SISSA - PHD - October, 18th 2004

I neighborsI neighbors GOM GOM11

II neighborsII neighbors GOM GOM22

ji i

iiij hJEji

)()1(

,

)1(

ji i

iiij hSEji

)()2(

,

)2(

Page 31: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

mixed Global Optimization mixed Global Optimization ModelModel

SISSA - PHD - October, 18th 2004

randomrandom initial functional assignment initial functional assignment indipendent optimizationindipendent optimization of of GOMGOM11 and and GOMGOM22 frustrationfrustration multiple optimal solutions multiple optimal solutions functional assignmentfunctional assignment: function(s) with : function(s) with highest highest

frequency of occurrencefrequency of occurrence mixed GOMmixed GOM functional assignment:functional assignment: mergingmerging GOMGOM11 andand GOMGOM22 role of role of topological redundanciestopological redundancies:: SSijij = = # paths of length 2# paths of length 2 connecting proteins connecting proteins ii

and and jj

Page 32: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Maximum Entropy Estimate Maximum Entropy Estimate ModelModel

SISSA - PHD - October, 18th 2004

kk-points correlation functions-points correlation functions

ji i

iiij hJEji

)()1(

,

)1(

1,0, jiijJ ((ii,,jj))

measure of measure of the functional the functional correlationscorrelations

Page 33: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Maximum Entropy Estimate Maximum Entropy Estimate ModelModel

SISSA - PHD - October, 18th 2004

((22))

((22,3,4),3,4)

((22))

((22,3),3) (5,6)(5,6)

(3,(3,44))

((44))

(2)(2)

(2,(2,44))

((22))

(2,(2,33,4),4)

((22))

(2,(2,33)) (5,6)(5,6)

((33,4),4)

(4)(4)

((22))

((22,4),4)

jiijJ ,

2 ji

4 ji ji ,

2i

3j

Page 34: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Maximum Entropy Estimate Maximum Entropy Estimate ModelModel

SISSA - PHD - October, 18th 2004

ij

F

f

F

f z

z

j

z

z

i

ji

ji

i

i

j

j j

j

i

iFFE

1 1

11

)()(

),(ln

1

ij

F

f

F

fji

i

i

j

j

jf

jif

iFFL1 1

',,

11)',(

Info extracted from the Info extracted from the partial knowledgepartial knowledge of the of the network network (maximum entropy estimate criterion)(maximum entropy estimate criterion)

cost functioncost function

Page 35: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Results: Statistical reliabilityResults: Statistical reliability

SISSA - PHD - October, 18th 2004

Self-consistency Self-consistency test:test: • fraction fraction ffnn of class. of class.

proteins proteins set set unclassifiedunclassified• function predictionfunction prediction• rate of successrate of success in in recovering correctrecovering correct functions of testfunctions of test proteinsproteins

Page 36: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Results: Statistical reliabilityResults: Statistical reliability

SISSA - PHD - October, 18th 2004

(I)(I) (II)(II)totally overlappingtotally overlapping successsuccess

unsuccessunsuccess

36%36% 79%79%

21%21%

51%51% 80%80%

20%20%

partly overlappingpartly overlapping success success GOMGOM11

success success GOMGOM22

success success GOMGOM11 & GOM & GOM22

unsuccessunsuccess

37%37% 8%8%

4%4%

74%74%

14%14%

26%26% 9%9%

1%1%

75%75%

15%15%

not overlappingnot overlapping success success GOMGOM11

success success GOMGOM22

success success GOMGOM11 & GOM & GOM22

unsuccessunsuccess

27%27% 20%20%

23%23%

34%34%

23%23%

23%23% 20%20%

24%24%

42%42%

14%14%

Page 37: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Results: Statistical reliabilityResults: Statistical reliability

SISSA - PHD - October, 18th 2004

TPTP

TPTPtSR f

##

##1)(

PP: ensemble of: ensemble of predicted functionspredicted functionsTT: ensemble of: ensemble of true functionstrue functions

TPTP

TPTPtSR f

##

##1)(

MRMR: majority rule: majority rule

randomrandom: random : random guessingguessing

Page 38: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Results: Statistical reliabilityResults: Statistical reliability

SISSA - PHD - October, 18th 2004

mixed GOM / MEEmixed GOM / MEEcomparisoncomparison

Page 39: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Results: RobustnessResults: Robustness

SISSA - PHD - October, 18th 2004

random rewiringrandom rewiring degree ofdegree of dissimilaritydissimilarity ffll

function predictionfunction prediction on on originaloriginal & & rewiredrewired networks networks prediction overlap:prediction overlap:

2/1

)()0()( liili ff

Page 40: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

Conclusions & PerspectivesConclusions & Perspectives

SISSA - PHD - October, 18th 2004

PINPIN: : underlying architecture and organizationunderlying architecture and organization standard tools of the theory of complex networksstandard tools of the theory of complex networks renormalization group approachrenormalization group approach

functional relevance and correlationsfunctional relevance and correlations

function prediction methodsfunction prediction methods Mixed GOMMixed GOM: topological extension of GOM; 2 : topological extension of GOM; 2

parameters with parameters with a prioria priori assigned values assigned values MEE modelMEE model: no free parameters, extracting info from : no free parameters, extracting info from

given knowledgegiven knowledge

improvementimprovement of predictive ability ( of predictive ability (success ratesuccess rate, , robustnessrobustness))

Page 41: Statistical mechanics approach to complex networks: from abstract to Vittoria Colizza Supervisor: Prof. Amos Maritan biological networks

AcknowledgmentsAcknowledgments

Amos MaritanAmos Maritan

Alessandro FlamminiAlessandro Flammini

Paolo De Los RiosPaolo De Los Rios

Alessandro VespignaniAlessandro Vespignani

Jayanth BanavarJayanth Banavar

Andrea RinaldoAndrea Rinaldo

SISSA - PHD - October, 18th 2004