Upload
angelica-castel
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Statistical mechanics approach Statistical mechanics approach to complex networks: to complex networks:
from abstract to from abstract to
Vittoria ColizzaVittoria Colizza
Supervisor:Supervisor: Prof. Amos Prof. Amos MaritanMaritan
biological networksbiological networks
Protein-protein Protein-protein Interaction Interaction NetworksNetworks
biological networksbiological networks
OutlineOutline
• PINPIN• MethodsMethods• Topological analysisTopological analysis• RenormalizationRenormalization• Topology / functionality correlationsTopology / functionality correlations• Function predictionFunction prediction
mixed Global Optimization Modelmixed Global Optimization Model Maximum Entropy Estimate ModelMaximum Entropy Estimate Model
• Conclusions & PerspectivesConclusions & Perspectives
SISSA - PHD - October, 18th 2004
Protein Interaction NetworksProtein Interaction Networks
Involved in Involved in almost every cellular almost every cellular
process process ::
DNA replication, transcription and translationDNA replication, transcription and translation intracellular communicationintracellular communication cell cycle controlcell cycle control the workings of complex molecular motorsthe workings of complex molecular motors ……....
SISSA - PHD - October, 18th 2004
Protein Interaction NetworksProtein Interaction Networks
Undirected network:Undirected network:
nodes nodes proteinsproteins
links links direct direct
interactioninteraction
SISSA - PHD - October, 18th 2004
S.Maslov & K.Sneppen, S.Maslov & K.Sneppen, ScienceScience 296296, 910 (2002), 910 (2002)
Interaction-detection MethodsInteraction-detection Methods
experimental techniquesexperimental techniques physical physical bindingsbindings• yeast two-hybrid systemsyeast two-hybrid systems• mass spectrometry analysis of purifiedmass spectrometry analysis of purified complexescomplexes
interaction prediction methodsinteraction prediction methods functional associationsfunctional associations• correlated mRNA expression profilescorrelated mRNA expression profiles• genetic interaction-detection methodsgenetic interaction-detection methods• in silicoin silico approaches – gene fusion, gene neighborhood, approaches – gene fusion, gene neighborhood,
phylogenetic profilesphylogenetic profiles
SISSA - PHD - October, 18th 2004
Yeast two-hybrid system (Y2H)Yeast two-hybrid system (Y2H)
simple, rapid, sensitive, simple, rapid, sensitive, inexpensive inexpensive suitable for suitable for large-large-scale applicationsscale applications
virtually virtually every protein-protein every protein-protein interactioninteraction, even transient, , even transient, unstable or weak ints.unstable or weak ints.
no cooperativeno cooperative binding bindingsome kindssome kinds of proteins of proteins not not suitablesuitable, e.g. transcription factors, e.g. transcription factorsfalse negativefalse negative ints. (artificially ints. (artificially made hybrids)made hybrids)false positivefalse positive ints. (spatio- ints. (spatio-temporal constraints)temporal constraints)
SISSA - PHD - October, 18th 2004
Protein Complex analysisProtein Complex analysis
identification of identification of whole complexeswhole complexes cooperative bindingcooperative binding
in vivoin vivo technique; one artificially technique; one artificially made proteinmade protein
physiologicalphysiological settings settings several componentsseveral components as as tagged tagged
baitsbaits for test for test
tagging procedure interferencetagging procedure interference with complex formationwith complex formation
false negativefalse negative ints. (weakly ints. (weakly associated proteins) associated proteins)
SISSA - PHD - October, 18th 2004
Topological analysisTopological analysis
Saccharomyces cerevisiaeSaccharomyces cerevisiae• network (I): network (I): Y2HY2H binary interactions from 2 binary interactions from 2
distinct experimentsdistinct experiments
• network (II): network (II): interactions from complex interactions from complex analysis (analysis (tandem affinity purification,tandem affinity purification, TAPTAP))
• network (III): network (III): mixedmixed collection of interactions collection of interactions from different exp. techniques (from different exp. techniques (Database of Database of Interacting Proteins,Interacting Proteins, DIPDIP) )
SISSA - PHD - October, 18th 2004
P.Uetz P.Uetz et al. Natureet al. Nature 403403, 623 (2000), 623 (2000)T.Ito T.Ito et al. Proc. Natl. Acad. Sci. USAet al. Proc. Natl. Acad. Sci. USA 9898, 4569 (2001), 4569 (2001)
A.C.Gavin A.C.Gavin et al. Natureet al. Nature 415415, 141 (2002), 141 (2002)
Database of Interacting Proteins (DIP)Database of Interacting Proteins (DIP)http://dip.doe-mbi.ucla.edu/http://dip.doe-mbi.ucla.edu/
Topological analysisTopological analysis
SISSA - PHD - October, 18th 2004
(I) (II) (III)
# proteins 2152 1361 4713
# links 2831 3221 14846
<k> 2.63 4.73 6.30
<C> 0.10 0.22 0.09
<Crand> 0.0064 0.019 0.018
degree distrib. P(k)degree distrib. P(k)
SISSA - PHD - October, 18th 2004
0/
0 )()(kk
ekkkP
H.Jeong, S.P.Mason, A.-L. Barabasi & Z.N.Oltvai, H.Jeong, S.P.Mason, A.-L. Barabasi & Z.N.Oltvai, NatureNature 411411, 41 (2001), 41 (2001)
SISSA - PHD - October, 18th 2004
0/
0 )()(kk
ekkkP
5.2)( I
5.2)( III1.2)( II
clustering coeff. C(k)clustering coeff. C(k)
SISSA - PHD - October, 18th 2004
)1(
2
ii
ii kk
eC
i
kki iC
knPkC ,)(
1)(
48.0)( II
Ck
neighb. degree kneighb. degree knnnn(k)(k)
SISSA - PHD - October, 18th 2004
j
jijii
kknn kAkknP
kki
1
)(
1)( ,
24.0)( III
knn
rich-club phenomenonrich-club phenomenon
SISSA - PHD - October, 18th 2004
)1(
2)(
kk
k
NN
ek
96.1)( I
k
14.1)( III
k01.1
)( II
k
SISSA - PHD - October, 18th 2004
Y2H networkY2H network
TAP networkTAP network
DIP networkDIP network
no correlationsno correlations
only 3-points only 3-points correlations correlations (complexes)(complexes)
hierarchical hierarchical structure,structure,degree degree correlationscorrelations
Topological analysisTopological analysis
SISSA - PHD - October, 18th 2004
throughthrough
network renormalizationnetwork renormalization
Network renormalizationNetwork renormalization
investigation of investigation of critical behaviorscritical behaviors of of complex networks through complex networks through RG approachRG approach
coarse-graining: coarse-graining: decimation of decimation of less less relevant detailsrelevant details to elucidate to elucidate critical critical propertiesproperties
‘‘simplification’ simplification’ simpler and more simpler and more understandable versions of large-scale understandable versions of large-scale networks networks network visualization ?network visualization ?
SISSA - PHD - October, 18th 2004
=0,1=0,1 ’ ’ not onlynot only 0,1 0,1 weighted weighted networksnetworks
SISSA - PHD - October, 18th 2004
PIN renormalizationPIN renormalization
SISSA - PHD - October, 18th 2004
PIN renormalizationPIN renormalization
SISSA - PHD - October, 18th 2004
power-law + exp. cut-offpower-law + exp. cut-off
purepure power-law power-law
no no purepure power-law (+ exp. cut-off) power-law (+ exp. cut-off)
power-law (+ exp. cut-off)power-law (+ exp. cut-off)
Functional characterizationFunctional characterization
SISSA - PHD - October, 18th 2004
Change in the view of protein function:Change in the view of protein function:
individual taskindividual task
cooperative behaviour protein cooperative behaviour protein
interactionsinteractions
functional relationshipsfunctional relationshipsMIPS Comprehensive Yeast Genome Database (CYGD).MIPS Comprehensive Yeast Genome Database (CYGD).http://mips.gsf.de/proj/yeast/CYGD/dbhttp://mips.gsf.de/proj/yeast/CYGD/db
Function predictionFunction prediction
SISSA - PHD - October, 18th 2004
• about 30% of encoded proteins per sequenced genome about 30% of encoded proteins per sequenced genome are stillare still uncharacterizeduncharacterized
• network-based methods for function prediction:network-based methods for function prediction:
Majority rule (MR)Majority rule (MR)
Global optimization (GOM)Global optimization (GOM)
Topological redundanciesTopological redundancies
Functional clustering (PRODISTIN)Functional clustering (PRODISTIN)
Mixed GOMMixed GOM
MEE modelMEE model
B.Schwikowski, P.Uetz & S.Fields. B.Schwikowski, P.Uetz & S.Fields. Nature Biotech.Nature Biotech. 1818, 1257 (2000), 1257 (2000)
H.Hishigaki, K.Nakai, T.Ono & A.Tanigami. H.Hishigaki, K.Nakai, T.Ono & A.Tanigami. YeastYeast 1818, 523 (2001), 523 (2001)
A.Vazquez, A. Flammini, A. Maritan & A.Vespignani. A.Vazquez, A. Flammini, A. Maritan & A.Vespignani. Nature Biotech.Nature Biotech. 2121, 697 (2003), 697 (2003)
M.P.Samanta & S.Liang. M.P.Samanta & S.Liang. Proc. Natl. Acad. Sci. USAProc. Natl. Acad. Sci. USA, , 100100,12579 (2003),12579 (2003)
C.Brun C.Brun et al.et al. Genome Biol.Genome Biol. 55, R6 (2003), R6 (2003)
VC, P.De Los Rios, A.Flammini & A.Maritan. VC, P.De Los Rios, A.Flammini & A.Maritan. In preparationIn preparation
Function predictionFunction prediction
SISSA - PHD - October, 18th 2004
Basic strategy:Basic strategy:
closeclose proteins proteins closely relatedclosely related functional functional
annotationsannotations
RateRate (link (link f common) f common)(I) (II) (III)
exp
NM1
NM2
%90.82 %89.82 %36.72)%19.055.60( )%22.035.65( )%15.028.49(
)%16.062.49( )%20.005.64( )%20.064.60(
Majority ruleMajority rule
SISSA - PHD - October, 18th 2004
function assigned = function assigned = most commonmost common function(s) function(s) among classified partnersamong classified partners
???
2
3,4,10
12
links uncl./uncl. proteins links uncl./uncl. proteins completely completely neglectedneglected !!! !!!
Global Optimization Model Global Optimization Model (GOM)(GOM)
SISSA - PHD - October, 18th 2004
links links unclassified / unclassified proteinsunclassified / unclassified proteins also taken into accountalso taken into account
???
whole set of interactionswhole set of interactions of each of each uncharacterized protein uncharacterized protein self-consistencyself-consistency
2,4
3,4,10
12
2
3,4,10
12
Global Optimization Model Global Optimization Model (GOM)(GOM)
SISSA - PHD - October, 18th 2004
functional assignment functional assignment
scorescore
global optimizationglobal optimization::minimum minimum EE functional assignment proposedfunctional assignment proposed
links uncl./class. proteinslinks uncl./class. proteinslinks uncl./uncl. proteinslinks uncl./uncl. proteins
ji i
iiij hJEji
)(,
mixed GOM & MEE modelsmixed GOM & MEE models
SISSA - PHD - October, 18th 2004
Designed to take Designed to take full advantagefull advantage of the of the observedobservedcorrelationscorrelations between the between the pattern of interactionspattern of interactionsamong proteins & their among proteins & their functionalitiesfunctionalities
more throughful more throughful investigation of the investigation of the
topologytopologymixed GOMmixed GOM
observed correlationsobserved correlationsbetween the functionsbetween the functionsof interacting proteinsof interacting proteins
MEE modelMEE model
mixed Global Optimization mixed Global Optimization ModelModel
SISSA - PHD - October, 18th 2004
II neighborsII neighbors
experimental reasons: direct interaction/experimental reasons: direct interaction/ mediated interaction mediated interaction
evolution by duplication/divergenceevolution by duplication/divergence
topological redundanciestopological redundancies
M.P.Samanta & S.Liang. M.P.Samanta & S.Liang. Proc. Natl. Acad. Sci. USAProc. Natl. Acad. Sci. USA, , 100100,12579 (2003),12579 (2003)
A.Edwards A.Edwards et al. Trends Genetet al. Trends Genet. . 1818, 529 (2002), 529 (2002)
A.Force A.Force et al. Geneticset al. Genetics 151151, 1531 (1999), 1531 (1999)M.Lynch and A.Force. M.Lynch and A.Force. GeneticsGenetics 154154, 459 (2000), 459 (2000)
mixed Global Optimization mixed Global Optimization ModelModel
SISSA - PHD - October, 18th 2004
I neighborsI neighbors GOM GOM11
II neighborsII neighbors GOM GOM22
ji i
iiij hJEji
)()1(
,
)1(
ji i
iiij hSEji
)()2(
,
)2(
mixed Global Optimization mixed Global Optimization ModelModel
SISSA - PHD - October, 18th 2004
randomrandom initial functional assignment initial functional assignment indipendent optimizationindipendent optimization of of GOMGOM11 and and GOMGOM22 frustrationfrustration multiple optimal solutions multiple optimal solutions functional assignmentfunctional assignment: function(s) with : function(s) with highest highest
frequency of occurrencefrequency of occurrence mixed GOMmixed GOM functional assignment:functional assignment: mergingmerging GOMGOM11 andand GOMGOM22 role of role of topological redundanciestopological redundancies:: SSijij = = # paths of length 2# paths of length 2 connecting proteins connecting proteins ii
and and jj
Maximum Entropy Estimate Maximum Entropy Estimate ModelModel
SISSA - PHD - October, 18th 2004
kk-points correlation functions-points correlation functions
ji i
iiij hJEji
)()1(
,
)1(
1,0, jiijJ ((ii,,jj))
measure of measure of the functional the functional correlationscorrelations
Maximum Entropy Estimate Maximum Entropy Estimate ModelModel
SISSA - PHD - October, 18th 2004
((22))
((22,3,4),3,4)
((22))
((22,3),3) (5,6)(5,6)
(3,(3,44))
((44))
(2)(2)
(2,(2,44))
((22))
(2,(2,33,4),4)
((22))
(2,(2,33)) (5,6)(5,6)
((33,4),4)
(4)(4)
((22))
((22,4),4)
jiijJ ,
2 ji
4 ji ji ,
2i
3j
Maximum Entropy Estimate Maximum Entropy Estimate ModelModel
SISSA - PHD - October, 18th 2004
ij
F
f
F
f z
z
j
z
z
i
ji
ji
i
i
j
j j
j
i
iFFE
1 1
11
)()(
),(ln
1
ij
F
f
F
fji
i
i
j
j
jf
jif
iFFL1 1
',,
11)',(
Info extracted from the Info extracted from the partial knowledgepartial knowledge of the of the network network (maximum entropy estimate criterion)(maximum entropy estimate criterion)
cost functioncost function
Results: Statistical reliabilityResults: Statistical reliability
SISSA - PHD - October, 18th 2004
Self-consistency Self-consistency test:test: • fraction fraction ffnn of class. of class.
proteins proteins set set unclassifiedunclassified• function predictionfunction prediction• rate of successrate of success in in recovering correctrecovering correct functions of testfunctions of test proteinsproteins
Results: Statistical reliabilityResults: Statistical reliability
SISSA - PHD - October, 18th 2004
(I)(I) (II)(II)totally overlappingtotally overlapping successsuccess
unsuccessunsuccess
36%36% 79%79%
21%21%
51%51% 80%80%
20%20%
partly overlappingpartly overlapping success success GOMGOM11
success success GOMGOM22
success success GOMGOM11 & GOM & GOM22
unsuccessunsuccess
37%37% 8%8%
4%4%
74%74%
14%14%
26%26% 9%9%
1%1%
75%75%
15%15%
not overlappingnot overlapping success success GOMGOM11
success success GOMGOM22
success success GOMGOM11 & GOM & GOM22
unsuccessunsuccess
27%27% 20%20%
23%23%
34%34%
23%23%
23%23% 20%20%
24%24%
42%42%
14%14%
Results: Statistical reliabilityResults: Statistical reliability
SISSA - PHD - October, 18th 2004
TPTP
TPTPtSR f
##
##1)(
PP: ensemble of: ensemble of predicted functionspredicted functionsTT: ensemble of: ensemble of true functionstrue functions
TPTP
TPTPtSR f
##
##1)(
MRMR: majority rule: majority rule
randomrandom: random : random guessingguessing
Results: Statistical reliabilityResults: Statistical reliability
SISSA - PHD - October, 18th 2004
mixed GOM / MEEmixed GOM / MEEcomparisoncomparison
Results: RobustnessResults: Robustness
SISSA - PHD - October, 18th 2004
random rewiringrandom rewiring degree ofdegree of dissimilaritydissimilarity ffll
function predictionfunction prediction on on originaloriginal & & rewiredrewired networks networks prediction overlap:prediction overlap:
2/1
)()0()( liili ff
Conclusions & PerspectivesConclusions & Perspectives
SISSA - PHD - October, 18th 2004
PINPIN: : underlying architecture and organizationunderlying architecture and organization standard tools of the theory of complex networksstandard tools of the theory of complex networks renormalization group approachrenormalization group approach
functional relevance and correlationsfunctional relevance and correlations
function prediction methodsfunction prediction methods Mixed GOMMixed GOM: topological extension of GOM; 2 : topological extension of GOM; 2
parameters with parameters with a prioria priori assigned values assigned values MEE modelMEE model: no free parameters, extracting info from : no free parameters, extracting info from
given knowledgegiven knowledge
improvementimprovement of predictive ability ( of predictive ability (success ratesuccess rate, , robustnessrobustness))
AcknowledgmentsAcknowledgments
Amos MaritanAmos Maritan
Alessandro FlamminiAlessandro Flammini
Paolo De Los RiosPaolo De Los Rios
Alessandro VespignaniAlessandro Vespignani
Jayanth BanavarJayanth Banavar
Andrea RinaldoAndrea Rinaldo
SISSA - PHD - October, 18th 2004