33
Decomposition for Reasoning with Biological Network Gauvain Bourgne, Katsumi Inoue ISSSB’11, Shonan Village, November 13 th -17 th 2011

Decomposition for Reasoning with Biological Network

  • Upload
    yestin

  • View
    36

  • Download
    0

Embed Size (px)

DESCRIPTION

Decomposition for Reasoning with Biological Network. Gauvain Bourgne , Katsumi Inoue ISSSB’11, Shonan Village, November 13 th - 17 th 2011. Motivation. In bioinformatics, need to reason on huge amount of data Huge networks (e.g. metabolic pathways, signaling pathways…) - PowerPoint PPT Presentation

Citation preview

Automated problem decomposition for efficient reasoning in metabolic pathways

Decomposition for Reasoning with Biological NetworkGauvain Bourgne, Katsumi InoueISSSB11, Shonan Village, November 13th -17th 20111MotivationIn bioinformatics, need to reason on huge amount of dataHuge networks (e.g. metabolic pathways, signaling pathways)

On such problems, centralized methodsLong computation timeMemory overflow

Problem decompositionDivide into smaller problems or steps to recompose a global solutionNeed for (1) an automated process to decompose and (2) an algorithm to solve local problems and recompose global solution2Automated Problem Decomposition /33Example Problem (Krebs Cycle)

3succinateformaldehydecreatininecreatinebeta-alanine2-oxe-glutaratel-lysinel-2-aminoadipateisocitratetrans-aconitatetaurinenmndnmnahippurateformatesarcosinel-ascitrullineornithinearginineureamethylaminetmaolactateglucoseacetateacryloyl-coapyruvateFumaratefumarate2.6.1.391.1.1.422.3.1.614.2.1.34.2.1.21.3.99.11.13.11.162.1.1.12.1.1.76.3.4.52.1.3.32.1.1.23.5.3.13.5.3.33.5.2.101.5.99.11.1.99.81.4.99.34.1.2.324.2.1.544.3.1.62.1.3.14.1.1.202.6.1.141.2.1.31glycolisis1.1.1.274.3.2.13.5.1.592.6.1.-acetylcoa2.3.3.11.2.4.16.2.1.1citrate3Automated Problem Decomposition /33Example Problem (Krebs Cycle)

4succinateformaldehydecreatininecreatinebeta-alanine2-oxe-glutaratel-lysinel-2-aminoadipateisocitratetrans-aconitatetaurinenmndnmnahippurateformatesarcosinel-ascitrullineornithinearginineureamethylaminetmaolactateglucoseacetateacryloyl-coapyruvateFumaratefumarate2.6.1.391.1.1.422.3.1.614.2.1.34.2.1.21.3.99.11.13.11.162.1.1.12.1.1.76.3.4.52.1.3.32.1.1.23.5.3.13.5.3.33.5.2.101.5.99.11.1.99.81.4.99.34.1.2.324.2.1.544.3.1.62.1.3.14.1.1.202.6.1.141.2.1.31glycolisis1.1.1.274.3.2.13.5.1.592.6.1.-acetylcoa2.3.3.11.2.4.16.2.1.1citrateAg2Ag0Ag4Ag1Ag3Ag54.2.1.21.1.1.424.1.1.202.3.3.14.3.1.62.1.3.12.1.3.33.5.3.11.5.99.11.3.99.14Automated Problem Decomposition /33OverviewReasoning taskPartition-based algorithmAutomated decompositionExperimental evaluationConclusion5Automated Problem Decomposition /33OverviewReasoning taskPartition-based algorithmAutomated decompositionExperimental evaluationConclusion6Automated Problem Decomposition /33Logical representationMetabolic pathways: set of reactions Ri:Ri: m1,m2,,mp p1,p2,,pn Such reactions can be represented as an activation rulem1vm2vvmp v Ri n production rulesRi v p1Ri v p2Ri v pn Clausal theory7Automated Problem Decomposition /33Problems(Conditional) accessibility problemsSources (si), Conditional sources (ci), Targets (ti)Find which ti can be produced from si, possibly with the addition of ci as a new sourceFind all consequences of the form civvckv tjExtraction of sub-networksPathways completion (abduction)Find reactions (set of clauses)Hypothesis on state of reaction given experiments

Consequence finding (with specific form)8Automated Problem Decomposition /33Main reasoning taskConsequence Finding (CF) in clausal theoriesInputA clausal theory TA production field P=L is a list of literalsCond is a condition (maximal length of the consequences, or number of occurrences of some literals)OutputAll the consequences of T that are subsumption-minimal and belongs to P (formed with literals of L respecting condition Cond).Carc(T,P)9Automated Problem Decomposition /33OverviewReasoning taskPartition-based algorithmAutomated decompositionExperimental evaluationConclusion10Automated Problem Decomposition /33Partition-based CFThe taskConsequence Finding (CF) in clausal theoriesInputA set of clausal theory Ti such that UTi=T, and a set of reasoners ai associated with each partitionA production field P=OutputCarc(T,P)WhereThe output should be produced through local computations and interactions between reasoners (message exchange) 11Automated Problem Decomposition /33Partition-based Consequence FindingGeneralization of Partition-based Theorem Proving [Amir & McIlraith, 2005]Based on Craigs Interpolation Theorem:If C entails D, then there is a formula F involving only symbols common to C et D such that C entails F and F entails D.PrinciplesIdentify common symbols (communication languages)Build a tree structure (cycle-cut)Forward relevant consequences from leaf to rootCDF12Automated Problem Decomposition /33Communication languagesGraph induced from the partitionProblem : eliminate cycles from it while ensuring a proper labeling.Cycle-cutWhile (G not acyclic)Take a minimal cycle S=(i1,i2),(i2,i3),,(ip,i1).Choose (i,j) in S s.t. is minimal

For each (q,r)(i,j) in S, l(q,r)l(q,r)Ul(i,j)Remove (i,j) from Eabcbfgadeacdfaacbfad

bbAutomated Problem Decomposition /3313Forward Message-passing Algorithm(Sequential)PreprocessingDetermine initial l(i,j)Apply Cut-cyclesDetermine PiNon-root agents ai (with parent aj): Pi=Root ak: Pk=P

Consequence-FindingFrom leaves to rootDetermine Cni=Carc(i,Pi)Forward Cni

CarcCarcCarcCarc14Automated Problem Decomposition /33Parallel Variant

CarcCarcCarcCarcNewcarcNewcarcIncremental computations:Newcarc(TUC,P)=Carc(TUC,P)\Carc(T,P)15Automated Problem Decomposition /33OverviewReasoning taskPartition-based algorithmAutomated decompositionExperimental evaluationConclusion16Automated Problem Decomposition /33Decomposition of clausal theoriesGiven a Clausal Theory TFind a set of partitions Ti, such thatUTi=TReasoning is easier ie the application of partition-based algorithm to this decomposition is as efficient as possible.Minimize the size of the communication languagesEnsure that some simplification can be done locally Partitions should be cohesive and loosely coupled.

17Automated Problem Decomposition /33c1: bcefc2: adec3: dghc4: egc5: ghic2c1c4c3c5adhigecfbc2c1c4c3c5adhigecfbc2c1c4c3c5eedg,hgGraph representationClausal theory can be represented as graph

Focus on common symbols

18Automated Problem Decomposition /33c2c1c4c3c511121ArchitectureInitial Theory.sol fileReduced graph representationPartitioned graphPartitioned clausal theory.dcf fileRootSolutionkmetisNumber of partitionsPartition-based CFbuildGraphgraph2dcfRoot choice heuristicChoose root with maximal average clause size

19Automated Problem Decomposition /33Problem Decomposition

succinateformaldehydecreatininecreatinebeta-alanine2-oxe-glutaratel-lysinel-2-aminoadipateisocitratetrans-aconitatetaurinenmndnmnahippurateformatesarcosinel-ascitrullineornithinearginineureamethylaminetmaolactateglucoseacetateacryloyl-coapyruvateFumaratefumarate2.6.1.391.1.1.422.3.1.614.2.1.34.2.1.21.3.99.11.13.11.162.1.1.12.1.1.76.3.4.52.1.3.32.1.1.23.5.3.13.5.3.33.5.2.101.5.99.11.1.99.81.4.99.34.1.2.324.2.1.544.3.1.62.1.3.14.1.1.202.6.1.141.2.1.31Glycolisis path1.1.1.274.3.2.13.5.1.592.6.1.-acetylcoa2.3.3.11.2.4.16.2.1.1citrateag1ag3ag2ag5ag4ag020Automated Problem Decomposition /33OverviewReasoning taskPartition-based algorithmAutomated decompositionExperimental evaluationConclusion21Automated Problem Decomposition /33Benchmark Problems

Biological networksTPTP problemsProduction field : Vocabulary of conjecture (+ removing conjecture)Full vocabulary with length limitSAT problemsProduction fieldBased on frequency of literalsN% most/less frequent literalsSizeProblems still not tractable as CF problemsSolving only a cohesive sub-problem (obtained by partition of the clause graph)22Automated Problem Decomposition /33Problems characteristics

23Automated Problem Decomposition /33Results Biological Networks2 682 252 (3 321 857)24Automated Problem Decomposition /33Results SAT problems25Automated Problem Decomposition /33Results TPTP problems26Automated Problem Decomposition /33Results - summary27Automated Problem Decomposition /3327Results - summary28Automated Problem Decomposition /3328Results

For almost all problems, decomposition can reduce the number of resolve operations needed.Especially, it can solve some problems that could not be solvedTime is no often improvedDue to communication time (parsing, and such)Approached decomposition with metis: ok.Root choice heuristic: still insufficient, though not bad for biological networks problems.

Automated Problem Decomposition /3329OverviewReasoning taskPartition-based algorithmAutomated decompositionExperimental evaluationConclusion30Automated Problem Decomposition /33ConclusionA sound and complete algorithm combined with automated problem decompositionCan increase efficiency (nb of operation) for almost all problemsBut, results dependent on the choice of root

31Automated Problem Decomposition /33Future worksPartition-based algorithmVariant for Newcarc computationsCommon Theories for 1st order representationsOrdered partitions to break cycle (without removing links)DecompositionDirectly from metabolic pathwayRoot choice heuristicLearning preference relation on root choiceChoosing the number of partition

32Automated Problem Decomposition /33Thank you for your attentionAny question ? /3333Automated Problem Decomposition