View
213
Download
0
Embed Size (px)
Citation preview
caring about sharing: little b, a language for
building modular models
aneil mallavarapudepartment of systems biology
harvard medical school
alife-bostonwednesday june 15th, 2005
the motivation
• today, models are monolithic and used only by a small cadre of computational biologists
• how can models become a part of everyday scientific life, as gene sequences have become?
• we need a computational framework for building models in a modular and incremental way
how can we make McModelling a reality?
what is a model?• a formal description of a system of
interacting parts
• which enables some useful analysis, for example…
• mechanistic simulation: ODE/PDE, stochastic, boolean, multivalued discrete, hybrid, etc.
• steady-state analysis: flux-balance, metabolic control, null-cline analysis
• statistical analysis: bayes nets
i’ll show you “little b”,
• a programming language built in LISP,
• designed to enable modular description of biological systems,
• and write mathematical models for you.
modularity and extensibility
common lisp ansi x3j13
little b core language + syntax
symbolicmath
“x is used by y”
x y
biology
tools
MATLABSBML?PI?GUI tools?Database?
data structures and logical rules for building biochemical models.reactant reactant-type reaction reaction-type location membrane compartment aggregateenzymatic-reaction …
symbolic mathematicsunit dimension base-dimension gauss quantity polynomial rational-polynomialvar tvar cvar dvar…
object-oriented data structures & syntax + rule-based logic.define defcon defprop defruledefmethod { } – infix operator,[ ] – object operator. – field-access operator
widely supported standardfree & commercial compilers, tools, librariescore language syntax, types and functionsinteger, bignum, float, complex, string, list, array types, support for classes, structures, functions
units & dimensions
models
toy egf receptor model - parts:egf
egfr
mapkkk mapkkk*
mapkk mapkk*
mapk mapk*
“egfr+egf”“under the hood”
toy egf receptor model - mathematics:
• given a reaction with n LHS reactants, Ri with stoichiometries si:• s1R1 + … siRi … + snRn
• where the reaction occurs in a location of size Z (which may be a volume, area or length).
• reaction rate, T (moles / size-units / seconds)= • k x [R1]s1 x … [Ri]si … x [Rn]sn
• reaction rate in moles/seconds = T x Z • d[Ri]/dt = T x Z / Ci
• where Ci is the size of the compartment containing R i
e.g., A B2
T = k[A]2
d[A]/dt = … - 2 T ….d[B]/dt = … + T …
A
B
+C T = k[A][B]
d[A]/dt = … - T …d[B]/dt = … - T Zmembrane / Zcompartment
d[C]/dt = … + T …Zmembrane
Zcompartment
the mass action rate-method, calculates T, the rate of the reaction:
the rate-method is modular
which can be substituted:
implemented as a function
or the adventurous can build their own…
now imagine….
• libraries of such components have been previously defined by experts, and are available– over the web– in a database in your lab– in your own personal collection
• b enables these parts to be combined
let’s describe a situation composed of predefined parts:
dish
cell-a
egf
mapkkk
egfr
mapkk
mapk
mapkkk*
ES complex (mapkkk*-mapkk)
mapkk*
ES complex (mapkk*-mapk)
mapk*
b builds symbolic mathematical expressions:
“object-oriented syntax meets symbolic math” enables programmers and theorists to write & debug functions which translate between the world of objects and the world of mathematical expressions.
dish
cell-a
egf
mapkkk
egfr
mapkk
mapk
mapkkk*
ES complex (mapkkk*-mapkk)
mapkk*
ES complex (mapkk*-mapk)
mapk*
the symbolic math subsystem is a an extensible toolkit for theoreticians to express mathematical concepts:
• units, dimensions
• quantities, gaussian distributions
• polynomials
• rational-polynomials
system is extensible:
possible additions:
• radical expressions
• matricies
• poisson distributions
• …others?
ok… back to the model:
dish
cell-a
egf
mapkkk
egfr
mapkk
mapk
mapkkk*
ES complex (mapkkk*-mapkk)
mapkk*
ES complex (mapkk*-mapk)
mapk*
set initial conditions… and perform numerical integration in matlab
} shorthand for setting initial condition ofall reactants of a particular type
modularity and extensibility
common lisp ansi x3j13
little b core language + syntax
symbolicmath
biology
tools
units & dimensions
models
aggr
egat
es
aggregates:• an aggregate is a biochemical species which is composed of some number
of other molecules
S1 S2
?
R
S2S1
S1
S2
“RS12”
dimerizing-aggregate calculates every pairwise reaction-type which leads to formation of the complex
situation-independent encoding of reactions
• location-class• location• reaction-type• reactant-type• location-requirement
• reaction• reactant
reactant-type / reactant
• reactant-type is used to describe types of (bio)chemical species
(define ion [simple-reactant-type :location-class compartment])
(define ion-channel [simple-reactant-type :location-class membrane])
• reactant is used to describe a population of molecules of a particular reactant-type in a particular location:
(define c1 [compartment])
(define m1 [membrane]) A.(in cell.inner) :#= [reactant A c1] molecules of reactant-type A in c1
R.(in cell.membrane) :#= [reactant R m1] molecules of reactant-type R in m1
reaction-type / reaction
• a reaction-type describes the logical requirements for a reaction:
A B2
(define A [simple-reactant-type :location-class compartment])(define B [simple-reactant-type :location-class compartment])
(define RT1 [reaction-type {2 A} {B} compartment])
• a reaction is a reaction-type in a particular location:• e.g., if A.(in c1) exists, then [reaction RT1 c1] will be created• if [reaction RT1 c1] exists, then B.(in c1) will be created
{ { {
what is required for the reaction to proceed + stoichiometrywhat is required if the reaction does proceed + stoichiometryclass of location in which the reaction happens
membrane reactions
“:c1” side
“:c2” side
L
R
+
RL
[reaction-type {R + L.(required :c1)} {RL} membrane]
[reaction-type {C + I.(required :c1)} {C + I.(required :c2)} membrane]
ligand binding
“:c1” side
“:c2” sideC
I
C
I
membrane transport
“:c1” side
“:c2” sideR
R [reaction-type {R} {R.(required :inverse)} membrane]
inversion
“:c1” side
“:c2” side
engineer’s “modularity” ≠ biochemist’s “modularity”
hierarchical composition
circuits & software:
…
component
…
pathway
“flat spaghetti” composition
biochemistry
how can we name objects?
s
e
pr
by user definition
2-step mechanismimplies “r.es.1”
[simple-reactant-type (_id rs.es.1)]
by hierarchical definition
A
B{A + B}
[aggregate {A + B}]
by composition
A
B
C
A B C
B A C
by structure
B
A C
graph-based reactantsmolecular complexes may be defined according to coarse-grained structure:
e.g., scaffold (S) and two kinases (K1, K2)
k1sh3
ps
atomic reactant-types are defined with user-generated symbols (as before), but…also include sites of interaction
reactant-types representing multimeric complexes are described using graphs
s 1
2
k2sh3
ps
s 1
2
k1sh3
ps
k2sh3
ps
u
p
G = {V,E} where V = verticies, E= edges
V = s.(site 1), s.(site 2), k1.(site :sh3)…E = s.(site 1) (k.(site :sh3), s.(site 2) (k.(site :sh3) … k1.(site :ps) :u
scaffold bound to kinases where kinase1 is unphosphorylated and kinase2 is phosphorylated
some thoughts on language designI think conventional languages are for the birds. They're just extensions of the von Neumann computer, and they keep our noses in the dirt of dealing with individual words and computing addresses, and doing all kinds of silly things like that, things that we've picked up from programming for computers; we've built them into programming languages; we've built them into Fortran; we've built them in PL/1; we've built them into almost every language.
John Backus
“Programs must be written for people to read, and only incidentally for machines to execute”Abelson & Sussman, Structure and Interpretation of Computer Programs
“Intellectually, it is just as worthwhile to design a language programmers will love as it is to design a horrible one that embodies some idea you can publish a paper about.”Paul Graham, Five Questions about Language Design
programming languages as a medium of communication:
human computer
human familiarity
brevity
comprehensibility
…
computer uniformity
non-redundancy
computability
code safety
…
fromto
the core language
common lisp ansi x3j13
little b core language + syntax
symbolicmath
biology
toolsobject-oriented data structures & syntax + rule-based logic.define defcon defprop defruledefmethod { } – infix operator,[ ] – object operator. – field-access operator
units & dimensions
models
future
• generalized scaffold and multisite phosphorylation models
• markov chain-based model for representing protein-DNA interactions
• concepts for sharing stochastic models• implementation of shareable mapk, nfkb models with
scaffold• gui tools
modular extensible model building
• separation of biological understanding from mechanism and mathematical assumptions
• automated model construction from reusable parts
• extensible libraries• physical units and dimensions• free, open source software
http://littleb.org
thank you
• jeremy gunawardena
• craig muir, millennium pharmaceuticals
• matt thomson
• vlado gelev
some current approaches
Mathematical notation:
Hoffmann A, Levchenko A, Scott ML, Baltimore D. Related Articles, Links The IkappaB-NF-kappaB signaling module: temporal control and selective gene activation. Science. 2002 Nov
8;298(5596):1241-5. SBML: MAPK Scaffold ModelProc Natl Acad Sci U S A. 2000 May 23;97(11):5818-23. Scaffold proteins may biphasically affect the levels of mitogen-activated protein kinase signaling and reduce its threshold properties. Levchenko A, Bruck J, Sternberg PW.
little b was inspired by work in qualitative physics
• a branch of artificial intelligence which aimed to emulate human-like qualitative reasoning about the physical world (see Kuipers, Forbus, CML)
• a scenario is described in terms of different types of objects and their relationships
• “water is in a pot”• “a flame is under the pot”• what happens?• the computer needs to be able to
compute the implications of a scenario
• based on general rules for reasoning about physical systems