77
How does science work? Nonlinear fits to data: Sloppiness, differential geometry, and algorithms JPS, Mark Transtrum, Isabel Kloumann, Ben Machta, Kevin Brown, Ryan Gutenkunst, Josh Waterfall, Chris Myers, … Cell Dynamics Sloppiness Model Manifold Hyper-ribbons Coarse-Grained Models Fitting Exponentials Friday, May 3, 13

How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

  • Upload
    lenga

  • View
    215

  • Download
    0

Embed Size (px)

Citation preview

Page 1: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

How does science work? Nonlinear fits to data: Sloppiness, differential geometry, and algorithms

JPS, Mark Transtrum, Isabel Kloumann, Ben Machta, Kevin Brown, Ryan Gutenkunst, Josh Waterfall, Chris Myers, …

Cell Dynamics

Slop

pine

ss

Model Manifold

Hyper

-ribb

ons

Coarse-Grained

Models

Fitting Exponentials

Friday, May 3, 13

Page 2: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Ensemble: Extrapolation

Ensemble:Interpolation

Fit

Fitting Decaying ExponentialsClassic ill-posed inverse problem

Given Geiger counter measurements from a

radioactive pile, can we recover the identity of the elements and/or

predict future radioactivity? Good fits with bad decay rates!

P, S, I3532 125

6 Parameter Fity(A,γ,t) = A1 e-γ1t+A2 e-γ2t+A3 e-γ3t

Friday, May 3, 13

Page 3: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Ensemble: Extrapolation

Ensemble:Interpolation

Fitting Decaying ExponentialsClassic ill-posed inverse problem

Given Geiger counter measurements from a

radioactive pile, can we recover the identity of the elements and/or

predict future radioactivity? Good fits with bad decay rates!

P, S, I3532 125

6 Parameter Fity(A,γ,t) = A1 e-γ1t+A2 e-γ2t+A3 e-γ3t

Friday, May 3, 13

Page 4: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Ensemble: Extrapolation

Fitting Decaying ExponentialsClassic ill-posed inverse problem

Given Geiger counter measurements from a

radioactive pile, can we recover the identity of the elements and/or

predict future radioactivity? Good fits with bad decay rates!

P, S, I3532 125

6 Parameter Fity(A,γ,t) = A1 e-γ1t+A2 e-γ2t+A3 e-γ3t

Friday, May 3, 13

Page 5: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

+NGF+EGF

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 6: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFR NGFR

Ras

Sos

ERK1/2

MEK1/2

Raf-1

+NGF+EGF

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 7: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFR NGFR

Ras

Sos

ERK1/2

MEK1/2

Raf-1

+NGF+EGF

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 8: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFR NGFR

Ras

Sos

ERK1/2

MEK1/2

Raf-1

+NGF

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 9: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFR NGFR

Ras

Sos

ERK1/2

MEK1/2

Raf-1

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 10: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFR NGFR

Ras

Sos

ERK1/2

MEK1/2

Raf-1

Tunes down signal (Raf-1)

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 11: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

PC12 DifferentiationER

K*

Time

10’

ERK

*

Time

10’

EGFR NGFR

Ras

Sos

ERK1/2

MEK1/2

Raf-1

Pumps up signal (Mek)Tunes down

signal (Raf-1)

Biologists study which proteins talk to which. Modeling?Friday, May 3, 13

Page 12: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

48 Parameter Fit!

Systems Biology: Cell Protein Reactions

Friday, May 3, 13

Page 13: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

48 Parameter Fit!

Systems Biology: Cell Protein Reactions

Friday, May 3, 13

Page 14: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

48 Parameter Fit!

Systems Biology: Cell Protein Reactions

Friday, May 3, 13

Page 15: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

48 Parameter Fit!

Systems Biology: Cell Protein Reactions

Friday, May 3, 13

Page 16: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Ensemble of ModelsWe want to consider not just minimum cost fits, but all parameter sets consistent with the available data. New

level of abstraction: statistical mechanics in model space.

Boltzmann weights exp(-C/T)

O is chemical concentration y(ti), or rate constant θn…

Don’t trust predictions that varyCost is least-squares fit

Friday, May 3, 13

Page 17: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Ensemble of ModelsWe want to consider not just minimum cost fits, but all parameter sets consistent with the available data. New

level of abstraction: statistical mechanics in model space.

Boltzmann weights exp(-C/T)

O is chemical concentration y(ti), or rate constant θn…

Don’t trust predictions that varyCost is least-squares fit

Friday, May 3, 13

Page 18: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Ensemble of ModelsWe want to consider not just minimum cost fits, but all parameter sets consistent with the available data. New

level of abstraction: statistical mechanics in model space.

Boltzmann weights exp(-C/T)

O is chemical concentration y(ti), or rate constant θn…

bare

eigen

Don’t trust predictions that varyCost is least-squares fit

Friday, May 3, 13

Page 19: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Parameters Fluctuate

over Enormous

Range

stiff

sloppy

log e

eig

enpa

ram

eter

flu

ctua

tion

sorted eigenparameter number

• All parameters vary by minimum factor of 50, some by a million• Not robust: four or five “stiff” linear combinations of parameters; 44 sloppy

Parameter (sorted)

Rel

ativ

e pa

ram

eter

flu

ctua

tion

Are predictions possible?

Friday, May 3, 13

Page 20: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Predictions are Possible

Model predicts that the left branch isn’t important

Bro

wn’

s E

xper

imen

tM

odel

Pre

dict

ion

Parameters fluctuate orders of magnitude, but still predictive!

Friday, May 3, 13

Page 21: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppy direction

Stiff

dire

ctio

n

10-6

-10-6

-10-3 10-3

fits of decaying exponentials

Parameter Indeterminacy and Sloppiness

A few ‘stiff’ constrained directions allow model to

remain predictive

Note: Horizontal scale shrunk by 1000 times

Aspect ratio = Human hair

48 parameter fits are sloppy: Many parameter sets give almost equally

good fits

Cost Contours

“Bare ParameterAxes”

~5 stiff, ~43 sloppy directionsFriday, May 3, 13

Page 22: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Systems BiologySeventeen models

Enormous Ranges of Eigenvalues

(348 is a big number)Sloppy Range ~ √λ Gutenkunst

(a) eukaryotic cell cycle(b) Xenopus egg cell cycle(c) eukaryotic mitosis (d) generic circadian rhythm(e) nicotinic acetylcholine intra-receptor

dynamics(f) generic kinase cascade(g) Xenopus Wnt signaling(h) Drosophila circadian rhythm(i) rat growth-factor signaling(j) Drosophila segment polarity (k) Drosophila circadian rhythm(l) Arabidopsis circadian rhythm(m)in silico regulatory network(n) human purine metabolism(o) Escherichia coli carbon metabolism(p) budding yeast cell cycle(q) rat growth-factor signaling

Friday, May 3, 13

Page 23: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Where is Sloppiness From?Fitting Polynomials to Data

Fitting Monomials to Datay = ∑an xn

Functional Forms SameHessian Hij = 1/(i+j+1)Hilbert matrix: famous

Orthogonal Polynomialsy = ∑bn Ln(x)

Functional Forms DistinctEigen ParametersHessian Hij = δij

Sloppiness arises when bare parameters skew in eigenbasis Small Determinant!

|H| = ∏ λnFriday, May 3, 13

Page 24: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppy Models Outside Bio

Sloppy Systems• Enormous range of eigenvalues• Roughly equal density in log• Observed in broad range of systems

From accelerator design to insect flight, multiparameter fits are sloppy

10–4!

10–2!

100!

10–6!

10–8!

10–10!

Eige

nval

ues!

Diffusive hopping!

Ising !model!

Radioactive!decay!

Neural network!

Systems!biology!

Ran

dom!

Friday, May 3, 13

Page 25: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Some systems aren’t sloppy

Syst

ems

Bio

QM

C

Rad

ioac

tivity

Man

y ex

ps

GO

E

Prod

uct

Fitti

ng p

lane

Mon

omia

ls

Eig

enva

lue

Sloppy Systems• Quantum Monte Carlo (best molecule excitations)• Monomial sums• Not linear regression models• Not random matrix theory

NOT SLOPPY

Friday, May 3, 13

Page 26: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Exploring Parameter SpaceRugged? More like Grand Canyon (Josh)

Glasses: Rugged LandscapeMetastable Local ValleysTransition State Passes

Optimization Hell: Golf CourseSloppy Models

Minima: 5 stiff, N-5 sloppySearch: Flat planes with cliffs

Friday, May 3, 13

Page 27: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Variational Wavefunctions for Quantum Monte Carlo Cyrus Umrigar, Josh Waterfall

Ψ(R) = amm∑ exp(− u(rij ))

i< j∑

φ1(r1) φ1(r2) ... φ1(rN )φ2(r1) φ2(r2) ... φ2(rN )... ... ... ...

φN (r1) φN (r2) ... φN (rN ) m↑

φ1(r1) φ1(r2) ... φ1(rN )φ2(r1) φ2(r2) ... φ2(rN )... ... ... ...

φN (r1) φN (r2) ... φN (rN ) m↓

...

The most accurate method for solving Schrödinger’s equation for molecules rests on a variational wavefunction (followed by diffusion Monte Carlo):

Often parameters θ are varied to minimize energy (find ground states)Umrigar developed a method which fits the local energies at randomly chosen Rj

Data = H Ψθ (Rj) / |Ψθ (Rj)| = E(Rj)then varies parameters to minimize fluctuations in local E’s away from eigen-energy

Friday, May 3, 13

Page 28: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Interatomic PotentialsSøren Frederiksen, Karsten Jacobsen, Kevin Brown, JPS

• Need atomic forces and energies• Guess functional form for potential (17 parameters for MEAM, 5 for Finnis-Sinclair)• Use electronic ground state energy U(R1,R2,…) (Born-Oppenheimer approximation)• Least-squares fit to DFT calculations of energy, forces for a variety of “important” atomic configurations

Friday, May 3, 13

Page 29: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Universal Scaling FunctionsYan-Jiun Chen, Zapperi, Durin, Papanikolaou

Text

Avalanches through Windows

11 parameter universal scaling form depends on demag length Lκ, window width W

A00

Size s

Friday, May 3, 13

Page 30: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Machine LearningMark Transtrum

• Neural net “trained” to predict option price, or recognize handwriting• Neural weights sloppy!• Output can be viewed as small dimensional representation of Big Data set• Is the inverse image of output a hyperribbon in Data space?

V t S OP 0.20 5.0 75. 25.0000 0.40 5.0 93. 7.2537 0.40 15.0 79. 21.0225 0.66 10.0 91. 10.3957

V

t

S

OP

320

One

Alex AlemiThorsten Joachims CS

Friday, May 3, 13

Page 31: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Accelerator OptimizationPhotoinjector system for Energy Recovery Linac

• Laser hits flexible mirror, which shapes beam on target• Beam of electrons nonlinear distortion through photoinjector• Optimize beam profile for emittance• Use 3D beam shape (point on model manifold), find “stiff knobs” to optimally adjust beam

Colin ClementIvan Bazarov (Physics)

Friday, May 3, 13

Page 32: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Climate Change?Climate models contain many

unknown parameters, fit to data

Stainforth et al., Uncertainty in predictions of the climate response to rising levels of greenhouse gases, Nature 433, 403-406 (2005)

Yan-Jiun ChenNatalie Mahowald EAS

• General Circulation Model (air, oceans, clouds), exploring doubling of CO2

• 21 total parameters• Initial conditions and (only) 6 “cloud dynamics” parameters varied• Heating typically 3.4K, ranged from < 2K to > 11K

Friday, May 3, 13

Page 33: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

MacroeconomicsMany ‘structural’ parameters unknown

Ricky ChachraKarel Mertens Econ

• Dynamic stochastic general equilibrium models (DSGE), such as Smets and Wouters• ~25 parameters• Larger uncertainties and less predictive than much more primitive models with many more parameters?• Parameter sensitivities: is it sloppy?• Model reduction: can we simplify it? Can we distill one from more primitive models?• Are extensions of the model different?

Friday, May 3, 13

Page 34: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

The UniverseΛCDM fit for cosmic microwave background radiation

Isabel KloumannMichael Niemack

• Six parameter ΛCDM model is sloppy fit to CMB; SNe and BAO determine• More general models introduce degeneracies

20

Figure 5. !CDM model: 68.3%, 95.4%, and 99.7% confidence regions of the ("m,"!) plane from SNe Ia combined with the constraints from BAO andCMB. The left panel shows the SN Ia confidence region only including statistical errors while the right panel shows the SN Ia confidence region with bothstatistical and systematic errors.

Figure 6. wCDM model: 68.3%, 95.4%, and 99.7% confidence regions in the ("m, w) plane from SNe Ia BAO and CMB. The left panel shows the SN Iaconfidence region for statistical uncertainties only, while the right panel shows the confidence region including both statistical and systematic uncertainties. Wenote that CMB and SN Ia constraints are orthogonal, making this combination of cosmological probes very powerful for investigating the nature of dark energy.

Universe is flat, mostly unknown

dark stuff

Friday, May 3, 13

Page 35: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

The Model Manifold

Parameter spaceStiff and sloppy directionsCanyons, Plateaus

Data spaceManifold of model predictionsParameters as coordinates

Model boundaries θn = ±∞, θmcause Plateaus

Metric gµν from distance to data

Two exponentials θ1, θ2fit to three data points y1, y2, y3

yn = exp(-θ1 tn) + exp(-θ2 tn)

θ1

θ2

Friday, May 3, 13

Page 36: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Geodesics“Straight line” in curved

spaceShortest path between

points

Easy to find cost minimum using polar geodesic coordinates

γ1, γ2Cost contours in geodesic coordinates

nearly concentric circles! Use this for algorithms…

γ1

γ 2

Friday, May 3, 13

Page 37: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

The Model Manifold is a Hyper-Ribbon

•Hyper-ribbon: object that is longer than wide, wider than thick, thicker than ...•Thick directions traversed by stiff eigenparameters, thin as sloppy directions varied.

Sum of many exponentials, fit to

y(0), y(1)data predictions at y(1/4), y(1/2), y(3/4)

Widths along geodesics track eigenvalues almost perfectly!

Friday, May 3, 13

Page 38: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Edges of the model manifoldFitting ExponentialsTop: Flat model manifold; articulated edges = plateauBottom: Stretch to uniform aspect ratio (Isabel Kloumann)

θ1

θ2

Friday, May 3, 13

Page 39: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Edges of the model manifoldFitting ExponentialsTop: Flat model manifold; articulated edges = plateauBottom: Stretch to uniform aspect ratio (Isabel Kloumann)

θ1

θ2

Friday, May 3, 13

Page 40: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

CurvaturesIntrinsic curvature Rµ

ναβ

• determines geodesic shortest paths• independent of embedding, parametersExtrinsic curvature• also measures bending in embedding space (i.e., cylinder)• independent of parameters• Shape operator, geodesic curvatureParameter effects “curvature”• Usually much the largest• Defined in analogy to extrinsic curvature (projecting out of surface, rather than into)

Shape Operator

Geodesic Curvature

No intrinsic curvature

Friday, May 3, 13

Page 41: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Hierarchy of widths and curvatures

107

10-5

101

Eigendirection at best fit

Hierarchy of widths

Cross sections: fixing f at 0, ½, 1

Theorem: interpolation good with many data points

Geometrical convergence

Multi-decade span of widths, curvatures,

eigenvalues

Widths ~ √λ sloppy eigs

Parameter curvature KP = 103 ✕ K

>> extrinsic curvature

W3

W4 W5

Friday, May 3, 13

Page 42: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Why is it so thin and flat?

Wj

ε=ΔfN+1

2Δf5P3(t)

Model f(t,θ) analytic: f(n)(t)/n! ≤ R-n

Polynomial fit Pm-1(t) to f(t1),…,f(tm)Interpolation convergence theoremΔfm+1 = f(t)-Pm-1(t) < (t-t1) (t-t2)…(t-tm) f(m)(ξ)/m! ~ (Δt / R)m

More than one data per R

Friday, May 3, 13

Page 43: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Which Rate Constants are in the Stiffest Eigenvector?

**

*

*

*

stiffest **

* *

2nd stiffest

Eigenvector components along

the bare parameters reveal which ones are most important

for a given eigenvector.

Ras

Raf1

Oncogenes

Friday, May 3, 13

Page 44: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Physics: Sloppiness and EmergenceBen Machta, Ricky Chachra, Mark Transtrum

Both sloppy at long-wavelengths

Emergence of distilled laws from microscopic complexity

Ising: long bondsDiffusion: long hops

Irrelevant on macroscale

Friday, May 3, 13

Page 45: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppiness and Continuum LimitsBen Machta, Ricky Chachra, Mark Transtrum

Microscopic hopping gives continuum diffusion equationSloppy after coarse graining in time

Den

sitie

s!

Diffusion time steps!

Eige

nval

ues!

0! +1! +2! +3!–1!–2!–3!

1! 2! 3! 4! 5! 6! 7!

Sites!

10–4!

10–2!

100!

102!

10–6!

10–8!

–10! 0! +10!–10! 0! +10!–10! 0! +10!

0.07!

0.14!

–10! 0! +10!Sites!

Friday, May 3, 13

Page 46: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppiness and Critical PointsBen Machta, Ricky Chachra, Mark Transtrum

Ising model with long-range bondsKullback-Liebler divergence metricSloppy after coarse graining in space

Tc ∞ Eig

enva

lue

Eige

nval

ues!

Coarsening Steps!

Con

figur

atio

ns!

1! 2! 3! 4! 5! 6!0!

101!

103!

105!

10–1!

10–3!

10–5!

Friday, May 3, 13

Page 47: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Generation of Reduced ModelsMark Transtrum (not me)

Can we coarse-grain sloppy models? If most parameter directions are useless, why not remove some?Transtrum has systematic method!(1) Geodesic along sloppiest direction to nearby point on manifold boundary(2) Eigendirection simplifies at model boundary to chemically reasonable simplified model

Coarse-graining = boundaries of model manifold.

Sloppiest Eigendirection

Simplified at Boundary(Unsaturable reaction)

Friday, May 3, 13

Page 48: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Generation of Reduced ModelsMark Transtrum (not me)

48 params29 ODEs

Friday, May 3, 13

Page 49: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Generation of Reduced ModelsMark Transtrum (not me)

12 params6 ODEs

Effective ‘renormalized’ params

Reduced model fits all experimental data

Friday, May 3, 13

Page 50: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppy ApplicationsSeveral applications emerge

Fits good: measured

bad

A. Fitting data vs. measuring

parameters (Gutenkunst)

B. Finding best fits by geodesic acceleration (Transtrum)

C. Optimal experimental

design (Casey)

E. Estimating systematic errors:(Jacobsen et al.)

D. Sloppy fitness and evolution

(Gutenkunst)

F. Sloppiness and

Robustness

Friday, May 3, 13

Page 51: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppy ApplicationsSeveral applications emerge

Fits good: measured

bad

A. Fitting data vs. measuring

parameters (Gutenkunst)

B. Finding best fits by geodesic acceleration (Transtrum)

C. Optimal experimental

design (Casey)

E. Estimating systematic errors:(Jacobsen et al.)

D. Sloppy fitness and evolution

(Gutenkunst)

F. Sloppiness and

Robustness

Friday, May 3, 13

Page 52: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

A. Are rate constants useful?Fits vs. measurements

• Easy to Fit (14 expts); Measuring huge job (48 params, 25%)• One missing parameter measurement = No predictivity• Sloppy Directions = Enormous Fluctuations in Parameters• Sloppy Directions often do not impinge on predictivity

Fits good: measured

bad

Monte Carlo (anharmonic)

Missingone

param

Measured

Fit

eigen params

Best Fit

bare params

Friday, May 3, 13

Page 53: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

A. Are rate constants useful?Fits vs. measurements

• Easy to Fit (14 expts); Measuring huge job (48 params, 25%)• One missing parameter measurement = No predictivity• Sloppy Directions = Enormous Fluctuations in Parameters• Sloppy Directions often do not impinge on predictivity

Fits good: measured

bad

Monte Carlo (anharmonic)

Missingone

param

Measured

Fit

eigen params

Best Fit

bare params

Friday, May 3, 13

Page 54: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

A. Are rate constants useful?Fits vs. measurements

• Easy to Fit (14 expts); Measuring huge job (48 params, 25%)• One missing parameter measurement = No predictivity• Sloppy Directions = Enormous Fluctuations in Parameters• Sloppy Directions often do not impinge on predictivity

Fits good: measured

bad

Monte Carlo (anharmonic)

Missingone

param

Measured

Fit

eigen params

Best Fit

bare params

Friday, May 3, 13

Page 55: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

A. Are rate constants useful?Fits vs. measurements

• Easy to Fit (14 expts); Measuring huge job (48 params, 25%)• One missing parameter measurement = No predictivity• Sloppy Directions = Enormous Fluctuations in Parameters• Sloppy Directions often do not impinge on predictivity

Fits good: measured

bad

Monte Carlo (anharmonic)

Missingone

param

Measured

Fit

eigen params

Best Fit

bare params

Friday, May 3, 13

Page 56: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

B. Finding best fits: Geodesic acceleration

Geodesic Paths nearly circlesFollow local geodesic velocity? Gauss-Newton Hits manifold boundary

δθ µ = −gµν∇νC

Model Graph add weight λ of

parameter metric yields Levenberg-Marquardt:

Step size now limited by curvature

Follow parabola, geodesic accelerationCheap to calculate; faster; more success

Friday, May 3, 13

Page 57: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

B. Finding best fits: Model manifold dynamics (Isabel Kloumann)

Dynamics on the model manifold: Searching for the best fit• Jeffrey’s prior plus noise• Big noise concentrates on manifold edges• Note scales: flat• Top: Levenberg-Marquardt• Bottom: Geodesic acceleration• Large points: Initial conditions which fail to converge to best fit

Friday, May 3, 13

Page 58: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

B. Finding best fits: Model manifold dynamics (Isabel Kloumann)

Dynamics on the model manifold: Searching for the best fit• Jeffrey’s prior plus noise• Big noise concentrates on manifold edges• Note scales: flat• Top: Levenberg-Marquardt• Bottom: Geodesic acceleration• Large points: Initial conditions which fail to converge to best fit

Friday, May 3, 13

Page 59: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

B. Finding best fits: Model manifold dynamics (Isabel Kloumann)

Dynamics on the model manifold: Searching for the best fit• Jeffrey’s prior plus noise• Big noise concentrates on manifold edges• Note scales: flat• Top: Levenberg-Marquardt• Bottom: Geodesic acceleration• Large points: Initial conditions which fail to converge to best fit

Friday, May 3, 13

Page 60: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

C. EGFR Trafficking Model

Receptor activation of MAPK pathway and other effectors (e.g. Src)

recycling

intern

alizat

ion

Fergal Casey, Cerione lab• Active research, Cerione lab: testing hypothesis, experimental design (Cool1≡β-PIX)• 41 chemicals, 53 rate constants; only 11 of 41 species can be measured• Does Cool-1 triple complex sequester Cbl, delay endocytosis in wild type NIH3T3 cells?

Friday, May 3, 13

Page 61: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

C. Trafficking: experimental designWhich experiment best reduces prediction uncertainty?

• Amount of triple complex was not well predicted• V-optimal experimental design: single & multiple measurements• Total active Cdc42 at 10 min.; Cerione independently concurs• Experiment indicates significant sequestering in wild type• Predictivity without decreasing parameter uncertainty

Friday, May 3, 13

Page 62: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

C. Trafficking: experimental designWhich experiment best reduces prediction uncertainty?

• Amount of triple complex was not well predicted• V-optimal experimental design: single & multiple measurements• Total active Cdc42 at 10 min.; Cerione independently concurs• Experiment indicates significant sequestering in wild type• Predictivity without decreasing parameter uncertainty

Friday, May 3, 13

Page 63: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

D. Evolution in Chemotype spaceImplications of sloppiness?

• Culture of identical bacteria, one mutation at a time• Mutation changes one or two rate constants (no pleiotropy): orthogonal moves in rate constant (chemotype) space• Cusps in first fitness gain (one for each rate constant, big gap)• Multiple mutations get stuck on ridge in sloppy landscape

Fitness gain from first successful mutation

Friday, May 3, 13

Page 64: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

E. Bayesian Errors for Atoms‘Sloppy Model’ Approach to Error Estimation of Interatomic Potentials

Søren Frederiksen, Karsten W. Jacobsen, Kevin Brown, JPS

Atomistic potential820,000 Mo atoms(Jacobsen, Schiøtz)

Quantum Electronic

Structure (Si)90 atoms (Mo)

(Arias)

Interatomic Potentials V(r1,r2,…)• Fast to compute• Limit me/M → 0 justified• Guess functional form Pair potential ∑ V(ri-rj) poor Bond angle dependence Coordination dependence• Fit to experiment (old)• Fit to forces from electronic structure calculations (new)

17 Parameter FitFriday, May 3, 13

Page 65: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

E. Interatomic Potential Error Bars

Best fit is sloppy: ensemble of fits that aren’t much

worse than best fit. Ensemble in Model Space!

T0 set by equipartition

energy = best cost

Error Bars from quality of

best fit

Ensemble of Acceptable Fits to Data Not transferable Unknown errors• 3% elastic constant• 10% forces• 100% fcc-bcc, dislocation core

Green = DFT, Red = Fits

T0

Friday, May 3, 13

Page 66: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

E. Interatomic Potential Error Bars

Best fit is sloppy: ensemble of fits that aren’t much

worse than best fit. Ensemble in Model Space!

T0 set by equipartition

energy = best cost

Error Bars from quality of

best fit

Ensemble of Acceptable Fits to Data Not transferable Unknown errors• 3% elastic constant• 10% forces• 100% fcc-bcc, dislocation core

Green = DFT, Red = Fits

T0

Friday, May 3, 13

Page 67: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Sloppy Molybdenum: Does it Work?Estimating Systematic Errors

Bayesian error σi gives total error if ratio r = errori/σi distributed as a Gaussian: cumulative distribution P(r)=Erf(r/√2)

Three potentials• Force errors• Elastic moduli• Surfaces• Structural• Dislocation core• 7% < σi < 200%

Note: tails…Worst errors underestimated by ~ factor of 2

“Sloppy model” systematic

error most of total

~2 << 200%/7%Friday, May 3, 13

Page 68: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Systematic Error Estimates for DFTGGA-DFT as Multiparameter Fit?

J. J. Mortensen, K. Kaasbjerg, S. L. Frederiksen, J. K. Nørskov, JPS, K. W. Jacobsen,

(Anja Tuftelund, Vivien Petzold, Thomas Bligaard)

Enhancement factor Fx(s) in the exchange energy Ex

Large fluctuations

Actual error / predicted errorDeviation from experiment

well described by ensemble!Friday, May 3, 13

Page 69: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

TextFruit fly embryo development model robust: 1/200 parameter sets in

“allowed region” [± 3000%] fits data. (Naïve green cube: L48 = 1/200)

F. Parameter robustness and sloppiness Do parameters matter at all?

Bryan Daniels, Yanjiun Chen, Ryan Gutenkunst, Chris Myers

Segment polarity model is sloppy (eigenvalues, left) and robust (PCA,

phenotype-preserving in cube, right).

TextModel is sloppy: only four parameter directions vary less than ± 3000%. (Red brick: allowed regions of three stiffest to give 1/200 acceptance).

Friday, May 3, 13

Page 70: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

F. Environmental robustness and sloppiness Circadian Rhythms

Bacteria know the time of day! How do they keep their clocks on time in the cold? All reaction rates exponentially dependent on temperature! Delicate cancellation?

Sloppiness facilitates finding parameter values for robust response to environmental change.

Three sloppy directions, 18 rates exponentially dependent on temperature. Three stiff directions at each temperature? 18 – 3 (25°C) – 3 (30°C) – 3 (35°C) = 9 dimensional robust parameter spaceFriday, May 3, 13

Page 71: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

F. Evolvability, robustness, and sloppinessIf it’s robust, can it evolve?

How can evolution proceed if mutating the parameters doesn’t matter? More robust, less evolvable!

Evolutionary force F

Sloppy signal transduction model Individual evolvabilities ec(F) within phenotypePopulation evolvability ed within phenotype[Sloppy neutral spaces allow species to explore large ranges of parameters]Friday, May 3, 13

Page 72: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

How does science work? Nonlinear fits to data: Sloppiness, differential geometry, and algorithms

JPS, Mark Transtrum, Isabel Kloumann, Ben Machta, Kevin Brown, Ryan Gutenkunst, Josh Waterfall, Chris Myers, …

Cell Dynamics

Slop

pine

ss

Model Manifold

Hyper

-ribb

ons

Coarse-Grained

Models

Fitting Exponentials

Friday, May 3, 13

Page 73: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Proposed universal ensembleWhy are they sloppy?

Assumptions: (Not one experiment per parameter)i. Model predictions all depend on every parameter,

symmetrically: yi (θ1 , θ2 , θ3 ) = yi (θ2 , θ3 , θ1 ) ii. Parameters are nearly degenerate: θι = θ0 + εi

VandermondeMatrix

d=N-1

• Implies enormous range of eigenvalues• Implies equal spacing of log eigenvalues• Like universality for random matrices

Friday, May 3, 13

Page 74: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

48 Parameter “Fit” to Data

ERK

*

Time

10’

ERK

*

Time

10’

+EGF

+NGF

bare

eigen

Cost is Energy

Ensemble of Fits Gives Error Bars

Error Bars from Data Uncertainty

Friday, May 3, 13

Page 75: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

Why is it so thin?

2Δf5P3(t)

Model f(t,θ) analytic: f(n)(t)/n! ≤ R-n

Polynomial fit Pm-1(t) to f(t1),…,f(tm)Interpolation convergence theoremΔfm+1 = f(t)-Pm-1(t) < (t-t1)…(t-tm) f(m)(ξ)/m! ~ (Δt / R)m

More than one data per R

Hyper-ribbon: Cross-section constraining m points has width Wm+1 ~ Δfm+1 ~ (Δt/R)m

Friday, May 3, 13

Page 76: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

B. Finding sloppy subsystemsModel reduction?

• Sloppy model as multiple redundant parameters?

• Subsystem = subspace of parameters pi with similar effects on model behavior

• Similar = same effects on residuals rj

• Apply clustering algorithm to rows of Jij = ∂rj/∂pi

Tpa

ram

eter

i

residual j

PC12 differentiation model

Continuum mechanics, renormalization group,

Lyapunov exponents can also be viewed as sloppy

model reduction Friday, May 3, 13

Page 77: How does science work? Nonlinear fits to data: …pages.physics.cornell.edu/~sethna/teaching/BasicTraining/HW13/13... · How does science work? Nonlinear fits to data: Sloppiness,

References“The sloppy model universality class and the Vandermonde matrix”, J. J. Waterfall, F. P. Casey, R. N. Gutenkunst, K. S. Brown, C. R. Myers, P. W. Brouwer, V. Elser, and James P. Sethna, http://arxiv.org/abs/cond-mat/0605387.“Sloppy systems biology: tight predictions with loose parameters”, Ryan N. Gutenkunst, Joshua J. Waterfall, Fergal P. Casey, Kevin S. Brown, Christopher R. Myers & James P. Sethna (submitted).“The Statistical Mechanics of Complex Signaling Networks: Nerve Growth Factor Signaling”, K. S. Brown, C. C. Hill, G. A. Calero, C. R. Myers, K. H. Lee, J. P. Sethna, and R. A. Cerione, Physical Biology 1, 184-195 (2004) . “Statistical Mechanics Approaches to Models with Many Poorly Known Parameters”, K. S. Brown and J. P. Sethna, Phys. Rev. E 68, 021904 (2003). “Bayesian Ensemble Approach to Error Estimation of Interatomic Potentials”, Søren L. Frederiksen, Karsten W. Jacobsen, Kevin S. Brown, and James P. Sethna, Phys. Rev. Letters 93, 165501 (2004).“Bayesian Error Estimation in Density Functional Theory”, J. J. Mortensen, K. Kaasbjerg, S. L. Frederiksen, J. K. Norskov, James P. Sethna, K. W. Jacobsen, Phys. Rev. Letters 95, 216401 (2005).“SloppyCell” systems biology modeling software, Ryan N. Gutenkunst, Christopher R. Myers, Kevin S. Brown, Joshua J. Waterfall, Fergal P. Casey, James P. Sethna http://www.lassp.cornell.edu/sethna/GeneDynamics/, SourceForge repository at http://sloppycell.sourceforge.net/

Friday, May 3, 13