Data Assimilation with Stochastic Model Reduction of ...feilu/slides/19_5DA.pdf · !Improve performance of DA 22/24. Ongoing work: noisy data: state estimation and model inference

Data Assimilation with Stochastic ModelReduction of Chaotic Systems

Fei Lu

Department of Mathematics, Johns HopkinsJoint work with: Alexandre J. Chorin (UC Berkeley)

Kevin K. Lin (U. of Arizona)Xuemin Tu (U. of Kansas)

Snowbird, SIAM-DS, May 23, 2019

1 / 24

Model error from sub-grid scales

ECMWF: 16 km horizontal

grid→ 109 freedoms

From: Wikipedia, ECWMF

The Lorenz 96 system

Wilks 20052 / 24





Wilks 2005

Discrete partialdata

High-dimensionalFull system Prediction

x ′= f (x) + U(x , y),

y ′ = g(x , y).Observe only{x(nh)}N

n=1.Forecastx(t), t ≥ Nh.

HighD multiscale full systems:I can only afford to resolve x ′ = f (x)I y : unresolved variables (subgrid-scales)

Discrete noisy observations: missing i.c.

Ensemble prediction: need many simulations

→ How to account for the model error U(x , y)?

3 / 24





Wilks 2005

Discrete partialdata

High-dimensionalFull system Prediction

x ′= f (x) + U(x , y),

y ′ = g(x , y).Observe only{x(nh)}N

n=1.Forecastx(t), t ≥ Nh.

HighD multiscale full systems:I can only afford to resolve x ′ = f (x)I y : unresolved variables (subgrid-scales)

Discrete noisy observations: missing i.c.

Ensemble prediction: need many simulations

→ How to account for the model error U(x , y)?

4 / 24

Given a highD multiscale full system:

x ′ = f (x) + U(x,y), y ′ = g(x , y).

Ensemble prediction: can afford to resolve x ′ = f (x) online.

Accounting model error U(x , y) from subgrid scales

Indirect approaches: correct the forecast ensembleI e.g. Inflation and Localization;1 relaxation/bias correction2,I in Assimilation step; deficiency in forecast model remains

Direct approach: improve the forecast modelI parametrization methods 3, non-Markovian 4,I random perturbation, averaging+homogenization 5

1Mitchell-Houtekamer(00), Hamill-Whitaker(05), Anderson(07)2Zhang-etc (04), Dee-Da Silva(98)3Palmer, Arnold+(01,13), Wilks(05), Meng-Zhang(07), Danforth-Kalnay-Li+(08,09),

Berry-Harlim(14), Mitchell-Carrissi(15), Van Leeuwen etc(18)4Chorin+(00-15), Marjda-Timofeyev-Harlim+(03-13), Chekroun-Kondrashov-

Gil+(11,15), Cromellin+Vanden-Eijinden(08), Gottwald+(15)5Hamill-Whitaker(05),Houtekamer+(09) Pavliotis-Stuart(08), Gottwald+(12-13)

5 / 24

Outline

1. Stochastic model reduction(reduction from simulated data)

I Discrete-time stochastic parametrization (NARMA)

2. Data assimilation with the reduced model(Noisy data + reduced model→ state estimation and prediction)

6 / 24

Stochastic model reductionx ′ = f (x) + U(x , y), y ′ = g(x , y).

Data {x(nh)}Nn=1Why stochastic reduced models?The system is “ergodic”: 1

N

∑Nn=1 F (x(nh))

N→∞−−−−→∫

F (x)µ(dx)

U(x , y) acts like a stochastic force

Memory effects (Mori, Zwanzig, Chorin, Kubo, Majda, Wilks, Ghil, . . . )

Mori-Zwanzig formalism→ generalized Langevin equ.

dxdt

= E[RHS|x ]︸︷︷︸Markov term

+

∫ t

0K (x(s), t − s)ds︸︷︷︸

memory

+ Wt︸︷︷︸noise

,

Fluctuation-dissipation theory→ Hypoelliptic SDEsdX = a(X ,Y )dt + Y ; dY = b(X ,Y )dt + c(X ,Y )dW ,

Parametrization: multi-layer stochastic models

Goal: develop a non-Markovian stochastic reduced system for x

7 / 24

Stochastic model reductionx ′ = f (x) + U(x , y), y ′ = g(x , y).

Data {x(nh)}Nn=1Why stochastic reduced models?The system is “ergodic”: 1

N

∑Nn=1 F (x(nh))

N→∞−−−−→∫

F (x)µ(dx)

U(x , y) acts like a stochastic force

Memory effects (Mori, Zwanzig, Chorin, Kubo, Majda, Wilks, Ghil, . . . )

Mori-Zwanzig formalism→ generalized Langevin equ.

dxdt

= E[RHS|x ]︸︷︷︸Markov term

+

∫ t

0K (x(s), t − s)ds︸︷︷︸

memory

+ Wt︸︷︷︸noise

,

Fluctuation-dissipation theory→ Hypoelliptic SDEsdX = a(X ,Y )dt + Y ; dY = b(X ,Y )dt + c(X ,Y )dW ,

Parametrization: multi-layer stochastic models

Goal: develop a non-Markovian stochastic reduced system for x

8 / 24

Discrete-time stochastic parametrization

NARMA(p,q)

Xn = Xn−1 + Rh(Xn−1) + Zn,

Zn = Φn + ξn,

Φn =

p∑j=1

ajXn−j +r∑

j=1

s∑i=1

bi,jPi(Xn−j)︸︷︷︸Auto-Regression

+

q∑j=1

cjξn−j︸︷︷︸Moving Average

Rh(Xn−1) from a numerical scheme for x ′ ≈ f (x)Φn depends on the past

Tasks:Structure derivation: terms and orders (p, r , s,q) in Φn;Parameter estimation: aj ,bi,j , cj , and σ.

9 / 24

Overview:

x ′ = f (x) + U(x , y), y ′ = g(x , y).

Data {x(nh)}Nn=1

Discrete-time stochastic parametrization

NARMAXn = Xn−1 + Rh(Xn−1) + Zn,

Zn = Φn + ξn,

Φn =

p∑j=1

aj Xn−j +

q∑j=1

cjξn−j

+

r∑j=1

s∑i=1

bi,j Pi (Xn−j ).

1. compute Rh(x)

2. derive structure

3. estimate parameters

10 / 24

Application to the chaotic Lorenz 96 systemA chaotic dynamical system (a simplified atmospheric model)

ddt

xk = xk−1 (xk+1 − xk−2)− xk + 10− 1J

∑j

yk,j ,

ddt

yk,j =1ε

[yk,j+1(yk,j−1 − yk,j+2)− yk,j + xk ],

where x ∈ R18, y ∈ R360.

Wilks 2005

Find a reduced system for x ∈ R18 based on

ã Data {x(nh)}Nn=1

ã ddt xk ≈ xk−1 (xk+1 − xk−2)− xk + 10.

11 / 24

� NARMA:xn = xn−1 + Rh(xn−1) + zn; zn = Φn + ξn,

Φn = a +

p∑j=1

dx∑l=1

bj,l (xn−j )l +

p∑j=1

cjRh(xn−j ) +

q∑j=1

djξn−j .

p = 2,dx = 3; q =

{1, h = 0.01;

0, h = 0.05.

� Polynomial autoregression (POLYAR)6

ddt

xk = xk−1 (xk+1 − xk−2)− xk + 10 + U ,

U = P(xk ) + ηk , with dηk (t) = φηk (t) + dBk (t).

where P(x) =∑dx

j=0 ajx j . Optimal dx = 5.

6Wilks 05: an MLR model in atmosphere science12 / 24

� NARMA:xn = xn−1 + Rh(xn−1) + zn; zn = Φn + ξn,

Φn = a +

p∑j=1

dx∑l=1

bj,l (xn−j )l +

p∑j=1

cjRh(xn−j ) +

q∑j=1

djξn−j .

p = 2,dx = 3; q =

{1, h = 0.01;

0, h = 0.05.

� Polynomial autoregression (POLYAR)6

ddt

xk = xk−1 (xk+1 − xk−2)− xk + 10 + U ,

U = P(xk ) + ηk , with dηk (t) = φηk (t) + dBk (t).

where P(x) =∑dx

j=0 ajx j . Optimal dx = 5.

6Wilks 05: an MLR model in atmosphere science13 / 24

Long-term statisticsEmpirical probability density function (PDF)

−5 0 5 10

0.02

0.04

0.06

0.08

0.1

X1

PDF

L96POLYARNARMA

h=0.01

−5 0 5 10

0.02

0.04

0.06

0.08

0.1

0.12

X1

PDF

L96POLYARNARMA

h=0.05

Empirical autocorrelation function (ACF)

0 2 4 6 8 10

−0.2

0

0.2

0.4

0.6

0.8

time

ACF

L96POLYARNARMA

h=0.01

0 2 4 6 8 10−0.5

0

0.5

1

time

ACF

L96POLYARNARMA

h=0.05

14 / 24

Prediction (h = 0.05)

A typical ensemble forecast:

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−5

0

5

10

time

X 1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−5

0

5

10

time

X 1

POLYAR

NARMAX

forecast trajectories in cyan

true trajectory in blue

RMSE of many forecasts:

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

2

4

6

8

10

12

14

16

18

20

lead time

RM

SE

POLYAR

NARMAX

Forecast time:POLYAR: T ≈ 1.25NARMA: T ≈ 2.5(Full model: T ≈ 2.5

“Best” forecast time achieved! )

15 / 24

Prediction (h = 0.05)

A typical ensemble forecast:

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−5

0

5

10

time

X 1

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5−5

0

5

10

time

X 1

POLYAR

NARMAX

forecast trajectories in cyan

true trajectory in blue

RMSE of many forecasts:

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

2

4

6

8

10

12

14

16

18

20

lead timeR

MSE

POLYAR

NARMAX

Forecast time:POLYAR: T ≈ 1.25NARMA: T ≈ 2.5(Full model: T ≈ 2.5

“Best” forecast time achieved! )

16 / 24

Outline

1. Stochastic model reduction(reduction from simulated data)

I Discrete-time stochastic parametrization (NARMA)

2. Data assimilation with the reduced model(Noisy data + reduced model→ state estimation and prediction)

17 / 24

Data assimilation with the reduced model

x ′ = f (x) + U(x , y), y ′ = g(x , y).

Noisy data: x(nh) + W (n), n = 1,2, . . .

Data assimilation:estimate the state of a forward modelmake prediction (by ensembles of solutions)

The widely used method: Ensemble Kalman filters (EnKF)

18 / 24


Wilks 2005

Estimate and predict x based on

ã Noisy Data z(n) = x(nh) + W(n)

ã Forward models

L96x: the truncated modelddt xk ≈ xk−1 (xk+1 − xk−2)− xk + 10(account for the model error by IL in EnKF )

NARMA (account for the model error byparametrization in the forward model)

19 / 24

Relative error of state estimation

0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.40.005

0.01

0.015

0.02

0.025

0.03

std of observation noise

rela

tive

erro

r

L96x + ILNARMAFull model + IL

Relative error for different observation noises.(ensemble size: =1000 for L96x and NARMA; =10 for the full model)

20 / 24

RMSE of state prediction

0 1 2 3 40

2

4

6

8

10

12

14

16

time

RM

SE

L96x + ILNARMAFull model + IL

RMSE of 104 ensemble forecasts.(ensemble size: =1000 for L96x and NARMA; =10 for the full model)

Summary: The stochastic model improves performance of DA.

21 / 24

Summary and ongoing work

Accounting model error U(x , y) from subgrid scales

x ′ = f(x) + U(x,y), y ′= g(x,y).Data {x(nh)}Nn=1

“X ′ = f (X ) + Z (t , ω)”

Inference

“Xn+1 = Xn + Rh(Xn) + Zn ”for prediction

Discretization

Inference

Stochastic model reduction byDiscrete-time stochastic parametrization

I simplifies the inference from data

I incorporates memory flexiblyI effective reduced model (NARMA)

I capture key statistical-dynamical featuresI make medium-range forecasting

Improve the forecast model→ Improve performance of DA

22 / 24

Ongoing work:

noisy data: state estimation and model inferenceI data assimilation with non-Markovian modelsI inference for hidden non-Markovian models

model reduction for (stochastic) PDEsI stochastic Burgers equation, N-S equation

23 / 24

ReferencesData-driven stochastic model reduction

I Chorin-Lu: Discrete approach to stochastic parametrization and dimensionreduction in nonlinear dynamics. PNAS 112 (2015), no. 32, 9804–9809.

I Lu-Lin-Chorin: Comparison of continuous and discrete-time data-basedmodeling for hypoelliptic systems. CAMCoS, 11 (2016), no. 8, 4227–4246.(With K. Lin and A. Chorin).

I Lu-Lin-Chorin: Data-based stochastic model reduction for the Kuramoto –Sivashinsky equation. Physica D, 340 (2017), 46–57. (With K. Lin andA. Chorin)

Data assimilationI Lu-Tu-Chorin: Accounting for model error from unresolved scales in

EnKFs: improving the forecast model. MWR, 340 (2017).

Thank you!

24 / 24

Documents

Data Assimilation with Stochastic Model Reduction of ...feilu/slides/19_5DA.pdf · !Improve performance of DA 22/24. Ongoing work: noisy data: state estimation and model inference