Advanced data assimilation methods with evolving forecast error covariance
Four-dimensional variational analysis(4D-Var)
Shu-Chih Yang (with EK)
Find the optimal analysis
• Least squares approachFind the optimal weights to minimize the analysis error covariance
€
Ta =σ 2
2
σ 12 + σ 2
2
⎛
⎝ ⎜
⎞
⎠ ⎟T1 +
σ 12
σ 12 + σ 2
2
⎛
⎝ ⎜
⎞
⎠ ⎟T2
€
J(T) =1
2
(T − T1)2
σ 12
+(T − T2)2
σ 22
⎡
⎣ ⎢
⎤
⎦ ⎥ ,
∂J
∂T= 0 for T = Ta
• Variational approachFind the analysis that will minimize a cost function, measuring its distance to the background and to the observation
€
T1 = Tt + ε1
T2 = Tt + ε2
(forecast)
(observation)Best estimate the true value
Both methods give the same Ta !
J(x)= (x-xb)TB-1(x-xb) + [yo-H(x)]TR-1[yo-H(x)]
J(xa)=0 at J(xa)=Jmin
3D-Var
Distance to forecast (Jb) Distance to observations (Jo)
€
1
2
€
1
2
How do we find an optimum analysis of a 3-D field of model variable xa, given a background field, xb, and a set of observations, yo?
find the solution in 3D-Var
Directly set J(xa)=0 and solve
(B-1+HTR-1H)(xa-xb)=HTR-1[yo-H(xb)] (Eq. 5.5.9)
Usually solved as
(I+ B HTR-1H)(xa-xb)= B HTR-1[yo-H(xb)]
€
δJ ≈∂J
∂x
⎡ ⎣ ⎢
⎤ ⎦ ⎥
T
⋅δx; ∇J =∂J
∂x
Minimize the cost function, J(x)
A descent algorithm is used to find the minimum of the cost function.
This requires the gradient of the cost function, J.
-J(x)
-J(x+1)
J(x)
J(x+1)
min
Ex: “steepest descent” method
4D-Var
J(x(t0))= [x(t0)-xb(t0)]TB0
-1[x(t0)-xb(t0)]+ [yoi-H(xi)]
TRi-1[yo
i-H(xi)]
Need to define J(x(t0)) in order to minimize J(x(t0))€
1
2
€
1
2 i= 0
i= N
∑
Find the initial condition such that its forecast best fits the observations within the assimilation interval
assimilation windowt0
tnti
yo
yo
yo
yo
previous forecast
xb corrected forecast
xa
J(x) is generalized to include observations at different times.
Jb =12[x(t0 )−xb(t0 )]
T B−1[x(t0 )−xb(t0 )]
€
∂Jb
∂x(t0)= B−1[x(t0) − xb (t0)]
Given a symmetric matrix A, and
a function , the gradient is give by
€
J =1
2xTAx
€
∂J
∂x= Ax
J =J b + J o,∂J
∂x(t0 )=
∂J b
∂x(t0 )+
∂J o
∂x(t0 )
First, let’s consider Jb(x(t0))
Separate J(x(t0)) into “background” and “observation” terms
∂(H (xi ) − yio )
∂x0
= ∂H
∂xi
∂M i
∂x0
= HiL(t0 , ti ) = HiLi-1Li-2 L L0
Jo is more complicated, because it involves the integration of the model:
Jo =12
[H(xi )−yio]Ri
-1[H(xi )−yio]
i=0
N
∑
∂Jo
∂x(t0 )
⎡
⎣⎢
⎤
⎦⎥= LT (t0 ,ti )Hi
T R i-1[H (xi ) − yi
o ]i=0
N
∑weighted increment at observation time, ti, in model coordinates
If J = yTAy and y = y(x), then ,
where is a matrix.€
∂J
∂x=
∂y
∂x
⎡ ⎣ ⎢
⎤ ⎦ ⎥
T
Ax
€
∂y
∂x
⎡ ⎣ ⎢
⎤ ⎦ ⎥k,l
=∂yk
∂x l
xi=Mi[x(ti-1)]
[HiLi-1Li-2 ...L0 ]T =L 0T L L i−2
T L i−1T H i
T =LT (ti ,t0 )H iT
Adjoint model integrates increment backwards to t0
€
d i = H iTRi
−1[H(x i) − y io]
t0 t1 t3t2 t4
€
d 4
€
d 3
€
d 2
€
d 1
€
d 0 + L0T (d 1 + L1
T (d 2 + L2T (d 3 + L3
T d 4 )))
€
B0−1[x(t0) − xb (t0)]
+Jo/x0
Jb/x0
€
d 0
• In each iteration, J is used to determine the direction to search the Jmin.
• 4D-Var provides the best estimation of the analysis state and error covariance is evolved implicitly.
Simple example:Use the adjoint model to integrate backward in time
Start from the end!
Jo
t0tnti
assimilation window
yo
yo
yo
yo
previous forecast
corrected forecastxb
Jb
Jo
Jo
Jo
xa
3D-V
ar
1. 4D-Var assumes a perfect model. It will give the same credence to older observations as to newer observations.• algorithm modified by
Derber (1989)2. Background error covariance
is time-independent in 3D-Var, but evolves implicitly in 4D-Var.
3. In 4D-Var, the adjoint model is required to compute J.
Figure from http://www.ecmwf.int/
3D-Var vs. 4D-Var
With this form, it is possible to choose a “simplification operator, S” to solve the cost function in a low dimension space (change the control variable).Now, δw=Sδx and minimize J(δw)
where and
Practical implementation: use the incremental form
The choice of the simplification operator• Lower resolution• Simplification of physical process
€
J(δx0) =1
2(δx0)
T B0−1δx0 +
1
2[H iL(t0, ti)δx0 − di
o]T R−1
i= 0
N
∑ [H iL(t0, ti)δx0 − dio]
€
δx = x − xb
€
d = yo − H(x)
Example of using simplification operator
Both TLM and ADJ use a low resolution and also simplified physics due to the limitation of the computational cost.
Example with the Lorenz 3-variable model
• The background state is needed in both L and LT (need to save the model trajectory)
• In a complex NWP model, it is impossible to write explicitly this matrix form
€
dx1
dt= −px1 + px2
dx2
dt= rx1 − x1x3 − x2
dx3
dt= x1x2 − bx3
Nonlinear model x=[x1,x2,x3]
Tangent linear model δx=[δx1, δx2, δx3]
Adjoint model δx*=[δx*
1, δx*2, δx*
3]
€
L =∂M
∂x=
∂M
∂x i
=
−p p 0
r − x3 −1 −x1
x2 x1 −b
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥
x3 x1
x2 x1
€
LT =∂M
∂x i
⎡
⎣ ⎢
⎤
⎦ ⎥
T
=
−p r − x3 x2
p −1 x1
0 −x1 −b
⎡
⎣
⎢ ⎢ ⎢
⎤
⎦
⎥ ⎥ ⎥x1
x1
x2x3
In tangent linear model
forward in time
In adjoint model
backward in time
Example of tangent linear and adjoint codes
€
δx 3(t + Δt) −δx3(t)
Δt= x2(t)δx1(t) + x1(t)δx2(t) − bδx3(t)
δx 3(t + Δt) = δx3(t) + [x2(t)δx1(t) + x1(t)δx2(t) − bδx3(t)]Δt
€
δx3*(t) = δx3
*(t) + (1− bΔt)δx3*(t + Δt)
δx2*(t) = δx2
*(t) + (x1(t)Δt)δx3*(t + Δt)
δx1*(t) = δx1
*(t) + (x2(t)Δt)δx1*(t + Δt)
δx1*(t + Δt) = 0
* Try an example in Appendix B (B.1.15)
use forward scheme to integrate in time
RMS error of 3D-Var and 4D-Var in Lorenz model
Experiments: DA cycle and observations: 8t, R=2*I 4D-Var assimilation window: 24t
observation error
3DVar
4DVar
4D-Var in the Lorenz model (Kalnay et al., 2005)
Impact of the window length
Win=8 16 24 32 40 48 56 64 72
Fixed window 0.59 0.59 0.47 0.43 0.62 0.95 0.96 0.91 0.98
Start with short window
0.59 0.51 0.47 0.43 0.42 0.39 0.44 0.38 0.43
• Lengthening the assimilation window reduces the RMS analysis error up 32 steps.
• For the long windows, error increases because the cost function has multiple minima.
• This problem can be overcome by the quasi-static variational assimilation approach (Pires et al, 1996), which needs to start from a shorter window and progressively increase the length of the window.
Schematic of multiple minima and increasing window size (Pires et al, 1996)
failed
1 2final
……
Jmin1
Jmin2
J(x)
Dependence of the analysis error on B0
Dependence of the analysis error on the B0
•Since the forecast state from 4D-Var will be more accurate than 3D-Var, the amplitude of B should be smaller than the one used in 3D-Var.• Using a covariance proportional to B3D-Var and tuning its amplitude is a good strategy to estimate B.
Win=8 B= B3D-Var 50%
B3D-Var
40%
B3D-Var
30%
B3D-Var
20%
B3D-Var
10%
B3D-Var
5%
B3D-Var
RMSE 0.78 0.59 0.53 0.52 0.50 0.51 0.65 >2.5