My PhD talk "Application of H-matrices for computing partial inverse"

Application of H-matrices for solving

multiscale problems with stochastical right hand side

Litvinenko Alexander

WIRE, TU Braunschweig,

5 April, 2007.

www.hlib.org www.wire.tu-bs.de

1

Contents

1. Problem Setup

2. Hierarchical Domain Decomposition (HDD)

3. HDD in the H-matrix arithmetic

4. Computational resources of HDD

5. Functionals of the solution

6. Numerical results

2

Problem setup

The stochastic elliptic boundary value problem: find u : Ω × D → R

s.t. : almost surely

∑

1≤i,j≤2

∂

∂xi

αi,j(x)∂

∂xj

u = f(x, ω) on D

u = g(x, ω) in ∂D

(1)

where αi,j ∈ L∞(D) such A(x) = (αi,j)i,j=1,2 satisfies

0 < λ ≤ λmin(A(x)) ≤ λmax(A(x)) ≤ λ , ∀x ∈ D.

⇒ Oscillatory or jumping coefficients are allowed.

3

The motivation and goals

a) b) c)

DD D

u|∂Dγ

ν

u|∂ν

u|γ = Φ(u|∂D, f)

Complete inverse to the stiffness matrix can be too expensive!

Often only few functionals of the solution are of interest!

Compute stochastical solution a) on γ, b) on ∂ν, c) on the interface.

4

From H.Matthies and A.Keese Bericht Nr. 08-2003

Ku(ω) = K

(

∑

α∈J

u(α)Hα(ω)

)

=∑

α∈J

Ku(α)Hα(ω) =

f(ω) =∑

α∈J

f (α)Hα(ω).

where u(α) can by find by solving uncoupled systems

∀α ∈ J : Ku(α) = f (α).

How to compute all u(α) efficiently? a) K−1, b) LU and c) CG

(GMRES)?

5

The idea of HDD

Apply Galerkin FE discretisation to (1).

We construct the discrete solution in the form

uh(·, ω) = Fhfh(·, ω) + Ghgh(·, ω), (2)

where Fh, Gh are two solution operators, fh is the st. FE rhs and gh

is the st. FE Dirichlet-boundary values.

E.g. uh(fh, gh) only for fh in a coarser space.

6

Domain decomposition tree (TTh)

FE discretisation: triangulation Th, D := Dh = ∪t∈Tht.

1

2

3

4

5

6

7

910

11

12

13

14

15

8

5

6

7

11

12

13

14

15

8

1

2

3

4

5

6

7

910

3

4

19

10

......

5

611

12

13

14

15

6

7

11

15

8

......

26

2

6

• D is the root of the tree,

• TThis a binary tree,

• if ν ∈ TThhas two sons ν1, ν2 ∈ TTh

:

ν = ν1 ∪ ν2

and γ = ∂ν1 ∩ ∂ν2,

• ν ∈ TThis a leaf, if and only if ν ∈ Th.

7

Notations

Let ν ∈ TTh, ν = ν1 ∪ ν2.

Γν,1 := ∂ν ∩ ν1,

Γν,2 := ∂ν ∩ ν2

γ := ∂ν1\∂ν = ∂ν2\∂ν

I := I(D) = set of all nodal points in D.

I(ν) := i ∈ I : xi ∈ ν.1 2

γ

Γν,1 Γν,2

Γν

ν

νν

8

FE Galerkin Discretisation

For ν ∈ TThdefine fν := (fi)i∈I(ν), gν := (gi)i∈I(∂ν), dν := (fν , gν).

Let bj , j = 1, ..., N be a piecewise linear basis,

Vh := spanb1, ..., bN, Vh ⊂ V = H1(D).

Variational Galerkin formulation of (1): find uh ∈ Vh such that

aν(uh, bj) = (fν , bj)L2(ν) ∀ j ∈ I(ν),

uh(xj) = gj ∀ j ∈ I(∂ν),(3)

where

aν(bi, bj) =

∫

D

α(x)(∇bi,∇bj)dx,

(fν , bj) =

∫

suppbj

fνbjdx.

9

Let Uν ∈ Vh be the solution of (3), then Uν = Ufν + Ug

ν ,

where Ufν is the solution of

aν(Ufν , bj) = (fν , bj)L2(ν) , ∀ j ∈ I(

ν),

Ufν (xj) = 0, ∀ j ∈ I(∂ν)

and Ugν is the solution of

aν(Ugν , bj) = 0, ∀ j ∈ I(

ν),

Ugν (xj) = gj , ∀ j ∈ I(∂ν).

10

Main point of HDD is to build the mapping Φν := (Φgν , Φf

ν), where

Φgν : R

I(∂ν) → RI(γ) and Φf

ν : RI(ν) → R

I(γ) for each ν ∈ TTh.

1. Definition of Mapping Φν

(Φν(dν))i := uh(xi) , ∀i ∈ I(γ).

Hence, Φν(dν) is the trace of uh on γ.

Actually, Φνdν = Φgνgν + Φf

νfν .

2. Definition of auxiliary Mapping Ψν := (Ψgν , Ψf

ν)

Ψν(d) = (Ψν(dν))i∈I(∂ν) with (Ψν(dν))i := aν(uh, bi) − (fν , bi)L2(ν) ,

Ψνdν = Ψfνfν + Ψg

νgν .

11

Construction of the mappings Ψν and Φν

Lemma 1: Let ν1 and ν2 be two sons of ν ∈ TTh. Let d1 and d2 be

the data associated to ν1 and ν2 s.t. :

• (consistency conditions for the Dirichlet data)

g1,i = g2,i , ∀i ∈ I(ν1) ∩ I(ν2),

• (consistency conditions for the right-hand side)

f1,i = f2,i , ∀i ∈ I(ν1) ∩ I(ν2).

1

2

xjγ

xjν

ν

ν

Let u1 and u2 be the local FE solutions of the problem (3) for the

data d1, d2.

12

If u1, u2 satisfy

γΨ1(d1) + γΨ2(d2) = 0,

then uν defined by assembling

uν(xi) :=

u1(xi) for i ∈ I(ν1)

u2(xi) for i ∈ I(ν2)

1

2

xjγ

xjν

ν

ν

is solution of (3) for the data dν = (fν , gν) given by

fν :=

f1,i for i ∈ I(ν1)

f2,i for i ∈ I(ν2)gν :=

g1,i for i ∈ I(∂ν1)

g2,i for i ∈ I(∂ν2)

13

Construction of Φν

Given: d1 := dν1= (f1, g1,Γ, g1,γ), where g1,Γ := (g1)i∈I(Γν,1),

g1,γ := (g1)i∈I(γ). Then

Ψ1d1 = Ψf1f1 + ΨΓ

1 g1,Γ + Ψγ1g1,γ ,

Ψ2d2 = Ψf2f2 + ΨΓ

2 g2,Γ + Ψγ2g2,γ .

Restricting to I(γ) and summing

( γΨγ1 + γΨγ

2) gγ = (−Ψf1f1 − ΨΓ

1 g1,Γ − Ψf2f2 − ΨΓ

2g2,Γ)|γ .

We set

M := −( γΨγ1 + γΨγ

2 ),

compute M−1 and solve for gγ :

gγ = M−1(Ψf1f1 + ΨΓ

1 g1,Γ + Ψf2f2 + ΨΓ

2g2,Γ)|γ .

14

HDD consists of two algorithms

I. Construction of Φν for all ν ∈ TTh

1. Compute Ψν for all leaves of TTh(∈ R

3×3 matrices).

2. Recursion from the leaves to the root (end if ν = D):

(a) Compute Ψν and Φν from Ψ1, Ψ2.

(b) Store Φν and delete Ψ1, Ψ2.

15

II. Application of Φν

1. Given dν = (fν , gν), compute the solution uh on the interface γ

by Φν(dν).

2. Build the data d1 = (fν1, gν1

), d2 = (fν2, gν2

) from dν = (fν , gν)

and gγ = Φν(dν).

3. Repeat for sons of ν1 and ν2.

16

HDD in the H-matrix arithmetic

The system of linear equations for ν ∈ TThis Au = Fc.

Rewrite it in the block matrix form:

ABB ABI

AIB AII

uB

uI

=

FB

FI

c,

where uB ∈ RI(∂ν), uI ∈ R

I(γ), c ∈ RI(ν)

ABB ∈ RI(∂ν) → R

I(∂ν), AII ∈ RI(γ) → R

I(γ).

17

Eliminate uI :

ABB − ABIA−1II AIB 0

AIB AII

uB

uI

=

FB − ABIA−1II FI

FI

c.

(ABB − ABIA−1II AIB)uB = (FB − ABIA

−1II FI)c

uI = A−1II FIc − A−1

II AIBuB ,

Ψgν :=ABB − ABIA

−1II AIB (Schur complement)

Ψfν :=FB − ABIA

−1II FI ,

Φgν :=A−1

II AIB

Φfν :=A−1

II FI .

Exact HDD requires expensive matrix arithmetic.

Apply the H-matrix techniques.

18

H-matrices (Hackbusch ’99)

Rank-k matrices

R ∈ Rn×m, A ∈ R

n×k, B ∈ Rm×k, k ≪ min(n, m).

The storage R = ABT is k(n + m) instead of n · m for R represented

in the full matrix format.

=

A

BT

*

R

k

k

n

m

n

m

19

25 4

4 85

5 16 5

5 165

5 32 6

6 325

532 5

5 326

632 5

5 321

1

32 5

5 32 5

5

32 5

5

16 4

4 32 5

5 165

5 325

532 5

5 32

12

12

32 5

5 32 5

5

32 5

5

16 5

5

32 4

4 165

5 325

532 5

5 321

1

32 5

5 32 6

6

32 5

5 32 6

6

32 5

5

32 5

5

16 5

5

16 4

4 31

An H-matrix approximation to Ψgν , k ≤ 12.

20

Let n := max(|I|, |J |), d = 1, 2, 3 be the spatial dimension,

q the number of processors, k ≪ n a maximal rank.

Operation Sequential Complexity Parallel Complexity

(Hackbusch et al. ’99-’06) (Kriemann ’05)

storage(M) N = O(kn log n) Nq

Mx N = O(kn log n) Nq

M1 ⊕ M2 N = O(k2n log n) Nq

M1 ⊙ M2, M−1 N = O(k2n log2 n) Nq

+ O(n)

H-LU N = O(k2n log2 n) Nq

+ O(k2n log2 n

n1/d )

21

Computational resources for ν ∈ TTh

Lemma 2: Let ν ∈ TTh, nν := |I(ν)| and

√nν be the number of dofs

on the interface. Then the storage costs and computational

complexities of Ψgν , Ψf

ν , Φgν , Φf

ν are as shown in Table.

Storage Comput. complexity Application

Ψgν O(k

√nν log

√nν)∗ O(k2√nν log2 √nν) -

Ψfν O(knν log nν)∗ O(k2nν log2 nν) -

Φgν O(k

√nν) - O(k

√nν)

Φfν O(knν log nν) - O(knν log nν)

Lemma 3: The total storage cost of HDD is O(kn log2 nh) and the

total complexity is O(k2nh log3 nh).

22

HDD with fH ∈ VH ⊂ Vh

Given: h ≪ H, fH ∈ VH ⊂ Vh,

mappings Ψfν : R

I(νh) → RI(∂νh) Φf

ν : RI(νh) → R

I(γh)

want to build Ψfν : R

I(νH) → RI(∂νh) Φf

ν : RI(νH) → R

I(γh).

H h

.=

ΦfνΦf

ν P h←Hν

Lemma: The total storage cost of HDD is O(k√

nhnH log2 √nhnH)

and the total complexity is O(k2√nhnH log3 √nhnH).

23

HDD with truncation of the small scales:

D

h

H

T≥HTh

TTh

T <HTh

. . . .

.

.

.

.

.

.

.

.

.

.

.

.

mean value

(left)Domain decomposition tree TTh; (right) 2

√nhnH dofs.

Application: Multiscale problems (e.g. the skin problem, porous

medium).

Use the microscopic model to extract all microscale details and then

compute the macroscale behaviour.

24

Truncation of the scales < H

Memory costs of all Φgν (in kB). Maximal rank is k = 7.

dofs Φg, H = h Φg, H = 0.125

332 2.45 ∗ 102 2 ∗ 102

652 1.1 ∗ 103 7.9 ∗ 102

1292 5 ∗ 103 2.6 ∗ 103

2572 2.1 ∗ 104 7.4 ∗ 103

25

Memory costs of all Φfν (in kB). Maximal rank is k = 7.

dofs Φf , H = h Φf , H = 0.125

332 4 ∗ 102 2.8 ∗ 102

652 2.4 ∗ 103 1.8 ∗ 103

1292 1.4 ∗ 104 1.2 ∗ 104

2572 7.9 ∗ 104 6.9 ∗ 104

26

The mean value of the solution in ν

Lemma 4: Let ν, ν1, ν2 ∈ TThand ν = ν1 ∪ ν2. Let

λνi(dνi) = (λgνi

, gνi) + (λfνi

, fνi) computes the mean value in νi,

i = 1, 2. Then

λν(dν) = (λfν , fν) + (λg

ν , gν)

computes the mean value in ν. Here

λfν : R

I(ν) → R, fν ∈ RI(ν),

λgν : R

I(∂ν) → R, gν ∈ RI(∂ν),

λfν = c1λ

fν1

+ c2λfν2

,

λgν = c1λ

gν1

+ c2λgν2

,

gν is built from gν1, gν2

and g|γ := Φν(dν).

27

Many right-hand sides

The skin problem with highly oscillatory coefficients.

Ku(α) = f (α), K ∈ R1292×1292

.

“Leaves to Root ” ⇒ t1,

“Root to Leaves ” ⇒ t2.

|J | t1 + t2, sec. tcg, sec.

10 38+2.8 29

100 38+27 117

1000 38+240 1048

The total computational times of HDD and CG with H-Cholesky

preconditioner for |J | right-hand sides.

28

Conclusion:

1. HDD computes Fh, Gh and uh(·, ω) = Fhfh(·, ω) + Ghgh(·, ω).

2. Fh and Gh are successfully approximated in the H-matrix format.

3. The storage requirement is O(knh log2 nh).

4. The complexity is O(k2|J |nh log3 nh).

5. HDD allows to compute functionals of the solution.

6. HDD is well parallelizable with a small data exchange.

29

To do:

1)

∑

1≤i,j≤2

∂

∂xi

αi,j(x, ω)∂

∂xj

u = f(x, ω) x ∈ D, ω ∈ Ω,

u = g(x, ω) x ∈ ∂D, ω ∈ Ω,

and would like to get

u =

[

∑

γ

∆γ ⊗ Bγ

]

f +

∑

β

Λβ ⊗ Cβ

g,

where ∆γ , Λβ is a stochastic part and Bγ , Cβ is a deterministic part.

2) functionals of the solution u for nonlinear problems.

30

Thanks for your attention!

Questions ?

31

32

Education

My PhD talk "Application of H-matrices for computing partial inverse"