Itera
tive
Line
ar S
olve
rs fo
r Sp
arse
Mat
rices
Ken
goN
akaj
ima
Info
rmat
ion
Tech
nolo
gy C
ente
r, Th
e U
nive
rsity
of T
okyo
, Jap
an
2013
Inte
rnat
iona
l Sum
mer
Sch
oolo
n H
PC C
halle
nges
in
Com
puta
tiona
l Sci
ence
sN
ew Y
ork
Uni
vers
ity, N
ew Y
ork,
NY
June
24-
28, 2
013
•S
pars
e M
atric
es•
Itera
tive
Line
ar S
olve
rs−
Pre
cond
ition
ing
−P
aral
lel I
tera
tive
Line
ar S
olve
rs−
Mul
tigrid
Met
hod
−R
ecen
t Tec
hnic
al Is
sues
•E
xam
ple
of P
aral
lel M
GC
G•
ppO
pen-
HP
C
TOC
2IS
S-2
013
•S
pars
e M
atric
es•
Itera
tive
Line
ar S
olve
rs−
Pre
cond
ition
ing
−P
aral
lel I
tera
tive
Line
ar S
olve
rs−
Mul
tigrid
Met
hod
−R
ecen
t Tec
hnic
al Is
sues
•Ex
ampl
e of
Par
alle
l MG
CG
•pp
Ope
n-H
PC
TOC
3IS
S-2
013
4
•B
oth
of c
onve
rgen
ce (r
obus
tnes
s) a
nd e
ffici
ency
(s
ingl
e/pa
ralle
l) ar
e im
porta
nt•
Glo
bal c
omm
unic
atio
ns n
eede
d–
Mat
-Vec
(P2P
com
mun
icat
ions
, MP
I_Is
end/
Irecv
/Wai
tall)
: Loc
al
Dat
a S
truct
ure
with
HA
LO
effe
ct o
f lat
ency
–D
ot-P
rodu
cts
(MP
I_A
llred
uce)
–P
reco
nditi
onin
g (u
p to
alg
orith
m)
•R
emed
y fo
r Rob
ust P
aral
lel I
LU P
reco
nditi
oner
–A
dditi
ve S
chw
artz
Dom
ain
Dec
ompo
sitio
n–
HID
(Hie
rarc
hica
l Int
erfa
ce D
ecom
posi
tion,
bas
ed o
n gl
obal
ne
sted
dis
sect
ion)
[Hen
on&
Saa
d20
07],
ext.
HID
[KN
201
0]•
Par
alle
l “D
irect
” Sol
vers
(e.g
. Sup
erLU
, MU
MP
S e
tc.)
Para
llel I
tera
tive
Solv
ers
ISS
-201
3
•S
pars
e M
atric
es•
Itera
tive
Line
ar S
olve
rs−
Pre
cond
ition
ing
−P
aral
lel I
tera
tive
Line
ar S
olve
rs−
Mul
tigrid
Met
hod
−R
ecen
t Tec
hnic
al Is
sues
•E
xam
ple
of P
aral
lel M
GC
G•
ppO
pen-
HP
C
5IS
S-2
013
Aro
und
the
mul
tigrid
in a
sin
gle
slid
e•
Mul
tigrid
is a
sca
labl
e m
etho
d fo
r sol
ving
line
ar e
quat
ions
. •
Rel
axat
ion
met
hods
(sm
ooth
er/s
moo
thin
g op
erat
or in
MG
w
orld
) suc
h as
Gau
ss-S
eide
l effi
cien
tly d
amp
high
-fre
quen
cy e
rror
but
do
not e
limin
ate
low
-freq
uenc
y er
ror.
•Th
e m
ultig
ridap
proa
ch w
as d
evel
oped
in re
cogn
ition
that
th
is lo
w-fr
eque
ncy
erro
r can
be
accu
rate
ly a
nd e
ffici
ently
so
lved
on
a co
arse
r grid
. •
Mul
tigrid
met
hod
unifo
rmly
dam
ps a
ll fre
quen
cies
of e
rror
co
mpo
nent
s w
ith a
com
puta
tiona
l cos
t tha
t dep
ends
onl
y lin
early
on
the
prob
lem
siz
e (=
scal
able
).–
Goo
d fo
r lar
ge-s
cale
com
puta
tions
•M
ultig
ridis
als
o a
good
pre
cond
ition
ing
algo
rithm
for K
rylo
vite
rativ
e so
lver
s.
6IS
S-2
013
Con
verg
ence
of G
auss
-Sei
del &
SO
R
ITER
ATIO
N#
RESIDUALR
apid
Con
verg
ence
(hig
h-fre
quen
cy e
rror
:sh
ort w
ave
leng
th)
7IS
S-2
013
Con
verg
ence
of G
auss
-Sei
del &
SO
R
ITER
ATIO
N#
RESIDUAL
Slo
w C
onve
rgen
ce(lo
w-fr
eque
ncy
erro
r:lo
ng w
ave
leng
th)
8IS
S-2
013
Aro
und
the
mul
tigrid
in a
sin
gle
slid
e•
Mul
tigrid
is a
sca
labl
e m
etho
d fo
r sol
ving
line
ar e
quat
ions
. •
Rel
axat
ion
met
hods
(sm
ooth
er/s
moo
thin
g op
erat
or in
MG
w
orld
) suc
h as
Gau
ss-S
eide
l effi
cien
tly d
amp
high
-fre
quen
cy e
rror
but
do
not e
limin
ate
low
-freq
uenc
y er
ror.
•Th
e m
ultig
rid a
ppro
ach
was
dev
elop
ed in
reco
gniti
on th
at
this
low
-freq
uenc
y er
ror c
an b
e ac
cura
tely
and
effi
cien
tly
solv
ed o
n a
coar
ser g
rid.
•M
ultig
rid m
etho
d un
iform
ly d
amps
all
frequ
enci
es o
f err
or
com
pone
nts
with
a c
ompu
tatio
nal c
ost t
hat d
epen
ds o
nly
linea
rly o
n th
e pr
oble
m s
ize
(=sc
alab
le).
–G
ood
for l
arge
-sca
le c
ompu
tatio
ns•
Mul
tigrid
is a
lso
a go
od p
reco
nditi
onin
g al
gorit
hm fo
r Kry
lov
itera
tive
solv
ers.
9IS
S-2
013
Mul
tigrid
is s
cala
ble
Wea
k Sc
alin
g: P
robl
em S
ize/
Cor
e Fi
xed
for 3
D P
oiss
on E
qn’s
(
q)M
GC
G=
Con
juga
te G
radi
ent w
ith M
ultig
rid P
reco
nditi
onin
g
0
500
1000
1500
2000
2500
3000 1.
E+06
1.E+
071.
E+08
Iterations
DO
F
ICC
GM
GC
G
10IS
S-2
013
Mul
tigrid
is s
cala
ble
Wea
k Sc
alin
g: P
robl
em S
ize/
Cor
e Fi
xed
Com
p. ti
me
of M
GC
G fo
r wea
k sc
alin
g is
con
stan
t: =>
sca
labl
e
0
500
1000
1500
2000
2500
3000 1.
E+06
1.E+
071.
E+08
Iterations
DO
F
ICC
GM
GC
G
1632
6412
8
11IS
S-2
013
Proc
edur
e of
Mul
tigrid
(1/3
)12
Mul
tigrid
is a
sca
labl
e m
etho
d fo
r sol
ving
line
ar e
quat
ions
. Rel
axat
ion
met
hods
su
ch a
s G
auss
-Sei
del e
ffici
ently
dam
p hi
gh-fr
eque
ncy
erro
r but
do
not e
limin
ate
low
-freq
uenc
y er
ror.
The
mul
tigrid
appr
oach
was
dev
elop
ed in
reco
gniti
on th
at
this
low
-freq
uenc
y er
ror c
an b
e ac
cura
tely
and
effi
cien
tly s
olve
d on
a c
oars
er
grid
. Thi
s co
ncep
t is
expl
aine
d he
re in
the
follo
win
g si
mpl
e 2-
leve
l met
hod.
If w
e ha
ve o
btai
ned
the
follo
win
g lin
ear s
yste
m o
n a
fine
grid
:
AF
u F=
f
and
AC
as th
e di
scre
te fo
rm o
f the
ope
rato
r on
the
coar
se g
rid, a
sim
ple
coar
se
grid
cor
rect
ion
can
be g
iven
by
:
u F(i+
1)=
u F(i)
+ R
TA
C-1
R( f
-AF
u F(i)
)
whe
re R
Tis
the
mat
rix re
pres
enta
tion
of li
near
inte
rpol
atio
n fro
m th
e co
arse
grid
to
the
fine
grid
(pro
long
atio
nop
erat
or) a
nd R
is c
alle
d th
e re
stric
tion
oper
ator
. Th
us, i
t is
poss
ible
to c
alcu
late
the
resi
dual
on
the
fine
grid
, sol
ve th
e co
arse
gr
id p
robl
em, a
nd in
terp
olat
e th
e co
arse
grid
sol
utio
n on
the
fine
grid
.
12IS
S-2
013
Proc
edur
e of
Mul
tigrid
(2/3
)13
This
pro
cess
can
be
desc
ribed
as
follo
ws
:
1.R
elax
the
equa
tions
on
the
fine
grid
and
obt
ain
the
resu
lt u F
(i)
= S
F( A
F, f
). Th
is o
pera
tor S
F(e
.g.,
Gau
ss-S
eide
l) is
cal
led
the
smoo
thin
g op
erat
or (o
r ).
2.C
alcu
late
the
resi
dual
term
on
the
fine
grid
by
r F=
f -A
Fu F
(i).
3.R
estri
ct th
e re
sidu
al te
rm o
n to
the
coar
se g
rid b
y r C
= R
r F.
4.S
olve
the
equa
tion
AC
u C=
r Con
the
coar
se g
rid ;
the
accu
racy
of t
he s
olut
ion
on th
e co
arse
grid
affe
cts
the
conv
erge
nce
of th
e en
tire
mul
tigrid
syst
em.
5.In
terp
olat
e (o
r pro
long
) the
coa
rse
grid
cor
rect
ion
on th
e fin
e gr
id b
y u
F(i)=
RT
u C.
6.U
pdat
e th
e so
lutio
n on
the
fine
grid
by
u F(i+
1)=
u F(i)
+ u
F(i)
13IS
S-2
013
fine
coar
se
w1k
: App
rox.
Sol
utio
nvk
: Cor
rect
ion
I kk-
1: R
estri
ctio
n O
pera
tor
Lk
Wk
= Fk
(Lin
ear
Equ
atio
n:
Fine
Lev
el)
Rk
= Fk
-Lk
w1k
vk=
Wk
-w1k ,
Lk
vk=
Rk
Rk-
1=
I kk-
1R
k
Lk-
1vk-
1=
Rk-
1 (L
inea
r E
quat
ion:
C
oars
e L
evel
)vk
= I k
-1k
vk-1
w2k
= w
1k+
vk
fine
coar
se
w1k
: App
rox.
Sol
utio
nvk
: Cor
rect
ion
I kk-
1: R
estri
ctio
n O
pera
tor
Lk
Wk
= Fk
(Lin
ear
Equ
atio
n:
Fine
Lev
el)
Rk
= Fk
-Lk
w1k
vk=
Wk
-w1k ,
Lk
vk=
Rk
Rk-
1=
I kk-
1R
k
Lk-
1vk-
1=
Rk-
1 (L
inea
r E
quat
ion:
C
oars
e L
evel
)vk
= I k
-1k
vk-1
w2k
= w
1k+
vk
fine
coar
se
Lk
Wk
= Fk
(Lin
ear
Equ
atio
n:
Fine
Lev
el)
Rk
= Fk
-Lk
w1k
vk=
Wk
-w1k ,
Lk
vk=
Rk
Rk-
1=
I kk-
1R
k
Lk-
1vk-
1=
Rk-
1 (L
inea
r E
quat
ion:
C
oars
e L
evel
)vk
= I k
-1k
vk-1
w2k
= w
1k+
vk
I k-1
k: P
rolo
ngat
ion
Ope
rato
rw
2k: A
ppro
x. S
olut
ion
by M
ultig
rid
fine
coar
se
Lk
Wk
= Fk
(Lin
ear
Equ
atio
n:
Fine
Lev
el)
Rk
= Fk
-Lk
w1k
vk=
Wk
-w1k ,
Lk
vk=
Rk
Rk-
1=
I kk-
1R
k
Lk-
1vk-
1=
Rk-
1 (L
inea
r E
quat
ion:
C
oars
e L
evel
)vk
= I k
-1k
vk-1
w2k
= w
1k+
vk
I k-1
k: P
rolo
ngat
ion
Ope
rato
rw
2k: A
ppro
x. S
olut
ion
by M
ultig
rid
14IS
S-2
013
Proc
edur
e of
Mul
tigrid
(3/3
)15
•R
ecur
sive
app
licat
ion
of th
is a
lgor
ithm
for 2
-leve
l pro
cedu
re to
co
nsec
utiv
e sy
stem
s of
coa
rse-
grid
equ
atio
ns g
ives
a m
ultig
ridV-
cycl
e. If
the
com
pone
nts
of th
e V-
cycl
e ar
e de
fined
app
ropr
iate
ly,
the
resu
lt is
a m
etho
d th
at u
nifo
rmly
dam
ps a
ll fre
quen
cies
of e
rror
with
a c
ompu
tatio
nal c
ost t
hat d
epen
ds o
nly
linea
rly o
n th
e pr
oble
m s
ize.
−
In o
ther
wor
ds, m
ultig
ridal
gorit
hms
are
scal
able
.•
In th
e V-
cycl
e, s
tarti
ng w
ith th
e fin
est g
rid, a
ll su
bseq
uent
coa
rser
gr
ids
are
visi
ted
only
onc
e.
−In
the
dow
n-cy
cle,
sm
ooth
ers
dam
p os
cilla
tory
erro
r com
pone
nts
at d
iffer
ent
grid
sca
les.
−
In th
e up
-cyc
le, t
he s
moo
th e
rror c
ompo
nent
s re
mai
ning
on
each
grid
leve
l ar
e co
rrect
ed u
sing
the
erro
r app
roxi
mat
ions
on
the
coar
ser g
rids.
•
Alte
rnat
ivel
y, in
a W
-cyc
le, t
he c
oars
er g
rids
are
solv
ed m
ore
rigor
ousl
y in
ord
er to
redu
ce re
sidu
als
as m
uch
as p
ossi
ble
befo
re
goin
g ba
ck to
the
mor
e ex
pens
ive
finer
grid
s.
15IS
S-2
013
fine
coar
se
(a) V
-Cyc
le
fine
coar
se
(a) V
-Cyc
le(b
) W-C
ycle
fine
coar
se
(b) W
-Cyc
le
fine
coar
se
16IS
S-2
013
Mul
tigrid
as
a Pr
econ
ditio
ner
17
•M
ultig
ridal
gorit
hms
tend
to b
e pr
oble
m-s
peci
fic
solu
tions
and
less
robu
st th
an p
reco
nditi
oned
Kry
lov
itera
tive
met
hods
suc
h as
the
IC/IL
U m
etho
ds.
•Fo
rtuna
tely,
it is
eas
y to
com
bine
the
best
feat
ures
of
mul
tigrid
and
Kry
lov
itera
tive
met
hods
into
one
alg
orith
m−
mul
tigrid
-pre
cond
ition
ed K
rylo
vite
rativ
e m
etho
ds.
•Th
e re
sulti
ng a
lgor
ithm
is ro
bust
, effi
cien
t and
sca
labl
e.
•M
utig
ridso
lver
s an
d K
rylo
vite
rativ
e so
lver
s pr
econ
ditio
ned
by m
ultig
ridar
e in
trins
ical
ly s
uita
ble
for
para
llel c
ompu
ting.
ISS
-201
3
Geo
met
ric a
nd A
lgeb
raic
Mul
tigrid
18
•O
ne o
f the
mos
t im
porta
nt is
sues
in m
ultig
ridis
the
cons
truct
ion
of th
e co
arse
grid
s.
•Th
ere
are
2 ba
sic
mul
tigrid
appr
oach
es−
geom
etric
and
alg
ebra
ic
•In
geo
met
ric m
ultig
rid, t
he g
eom
etry
of t
he p
robl
em is
us
ed to
def
ine
the
vario
us m
ultig
ridco
mpo
nent
s.
•In
con
trast
, alg
ebra
ic m
ultig
ridm
etho
ds u
se o
nly
the
info
rmat
ion
avai
labl
e in
the
linea
r sys
tem
of e
quat
ions
, su
ch a
s m
atrix
con
nect
ivity
. •
Alg
ebra
ic m
ultig
ridm
etho
d (A
MG
) is
suita
ble
for
appl
icat
ions
with
uns
truct
ured
grid
s.
•M
any
tool
s fo
r bot
h ge
omet
ric a
nd a
lgeb
raic
met
hods
on
unst
ruct
ured
grid
s ha
ve b
een
deve
lope
d.
18IS
S-2
013
“Dar
k Si
de”
of M
ultig
rid M
etho
d19
•Its
per
form
ance
is e
xcel
lent
for w
ell-c
ondi
tione
d si
mpl
e pr
oble
ms,
suc
h as
hom
ogen
eous
Poi
sson
equ
atio
ns.
•B
ut c
onve
rgen
ce c
ould
be
wor
se fo
r ill-
cond
ition
ed
prob
lem
s.•
Ext
ensi
on o
f app
licab
ility
of m
ultig
ridm
etho
d is
an
activ
e re
sear
ch a
rea.
19IS
S-2
013
Ref
eren
ces
•B
riggs
, W.L
., H
enso
n, V
.E. a
nd M
cCor
mic
k, S
.F. (
2000
) A
Mul
tigrid
Tut
oria
l Sec
ond
Edi
tion,
SIA
M
•Tr
otte
mbe
rg, U
., O
oste
rlee,
C. a
nd S
chül
ler,
A. (
2001
) M
ultig
rid, A
cade
mic
Pre
ss
•ht
tps:
//com
puta
tion.
llnl.g
ov/c
asc/
•H
ypre
(AM
G L
ibra
ry)
–ht
tps:
//com
puta
tion.
llnl.g
ov/c
asc/
linea
r_so
lver
s/sl
s_hy
pre.
htm
l
20IS
S-2
013
•S
pars
e M
atric
es•
Itera
tive
Line
ar S
olve
rs−
Pre
cond
ition
ing
−P
aral
lel I
tera
tive
Line
ar S
olve
rs−
Mul
tigrid
Met
hod
−R
ecen
t Tec
hnic
al Is
sues
•E
xam
ple
of P
aral
lel M
GC
G•
ppO
pen-
HP
C
21IS
S-2
013
Key
-Issu
es fo
r A
ppl’s
/Alg
orith
ms
tow
ards
Post
-Pet
a&
Exa
Com
putin
gJa
ck D
onga
rra
(OR
NL/
U. T
enne
ssee
) at I
SC
201
3
•H
ybrid
/Het
erog
eneo
us A
rchi
tect
ure
–M
ultic
ore
+ G
PU
/Man
ycor
es (I
ntel
MIC
/Xeo
n P
hi)
•D
ata
Mov
emen
t, H
iera
rchy
of M
emor
y
•C
omm
unic
atio
n/S
ynch
roni
zatio
n R
educ
ing
Alg
orith
ms
•M
ixed
Pre
cisi
on C
ompu
tatio
n•
Aut
o-Tu
ning
/Sel
f-Ada
ptin
g•
Faul
t Res
ilient
Alg
orith
ms
•R
epro
duci
bilit
y of
Res
ults
22IS
S-2
013
23
•C
omm
unic
atio
n ov
erhe
ad b
ecom
es s
igni
fican
t•
Com
mun
icat
ion-
Com
puta
tion
Ove
rlap
–N
ot s
o ef
fect
ive
for M
at-V
ecop
erat
ions
•C
omm
unic
atio
n Av
oidi
ng/R
educ
ing
Alg
orith
ms
•O
penM
P/M
PI H
ybrid
Par
alle
l Pro
gram
min
g M
odel
–(N
ext s
ectio
n)
Rec
ent T
echn
ical
Issu
es in
Par
alle
l Ite
rativ
e So
lver
s
ISS
-201
3
24
Com
mun
icat
ion
over
head
bec
omes
la
rger
as
node
/cor
e nu
mbe
r inc
reas
esW
eak
Scal
ing:
MG
CG
on
T2K
Tok
yo
ISS
-201
3
0%10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
6412
825
651
210
2420
4840
9661
4481
92
%
core
#
Com
m.
Com
p.
Com
m.-C
omp.
Ove
rlapp
ing
ISS
-201
325
Inte
rnal
Mes
hes
Ext
erna
l (H
ALO
) Mes
hes
Com
m.-C
omp.
Ove
rlapp
ing
ISS
-201
326
Inte
rnal
Mes
hes
Ext
erna
l (H
ALO
) Mes
hes
Inte
rnal
Mes
hes
on
Bou
ndar
y’s
Mat
-Vec
oper
atio
ns•
Ove
rlapp
ing
of c
ompu
tatio
ns
of in
tern
al m
eshe
s, a
nd
impo
rting
ext
erna
l mes
hes.
•Th
en c
ompu
tatio
n of
in
tern
atio
nal m
eshe
s on
bo
unda
ry’s
•
Diff
icul
t for
IC/IL
U o
n H
ybrid
Com
mun
icat
ion
Avo
idin
g/R
educ
ing
Alg
orith
ms
for S
pars
e Li
near
Sol
vers
•K
rylo
vIte
rativ
e M
etho
d w
ithou
t Pre
cond
ition
ing
–D
emm
el, H
oem
men
, Moh
iyud
din
etc.
(UC
Ber
kele
y)•
s-st
epm
etho
d–
Just
one
P2P
com
mun
icat
ion
for e
ach
Mat
-Vec
durin
g s
itera
tions
. Con
verg
ence
bec
omes
uns
tabl
e fo
r lar
ge s
.–
mat
rix p
ower
s ke
rnel
: Ax,
A2 x
, A3 x
...
•ad
ditio
nal c
ompu
tatio
ns n
eede
d
•C
omm
unic
atio
n A
void
ing
ILU
0 (C
A-IL
U0)
[Mou
faw
ad&
G
rigor
i, 20
13]
–Fi
rst a
ttem
pt to
CA
pre
cond
ition
ing
–N
este
d di
ssec
tion
reor
derin
g fo
r lim
ited
geom
etrie
s (2
D F
DM
)
27IS
S-2
013
Com
m. A
void
ing
Kry
lov
Itera
tive
Met
hods
usi
ng “
Mat
rix P
ower
s K
erne
l”
ISS
-201
328
Avoi
ding
Com
mun
icat
ion
in S
pars
e M
atrix
Com
puta
tions
. Ja
mes
Dem
mel
, Mar
k H
oem
men
, Mar
ghoo
bM
ohiy
uddi
n,
and
Kat
herin
e Ye
lick.
, 20
08 IP
DP
S
Req
uire
d In
form
atio
n of
Loc
al M
eshe
s fo
r s-s
tep
CA
com
puta
tions
(2D
5pt
.)
ISS
-201
329
s=1
(orig
inal
)s=
2s=
3
•S
pars
e M
atric
es•
Itera
tive
Line
ar S
olve
rs−
Pre
cond
ition
ing
−P
aral
lel I
tera
tive
Line
ar S
olve
rs−
Mul
tigrid
Met
hod
−R
ecen
t Tec
hnic
al Is
sues
•E
xam
ple
of P
aral
lel M
GC
G•
ppO
pen-
HP
C
30IS
S-2
013
Key
-Issu
es fo
r A
ppl’s
/Alg
orith
ms
tow
ards
Post
-Pet
a&
Exa
Com
putin
gJa
ck D
onga
rra
(OR
NL/
U. T
enne
ssee
) at I
SC
201
3
•H
ybrid
/Het
erog
eneo
us A
rchi
tect
ure
–M
ultic
ore
+ G
PU
/Man
ycor
es (I
ntel
MIC
/Xeo
n P
hi)
•D
ata
Mov
emen
t, H
iera
rchy
of M
emor
y
•C
omm
unic
atio
n/S
ynch
roni
zatio
n R
educ
ing
Alg
orith
ms
•M
ixed
Pre
cisi
on C
ompu
tatio
n•
Aut
o-Tu
ning
/Sel
f-Ada
ptin
g•
Faul
t Res
ilient
Alg
orith
ms
•R
epro
duci
bilit
y of
Res
ults
31IS
S-2
013
Mot
ivat
ion
of T
his
Stud
y•
Larg
e-sc
ale
3D G
roun
dwat
er F
low
–
Poi
sson
equ
atio
ns–
Het
erog
eneo
us p
orou
s m
edia
•P
aral
lel (
Geo
met
ric) M
ultig
ridS
olve
rs fo
r FV
M-ty
pe a
ppl.
on F
ujits
u P
RIM
EH
PC
FX
10 a
t Uni
vers
ity o
f Tok
yo
(Oak
leaf
-FX
)•
Flat
MP
I vs.
Hyb
rid (O
penM
P+M
PI)
•E
xpec
tatio
ns fo
r Hyb
rid P
aral
lel P
rogr
amm
ing
Mod
el–
Num
ber o
f MP
I pro
cess
es (a
nd s
ub-d
omai
ns) t
o be
redu
ced
–O
(108
-109
)-w
ay M
PI m
ight
not
sca
lein
Exa
scal
eS
yste
ms
–E
asily
ext
ende
d to
Het
erog
eneo
us A
rchi
tect
ures
•C
PU
+GP
U, C
PU
+Man
ycor
es(e
.g. I
ntel
MIC
/Xeo
n P
hi)
•M
PI+
X: O
penM
P, O
penA
CC
, CU
DA
, Ope
nCL
32IS
S-2
013
33
•3D
Gro
undw
ater
Flo
w v
ia. H
eter
ogen
eous
Por
ous
Med
ia–
Poi
sson
’s e
quat
ion
–R
ando
mly
dis
tribu
ted
wat
er c
ondu
ctiv
ity
–D
istri
butio
n of
wat
er c
ondu
ctiv
ity is
def
ined
thro
ugh
met
hods
in
geos
tatis
tics
〔D
euts
ch &
Jou
rnel
, 199
8〕•
Fini
te-V
olum
e M
etho
d on
Cub
ic V
oxel
Mes
h
Targ
et A
pplic
atio
n: pGW3D-FVM
•D
istri
butio
n of
Wat
er C
ondu
ctiv
ity–
10-5
-10+
5 , C
ondi
tion
Num
ber ~
10+
10
–Av
erag
e: 1
.0•
Cyc
lic D
istri
butio
n: 1
283
ISS
-201
3
max
0,
,,
zz
atq
zy
x
34
•3D
Gro
undw
ater
Flo
w v
ia. H
eter
ogen
eous
Por
ous
Med
ia–
Poi
sson
’s e
quat
ion
–R
ando
mly
dis
tribu
ted
wat
er c
ondu
ctiv
ity
–D
istri
butio
n of
wat
er c
ondu
ctiv
ity is
def
ined
thro
ugh
met
hods
in
geos
tatis
tics
〔D
euts
ch &
Jou
rnel
, 199
8〕•
Fini
te-V
olum
e M
etho
d on
Cub
ic V
oxel
Mes
h
Targ
et A
pplic
atio
n: pGW3D-FVM
•D
istri
butio
n of
Wat
er C
ondu
ctiv
ity–
10-5
-10+
5 , C
ondi
tion
Num
ber ~
10+
10
–Av
erag
e: 1
.0•
Cyc
lic D
istri
butio
n: 1
283
ISS
-201
3
max
0,
,,
zz
atq
zy
x
Mot
ivat
ion
of T
his
Stud
y•
Larg
e-sc
ale
3D G
roun
dwat
er F
low
–
Poi
sson
equ
atio
ns–
Het
erog
eneo
us p
orou
s m
edia
•P
aral
lel (
Geo
met
ric) M
ultig
ridS
olve
rs fo
r FV
M-ty
pe a
ppl.
on F
ujits
u P
RIM
EH
PC
FX
10 a
t Uni
vers
ity o
f Tok
yo
(Oak
leaf
-FX
)•
Flat
MP
I vs.
Hyb
rid (O
penM
P+M
PI)
•E
xpec
tatio
ns fo
r Hyb
rid P
aral
lel P
rogr
amm
ing
Mod
el–
Num
ber o
f MP
I pro
cess
es (a
nd s
ub-d
omai
ns) t
o be
redu
ced
–O
(108
-109
)-w
ay M
PI m
ight
not
sca
lein
Exa
scal
eS
yste
ms
–E
asily
ext
ende
d to
Het
erog
eneo
us A
rchi
tect
ures
•C
PU
+GP
U, C
PU
+Man
ycor
es(e
.g. I
ntel
MIC
/Xeo
n P
hi)
•M
PI+
X: O
penM
P, O
penA
CC
, CU
DA
, Ope
nCL
35IS
S-2
013
•P
aral
lel G
eom
etric
Mul
tigrid
•O
penM
P/M
PI H
ybrid
Par
alle
l Pro
gram
min
g M
odel
•Lo
caliz
ed B
lock
Jac
obi P
reco
nditi
onin
g−
Ove
rlapp
ed A
dditi
ve S
chw
artz
Dom
ain
Dec
ompo
sitio
n (A
SD
D)
•O
penM
PP
aral
leliz
atio
n w
ith C
olor
ing
•C
oars
e G
rid A
ggre
gatio
n (C
GA
), H
iera
rchi
cal
CG
A
Key
wor
ds
36IS
S-2
013
Flat
MPI
vs.
Hyb
rid
Hyb
rid:H
iera
rcha
l Stru
ctur
e
Flat
-MP
I:E
ach
Cor
e ->
Inde
pend
ent
core
core
core
core
memory
core
core
core
core
memory
core
core
core
core
memory
core
core
core
core
memory
core
core
core
core
memoryco
reco
reco
reco
re
memory
memory
memory
memory
core
core
core
core
core
core
core
core
core
core
core
core
37IS
S-2
013
Fujit
su P
RIM
EHPC
FX1
0 (O
akle
af-F
X)at
the
U. T
okyo
•S
PA
RC
64 Ix
fx(4
,800
nod
es, 7
6,80
0 co
res)
•C
omm
erci
al v
ersi
on o
f K c
ompu
terx
•P
eak:
1.1
3 P
FLO
PS
(1.0
43 P
F, 2
6th ,
41th
TOP
500
in 2
013
June
.)•
Mem
ory
BW
TH 3
98 T
B/s
ec.
38IS
S-2
013
Mul
tigrid
•S
cala
ble
Mul
ti-Le
vel M
etho
d us
ing
Mul
tilev
el G
rid fo
r S
olvi
ng L
inea
r Eqn
’s–
Com
puta
tion
Tim
e ~
O(N
) (N
: # u
nkno
wns
)–
Goo
d fo
r lar
ge-s
cale
pro
blem
s•
Pre
cond
ition
erfo
r Kry
lov
Itera
tive
Line
ar S
olve
rs–
MG
CG
0
100
200
300
400 1.
E+06
1.E+
071.
E+08
計算時間(秒)
問題
規模
ICC
GM
GC
G
DO
F
sec.
39IS
S-2
013
40
•P
reco
nditi
oned
CG
Met
hod
–M
ultig
ridP
reco
nditi
onin
g (M
GC
G)
–IC
(0) f
or S
moo
thin
g O
pera
tor (
Sm
ooth
er):
good
for i
ll-co
nditi
oned
pro
blem
s•
Par
alle
l Geo
met
ric M
ultig
ridM
etho
d–
8 fin
e m
eshe
s (c
hild
ren)
form
1 c
oars
e m
esh
(par
ent)
in
isot
ropi
c m
anne
r (oc
tree)
–V-
cycl
e–
Dom
ain-
Dec
ompo
sitio
n-ba
sed:
Loc
aliz
ed B
lock
-Jac
obi,
Ove
rlapp
ed A
dditi
ve S
chw
artz
Dom
ain
Dec
ompo
sitio
n (A
SD
D)
–O
pera
tions
usi
ng a
sin
gle
core
at t
he c
oars
est l
evel
(red
unda
nt)
Line
ar S
olve
rsIS
S-2
013
41
Ove
rlapp
ed A
dditi
ve S
chw
artz
D
omai
n D
ecom
posi
tion
Met
hod
AS
DD
: Loc
aliz
ed B
lock
-Jac
obi P
reco
nd. i
s st
abiliz
ed
Glo
bal O
pera
tion
Loca
l Ope
ratio
n
Glo
bal N
estin
g C
orre
ctio
n
1 2
rMz
22
21
11
11
,
r
Mz
rM
z
)(
11
11
11
11
11
11
n
nn
nz
Mz
Mr
Mz
z
)(
11
11
22
22
22
22
n
nn
nz
Mz
Mr
Mz
z
i:
Inte
rnal
(i≦
N)
i:E
xter
nal(
i>N)
ISS
-201
3
Com
puta
tions
on
Fujit
su F
X10
•Fu
jitsu
PR
IME
HP
C F
X10
at U
.Tok
yo(O
akle
af-F
X)
–16
cor
es/n
ode,
flat
/uni
form
acc
ess
to m
emor
y•
Up
to 4
,096
nod
es (6
5,53
6 co
res)
(Lar
ge-S
cale
HP
C
Cha
lleng
e)
–M
ax 1
7,17
9,86
9,18
4 un
know
ns–
Flat
MP
I, H
B 4
x4, H
B 8
x2, H
B 1
6x1
•H
B M
xN: M
-thre
ads
x N
-MP
I-pro
cess
es o
n ea
ch n
ode
•W
eak
Sca
ling
–64
3ce
lls/c
ore
•S
trong
Sca
ling
–12
83×
8= 1
6,77
7,21
6 un
know
ns, f
rom
8 to
4,0
96 n
odes
•N
etw
ork
Topo
logy
is n
ot s
peci
fied
–1D
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L2
Mem
ory
42IS
S-2
013
ISS
-201
343
HB
M x
N
Num
ber o
f Ope
nMP
thre
ads
per a
sin
gle
MP
I pro
cess
Num
ber o
f MP
I pro
cess
per a
sin
gle
node
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L2
Mem
ory
ME
PA
201
444
HB
8 x
2
Num
ber o
f Ope
nMP
thre
ads
per a
sin
gle
MP
I pro
cess
Num
ber o
f MP
I pro
cess
per a
sin
gle
node
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L1 CL1 C
L2
Mem
ory
8 th
read
s/pr
oces
s8
thre
ads/
proc
ess
•K
rylo
vIte
rativ
e S
olve
rs–
Dot
Pro
duct
s–
SM
VP
–D
AX
PY
–P
reco
nditi
onin
g•
IC/IL
U F
acto
rizat
ion,
For
war
d/B
ackw
ard
Sub
stitu
tion
–G
loba
l Dat
a D
epen
denc
y–
Reo
rder
ing
need
ed fo
r par
alle
lism
([K
N 2
003]
on
the
Ear
th
Sim
ulat
or, K
N@
CM
CIM
-200
2)–
Mul
ticol
orin
g, R
CM
, CM
-RC
M
Reo
rder
ing
for e
xtra
ctin
g pa
ralle
lism
in e
ach
dom
ain
(= M
PI P
roce
ss)
45IS
S-2
013
OM
P-1
46
Para
lleriz
atio
nof
ICC
G
do i= 1, N
VAL= D(i)
do k= indexL(i-1)+1, indexL(i)
VAL= VAL -
(AL(k)**2) * W(itemL(k),DD)
enddo
W(i,DD)= 1.d0/VAL
enddo
do i= 1, N
WVAL= W(i,Z)
do k= indexL(i-1)+1, indexL(i)
WVAL= WVAL -
AL(k) * W(itemL(k),Z)
enddo
W(i,Z)= WVAL * W(i,DD)
enddo
IC
Fact
oriz
atio
n
Forw
ard
Subs
titut
ion
OM
P-1
47
(Glo
bal)
Dat
a D
epen
denc
y:
Writ
ing/
read
ing
may
occ
ur s
imul
tane
ousl
y, h
ard
to p
aral
leliz
e
do i= 1, N
VAL= D(i)
do k= indexL(i-1)+1, indexL(i)
VAL= VAL -
(AL(k)**2) * W
(itemL(k),DD)
enddo
W(i,DD)= 1.d0/VAL
enddo
do i= 1, N
WVAL= W(i,Z)
do k= indexL(i-1)+1, indexL(i)
WVAL= WVAL -
AL(k) * W(itemL(k),Z)
enddo
W(i,Z)= WVAL * W(i,DD)
enddo
IC
Fact
oriz
atio
n
Forw
ard
Subs
titut
ion
ISS
-201
348
Ope
nMP
for S
pMV:
Str
aigh
tforw
ard
NO
dat
a de
pend
ency
!$omp
parallel do private(ip,i,VAL,k)
do ip= 1, PEsmpTOT
do i
= INDEX(ip-1)+1, INDEX(ip)
VAL= D(i)*W(i,P)
do k= indexL(i-1)+1, indexL(i)
VAL= VAL + AL(k)*W(itemL(k),P)
enddo
do k= indexU(i-1)+1, indexU(i)
VAL= VAL + AU(k)*W(itemU(k),P)
enddo
W(i,Q)= VAL
enddo
enddo
Ord
erin
g M
etho
dsE
lem
ents
in “s
ame
colo
r” a
re in
depe
nden
t: to
be
para
lleliz
ed
6463
6158
5449
4336
6260
5753
4842
3528
5956
5247
4134
2721
5551
4640
3326
2015
5045
3932
2519
1410
4438
3124
1813
96
3730
2317
128
53
2922
1611
74
21
4832
3115
1462
6144
4326
258
754
5336
1664
6346
4528
2710
956
5538
3720
192
4730
2912
1158
5740
3922
214
350
4933
1360
5942
4124
236
552
5135
3418
171
6463
6158
5449
4336
6260
5753
4842
3528
5956
5247
4134
2721
5551
4640
3326
2015
5045
3932
2519
1410
4438
3124
1813
96
3730
2317
128
53
2922
1611
74
21
117
318
519
720
3349
3450
3551
3652
1721
1922
2123
2324
3753
3854
3955
4056
3325
3526
3727
3928
4157
4258
4359
4460
4929
5130
5331
5532
4561
4662
4763
4864
12
34
56
78
910
1112
1314
1516
RC
MR
ever
se C
uthi
ll-M
ckee
MC
(Col
or#=
4)M
ultic
olor
ing
CM
-RC
M (C
olor
#=4)
Cyc
lic M
C +
RC
M
49IS
S-2
013
Ord
erin
g M
etho
dsE
lem
ents
in “s
ame
colo
r” a
re in
depe
nden
t: to
be
para
lleliz
ed
6463
6158
5449
4336
6260
5753
4842
3528
5956
5247
4134
2721
5551
4640
3326
2015
5045
3932
2519
1410
4438
3124
1813
96
3730
2317
128
53
2922
1611
74
21
4832
3115
1462
6144
4326
258
754
5336
1664
6346
4528
2710
956
5538
3720
192
4730
2912
1158
5740
3922
214
350
4933
1360
5942
4124
236
552
5135
3418
171
6463
6158
5449
4336
6260
5753
4842
3528
5956
5247
4134
2721
5551
4640
3326
2015
5045
3932
2519
1410
4438
3124
1813
96
3730
2317
128
53
2922
1611
74
21
117
318
519
720
3349
3450
3551
3652
1721
1922
2123
2324
3753
3854
3955
4056
3325
3526
3727
3928
4157
4258
4359
4460
4929
5130
5331
5532
4561
4662
4763
4864
12
34
56
78
910
1112
1314
1516
RC
MR
ever
se C
uthi
ll-M
ckee
MC
(Col
or#=
4)M
ultic
olor
ing
CM
-RC
M (C
olor
#=4)
Cyc
lic M
C +
RC
M
50IS
S-2
013
51
•O
ptim
izat
ion
of P
aral
lel M
GCG
–Co
njug
ate
Gra
dien
t So
lver
with
Mul
tigrid
Prec
ondi
tioni
ng–
Ope
nMP/
MPI
Hyb
rid P
aral
lel P
rogr
amm
ing
Mod
el–
Effic
ienc
y &
Con
verg
ence
•Co
mm
unic
atio
ns a
re e
xpen
sive
–Se
rial C
omm
unic
atio
ns
Dat
a Tr
ansf
er t
hrou
gh H
iera
rchi
cal M
emor
y
–Pa
ralle
l Com
mun
icat
ions
M
essa
ge P
assi
ng t
hrou
gh N
etw
ork
•Pa
ralle
l Mul
tigrid
–“C
oars
e G
rid S
olve
r” is
impo
rtan
t
Effic
ienc
y &
Con
verg
ence
−H
PCG
(H
igh-
Perf
orm
ance
Con
juga
te G
radi
ents
)
MG
CG b
y G
eom
etric
Mul
tigrid
Rec
ent P
rogr
ess
(201
3-20
14)
ME
PA
201
4
•3D
Groun
dwater Flow via Heterogen
eous
Porous M
edia
−Po
isson
’s eq
uatio
n−Ra
ndom
ly distrib
uted
water con
ductivity
−Finite‐Volum
e Metho
d on
Cub
ic Voxel M
esh
−=
10‐5~10+
5 , Average: 1.00
–MGCG
Solver
Parallel M
G Solvers: p
GW3D
‐FVM
52M
EP
A 2
014
q
zy
x
,
,
•Storage form
at of coe
fficient m
atric
es (S
erial
Comm.)
–CR
S (Com
pressed Ro
w Storage)
–ELL (Ellp
ack‐Itp
ack)
•Co
mm. /Sych. R
educing MG (P
arallel
Comm.)
–Co
arse Grid
Aggregatio
n (CGA)
–Hierarchical CGA: Com
mun
ication Re
ducing
CGA
ELL:
Fix
ed L
oop-
leng
th, N
ice
for
Pre-
fetc
hing
ME
PA
201
453
50
00
10
47
30
00
31
40
05
21
00
03
11
31
25
41
33
74
15
13
12
54
13
37
41
5
0 0
(a) C
RS
(b) E
LL
Spec
ial T
reat
men
t for
“B
ound
ary”
Cel
lsco
nnec
ted
to “
Hal
o”•
Dis
tribu
tion
of
Low
er/U
pper
Non
-Zer
o O
ff-D
iago
nal C
ompo
nent
s
•If
we
adop
t RC
M (o
r CM
) re
orde
ring
...•
Pur
e In
tern
al C
ells
–L:
~3,
U: ~
3•
Bou
ndar
y C
ells
–L:
~3,
U: ~
6
ME
PA
201
454
Ext
erna
l Cel
ls
Inte
rnal
Cel
ls
on B
ound
ary
Pur
e In
tern
al
Cel
ls
x
yz
Pur
e In
tern
al C
ells
Inte
rnal
Cel
ls
on B
ound
ary
●In
tern
al
(low
er)
●In
tern
al
(upp
er)
●E
xter
nal
(upp
er)
Orig
inal
ELL
: Bac
kwar
d Su
bst.
Cac
he is
not
wel
l-util
ized
: IA
Une
w(6
,N),
Aun
ew(6
,N)
ME
PA
201
455
do icol= NHYP(lev), 1, -1
if (mod(icol,2).eq.1) then
!$omp
parallel do private (ip,icel,j,SW)
do ip= 1, PEsmpTOT
do icel= SMPindex(icol-1,ip,lev)+1, SMPindex(icol,ip,lev)
SW= 0.0d0
do j= 1, 6
SW= SW + AUnew(j,icel)*Rmg(IAUnew(j,icel))
enddo
Rmg(icel)= Rmg(icel) -
SW*DDmg(icel)
enddo
enddo
else
!$omp
parallel do private (ip,icel,j,SW)
do ip= 1, PEsmpTOT
do icel= SMPindex(icol-1,ip,lev)+1, SMPindex(icol,ip,lev)
SW= 0.0d0
do j= 1, 3
SW= SW + AUnew(j,icel)*Rmg(IAUnew(j,icel))
enddo
Rmg(icel)= Rmg(icel) -
SW*DDmg(icel)
enddo
enddo
endif
enddo
IAUnew(6,N), AUnew(6,N)
for P
ure
Inte
rnal
Cel
ls
for B
ound
ary
Cel
ls
Orig
inal
ELL
: Bac
kwar
d Su
bst.
Cac
he is
not
wel
l-util
ized
: IA
Une
w(6
,N),
Aun
ew(6
,N)
ME
PA
201
456
Pur
e In
tern
al C
ells
AUnew(6,N)
Bou
ndar
y C
ells
AUnew(6,N)
Orig
inal
App
roac
h (r
estr
ictio
n)C
oars
e gr
id s
olve
r at a
sin
gle
core
[KN
201
0]
ME
PA
201
460
Leve
l=1
Leve
l=2
Leve
l=m
-3
Leve
l=m
-2
Leve
l=m
-1
Leve
l=m
Mes
h #
for
each
MP
I= 1
Fine
Coa
rse
Coa
rse
grid
sol
ver o
n a
sing
le c
ore
(furth
er m
ultig
rid)
Orig
inal
App
roac
h (r
estr
ictio
n)C
oars
e gr
id s
olve
r at a
sin
gle
core
[KN
201
0]
ME
PA
201
461
Fine
Coa
rse
Com
mun
icat
ion
Ove
rhea
dat
Coa
rser
Lev
els
Coa
rse
grid
sol
ver o
n a
sing
le c
ore
(furth
er m
ultig
rid)
Leve
l=1
Leve
l=2
Leve
l=m
-3
Leve
l=m
-2
Leve
l=m
-1
Leve
l=m
Mes
h #
for
each
MP
I= 1
Coa
rse
Grid
Agg
rega
tion
(CG
A)
Coa
rse
Grid
Sol
ver i
s m
ultit
hrea
ded
[KN
201
2]
ME
PA
201
462
Leve
l=1
Leve
l=2
Leve
l=m
-3
Fine
Coa
rse
Coa
rse
grid
sol
ver o
n a
sing
le M
PI p
roce
ss
(mul
ti-th
read
ed,
furth
er m
ultig
rid)
•C
omm
unic
atio
n ov
erhe
ad
coul
d be
redu
ced
•C
oars
e gr
id s
olve
r is
mor
e ex
pens
ive
than
orig
inal
ap
proa
ch.
•If
proc
ess
num
ber i
s la
rger
, th
is e
ffect
mig
ht b
e si
gnifi
cant
Leve
l=m
-2
Res
ults
63
CA
SEM
atrix
Coa
rse
Grid
C0
CR
SS
ingl
e C
ore
C1
ELL
(orig
inal
)S
ingl
e C
ore
C2
ELL
(orig
inal
)C
GA
C3
ELL
(new
)C
GA
C4
ELL
(new
)hC
GA
Cla
ssSi
zeW
eak
Sca
ling
643
cells
/cor
e26
2,14
4S
trong
Sca
ling
2563
cells
16,7
77,2
16
64
Results
at 4
,096
nod
es (1
.72x10
10DOF)
(Fujitsu FX10
: Oakleaf‐FX): H
B 8x2
lev: sw
itching
level to “coarse grid so
lver”, Opt. Level= 7
■Pa
ralle
l■
Seria
l/Red
unda
nt
Fine
Coa
rse
0.0
5.0
10.0
15.0
20.0
ELL
-CG
A,
lev=
6: 5
1E
LL-C
GA
,le
v=7:
55
ELL
-CG
A,
lev=
8: 6
0E
LL: 6
5,(N
O C
GA
)C
RS
: 66,
(NO
CG
A)
sec.
Res
tC
oars
e G
rid S
olve
rM
PI_
Allg
athe
rM
PI_
Isen
d/Ire
cv/A
llred
uce
C1
C2
C0
C2
C2
Mat
rixC
oars
e G
rid
C0
CR
SS
ingl
e C
ore
C1
ELL
(org
)S
ingl
e C
ore
C2
ELL
(org
)C
GA
C3
ELL
(slic
ed)
CG
A
67
Weak Scaling: C2(w
ith CGA)
Time for C
oarse Grid
Solver
Effi
cien
cy o
f coa
rse
grid
sol
ver f
or H
B 1
6x1
is x
256
of th
at o
f fla
t MP
I (1
/16
prob
lem
siz
e, x
16 re
sour
ce fo
r coa
rse
grid
sol
ver)
0.00
1.00
2.00
3.00
4.00
1024
2048
4096
8192
1638
432
768
4915
265
536
sec.
CO
RE
#
Flat
MP
IH
B 4
x4H
B 8
x2H
B 1
6x1
Sum
mar
y so
far .
..•
“Coa
rse
Grid
Agg
rega
tion
(CG
A)”
is e
ffect
ive
for
stab
ilizat
ion
of c
onve
rgen
ce a
t O(1
04) c
ores
for M
GC
G–
Sm
alle
r num
ber o
f par
alle
l dom
ains
–H
B 8
x2 is
the
best
at 4
,096
nod
es–
Flat
MP
I, H
B 4
x4•
Coa
rse
grid
sol
vers
are
mor
e ex
pens
ive,
bec
ause
thei
r num
ber o
f MP
I pr
oces
ses
are
mor
e th
an th
ose
of H
B 8
x2 a
nd H
B 1
6x1.
•E
LL fo
rmat
is e
ffect
ive
!–
C0
(CR
S)
->
C1
(ELL
-org
.): +
20-3
0%–
C2
(ELL
-org
)->
C3(
ELL
-new
): +2
0-30
%–
C0
-> C
3: +
80-9
0%•
Coa
rse
Grid
Sol
ver
–(M
ay b
e) v
ery
expe
nsiv
e fo
r cas
es w
ith m
ore
than
O(1
05) c
ores
–Mem
ory of a single nod
e is no
t eno
ugh
–Multip
le nod
es sh
ould be utilized for coarse grid so
lver
68
Mat
rixC
oars
e G
rid
C0
CR
SS
ingl
e C
ore
C1
ELL
(org
)S
ingl
e C
ore
C2
ELL
(org
)C
GA
C3
ELL
(slic
ed)
CG
A
Hie
rarc
hica
l CG
A: C
omm
. Red
ucin
g M
GR
educ
ed n
umbe
r of M
PI p
roce
sses
[KN
201
3]
69
Leve
l=1
Leve
l=2
Leve
l=m
-3
Leve
l=m
-3
Fine
Coa
rseLeve
l=m
-2
Coa
rse
grid
sol
ver o
n a
sing
le M
PI p
roce
ss (m
ulti-
thre
aded
, fur
ther
mul
tigrid
)
hCG
A: R
elat
ed W
ork
•N
ot a
new
idea
, but
ver
y fe
w im
plem
enta
tions
.–
Not
effe
ctiv
e fo
r pet
a-sc
ale
syst
ems
(Dr.
U.M
.Yan
g(L
LNL)
, dev
elop
er o
f Hyp
re)
•E
xist
ing
Wor
ks: R
epar
titio
ning
at C
oars
e Le
vels
–Li
n, P
.T.,
Impr
ovin
g m
ultig
ridpe
rform
ance
for u
nstru
ctur
ed m
esh
drift
-diff
usio
n si
mul
atio
ns o
n 14
7,00
0 co
res,
Inte
rnat
iona
l Jou
rnal
fo
r Num
eric
al M
etho
ds in
Eng
inee
ring
91 (2
012)
971
-989
(San
dia)
–S
unda
r, H
. et a
l, P
aral
lel G
eom
etric
-Alg
ebra
ic M
ultig
ridon
U
nstru
ctur
ed F
ores
ts o
f Oct
rees
, AC
M/IE
EE
Pro
ceed
ings
of t
he
2012
Inte
rnat
iona
l Con
fere
nce
for H
igh
Per
form
ance
Com
putin
g,
Net
wor
king
, Sto
rage
and
Ana
lysi
s (S
C12
) (20
12) (
UT
Aus
tin)
–Fl
at M
PI,
Rep
artit
ioni
ng if
DO
F <
O(1
03) o
n ea
ch p
roce
ss
70
hCG
Ain
the
pres
ent w
ork
•A
ccel
erat
e th
e co
arse
r grid
sol
ver
–us
ing
mul
tiple
pro
cess
es in
stea
d of
a s
ingl
e pr
oces
s in
CG
A–
Onl
y 64
cel
ls o
n ea
ch p
roce
ss o
f lev
=6in
the
figur
e
•S
traig
htfo
rwar
d A
ppro
ach
–M
PI_
Com
m_s
plit,
MP
I_G
athe
r, M
PI_
Bca
stet
c.
71
0.0
5.0
10.0
15.0
20.0
ELL
-CG
A,
lev=
6: 5
1E
LL-C
GA
,le
v=7:
55
ELL
-CG
A,
lev=
8: 6
0E
LL: 6
5,(N
O C
GA
)C
RS
: 66,
(NO
CG
A)
sec.
Res
tC
oars
e G
rid S
olve
rM
PI_
Allg
athe
rM
PI_
Isen
d/Ire
cv/A
llred
uce
Sum
mar
y•
hCG
Ais
effe
ctiv
e, b
ut n
ot s
o si
gnifi
cant
(exc
ept f
lat M
PI)
–fla
t MP
I: x1
.61
for w
eak
scal
ing,
x6.
27 fo
r stro
ng s
calin
g at
4,0
96
node
s of
Fuj
itsu
FX10
–
hCG
Aw
ill be
effe
ctiv
e fo
r HB
16x
1 w
ith m
ore
than
2.5
0x10
5no
des
(= 4
.00x
106
core
s) o
f FX
10 (=
60 P
FLO
PS
)•
effe
ct o
f coa
rse
grid
sol
ver i
s si
gnifi
cant
for F
lat M
PI w
ith >
103
node
s–
Com
mun
icat
ion
over
head
has
bee
n re
duce
d by
hC
GA
•Fu
ture
/On-
Goi
ng W
orks
and
Ope
n P
robl
ems
–Im
prov
emen
t of h
CG
A•
Ove
rhea
d by
MP
I_A
llred
uce
etc.
-> P
2P c
omm
.–
Alg
orith
ms
•C
A-M
ultig
rid(fo
r coa
rser
leve
ls),
CA
-SP
AI,
Pip
elin
ed M
etho
d–
Stra
tegy
for A
utom
atic
Sel
ectio
n •
switc
hing
leve
l, nu
mbe
r of p
roce
sses
for h
CG
A, o
ptim
um c
olor
#•
effe
cts
on c
onve
rgen
ce–
Mor
e Fl
exib
le E
LL fo
r Uns
truct
ured
Grid
s–
Xeo
n P
hi C
lust
ers
•H
ybrid
240
(T)x
1(P
) is
not t
he o
nly
choi
ce76
Ove
rhea
d by
Col
lect
ive
Com
mun
icat
ion 77
0.00
E+0
0
1.00
E-0
3
2.00
E-0
3
3.00
E-0
3
4.00
E-0
3
5.00
E-0
3
6.00
E-0
3
7.00
E-0
3 100
1000
1000
010
0000
sec./MPI_Allreduce
MP
I Pro
cess
#
Flat
MP
IH
B 4
x4H
B 8
x2H
B 1
6x1
Ove
rhea
d by
MPI
_Allr
educ
efo
r MG
CG
cas
e
•O
verh
ead
by g
loba
l col
lect
ive
com
m. (
e.g.
MP
I_A
llred
uce)
•C
hang
e or
igin
al K
rylo
vso
lver
so
that
com
m. o
verh
ead
by
glob
al c
oll.
com
m. a
re h
idde
n by
ove
rlapp
ing
with
oth
er
com
puta
tions
(Gro
pp’s
asyn
ch. C
G, s
-ste
p, p
ipel
ined
...)
•“M
PI_
Iallr
educ
e” in
MP
I-3 s
peci
ficat
ion