Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
1
A N
este
dD
issection
Para
llelD
irectSolv
er
for Sim
ula
tions
of3D
DC
/AC
Resis
tivity
Measure
ments
Maciej Paszyński(1
,2)
DavidPardo
(2) , CarlosTorres-Verdín
(2)
(1) Department of Computer Science,
AGH University of Science and Technology, Kraków, Poland
e-mail: [email protected]
home.agh.edu.pl/~paszynsk
(2) Department of PetroleumandGeosystemEngineering,
TheUniversityofTexasatAustin
2
OU
TLIN
E
•Formulationoftheresistivitymeasurementsimulationmodel
problem
•Sequentialalgorithm
•Parallelalgorithm
•Scalabilityoftheparallelsolver
•Parallelsolverdetails
•Conclusionsandfuturework(inversions)
3
FO
RM
ULA
TIO
N O
F T
HE M
OD
EL P
RO
BLEM
()
()
()
=⋅
∇
=⋅
∇
−−
=×
∇
++
=×
∇
0Hµ
Eε
MHµ
E
JE
εσ
Himpimp
ρωω
j
j(Ampere’sLaw)
(Faraday’sLaw)
(Gauss’ Law ofElecticity)
(Gauss’ Law ofMagnetism)
magneticfield
electricfield
diaelectricpermittivity
H E ε µmagneticpermeability
σelectricalconductivity
ρelectriccharge distribution
ωangularfrequency
imp
Jimp
M
impressedelectriccurrent
impressedmagneticcurrent
4
FO
RM
ULA
TIO
N O
F T
HE M
OD
EL P
RO
BLEM
(Ampere’sLaw)
(Faraday’sLaw)
(Gauss’ Law ofElecticity)
(Gauss’ Law ofMagnetism)
magneticfield
electricfield
diaelectricpermittivity
H E ε µmagneticpermeability
ρelectriccharge distribution
0=
ωDC formulation
imp
Jimpressedelectriccurrent
()
()
=⋅
∇
=⋅
∇
=×
∇
+=
×∇
0
0
Hµ
Eε
E
JEσ
Himp
ρ
5
FO
RM
ULA
TIO
N O
F T
HE M
OD
EL P
RO
BLEM
TakingthecurloftheAmpere’slaw andutilizingtheGauss’ Electriclaw we obtain
the
conductive
media
equation
()
()
()
()
Ω∈
∀+
⋅∇
=∇
∇Γ
ΩΩ
12
22
,,
,D
LL
LH
vh
vv
uv
N
Jσ(
)J
σ⋅
−∇=
∇⋅
∇u
Variationalfo
rmula
tion
Find
suchthat
()
Ω+
∈1 D
DH
uu
Du
liftofessentialDirichletb.c.
uh
∇⋅⋅
=σ
nprescribedfluxon
NΓ
()
()
0
:1
1=
Ω∈
=Ω
Γ Du
Hu
HD
where
scalarpotentialsuchthat
u−∇
=E
u
6
FO
RM
ULA
TIO
N O
F T
HE M
OD
EL P
RO
BLEM
•1 transmitter
•2 receiverelectrodes
•transmitterelectrodemodeledby
theimpressedelectriccurrent
•five different layers in the formation
with resistivities100 Ω·m (sand)
5Ω·m (shale) 200 Ω·m (oil)
1 Ω·m (water) 1000 Ω·m (rock)
•boreholewithresistivity0.1 Ω·m
•0 Dirichletb.c. on theexternal
boundaryofthedomain
•0 Neumannb.c. on theaxisof
symmetry
imp
J
7
FO
RM
ULA
TIO
N O
F T
HE M
OD
EL P
RO
BLEM
We utilize a 2D self-adaptive
goal-oriented hp-adaptive strategy
combined with a Fourier series
expansion in a non-orthogonal
system of coordinates
8
FO
RM
ULA
TIO
N O
F T
HE M
OD
EL P
RO
BLEM
9
SEQ
UE
NTIA
L A
LG
OR
ITH
M
Loopoverelectrodelocations
Iterationsofthegoal-orientedself-adaptivehpFEM
Solvetheproblem overthecoarsemesh
Solvetheproblem overthefinemesh
Computerelativeerrorestimationsoverfiniteelements
Ifmaximumrelativeerror< requiredaccuracythenexit
Makedecisionsaboutoptimalrefinements
Performoptimalrefinements
End
Storesecondverticaldifferenceofpotential
atreceiverelectrodes
End
10
PA
RA
LLE
L A
LG
OR
ITH
M
Each processor is assigned to a set of finite elements.
Each node from the interface is assigned to multiple processor owners.
Localcopyoftheentiredata structureisstoredon everyprocessor.
But eachprocessorperformscomputationsonlyon assignedset offiniteelements.
Onlylocalsolutiondegreesoffreedomarestoredto savememory.
11
PA
RA
LLE
L A
LG
OR
ITH
M
Redistributethecomputationalmeshbetweenprocessors
Loopoverelectrodelocations
Iterationsofthegoal-orientedself-adaptivehpFEM
Solvethecoarsemeshproblem by parallelsolver
Solvethefinemeshproblem by parallelsolver
Computerelativeerrorestimationsoverfiniteelements
Computeglobalmaximumrelativeerror(mpi_allreduce)
Ifmaximumrelativeerror< requiredaccuracythenexit
Makelocaldecisionsaboutoptimalrefinements
Broadcastrequiredrefinements
Performoptimalrefinementson thewholelocalmesh
End
Storesecondverticaldifferenceofpotential
atreceiverelectrodes
(requirescommunicationto gatherdistributedsolution)
12
SC
ALA
BIL
ITY O
F T
HE P
AR
ALLEL S
OLVER
Finemesh, 10 Fourier modes, 141 000 degreesoffreedom
211 secondson 1 processor(per single electrodelocation)
1.75 secondson 192 processors(per single electrodelocation)
13
SC
ALA
BIL
ITY O
F T
HE P
AR
ALLEL S
OLVER
Finemesh, 10 Fourier modes, 141 000 degreesoffreedom
211 secondson 1 processor(per single electrodelocation)
1.75 secondson 192 processors(per single electrodelocation)
14
SC
ALA
BIL
ITY O
F T
HE P
AR
ALLEL S
OLVER
Finemesh, 10 Fourier modes, 141 000 degreesoffreedom
211 secondson 1 processor(per single electrodelocation)
1.75 secondson 192 processors(per single electrodelocation)
15
PA
RA
LLE
L S
OLVER
DETA
ILS
()
()
()
()
()
NL
j hp
L
j hp
j hp
L
j hp
i hp
j hp
i hp
he
eel
ee
ee
b
ΓΩ
Ω
+⋅
∇=
∇∇
=
22
2
,,
,,
J
σ
16
PA
RA
LLE
L S
OLVER
DETA
ILS
Forwardelimination
on thewholematrix
O(15^3)
17
PA
RA
LLE
L S
OLVER
DETA
ILS
Partial
forwardelimination
O(6*9^2)
18
PA
RA
LLE
L S
OLVER
DETA
ILS
Partial
forwardelimination
O(6*9^2)
19
PA
RA
LLE
L S
OLVER
DETA
ILS Fullforwardelimination
O(3^3)
20
PA
RA
LLE
L S
OLVER
DETA
ILS
Forwardeliminationoverthewholematrix
15^3 = 3375 operations
Partialforwardeliminationsoverelements
6*9^2 + 6*9^2 + 3^3 = 486 + 486+ 27 = 999
Theidea isgeneralizedintothewholedomain
21
PA
RA
LLE
L S
OLVER
DETA
ILS
22
PA
RA
LLE
L S
OLVER
DETA
ILS
23
SO
LVER
REC
UR
SIV
E A
LG
OR
ITH
M
matrixfunctionrecursive_solver(tree_node)
iftree_nodehasno sonnodesthen
eliminateleafelement stiffnessmatrix
internalnodes
returnSchurcomplementsub-matrix
elseiftree_nodehassonnodesthen
dofor eachson
son_matrix= recursive_solver(tree_node_son)
mergeson_matrixintonew_matrix
enddo
decidewhichunknownsofnew_matrixcanbe eliminated
performpartialforwardeliminationon new_matrix
returnSchurcomplementsub-matrix
endif
24
PA
RA
LLE
L S
OLVER
REC
UR
SIV
E A
LG
OR
ITH
Mmatrix
function
recursive_solver(tree_node)
if
tree_node
has
no son
nodes
then
eliminate
leaf
element stiffness
matrix
internal
nodes
return
Schur
complement
sub-matrix
else
if
tree_node
has
son
nodes
then
do
for each
son
if
son
node
is
assign
to current
processor
son_matrix
= recursive_solver(tree_node_son)
if
current
processor
k is
the
first
processor
in
procesors
group
RECEIVE son_matrix
from
processor
2k+1
merge
son_matrix
into
new_matrix
else
SEND son_matrix
to the
first
processor
in
processors
group
enddo
decide
which
unknowns
of
new_matrix
can
be eliminated
perform
partial
forward
elimination
on new_matrix
return
Schur
complement
sub-matrix
25
CO
NC
LU
SIO
NS A
ND
FU
TU
RE W
OR
K
•A newparalleldirectsolverhasbeendeveloped.
•ThesolverrecursivelyutilizestheSchurcomplementpatternto eliminate
fullyagregateddegreesoffreedomon everyleveloftheeliminationtree.
•Thesolverrecursivelytravelstherefinementtrees, thetreeofinitialmesh
elementsas wellas thetreeofsub-domains.
•Theparallelversionofthesolverprovidesover60% relativeefficiencyon
200 processors.
•Thesolverisableto reducethesolutiontime of141000 degreesof
freedomproblem from211 secondsto 1.75 secondson 192 processors.
•Thesingle loggingpositioncanbe solvedwithin2 secondson 200
processors.
•Thefutureworkwill involveapplicationoftheparallelsolverto theinverse
problem modeling.