42
Numerical methods and optimizations that enhance semi-lagrangian gyrokinetic calculations G. Latu,Y.Asahi, J. Bigot , G. Dif-Pradalier,P.Donnel, C. Ehrlacher, X. Garbet ,P.Ghendrih,V.Grandgirard , M. Ottaviani, C. P asseron,Y.Sarazin CEA, France N. Bouzat , M. Mehrenberger Y. G¨ uc ¸ l ¨ u,ESonnendr ¨ ucker INRIA + U. Strasbourg IPP, Germany Acknowledgements to: Eurofusion funding, EoCoE project funding G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations 20/10/2016 1

@let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Numerical methods andoptimizations that enhancesemi-lagrangian gyrokinetic

calculations

G. Latu, Y. Asahi, J. Bigot, G. Dif-Pradalier, P. Donnel,

C. Ehrlacher, X. Garbet, P. Ghendrih, V. Grandgirard,

M. Ottaviani, C. Passeron, Y. Sarazin

CEA, France

N. Bouzat, M. Mehrenberger Y. Guclu, E Sonnendrucker

INRIA + U. Strasbourg IPP, Germany

Acknowledgements to:Eurofusion funding,

EoCoE project funding

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 1

Page 2: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Outline

• Short introduction to Gysela parallel setting

• Aligned advection along θ,ϕ

• Cubic splines versus Lagrange interpolants in Gysela

• Removing boundary condition at magnetic axis

Page 3: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Numerical scheme: overview

I Main unknown: fn(r , θ, ϕ, v‖, µ)

Input : Physics parameters, f0

Output : Diagnostics

for time step n ≥ 0 doIntegrals: Nn

i (r , θ, ϕ) =∫ ∫

fn B(r , θ) J(k⊥ρC) dv‖dµ;Push fields (Poisson Eq.): Nn

i (r , θ, ϕ)→ Φn(r , θ, ϕ);Diagnostics/Outputs for time step n;Push particles (Vlasov Eq. + other terms): Φn(r , θ, ϕ), fn

→ fn+1;

Algorithm 1: Simplified overall Gysela algorithm

I Practically: predictor-corrector time integration scheme O(∆t2)

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 2

Page 4: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Hybrid parallelizationMPI + OpenMP

Fortran 90 code, hybrid MPI+OpenMP

I MPI parallelization in variables µ, r , θ (most of the time)(in: Integrals, Diagnostics, Vlasov;

not in: Poisson solver)→ Work well balanced between processors→ Parallel overhead: MPI communications mainly

I OpenMP fine grain parallelization→ Avoid MPI communication costs→ Easy to change OpenMP parallelization (vs. MPI)

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 3

Page 5: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Parallel algo. for a 1D advection(in ϕ direction)

Input : f?(r , θ, ϕ, v‖, µ)Output : f�(r , θ, ϕ, v‖, µ)

for µ do in parallel MPI /* One MPI commmunicator per µ-value */

for r do in parallel MPIfor θ do in parallel MPI

for θ do in parallel OpenMPfor v‖ do

Compute cubic spline representation of f?(r , θ, ϕ = ∗, v‖, µ)for ϕ do

∆ϕ← (v‖ + other terms)∆tf�(r , θ, ϕ, v‖, µ) = spline interpolate(f?(r , θ, ϕ −∆ϕ, v‖, µ))

Cubic spline are used in each advections of Gysela

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 4

Page 6: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Parallelization of Vlasov solverTransposition used

for time step n ≥ 0 doIntegrals, Poisson, Diagnostics

1D Advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);1D Advection in ϕ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);Transposition of f ;

Vlasov 2D Advection in (r , θ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);Transposition of f ;1D Advection in ϕ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);

1D Advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);

Algorithm 2: Transposes, two parallel data decompositions

I Two tranposes in each µ communicator (per Vlasov solve)I No CFL constraint on time step dt for 2D advection

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 5

Page 7: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Communication schemes - Transpose

Transpose forward

block_θ

block_r

local subdomain D1 (larger than 32x32 pts)

v//=*, φ=*

block_φ

block_v//

r=*, θ=*

local subdomain D2

Transpose backward

Figure 1: Transpose communication scheme within a MPIcommunicator identified by µ

I Large communication amount: Θ((Nr Nθ)Nϕ Nv‖ Nµ)

I However, scale well up to 64k coresI Less than 15% of total elapsed time

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 6

Page 8: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Outline

• Short introduction to Gysela parallel setting

• Aligned advection along θ,ϕ

• Cubic splines versus Lagrange interpolants in Gysela

• Removing boundary condition at magnetic axis

Page 9: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Motivation: aligned advection

I Anisotropy within (θ,ϕ) plane [ at a given r ]structures are aligned along field lines

I Strong gradients perpendicularly to field linesI Safety factor q(r) gives the direction of the field line

I Aim: interpolate along field line: smooth variationsG. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 7

Page 10: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Numerical method - aligned advection

Aligned scheme: interpolation of a target point at position (θ?, ϕ?).Assumption: smooth variation along the green line.

(θ*,φ*)

θ

φ

φj*

φj*+1

φj*-1

Squares are located at intersection of green lines and grid lines along θ.Values at square are interpolated using values known at black point (grid)Value at the red circle position (θ∗, ϕ∗) is interpolated using values known atthe square positions.

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 8

Page 11: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Algorithm - aligned advection

/* Input: distrib. function on (θ,ϕ) + feet (θ?, ϕ?) */

Input : g(:, :), theta?(:, :), phi?(:, :)

/* Output: distrib. function on (θ,ϕ) plane */

Output : g†(:, :)

for j = 0,Nϕ doη(i = ∗, j)← spline coefficients along θ for g(i = ∗, j)

for j = 0,Nϕ dofor i = 0,Nθ do

ϕ? ← phi?(i, j); θ? ← theta?(i, j);j? ← index of the left grid

point close to ϕ? ;for k = −d,d+1 do

θk ← fieldlineθ(θ?, ϕ?, j? + k );uk ← 1D spline interpolation along θ

at θk using η(i = ∗, j? + k );g†(i, j)← 1D Lagrange interpolation

using values (uk )k=−d,d+1

Algorithm 3: Aligned interpolation in (θ,ϕ) plane

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 9

Page 12: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Estimate of derivatives

I Reduced number of points along ϕ→ which method to get accurate derivative along ϕ ?→ major issue: derivative of electric potential Φ

I Method used (based on aligned interpolation):

(θi ,φ+ε)

θ

φ

φj*

φj*+1

φj*-1

(θi ,φ-ε)

∂f∂ϕ =

f(θi ,ϕj+ε)−f(θi ,ϕj−ε)

2 ε Two aligned interp. f -values

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 10

Page 13: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Global algorithm for Vlasov solver

1D Advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);2D Aligned advection in (θ,ϕ) ;

Vlasov 2D Advection in (r , θ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);2D Aligned advection in (θ,ϕ) ;

1D Advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);

Algorithm 4: Aligned method, Vlasov algorithm

I Domain decomp. for the aligned advection ?→ θ=∗ needed for splines

I Which MPI communications should be done ?

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 11

Page 14: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Parallel algorithm (v1)

I Whole plane (θ = ∗, ϕ = ∗) known→ simplify implementation

1D advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);Get feet for 2D advection in (θ,ϕ) (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);Transpose f , and redistribute feet;2D aligned advection in (θ,ϕ) (∀(µ, v‖) = [local],∀(r , θ, ϕ) = [∗]);Transpose f ;2D advection in (r , θ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);Get feet for 2D advection in (θ,ϕ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);Transpose f , and redistribute feet;2D aligned advection in (θ,ϕ) (∀(µ, v‖) = [local],∀(r , θ, ϕ) = [∗]);Transpose f ;1D advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);

Algorithm 5: Aligned Vlasov solver (v1)I 3 domain decompositionsI Overheads: communication + memory + compute costs

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 12

Page 15: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Performance - aligned method (v1)

Execution Time Standard Nphi =32 Standard Nphi =64 Aligned Nphi =32

Transposes 34.6 46.0 145.2Advections 81.8 159.6 162.1

Others 94.1 184.6 96.4Total run time 210.6 403.7 390.3

Table 1: Time (in s.) of a short Gysela run

I Domain size Nr = 256, Ntheta = 256, Nvpar = 48, Nmu = 4,nbtimestep = 16, nbcpus = 256

I Algo (v1)→ Pb 1: Exec. time→ significant overheadI Algo (v1)→ Pb 2: memory footprint increase by 3×

→ data structures containing feet + MPI buffers /I Algo (v1)→ Aligned Nphi = 32 results close to standard Nphi = 128 or 256

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 13

Page 16: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Numerical experiments(Standard Nϕ = 256 close to Aligned Nϕ = 32)

Linear simulations (mode n=10, full torus) [Y. Sarazin]ρ? = 1/150, Nr = 256, Ntheta = 256, Nvpar = 128, Nmu = 16

Plotting amplitude of the 4 most unstable modes

Nφ=32 Nφ=128 Nφ=256

Standard

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 14

Page 17: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Parallel algorithm (v2)Improved performance

1D advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);Transpose f and get ghost cells on f along ϕ direction;2D aligned advection in (θ,ϕ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);Comm: if feet outside of local domain, interpolate on another MPI process;2D advection in (r , θ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);Comm: get ghost cells on f along ϕ direction;2D aligned advection in (θ,ϕ) (∀(µ, ϕ, v‖) = [local],∀(r , θ) = [∗]);Comm: if feet outside of local domain, interpolate on another MPI process;Transpose f ;1D advection in v‖ (∀(µ, r , θ) = [local],∀(ϕ, v‖) = [∗]);

Algorithm 6: Aligned Vlasov solver (v2)

I Subdomain (θ = ∗, ϕ = [local])→ comm. at ϕ boundariesI Transmit particles escaping subdomain→ comm.I Extra communication costs→ less than (v1) ,

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 15

Page 18: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Performance - aligned method (v2)

Execution Time Standard Nphi =32 Standard Nphi =64 Aligned Nphi =32

Transposes 34.6 46.0 42.2 145.2Advections 81.8 159.6 143.0 162.1

Others 94.1 184.6 96.6 96.4Total run time 210.6 403.7 281.8 390.3

Table 2: Time (in s.) of a short Gysela run

I Domain size Nr = 256, Ntheta = 256, Nvpar = 48, Nmu = 4,nbtimestep = 16, nbcpus = 256

I Algo (v2)→ No more 3 domain decompositions as in (v1) ,I Algo (v2)→ Almost no memory costs vs standard version ,I Algo (v2)→ Execution time reduction vs (v1) ,

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 16

Page 19: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Overheads estimate - aligned method

I Algorithm v1I 2D advection instead of 1D advection along ϕI Feet computation is decoupled from advection in (θ,ϕ)

→ bad cache effect (temporal locality)I Feet (θ?, ϕ?) should be stored/sent to other processesI Extra communication cost due to another domain

decomposition (∀(µ, v‖) = [local],∀(r , θ, ϕ) = [∗])

I Algorithm v2 (best)I 2D advection instead of 1D advection along ϕI Ghost cells exchange f along ϕ→ small comm. costsI Points outside subdomain during (θ,ϕ) advection

Send request for each point (small comm. cost)Message to send back the interpolated value (small cost)

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 17

Page 20: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Aligned method - Conclusion

I Main aligned method achievements:→ Reduced number of points along ϕ (4×)

I Algorithm v2 has reasonable overheads for production

I However, spurious modes (electric potential Φ) grows(long time runs) for small Nϕ, solution:

→ Filtering them out in Fourier space in (θ,ϕ)many gyrokinetic codes do that, for various reasons

→ Explain the causesome modes are not damped as they should (if Nϕ too small)

[Ottaviani, Physics Letters A 375 (2011) 1677]still under investigation

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 18

Page 21: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Outline

• Short introduction to Gysela parallel setting

• Aligned advection along θ,ϕ

• Cubic splines versus Lagrange interpolants in Gysela

• Removing boundary condition at magnetic axis

Page 22: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Interpolations in Gysela

I Interpolations have a main role in advections:I along ϕ, along v‖, along (r , θ), along (θ,ϕ) (aligned method)

I Interpolations used in derivative estimates:I along ϕ (aligned method)

I Cubic splines usedI frequently in Semi-Lagrangian code (plasma physics,

atmospheric model ...)I non-local→ couples all values along one direction /I good compromise between computational efficiency and

numerical accuracy ,I smooth interpolation, C1 continuity ,I degrade well when distribution is underresolved on the mesh

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 19

Page 23: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Uniform cubic spline interpolation(for Semi-Lagrangian method)

Input : set of values G =

g0

g1

.

.

.

gN

, a location x

Output : g(x) Interpolated value at location x

beginCompute g−1 and gN+1 depending on boundary conditions;1

Solve A

η−1

η0

.

.

.

ηN+1

=

g−1

G

gN+1v

with A = L U ;

2

Set index i ← b(x − x0)/dxc /* local support */;3

Interpolate g(x) using coefficients ηi−1, ηi , ηi+1, ηi+2;4

end

For several interpolations at different locations using G:LU system solved only onceG. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 20

Page 24: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Lagrange interpolants

I Lagrange polynomials, alternative to cubic splines:I more local than splines

I Definition: g a discrete function (defined on x ∈ [x0, xN]).

L(x) =

n∑j=1

Lj(x),

Lj(x) = g(xj)

k=n∏k=1, k !=j

(x − xk )

(xj − xk )

I Property: ∀j ∈ [1,n], L(xj) = g(xj)

I Property: n points, degree of the L polynomial n − 1

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 21

Page 25: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

One drawback of Lagrange polynomial...

I Assumption: sharp gradients in input dataI Often the case for turbulent simulation

I Everything seems fine, let’s have a zoom ...

-1

-0.5

0

0.5

1

1.5

2

5 5.2 5.4 5.6 5.8 6

cubic splineslagrange 6-ptslagrange 8-pts

Input points

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 22

Page 26: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

One drawback of Lagrange polynomiallack of continuity

I Lagrange polynomial: lack of continuity

-0.9

-0.8

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

5.64 5.66 5.68 5.7 5.72 5.74 5.76 5.78 5.8

Pb: lagrange not C1

cubic splineslagrange 6-ptslagrange 8-pts

Input points

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 23

Page 27: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Hybrid Lagrange-Hermite polynomials

I Recipies for new Lagrange polynomials [M. Mehrenberger]:I Take Lagrange polynomial (nb points: (2 m)), C0, not C1I Remove first and last pointI Increase multiplicity of 2 zeros to fix 2 derivatives→ C1 continuityI Finally: 2 m points in input, degree of polynomial: 2 m − 1

-0.9

-0.8

-0.7

-0.6

-0.5

-0.4

-0.3

-0.2

-0.1

0

0.1

5.64 5.66 5.68 5.7 5.72 5.74 5.76 5.78 5.8

Success: new lagrange is C1 !

new lagrange close to cubic spline

cubic splinesnew lagrange 6-ptsnew lagrange 8-pts

Input points

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 24

Page 28: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Amplification factor

I Measure quality of the numerical schemeI Def: amplitude error for a linear advection of a waveI Higher is better (but should remain less than 1 to be stable)

0.94

0.95

0.96

0.97

0.98

0.99

1

1.01

1.02

0 0.2 0.4 0.6 0.8 1

Amplification for Omega= PI/2 (Lagrange)

Lagrange 6ptsLagrange 8pts

Lagrange 10ptsCubic splines

0.94

0.95

0.96

0.97

0.98

0.99

1

1.01

1.02

0 0.2 0.4 0.6 0.8 1

Amplification for Omega= PI/2 (NEW Lagrange)

New Lagrange 6ptsNew Lagrange 8pts

New Lagrange 10ptsCubic splines

I New Lagrange 8 & 10 pts→ better than splinesI New Lagrange has derivative=0 in 0 and 1→ C1 continuity

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 25

Page 29: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Dispersion/Phase error

I Measure quality of the numerical scheme alsoI Def: phase error for a linear advection of a waveI Closer to 0 is better [Filbet, Sonnendrucker, CPC 2003]

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0 0.2 0.4 0.6 0.8 1

Dispersion for Omega= PI/2

Lagrange 6ptsLagrange 8pts

Lagrange 10ptsCubic splines

-0.04

-0.03

-0.02

-0.01

0

0.01

0.02

0.03

0.04

0 0.2 0.4 0.6 0.8 1

Dispersion for Omega= PI/2

New Lagrange 6ptsNew Lagrange 8pts

New Lagrange 10ptsCubic splines

I Lagrange performs better than New Lagrange /I New Lagrange 10 pts better than splines, 8 pts is good ,

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 26

Page 30: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Numerical experiment with Gysela

I Advections with Lagrange interpolations instead of splinesI Many short-time runs behave well (no significant changes compared to

splines) with Lagrange or New Lagrange using 5-pts up to 10-pts ,

I Run a difficult case: kinetic e−, not-so-fine mesh, aligned advec. (θ,ϕ)

Nr = 256, Ntheta = 256, Nvpar = 48, , Nphi = 32, Nmu = 4, ρ? = 1/95

Interpolation StatusSpline OK

Lagrange 6-pts failLagrange 7-pts failLagrange 8-pts fail

New Lagrange 6-pts OKNew Lagrange 8-pts OK

Table 3: Status after a Gysela run of ≈ 400 time steps

I Failures of Lagrange interp. due to errors on distrib. functionI New Lagrange degrades well if distrib. is underresolved on the mesh

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 27

Page 31: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Time measurements with Gysela

I Run a difficult case: kinetic e−, not-so-fine mesh, aligned advec. (θ,ϕ)

Interpolation Total 1D advection v‖ 2D advectionsLagrange 8-pts 628 102 332

New Lagrange 8-pts 595 99 300Spline 590 110 280

New Lagrange 6-pts 566 96 273Lagrange 6-pts 560 95 267

Table 4: Execution time (in s.) of a Gysela run of 20 time steps

I New Lagrange is competitive against splinesVectorizations/optimizations will be undertaken soon

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 28

Page 32: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Costs of Lagrange 8-pts versus splines

I Considering one advection step,assuming cache large enough, 64-bit computations,excluding cost to get grid index at left to target location

I Average cost of 1D interpolation (lagrange 8-pts):I 1 load, 1 store, 48 multiply, 37 additions

I Average cost of 1D interpolation (cubic spline):I 1 load, 1 store, 26 multiply, 16 additions, 1 divide

I Average cost of 2D interpolation (lagrange 8-pts):I 1 load, 1 store, 144 multiply, 122 additions

I Average cost of 2D interpolation (cubic spline):I 1 load, 1 store, 60 multiply, 40 additions, 2 divide

I Why Lagrange 8-pts exec. time so close to spline then ?I under investigation, possibly: mem. bandwidth,

vectorization, instruction parallelismG. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 29

Page 33: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Lagrange polynomials - Conclusion

I Results obtained:

I Standard Lagrange polynomial→ good Gysela simulations→ However some simulations underresolved fails→ New hybrid Hermite-Lagrange solved this problem

I To recover the same accuracy of spline with New Lagrange→ Needs 6-pts or 8-pts or 10-pts (higher order vs spline)→ Computate costs competitive for 6-pts or 8-pts (1D & 2D)→ Vectorizations/optimizations will be undertaken

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 30

Page 34: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Outline

• Short introduction to Gysela parallel setting

• Aligned advection along θ,ϕ

• Cubic splines versus Lagrange interpolants in Gysela

• Removing boundary condition at magnetic axis

Page 35: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Issue near r = 0Artificial radial inner boundary condition

I Cause: assuming there is a point at r = 0I Several operators/solvers consider terms in 1/r

Pb: Field solver, Field derivative computations, ...I Mesh has a singularity near r = 0 (large nb of θ points)

I Gysela simulations uses an inner radius (rmin) boundary conditionI Physical submodels are needed for all operators at rmin (up to now)I Transport solver: What if an eddy goes through the center ?I Field solver: How to avoid adding artificial boundary at rmin ?

Numerical artifact at rmindue to boundary conditions

Zoom on distribution function (r,θ) cutat a given phi=0, μ=0.05, v//=3.1vth

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 31

Page 36: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Poisson solver upgraded

I Method to deal with inner boundary condition:Lai, M.-C., Wang, W.-C., Fast direct solvers for Poisson equationon 2D polar and spherical geometries. Numer. Methods forPartial Differential Equations (2002)

I Method directly integrated into Gysela ,recipe for the new 2D poloidal solver:

I Finite difference along r , Spectral along θ (as before)I First radial point fixed to rmin = ∆r/2 (clever trick),

cancelation of 2 terms→ no boundary condition at rmin

I Results:I No more boundary condition at magnetic axis

(used to be Dirichlet or Neumann in the past)I No more numerical artifacts due to boundary cond. at rminI Eddies can go through magnetic axis

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 32

Page 37: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Poisson upgrade - result

Removing Boundary condition at rmin,zoom on electric potential, poloidal cut, at a given time step

Figure 2: Neumann at rmin (old),artifact in the center

Figure 3: Lai & Wang trick at rmin(new), nothing bad in the center

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 33

Page 38: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Poisson upgrade - result

Removing Boundary condition at rmin,zoom on electric potential, poloidal cut, at a given time step

Figure 4: Neumann at rmin (old),artifact in the center

Figure 5: Lai & Wang trick at rmin(new), nothing bad in the center

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 34

Page 39: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Vlasov solver - upgrade

I Interpolation at r � rmin

using cubic splines as usual (or Lagrange polynomial)

I Interpolation at the very center (near r =0)using bilinear interpolation in x , yremoving dependency along θ direction to avoid singularity

I Interpolation in-betweenratio mixing: 2D cubic splines interp., bilinear interp.weighting coefficient: depending on r value

cubic splines 2D interpolation in (r,θ)

smooth transition from 2D splines to bilinear

bilinear interpolation in (x,y) for r ∈ [0,rmin]

r=0 r=1.5Δr r=rmax

Weighting coefficients

0

1

0.5

bili

near

inte

rp.

2D

cub

ic s

plin

es

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 35

Page 40: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Vlasov upgrade - result

Removing Boundary condition at rmin

Figure 6: No interp. in [0, rmin] (old),artifact at r = 0

Figure 7: Interp. in [0, rmin] (new),nothing specific at r = 0

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 36

Page 41: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Vlasov upgrade - result

Removing Boundary condition at rmin

Figure 8: No interp. in [0, rmin] (old),artifact at r = 0

Figure 9: Interp. in [0, rmin] (new),nothing specific at r = 0

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 37

Page 42: @let@token OT1TS1blackNumerical methods and optimizations ...irma.math.unistra.fr/IMG/pdf/latu_numkin2016.pdf · Assumption: smooth variation along the green line. (*,*) j* j*+1 j*-1

Conclusion - inner boundary condition

I Two methods integrated to suppress inner bound. cond.I Poisson solver: Lai & Wang methodI Vlasov solver: specific interp. in the center r ∈ [0 : rmin].→ Alternative to bilinear will be investigated

I Results:I Remove possible artifacts close to r = 0 ,I In general, simulations are close to those using previous

boundary condition→ does not invalidate previous Gysela simulations

G. Latu & al. Enhancing semi-lagrangian gyrokinetic calculations � 20/10/2016 38