38
⇥⇤⌅⇧⇥⌃⇤⌥⇥⌃⌦↵✏⌦⇣⇧⌘✓◆ ⌫⌅⇧⌥⇠⌃⌦⇡⌅⇤◆⇢✓⌦⇧⌦ ⌧⌥◆⌅⇧⇧⌫⌃⇥⇠⌅⌦⌫⇤⌅⇧✓ ⇣!⌦⌧⌅⌥⇢⌅"⌅⇧!⌦#⌅⇠⌅⌫"⌅⇧⌦$%!⌦&'$(

conan.iwr.uni-heidelberg.de · Organized by Peter Bastian, Hester Bijl, Christian Klingenberg and Barbara Wohlmuth. We would like to thank Ingrid Hellwig and Ole Klein for …

  • Upload
    ngophuc

  • View
    213

  • Download
    0

Embed Size (px)

Citation preview

�⇥⇤⌅⇧⇥⌃⇤⌥�⇥⌃ ⌦↵����✏⌦⇣�⇧⌘✓◆�

��⌫⌅⇧⌥⇠⌃ ⌦⇡⌅⇤◆�⇢✓⌦��⇧⌦

⌧⌥�◆��⌅⇧��⇧⌫⌃⇥⇠⌅⌦��⌫�⇤⌅⇧✓

�⇣ !⌦⌧⌅⌥⇢⌅ "⌅⇧�!⌦#⌅⇠⌅⌫"⌅⇧⌦$�%!⌦&'$(

Organized by Peter Bastian, Hester Bijl, Christian Klingenberg and Barbara Wohlmuth.

We would like to thank Ingrid Hellwig and Ole Kleinfor their help with the preparation of the workshop.

This workshop is supported financally by theDFG Priority Programme 1648 “Software for Exascale Computing” as well as the “Heidelberg

Graduate School of Mathematical and Computational Methods for the Sciences”.

Contents

1 Schedule 4

2 Invited Talks 7

2.1 Assyr Abdulle: Multiscale and reduced order modeling methods for linear and non-linear homogenization problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2 Alexandre Ern: A nonintrusive Reduced Basis Method applied to aeroacoustic simu-lations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.3 Robert D. Falgout: Multigrid Methods and Software for Exascale Computing . . . . . 82.4 Mike Giles: Multilevel Monte Carlo methods . . . . . . . . . . . . . . . . . . . . . . . 92.5 Mark Hoemmen: Getting the Right Answer Despite Incorrect Hardware . . . . . . . . 92.6 Martin Kronbichler: Fast matrix-free methods for adaptive higher order elements . . . 102.7 Claus-Dieter Munz: High Order Schemes for Complex Flow Simulations . . . . . . . . 112.8 Philip L. Roe: Reassessing Lax-Wendroff and similar schemes . . . . . . . . . . . . . 122.9 Chi-Wang Shu: Positivity-preserving high order schemes in CFD . . . . . . . . . . . . 122.10 Carol S. Woodward: A Reconsideration of Fixed Point Methods for Nonlinear Systems 12

3 Contributed Talks 14

3.1 Juan A. Acebrón: Efficient parallel solution of the telegraph equations subject togeneral boundary conditions by a novel Monte Carlo metho . . . . . . . . . . . . . . . 14

3.2 Michael Bader: Petascale Earthquake Simulations with SeisSol . . . . . . . . . . . . . 153.3 Santiago Badia: A Highly Scalable Asynchronous Implementation of Balancing Do-

main Decomposition by Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153.4 Martin Hanek: Numerical solution of Navier-Stokes equations using Balancing Domain

Decomposition by Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.5 Mario Heene: Hierarchical Numerics for High-Dimensional Exascale Computing . . . 183.6 Verena Krupp: Exploiting modern HPC systems with hybrid parallelism in high order

DG for linear equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.7 Stéphane Lanteri: High performance discontinuous finite element

time-domain solvers for computational nanophotonics . . . . . . . . . . . . . . . . . . 193.8 Martin Lanser: A massively parallel domain decomposition / AMG method for elas-

ticity problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203.9 René Milk: Scalable, Hybrid-Parallel Multiscale Methods using DUNE . . . . . . . . 213.10 Miriam Mehl: Recent Advances in Parallel Fluid-Structure-Acoustics Simulations . . 213.11 Eike Hermann Müller: Performance portable multigrid preconditioners for mixed finite

element discretisations in atmospheric models . . . . . . . . . . . . . . . . . . . . . . 223.12 Steffen Müthing: Efficient Discontinuous Galerkin schemes on hybrid HPC architectures 233.13 Hannah Rittich: Local Fourier Analysis of Pattern Structured Operators . . . . . . . 233.14 D. G. Roehm: Parallel Runtime Environments with Cloud Database: A Performance

Study for the Heterogeneous Multiscale Method with Adaptive Sampling . . . . . . . 243.15 Jonatan Nunez-de la Rosa: Higher-Order Discontinuous Galerkin Spectral Element

Methods for Computational Astrophysics . . . . . . . . . . . . . . . . . . . . . . . . . 25

1

3.16 Daniel Ruprecht: The parallel-in-time methods Parareal and PFASST: Library devel-opment and applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.17 Gero Schnücke: Towards a high order moving grid method for compressible flow equa-tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

3.18 Martin Siebenborn: Structured inverse modeling in diffusive processes . . . . . . . . . 273.19 Jonas Thies: A 3D-parallel interior eigenvalue solver . . . . . . . . . . . . . . . . . . . 273.20 Christian Waluga: Physics-aware solver concepts for geophysical applications on mas-

sively parallel architectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.21 Peter Zaspel: Optimal parallel uncertainty quantification in large-scale flow problems 28

4 Important Information 30

4.1 Places . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 304.2 Public Transport . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

5 List of Participants 32

2

1 Schedule

The workshop is taking place in room 432 on the fourth floor of building 368 on the “NeuenheimerFeld” campus of Heidelberg University. For information on how to get there see chapter 4 of thisdocument. Coffee breaks are in the “common room” on fifth floor.

Monday, December 1, 2014

08:50 - 09:20 Registration09:20 - 09:30 Opening Abstract09:30 - 10:20 Robert D. Falgout (Livermore)

Multigrid Methods and Software for Exascale Computing page 810:20 - 10:45 Eike Hermann Müller (Bath)

Performance portable multigrid preconditioners for mixed finite el-ement discretisations in atmospheric models

page 22

10:45 - 11:15 Coffee break11:15 - 11:40 Christian Waluga (München)

Physics-aware solver concepts for geophysical applications on mas-sively parallel architectures

page 28

11:40 - 12:05 Daniel Ruprecht (Lugano)The parallel-in-time methods Parareal and PFASST: Library de-velopment and applications

page 26

12:05 - 12:30 Hannah Rittich (Wuppertal)Local Fourier Analysis of Pattern Structured Operators page 23

12:30 - 14:00 Lunch break (Mensa)14:00 - 14:50 Mike Giles (Oxford)

Multilevel Monte Carlo methods page 914:50 - 15:15 Juan A. Acebrón (Lisbon)

Efficient parallel solution of the telegraph equations subject to gen-eral boundary conditions by a novel Monte Carlo method

page 14

15:15 - 15:40 Peter Zaspel (Bonn)Optimal parallel uncertainty quantification in large-scale flow prob-lems

page 28

15:40 - 16:10 Coffee break16:10 - 17:00 Assyr Abdulle (Lausanne)

Multiscale and reduced order modeling methods for linear and non-linear homogenization problems

page 7

17:00 - 17:25 René Milk (Münster)Hybrid-Parallel Multiscale Methods using DUNE page 21

17:25 - 17:50 D. G. Roehm (Stuttgart)Parallel Runtime Environments with Cloud Database: A Perfor-mance Study for the Heterogeneous Multiscale Method with Adap-tive Sampling

page 24

18:00 - 19:00 Math meets HPC: Get-together in common room (514, fifth floor)

3

Tuesday, December 2, 2014

09:00 - 09:50 Alexandre Ern (Paris) AbstractA nonintrusive Reduced Basis Method applied to aeroacoustic sim-ulations

page 7

09:50 - 10:15 Mario Heene (Stuttgart)Hierarchical Numerics for High-Dimensional Exascale Computing page 18

10:15 - 10:40 Martin Siebenborn (Trier)Structured inverse modeling in diffusive processes page 27

10:40 - 11:10 Coffee break11:10 - 12:00 Claus-Dieter Munz (Stuttgart)

High Order Schemes for Complex Flow Simulations page 1112:00 - 12:25 Stéphane Lanteri (Sophia Antipolis)

High performance discontinuous finite element time-domain solversfor computational nanophotonics

page 19

12:25 - 14:00 Lunch break (Mensa)14:00 - 14:50 Martin Kronbichler (München)

Fast matrix-free methods for adaptive higher order elements page 1014:50 - 15:15 Jonatan Nunez-de la Rosa (Stuttgart)

Higher-Order Discontinuous Galerkin Spectral Element Methodsfor Computational Astrophysics

page 25

15:15 - 15:40 Verena Krupp (Siegen)Exploiting modern HPC systems with hybrid parallelism in highorder DG for linear equations

page 18

15:40 - 16:05 Steffen Müthing (Heidelberg)Efficient Discontinuous Galerkin schemes on hybrid HPC architec-tures

page 23

16:05 - 16:30 Coffee break16:30 - 17:20 Mark Hoemmen (Albuquerque)

Getting the Right Answer Despite Incorrect Hardware page 917:20 - 17:45 Jonas Thies (Köln)

A 3D-parallel interior eigenvalue solver page 2717:45 - 18:10 Martin Lanser (Köln)

A massively parallel domain decomposition / AMG method for elas-ticity problems

page 20

19:30 Conference dinner (Wirtshaus “Zum Seppl”)

4

Wednesday, December 3, 2014

09:00 - 09:50 Philip L. Roe (Ann Arbor) AbstractReassessing Lax-Wendroff and similar schemes page 12

09:50 - 10:15 Michael Bader (München)Petascale Earthquake Simulations with SeisSol] page 15

10:15 - 10:40 Miriam Mehl (Stuttgart)To Be Anounced page 21

10:40 - 11:10 Coffee break11:10 - 12:00 Carol S. Woodward (Livermore)

A Reconsideration of Fixed Point Methods for Nonlinear Systems page 1212:00 - 12:25 Santiago Badia (Castelldefels)

A Highly Scalable Asynchronous Implementation of Balancing Do-main Decomposition by Constraints

page 15

12:25 - 14:00 Lunch break (Mensa)14:00 - 14:50 Chi-Wang Shu (Providence)

Positivity-preserving high order schemes in CFD page 1214:50 - 15:15 Martin Hanek (Prague)

Numerical solution of Navier-Stokes equations using Balancing Do-main Decomposition by Constraints

page 17

15:15 - 15:40 Gero Schnücke (Würzburg)Towards a high order moving grid method for compressible flowequations

page 26

15:40 Final remarks and workshop end

5

2 Invited Talks

2.1 Multiscale and reduced order modeling methods forlinear and nonlinear homogenization problems

Assyr AbdulleEcole Polytechnique Fédérale de Lausanne (EPFL)

Station 8, CH-1015 Lausanne, Switzerland

Abstract: In this talk we will present recent developments in the design and analysis of numericalhomogenization methods. Numerical methods for linear and nonlinear partial differential equationsthat combine multiscale methods with reduced order modeling techniques such as the reduced basismethod will be discussed.

The talk is based upon a series of joint works with various collaborators[1,2,3,4,5].

[1] A. Abdulle and Y. Bai, Reduced basis finite element heterogeneous multiscale method for

high-order discretizations of elliptic homogenization problems, J. Comput. Phys., vol. 191,num. 1, p. 18-39, 2012.

[2] A. Abdulle and Y. Bai, Reduced order modelling numerical homogenization, PhilosophicalTransactions of the Royal Society A, vol. 372, num. 2021, 2014.

[3] A. Abdulle, Y. Bai and G. Vilmart, Reduced basis finite element heterogeneous multiscale

method for quasilinear elliptic homogenization problems, Discrete Contin. Dyn. Syst. vol.8,num. 1, 2015.

[4] A. Abdulle and O. Budac, An adaptive finite element heterogeneous multiscale method

for Stokes flow in porous media, to appear in SIAM MMS.

[5] A. Abdulle and P. Henning A reduced basis localized orthogonal decomposition, preprintsubmitted for publication.

2.2 A nonintrusive Reduced Basis Method applied toaeroacoustic simulations

Fabien Casenave, Alexandre ErnUniversité Paris-Est, CERMICS (ENPC), 6 & 8 av Blaise Pascal, 77455 Marne-la-Vallée Cedex 2,

FranceTony Lelièvre

Université Paris-Est, CERMICS (ENPC), 6 & 8 av Blaise Pascal, 77455 Marne-la-Vallée Cedex 2,France, and INRIA Rocquencourt, Matherials Team-Project, Domaine de Voluceau, B.P. 105,

78153 Le Chesnay Cedex, France

6

In many problems such as optimization, uncertainty propagation, and real-time simulations, oneneeds to solve a parametrized problem for many values of some parameter(s). The Reduced BasisMethod (see, e.g., [1]) can be exploited in an efficient way in this multi-query context only if the so-called affine dependence assumption on the operator and right-hand side of the considered problemwith respect to the parameters is satisfied. When it is not, the Empirical Interpolation Method (see,e.g., [2]) is usually used to recover this assumption approximately. In both cases, the Reduced BasisMethod requires to access and modify the assembly routines of the corresponding computational code.This leads to an intrusive procedure which cannot be feasible when working with large (industrial)codes. In this work, we show how the EIM algorithm can be used to turn the Reduced Basis Methodinto a nonintrusive procedure. We present examples of aeroacoustic problems solved by integralequations. The methodology has been described in more detail in [3,4]; see also [5] for the boundaryelement method coupled with finite elements applied to the convected Helmholtz equation.

[1] L. Machiels, Y. Maday, A. T. Patera, C. Prud’ homme, D. V. Rovas, G. Turinici, and K. Veroy.Reliable real-time solution of parametrized partial differential equations: Reduced-basis outputbound methods. CJ Fluids Engineering, 124:70–80, 2002.

[2] M. Barrault, Y. Maday, N. C. Nguyen, and A. T. Patera. An ’empirical interpolation’ method:application to efficient reduced-basis discretization of partial differential equations. ComptesRendus Mathematique, 339(9):667–672, 2004.

[3] F. Casenave, A. Ern, and T. Lelièvre, A nonintrusive Reduced Basis Method applied to aeroa-coustic simulations. Advances Comput. Math., in press, 2014. [URL: link.springer.com/article/10.1007%2Fs10444-014-9365-0]

[4] F. Casenave, Model reduction methods applied to aeroacoustic problems solved by integral equa-tions. PhD Thesis, University Paris-Est, 2013.

[5] F. Casenave, A. Ern, and G. Sylvand, Coupled BEM/FEM for the convected Helmholtzequation with non-uniform flow in a bounded domain. J. Comput. Phys., 257:627–644, 2014.

2.3 Multigrid Methods and Software for Exascale ComputingRobert D. Falgout

Center for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O. Box 808,L-561, Livermore, CA 94551

Multigrid methods are important techniques for efficiently solving huge linear systems and theyhave already been shown to scale effectively on millions of cores. Future exascale architectures willrequire solvers to exhibit even higher levels of concurrency (1B cores), minimize data movement,exploit machine heterogeneity, and demonstrate resilience to faults. While considerable research anddevelopment remains to be done, multigrid approaches are ideal for addressing these challenges. Inthis talk, we will discuss issues related to developing multigrid for exascale computing and give anoverview of several approaches being pursued in the hypre library to reduce communication costsin algebraic multigrid (AMG) (see [1] for example). We will also discuss our new software libraryXBraid for doing parallel time integration based on multigrid reduction (MGR) techniques [2]. Theadvantage of the approach is that it is easily integrated into existing codes because it simply callsthe user’s time-stepping routine.

7

[1] Robert D. Falgout and Jacob B. Schroder, Non-galerkin coarse grids for algebraic multigrid,SIAM J. Sci. Comput., 36(3):C309–C334, 2014.

[2] R. D. Falgout, S. Friedhoff, Tz. V. Kolev, S. P. MacLachlan, and J. B. Schroder, Parallel timeintegration with multigrid, SIAM J. Sci. Comput., submitted. LLNL-JRNL-645325.

2.4 Multilevel Monte Carlo methodsMike Giles

Mathematical Institute, University of Oxford, Oxford, UK

Recently there has been a great increase in interest in stochastic modelling and uncertainty quan-tification. In both cases, one is usually concerned with estimating the expected value of some functionof the solution of a problem involving random inputs from some probability space.

Monte Carlo simulation is a very old approach to treating such problems. In the past, it has oftenbeen prohibitively expensive, but due to increases in computing power together with algorithmicimprovements such as multilevel Monte Carlo methods, it is now a viable and practical approach foran increasing range of applications.

In this talk, I will explain the very simple ideas behind multilevel Monte Carlo methods, and givea range of examples of its use. Most research in this area is very recent, in the past 5 years or so,and there are lots of opportunities for future work.

[1] M.B. Giles. ’Multi-level Monte Carlo path simulation’. Operations Research, 56(3):607-617,2008

[2] M.B. Giles. ’Multilevel Monte Carlo methods’, pp.79-98 in Monte Carlo and Quasi-MonteCarlo Methods 2012, Springer, 2014

[3] M.B. Giles. ’Multilevel Monte Carlo methods’, Acta Numerica 2015 article and accompanyingMATLAB codes: http://people.maths.ox.ac.uk/gilesm/acta/

[4] MLMC community webpage: http://people.maths.ox.ac.uk/gilesm/mlmc_community.html

2.5 Getting the Right Answer Despite Incorrect HardwareMark Hoemmen

Sandia National Laboratories, P.O. Box 5800, Albuquerque, NM, USA 87185-1320

Users demand that numerical computations produce correct results, despite many sources of un-certainty. Algorithm developers have studied how to control some of these sources. One source thatthey understand poorly, comes from incorrect behavior of computer hardware. Cosmic rays or otherevents may corrupt stored data or cause arithmetic mistakes. Today’s hardware can correct some ofthese errors before they disturb running applications. However, correction costs energy, performance,or both. Energy increasingly constrains modern computer hardware, especially for the largest parallelcomputers. If it becomes too expensive to correct all errors in hardware, applications may experiencesilent data corruption (SDC). Hardware vendors fight this trend, but also guard information aboutfault causes and rates as proprietary. Thus, there is considerable uncertainty both about how SDCwill manifest in applications (the “fault model”), and how often it may occur (the “fault rate”).

8

We present a two-part approach to mitigate this uncertainty. First, we advocate skeptical program-ming. Software developers should trust intermediate results less. Inexpensive checks of key invariantscan exclude large intermediate errors. This bounds any remaining undetected errors, making it easierfor algorithms to converge through them. While this requires algorithm-specific analysis, it identifiesand rewards algorithms with natural “fault tolerance.” Checking invariants also helps detect pro-grammer error. A healthy skepticism about subroutines’ correctness makes sense even with perfectlyreliable hardware, especially since modern applications may have millions of lines of code and rely onmany third-party libraries. Skeptical programming works today and is a good idea anyway, regardlessof hardware trends.

Skepticism does not suffice for correctness despite SDC. This calls for a selective reliability pro-gramming model that lets us choose parts of an algorithm that we insist be correct. True selectivereliability requires collaboration with computer architects and systems experts. We encourage suchcollaboration and participate in it ourselves, but realize that it will take time to bear fruit. Never-theless, we have made progress in algorithms by assuming that such a programming model exists.By combining selective reliability and skeptical programming, we have made a set of iterative linearsolvers that can get the right answer despite unbounded faults.

2.6 Fast matrix-free methods for adaptive higher orderelements

Martin KronbichlerInstitute for Computational Mechanics, Technische Universität München, Boltzmannstr. 15, 85748

Garching b. München, Germany

Most adaptive finite element codes rely on sparse matrix kernels that are memory bandwidth boundand have thus only seen modest performance gains with advancing computer architectures in recentyears. The work in [1] proposed element-wise quadrature as a means to evaluate operators morequickly. For hexahedral elements with tensor-product basis functions, so-called sum factorizationalgorithms that apply and integrate shape functions along one dimension at a time provide thebest operation counts known. They have their roots in the spectral element community wherehigh polynomial degrees of about six or higher are popular. For lower degrees, integration stillconsiderably increases the operation count as compared to a linear operator in assembled sparsematrix form. However, our implementation from [1] shows that when implemented well a speedupof a factor three over sparse matrix-vector products is possible already for quadratic elements, withthe gap increasing with the order. Most important to this development is a significant reductionin the global memory access. While this already pays off for serial execution, the effect is morepronounced on today’s parallel architectures where computing power increases more than memoryaccess speed. The implementation can successfully exploit the arithmetic resources provided bySIMD (vectorization) as well as intra-node parallelism, reaching 40%–70% of peak on modern CPUs.For the latter, both dynamic scheduling of work in shared memory that avoids conflicting writes onneighboring elements [2] as well as plain MPI are available.

Our matrix-free methods are implemented in the open-source finite element library deal.II [3]and have translated to considerable speedups in several applications, including geometric multigridsolvers, algebraic multigrid with matrix-free operators used as a replacement for the finest-leveloperator, and various explicit time stepping schemes. For the incompressible Navier–Stokes equationswith implicit time stepping in 3D, the performance gain allows us to run quadratic elements at asimilar cost as a sparse-matrix-based implementation of linear elements with the same element count.

9

In my talk, I will also discuss the perspectives of the matrix-free algorithm on current and futurethroughput processors (GPUs, Xeon Phi), in particular with respect to the control flow divergenceon SIMD/SIMT level for adaptive meshes with hanging node constraints or discontinuous Galerkinface integrals as well as the cache usage.

[1] M. Kronbichler, K. Kormann, A generic interface for parallel cell-based finite element operatorapplication, Computers & Fluids 63:135–147, 2012

[2] K. Kormann, M. Kronbichler, Parallel finite element operator application: Graph partitioningand coloring, 2011 IEEE 7th International Conference on E-Science, pp. 332–339, 2011

[3] W. Bangerth, T. Heister, L. Heltai, G. Kanschat, M. Kronbichler, M. Maier, B. Turcksin, T.D. Young, The deal.II library, version 8.1, http://arxiv.org/abs/1312.2266v4, 2013

2.7 High Order Schemes for Complex Flow SimulationsClaus-Dieter Munz, Andrea Beck, Florian Hindenlang

Institute of Aerodynamics and Gas Dynamics, University of StuttgartPfaffenwaldring 21, 70550 Stuttgart

Gregor GassnerInstitute of Mathematics, University of Cologne

Weyertal 86-90, 50931 Köln

Computational fluid dynamics has become a key technology in the development of new productsin many industrial fields. Despite the progress made, the simulation of unsteady flow problems withhigh fidelity turbulence models is still impractical for various important applications in term of usertime and computational resources. Beside the increase of the computer power a substantial improve-ment of the numerical methods is necessary to perform simulations of turbulent flows for industrialproblems within the development and design process. While the standard schemes in industry arestill the second order finite colume schemes, in recent times high order schemes are intensively studiedfor their use in industrial applications.

We believe that high order discontinuous Galerkin schemes on high performance computers are themost promising candidates to tackle the challenging complex flow problems in future. In this talkwe mainly consider two aspects in the development of novel numerical schemes for unsteady flow.After a short introduction to the spectral element discontinuous Galerkin schemes we present resultswith respect to their scalability and efficiency on massively parallel systems. Another focus of thetalk will be an attempt to answer the question: Are high order schemes really favorable for under-resolved turbulent flow simulations? Efficient strategies for large eddy simulation with high orderdiscontinuous Galerkin schemes, including both numerical and mathematical modeling aspects areaddressed, see ([1],[2]). These topics are completed by showing simulations of well-known benchmarkproblems and of industrial problems.

[1] A. Beck, T. Bolemann, D. Flad, H. Frank, G. Gassner, F. Hindenlang, C.-D. Munz: High orderdiscontinuous Galerkin spectral element methods for transitional and turbulent flow simula-tions" International Journal of Numerical Methods in Fluids 76 (2014), 522-548

[2] G. Gassner, A. Beck: On the accuracy of high-order discretizations for under-resolved turbu-lence simulations, Theoretical and Computational Fluid Dynamics 27 (2013), 221-237

10

2.8 Reassessing Lax-Wendroff and similar schemesPhilip L. Roe

University of Michigan, Ann Arbor

2.9 Positivity-preserving high order schemes in CFDChi-Wang Shu

Division of Applied Mathematics, Brown University, Providence, RI 02912, USA

We give a survey of our recent work with collaborators on the construction of uniformly highorder accurate discontinuous Galerkin (DG) and weighted essentially non-oscillatory (WENO) finitevolume (FV) and finite difference (FD) schemes which satisfy strict maximum principle for nonlinearscalar conservation laws, passive convection in incompressible flows, and nonlinear scalar convection-diffusion equations, and preserve positivity for density and pressure for compressible Euler systems incomputational fluid dynamics. These schemes are referred to as bound-preserving high order schemes.A general framework (for arbitrary order of accuracy) is established, which consists of the following3 ingredients: (1) A first order accurate scheme which has the bound-preserving property under asuitable CFL condition; (2) a simple scaling limiter applied to the high order DG or FV method,involving only evaluations of the polynomial solution at certain quadrature points, which does notaffect high order accuracy and guarantees the same bound-preserving property for the first orderEuler forward time discretization under a modified CFL condition; (3) a strong stability preserving(SSP) high order Runge-Kutta or multistep time discretization which increases time accuracy andmaintains the same bound-preserving property. One remarkable property of this approach is thatthe second ingredient above is straightforward to extend two and higher dimensions on arbitrarytriangulations, and the scaling limiter is local in the cell, thus it keeps the parallel efficiency of theoriginal algorithm. The schemes constructed in this framework are extremely robust, especially forproblems involving strong shocks or even �-function singularities. We will list applicability of themethod for problems including arbitrary equations of state, source terms, integral terms, shallowwater equations, Lagrangian schemes, and positivity-preserving high order finite volume scheme andpiecewise linear DG scheme for convection-diffusion equations. Numerical tests demonstrating thegood performance of the scheme will be reported.

2.10 A Reconsideration of Fixed Point Methods forNonlinear Systems

Carol S. WoodwardCenter for Applied Scientific Computing, Lawrence Livermore National Laboratory, P.O. Box 808,

L-561, Livermore, CA 94551

Newton-Krylov methods have proven to be very effective for solution of large-scale, nonlinearsystems of equations resulting from discretizations of PDEs. However, increasing complexities andnewer models are giving rise to nonlinear systems with characteristics that challenge this commonlyused method. In particular, for many problems, Jacobian information may not be available or it maybe too costly to compute. Moreover, linear system solves required to update the linear model withineach Newton iteration may be too costly on newer machine architectures.

11

Fixed point iteration methods have not been as commonly used for PDE systems due to their slowconvergence rate. However, these methods do not require Jacobian information nor do they require alinear system solve. In addition, recent work has employed Anderson acceleration as a way to speedup fixed point iterations [4, 1, 2, 3].

In this presentation, we will discuss reasons for success of Newton’s method as well as its weak-nesses. Fixed point and Anderson acceleration will be presented along with a summary of knownconvergence results for this accelerated method. Results will show benefits from this method forvariably saturated subsurface flow and for a materials science application. In addition, the impactsof these methods will be discussed for large-scale problems on next generation architectures.

[1] Anderson D G, "Iterative procedures for nonlinear integral equations," J Assoc Comput Mach,1965;12:547–60.

[2] Carlson N N, Miller K. Design and application of a gradient-weighted moving finite elementcode I: in one dimension. SIAM J Sci Comput 1998;183:275–87.

[3] Lott, P. A., H. F. Walker, C. S. Woodward, U. M. Yang, “An accelerated Picard method fornonlinear systems related to variably saturated flow,” Adv. Wat. Resour., 38 (2012), pp.92-101. DOI: 10.1016/j.advwatres.2011.12.013.

[4] Walker H. F. and Ni P., "Anderson acceleration for fixed-point iterations," SIAM J Num Anal2011;49:1715–35.

12

3 Contributed Talks

3.1 Efficient parallel solution of the telegraph equationssubject to general boundary conditions by a novel MonteCarlo method

Juan A. AcebrónDepartment of Information Science and Technology, ISCTE-University Institute of Lisbon, Av.

Forças Armadas, 1649-026 Lisboa, Portugal, and INESC-ID/IST, Technical University of Lisbon,Rua Alves Redol,9,1000-029 Lisboa, Portugal

Marco A. RibeiroDepartment of Information Science and Technology, ISCTE-University Institute of Lisbon, Av.

Forças Armadas, 1649-026 Lisboa, Portugal, and Instituto de Telecomunicações, Av. Rovisco Pais1, 1049-001, Lisboa, Portugal

In this paper is presented a new algorithm based on Monte Carlo simulations [1] for solving theone dimensional telegraph equations governing the evolution of voltage and current in a two-wiretransmission line in a bounded domain subject to general boundary conditions. The numerical schemeproposed does not require to discretize any time variable to obtain the random time, as in the classicalKac’s theory, being therefore the underlying algorithm much more efficient from the computationalpoint of view. Moreover, the algorithm has been validated comparing the results obtained with theFDTD method for a typical two-wire transmission line terminated at both ends with Dirichlet andNeumann boundary conditions. Since the algorithm is based on Monte Carlo simulations, it doesnot suffer of any numerical dispersion and dissipation issues, as it happens on the contrary for anystandard finite difference-based numerical schemes on a lossy medium. This in practice allowed todevelop an efficient numerical method capable to outperform the classical FDTD method for largescale problems and high frequency signals. The most important difference compared with the classicalmethods used so far for solving such a hyperbolic partial differential equation rests on the possibilityof computing the solution at a single point, being therefore essentially a meshless-type method. Sinceno computational mesh is required, the well known memory constraints for solving large scale andhigh dimensional problems are minimized [2]. Moreover, since the solution is computed by takingthe average of independent calculations, the underlying algorithms are indeed well suited for parallelcomputing [3].

[1] M.N.O. Sadiku, Monte Carlo methods for electromagnetics, CRC press, 2009.

[2] J.A. Acebrón, A. Rodríguez-Rozas, and R. Spigler, Efficient parallel solution of nonlinearparabolic partial differential equations by a probabilistic domain decomposition, J. Sci. Comput,43:135–157, 2010.

[3] J.A. Acebrón, and R. Spigler, Supercomputing applications to the numerical modeling ofindustrial and applied mathematics problems, J. Supercomput, 40:67–80, 2007.

13

3.2 Petascale Earthquake Simulations with SeisSolMichael Bader, Alexander Breuer, Sebastian Rettenberger

Department of Informatics, Technische Universität München, at Boltzmannstr. 3, 85748 Garching,Germany

Alice-Agnes Gabriel, Christian PeltiesDepartment of Earth and Environmental Sciences, Ludwig-Maximilians-University Munich, at

Theresienstr. 41, 80333 Munich, GermanyAlexander Heinecke

Intel Parallel Computing Lab, Intel Corporation, at 2200 Mission College Blvd, Santa Clara, CA95054, USA

We present recent work on optimizing the earthquake simulation code SeisSol for current petascalesupercomputers, esp. the Xeon Phi platforms Stampede and Tianhe-2. SeisSol solves the 3D elasticwave equation using an Arbitrary high order DERivative Discontinuous Galerkin (ADER-DG) dis-cretization on unstructured adaptive tetrahedral meshes, thus allowing high-order accuracy in spaceand time. The dynamic rupture process on geometrically complex tectonic faults is simulated to-gether with seismic wave propagation leading to earthquake simulations with unprecedented modelcomplexity.

SeisSol’s ADER-DG approach is particularly attractive for petascale simulations and beyond: thehigh-order DG discretization fosters efficient parallelization and provides high computational inten-sity; local time stepping offered by the ADER method promises a cure for the small time steps ex-pected for large-scale adaptive seismic wave simulations. Key to realizing petascale simulations withhigh peak efficiency was the hardware-oriented optimization of sparse and dense matrix operationsin time, volume and boundary kernels [1]. To fully exploit Xeon Phi coprocessors, a careful hybridparallelization was necessary, including an offload scheme tailored to the multiphysics simulation [2].

Recently, we set up a simulation of the 1992 Landers Earthquake using a mesh with 191 millionelements (1011 degrees of freedom). On SuperMUC, a production run (over 7 h of computation time)for this scenario achieved 1.25 PFLOPS sustained performance. On Stampede and Tianhe-2, strongscaling tests ran at up to 85 % parallel efficiency and more than 20 % peak performance. For thewave propagation component, a performance of 8.6 PFLOPS was achieved in a weak scaling test on8192 nodes on Tianhe-2.

[1] A. Breuer, A. Heinecke, S. Rettenberger, M. Bader, A.-A. Gabriel and C. Pelties: SustainedPetascale Performance of Seismic Simulations with SeisSol on SuperMUC. In: Supercomputing– 29th International Conference, ISC 2014, LNCS 8488, p. 1–18. Springer, Heidelberg, 2014.PRACE ISC Award 2014.

[2] A. Heinecke, A. Breuer, S. Rettenberger, M. Bader, A.-A. Gabriel, C. Pelties, A. Bode, W.Barth, K. Vaidyanathan, M. Smelyanskiy and P. Dubey: Petascale High Order Dynamic Rup-ture Earthquake Simulations on Heterogeneous Supercomputers. In: Supercomputing 2014,The International Conference for High Performance Computing, Networking, Storage and Anal-ysis. Accepted as Gordon Bell Finalist.

3.3 A Highly Scalable Asynchronous Implementation ofBalancing Domain Decomposition by Constraints

Santiago Badia, Alberto F. Martín, Javier Principe

14

Centre Internacional de Mètodes Numèrics en Enginyeria (CIMNE),Universitat Politècnica de Catalunya,

Parc Mediterrani de la Tecnologia, Esteve Terrades 5, 08860 Castelldefels, Spain.

The numerical approximation of partial differential equations (PDEs) by the finite element (FE)method requires the solution of sparse linear systems with several hundreds and even thousands ofmillions of equations/unknowns, which is only possible by appropriately exploiting current multicore-based distributed-memory machines. A natural strategy to achieve this goal is to rely on domaindecomposition (DD) methods, in which the divide and conquer principle is exploited through the so-lution of local problems and communication among neighboring subdomains only, resulting in highlyparallel preconditioners. Two-level DD preconditioners also include a global correction in order toachieve quasi-optimal condition number bounds, i.e., independent of the number of subdomains andglobal problem size, for second-order coercive problems. The global correction involves the solutionof a “small” coarse-grid problem that couples all the subdomains, providing a global mechanism forexchanging information. We focus on the Balancing DD by Constraints preconditioner [1] (BDDC)for which optimal condition number bounds can be proved.

However, how the scalability of optimal algorithms behaves in practice depends on a number offactors, the most important being the cost of the solution of the coarse problem, whose size increases(at best) linearly with respect to the number of subdomains. For large-scale problems, the coarseproblem rapidly becomes the bottleneck of the algorithm [2]. In this contribution we discuss threeactions to tackle this problem.

The first action, developed in [3], consists in exploiting the orthogonality (with respect to energyinner product) of the coarse and fine correction spaces of the BDDC preconditioner to develop analgorithm in which the corresponding corrections are computed in parallel. We propose a novelparallelization approach of BDDC based on overlapped/asynchronous fine-grid and coarse-grid dutiesin time and we present a discussion of how these novel techniques are exploited in order to reachmaximum performance benefit.

The second action is to use inexact/approximate solvers with linear complexity for the coarseproblem. The BDDC preconditioner requires the solution of several local problems and the globalproblem and different strategies can be used for their (approximate) solution. On top of the firststrategy, in [4] we explore several combinations, including a single application of an algebraic multi-grid preconditioner (AMG) or even an AMG-preconditioned Krylov method with a coarse relativeresidual tolerance, for the approximate solution of local Dirichlet/Neumann problems and/or theglobal coarse-grid problem.

The third action is to recursively apply the BDDC method to solve the coarse problem, resultingin a multilevel algorithm. This idea has already been developed in [5] but we combine it with the

first action above in a work currently under development. A key aspect to efficiently generalizethis strategy in a multilevel setting is how to map the computations/communications at each level toMPI tasks in such a way that a high degree of overlapping is achieved. In order to reach such goal,a strategically defined hierarchy of MPI communicators is employed. Besides, in order to implementthe transfer of coarse-grid problem related data among pairs of consecutive levels, say level k andk + 1, there are as many groups as MPI tasks on level k + 1 and each group has a root MPI task inlevel k+1 that collects (partial) contributions to the coarse problem from MPI tasks in level k. Thisway, coarse problem matrix and vectors are never centralized in a MPI task but always distributedamong the MPI tasks of the next level, which permits to maximize multilevel parallelism.

We show weak scalability studies of our approach for the 3D Poisson and linear elasticity problemsup to 100,000 cores on structured meshes and up to 8K cores on unstructured meshes on state-of-the-art multicore-based distributed-memory machines (JUQUEEN, HELIOS, and CURIE) confirmsthe success of this strategy.

15

[1] C. R. Dohrmann. A preconditioner for substructuring based on constrained energy minimiza-tion. SIAM Journal on Scientific Computing, Vol. 25 (1), 246–258, 2003.

[2] S. Badia, A. F. Martín, J. Principe. Implementation and scalability analysis of balancingdomain decomposition methods. Archives of Computational Methods in Engineering, Vol. 20

(3), 239–262, 2013.

[3] S. Badia, A. F. Martín, J. Principe. A Highly Scalable Parallel Implementation of BalancingDomain Decomposition by Constraints. SIAM Journal of Scientific Computing, Vol. 36(2),C190–C218, 2014.

[4] S. Badia, A. F. Martín, J. Principe. On an overlapped coarse/fine implementation of balancingdomain decomposition with inexact solvers. Submitted, 2014.

[5] J. Mandel, B. Sousedík and C. R. Dohrmann, Multispace and multilevel BDDC, Computing,Vol. 83 (2), 55 – 85, 2008.

3.4 Numerical solution of Navier-Stokes equations usingBalancing Domain Decomposition by Constraints

Martin Hanek, Pavel BurdaDept. of Mathematics, Faculty of Mechanical Engineering, Czech Technical University,

Karlovo náměstí 13, 121 35 Prague 2, Czech RepublicJakub Šístek

Institute of Mathematics, Academy of Sciences of the Czech Republic,Žitná 25, 115 67 Prague 1, Czech Republic

We deal with numerical simulation of the incompressible flow using Balancing Domain Decompo-sition by Constraints (BDDC). This method was introduced for elasticity problems in [1]. In [2],the method was used for the Stokes problem discretised by finite elements with continuous pressureapproximation. BDDC was modified for nonsymmetric matrices arising from solving Euler equationsin [3].

In our contribution, we explore the applicability of the BDDC method for nonsymmetric problemsarising from linearisation of incompressible Navier-Stokes equations. Picard iteration is consideredas the nonlinear solver. One step of the BDDC method is applied as the preconditioner for the fulllinearised system when it is solved by the BiCGstab method.

We present results for three 3-D problems — lid-driven cavity, backward-facing step and a hydro-static bearing — computed on 64 cores of a parallel computer.

[1] Dohrmann, CR., A preconditioner for substructuring based on constrained energy minimiza-tion, SIAM J Sci Comput 25(1), 246-258, 2003

[2] Šístek, J., Sousedík, B., Burda, P., Mandel, J., Novotný, J., Application of the parallel BDDCpreconditioner to Stokes flow, Computers and fluids 46, 429-435, 2011

[3] Yano, M., Massively Parallel Solver for the High-Order Galerkin Least-Squares Method, PhD.thesis, MIT, Massachusetts, 2009

16

3.5 Hierarchical Numerics for High-Dimensional ExascaleComputing

Mario Heene, Dirk PflügerInstitute for Parallel and Distributed systems, University of Stuttgart, Stuttgart, Germany

Alfredo Parra Hinojosa, Hans-Joachim BungartzChair of Scientific Computing, Technische Universität München, Munich, Germany

High-dimensional problems pose a challenge for tomorrow’s supercomputing. Problems that requirethe joint discretization of more dimensions than space and time are among the most compute-hungryones and thus standard candidates for exascale computing and even beyond. Our SPPEXA projectEXAHD tackles such problems by a hierarchical extrapolation approach: the sparse grid combinationtechnique [1]. This approach provides novel ways to deal with central problems in (future) high-performance computing, such as scalability, load balancing and resilience [2].

As an exemplary prototype for high-dimensional problems, we study the massively parallel sim-ulation of plasma turbulence with the code GENE. Exploiting the additional level of parallelism –introduced by the combination technique – is promising to ensure the scalability of application codeslike GENE on future exascale machines. Furthermore, by mitigating the curse of dimensionality, itoffers means to tackle problem sizes that would typically be out of scope merely due to the numberof unknowns. We have developed efficient communication schemes that minimize the global commu-nication and synchronization overhead by exploiting the hierarchical structure of the combinationsolution, as well as suitable load balancing schemes. Additionally, by incorporating ideas from di-mensionally adaptive sparse grids and the generalized combination technique, it is possible to achievefull algorithm based fault-tolerance without the need for checkpoint restart.

First experiments applying the combination technique to large scale initial value computationswith GENE on the supercomputers Hermit and SuperMUC show promising results with respect tothe approximation quality and the use of computational resources.

[1] M. Griebel, M. Schneider, C. Zenger. A combination technique for the solution of sparse gridproblems. Iterative Methods in Lin. Alg., pp. 263–28, 1992.

[2] D. Pflüger, H.-J. Bungartz, M. Griebel, F. Jenko, T. Dannert, M. Heene, A. Parra Hinojosa, C.Kowitz, and P. Zaspel. EXAHD: An Exa-Scalable Two-Level Sparse Grid Approach for Higher-Dimensional Problems in Plasma Physics and Beyond. Workshop on Software for ExascaleComputing) at Euro-Par 2014.

3.6 Exploiting modern HPC systems with hybrid parallelismin high order DG for linear equations

Verena Krupp, Jens Zudrop, Harald Klimach, Sabine RollerSimulationstechnik und wissenschaftliches Rechnen Universität Siegen, [email protected]

Spectral and high order methods are attractive for linear equations like the acoustic wave equa-tion or Maxwell equations. Such methods can be implemented efficiently, as an arbitrary modalbasis can be used to represent the solution. We show that a high order discontinuous Galerkin(DG) scheme can exploit hybrid parallelism on modern supercomputers and efficiently solves three

17

dimensional problems on massively parallel systems like the SuperMUC. The implementation usesLegendre polynomials to represent the solution and thereby can exploit their orthogonality as well astheir recursive definition to allow fast matrix multiplications. Overall, the computational complexityfor linear problems is proportional to the number of degrees of freedoms.

Due to the low dissipation and dispersion achieved by spectral discretizations, it would be optimalto use them for linear equations. However, there are usually geometrical restrictions, that do notallow the deployment of spectral methods. The DG method opens an elegant solution, as it confinesthe spectral approximation to a local basis in finite elements. Furthermore, in the context of HPC,the need for partitioned distributed computations arises. While the coupling of the degrees of free-dom within each element is tight, neighboring elements only interact through surface information.Therefore, to take advantage of HPC systems and comply to geometrical constraints, large problemscan be distributed to an appropriate number of elements. High resolution can then be achieved byincreasing the modal representation within each element. The two levels of operation offered bythe DG scheme can be computationally exploited by hybrid parallelization. Data parallelism withinelements with a tight coupling of degrees of freedom, can be exploited by OpenMP threads, whilepartitions of elements can be computed in parallel by MPI processes. Our solver Ateles [1] combinesthis numerical scheme with an octree mesh that allows for an efficient neighborhood identificationon distributed systems. By free choices of the spatial scheme order and the hybrid parallelism, thesolver can be adapted to the executing machine. With the presented approach it is possible to achievea parallel efficiency of 88.7 % for a 64th order discretization on 131, 072 cores of SuperMUC withrespect to the performance of a single shared memory node (16 cores). The largest contribution inthe loss of efficiency (from 100 to 94.2 %)is due to the step from a single node to two nodes andthus, the need for communication over the network. The highest performance can be achieved whendeploying 4 OpenMP threads per MPI process.

[1] J. Zudrop, H. Klimach, M. Hasert, K. Masilamani, and S. Roller, A fully distributed cfdframework for massively parallel systems. In Cray User Group, Stuttgart (2012)

3.7 High performance discontinuous finite elementtime-domain solvers for computational nanophotonics

Stéphane Lanteri, Raphaël Léger, Jonathan ViqueratInria Sophia Antipolis-Méditerranée research center, Nachos project-team

06902 Sophia Antipolis Cedex, FranceClaire Scheid

University of Nice - Sophia Antipolis, J.A. Dieudonné Laboratory UMR CNRS 735106108 Nice Cedex 02, France

Tristan Cabel, Gabriel HautreuxCINES, 950 rue de Saint-Priest, Montpellier, France

Nanophotonics is the field of science and technology which aim at establishing and using thepeculiar properties of light and light-matter interaction in various nanostructures. Nanophotonicsincludes all the phenomena that are used in optical sciences for the development of optical devices.Therefore, nanophotonics finds numerous applications such as in optical microscopy, the design ofoptical switches and electromagnetic chips circuits, transistor filaments, etc. Because of its numerousscientific and technological applications (e.g. in relation to telecommunication, energy productionand biomedicine), nanophotonics represents an active field of research increasingly relying on numer-ical modeling beside experimental studies. The numerical study of electromagnetic wave propagation

18

in interaction with nanometer scale structures generally relies on the solution of the system of time-domain Maxwell equations, taking into account an appropriate physical dispersion model, such asthe Drude or Drude-Lorentz models, for characterizing the material properties of the involved nanos-tructures at optical frequencies. When dealing numerically with Drude and Drude-Lorentz models,the FDTD method is a widely used approach for solving the resulting system of partial differentialequations. However, for nanophotonic applications, the space and time scales, in addition to thegeometrical characteristics and the physical parameters of the considered nanostructures (or struc-tured layouts of the latter), are particularly challenging for an accurate and efficient applicationof the FDTD method. During the last ten years, numerical methods formulated on unstructuredmeshes have drawn a lot of attention in computational electromagnetics with the aim of dealing withirregularly shaped structures and heterogeneous media. In particular, the discontinuous Galerkintime-domain (DGTD) method has progressively emerged as a viable alternative to well establishedfinite-difference time-domain (FDTD) and finite-element time-domain (FETD) methods for the nu-merical simulation of electromagnetic wave propagation problems in the time-domain. In this talk,we will present our recent efforts aiming at the design of high performance DGTD methods forthree-dimensional nanophotnic applications. We will consider more particularly DGTD methods forsolving the system of Drude-Maxwell equations. The talk will cover both theoretical and numericalaspects, but the emphasis will be put on high performance computing issues studied in the contextof a recently awarded PRACE Preparatory Access project, aiming at improving the scalability of theDGTD solver for current petascale and future exascale systems.

3.8 A massively parallel domain decomposition / AMGmethod for elasticity problems

Axel Klawonn, Martin LanserMathematisches Institut, Universität zu Köln, Weyertal 86-90, 50931 Köln

Oliver RheinbachInstitut für Numerische Mathematik und Optimierung, Technische Universität Bergakademie

Freiberg, 09596 FreibergTzanio V. Kolev, Ulrike Meier Yang

Computation, Lawrence Livermore National Laboratory, Livermore, CA 94550

The numerical solution of partial differential equations as, e.g., linear or nonlinear elasticity prob-lems, on modern and future supercomputers requires fast and highly scalable parallel solvers. Domaindecomposition methods such as FETI-DP (Finite Element Tearing and Interconnecting - Dual Pri-mal) are well known as robust and fast solvers for elasticity problems, and algebraic multigrid (AMG)methods have shown to be highly scalable up to 100K and more MPI ranks for elasticity problems.The combination of the strengths of both methods may help to overcome scalability limits on modernarchitectures.In this talk, a combined FETI-DP / BoomerAMG method will be introduced. Additionally, alge-braic multigrid interpolations which preserve the rigid body modes on all levels are used for elasticityproblems. This approach is known as global matrix (GM) approach.Parallel weak scaling results up to 262K cores will be presented, using PETSc 3.4.3 and BoomerAMGsoftware packages.

19

3.9 Scalable, Hybrid-Parallel Multiscale Methods usingDUNE

René Milk, Mario OhlbergerInstitute for Numerical and Applied Mathematics, University of Münster

Sven KaulmannInstitute for Numerical and Applied Mathematics, University of MünsterInstitute for Applied Analysis and Numerical Simulation, University of Stuttgart

Among numerous other applications, multiscale problems as, for example, arising in the simulationof complex flows in reservoir engineering, promise good applicability of exa-scale computing tech-niques. In our contribution, we will introduce a mathematical abstraction of multiscale methods [5].Based on this unified mathematical abstraction layer, we introduce a hybrid parallelization approachthat reflects the different layers of these multiscale methods. We present our implementation basedusing on the Distributed and Unified Numerics Environment DUNE [1] and the DUNE Generic Dis-cretization Toolbox [4]. As a pilot application within EXA-DUNE [2], part of the DFG PriorityProgramme SPP 1648-1 "Software for Exascale Computing", we are combining techniques such asshared- and distributed memory parallelization and GPU/MIC accelerators to achieve scalabilityand efficiency on current and future, potentially highly heterogenous, peta- and exa-scale computingclusters. We conclude our presentation with new scalability results for the Multiscale Finite Element[3] method.

[1] Peter Bastian, Markus Blatt, Andreas Dedner, Christian Engwer, Robert Klöfkorn, Ralf Korn-huber, Mario Ohlberger, and Oliver Sander. A generic grid interface for parallel and adaptivescientific computing. Part II: Implementation and Tests in DUNE. Computing, 82(2-3):121–138, June 2008.

[2] P. Bastian, C. Engwer, D. Göddeke, O. Iliev, O. Ippisch, M. Ohlberger, S. Turek, J. Fahlke,S. Kaulmann, S. Müthing, and D. Ribbrock. EXA–DUNE: Flexible PDE Solvers, NumericalMethods and Applications . To appear in: Proceedings EuroPar 2014, Workshop on Softwarefor Exascale Computing . Springer, August 2014, Porto, Portugal .

[3] Thomas Y Hou. A Multiscale Finite Element Method for Elliptic Problems in CompositeMaterials and Porous Media. Journal of Computational Physics, 134(1):21–21, May 1997.

[4] Felix Schindler. DUNE-Gdt. https://github.com/pymor/dune-gdt/

[5] Mario Ohlberger. Error control based model reduction for multiscale problems. In AngelaHandlovičová, Zuzana Minarechová, and Daniel Ševčovič, editors, Algoritmy 2012, pages 1–10.Slovak University of Technology in Bratislava, April 2012.

3.10 Recent Advances in Parallel Fluid-Structure-AcousticsSimulations

Miriam MehlInstitut für Parallele und Verteilte Systeme,

Universität Stuttgart, Universitätsstraße 38, D-70569, Germany

20

The simulation of the interaction between turbulent fluid flow, elastic structures and acoustics isa prominent and challenging example for the increasingly important class of multi-field simulations.They involve at least two different types of equation systems that are in many cases simulated byseparate single-field solvers in order to minimize code development times and to maintain maximalflexibility in terms of exchanging solvers, coupling methods and even the considered fields. Sucha setting obviously poses challenges if executed on a massively parallel system. Just to mentiona few, dynamical load balancing, communication, robust parallel coupling numerics are importantissues. Fluid-structure-acoustics interactions are a very good representative for multi-field simulationsas they involve many of the typical difficulties such as bidirectional volume and surface coupling,instabilities and different scales in time and space. We present recent advanced achieved for suchsimulations in the project ExaFSA of the SPPExa priority program. These advances include improvedcoupling numerics, faster data mapping between non-matching grids and further steps towards ahighly scalable parallel implementation of the inhouse coupling tool preCICE.

3.11 Performance portable multigrid preconditioners formixed finite element discretisations in atmosphericmodels

Eike Hermann Müller, Robert ScheichlDepartment of Mathematical Sciences, University of Bath, Bath BA2 7AY United Kingdom

Colin Cotter(a), David Ham(a), Lawrence Mitchell(b)Department of Mathematics(a) and Department of Computing(b), Imperial College, London SW7

2AZ, United Kingdom

When implementing solvers for very large elliptic PDEs, algorithmic- and parallel- scalability haveto be guaranteed if the algorithms should run efficiently on future exascale systems. With rapidsucession of supercomputer installations and the advent of novel chip architectures such as GPUs,it becomes equally important to implement the algorithms in a performance portable way. Thisallows a “separation of concerns”, i.e. it hides the details of the parallelisation and the low-levelimplementation for a particular architecture from the domain specialist and the designer of thenumerical algorithm. In this talk we describe this approach for an elliptic PDE arising from implicittime stepping in atmospheric models.

Modern climate- and weather forecast models on semi-structured grids rely on (mimetic-) mixedfinite element discretisations to ensure the exact conservation of physical quantities and to suppresscomputational modes. If a (semi-) implicit time discretisation is used, this requires the efficientsolution of a mixed finite element system for pressure and velocity at every time step. Compared tomore standard finite-difference discretisations this approach creates new challenges for the iterativesolver algorithm which is used for inverting the mixed system, in particular if higher-order finiteelement schemes are used for improved accuracy.

We study the algorithmic- and parallel performance of iterative solver algorithms for the solutionof this equation. In particular we focus on the performance of geometric multigrid preconditionerswhich require a very small number of iterations to converge.

The solvers were tested in the performance-portable firedrake/PyOP2 framework [1,2,3], whichallows rapid testing and code auto-generation for different hardware architectures.

[1] Performance-Portable Finite Element Assembly Using PyOP2 and FEniCS Markall, G. R. etal. 28th International Supercomputing Conference, ISC, Proceedings, volume 7905, of LectureNotes in Computer Science, pages 279-289, 2013. Springer

21

[2] PyOP2: A High-Level Framework for Performance-Portable Simulations on Unstructured MeshesRathgeber, F. et al. In High Performance Computing, Networking Storage and Analysis, SCCompanion:, pages 1116-1123, Los Alamitos, CA, USA, 2012. IEEE Computer Society

[3] The firedrake project http://firedrakeproject.org/

3.12 Efficient Discontinuous Galerkin schemes on hybridHPC architectures

Steffen Müthing, Peter BastianInterdisclinary Center for Scientific Computing, Heidelberg University

Christian Engwer, Jorrit FahlkeInstitute for Computational and Applied Mathematics, University of Münster

We present a scalable solver for porous media problems based on operator splitting and a Discon-tinuous Galerkin discretization using the frameworks DUNE and PDELab, which we are developingas part of the EXA-DUNE project [1]. In particular, we focus on the efficient exploitation of currentand next generation HPC architectures and on algorithmic improvements at the assembly stage basedon a sum factorization approach. Moreover, we investigate the performance and scalability trade-offsof explicit and implicit time integrators and demonstrate the impact of matrix-free methods on thesolver performance.

[1] P. Bastian, C. Engwer, D. Göddeke, O. Iliev, O. Ippisch, M. Ohlberger, S. Turek, J. Fahlke,S. Kaulmann, S. Müthing, and D. Ribbrock. EXA-DUNE: Flexible PDE solvers, numericalmethods and applications. In Euro-Par 2014: Parallel Processing Workshops, Lecture Notesin Computer Science. Springer Berlin Heidelberg, to appear.

3.13 Local Fourier Analysis of Pattern Structured OperatorsM. Bolten, K. Kahl, H. Rittich

University of Wuppertal

Multigrid methods [5] are used to compute the solution u of the system of equations Lu = f ,where L is typically a discretization of a partial different equations (PDE) and f a corresponding,given right hand side. Local Fourier Analysis (LFA) [2, 5, 6] is well known to provide quantitativeestimates for the speed of convergence of multigrid methods, by analyzing the involved operators inthe frequency domain.

For the initial formulation of LFA [1] it was crucial to assume that all involved operators haveconstant coefficients. For many PDE operators the coefficients vary continuously in space. Thus ifthe grid is fine enough the discrete operator L will only vary slightly between neighboring grid pointsand hence can be well approximated by an operator with locally constant coefficients. Thus constantcoefficient are often reasonable assumption.

However, when analyzing more complex problems or even the multigrid method as a whole thisassumption is too restrictive. Interpolation and restriction operators typically act differently onvariables that have a coarse grid representative and those who do not have one. Another example

22

are patter relaxation schemes like the Red-Black Gauß-Seidel method where red points of the gridare treated differently from the black ones.

It is possible to analyze these cases [3,4] when allowing for interaction of certain frequencies (seealso [5, 6]). Even more, it turns out that when we allow for more frequencies to interact we cananalyze operators given by increasingly complex patterns. In our talk we will illustrate a generalframework for analyzing pattern structured operators, i.e., operators whose action is invariant undercertain shifts of the input function. Furthermore, we discuss different applications.

[1] A. Brandt. Multi-level adaptive solutions to boundary-value problems. Math. Comput.,31(138):333–390, 1977.

[2] A. Brandt. Rigorous quantitative analysis of multigrid, I: Constant coefficients two-level cyclewith L2-norm. SIAM J. Numer. Anal., 31(6):1695–1730, 1994.

[3] K. Stüben and U. Trottenberg. Multigrid methods: fundamental algorithms, model problemanalysis and applications. In Multigrid methods (Cologne, 1981), volume 960 of Lecture Notesin Math., pages 1–176. Springer, Berlin-New York, 1982.

[4] C.-A. Thole and U. Trottenberg. Basic smoothing procedures for the multigrid treatment ofelliptic 3D operators. Appl. Math. Comput., 19(1-4):333–345, 1986.

[5] U. Trottenberg, C. W. Oosterlee, and A. Schüller. Multigrid. Academic Press, 2001.

[6] R. Wienands and W. Joppich. Practical Fourier Analysis For Multigrid Methods. ChapmanHall/CRC Press, 2005.

3.14 Parallel Runtime Environments with Cloud Database: APerformance Study for the Heterogeneous MultiscaleMethod with Adaptive Sampling

D. G. Roehm, C. JunghansInstitute for Computational Physics, Universität Stuttgart, 70569 Stuttgart, Germany Computerand Computational Sciences Division, Los Alamos National Laboratory, Los Alamos, NM 87545,

USAR. S. Pavel, T. C. Germann, A. L. McPherson

Department of Electrical and Computer Engineering, University of Delaware, Newark, DE 19716,USA Theoretical Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA

Computer and Computational Sciences Division, Los Alamos National Laboratory, Los Alamos,NM 87545, USA

We present an adaptive sampling method for heterogeneous multiscale simulations with stochasticdata. Within the Heterogeneous Multiscale Method, a finite-volume scheme integrates the macro-scale differential equations for elastodynamics, which are supplemented by momentum and energyfluxes evaluated at the micro-scale. Therefore, light-weight MD simulations have to be launchedfor every volume element. Our adaptive sampling scheme replaces costly micro-scale simulationswith fast table lookup and prediction. The cloud database Redis serves as plain table lookup andwith locality-aware hashing we gather input data for our prediction scheme. For the latter we useordinary kriging, which estimates an unknown value at a certain location by using weighted averages

23

of the neighboring points. As the independent tasks can be of very different runtime, we used fourdifferent parallel computing frameworks, i.e. OpenMP[1], Charm++[2], Intel CnC[3] and MPI-onlyusing libcircle[4] for the implementation and compared their performance.

[1] Dagum, Leonardo and Menon, Ramesh, OpenMP: an industry standard API for shared-memoryprogramming, Computational Science & Engineering, IEEE, 1998

[2] Kale, Laxmikant V and Krishnan, Sanjeev, CHARM++: a portable concurrent object orientedsystem based on C++, ACM, 1998

[3] Knobe, Kathleen, Ease of use with concurrent collections (CnC), Proceedings of the FirstUSENIX conference on Hot topics in parallelism, 2009

[4] LaFon, Jharrod and Misra, Satyajayant and Bringhurst, Jon, On Distributed File Tree Walkof Parallel File Systems, Proceedings of the International Conference on High PerformanceComputing, Networking, Storage and Analysis, 2012

3.15 Higher-Order Discontinuous Galerkin Spectral ElementMethods for Computational Astrophysics

Jonatan Nunez-de la Rosa, Claus-Dieter MunzInstitut für Aerodynamik und Gasdynamik, Universität Stuttgart

Pfaffenwaldring 21, D-70569 Stuttgart, Germany

A higher-order discontinuous Galerkin spectral element method (DGSEM) for solving the magneto-hydrodynamics equations in three-dimensional domains is presented [1], [2]. Because of the presenceof shocks in astrophysical scenarios, the numerical framework consists in a novel hybrid scheme,where for smooth parts of the flow, the DGSEM is used, and those regions with strong shocks areevolved with a robust Finite Volume (FV) method with WENO3 reconstruction. In this approach,we interpret the nodal DG values in the troubled element as FV subcell values for their furthertime evolution [3]. For the time discretization, an explicit fourth-order strong stability-preservingRunge-Kutta method is employed [4]. Numerical results with very high polynomial degree includethe two-dimensional Orszag-Tang vortex, the spherical blast wave problem and the Kelvin-Helmholtzinstability. Additionally, for the first time, the simulation of an accretion disc with high-order dis-continuous Galerkin methods is showed.

[1] Florian Hindenlang, Gregor Gassner, Christoph Altmann, Andrea Beck, Marc Staudenmaier,and Claus-Dieter Munz. Explicit discontinuous Galerkin methods for unsteady problems. Com-puters & Fluids, 61:86–93, 2012.

[2] David Kopriva. Implementing spectral methods for partial differential equations. Springer-Verlag, Berlin, 2009.

[3] Matthias Sonntag and Claus-Dieter Munz. Shock capturing for discontinuous Galerkin methodsusing finite volume subcells. In Jürgen Fuhrmann, Mario Ohlberger and Christian Rohde,editors, Finite Volumes for Complex Applications VII, pages 945–953. Springer, 2014.

[4] Raymond Spiteri and Steven Ruuth. A new class of optimal high-order strong-stability-preserving time discretization methods. SIAM Journal on Numerical Analysis, 40(2):469–491,2002.

24

3.16 The parallel-in-time methods Parareal and PFASST:Library development and applications

Daniel Ruprecht, Andreas Kreienbuehl, Mathias Winkel, Rolf KrauseInstitute of Computational Science, Università della Svizzera italiana, Lugano, Switzerland

Robert Speck, Torbjörn KlattJülich Supercomputing Centre, Forschungszentrum Jülich, Germany

Matthew EmmettCenter for Computational Sciences and Engineering, Lawrence Berkeley National Laboratory, USA

In view of the rapidly increasing number of cores in modern supercomputers, novel mathematicalalgorithms that provide additional levels of concurrency will become necessary to fully leverage high-performance architectures. For initial value problems, parallel-in-time integration methods are apromising way to introduce parallelism along the temporal axis in conjunction with parallelizationin space.

Parareal – introduced by Lions, Maday and Turinici in 2001 [1] – is probably the most extensivelystudied time-parallel method. It relies on the iterative application of a computationally expensivefine integrator in parallel; and a coarse, computationally cheap method that propagates corrections,in serial. Parareal is relatively straightforward to implement and can be coupled to existing codesthrough rather lightweight interfaces. Unfortunately, however, it usually provides relatively lowparallel efficiency.

The “parallel full approximation scheme in space and time” (PFASST), was introduced in 2012 byEmmett and Minion [2]. PFASST is an iterative multilevel strategy for the temporal parallelizationof ODEs and discretized PDEs. It can be considered as a time-parallel variant of “multi-level spectraldeferred corrections” [3], an iterative multi-level solver for collocation problems where the inversionof the preconditioner corresponds to “sweeps” of a low-order time-stepper (e.g. implicit Euler). Thetighter interweaving of spatial and temporal discretization in PFASST can lead to good parallelefficiency, but also results in a more complicated interface.

The talk will sketch both of the Parareal and PFASST algorithms. A number of applicationexamples and benchmarks will be presented, including recent results on inexact spectral deferredcorrections (ISDC) and on interweaving PFASST with a massively parallel multigrid solver in space.The status of an ongoing collaborative effort to develop modern C++ software libraries for bothmethods will also be reported.

[1] J. Lions, Y. Maday, G. Turinici, A ”parareal” in time discretization of PDE’s, C. R. Acad. Sci.,2001.

[2] M. Emmett, M. Minion, Toward an efficient parallel in time method for partial differentialequations, CAMCOS, 2012.

[3] R. Speck, D. Ruprecht, M. Emmett, M. Minion, M. Bolten, R. Krause, A multi-level spectraldeferred correction method, BIT Numerical Mathematics, 2014.

3.17 Towards a high order moving grid method forcompressible flow equations

Juan Pablo Gallego, Christian Klingenberg, Gero Schnücke(speaker)Dept. of Mathematics, Würzburg University, Germany

25

Praveen ChandrashekarappaTata Institute of Fundamantal Research Centre of Applicable Math., Bangalore, India

Yinhua Xia, Yan XuUniversity of Science and Technology of China, Hefei, China

As part of the priority program of the German science foundation (DFG) on exascale computingwe will present the project devoted to developing a discontinuous Galerkin method on a moving meshfor compressible flow. It is called "Exascale simulations of the evolution of the universe includingmagnetic fields" or EXAMAG. This is a joint project with the astrophysicist Volker Springel of theHeidelberg Institute for Theoretical Studies.

The Springel group has developed a code based on a MUSCL scheme on a moving Voronoi grid tosimulate the formation of galaxies and evolution of the universe. The aim of the cooperation is toextend the code to a high order moving grid Runge-Kutta discontinuous Galerkin (RK-DG) method.For this reason we split the project into two parts.

First an arbitrary Lagrangian-Eulerian (ALE) RK-DG method for scalar conservation laws on asimplex mesh will be introduce. Both theoretical and numerical results, like numerical test exampleson convergence rates of the method, will be shown. Second a RK-DG code on a fixed grid for theEuler equations will be presented. The code works on a cartesian grid with adaptive mesh refinement.For the future the plan is to combine these two strands of work.

3.18 Structured inverse modeling in diffusive processesMartin Siebenborn, Volker Schulz

University of Trier, Universitätsring 15a, 54296 Trier, Germany

In many applications, which are modeled by diffusion processes, there is a small number of dif-ferent materials involved with distinct boundaries. It is thus reasonable not only to estimate thepermeability parameter itself but also the contour of the spatial distribution by methods of shapeoptimization. Depending on the complexity of the interfaces, this requires the solution of several verylarge systems of equations resulting from fine discretizations of PDE systems. It is thus obligatoryto develop efficient software for supercomputers that guarantee scalability also for problems wheremost degrees of freedom are located on interfaces. This talk introduces a limited memory BFGSapproach for shape optimization in diffusive flow processes. It is shown that superlinear convergencecan be achieved which is a significant speedup against optimizations based only on shape gradients.These techniques are utilized in order to fit a model of the human skin to data measurements andnot only estimate the permeability coefficients but also the shape of the cells.

3.19 A 3D-parallel interior eigenvalue solverJonas Thies

German aerospace center (DLR), simulation and software technology,Linder Höhe, 51147 Köln

Some applications in quantum physics require the computation of a relatively large part of theinterior of the spectrum of the Hamiltonian matrix. The sparse matrices in question have a dimensionof billions to trillions and on the order of 1000 eigenpairs are required.

26

We present a purely iterative solver based on the FEAST algorithm (Polizzi ’09)with a fault-tolerant and hybrid-parallel row-projection method for the linear systems that have to be solved.The subspace is distributed in both the ‘horizontal’ and ‘vertical’ direction, and the key operationsexploit any available intra-node parallelism.

3.20 Physics-aware solver concepts for geophysicalapplications on massively parallel architectures

Christian Waluga, Lorenz John, Barbara WohlmuthM2 - Zentrum Mathematik, Techn. Univ. München, Boltzmannstr. 3, 85748 Garching

Björn Gmeiner, Markus Huber, Ulrich RüdeInformatik 10, Univ. Erlangen-Nürnberg, Cauerstr. 11, 91058 Erlangen

In this talk we consider a Boussinesq-type model as a prototype for mantle convection problems.The mass- and momentum-balance are discretized using stabilized linear finite elements and theenergy-balance is discretized by vertex-centered finite volumes. Due to a duality of the respectivecomputational meshes we obtain a natural transfer of the solution coefficients back and forth betweenthe involved discrete function spaces. Moreover, we demonstrate that this combination offers thepossibility to obtain exact mass conservation in a fully coupled simulation at negligible extra cost[1]. We illustrate the performance and scalability of our solver by recent results obtained with ourTerra-Neo prototype, which is based on the concept of hierarchical hybrid multigrid [2,3].

[1] B. Gmeiner, C. Waluga, and B. Wohlmuth. Local mass-corrections for continuous pressureapproximations of incompressible flow, submitted

[2] B. Bergen and F. Hülsemann. Hierarchical hybrid grids: data structures and core algorithmsfor multigrid. Numerical Linear Algebra with Applications, 2004.

[3] B. Gmeiner, U. Rüde, H. Stengel, C. Waluga, and B. Wohlmuth. Performance and scalabilityof hierarchical hybrid multigrid solvers for Stokes systems, submitted

3.21 Optimal parallel uncertainty quantification inlarge-scale flow problems

Peter Zaspel, Christian Rieger, Michael GriebelInstitute for Numerical Simulation, University of Bonn, Wegelerstraße 6, 53115 Bonn

One big problem in simulations for real-world engineering applications is the appropriate handlingof small uncertainties in the involved quantities. These uncertainties include, but are not limitedto varying material parameters, physical constraints (e.g. temperature, gravitation) and shapes ofgeometrical objects. We have introduced techniques for uncertainty quantification into the fieldof incompressible two-phase flows with the random Navier-Stokes equations. The solution of theunderlying random PDE problem is computed by a non-intrusive stochastic collocation method.Here, we use reproducing kernel Hilbert space methods, namely radial-symmetric basis function(RBF) kernels, to approximate in stochastic space. Depending on the smoothness in the parameter

27

space, we can achieve higher-order algebraic or even exponential convergence rates by the choice ofappropriate kernels.

The major challenge of non-intrusive stochastic collocation methods is their requirement to performcalculations of hundreds or thousands of highly resolved CFD problems to extract stochastic datasuch as expectation value, variance or covariance. Obviously these computations push current highperformance compute (HPC) clusters at their limits. Furthermore the extraction of stochastic dataout of hundreds of Gigabytes to tens of Terabytes needs fast and optimal-complexity numericalmethods.

One part of overcoming this computational challenge is to introduce the massively parallel computepower of GPU clusters. Here, the fluid solver NaSt3DGPF, a solver for the two-phase incompressibleNavier-Stokes equations, was ported to multi-GPU hardware. Furtheremore, the stochastic colloca-tion framework is parallelized in a multi-GPU fashion.

The other part are optimal numerical methods. RBF kernel methods with their close relationshipto kriging promise small errors in the stochastic approximation with very few stochastic samples.Furthermore an anisotropic dimension-independent framework for RBF stochastic collocation is de-veloped. To achieve optimal compuational complexity, a classical hybrid GPU algebraic multigridmethod and memory-hierarchy aware preconditioners for the kernel approximation problem are con-sidered. Overall, this leads to a (multi-)GPU parallel random PDE solution framework with optimalapproximation and computational complexity for the proposed application.

In our presentation, we will highlight the numerical methods and some of the implementationdetails for parallel scalability and showcase several large-scale applications on multi-GPU clusters.

28

4 Important Information

4.1 Places• All lectures are taking place in room 432 on fourth floor of the IWR building located at “Im

Neuenheimer Feld 368, D-69120 Heidelberg”, see campus map in Figure 4.1.

• Coffee breaks are in the common room (room 514) on fifth floor, directly above the lectureroom.

• Lunch is served in the student mensa which is in building 304 on the campus. Use the map orfollow one of the locals.

• Get-together on Monday after the last presentation is taking place in the common room.

• Conference dinner is taking place in Wirtshaus “Zum Seppl”, Hauptstraße 213, 69117 Heidelbergin downtown Heidelberg. It can be reached by public transport and is in walking distance fromeither the stop “Neckarmünzplatz” on bus lines 33 and 35 or from the stop “Universitätsplatz”on bus line 31. Bus 31 stops close to IWR (stop “Kopfklink”), see the information below.

4.2 Public TransportA map of the Heidelberg tram and bus network can be found here:

https://www.rnv-online.de/uploads/media/Liniennetzplan_HD_01.pdf

A single fare ticket costs e2.40 and can be bought at ticket machines located at all larger stopsor from the bus driver (but not the tram drivers!). A pack of five tickets can also be obtained atthe ticket machines (“Mehrfahrtenkarte”) which makes them slightly cheaper (e2.20). Holders ofthe BahnCard can obtain single fare tickets for e1.80. Do not forget to stamp your tickets in thebus/tram.

To get around:

• The closest stops to the IWR building are “Kopfklink” on bus line 31 and “Bunsen-Gymnasium”on bus line 31 and tram lines 21 and 24.

• Bus line 31 connects directly to “Bismarckplatz” and “Universitätsplatz” in downtown Heidel-berg.

• Trams 21 and 24 connect directly to the train station (“Hauptbahnhof”). Bus number 32 alsogoes from the train station to Kopfklinik but takes rather long.

• The easiest way to/from the Holiday Inn Express Hotel is with tram 24 from the stop “RömerkreisSüd”.

29

CAMPUS IM NEUENHEIMER FELD

696

HauptbahnhofAutobahn

Campus Bergheim,Campus Altstadt

N E C K A R

691

692

370

320153

152

155

154

151

162

156

159

129

115

116 119

105 114

304

229

231

235

233252269330

282

306

345365

347

346 326

367

366

308

307327

368 348

324

530

531

678

704 705

706

400

131

240260

281

280

560561

562

230

241261

277

670

672

267

371

305325

254

520

328

276

271

275

273 279270

278272

132

133

134

253292 234

236

288

201364

677 679

675

674

680

688 689

683 695

696

687 685

681

686 682

684

690

DKFZZMBH

694

693

336-44

360

700

710

535

517- 519

347a

720

227 226

227b

229a

701 - 703

350

410

580

582

583 584

585

161

163

160

521

524523

522

536

HTC

Tiergartenstraße

Berli

ner S

traß

e

Berliner S

traße

Zoo

Jugendherberge(DJH)

TSG

Reiterverein

Max-PIanck-Institut(Jahnstr. 29)

Schwimm-bad

Max-PIanck-Institut

Studenten-wohnheime

Institut für Sport undSportwissenschaft

Bundesleistungs-zentrum (BLZ)

UBAPathologie

Hörsaal

URZ

Bioquant Mathematikon

SAI

DKFZ

Mensa

Botanischer Garten

Technologiepark

Technologiepark

581

OMZ

21, 24

21, 24

21, 24

21, 24

31

31, 3731, 32, 37

31, 32, 37

31

31, 32

31, 32

31, 32

31, 32

31, 32

32, 721

31, 32, 37

450HIT

100 m

329

Gerhart-Hauptmann-Str.

Nieren-zentrum

Alte Kinderklinik

Chirurgische Klinik

243242

P P

P

P

430

MedizinischeKlinik

225

M

M

M

M

P

M

M

M

M

M P

Hofmeisterweg

Tiergar tenstraße

P

P

M

274

Kirschnerstr.Kirschnerstr.

32

P

Klausenpfad

136

Im Neuenheimer Feld

KlausenpfadM

M

M

135

136

515

Jahnstr.

Kirschnerstr.

Uferstr.

Im Neuenheimer Feld

NCT 460

Schröderstr.

M

PädagogischeHochschule

© ZENTRALBEREICH Neuenheimer Feld · Print + Medien · Stand 10/2014

221224

220

M

MMönchhofstr.

Im Neuenheim

er Feld

Frauen- u.Hautklinik

P

Hubschrauber-Landeplatz

569

676

Heizwerk

Wohnen

699

671

VZM ZIM

Versorgungs-zentrum Medizin

Im Neuenheim

er Feld

Sonstige EinrichtungenWohn-/Gewerbegebiete

THEORETIKUMUniv./Klinikum (im Bau)Universität/Klinikum

Grün!ächenSportanlagenCampusGewässer

Hubschrauber-Landeplatz

Mitarbeiter-ParkplatzParkhaus

Straßenbahnlinie, HaltestelleKrankenhaus

Buslinie, Haltestelle

Campus-ÜbersichtstafelSchranke

Patienten/Besucher-ParkplatzPM

112

110

111

117

287 31, 37

31, 37, 721

232

205

206

100101

106118

501

503504

MP

P

669

PP

660 661 662

130

P

P

Kopf-klinik

Kinderklinik

THEORETIKUM

502

P

344

31, 32, 37

PM

165

420

H

Geogr. Institut(Berliner Str. 48)

ChirurgischeKlinik

361

294 293

440

M

420

Figure 4.1: Campus map.

30

5 List of Participants

Assyr AbdulleEPFL LausanneMathematics Section-MATHICSEEcole Polytechnique Fédéral de Lausanne (EPFL)Station 8CH-1015 [email protected]

Juan AcebronDepartment of Information Scienceand TechnologyISCTE-University Institute of LisbonAv. Forças Armadas1649-026 [email protected]

Martin AlkämperUniversität StuttgartFakultät für Mathematik und PhysikUniversität StuttgartPfaffenwaldring 57D-70569 [email protected]

Michael BaderTechnische Universität MünchenDepartment of InformaticsBoltzmannstr. 385748 [email protected]

Santiago BadiaCIMNE / UPCEsteve Terrades 508860 Castelldefels (Barcelona)[email protected]

Peter BastianUniversität HeidelbergIWRIm Neuenheimer Feld 368D-69120 [email protected]

Andreas BauerHITSSchloss-Wolfsbrunnenweg 3569118 [email protected]

Dörte BeigelIWR, INF 36869120 [email protected]

Hester BijlDelft University of TechnologyFaculty of Aerospace Eng.Delft University of TechnologyP.O.Box 5058NL-2600 GB DelftThe [email protected]

Matthias BoltenBergische Universität WuppertalGaußstraße 2042119 [email protected]

Christian EngwerWWU MünsterInstitut für Numerischeund Angewandte MathematikUniversität MünsterOrleans-Ring 1048149 Mü[email protected]

31

Alexandre ErnUniversité Paris-EstCERMICSENPC 6 et 8, avenue Blaise PascalF-77455 Marne la Vallée cedes [email protected]

Jorrit FahlkeInstitute for Comp. and Appl. Math.University of MünsterEinsteinstraße 6248149 Mü[email protected]

Rob FalgoutLawrence Livermore National LaboratoryCenter for Computational Sciences& EngineeringLawrence Livermore National LaboratoryP.O.Box 808L-316, Livermore, CA 94551, [email protected]

Juan-Pablo Gallego-ValenciaPhD student Universität WürzburgUniversität WürzburgCampus Hubland NordEmil-Fischer-Straße 3197074 WürzburgPhysik Ost, Zi. [email protected]

Mike GilesUniversity of OxfordMathematical InstituteOxford, [email protected]

Martin HanekDepartment of MathematicsFaculty of Mechanical EngineeringCzech Technical UniversityKarlovo namesti 13121 35 Praha 2Czech [email protected]

Mario HeeneInstitut für parallel und verteilte SystemeUniversität StuttgartIPVSUni StuttgartUniversitätsstr. 3870569 [email protected]

Mark HoemmenSandia National LaboratoriesP.O.Box 5800AlbuquerqueNM 87185-1320, [email protected]

Markus HuberUniversität Erlangen-NürnbergLehrstuhl Informatik 10 (Systemsimulation)Cauerstraße 11D-91058 [email protected]

Olaf IppischTU ClausthalInstitut für MathematikErzstr. 138678 [email protected]

Guido KanschatIWRINF 36869120 [email protected]

Christian KlingenbergWürzburg UniversityInstitut für MathematikEmil Fischer Str. 3097074 Wü[email protected]

Andreas KreienbuehlInstitute of Computational ScienceFaculty of InformaticsUniversity of LuganoVia Giuseppe Buffi 13CH-6900 Lugano

32

[email protected]

Martin KronbichlerTU MünchenLehrstuhl Numerische MathematikBoltzmannstr. 15D-85747 Garching, [email protected]

Verena KruppSimulationstechnik undWissenschaftliches RechnenUniversität SiegenHölderlinstr. 357076 Siegen, [email protected]

Martin LanserUniversität zu KölnWeyertal 86-9050931 KölnRaum [email protected]

Stéphane LanteriInria2004 Route des Lucioles - BP 9306902 Sophia Antipolis [email protected]

Jose Pablo Lucero LorcaIWR - Mathematische Methoden der SimulationIm Neuenheimer Feld 368 - Room 23269120 [email protected]

René MilkUniversity of MünsterInstitut für Numerischeund Angewandte MathematikEinsteinstr. 62D-48149 Mü[email protected]

Eike MuellerDepartment of Mathematical SciencesUniversity of BathBA2 7AY, United Kingdom

[email protected]

Claus-Dieter MunzUniversität StuttgartInstitut für Aerodynamik und GasdynamikPfaffenwaldring 21D-70569 [email protected]

Jonatan Nunez-de la RosaUniversitaet StuttgartPfaffenwaldring 21D-70569 [email protected]

Alfredo Parra HinojosaTechnische Universität MünchenBoltzmannstraße 385748 [email protected]

Aihui PengFaculty of TechnologyLinnaeus UniversitySweden351 95 Växjö[email protected]

Dirk RibbrockTU DortmundTechnische Universität DortmundFakultät für MathematikLehrstuhl LSIIIVogelpothsweg 8744227 [email protected]

Hannah RittichBergische Universität WuppertalArbeitsgruppe Angewandte InformatikFachbereich C – Fachgruppe MathematikBergische UniversitätGaußstraße 2042097 Wuppertal, [email protected]

33

Philip L. RoeUniversity of MichiganCollege of EngineeringAerospace EngineeringUniversity of Michigan in Ann Arbor3021 FXBMichigan, [email protected]

Dominic RoehmUniversitaet StuttgartAllmandring 370569 [email protected]

Daniel RuprechtInstitute of Computational ScienceUniversita della Svizzera italianaVia Giuseppe Buffi 13CH-6900 [email protected]

Kevin SchaalHeidelberg Institute for Theoretical StudiesSchloß-Wolfsbrunnenweg 3569118 [email protected]

Gero SchnückeUniversität WürzburgInstitut für MathematikEmil Fischer Str. 3097074 Wü[email protected]

Chi-Wang ShuBrown UniversityDivision of Applied MathematicsProvidenceRI 02912, [email protected]

Martin SiebenbornUniversity of TrierUniversitätsring 15a54296 Trier, Germany

[email protected]

Volker SpringelHeidelberg Institute for Theoretical Studies (HITS)and Heidelberg UniversitySchloss-Wolfsbrunnenweg 3569118 [email protected]

Dörte Carla SternelTU Darmstadt/FNBDolivostraße 1564293 [email protected]

Jonas ThiesGerman aerospace center (DLR)simulation and software technologyLinder Höhe51147 Kö[email protected]

Christian WalugaM2 - Zentrum MathematikTechnische Universität MünchenBoltzmannstraße 385748 [email protected]

Gabriel WittumGCSCInformatik Universität FrankfurtKettenhofweg 13960325 Frankfurt am [email protected]

Barbara WohlmuthTU MünchenM2 - Zentrum MathematikBoltzmannstr. 3D-85748 [email protected]

Carol S. WoodwardCenter for Applied Scientific ComputingLawrence Livermore National LaboratoryPO Box 808

34

94550California, [email protected]

Peter ZaspelInstitute for Numerical SimulationUniversity of BonnWegelerstraße 6 53115 [email protected]

35

For your notes.

36