CS11 Abstracts

96 CS11 Abstracts

IP1

Extreme-Scale Computing: Accelerating Discoveryand Innovation

With petascale systems now being applied to a wide vari-ety of challenging problems, attention is turning to exascalecomputing capability and the promise of simulation-basedscience and engineering. The challenges of making thetransition to exascale are formidable, but the expected ben-efits, including opportunities to accelerate scientific discov-ery and enable new technologies, provide a powerful argu-ment for a substantial investment in delivering new capabil-ities that will enable high-fidelity simulations of real-worldsystems. * Managed by UT-Battelle, LLC, for the U.S. De-partment of Energy under contract DE-AC05-00OR22725.

Thomas [email protected]

IP2

Biomolecular Modeling and Simulation: A FieldComing of Age

I will present a recent assessment of the progress inbiomolecular modeling and simulation, focusing on struc-ture prediction and dynamics, by presenting the field’s his-tory, metrics for its rise in popularity, early expressed ex-pectations, and current significant applications. Despiteearly unrealistic expectations and the realization that com-puter technology alone will not quickly bridge the gap be-tween experimental and theoretical time frames, increasesin computational power and improvements in algorithmsand force fields are propelling the field onto a productivetrajectory to become full partner with experiment. Re-search research on RNA and chromatin folding will also bepresented.

Tamar SchlickHoward Hughes Medical InstDepartment of Mathematics and [email protected]

IP3

Kinetic Methods for CFD

Computational fluid dynamics is based on direct discretiza-tions of the Navier-Stokes equations. The traditional ap-proach of CFD is now being challenged as new multi-scale and multi-physics problems have begun to emerge inmany fields – in nanoscale systems, the scale separationassumption does not hold; macroscopic theory is there-fore inadequate, yet microscopic theory may be impracticalbecause it requires computational capabilities far beyondour present reach. Methods based on mesoscopic theories,which connect the microscopic and macroscopic descrip-tions of the dynamics, provide a promising approach. Wewill present two mesoscopic methods: the lattice Boltz-mann equation and the gas-kinetic scheme, and their ap-plications to simulate various complex flows.

Li-Shi LuoOld Dominion [email protected]

IP4

Application Performance in the Multi-Core Era

Lessons to Be Learned for Exascale Computing

Tremendous efforts are put into hardware and softwaretechnologies to enable Exaflop computing before the endof this decade. Technological limitations will enforce ar-chitectural disruptions in the design of this new class ofsupercomputers. Still most of these changes are not clearlyidentified. However, they will challenge the sustainabilityof most existing software infrastructures and programmingprinciples used today in computational science and engi-neering. Focusing on technological trends already imple-mented at moderate scale in modern parallel computersand clusters, we evaluate their impact on programming anduse of parallel computers for scientific applications. Rec-ognizing some long known golden performance rules andadapting them to the latest widespread compute resourcesis the basis for getting good performance today. This willalso be a prerequisite for coping with the same problemsat extreme scale in Exaflop computers.

Gerhard WelleinErlangen Regional Computing Center (RRZE), [email protected]

IP5

Reverse Engineering The Face

Creating animated computer generated faces which canwithstand scrutiny on the large screen is a daunting task.How does the face move? How does it reflect light? Whatinformation is relevant? How can it be captured and thentransformed to convincingly breath life into a digital hu-man or fantastic creature? The talk will give examples ofnew technologies and methodologies developed to achievethis in blockbuster films including ”Avatar” and will pointthe way to the next generation of computer generated char-acters by showing the increasing importance of computa-tional simulation and discovering what is really going onunderneath the skin.

Mark SagarWeta [email protected]

IP6

Discovery of Patterns in Global Earth Science Datausing Data Mining

The climate and earth sciences have recently undergonea rapid transformation from a data-poor to a data-richenvironment. In particular, climate and ecosystem re-lated observations from remote sensors on satellites, aswell as outputs of climate or earth system models fromlarge-scale computational platforms, provide terabytes oftemporal, spatial and spatio-temporal data. These mas-sive and information-rich datasets offer huge potential forunderstanding and predicting the behavior of the Earth’secosystem and for advancing the science of climate change.This talk will present our recent research in discoveringinteresting relationships among climate variables from var-ious parts of the Earth.

Vipin KumarUniversity of [email protected]

IP7

Interoperable Unstructured Mesh Technologies for

CS11 Abstracts 97

Petascale Computations

The advent of petascale computing will enable increasinglycomplex, realistic simulations of PDE-based applications.Numerous software tools are used to help manage the com-plexity of these simulations, including computer-aided de-sign systems used to represent the geometry of the com-putational domain, advanced mesh generation tools to dis-cretize those domains, solution adaptive methods to im-prove the accuracy and efficiency of simulation techniques,and parallel tools such as dynamic partitioning to ease im-plementation on today’s computer architectures. However,managing the complexity of interactions between these ser-vices on massively parallel distributed memory computersis becoming increasingly difficult, leaving developers littletime to focus on the science of their applications. TheInteroperable Tools for Advanced Petascale Simulations(ITAPS) center focuses on providing tools to fill specifictechnology gaps, along with underlying interfaces provid-ing interoperability between these tools. In this talk, I willdescribe the foundational concepts behind the ITAPS in-teroperability goals and show several examples of applica-tions leveraging ITAPS technologies to increase simulationaccuracy, operate more effectively on complex computa-tional domains, or reduce the total time to solution.

Lori A. DiachinLawrence Livermore National [email protected]

IP8

Complexity Reduction in Partial Differential Equa-tions

Mathematical models of complex physical problems canbe based on heterogeneous partial differential equations(PDEs), i.e. on boundary-value problems of different kindin different subregions of the computational domain. Indifferent circumstances, especially in control and optimiza-tion problems for parametrized PDEs, reduced order mod-els such as the reduced basis method can be used to allevi-ate the computational complexity. After introducing someillustrative examples, in this presentation several solutionalgorithms will be proposed and a few representative ap-plications to blood flow modeling, sports design, and theenvironment will be addressed.

Alfio QuarteroniEcole Pol. Fed. de [email protected]

CP1

The Discontinuous Galerkin Method for Hyper-bolic Problems on Tetrahedral Meshes: A Poste-riori Error Estimation

We construct simple, efficient and asymptotically correct aposteriori error estimates for discontinuous finite elementsolutions of three-dimensional scalar first-order hyperbolicpartial differential problems on tetrahedral meshes. Weexplicitly write the basis functions for the error spaces cor-responding to several finite element spaces (P√, U√, V√,).

The leading term of the discretization error on each tetra-hedron is estimated by solving a local problem. The aposteriori error estimates are tested on several linear prob-lems to show their efficiency and accuracy under mesh re-finement for smooth solutions.

Idir Mechai

Department of Mathematics, Virginia [email protected]

Sliman Adjerid,Department of MathematicsVirginia [email protected]

CP1

Assessment of Collocation and Galerkin Ap-proaches to Linear Diffusion Equations with Ran-dom Data

We compare the performance of stochastic Galerkin andcollocation methods for solving PDEs with random data.The Galerkin method requires solution of a single algebraicsystem that is dramatically larger than deterministic sys-tems. The collocation method requires many deterministicsolves, which facilitates use of existing software. However,the number of unknowns for collocation methods can beconsiderably larger than for Galerkin methods. Experi-mental results indicate that for stochastically linear prob-lems, Galerkin methods are highly competitive.

Howard C. Elman, Christopher MillerUniversity of Maryland, College [email protected], [email protected]

Eric PhippsSandia National LaboratoriesOptimization and Uncertainty Quantification [email protected]

Raymond TuminaroSandia National [email protected]

CP1

New Hermite Multiwavelet Based Finite Elementsfor Numerical Solution of Biharmonic Equation

We present a new family of Hermite multiwavelets withsignificantly improved quantitative properties and an in-crementally solved, adaptive finite element method basedon these wavelets. The local generalized semiorthogo-nal wavelets lead to asymptotically optimal conditioning,where the resulting multilevel finite element system issparse and block diagonal, the system is solved in an in-cremental and adaptive manner. We finally demonstrateadaptive, incremental solution of a biharmonic equation intwo dimensions.

Sarosh M. QuraishiDepartment of Mechanical Engg., Institute of TechnologyBanaras Hindu [email protected]

CP1

Solving a Nist Suite of PDE Benchmark Problemswith Adaptive Low-Order Fem and Hp-Fem

This presentation is based on a recently published NISTReport NISTIR 7668 (2010) by William Mitchell that con-tains a collection of PDE benchmark problems. Theseproblems feature a spectrum of phenomena that pose chal-lenges to adaptive finite element algorithms. We begin witha brief review of the main differences in automatic adaptiv-ity for low-order and higher-order FEM including hanging

98 CS11 Abstracts

nodes and spatially and polynomially anisotropic refine-ments. Then we use the benchmarks in the suite to discussand compare various aspects of automatic adaptivity inlow-order FEM and hp-FEM.

Erick A. Santiago, Frank ProaUniversity of Nevada, [email protected], [email protected]

CP1

A Lagrangian Vortex Method for the BarotropicVorticity Equation on a Rotating Sphere

We present a Lagrangian vortex method for the barotropicvorticity equation (BVE) on a rotating sphere. The so-lution of BVE involves solving a conservative transportequation for the vorticity fields and a Poisson equation forthe stream function. The vortex method tracks the flowmap and absolute vorticity using Lagrangian particles andpanels. The velocity is computed from the Biot-Savart in-tegral on the sphere. An adaptive refinement strategy isimplemented to resolve small-scale features and a treecodeis used for efficient computation. A fourth-order Runge-Kutta scheme is used for time integration. We start ourinvestigation with point vortex method and the first testcase is the Rossby-Haurwitz wave, which is the exact solu-tion for BVE. Convergence study shows that the method isfourth order in time and first order in space for a uniformpanel discretization of the sphere. Then we switch to vor-tex blob method for stability consideration. We also testedthe evolution of vortex patch(s), which means the vortic-ity field is highly nonuniform on the surface of the sphere.Adaptive refinement strategy improves the computationalefficiency.

Lei WangUniversity of [email protected]

Robert KrasnyUniversity of MichiganDepartment of [email protected]

John P. BoydUniversity of MichiganAnn Arbor, MI 48109-2143 [email protected]

CP1

Immersed Finite Element Methods for Moving In-terface Problems

We will consider the parabolic equations with moving in-terface problems with discontinuous coefficients. The in-terface jump conditions are employed in the IFE functions,and the mesh need not to be aligned with the location of in-terface. So we can solve the problem on a fixed mesh eventhough the interface is moving. Several fully discretizedalgorithms based on the Crank-Nicolson idea will be pre-sented. These algorithms will be compared both analyti-cally and computationally. Numerical examples will showoptimal convergence for these schemes.

Xu ZhangVirginia [email protected]

Xiaoming He

Department of Mathematics and StatisticsMissouri University of Science and [email protected]

Tao LinDepartment of Mathematics, Virginia [email protected]

CP2

A Discontinuous Immersed Finite Element Methodfor Interface Problems

We present piecewise quadratic immersed finite element(IFE) spaces that are used with an interior penalty (IP)method for solving two dimensional second order ellipticboundary value problems with discontinuous shape func-tions without requiring the mesh to be aligned with thematerial interfaces. Two different IFE spaces are devel-oped, a discontinuous finite element formulation with in-terior penalty is discussed and numerical experiments arepresented to show the optimality of our method.

Mohamed Ben RomdhaneVirginia [email protected]

Slimane AdjeridDepartment of MathematicsVirginia [email protected]


CP2

A Conservative Meshless Framework with ErrorBound Minimization

We present a finite-volume-like conservative meshlessframework for solving conservation laws. The frameworkallows the incorporation of existing flux schemes and con-tains a central-like scheme as a special case. We discussthe necessary conditions for the existence of meshless coef-ficients and algorithms for generating coefficients that min-imize an upper bound of the global approximation error.Numerical examples in fluid mechanics will also be pre-sented to demonstrate the practicality of the framework.

Edmond Kwan-Yu ChiuPh.D. Candidate, Department of Aeronautics &AstronatuticsStanford [email protected]

Qiqi WangMassachusetts Institute of [email protected]

Antony JamesonProfessor, Department of Aeronautics & AstronauticsStanford [email protected]

CP2

An Error Analysis for the Corner Singularity Ex-

CS11 Abstracts 99

pansion of a Compressible Stokes System on a Non-Convex Polygon

In this talk we consider a finite element scheme approx-imating the regular part and the stress intensity factor,show its unique existence of the discrete solution and de-rive an (nearly) optimal error estimate. Some numericalexperiments confirming these results are given.

Young Pyo KimPOSTECHDept. [email protected]

CP2

Performance of Hp-Adaptive Finite Element Meth-ods

The hp version of the finite element method has been shownto have exponential rates of convergence with respect to thenumber of degrees of freedom, and consequently can be ofoptimal efficiency for many problems. Several strategies forthe choice between h- and p-refinement have been proposedover the years. In this talk we present some numericalresults to demonstrate the performance of these strategieson some test problems.

William F. MitchellNIST, Gaithersburg, [email protected]

Marjorie McClainMathematical and Computational Sciences DivisionNational Institute of Standards and [email protected]

CP2

An Integral Equation Method for the Modified Bi-harmonic Equation

An integral equation formulation for the modified bihar-monic equation in two dimensions is presented. This PDEarises from temporal discretizations of the stream functionfor the two dimensional incompressible Navier-Stokes equa-tions. Numerical treatment, fast algorithms and precondi-tioning of the integral equation will be discussed followedby some simple examples.

Bryan D. QuaifeSimon Fraser UniversityBurnaby, British Columbia, [email protected]

CP2

Multigrid Solution of the Distributed Optimal Con-trol of Semilinear Elliptic Pde.

In this work we design efficient multilevel methods forthe distributed optimal control of semilinear elliptic PDEs.Specifically, the control problem is solved using Newton’smethod, and specially designed multigrid methods are ap-plied to precondition the Hessian of the cost functional.It is shown that the quality of the resulting multigrid pre-conditioner increases at an optimal rate with respect to themesh-size, as in the case of optimal control of linear ellipticPDEs.

Jyoti SaraswatDepartment of Mathematics & Statistics

University of Maryland Baltimore [email protected]

Andrei DraganescuDepartment of Mathematics and StatisticsUniversity of Maryland, Baltimore [email protected]

CP3

The Reduced Basis Method for IncompressibleFluid Flow in Parametrized Domains

We present a reduced basis method for rapid and reli-able simulation of incompressible fluid flow problems inparametrized domains. We compare different strategies for1) ensuring the stability of the reduced basis approxima-tion, and 2) the computation of rigorous a posteriori errorbounds for both the velocity and the pressure approxima-tions. We apply the method to the Stokes equations, andpresent results for a model problem relevant in the designof microfluidic devices.

Anna-Lena Gerner, Karen Veroy-GreplGraduate School AICESRWTH Aachen [email protected], [email protected]

David J. [email protected]

CP3

Modeling of Oscillatory Stimulation of Fluid Flows

Oscillatory stimulation is a promising method for increas-ing drainage of non-Newtonian fluids through porous struc-tures in various applications. We developed a mathemati-cal model for unsteady flow of a bi-viscous incompressiblefluid in a circular straight channel. Longitudinal vibrationsare superimposed on the flow driven by pressure gradientalong the channel. Calculations have been carried out toobserve the effect of oscillations of longitudinal pressuregradient on the mean flow rate of non-Newtonian fluids. Itis found that oscillations enhance flow rate of a bi-viscousshear-thinning fluid while the effect of oscillations on theshear-thickening fluid is the opposite. The Bingham fluidexhibits even more significant augmentation of the meanflow rate when subjected to vibrations.

Sergey LapinWashington State UniversityDepartment of [email protected]

Mohsen Asle ZaeemMississipppi State [email protected]

Konstantin MatveevWashington State UniversitySchool of Mechanical and Materials [email protected]

CP3

Recent Advances in Numerical Study of Maxwell’sEquations in Metamaterials

Since 2000, there has been a growing interest in the study

100 CS11 Abstracts

of metamaterials across many disciplinaries. In this talk,I’ll first introduce the metamaterials, and some of its in-teresting applications. Then I’ll present the metamaterialmodeling equations, and some time-domain finite elementmethods recently developed for solving the time-dependentmetamaterial Maxwell’s equations. Finally, some numeri-cal results and open issues will be discussed.

Jichun LiDepartment of Mathematical SciencesUniversity of Nevada, Las Vegas, NV [email protected]

CP3

Uncertainties Quantification and Data Compres-sion in Numerical Aerodynamic

We research how uncertainties in the input parameters (theangle of attack and the Mach number) and in the airfoilgeometry propagate in the solution (lift, drag, etc). Weshow that uncertainties in the Mach number and in theangle of attack weakly affect the lift coefficient (1% − 3%)and strongly affect the drag coefficient (around 14%). Un-certainties in the geometry influence both the lift and dragcoefficients weakly. The RANS solver is TAU code withk-w turbulence model. For quantification of uncertaintieswe used KLE, PCE and low-rank matrices. The results ob-tained via collocation method on a sparse Gauss-Hermitegrid match to Monte Carlo results, but require much lessdeterministic evaluations and as a sequence - smaller com-puting time.

Alexander LitvinenkoTU Braunschweig, [email protected]

CP3

Experiences Extending the Cfd Solver of the PdeFramework Peano

The C++ PDE framework Peano [H.-J. Bungartz and M.Mehl and T. Neckel and T. Weinzierl, The PDE frame-work Peano applied to fluid dynamics: an efficient imple-mentation of a parallel multiscale fluid dynamics solver onoctree-like adaptive Cartesian grids, Computational Me-chanics 46, pp. 103–114, 2010] has been designed for realis-ing efficient applications on regular or adaptive Cartesiangrids. In this contribution, we evaluate Peano’s expand-ability by incorporating thermal energy transfer into theexisting incompressible flow solver. A series of numericalbenchmarks has been computed to validate the approach.These experiences are currently used to, first, further im-prove Peano’s integration and reuse capabilities by auto-matic code generation etc. and, second, to achieve fullyadaptive simulations of 3D thermal-hydraulic phenomenain the nuclear reactor coolant system.

Tobias Neckel, Michael Lieb, Ralf SanglTechnische Universitat [email protected], [email protected], [email protected]

Philipp J. Schoeffel, Fabian WeyermannGesellschaft fuer Anlagen- und Reaktorsicherheit (GRS)[email protected], [email protected]

CP4

On the Numerical Solution of Fuzzy Elliptic PDEs

by Means of Polynomial Response Surfaces

We consider the solution of elliptic partial differential equa-tions (PDE) with a fuzzy diffusion coefficient. Inspired bypopular solution techniques for stochastic elliptic PDEs, wemake use of orthogonal sets of polynomials and a Galerkinprojection to construct a response surface. We derive anupper bound for the error of this Galerkin approximation inthe maximum norm and use it to prove (near-)exponentialconvergence of this approximation to the exact solution inthe fuzzy sense.

Samuel CorveleynKatholieke Universiteit [email protected]

Stefan VandewalleScientific Computing Research [email protected]

CP4

An Etd Crank-Nicolson Method for Reaction-Diffusion Systems

A novel Exponential Time Differencing (ETD) Crank-Nicolson(CN) method is developed which is stable, secondorder convergent, and highly efficient as compared to stan-dard CN method and BDF2. Stability and convergence arediscussed for semi linear parabolic problems. Numericalexperiments are presented for a wide variety of examples,including multi-dimensional chemotaxis and bio-chemicalstiff reactions.

Abdul M. KhaliqMiddle Tennessee State UniversityDepartment of Mathematical [email protected]

CP4

Explicit Local Time-Step Method for 3D Maxwell’sEquations on Tetrahedral Mesh

An explicit local time-step (LTS) method is developed tosolve the time-dependent Maxwell’s equations on three-dimensional unstructured tetrahedral meshes. The methodis based on the finite volume discretization in spaceand LTS second order total variation diminishing (TVD)Runge-Kutta scheme in time. The algorithm is an exten-sion of the scheme proposed by Tang and Warnecke to solveconservation laws in one and two-dimensional space. Itsolves partial differential equations on a finite number ofsubdomains with LTSs using a simple projection of the so-lution at each step. In the present work, we remove the re-striction for the time-steps to be an integer multiples of theminimum time-step which is present in most electromag-netics literature. We optimize local time steps distributionto further minimize computational time. The applicationto reference problems for three-dimensional Maxwell equa-tions shows better performance of the proposed method.

Marina KotovshchikovaUniversity of [email protected]

Dmitry K. FirsovGeomodeling [email protected]

CS11 Abstracts 101

CP4

Accuracy Enhancement of Discontinuous GalerkinSolutions over Structured Triangular Meshes

Theoretically we can effectively increase the order of accu-racy of a discontinuous Galerkin (DG) solution from orderk+1 to 2k+1. However, this is a computationally complextask to execute in an efficient manner. This becomes aneven greater issue when we consider non-quadrilateral meshstructures. We present post-processing of DG solutions foraccuracy enhancement over structured triangular meshes.We demonstrate that we can obtain 2k+1 order accuracyin an efficient manner for linear hyperbolic equations.

Hanieh MirzaeeScientific Computing and Imaging InstituteSchool of Computing, University of [email protected]

Jennifer K. RyanDelft University of [email protected]

Robert KirbyUniversity of [email protected]

CP4

Development and Analysis of Computer Simula-tions of Neuron-Astrocyte Networks to Better Un-derstand the Role of Astrocytes

In the brain, astrocytes had been sidelined to a supportingrole in neural networks. The past decade revealed that theyinfluence and propagate neural signals, suggesting an im-portant role for astrocytes in neural networks. By runningmultiple simulations of neural networks, with and with-out astrocytes, we observed that they have the ability tosynchronize neural activity. Such findings resemble brainrecordings of epileptic patients whose neuronal activity issynchronized during seizures correlated with hypertrophiedastrocytes.

Tahirali H. MotiwalaWofford [email protected]

Anne CatllaWofford CollegeDept. of [email protected]

CP4

Investigations into the Boris Approach for Dealingwith Fast Magnetosonic Wave Speeds in An AleMhd Modeling Context

The classical magnetohydrodynamic (MHD) approxima-tion admits fast magnetosonic waves speeds which becomeunbounded as the material density vanishes. This can cre-ate wave speeds that are detrimental to an explicit timestepping algorithm in regions which have little relevanceto the critical magneto-mechanical issues. Boris proposeda semi-relativistic approach in which the displacement cur-rent terms from Ampere’s law are kept in the formulation ofthe momentum equation. This leaves open the possibilitythat for purposes of modeling efficiency one may choose tolimit the maximum computational wave speed by adjustingthe value of the speed of light. We describe algorithms of

the Boris style in the context of an arbitrary Lagrangian-Eulerian MHD modeling approach and illustrate the effec-tiveness of these approaches with relevant examples. San-dia National Laboratories is a multi-program laboratorymanaged and operated by Sandia Corporation, a whollyowned subsidiary of Lockheed Martin Corporation, for theU.S. Department of Energy’s National Nuclear SecurityAdministration under contract DE-AC04-94AL85000.

Allen C. RobinsonSandia National [email protected]

CP5

Accelerated Numerical Algorithms for Polyener-getic Digital Breast Tomosynthesis Reconstruction

Tomosynthesis imaging involves acquiring projection im-ages over a limited angular range, which, after reconstruc-tion, results in a pseudo-3D representation of the object.In breast cancer imaging, tomosynthesis is a viable alter-native to standard mammography. In this talk, we discussthe mathematical methods for the reconstruction based ona polyenergetic model, focusing on algorithm implemen-tation accelerators using the OpenCL and CUDA frame-works. We will show how vectorization and minimizingcommunication result in faster reconstruction time.

Veronica M. BustamanteEmory [email protected]

Julianne ChungUniversity of [email protected]

Ioannis SechopoulosDepartment of Radiology and Winship Cancer InstituteEmory [email protected]

James G. NagyEmory UniversityDepartment of Math and Computer [email protected]

CP5

Effect Of Flow Topology In Biophysical Interac-tions

Lagrangian flow topology may affect dynamics of biologicaltracers, such as chlorophyll. We quantify such effects bysimulating carrying capacity-phytoplankton-zooplanktoninteraction in an archetypical 2D flow characterized by ed-dies and filaments, representing the ocean surface. Wefind that the confined geometry of eddies initially helpthe phytoplankton population grow and filaments en-hance carrying-capacity transport, yet do not enhance bio-production. In both cases, the long time behavior of thephytoplankton population asymptotes.

Juan DurazoArizona State [email protected]

102 CS11 Abstracts

CP5

Biophysical Models of Brain Cancer Invasion

Glioblastomas are brain cancers characterized by diffusiveinvasion of brain tissues and a high incidence of recurrenceafter surgery. The growth and spread of glioblastoma cellpopulations is frequently modeled as a reaction-diffusionprocess with spatially varying diffusion. In the presentwork, the anisotropy of tumor cell diffusion is modeled withconsideration of biophysical interactions in the tumor mi-croenvironment and brain fiber tract maps acquired frommedical imaging. Results are compared with real patienthistories.

John B. IngrahamArizona State UniversitySchool of Mathematical and Statistical [email protected]

CP5

Quantitative Analysis of Brain Tumor Images

Medical Resonance Images are an essential part of diagno-sis and follow-up treatment for patients with brain tumors.Usually two types of MRIs are processed for quantifica-tion of tumor and edema volumes. For clinical assessment,T1+Contrast images are used to asses tumor size and T2for edema. I will describe an approach to separating theedema from the tumor core by processing both sets of im-ages simultaneously, improving the accuracy of edema vol-ume.

Helene A. RhodesArizona State UniversitySchool of Mathematical and Statistical [email protected]

CP5

Radiotherapy in Modeling Brain Tumors

Due to how common radiotherapy regiments are in thetreatment of glioblastoma and other brain tumors, it isnecessary to have an accurate model to be able to simulatepotential treatments. The key points of this presentationwill be: 1. The different ways that radiotherapy can dam-age cells 2. The attenuation of a radiation dose acrosstissue 3. The potential effects of radiotherapy aside fromcell death.

Keith A. [email protected]

CP6

Analysis of Quasicontinuum Methods for Uncon-strained and Circular Chains

We present the results of a theoretical analysis of stabil-ity of unconstrained straight and circular chains of atomsheld together by nearest and next-nearest neighbor inter-actions. We study the stability from the point of view ofeigenvalues of the Hessian of the quasicontinuum (QC) en-ergy functional, obtained by combining atomistic and con-tinuum models. We present sharp stability estimates forthe chain both under compression and expansion, as well aserror estimates for the quasi-nonlocal QC approximation.

Pavel BelikMathematics Department

Augsburg [email protected]

Mitchell LuskinSchool of MathematicsUniversity of [email protected]

CP6

The Edge Theory

Does there show ”assymptotic-uniqueness” along a networkusing PMA and control and can it be precluded from ex-ternal sharing. A supposition for application and testingof a controlled system program for use in PolyelectrolyteMultilayer Assemblies. The method of differential analy-sis is interpolated into a system of equations and regressedonto an original IC schematic of a type serial-parallel. thenumericism predicts and enhances material specific resultsthat would be observed of the use of any PMA so far. theessay further describes possible theories that allow the con-trol to have a different aspect of control on the input thandoes the rating on the load.

Cory J. Cheethamdiamondback [email protected]

Mike [email protected]

CP6

A Fracture Curve Generated by the Nearest Flaws

In this work, we study a two dimensional model for frag-mentation in which the fracture curve is generated recur-sively by visiting the nearest defect. The following assump-tions were also considered: the fracture forces are propor-tional to the size of the common boundary between twofragments; the material defects are represented by randompoint flaws; the total mass is conserved. Our main resultestablish that the visualizations present complex fracturepatterns that resemble real systems.

Gonzalo J. HernandezUniversity of Valparaiso, School of Industrial [email protected]

Roberto LeonUSM Department of [email protected]

CP6

An Open Curve Level-Set Based Algorithm for theMotion of Multiple Junctions

We propose an efficient open level-set method based onenergy minimization, which captures the behavior of junc-tions. These junctions arise in the study of material bound-aries, where the energy is made up of surface tension andbulk modulus. We solve the PDE using Sobolev gradi-ents, avoiding the need for regularization or re-initializationwhile also accelerating convergence to the steady state con-figuration. The algorithm is fairly simple and easy to gen-eralize to more complicated energies.

Rami Mohieddine

CS11 Abstracts 103

UCLA Math [email protected]

Luminita A. VeseUniversity of California, Los AngelesDepartment of [email protected]

CP6

Two-Dimensional Lagrangian Solver for Elasto-Plastic Deformation of Solids

We present a two-dimensional, cell centered, Lagrangianapproach for capturing the response of elasto-plastic defor-mation of solids under intense loading conditions. In thismodel, the primitive variables are stored and evolved at thecell centers. The nodal velocities and forces are requiredto update the mesh are computed in a consitent mannerwith the numerical fluxes. The hydrodynamic response ofthe material is modeled using the Mie-Gruneisen equationof state. The deviotaric stress components are evololvedin accordance with hypo-elastic stress-strain relations andJ2 Von-Mises yield conditions. We demonstrate the capa-bility of the our approach by presenting preliminary one-and two-dimensional problems with and without materialmodels.

Shiv Kumar Sambasivan, Shengnian Luo, MikhailShashkovLos Alamos National [email protected], [email protected], [email protected]

Pierre-Henri MaireUMR CELIA, Universite Bordeaux [email protected]

CP6

Augmented Lagrangian-Based Preconditioners forSteady Incompressible Navier-Stokes Equations

We discuss augmented Lagrangian-based preconditionersfor linear systems arising from stable finite elements dis-cretization of the steady incompressible Navier-Stokesequations. Spectral properties of the preconditioned ma-trices are established, and the choice of the augmentationparameter using Fourier analysis is presented. Moreover,we use field-of-values analysis to prove that the rate ofconvergence of GMRES with this type of preconditionersis independent of the mesh size for suitable choices of theaugmentation parameter. The dependence on viscosity willalso be discussed.

Zhen WangEmory [email protected]

Michele BenziDepartmentofMathematics and Computer ScienceEmory [email protected]

Maxim OlshanskiiDepartment of Mechanics and MathematicsMoscow State M.V.Lomonosov University, Moscow,[email protected]

CP7

Modeling Brain Tumor Growth Using Matlab

I will discuss the implementation and visualization of amathematical model of glioblastoma multiforme (GBM)brain tumor growth. Using Matlab, I have implemented amodel developed by Steffen Eikenberry and co-workers tosimulate treatment by surgery and radiation. With JohnIngraham (another student at ASU working on brain tu-mor analysis), I have produced code to convert the Matlabsimulation results into MRI-looking images that can be vi-sually compared with the actual patient scans.

Eric AdamsArizona State University, CSUMSSchool of Mathematical and Statistical Sciencese to [email protected]

CP7

Implicit Solution of Free-Surface Flows in Glaciol-ogy

Ice sheets and glaciers are usually modeled as non-Newtonian Stokes flows with free surface and nonlinear slipon bumpy surfaces. Local conservation is critical for manyapplications, but most conservative discretizations gener-ate spurious tangent forces which produces non-physicalrecirculation. We combine a well-balanced conservativediscretization of slip with fully-implicit time integration toenable analysis of stability and long-term behavior.

Jed BrownETH [email protected]

CP7

Robust Renewable Resource Exploitation Strategywith Sustainability

In this paper, we introduce a dynamic Nash game amongharvesting firms modeled as robust continuous time varia-tional inequality for renewable resource allocation prob-lems. For formulation, a continuous time fixed-pointmethod is presented. We discuss in detail the applicationof the proposed framework to the fishery game among firmsas well as among nations. We prove that our frameworkis effective in particular for computationally challengingrenewable resource management problems. A numericalexample is presented.

Sung Hoon Chung, Terry FrieszThe Pennsylvania State [email protected], [email protected]

CP7

Bayesian Markov Chain Monte Carlo Optimizationof a Physiologically Based Pharmacokinetics Modelof Nicotine

In an effort to reduce harm caused by tobacco products,the Family Smoking Prevention and Tobacco Act was en-acted by Congress and the Food and Drug Administration(FDA) was tasked with regulating tobacco. The FDA isnow charged to use the best available science to guide thedevelopment and implementation of effective public healthstrategies to reduce the burden of illness and death causedby tobacco products. Therefore, one significant challengeis to reduce toxicant exposure in tobacco products such ascigarettes. Nicotine is the active drug within tobacco that

104 CS11 Abstracts

is delivered during cigarette smoking. Smoking behavior isdriven by the need for nicotine and will influence deliveryof other tobacco toxicants into the lungs. To better un-derstand nicotine dosimetry a physiologically-based phar-macokinetics (PBPK) model which captures the essentialnicotine metabolism and takes into account the variabilityof kinetic parameters among individuals was developed.

Rudy Gunawan, Chuck Timchalk, Justin TeeguardenPacific Northwest National [email protected], [email protected],[email protected]

CP7

A Multi-Scale Model of Tumor Growth

Mathematical and computational modeling holds greatpromise for medicine to predict outcomes such as tumormetastasis or drug response. Fibroblasts and myofibrob-lasts near the tumor microenvironment are important play-ers in tumor growth and metastasis because of their uniqueability to coordinate events which increase cell prolifera-tion especially in breast cancer. It has been experimentallyshown that fibroblasts play an important role in promotingtumor growth in vitro. A multi-scale model of this interac-tion between stroma and transformed epithelial cells nearbreast duct will be presented. EGF-TGFbeta pathway con-trols these interactions and our multi-scale model describesthese phenomena at different time and spatial time scale,i.e., intracellular dynamics, cell dynamics at cellular level,and mechanical interaction between tumor cells and sur-rounding stromal tissue (continuum).

Yangjin KimUniversity of Michigan, [email protected]

Hans G. OthmerUniversity of MinnesotaDepartment of [email protected]

Magdalena StolarskaUniversity of St. ThomasDepartment of [email protected]

CP7

Finite Difference Time Domain (FDTD) Analysisof Near-Infrared Optical Pulse Propagation in AnAdult Head Model for Functional Imaging of BrainActivities

Finite difference time domain (FDTD) analysis is proposedfor calculating near infrared optical pulse propagation in anadult head model composed of scattering and clear cere-brospinal fluid (CSF) layer. Integral form of the diffusionequations with additional sources due to radiated photonsfrom an adjoining scattering layer are solved by the FDTDanalysis utilizing a new boundary condition at the CSF in-terface. The numerical results are in excellent agreementwith experiments and Monte Carlo simulations.

Tadatoshi TanifujiDepartment of Electrical and Electronics EngineeringKitami Institute of [email protected]

CP8

Numerical Solution of Control-State ConstrainedOptimal Control with An Inexact Smoothing New-ton Method

This talk is concerned with a globalized inexact smoothingNewton method for the numerical solution of optimal con-trol problems subject to mixed control-state constraints.The method uses the smoothed Fischer-Burmeister func-tion to reformulate first order necessary conditions andaims at minimizing the squared residual norm using New-ton steps and gradient-like steps. Numerical experimentsare provided to illustrate the convergence results.

Jinhai ChenUniversity of Colorado [email protected]

CP8

A Multilevel Monte Carlo for Simulating ExtremeQuantiles and Probabilities

Let X denote a generic random vector with probabilitydistribution μ on Rd, and let Φ be a mapping from Rd

to R. That mapping can be a black box, e.g., the resultfrom some computer experiments for which no analyticalexpression is available. Our goal is to estimate the proba-bility p = P [Φ(X) > q] for any arbitrary real number q, orreversely, when p is fixed, to estimate the quantile q suchthat P [Φ(X) > q] = p. Naive Monte Carlo estimation ofthat probability with a prescribed signal to noise ratio be-comes computationally intractable for quantiles q lying farout in the right-hand tail of the distribution of the ran-dom variable Φ(X). In this talk we present and analyzea novel simulation algorithm for this problem. It proceedsby successive elementary steps, each one being based onMetropolis-Hastings algorithm. We demonstrate the prac-tical usefulness of our method by applying it to a problemin watermarking.

Nicolas HengartnerInformation Sciences GroupLos Alamos National [email protected]

Arnaud GuyaderINRIA RennesUniversity of Haute [email protected]

Eric Matzner-LoberDepartment of StatisticsUniversity of Haute [email protected]

CP8

Calculating Derivatives in Software for DesigningOptical Fibers

Maxwells equation in one dimension may be used to modelan optical fiber. When solving an inverse problem to deter-mine the refractive index profile of the fiber, the computa-tion of the gradient of the dispersion, involving derivativesof the eigenvalues and effective area, an integral of a func-tion of the eigenvector, is more expensive than solving theunderlying eigen problem. By interchanging the order ofdifferentiation, taking advantage of the local support ofthe design parameters, and using separation of variables

CS11 Abstracts 105

we greatly decrease the computation time.

Linda Kaufman, Nick Lworonekin, William Sierchio,William Landon, Seokmin Bang, Ryan Petho, DanielSavacoolWilliam Paterson [email protected], [email protected],[email protected],[email protected], [email protected],[email protected], [email protected]

CP8

Time-Optimal Bang-Bang Control of Burger’sEquation in 1D

A numerical method for the approximate solution of a time-optimal control problem constrained by viscous burger’sequation in one spatial dimension is presented. The controlformulation requires a desired, but not fixed, final condi-tion on the state solution. The novel algorithm convergesbased on iteratively solving the transversality condition forthe final time. Numerical results for illustrative cases aregiven. The method can be adapted to hyperbolic systemsof PDEs.

Nathan D. MoshmanNaval Postgraduate [email protected]

CP8

Shape Optimization of Functions of Dirichlet-Laplacian Eigenvalues

We consider the shape optimization of functions ofDirichlet-Laplacian eigenvalues over the set of star-shaped,symmetric, bounded planar regions with smooth bound-ary. The regions are represented using Fourier-cosine coef-ficients and the optimization problem is solved numericallyusing a quasi-Newton method. The method is applied togeneralizations of the Payne-Polya-Weinberger ratio. Op-timal values and attaining regions are presented and in-terpreted as a study of the range of Dirichlet-Laplacianeigenvalues.

Braxton OstingColumbia [email protected]

CP8

Variational Multiscale Proper Orthogonal De-composition: Convection-Dominated Convection-Diffusion Equations

We introduce a variational multiscale closure modelingstrategy for the numerical stabilization of proper orthog-onal decomposition reduced-order models of convection-dominated equations. As a first step, the new model isanalyzed and tested for convection dominated convection-diffusion equations. The numerical analysis of the finiteelement discretization of the model is presented. Numer-ical tests show the increased numerical accuracy over astandard reduced-order model and illustrate the theoreti-cal convergence rates.

Zhu WangDepartment of MathematicsVirginia [email protected]

Traian IliescuInterdisciplinary Center for Applied [email protected]

CP9

Edu @ Jsc

Fostering a sound education of students and young re-searchers at bachelor, master and PhD level in high-performance computing, mathematics and computationalscience is an essential task of the Julich SupercomputingCentre (JSC). This talk will give an overview of the edu-cational activities at JSC and informs about the guest stu-dent programme, the summer/winter schools for PhD stu-dents, joint bachelor and master courses with universitiesand the graduate school GRS (German Research School forSimulation Sciences).

Johannes GrotendorstResearch Centre [email protected]

CP9

Sculpture, Geometry and Computer Science

We present a project for middle and high school studentsthat blends mathematics and computer science with artappreciation. The goal is to model a challenging geometricsculpture, Indiana Arc. After some basic instruction inthe Python scripting language, students use their mathskills to generate 3-D geometric data and then visualizeand analyze it using open source software. The project alsopays homage to ’Geometry and the Imagination’ (Hilbertand Cohn-Vossen).

Randy HeilandOpen Systems LabIndiana [email protected]

Charles [email protected]

Barbara ReamInternational School of [email protected]

Andrew LumsdaineOpen Systems LaboratoryIndiana [email protected]

CP9

Assessment Tools for the Computational ScienceCurriculum

We report on the ongoing development of a diagnostic as-sessment and a project evaluation rubric intended for useacross the computational science curriculum. The diagnos-tic assessment has predicted success in our introductorycomputational science course with reasonable reliability.The project evaluation rubric has been used in both in-troductory and upper-level undergraduate courses and ingraduate courses. Emphasis is given to the process of refin-ing the project evaluation rubric through several semesters

106 CS11 Abstracts

of use.

Robert Olsen, Russell MansonRichard Stockton College of New [email protected], [email protected]

CP10

Performance of a Boundary Element MethodSolver on General Purpose Graphics ProcessingUnit

The 3D Laplace boundary element code available fromwww.intetec.org was adapted to run on an Nvidia Teslageneral purpose graphics processing unit (GPU). Globalmatrix assembly and LU factorization of the resultingdense matrix were performed on the GPU. Out-of-coretechniques were used to solve problems larger than avail-able GPU memory. The code achieved over eight timesspeedup in matrix assembly and nearly 60Gflops/sec inthe LU factorization using only 512Mbytes of GPU mem-ory. This research is sponsored by the Office of AdvancedScientific Computing Research; U.S. Department of En-ergy. The work was performed at the Oak Ridge NationalLaboratory, which is managed by UT-Battelle, LLC underContract No. De-AC05-00OR22725.

Eduardo F. D’AzevedoOak Ridge National LaboratoryMathematical Sciences [email protected]

Sylvain Nintcheu FataOak Ridge National [email protected]

CP10

Gradient Free Design of Microfluidic Structures ona GPU Cluster

For certain fluid optimization problems such as those ofa multi-physics or multi-phase nature, traditional gradi-ent based design methods may break down at sharp inter-faces. In these scenarios gradient free methods are neces-sary. Unfortunately, gradient free methods require morefunction evaluations than gradient based design methods.In this work we present a design optimization method formicrofluidic T-junctions used in lab-on-a-chip devices. Weutilize the highly parallel multidirectional search (MDS)optimization procedure along with a multi-core/GPU ac-celerated flow solver, making the problem ideal for GPUclusters.

Austen C. DuffyDepartment of MathematicsFlorida State [email protected]

CP10

Efficient Numerical Algorithms for UncertaintyQuantification in Computational Mechanics UsingGpus

Graphic processing units (GPUs) are rapidly emerging asmuch economical and highly competitive alternatives toCPU-based parallel computing. This study implementsseveral existing algorithms, such as, vector convolution,explicit Runge-Kutta method, multi-dimensional numeri-

cal quadrature, and Monte Carlo simulation on a GPU,and discusses their utilization for uncertainty quantifica-tion problems arising in computational mechanics and dy-namics. Numerical examples are presented to demonstratethe computational efficiency of the GPU-based algorithmscompared to corresponding multi-core CPU-based algo-rithms.

Gaurav Gaurav, Steven WojtkiewiczDepartment of Civil EngineeringUniversity of [email protected], [email protected]

CP10

Investigation of Soft Errors As Integrated Compo-nents of a Simulation

As architectures become more complex, failures come in-creasingly into play and simulation behavior becomes lesspredictable. Especially worrisome are soft errors, i.e. bit?ips, due to systems operating at such low voltages. Softerrors are especially insidious because they may not even bedetected. Linear solvers must be considered as integratedcomponents of the algorithmic stack in a simulation. Weinvestigate the effects of soft errors in the linear solver onthe overall simulation.

Victoria HowleTexas [email protected]

Michael A. Heroux, Patricia D. HoughSandia National [email protected], [email protected]

CP10

Parallel Implementation of Adi As Preconditioner

The alternating directions implicit (ADI) method is aclassical iterative method for numerically solving linearsystems arising from discretizations of partial differentialequations. We use ADI as a preconditioner for Krylov sub-space methods for problems in two and three space dimen-sions, because it allows for a highly efficient, completelymatrix-free implementation. This talk will demonstratethat effective parallel implementations of the method arepossible.

Kyle SternUniversity of Maryland, Baltimore [email protected]

Matthias K. GobbertUniversity of Maryland, Baltimore CountyDepartment of Mathematics and [email protected]

Andrei DraganescuDepartment of Mathematics and StatisticsUniversity of Maryland, Baltimore [email protected]

CP10

Paladins: A Parallel Algebraic Adaptive Navier-Stokes Solver

In some applications of incompressible fluid dynamics, fasttransients, which requires to use small time steps, are

CS11 Abstracts 107

present only in some periods of the overall time interval ofinterest. A class of second and third order time-accuratesplitting schemes has been introduced by Gervasio, Saleriand Veneziani, that features a hierarchical structure proneto time adaptivity. We will present some technical de-tails and parallelization issues of this time adaptive schemebased on algebraic splitting.

Umberto E. VillaDept. of Mathematics and Computer ScienceEmory [email protected]

Alessandro VenezianiMathCS, Emory University, Atlanta, [email protected]

CP11

OSPRI: A New Communication Runtime Systemfor Global Arrays and Other One-sided Program-ming Models

We have developed a new communication runtime sys-tem named OSPRI (One-Sided PRImitives) for PGASprogramming libraries and languages. Significantly im-proved performance versus ARMCI is observed on theBlue Gene/P architecture for NWChem and Global Arraysbenchmarks. The design and implementation strategy ofOSPRI targets modern networks in Cray and IBM systemsas well as multithreaded and hybrid systems. Finally, in-teraction with MPI-3 and MPICH-based implementationsthereof will be discussed.

Jeff R. HammondArgonne National LaboratoryLeadership Computing [email protected]

CP11

Coupling Multi-Body Fluid-Solid Mechanical Sys-tems

The coupling of multi-physics systems is essential for a va-riety of problems in engineering and computational phys-iology. Increasingly, these problems demand the couplingof multiple fluid and solid bodies which may exhibit variedphysical responses and interactions through time. We fo-cus on the integration and solution of these systems usinga Lagrange multiplier finite element coupling approach andexamine – both theoretically and numerically – the accu-racy, solvability and uniqueness of these general coupledproblems.

David NordslettenMassachusetts Institute of [email protected]

CP11

Preconditioning Surface and Subsurface Flow Cou-pling for Arbitrary Geometries on a StructuredGrid

Due to complex dynamics inherent in the physical mod-els, numerical formulation of subsurface and overland flowcoupling can be challenging to solve. ParFlow is a sub-surface flow code that couples with overland flow via anoverland boundary condition prescribed at the top surface.This talk will present a preconditioning approach to dis-

crete systems arising from implicit coupling of these flowregimes in ParFlow. Numerical results will explore the ef-fectiveness of the preconditioner and its cost.

Carol WoodwardLawrence Livermore National [email protected]

Daniel Osei-KuffuorUniversity of [email protected]

Reed M. MaxwellDepartment of Geology and Geologic EngineeringColorado School of [email protected]

Steven SmithLawrence Livermore National [email protected]

CP11

Simulations of a Model for Calcium Waves in aHeart Cell Using Comsol Multiphysics

Experiments have shown that calcium waves can occurfrom spontaneously generated calcium sparks in a heartcell. A model for this process is given by a system ofreaction-diffusion equations. We have in the past devel-oped a special-purpose C code with MPI for this model thatsuccessfully handles the high-resolution meshes of the de-sired three-dimensional domains. We will evaluate whethera general-purpose package such as COMSOL Multiphysicscan be used at least for lower-resolution studies.

David TrottUniversity of Maryland, Baltimore [email protected]


CP11

Algorithms for Interface Treatment and Load Com-putation in Embedded Boundary Methods forFluid-Structure Interaction Problems

This talk focuses on the treatment of fluid-structure inter-faces and load computation in embedded boundary meth-ods. First, it presents a numerical method for treating si-multaneously the fluid pressure and velocity conditions onstatic and dynamic embedded interfaces based on the exactsolution of local fluid-structure Riemann problems. Next,it describes two consistent and conservative approaches forcomputing the flow-induced loads on embedded structures.Finally, it discusses applications in aeronautics and under-water implosion.

Kevin WangInstitute for Compuational and MathematicalEngineeringStanford University, Stanford, [email protected]

Charbel FarhatStanford University

108 CS11 Abstracts

[email protected]

CP12

Efficient Model Reduction of Large-Scale Nonlin-ear Systems in Fluid Dynamics

Time critical applications in fluids, such as flow control,demand numerical simulations that are computationallyinexpensive, yet extremely accurate. To this end, the au-thors have developed a model reduction methodology fornonlinear systems that reduces the complexity of evaluat-ing large-scale computational models while demonstratingaccuracy and stability. In this talk, results for this methodapplied to difficult real-world fluids problems are presentedfor the first time. Orders of magnitude speedups are ob-served.

Kevin T. Carlberg, David AmsallemStanford [email protected], [email protected]

Charbel Bou-MoslehNotre Dame [email protected]

Charbel FarhatStanford [email protected]

CP12

Interpolation-Based Model Reduction of BilinearSystems: New Results and An Approach to Opti-mal Approximation

Bilinear systems capture the nonlinear features of manyproblems, ranging from blood circulation to nuclear fis-sion, while retaining much of the simplicity and structureof linear models. In this talk, we will present a methodfor computing reduced-order models of MIMO bilinear sys-tems satisfying interpolation conditions, and show how touse the systems d-term to recover stability in the reduced-order model. Finally, we present H2 optimality conditions,and consider their implications in choosing interpolationpoints

Garret M. FlaggVirginia [email protected]

Serkan GugercinVirginia Tech.Department of [email protected]

Christopher A. BeattieVirginia Polytechnic Institute and State [email protected]

CP12

Subspace Tracking for Dimension Reduction inStreaming Data

The real-time analysis of streaming data from sensors canbe challenging when the number of sensors is large, thesampling rate is high, and the statistics of the data varyover time. We discuss how we can identify the importantdata streams using subspace tracking methods such as in-cremental SVD, PAST, and SPIRIT. These methods track

the reduced dimension space as new data come in and theold data become obsolete.

Chandrika KamathLawrence Livermore National [email protected]

CP12

.Direct Method for Calculus of Variations Prob-lems Via Combined Block-Pulse and OrthogonalFunctions

In this work, we present a new direct computationalmethod to solve calculus of variations problems. The ap-proach is based of reducing calculus of variations problemsinto a set of algebraic equations by first expanding thecandidate function as a hybrid function with an unknowncoefficient. The hybrid function, which consists of com-bined block-pulse and orthogonal functions, are first intro-duced. Some properties together with illustrative examplesare given.

Mohsen RazzaghiMississippi State [email protected]

CP12

Approximation of Ml Estimates of Parameters ofHierarchical Logistic Models Via Monte Carlo Sim-ulations

Maximum Likelihood (ML) estimation of parameters of hi-erarchical logistics models requires the approximation ofintegrals that do not have a closed form. Several approx-imation techniques have been proposed. This project ex-plores the performance of four approximations techniquesvia Monte Carlo simulations when using a hierarchical lo-gistic model to analyze cluster-randomized interventionsin Epidemiological studies. Recommendations are made,based on estimation properties, on which approximationtechnique to use under scenarios where these techniquesproduce different results.

Rafael E. DiazCalifornia State University [email protected]

Coskun CetinCalifornia State University, [email protected]

Matthew Steinwachs, Lydia Palma, Shawnee AndersonCalifornia State [email protected], [email protected],[email protected]

CP12

Low-Rank Tensor Methods for Uncertainty Quan-tification

Uncertainty quantification using spectral stochastic meth-ods like the stochastic Galerkin method involve solving sys-tems with a huge number of unknowns. However, the un-derlying space has a tensor product structure that can beused for more efficient representation of the quantities in-volved. We show how to modify standard iterative meth-ods in order to use this structure during the whole solution

CS11 Abstracts 109

process. Error estimates and convergence results for thosemethods will be given. Further, numerical results will bepresented that show how the range of problems feasible tocompute can be extended using these methods.

Elmar ZanderTU BraunschweigInst. of Scientific [email protected]

Hermann G. MatthiesInstitute of Scientific ComputingTechnische Universitat [email protected]

CP13

Recycling Bi-Lanczos Algorithms

Science and engineering problems frequently require solv-ing a long sequence of linear systems. Since CGS andBiCGSTAB are popular methods for solving large nonsym-metric linear systems, we generalize and extend the frame-work of our Recycling BiCG to CGS and BiCGSTAB. Wemodify these algorithms to use a recycle space, which isbuilt from left and right approximate eigenvectors. We alsopropose a Recycling QMR. Initial experiments on variousapplications give promising results.

Kapil Ahuja, Eric De SturlerVirginia [email protected], [email protected]

CP13

Uintah a Scalable Computational Science Frame-work for Hazard Analysis

The Uintah Software system was developed to provide anenvironment for solving a fluid-structure interaction prob-lems on structured adaptive grids on large-scale, long-running, data-intensive problems. Uintah uses a novelasynchronous task-based approach with fully automatedload balancing. The application of Uintah to a petascaleproblem in hazard analysis arising from “sympathetic’ ex-plosions in which the collective interactions of a large en-semble of explosives results in dramatically increased ex-plosion violence, is considered. The advances in scalabilityand combustion modeling needed to begin to solve thisproblem are discussed and illustrated by prototypical com-putational results.

Martin BerzinsSCI InstituteUniversity of [email protected]

Justin LuitjensSCI Institute and School of ComputingUniversity of [email protected]

Qingyu MengSCI InstituteUniveristy of [email protected]

Todd HarmanMechanical EngineeringUniversity of [email protected]

Charles WightDept. of ChemistryUniversity of [email protected]

Joseph PetersonDept of ChemistryUniversity of [email protected]

CP13

Trilinos-Based Solvers for Large Scale InverseProblems

Solving large scale inverse problems efficiently is integralto computational mathematics. A suite of iterative solversfor linear inverse problems is presented, which has beenwritten utilizing the Trilinos framework. To illustrate theuse of the solvers, an application for reducing movementdegradation of PET brain scans is described. Removingthese blurs through computational post-processing requiressolving a large, sparse linear inverse problem.

Sarah KnepperEmory UniversityDepartment of Mathematics and Computer [email protected]


CP13

Real Mathematics

Real Mathematics abstract Mathematics stays today moreand more frequently in front of a task for what it is notprepared. It has to its disposal some time series describingobjective reality and the task sound: Are these time seriesa manifestation of a real object Solution The real mathe-matics (RM), it means mathematics solving this problem,can proceed by the following manner. First of all, it needsto have to its disposal a model of interactions. RM usesthe knowledge of the exact philosophy about objects. Thissays that all real objects have a common structure. Theprocedure is as follows. If an existence of an object hasto be proved, and its properties have to be described, itis necessary to prove that given time series to some extentdescribe parts of that common structure. RM takes themodel of interactions, makes the forecast and measures itsconfidence using special criterion. If the incorporation ofproperties of time series been parts of an object improvethe forecast, the task is fulfilled. The optimal set of proper-ties for individual time series is found by this manner andRM can determine with what confidence it is an objectbehind

Martin VlcekMinistry of Finance of the Czech [email protected]

CP14

For the Stationary Compressible Viscous Navier-Stokes Equations with the No-Slip Condition on aConvex Polygon

Our concern is with existence and regularity of the sta-

110 CS11 Abstracts

tionary compressible viscous Navier-Stokes equations withno-slip condition on convex polygonal domains. Note that[,p] = [0, c], c a constant, is the eigenpair for the singularvalue l = 1 of the Stokes problem on the convex sector. Itis shown that, except the pair [0, c], the leading order ofthe corner singularities for the nonlinear equations is thesame as that of the Stokes problem. We split the leadingcorner singularity from the solution and show an increasedregularity for the remainder. As a consequence the pres-sure solution changes the sign at the convex corner and itsderivatives blow up. In the numerical point of view, wediscretize the stress intensity factor and the regular partof solution for the linearized problem, and show the errorestimates of them.

Hyung Jun ChoiDepartment of [email protected]

CP14

A Perfectly Matched Layer for the Boltzmann-BGK Equation and its Application to the LatticeBoltzmann Method

Perfectly Matched Layer (PML) absorbing boundary con-ditions are proposed for the discreet velocity Boltzmann-BGK equation (DVBE). Following a study of the linearwaves supported by DVBE, nonreflecting absorbing bound-ary conditions are derived using a space-time transforma-tion. Linear analysis shows that the proposed equations arestable for practical numerical calculations. To validate theaccuracy of the boundary condition, the DVBE is solvedby a finite difference scheme and by a Lattice Boltzmannmethod (LBM).

Elena CraigOld Dominion [email protected]

Fang HuDepartment of MathematicsODU, [email protected]

CP14

For the Stationary Navier-Stokes Flows with Non-standard Boundary Conditions on Polygonal Do-mains

In this talk we show existence and regularity forthe vorticity-velocity-pressure variables of the stationaryNavier-Stokes equations on polygonal domains. Near thenon-convex vertex the solution has the same corner sin-gularity order as ths solutions of the Stokes operator withthe vorticity boundary condition on the non-convex infinitesector and the regular part that is obtained by splitting thecorner singularity from the solution has futher regularities.

Oh Sung [email protected]

CP14

RBF Methods for 2D Fluid Flow on the Sphere

We use a pseudo-spectral method to solve advection prob-

lems on the unit sphere. Rather than Chebychev polynomi-als or Fourier series, we use Radial Basis Functions (RBFs)as an interpolation basis. We investigate the stability ofthis method with the aim of producing a stable Navier-Stokes solver. Specifically, we are interested in defining thesubset of velocity fields for which the advection operator isstable.

Jordan M. MartelArizona State UniversitySchool of Mathematical and Statistical [email protected]

CP14

Numerical Simulation of Cell/cell and Cell/particleInteraction in Microchannels

A spring model is applied to simulate the skeleton structureof the red blood cell membrane and to study the red bloodcell rheology in Poiseuille flow with an immersed boundarymethod. The lateral migration properties of many cells inPoiseuille flow have been investigated. We also have com-bined the above methodology with a distributed Lagrangemultiplier/fictitious domain method to simulate the inter-action of the red blood cells and neutrally buoyant particlesin a microchannel for studying the margination of particles.

Tsorng-Whay Pan, Lingling ShiDepartment of MathematicsUniversity of [email protected], lingling2math.uh.edu

Roland GlowinskiUniversity of [email protected]

CP14

Numerical Treatment of a Slider Bearing Mecha-nism: Performanance and Evaluation

Numerical solutions are obtained for the thermo hydrody-namic lubrication of a tilted pad slider bearing with heatconduction to the pad as well as to the slider. The fluidfilm momentum, continuity and energy equations are cou-pled to the heat conduction eq1uations for the pad andslider, density and viscosity are assumed to be tempera-ture dependent.The results depicted through plots revealthat thermal boundary layer and recirculation zones arenoticed in the temperature field.

Pentyala Srinivasa RaoIndian School of [email protected]

CP15

Framework for Loosely Coupled Fusion PlasmaSimulations

We present the Integrated Plasma Simulator (IPS), ahighly portable framework for file-based, loose coupling ofhigh performance, parallel multi-physics codes. The frame-work adopts a component-based approach, where each codeis adapted to a standard interface using a thin wrappinglayer. Data is exchanged among participating componentsusing a common Plasma State layer. Framework servicesare provided for the management of multi-level concurrenttask execution, resource allocation, data movement, fault

CS11 Abstracts 111

tolerance, and asynchronous events propagation.

Wael R. ElwasifOak Ridge National [email protected]

David E. BernholdtOak Ridge National LaboratoryComputer Science and Mathematics [email protected]

Samantha Foley, Aniruddha ShetOak Ridge National [email protected], [email protected]

Randall B. BramleyIndiana [email protected]

CP15

Lead-Acid Battery Model Under Discharge WithFast Splitting Method

A mathematical model of a valve regulated lead acid bat-tery under discharge is presented as simplified from astandard electrodynamics model. This nonlinear reaction-diffusion model of a battery cell is solved using an operatorsplitting method to quickly and accurately simulate sul-furic acid concentration. This splitting method preservescontinuity over material interfaces encompassing discon-tinuous parameters. Numerical results are compared withmeasured data by calculating battery voltage from the acidconcentration using the Nernst equation.

R. Corban Harwood, V. S. ManoranjanWashington State [email protected], [email protected]

Dean B. EdwardsUniversity of [email protected]

CP15

Numerical Application Using Invariant Distribu-tion of the Elasto-Plastic Oscillator to ComputeFrequency of Deformations

Mean frequency of threshold crossing of the response ofan elasto-plastic oscillator under standard white noise arestudied in this paper by means of its invariant measure andRice’s formula. Due to the lack of regularity of this mea-sure, an approximation of the Rice’s formula is introducedas a solution of a partial differential equation. Finally, auseful criterion is given for engineering problems to evalu-ate the mean frequencies of deformations.

Laurent MertzUniversity Pierre et Marie Curie,Paris [email protected]

Cyril FeauCommissariat a l’energie [email protected]

CP15

3D Shape Optimization in Viscous Incompressible

Fluid

A new integral equation for the axially symmetric Os-een equations is obtained based on the Cauchy integralformula for generalized analytic functions. The integralequation has computational advantage over the existingintegral equation based on fundamental solutions (Oseen-lets). The integral equation constrained optimization ap-proach to finding three-dimensional minimum-drag shapesfor bodies translating in viscous incompressible fluid underthe Oseen approximation of the Navier-Stokes equations ispresented. The approach formulates the Oseen flow prob-lem as a boundary integral equation and finds solutionsto this equation and its adjoint in the form of functionseries. Minimum-drag shapes, being also represented byfunction series, are then found by the adjoint equation-based method with a gradient-based algorithm, in whichthe gradient for shape series coefficients is determined an-alytically. Compared to PDE constrained optimizationcoupled with the finite element method (FEM), the ap-proach reduces dimensionality of the flow problem, solvesthe issue with region truncation in exterior problems, findsminimum-drag shapes in semi-analytical form, and has fastconvergence. As an illustration, the approach solves threedrag minimization problems for different Reynolds num-bers: (i) for a body of constant volume, (ii) for a tor-pedo with only fore and aft noses being optimized, and (iii)for a body of constant volume following another body offixed shape (torpedo chasing a vessel). The minimum-dragshapes in problem (i) are in good agreement with the ex-isting optimality conditions and conform to those obtainedby PDE constrained optimization.

Michael ZabarankinDepartment of Mathematical SciencesStevens Institute of [email protected]

CP16

A Note on the Use of Optimal Control on a DiscreteTime Model of Influenza Dynamics

A discrete time Susceptible - Asymptomatic - Infectious -Treated - Recovered (SAITR) model is introduced in thecontext of influenza transmission. We evaluate the poten-tial effect of control measures such as social distancing andantiviral treatment on the dynamics of a single outbreak.Optimal control theory is applied to identify the best wayof reducing morbidity and mortality at a minimal cost.The problem is solved by using a discrete version of Pon-tryagin’s maximum principle. Numerical results show thatdual strategies have stronger impact in the reduction of thefinal epidemic size.

Paula A. Gonzalez-ParraProgram in Computationa ScienceThe University of Texas at El [email protected]

Sunmi LeeMathematical, Computational and Modeling SciencesCenter,Arizona State University,[email protected]

Leticia VelazquezComputational Science ProgramUniversity of Texas, El [email protected]

112 CS11 Abstracts

Carlos Castillo-ChavezArizona State University andDepartment of [email protected]

CP16

Shear Flow Instabilities of an Upper ConvectedMaxwell Model

This work is concerned with the linear stability of viscoelas-tic shear flows of an upper convected Maxwell fluid underthe effect of elasticity. We are focused on the stability prob-lem of a few classes of simple parallel flows in the limitof infinite Weissenberg and Reynolds numbers. We willdiscuss the numerical stability results. We shall considerplane Couette and Poiseuille flow, the hyperbolic tangentshearlayer and the Bickley jet flows. For all these flows,we shall consider free surface boundary conditions as wellas wall boundary conditions. In the inviscid case, all theflows are unstable for free surfaces. For wall bounded flows,the Couette and Poiseuille flows are stable, while stabilityof the shear layer and Bickley jet depends on the ratioof the channel width to the characteristic length scale ofthe profile. In all cases, we find that elasticity stabilizesand ultimately suppresses the instability. Our numericalapproach is based on the spectral Chebyshev collocationmethod. We shall also show that some flows, such as planePoiseuille flow between two parallel free surfaces, also haveshort wave instabilities. This is in marked contrast to thewall bounded case. In this case, no smooth velocity pro-files unstables to short waves are known, and for certainclasses of flows there are even results ruling out short waveinstability.

Ahmed Kaffelvirginia [email protected]

CP16

Block Preconditioners for Fully-Implicit Integra-tion of the Shallow Water Equations

We discuss the development of block preconditioners inan effort to reduce computational costs associated withimplicit time integration of atmospheric climate mod-els within CAM-HOMME. We construct a fully implicitframework based on the shallow water equations and viewthe subsidiary linear system as a block matrix. Formal LUdecomposition is performed and block preconditioners arederived based on approximations to the upper triangularblock.

P. Aaron LottLawrence Livermore National [email protected]

Kate EvansOak Ridge National [email protected]

Howard C. ElmanUniversity of Maryland, College [email protected]


CP16

A Girsanov Monte Carlo Approach to Particle Fil-tering for Multi-Target Tracking

We present a novel approach for improving particle filtersfor multi-target tracking. The suggested approach is basedon Girsanov’s change of measure theorem for stochastic dif-ferential equations. Girsanov’s theorem is used to designa Markov Chain Monte Carlo step which is appended tothe particle filter and aims to bring the particle filter sam-ples closer to the observations. The numerical results showthat the suggested approach can improve significantly theperformance of a particle filter.

Panagiotis StinisUniversity of Minnesota, Twin [email protected]

Vasileios MaroulasUniversity of Tennessee, [email protected]

CP16

Impice Method for the Simulation of 2D and 3DCompressible Flow Problems in Uintah.

The Implicit Continuous-fluid Eulerian(ICE) method, asemi-implicit finite-volume solver, is a successful andwidely used method that applies to flows that range fromsupersonic to subsonic regimes. The classical ICE methodhas been expanded to problems in multiphase flow whichspan a wide area of science and engineering. The ICEmethod is utilized by the C-SAFE code Uintah writtenat the University of Utah to simulate explosions, fires andother fluid and fluid-structure interaction phenomena. Theimplementation of ICE method used in Uintah is describedin many papers by Kashiwa at Los Alamos and extended tosolve multifield cases by Harman at Utah. We improve themethod by using slope limiters and an approximate Rie-mann Solver to suppress the nonphysical oscillations. Theresults of the improved ICE method(IMPICE) for com-pressible flow problems governed by the 2D and 3D Eulerequations are shown along with the spatial and temporalerror analysis.

Lethuy T. TranUniversity of [email protected]

Martin BerzinsSCI InstituteUniversity of [email protected]

CP17

Distributional Properties of Stochastic ShortestPaths for Smuggled Nuclear Material

There are many well studied optimizaton problems ontransportation networks, including optimal routing be-tween selected nodes and interdiction. Existing algorithmshandle easily weights on both nodes and links. Examplesinclude, Dijkstra’s and Floyd Marshall’s shortest paths al-gorithm. We are particularly interested in the scenariowhere a nuclear material smuggler tries to successfullyreach his/her target. The Pathway Analysis, Threat Re-sponse and Interdiction Options Tool (PATRIOT), a tooldeveloped at the Los Alamos National Laboratory read-ily handles this problem using a multi-modal world trans-

CS11 Abstracts 113

portation network by identifying the most likely threatpathway to the target. The identification of this path relieson a set of fixed reliabilities associated with each arc andterminal in the network that represent the adversary’s per-ceived probability of successfully traversing the respectivelinks or nodes. This path thus represents the adversary’smost reliable path. In order to account for the adversary’suncertainty and to perform sensitivity analysis we considerrandom reliabilities. The main quantities of interest are theresulting stochastic most reliable paths or stochastic short-est paths for short, and their properties. These propertiesmay depend on the network, and on the specific origenand destination, nevertheless, to gain a better understand-ing, we perform some preliminary controlled experimentsby performing Monte Carlo simulations on a grid. Thispaper presents the resulting findings.

Leticia Cuellar, Feng PanLos Alamos National [email protected], [email protected]

Kevin Saeger, Fred [email protected], [email protected]

CP17

Selection Strategies for Genetic Algorithm to SolveTravelling Salesman Problem

Genetic algorithm is mainly composed of three importantoperations which are selection, crossover, and mutation.This study explores the performance of various selectionstrategies in solving travelling salesman problem. Resultsshow that tournament selection strategy outperformed pro-portional and rank-based selections, achieving the highestsolution qualities with low computing times. However, re-sults reveal that tournament and proportional can be su-perior to the rank-based for smaller problems and becomesusceptible to premature convergence as problem size in-creases.

Noraini Mohd [email protected]

John [email protected]

CP17

Scalable Distributed Coalition Formation forLarge-Scale Collaborative Multi-Agent Systems

We study an important problem in Distributed AI and col-laborative multi-agent systems, that of decentralized coali-tion formation. We have designed, implemented and ex-tensively analyzed a scalable, fully distributed algorithmfor dynamic coalition formation that ensembles of collab-orative autonomous agents can use as a basic coordina-tion subroutine. We summarize our algorithm and sharerecent simulation results that validate our earlier claimsabout high scalability and efficiency of our approach todistributed coalition formation. We also outline our ongo-ing and future work on (i) scaling the implementation fromhundreds to thousands of autonomous agents, and (ii) an-alyzing the potential benefits of multi-tiered reinforcementlearning on how to coordinate and form coalitions more

effectively.

Predrag T. TosicUniversity of [email protected]

Naveen GinneDepartment of Computer Science,University of [email protected]

CP17

Strategies for Iterated Travelers’ Dilemma: TheGood, The Bad and The Ugly

We study iterated travelers’ dilemma (ITD) game froman algorithmic and experimental standpoint. ITD is atwo-person game that is not zero-sum, i.e., that in gen-eral incorporates both cooperation and competition. Wefind ITD very interesting due to (i) a unique Nash equilib-rium that however corresponds to very low payoffs to bothplayers, (ii) there is a unique pair of strategies that resultin maximal social welfare, yet that strategy pair is highlyunstable, and (iii) what is an optimal or even good playfrom the individual welfare standpoint critically dependson the adopted notion of individual utility. We proposeand then analyze a broad range of IDT strategies via around-robin, everyone-against-everyone tournament madeof a large number of rounds. We share some interestingand, in several instances, surprising findings on the rela-tive performance of various strategies (such as variants oftit-for-tat, well-known from the iterated prisoners’ dilemmacontext). We also outline our ongoing and future work fo-cusing on (i) the value of learning and modeling the oppo-nent’s behavior for maximizing one’s own long-term utilityand (ii) co-learning and meta-learning approach to evolv-ing agent pairs whose long-term behavior maximizes theoverall, joint utility (that is, social welfare).

Predrag T. TosicUniversity of [email protected]

Philip DaslerDepartment of Computer ScienceUniversity of [email protected]

CP17

The Effect of Preconditioning in InterpolatoryModel Reduction

We expand upon our results in ” Inexact Solves in Inter-polatory Model Reduction” by Beattie, Gugercin, Wyatt,2010, which reported benefits of using a Petrov-Galerkinframework for the linear systems in interpolatory modelreduction. We discuss the role of preconditioning in thePetrov-Galerkin framework and the resulting backward er-ror formulation from a systems theory perspective. We alsoinvestigate the backward error formulation for the complexcase.

Sarah WyattVirginia [email protected]

Serkan GugercinVirginia Tech.Department of Mathematics

114 CS11 Abstracts

[email protected]


CP18

Explicit Stable Schemes for Advection Equations

Explicit time differencing of advection terms based on aug-mented spatial stencil is considered both for cases of pureadvection and for more complex systems including advec-tion as a part of general phenomena. It is shown thatthe proposed approximation possesses extended stabilityand simplifies numerical algorithm without loss of accu-racy of numerical solution. Numerical experiments showthat the proposed scheme halves required computationaltime in comparison with the standard leap-frog approxi-mation.

Andrei Bourchtein, Ludmila BourchteinPelotas State UniversityMathematics [email protected], [email protected]

CP18

P-Adaptive Hermite Methods for Initial ValueProblems

A straightforward order-adaptive (p-adaptive) implemen-tation of Hermite methods for hyperbolic and singularlyperturbed parabolic initial value problems is presented, ex-ploiting the fact that Hermite methods allow the degree ofthe local polynomial representation to vary arbitrarily fromcell to cell. Examples illustrating its application to linearand nonlinear wave propagations are included. Stabilityand convergence analysis of the p-adaptive Hermite meth-ods will be addressed.

XI ChenUniversity of New MexicoDepartment of Mathematics and [email protected]

Thomas M. HagstromSouthern Methodist UniversityDepartment of [email protected]

CP18

Spectral Methods for Systems of Coupled EllipticEquations

In the presentation I will talk about how our new devel-oped spectral method solvers can be applied to highly non-linear and high-order evolution equations such as stronglyanisotropic Cahn-Hilliard equations from materials science.Both theoretical and empirical points of view will be givenon this topic.

Feng ChenPurdue University, West [email protected]

Jie ShenDepartment of MathematicsPurdue [email protected]

CP18

An Error-Controlled Fast Multipole Method

We present our two-stage error estimation scheme for thefast multipole method (FMM). This scheme can be ap-plied to any particle system with open or periodic bound-ary conditions. The scaling of the scheme with respectto the number of particles N is O(N ). We show resultsfor homogeneous and clustered distributions consisting ofup to one billion particles. The mathematical basics aredescribed briefly.

Holger DachselJuelich Supercomputing CentreResearch Centre [email protected]

Ivo KabadshowResearch Centre [email protected]

CP18

A Positivity-Preserving High-Order Semi-Lagrangian Discontinuous Galerkin Scheme for theVlasov-Poisson Equations

The Vlasov-Poisson equations describe the evolution of acollisionless plasma. The large velocities of the system cre-ate a severe time-step restriction from which the dominantapproach in the plasma physics community is the particle-in-cell (PIC) method. We focus on an alternative ap-proach which evolves a grid-based representation throughLagrangian dynamics, followed by a projection onto theoriginal mesh. In particular, we develop a high-order dis-continuous Galerkin (DG) method in phase space, and anoperator split, semi-Lagrangian method in time. Our novelapplication of a 4th order split approach maintains massconservation exactly and is positivity preserving.

David Seal, James A. RossmanithUniversity of WisconsinDepartment of [email protected], [email protected]

CP19

Chordal Graph Preconditioners for Solving LargeSparse Linear Systems

We describe a chordal subgraph based preconditioner thatcan be used to accelerate the convergence of solvers forsparse linear systems. The resulting solvers combine fea-tures of both direct and iterative solvers; they provide fastconvergence to solution without requiring any additionalspace for fill-in. We present a parallel algorithm for com-puting a maximal chordal subgraph and provide experi-mental results of linear system solutions using the adja-cency matrix of the subgraph as preconditioner.

Sanjukta BhowmickDepartment of Computer ScienceUniversity of Nebraska, [email protected]

Tzu-Yi ChenPomona [email protected]

Kanimathi DuraisamyUniversity of Nebraska at Omaha

CS11 Abstracts 115

[email protected]

Lisa PeairsPomona [email protected]

CP19

An Integer Programming Formulation for VertexElimination Problem

We propose a 0-1 Integer Program formulation for a variantof the optimal Jacobian accumulation problem that em-ploys vertex elimination strategy. The vertex eliminationproblem is first modeled as a graph problem, and then iscasted as a 0-1 IP. The number of variables and constraintsof this IP blows up as the number of nodes and edges in thegraph increases. Solving such IP for moderate size graphis already a computational challenge. We investigate twodifferent strategies to help solving such large IPs: first, weadd valid constraints which help tightening the bounds,as well as symmetry-breaking constraints that lead to lessnodes in doing branch-and-bound; second, we use proper-ties derived from the graph to significantly reduce the sizeof the IP. Computational results and considerations will bediscussed.

Jieqiu ChenArgonne National LaboratoryMCS [email protected]

Paul D. HovlandArgonne National LaboratoryMCS Division, 221/[email protected]

Todd MunsonArgonne National LaboratoryMathematics and Computer Science [email protected]

CP19

Modeling Large Nonlinear Anisotropies with JfnkMethods

Jacobian-Free Newton-Krylov (JFNK) methods have beenshown to allow for the accurate computation of nonlinearterms and large time-steps for advancing magnetohydro-dynamic (MHD) simulations. We present results on theefficacy of a JFNK method applied to MHD systems withlarge nonlinear anisotropy due to the dependence of the(thermal) conductivity tensor on the magnetic field andtemperature. This massively parallel algorithm is imple-mented in NIMROD, an extended MHD code supportedby DOE Fusion Energy Sciences.

Ben [email protected]

Scott Kruger, Travis M. AustinTech-X [email protected], [email protected]

CP19

Multilevel Ilu Preconditioning for Indefinite Linear

Systems

Many applications in science and engineering require thesolution of ill-conditioned and indefinite linear systems,which can be challenging to solve by iterative methods.This talk will present a multilevel preconditioning ap-proach that combines ideas from multigrid and strategieslike modified ILU and diagonal perturbation techniques,to efficiently handle indefinite linear systems. Numericalresults will explore the effectiveness of the preconditionerand its cost on problems from electromagnetics and crystalsimulation.

Daniel Osei-KuffuorUniversity of [email protected]

Yousef SaadDepartment of Computer ScienceUniversity of [email protected]

CP19

On the Memory and I/O Requirements of Multi-frontal Sparse Factorization

The multifrontal method organizes the factorization ofsparse matrices as a tree of tasks. The tasks should beexecuted in a topological order. Among all topological or-ders, which one minimizes the memory requirements? Ifthe main memory is not large enough, what is the mini-mum size of the input-output operations? We investigatethe known alternatives and propose a new solution for thefirst problem. We show that the second problem is NP-hard and propose heuristics.

Bora UcarLIP, ENS-Lyon, [email protected]

Mathias Jacquelin, Loris MarchalENS [email protected], [email protected]

Yves RobertENS LYon, France& Institut Universitaire de [email protected]

CP19

Convergence Analysis of SART by Bregman Itera-tion and Dual Gradient Descent

We provide two approaches to prove the convergence ofsimultaneous algebraic reconstruction technique (SART):linearized Bregman iteration and dual gradient descent.These proofs can also be applied to other Landweber-likeschemes such as Cimmino’s algorithm and component av-eraging (CAV). Furthermore the noisy case is consideredand error estimate is given. The numerical experimentsare provided to demonstrate the convergence results.

Ming YanUniversity of California, Los AngelesDepartment of [email protected]

116 CS11 Abstracts

CP20

LU Factorization of Finite Difference Matrices inO(N) Operations

The talk describes recently developed techniques for com-puting a highly accurate LU-factorization of a the stiffnessmatrix arising from the finite difference discretization ofan elliptic PDE. While inspired by nested dissection meth-ods, the new techniques have linear complexity. The newapproach is also more stable, versatile, and far faster forproblems involving multiple right-hand sides than estab-lished O(N) techniques which are based on iterative meth-ods (multigrid, pre-conditioned GMRES, etc).

Adrianna GillmanUniversity of Colorado at BoulderDepartment of Applied [email protected]

Gunnar MartinssonUniv. of Colorado at [email protected]

CP20

Progress on the Development of a Elliptic OperatorMatrix Assembly Extension to the Chombo AmrLibrary

Block-structured Adaptive Mesh Refinement (AMR) offersthe prospect of solving elliptic equations with near lin-ear complexity by locally refining the mesh where needed,applying Gauss-Seidel smoothing on the error recursivelyfrom the finest to the coarsest mesh, and prolonging theerror from coarse to fine grids. Although this matrix-freeprocedure is known to be extremely efficient for Laplacetype problems, scaling to 10,000s or more processes, theclassical multigrid algorithm can encounter difficulties forhighly anisotropic elliptic operators and/or when the con-ductivity tensor is misaligned to the coordinate system. Toenable difficult physics problems to leverage multigrid, weare developing an extension to the Chombo AMR librarythat will build a multi-level representation of the sparsematrix. The sparse linear system can then be solved us-ing algebraic multigrid, among other methods provided bythe PETSc library. In addition to extending the multi-grid applicability domain, assembling the matrix will allowusers to study the discretized properties of the system (de-terminant, eigenvalues, etc), which are not readily accessi-ble when using a matrix-free approach. Results of highlyanisotropic heat equations in two and three dimensions arepresented.

Alexander PletzerTech-X [email protected]


Travis M. AustinTech-X [email protected]

Yongjun [email protected]

Ravi Samtaney

[email protected]

Phillip ColellaLawrence Berkeley National [email protected]

Brian Van StraalenLawrence Berkeley National LaboratoryDepartment N E R S [email protected]

Lois Curfman McInnesArgonne National [email protected]

Louis HowellLawrence Livermore National [email protected]

CP20

A Schur Complement Approach for Solving aNearly Hermitian System

We discuss an approach for solving an n× n linear systemwith rank s skew-Hermitian part (s � n). Such systemsarise in discretizations of certain integral equations, e.g.,wave scattering applications. Our approach is based on theobservation that such a linear system can be interpreted asthe Schur complement of a larger (n + s)× (n + s) system.We can thus solve the original system by solving s + 1Hermitian systems of order n, e.g., using MINRES, whichleads to significant storage savings. We then directly solveone s × s system. We present numerical results demon-strating the Schur algorithm’s competitiveness with botha standard Krylov method (GMRES) and a short-term re-currence method (IDR). This method can also serve as apreconditioner for systems whose skew-Hermitian part iswell-approximated by a rank s matrix.

Mark EmbreeDepartment of Computational and Applied MathematicsRice [email protected]

Josef SifuentesRice [email protected]

Kirk M. Soodhalter, Daniel B. SzyldTemple UniversityDepartment of [email protected], [email protected]

CP20

Solving Symmetric Indefinite Systems with Sym-metric Positive Definite Preconditioners

We consider an iterative solution of a symmetric indefi-nite linear system with a symmetric positive definite pre-conditioner. We describe a novel preconditioning strategy,which is based on the idea of approximating the inverseof the absolute value of the coefficient matrix (absolutevalue preconditioners). A simple example of the (geomet-ric) multigrid absolute value preconditioner is constructedfor a model problem of the discrete real Helmholtz equation

CS11 Abstracts 117

with a relatively low wavenumber.

Eugene VecharynskiUniversity of Colorado [email protected]

Andrew [email protected]

CP20

A Stable Runge-Free, Gibbs-Free Rational Inter-polation Scheme

We construct a rational interpolation scheme by blend-ing polynomial interpolants on subsets of the interpolationpoints. For a large class of grids, including uniform points,we prove that our interpolant converges uniformly to ana-lytic functions exponentially fast, and does not have Rungephenomenon. We show that the exponential sensitivityproven for this kind of schemes by Platte and Trefethendoes not lead to numerical instability for this method.We also demonstrate that the interpolation does not haveGibbs phenomenon for discontinuous, piecewise analyticfunctions.


CP20

Sensitivity Analysis of Limit Cycle Oscillations

Many unsteady problems equilibrate to periodic or quasi-periodic behavior. For these problems the sensitivity ofperiodic outputs to system parameters are often desired,and must be estimated from a finite time calculation. Weshow that sensitivities computed over a finite time can takeexcessive time to converge, or fail altogether to reach anequilibrium value. Further, we demonstrate that outputwindowing enables the accurate computation of periodicoutput sensitivities.

Joshua A. Krakos, Qiqi WangMassachusetts Institute of [email protected], [email protected]

David l. DarmofalDepartment of Aeronautics & AstronauticsMassachusetts Institute of [email protected]

Steven HallMassachusetts Institute of [email protected]

MS1

Molecular Dynamics Simulations of Biomoleculeson GPUs using the Multilevel Summation Method

Molecular dynamics simulations of biomolecules require ateach time step the costly calculation of electrostatic forcesbetween all pairs of N atoms. Of the fast methods for solv-ing the N-body problem, the multilevel summation method(MSM) offers particular advantages to molecular dynamics.MSM also turns out to be well-suited to GPU computation,with a short-range part calculated entirely by square root,multiply, and add instructions and a long-range part ar-

ranged as localized grid calculations.

David J. HardyTheoretical Biophysics Group, Beckman InstituteUniversity of Illinois at [email protected]

MS1

High-Order Discontinuous Galerkin Methods byGPU Metaprogramming

I will describe techniques and tools to tap the enormousperformance potential of GPUs for discontinuous Galerkinfinite element solvers. Particular emphasis will be on theadvantages that high-order discretizations offer on modernSIMD-like architectures. I will explain design consider-ations and tricks that enabled sustained single-chip per-formance of above 200 GFlops/s across a wide range ofdiscretization parameters and equation types. I will alsobriefly touch upon methods and tools for run-time codegeneration and empirical optimization from a high-levellanguage which were crucial to the present effort’s success.

Andreas KloecknerCourant Institute of the Mathematical SciencesNew York [email protected]

Timothy WarburtonDepartment of Computational and Applied MathematicsRice [email protected]

Jan S. HesthavenBrown UniversityDivision of Applied [email protected]

MS1

A Case Study of GPUs in Scientific Computing:Low-Order FEM

We present a model for scientific applications based upona Python framework, in which linear algebra and solverwork is handled by libraries, and integration, or physics,is handled by massively parallel accelerators, in this casea GPU, through the PyCUDA framework. We illustratethis paradigm by discussing the step-by-step developmentof a high performance, portable engine for low-order FEMintegration.

Matthew G. KnepleyUniversity of [email protected]

MS1

Large Scale Multi-GPU FMM for Bioelectrostatics

Abstract not available at time of publication.

Rio YokotaUniversity of [email protected]

MS2

On the Construction of Preconditioners for Sys-

118 CS11 Abstracts

tems of PDEs

The purpose of this talk is to discuss a general approach tothe construction of preconditioners for the linear systems ofalgebraic equations arising from discretizations of systemsof partial differential equations. We construct the precon-ditioners based on the mapping properties of the differen-tial operator in Sobolev spaces. In particular, the exactpreconditioners can be seen as the Riesz isomorphisms be-tween properly chosen Sobolev spaces, while computation-ally efficient preconditioners are constructed as spectrallyequivalent isomorphisms by using multigrid or domain de-composition methods. Finally, we discuss how to employthis technique in FEniCS.

Kent-Andre MardalSimula Research Laboratory and Department ofInformaticsUniversity of [email protected]

MS2

An Efficient Implementation of Nitsche’s Methodon Overlapping Meshes in 3D

Frequent remeshing is required when large deforma-tions occur in ALE-based solution algorithms for Fluid-Structure-Interaction problems. Therefore, fixed fluidmesh methods provide a promising approach when dealingwith large deformations. Nitsche’s method can be used toderive a systematic finite element formulation for problemswith overlapping meshes that is stable and has optimal or-der. Furthermore, it avoids the introduction of Lagrangemultipliers used in other domain decomposition approachesto enforce the interface conditions. Despite its promisingproperties, Nitsche’s method on overlapping meshes hasso far only been implemented in 2D with several restric-tions on how meshes overlap. This is mainly because ofthe complexity of arbitrary mesh intersections in 3D. Inthis contribution, we discuss in detail the general aspectsof an efficient 3D implementation of Nitsche’s method foroverlapping meshes, ranging from computing mesh inter-sections to integration on intersected elements. Finally, arealization within the FEniCS framework is discussed andwe present numerical examples that include the Poissonequation and linear elasticity.

Andre Massing, Anders LoggSimula Research [email protected], [email protected]

Mats G. LarsonDepartment of MathematicsUmea [email protected]

MS2

Automated Goal-Oriented Error Control

In many areas of computer simulation, the assessment ofthe quality of a computed numerical approximation can bea vital task. It can however also be a challenging task, bothmathematically, computationally, and from the program-mer’s view point, implementationally. In this talk, I willpresent an automated framework for goal-oriented adap-tivity and error control for finite element methods. Sincethe framework is automated, a user can take advantage ofstate-of-the-art adaptivity techniques with minimal imple-mentational effort. The framework is implemented within

the FEniCS project, a project for the development of con-cepts and software for automated solution of differentialequations.

Marie E. Rognes, Anders LoggSimula Research [email protected], [email protected]

MS2

Automated Solution of Optimal Control Problemsfor Coupled PDEs

We will present various approaches to solving optimal con-trol problems with multiple coupled equations. Differ-ent solution methods can be implemented and examinedrapidly using automated code generation, and in particularusing automatic differentiation. For ’one-shot methods’,when changing from one equation to another it is possibleto simply change the functional, in a high-level domain-specific language, whose stationary points correspond tothe solution of the control problem. The necessary deriva-tives will be computed automatically, and the remainder ofthe program will be generated and compiled automatically.Other methods for finding stationary points of a functionalcan be constructed easily using a collection of basic build-ing blocks that mirror mathematical operations. A numberof examples using different algorithms will be presented forStokes flow and flow through porous media.

Andi MerxhaniDepartment of Engineering, University of [email protected]

Garth WellsUniversity of [email protected]

MS3

Algorithms for Linear Elasticity on OverlappingGrids

In this talk we describe, evaluate, and compare two numer-ical approaches for solving the equations of linear elasticityon composite overlapping grids. In the first approach wesolve the elastodynamic equations posed as a second-ordersystem using a conservative finite difference approximation.In the second approach we solve the equations written asa first-order system using a high-order characteristic-based(Godunov) finite-volume method.

Jeffrey W. BanksLawrence Livermore National [email protected]

William D. HenshawCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

Donald W. SchwendemanRensselaer Polytechnic InstituteDepartment of Mathematical [email protected]

Daniel AppeloDivision of Engineering and Applied ScienceCalifornia Institute of [email protected]

CS11 Abstracts 119

MS3

Hybrid and Dynamically Adaptive Higher-OrderShock-Capturing Methods for Compressible GasDynamics

We present an advanced high-resolution approach forsimulating turbulence with shock interaction in multi-component gases that aims at approximating interfacialmixing with centered schemes and capturing shock waveswith higher-order upwind methods. Numerical stability isachieved by the utilization of deep hierarchies of block-structured adaptive mesh refinement combined with a reli-able scheme switching criterion. The design of the softwarewill be discussed and large-scale parallel direct numericaland large eddy simulation examples will be presented.

Ralf DeiterdingOak Ridge National LaboratoryComputer Science and Mathematics [email protected]

Jack L. Ziegler, Manuel Lombardini, Dale I. PullinGraduate Aeronautical LaboratoriesCalifornia Institute of [email protected], [email protected],[email protected]

MS3

Deforming Composite Grids for Fluid Structure In-teractions

We describe the use of deforming composite overlappinggrids for the solution of problems coupling fluid flow anddeforming solids. The method is based on a mixed Eule-rian Lagrangian technique. Local moving boundary-fittedgrids are used near the deforming interface and these over-lap non-moving grids which cover the majority of the do-main. The approach is described and validated for somefluid structure problems involving high speed compressibleflow and linear elastic solids.

William D. HenshawCenter for Applied Scientific ComputingLawrence Livermore National [email protected]


Donald W. SchwendemanRensselaer Polytechnic InstituteDepartment of Mathematical [email protected]

MS3

Stable Grid Refinement for Seismic Wave Simula-tion

We will present an energy conserving extension of our fi-nite difference method for the elastic wave equation to acomposite grid, consisting of a set of structured rectangularcomponent grids with hanging nodes on the grid refinementinterface. We prove that the resulting, non-dissipative,method with variable coefficients is stable.

Bjorn SjogreenCenter for Applied Scientific ComputingLawrence Livermore National Laboratory

[email protected]

Anders PeterssonCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

MS4

Full Waveform Inversion using Diffraction Waves

Diffracted and reflected waves are fundamentally differentphysical phenomena. Most of seismic imaging, as prac-ticed in the industry, is tuned to processing reflected waves,which carry most of the information about subsurface.However, diffracted waves are also important, both becausethey are a direct response of small geologically-importantsubsurface features and because they behave differentlyfrom reflections in the process of imaging. We propose toseparate diffractions from reflections in the recorded reflec-tion seismic data and to use focusing of diffractions in theimaging process as an objective function for full waveforminversion for seismic velocity parameters. Preliminary testsindicate that the focusing objective function may allow foran efficient gradient-based optimization.

Sergey FomelUniversity of Texas at [email protected]

MS4

Full-waveform Inversion with Compressive Up-dates

Full-waveform inversion relies on large multi-experimentdata volumes. While improvements in acquisition and in-version have been extremely successful, the current pushfor higher quality models reveals fundamental shortcom-ings handling increasing problem sizes numerically. Toaddress this fundamental issue, we propose a random-ized dimensionality-reduction strategy motivated by recentdevelopments in stochastic optimization and compressivesensing. In this formulation conventional Gauss-Newtoniterations are replaced by dimensionality-reduced sparserecovery problems with source encodings.

Felix J. HerrmannThe University of British ColumbiaDepartment of Earth and Ocean [email protected]

Aleksandr Aravkin, Tristan van Leeuwen, Xiang Lithe University of British [email protected],tristan van leeuwen ¡[email protected]¿a, xiang li ¡[email protected]¿

MS4

A Discontinuous Galerkin Method for CoupledElastic-acoustic Inverse Wave Propagation

Optimization-based methods for full-waveform inversion torecover locally varying seismic wave speeds will be pre-sented. The forward equation is discretized by a higher-order discontinuous Galerkin (dG) method, which uses anupwind numerical flux. The computation of discretely con-sistent gradients leads to a dG discretization for the adjointequation based on a downwind flux. To quantify the un-certainty in the reconstruction, an estimate of the variance

120 CS11 Abstracts

at the maximum a posteriori point is computed.

Tan Bui-Thanh, Carsten BursteddeThe University of Texas at [email protected], [email protected]

Omar GhattasUniversity of Texas at [email protected]

James R. MartinUniversity of Texas at AustinInstitute for Computational Engineering and [email protected]

Georg Stadler, Lucas WilcoxUniversity of Texas at [email protected], [email protected]

MS4

From Simulation to Inversion: Algorithm and Soft-ware Organization for Imaging and Inversion

We describe a few mild design constraints which permitrapid adaptation of simulation code for linear wave prob-lems to inversion or design optimization applications, re-taining the parallel and other performance enhancementsof the underlying simulator. We also describe a frame-work taking advantage of these concepts which we haveused to build a variety of inversion applications. Wave in-verse problems tend to be afflicted by a variety of features,including extreme ill-conditioning and nonlinearity, whichdegrade the performance of straightforward optimizationmethods. Variants of data-fitting inversion, motivated bystandard methods in exploration geophysics, may relievesome of these difficulties. The framework approach alsoaccommodates these extensions to standard inversion.

William Symes, Marco Enriquez, Dong Sun, RamiNammour, Xin WangRice [email protected], [email protected],[email protected], [email protected],[email protected]

Igor TerentyevDept. of Computational and Applied MathematicsRice [email protected]

MS5

Treed Gaussian Processes for Mixed-Integer Sur-rogate Modeling

The ability to understand and analyze computer simula-tion models can rely heavily on the ability to approximatethe model with a good statistical surrogate. While tradi-tional approaches have focused on the case of only contin-uous input variables, we present new developments withtreed Gaussian processes making them applicable for bothdiscrete and continuous inputs. Thus they are an ideal sur-rogate for mixed-integer problems, as they allow modelingof nonstationarity and quantification of uncertainty.

Yuning HeUC Santa [email protected]

Herbie LeeUniversity of California, Santa CruzDept. of Applied Math and [email protected]

MS5

Evaluation of Mixed Continuous-discrete SurrogateApproaches

Evaluating the performance of surrogate modeling ap-proaches is essential for determining their viability in opti-mization or uncertainty analysis. To this end, we evaluatedcategorical regression, ACOSSO splines, and treed GPs ona set of test functions. We describe the principles andmetrics we used for this evaluation, the characteristics ofthe test functions we considered, and our software testbed.Additionally, we present our numerical results and discussour observations regarding the merits of each approach.

Herbie LeeUniversity of California, Santa CruzDept. of Applied Math and [email protected]

Laura SwilerSandia National LaboratoriesAlbuquerque, New Mexico [email protected]

Patricia D. HoughSandia National [email protected]

MS5

Design and Analysis of Computer Experimentswith Qualitative and Quantitative Factors

We propose a simple yet efficient approach for buildingGaussian process models for computer experiments withboth qualitative and quantitative factors. This approachuses the hypersphere parameterization to model the corre-lations of the qualitative factors, thus avoiding the need ofdirectly solving optimization problems with positive defi-nite constraints. The effectiveness of this method is suc-cessfully illustrated by several examples. Also will be dis-cussed are new classes of space-filling designs for buildingmixed-integer surrogate models.

Peter Qian, Qiang Zhou, Shiyu ZhouUniversity of Wisconsin - [email protected], [email protected],[email protected]

MS5

Functional ANOVA Decomposition and DiscreteInputs in Computer Model Emulation


Curtis StorlieLos Alamos National [email protected]

Brian ReichNorth Carolina State Universitybrian [email protected]

CS11 Abstracts 121

MS6

Parameter Estimation for Photosynthesis

We have developed an extensive model of (leaf) photosyn-thesis based on ordinary differential equations. Using thismodel, we have shown how carbon fixation might be dou-bled (more biofuel), and we can analyze questions fromevolutionary biology and the influence of climate change.An important problem is estimating the many parametersthat govern the photosynthesis process, but that cannotbe measured directly inside a cell, and that may vary con-siderably among different plants. This is a collaborationwith Xinguang Zhu (Plant Systems Biology Group, Chi-nese Academy of Sciences, Shanghai), and Stephen Long(Plant Biology and Crop Science, University of Illinois atUrbana-Champaign)

Eric De Sturler, Eun ChangVirginia [email protected], [email protected]

MS6

Numerical Solution for Time Dependent OptimalTransport

In this talk we present a new computationally efficient nu-merical scheme for the computation of the optimal L2 masstransport mapping. Our starting point is the Benamou andBrenier approach to the problem which computes the com-plete transport path. We review the approach and discussits numerical shortcomings. We then derive an efficientdiscretization and a solution technique for the problem.We demonstrate the effectiveness of our approach using anumber of numerical experiments.

Eldad HaberDepartment of MathematicsThe University of British [email protected]

MS6

Rank-Deficient Nonlinear Least Squares Problemsand Subset Selection

We examine the local convergence of the Levenberg-Marquardt method for the solution of nonlinear leastsquares problems that are rank deficient and have nonzeroresidual. We show that replacing the Jacobian by a trun-cated singular value decomposition can be numerically un-stable. We recommend instead the use of subset selec-tion. We corroborate our recommendations by perturba-tion analyses and numerical experiments.

Ilse IpsenNorth Carolina State UniversityDepartment of [email protected]

Carl T. KelleyNorth Carolina State UnivDepartment of Mathematicstim [email protected]

Scott PopeSAS [email protected]

MS6

Regularized Gauss-Newton for Parameter Estima-tion with Applications in Tomographic Imaging

We present a new algorithm for the solution of nonlinearleast squares problems arising from low-order, paramet-ric tomographic imaging models. The ill-conditioning ofthe Jacobian, together with the presence of noise in thedata, motivates us to devise a regularized, trust-region-based Gauss-Newton approach for determining search di-rections. Examples show the success of our approach rela-tive to well-known alternatives.

Misha E. KilmerTufts [email protected]

Eric De SturlerVirginia [email protected]

MS7

Optimization based Modeling. Part I. Additive De-composition of Multiphysics Problems

We formulate an approach, based on additive operator de-composition and reconnection via optimization, which al-lows to synthesize robust and efficient solvers for a cou-pled multiphysics problems from simpler solvers for theirconstituent components. To illustrate the scope of the ap-proach we show how a robust and efficient solver for nearlyhyperbolic PDEs can be derived from standard, off-the-shelf algebraic multigrid solvers for the Poisson equationthat do not work for the original equations.

Pavel BochevSandia National LaboratoriesComputational Math and [email protected]

Denis RidzalSandia National [email protected]

MS7

Optimization-based Modeling, Part II: Monotone,Bound Preserving Transport and Remap

Remap, broadly defined as the transfer of simulation databetween different computational meshes subject to physicalconstraints is a critical component in a number of numer-ical algorithms, e.g. ALE transport schemes. We presenta new mathematical framework for remap, based on ideasfrom constrained optimization. Optimization-based remap(OBR) is formulated as the solution of an optimizationproblem, in which accuracy considerations, handled by anobjective functional, are separated from monotonicity con-siderations, handled by a carefully defined set of inequalityconstraints. As such, OBR naturally applies to unstruc-tured meshes and meshes comprised of arbitrary polyhedralcells. In addition, we demonstrate that OBR can generatesignificantly more accurate solutions at or below the costof the best currently used remapping techniques.


Pavel Bochev

122 CS11 Abstracts

Sandia National LaboratoriesComputational Math and [email protected]

MS7

Advanced Variational Methods on TetrahedralMeshes for Hyperbolic Systems: Stabilization,Compatibility, Accuracy

This talk describes a new variational multiscale formula-tion for Lagrangian-ALE shock hydrodynamics for tetra-hedral finite elements. To the author knowledge thisis the only robust and accurate tetrahedral formula-tion developed to date for shock hydrodynamics com-putations. The formulation preserves conservation ofmass/momentum/total energy, which can be interpretedas compatibility relationships. A specific ALE remap algo-rithm based on flux-corrected transport (FCT) of systemsof equations invokes another compatibility relationship, theGeometric Conservation Law, which is shown to be neces-sary in ensuring monotonicity of the remapped fields.

Guglielmo ScovazziSandia National [email protected]

MS7

Optimization-Based Synchronized Flux-CorrectedConservative Interpolation (remapping) of Massand Momentum for Arbitrary Lagrangian-EulerianMethods

A new optimization-based synchronized flux-corrected con-servative interpolation (remapping) of mass and momen-tum for arbitrary Lagrangian-Eulerian hydro methods isdescribed. Fluxes of conserved variables - mass and mo-mentum - are limited in synchronized way to preserve localbounds of primitive variables - density and velocity.

Mikhail ShashkovLos Alamos National [email protected]

Richard LiskaCzech Technical UniversityFac Nuclear Science & Phys [email protected]

Pavel VachalCzech Technical University in [email protected]

Burton WendroffLos Alamos National [email protected]

MS8

Finite Element Method (FEM) for the ModifiedPoisson Nernst Planck Equations (MPNPE)

The dynamics of ions in implicit solvent models are de-scribed by the coupled Poisson-Nernst-Planck equations(PNPE). The PNPE are an alternative to the computa-tionally expensive fully explicit particle based methods.However, PNPE may exhibit unbounded concentration ofions. The MPNPE removes this drawback by taking stericeffects into account. This talk addresses FEM for the MP-NPE for ion flow through a protein pore. We show MPNPE

leads to more accurate calculation of ionic current than theclassical PNPE.

Jehanzeb H. ChaudhryUniversity of [email protected]

MS8

Goal-Oriented Error Estimation for the Poisson-Boltzmann Equation

Studying solvation effects in biomolecules is important toaccurate representation of their physical environment. Inthis regard, a meaningful quantity is the solvation free en-ergy, which can be written as a linear functional of the solu-tion to the nonlinear PDE known as the Poisson-Bolzmannequation (PBE). In this talk we present an adaptive meshrefinement algorithm that drives refinement in an effort tocontrol the error in the solvation free energy. Key to thisalgorithm is the development of goal-oriented error indica-tors. We show how these indicators are developed and givethe steps needed for their computation and use in adaptivemesh refinement. An additional aspect of this refinementalgorithm, is the development of a split-domain markingstrategy that proved critical to its sucess. To show theefficacy of the adaptive refinement algorithm, we presentresults based on computing the solvation free energy of theprotein Fasciculin-1.

Eric C. CyrScalable Algorithms DepartmentSandia National [email protected]

Stephen D. BondSandia National [email protected]

Burak AksoyluTOBB University of Economics and Technology, [email protected]

Michael HolstUniversity of California, San Diego, CA, [email protected]

MS8

Recovery-type Error Estimators for the Adap-tive Finite Element Approximation of the Poisson-Boltzmann Equation

In this talk, we discuss recovery-type a posteriori error es-timators for use in the adaptive finite element method ap-plied to the Poisson-Boltzmann equation. Specifically, wederive and study both gradient and flux recovery methods.These quantities are important in molecular dynamics sim-ulations and the computation of reaction rates. Finally,we will present numerical results illustrating convergenceof the adaptive method using each indicator.

Long ChenUniversity of California at [email protected]

Ryan Szypowski, Michael HolstUC San DiegoDepartment of [email protected], [email protected]

CS11 Abstracts 123

Yunrong ZhuDepartment of MathematicsUniversity of California at San [email protected]

MS8

Adaptive Multiscale Methods for the Poisson-Boltzmann and Nerst-Planck Equations

In this lecture, we give an overview of several relatedprojects based at UCSD involving the design and analysisof high-resolution, high-fidelity mathematical and numer-ical modeling techniques for solvation and diffusion phe-nomena in biophysics. We first outline some new the-oretical results on the robust numerical discretization ofthe Poisson-Boltzmann equation using adaptive finite ele-ment techniques. We then give an analysis of a coupledsolvation-large-deformation nonlinear elasticity model, de-scribe an iterative method for its simulation, and givesome convergence results. We then consider reaction dif-fusion models such as the Poisson-Nernst-Planck (PNP)and Smoluchowski-Poisson-Boltzmann (SPB) models, anddescribe our recent work on the development of robust sim-ulation tools using modern geometric modeling techniquesand adaptive finite element methods.

Michael J. HolstUniv. of California, San DiegoDepartment of [email protected]

Yongcheng ZhouDepartment of MathematicsColorado State [email protected]

MS9

STK-Mesh Example Computation Setup and Par-allel Execution

HPC computational kernels applied to parallel heteroge-neous unstructured meshes require setup of and access todiscretization knowledge, mesh data structures, and com-putational fields. A STK-Mesh example of problem setupand hybrid parallel execution of a non-trivial computa-tional kernel is presented.

Todd Coffey, H. Carter EdwardsSandia National [email protected], [email protected]

MS9

STK-Mesh Domain-driven Design

An unstructured mesh providing hybrid parallel (dis-tributed + threaded), heterogeneous discretization, anddynamic modifications for HPC scientific and engineeringcomputations has significant inherent complexity. Man-aging this complexity through domain-driven design hasbeen critical to the success of the STK-Mesh project. TheSTK-Mesh domain model (conceptual design) developed tomanage this complexity is presented.

H. Carter EdwardsSandia National [email protected]

MS9

STK-Mesh Example Dynamic Mesh Modification

Evolving solutions to problems may require advanced mod-eling and simulation strategies to dynamically modify thediscretization of the problems unstructured mesh. Dy-namic load balancing, domain cracking or erosion, and so-lution feature resolution through mesh refinement exem-plify advanced problems that require mesh modification.A STK-Mesh example of robust and efficient parallel dy-namic mesh modification is presented.

Daniel Sunderland, H. Carter EdwardsSandia National [email protected], [email protected]

MS9

Sierra Toolkit Capabilities and Future

The STK-Mesh is the first component deployed withina larger SIERRA Toolkit project. This project will bereleasing advanced HPC CS&E capabilities in the pub-lic domain through Sandia National Laboratory’s Trili-nos (http://trilinos.sandia.gov) repository. This presenta-tion summarizes the current set of STK-Mesh and otherSIERRA Toolkit component capabilities and future plansfor the SIERRA Toolkit.

Alan B. WilliamsSandia National LaboratoryDistributed Systems Research [email protected]

H. Carter EdwardsSandia National [email protected]

MS10

Latency? Not a Problem When You Use FG

If your computation works with large amounts of data, thenit may very well suffer from high latency due to disk I/Oand/or interprocessor communication. You can mitigatethe effect of high latency by overlapping high-latency op-erations with other, useful work—but implementing pro-grams that do so can be a painstaking, time-consuming,and error-prone venture. In this talk, I will describe theFG system, which makes it easy to overlap high-latencyoperations with other work.

Thomas CormenDepartment of Computer ScienceDartmouth [email protected]

MS10

A Parallel Fast Algorithm for Applying Fourier In-tegral Operators

Fourier Integral Operators (FIOs) encompass a wide rangeof transforms including several variants of the FourierTransform and the Generalized Radon Transform. A newscheme for the distributed-memory parallelization of thebutterfly algorithm for FIOs is presented and results areshown on thousands of cores. This is a joint work with L.Demanet and N. Maxwell.

Jack PoulsonInstitute for Computational Engineering and Sciences

124 CS11 Abstracts

The University of Texas at [email protected]

Lexing YingUniversity of TexasDepartment of [email protected]

Nicholas MaxwellThe University of [email protected]

Laurent DemanetMathematics, [email protected]

MS10

Evaluation of OpenMP Task-Based Parallelism forthe Adaptive Fast Multipole Algorithm

Parallel execution of the Adaptive Fast Multipole Algo-rithm on a multicore shared-memory computer presentscomplex load balancing and memory locality issues. Wereport on the use of OpenMP 3.0 task parallel directives toexpress the nested parallelism in the algorithm in a clearand concise fashion, and report on the execution times ob-tained using a novel OpenMP runtime system to measureand tune overall performance as well as parallel scaling.

Jan F. PrinsDept. of Computer ScienceUniversity of North Carolina at Chapel [email protected]

Robert Overman, Stephen OlivierUNC Chapel [email protected], [email protected]

Allan PorterfieldRENCIUNC Chapel [email protected]

Bo ZhangDuke [email protected]

Jinfang HuangUNC Chapel HillApplied [email protected]

Michael MinionUniversity of North Carolina-Chapel HillMathematics [email protected]

MS10

Parallelization of New-Version Adaptive Fast Mul-tipole Method on Multicore Machines

We present a parallelization scheme for the adaptive newversion fast multipole method on the multicore machines.We organized the algorithm into upward, interaction, anddownward stages, where all involved operators are im-plemented to have intrinsic mutual exclusion property.A spatio-temporal partition strategy is developed. The

scheme has been implemented for Laplace and Yukawa ker-nels and is shown to be highly efficient in several numericalresults.

Bo ZhangDuke [email protected]

Jingfang HuangDepartment of MathematicsUniversity of North Carolina, Chapel [email protected]

Nikos PitsianisDepartment of Electrical & Computer EngineeringDuke [email protected]

Xiaobai SunDepartment of Computer ScienceDuke [email protected]

MS11

Next- Generation Sequencing


Srinivas AluruIowa State [email protected]

MS11

Sensor Placement


William E. HartSandia National Labs, [email protected]

MS11

Towards Hypothesis Testing for Communities inNetworks

Community detection methods for social network analysismust generally account for uncertainty in relationship ob-servation. Community detection algorithms approximatelyoptimize an objective that leads to higher internal con-nectivity than external. Many believe that communitiesshould not have statistically significant sub-communitiesand communities should not be readily explained as sta-tistical fluctuations in larger communities. We describehypothesis testing methods to help determine if a set ofcommunities has been computed to the correct level of res-olution.

Jonathan Berry, Cynthia PhillipsSandia National [email protected], [email protected]

MS12

Velvetrope: a Parallel, Bitwise Algorithm forFinding Homologous Regions within Multiple Se-

CS11 Abstracts 125

quences


Scott ClarkCornell [email protected]

MS12

Parallel, Precision-Limited PDE Evolution usingVariable-Order Series Expansion


Hal FinkelYale [email protected]

MS12

Subgraph Isomorphism Algorithms for Multi-threaded Shared Memory Architectures


Claire RalphCornell [email protected]

MS12

Bringing OpenCL to Supercomputing: Finite Vol-ume Solvers on GPU Clusters


Noah F. ReddellUniversity of [email protected]

MS13

Hierarchical Matrix Preconditioners for SaddlePoint Problems

Hierarchical (H-) matrices provide a powerful technique tocompute and store approximations to dense matrices in adata-sparse format. The basic idea is the approximationof matrix data in hierarchically structured subblocks bylow rank representations. The usual matrix operations canbe computed approximately with almost linear complex-ity. We use such an H-arithmetic to set up preconditionersfor the iterative solution of sparse linear systems of saddlepoint type as they arise in the finite element discretizationof systems of PDEs.

Sabine Le BorneTennessee Technological UniversityDepartment of [email protected]

MS13

Fast Direct Solvers for Elliptic PDEs

That the linear systems arising upon the discretization ofelliptic PDEs can be solved very rapidly is well-known, andmany successful iterative solvers with linear complexityhave been constructed (multigrid, Krylov methods, etc).Interestingly, it has recently been demonstrated that it ispossible to directly compute an approximate inverse to thecoefficient matrix in linear (or close to linear) time. The

talk will survey some recent work in the field and demon-strate the advantages of direct solvers in terms of stability,and, sometimes, speed.

Gunnar MartinssonUniv. of Colorado at [email protected]

MS13

Exploiting the Natural Block-Structure ofHierarchically-Decomposed Variational Discretiza-tions of Elliptic Boundary Value Problems

Hierarchical bases for higher-order finite element dis-cretizations provide a natural block structure with two keyproperties which suggest that a block Gauss-Seidel (or Ja-cobi) approach will be effective: only the linear-linear di-agonal block is ill-conditioned, so it is the only part whichrequires sophisticated techniques; and the off-diagonal cou-pling is weak enough that (at least) a fixed error reductionis guaranteed in each iteration. The performance of theapproach is demonstrated using H-matrix techniques.

Jeffrey S. OvallUniversity of Kentucky, [email protected]

MS14

Hybrid Hermite-Discontinuous-Galerkin Methodsfor Hyperbolic Systems

A class of hybrid methods for hyperbolic problems basedon the combination of Hermite-Taylor-Runge-Kutta meth-ods and nodal Discontinous Galerkin methods is presented.Examples illustrating its application to the wave equationand Maxwell’s equation will be given.

Daniel AppeloDivision of Engineering and Applied ScienceCalifornia Institute of [email protected]

Ronald ChenDepartment of MathematicsThe University of New [email protected]


MS14

Implicit High-order Compact Schemes for Incom-pressible Flow

This talk describes an implicit, high-order accurate,method for incompressible flow combining compact spa-tial discretizations with approximate factorization schemeson overlapping grids. The approach introduces techniquesthat minimize the number of factors used in the implicitscheme while maintaining up to fourth order spatial accu-racy at physical boundaries. Implicit time discretizationis achieved via a second order accurate approximately fac-tored Crank-Nicolson method that incorporates the com-pact spatial approximations into the banded solves.

Kyle Chand

126 CS11 Abstracts

Lawrence Livermore National [email protected]

MS14

Accurate Methods for Time-Domain Scattering

Volume-based methods for solving scattering problems inthe time domain require an accurate near-field radiationboundary condition. Complete radiation boundary condi-tions represent an essentially optimal solution to this prob-lem for isotropic systems. They are local and inexpensive,they can be stably implemented on polygonal boundaries,and their accuracy is guaranteed by a priori error estimates.We will review their construction and analysis and describeour efforts to produce generally usable implementations.


MS14

Using Adaptive Overset Grid Methods for ShapeOptimization of Medical Devices

Adaptive overset grid methods are well-suited for shapeoptimization problems. Rapid and local grid generationtechniques facilitate the low-cost construction of compu-tational grids required during the optimization procedure.This work couples the overset grid method with a deriva-tive free optimization algorithm to determine the optimalshape and configuration of blood clots that are trappedby medical devices. The methods are fully automated anddemonstrate an adaptive design optimization frameworkthat is broadly applicable.

Michael SingerLawrence Livermore National [email protected]

Stephen WangKaiser [email protected]

Darin DiachinKanoga [email protected]

MS15

On the Application of Parallel Scalable Solvers forthe Integrated Hydrologic Model ParFlow

Integrated models with coupled nonlinear physics have par-ticular solver requirements. Here, the ParFlow model, aparallel hydrology model with integrated land-surface pro-cesses, and its solver and model framework will be dis-cussed. The different solver requirements for different scaleof application, the disparate timescales of physical pro-cesses and the different degree of model coupling betweencomponents will be highlighted. Finally, physical applica-tions and parallel scaling will be addressed.

Reed M. MaxwellDepartment of Geology and Geologic EngineeringColorado School of [email protected]

MS15

A Subdomain-based Parallel PCG Solver for theCell Centered Finite Difference Groundwater FlowEquations using Incomplete Cholesky Precondi-tioning

An algorithm based in non-overlapping subdomains forsolving sparse symmetric matrix systems that might re-sult from the cell-centered finite difference (CCFD) al-gorithm has been developed for a multi-processor envi-ronment. This scheme is based in the PCG algorithmusing incomplete Cholesky preconditioning with zero fill(IC(0)); parallelization is based in non-overlapping sub-domains, whereby application over each subdomain corre-sponds to a computational process. As the IC(0) precondi-tioner does not lend itself easily to parallelization, certaincompromises in parallelizations must be accommodated.The parallelization itself is instituted by means of stan-dard Message-Passing Interface (MPI) functions. As a re-sult, this parallel solver can be run on any Beowulf clusteror any multi-processor machine on which an implementa-tion of MPI has been installed. The parallel PCG schemewith IC(0) preconditioning is compared with similar par-allel PCG schemes where preconditioning is instituted byeither block Jacobi or block Gauss-Seidel methods.

Richard L. NaffU S Geological [email protected]

MS15

Newton-GMRES Solvers for the Integrated WaterFlow Model (IWFM) with Adaptive Stopping Cri-teria

We will discuss the use of inexact Newton method in con-junction with GMRES to solve systems of nonlinear equa-tions in IWFM, a water resources management and plan-ning model developed by California State Department ofWater Resources. A strategy to adaptively control GMRESaccuracy based on information of inexact Newton iteratesis proposed. Numerical results show that the new strat-egy could save a significant number of GRMES iterations.Efficiency of several preconditioners will be also presented.

Hieu NguyenUniversity of California, [email protected]

Zhaojun BaiUniversity of [email protected]

Matthew F. DixonDepartment of Mathematics, UC [email protected]

Charles Brush, Emin Dogrul, Tariq Kadir, Francis ChungCalifornia State Department of Water [email protected], [email protected],[email protected], [email protected]

MS15

Improvement of Performance and Applicability ofMODFLOW-2005: New NWT Solver and xMDMatrix Solver Package

The efficiency of Picard’s method used by MODFLOW de-

CS11 Abstracts 127

grades when a model system is strongly nonlinear. MOD-FLOW has difficulties solving problems that involve largerhydraulic conductivity contrasts among its geological units.These difficulties are caused by a hard-to-solve matrix. TheNewton method is combined with a new preconditionedconjugate-gradient type matrix solver to improve MOD-FLOW’s performance. This talk will present results, in-cluding the models ability to provide a solution for a diffi-cult unconfined groundwater-flow problem.

Rich NiswongerUS Geological [email protected]

Sorab PandayAMEC Geomatrix [email protected]

Motom IbarakiThe Ohio State [email protected]

MS16

A Fast Algorithm for the Time Harmonic ElasticInverse Medium with Multiple Events

We consider the inversion of the 3-D time-harmonic elas-todynamic equation in a lossy medium. In particular, wefocus on broadband multi-point illumination problems forlow frequency regimes. Such a problem finds many applica-tions in geosciences. We use an integral-equation formula-tion for the forward problem and consider only small per-turbations of the background medium (Born approxima-tion). To solve this inverse problem we use a least squaresformulation. If Nω is the number of excitation frequen-cies, Ns the number of incoming waves, Nd the number ofdetectors, and N the discretization size for the scatterer,a dense SVD for the overall input-output map will have[min(NsNωNd, N)]2 × max(NsNωNd, N) cost. We havedeveloped a fast algorithm that brings the cost down toO(NNωNs +NNωNd) thus, providing orders of magnitudeimprovements. Also, we propose an adaptive method inspace to optimize the ratio accuracy versus number of de-grees of freedom in space (the number of spatial discretiza-tion points).

Stephanie [email protected]

George BirosGeorgia Institute of [email protected]

MS16

Large Scale Inversion for Electromagnetic WavePropagation Problems

In this talk we discuss the inversion of electromagnetic sig-nals. In particular we discuss time domain EM inversion.We address the main computational bottlenecks and showhow to reduce them.


MS16

Solving Time-Harmonic Inverse Medium Opti-mization Problems on the Cray Xe6 Supercom-puter

We formulate the inverse medium problem as a PDE-constrained optimization problem, where the underlyingwave field solves the time-harmonic Helmholtz equation.Ill-posedness is tackled through regularization while theinclusion of inequality constraints is used to encode priorknowledge. The resulting nonconvex optimization problemis solved by a primal-dual interior-point algorithm with in-exact step computation. Numerical results both in twoand three space dimensions on the Cray XE6 illustrate theusefulness of the approach.

Olaf SchenkDepartment of Mathematics and Computer ScienceUniversity of Basel, [email protected]

Grote Marcus, Johannes HuberUniversity of [email protected], [email protected]

MS16

A Collection of Computational Experiments in Pa-rameter Estimation

Full Wavefield Seismic Inversion (FWI) represents a chal-lenging parameter estimation problem. In this problem, werequire the solution of an optimization problem with onePDE constraint for every seismic source. As a result, ap-plying fast converging algorithms that require us to storesecond order information becomes challenging. In the fol-lowing presentation, we present several computational ex-periments where we compare and contrast a variety of firstand second order low memory algorithms applied to FWI.

Joseph Young, Scott Collis, Bart G. Van BloemenWaandersSandia National [email protected], [email protected], [email protected]

MS17

Basis Design for Polynomial Regression withDerivatives

We discuss polynomial regression with derivative (PRD)information; a method for approximating the stochas-tic response of a complex system. We demonstrate thatthe method needs less sampling information compared toits derivative-free version. Nevertheless, the method alsoposes other challenges in that the polynomial basis ofchoice for derivative-free methods is no longer suitable inthis case. We propose a novel basis based on orthogonaliza-tion with respect to an inner product that involves gradientinformation. We demonstrate that the basis results in moreaccurate approximation at the same information level.

Yiou LiIllinois Institute of [email protected]

Mihai AnitescuArgonne National LaboratoryMathematics and Computer Science [email protected]

128 CS11 Abstracts

Oleg RoderickArgonne National [email protected]

MS17

Ensemble Emulators

Constructing Gaussian process/Kriging models requiresthe repeated inversion of an M by M matrix where M is thenumber of design points. For M = O(∞′ �) points, the costis O(∞′∞∀) operations and the matrix is sure to be nu-

merically singular. Using an ensemble of O(∞′ �) small em-ulators made from O(∞′′) points each reduces the cost toO(∞′∞∈) operations, avoids singularity issues, allows theensemble emulator to represent non-stationary processes,and enables its concurrent construction and evaluation.

Keith DalbeySandia National [email protected]

Matthew JonesCenter for Comp. Research,Univ. at Buffalo, Buffalo, NY [email protected]

Abani PatraDept. of Mechanical and Aerospace EngineeringUniversity at [email protected]

Eliza CalderUniversity of [email protected]

MS17

Countering the Curse of Dimensionality UsingHigher-order Derivatives

Surrogate model approaches can reduce the computationalcost for uncertainty analysis dramatically since their es-timated function values can be used for an inexpensiveMonte Carlo simulation. The surrogate model construc-tion can be enhanced by using higher-order derivatives,whereby the information gain with higher dimensionalityat reduced additional cost through adjoint methods can beexploited. Some uncertainty analysis examples applyingthese ideas will be given involving analytic test functionsas well as computational fluid dynamics simulations.

Markus P. RumpfkeilUniversity of [email protected]

Wataru YamazakiNagaoka University of [email protected]

Dimitri MavriplisDepartment of Mechanical EngineeringUniversity of [email protected]

MS17

Improving the Scalability of Simplex Stochas-tic Collocation using Gradient-enhanced Response

Surface Approximation

Simplex Stochastic Collocation (SSC) is an efficientmethod for the propagation of multiple aleatoric uncer-tainties based on an adaptive Delaunay triangulation ofprobability space and higher degree polynomial interpola-tion. SSC is here extended to include gradient-enhancedresponse surface approximation for improving the scalabil-ity to higher dimensional probability spaces using adjointinformation. The computational complexity is further re-duced by a pointwise construction of the response surfaceinstead of the Delaunay triangulation.

Gianluca IccarinoCTR, Stanford University, Stanford,CA 94305-3035, [email protected]

Jeroen WitteveenStanford UniversityCenter for Turbulence [email protected]

MS18

Proper Generalized Decomposition based DynamicData-Driven Application

Dynamic Data-Driven Application Systems (DDDAS) ap-pear as a new paradigm in the field of applied sciencesand engineering, and in particular in simulation-based en-gineering sciences. By DDDAS we mean a set of tech-niques that allow the linkage of simulation tools with mea-surement devices for real-time control of systems and pro-cesses. DDDAS entails the ability to dynamically incorpo-rate additional data into an executing application, and inreverse, the ability of an application to dynamically steerthe measurement process. DDDAS needs for accurate andfast simulation tools making use if possible of off-line com-putations for limiting as much as possible the on-line com-putations. We could define efficient solvers by introducingall the sources of variability as extra-coordinates in order tosolve off-line only once the model to obtain its most generalsolution to be then considered in on-line purposes. How-ever, such models result defined in highly multidimensionalspaces suffering the so-called curse of dimensionality. Weproposed recently a technique, the Proper Generalized De-composition (PGD), able to circumvent the redoubtablecurse of dimensionality. The marriage of DDDAS con-cepts and tools and PGD off-line computations could openunimaginable possibilities in the field of dynamics data-driven application systems. In this work we explore somepossibilities in the context of parameter estimation.

Francoise MassonGEM - Centrale Nantes, [email protected]

Elias CuetoI3A, University of Zaragoza, [email protected]

Francisco ChinestaEcole Centrale de NantesEADS Corporate Foundation Internantional [email protected]

David GonzalezI3A, University of Zaragoza, [email protected]

CS11 Abstracts 129

MS18

Towards Optimal Interpolatory Model Reductionfor Parameterized Systems

Optimal interpolatory model reduction has received greatattention recently due to numerically effective methods forproving optimal point selection strategies. In this talk,after briefly reviewing the interpolation framework for pa-rameterized systems, we show how to extend the conceptof optimality to the parametric setting.

Serkan GugercinVirginia Tech.Department of [email protected]

MS18

Applications of DEIM in Nonlinear Model Reduc-tion

A dimension reduction method called Discrete EmpiricalInterpolation (DEIM) is described and shown to dramati-cally reduce the computational complexity of the popularProper Orthogonal Decomposition (POD) method for con-structing reduced-order models for parametrized nonlinearpartial differential equations (PDEs). POD reduces dimen-sion in the sense that far fewer variables are present, butthe complexity of evaluating the nonlinear term remainsthat of the original problem. DEIM is a modification ofPOD that reduces complexity of the nonlinear term of thereduced model to a cost proportional to the number of re-duced variables obtained by POD. The method applies toarbitrary systems of nonlinear ODEs, not just those aris-ing from discretization of PDEs. Applications from NeuralModeling, Porous Media Flow, and Shape Optimizationwill be presented to illustrate the wide applicability of theDEIM approach.

Danny C. SorensenRice [email protected]

MS18

A Comparison of Reduced Basis Methods and PODfor an Optimal Control Problem

In this contribution, a linear-quadratic optimal controlproblem governed by the Helmholtz equation is considered.For the computation of suboptimal solutions, two differentmodel reduction techniques are compared: the reduced ba-sis method (RBM) and proper orthogonal decomposition(POD). By an a-posteriori error estimator for the opti-mal control problem the accuracy of the suboptimal solu-tions is ensured. The efficiency of both model reductionapproaches is illustrated by a numerical example for thestationary Helmholtz equation.

Karsten UrbanInstitute of Numerical Mathematics, University of [email protected]

MS19

Isogeometric Analysis for Solids and Structures

A Reissner-Mindlin shell formulation based on a degener-ated solid is implemented for NURBS-based isogeometricanalysis. Its performance is examined on a set of elastic andnonlinear elasto-plastic benchmark examples. The analy-ses were performed with LS-DYNA, a general-purpose fi-

nite element code, for which a user-defined shell elementcapability was implemented. This new feature, to be re-ported on in subsequent work, allows for the use of NURBSand other non-standard discretizations in a sophisticatednonlinear analysis framework.

David J. BensonDepartment of Structural EngineeringUniversity of California, San [email protected]

MS19

Finite Element Methods for Nonlocal Models ofDiffusion and Mechanics


Max GunzburgerFlorida State UniversitySchool for Computational [email protected]

MS19

Nonlocal Continuum Balances in Continuum Me-chanics

I review the nonlocal balances of linear momentum, angularmomentum and energy. The nonlocal balances representextensions to the classical balances and this relationshipis given. I also highlight the current mathematical andphenomenological understanding of the nonlocal balances.

Richard B. LehoucqSandia National [email protected]

MS19

Computational Peridynamics

Peridynamics is a nonlocal extension of classical contin-uum mechanics. Whereas classical continuum mechanicsis governed by familiar partial differential equations, peri-dynamics is governed by an integro-differential equation.We discuss the impact of this nonlocal formulation uponthe computational structure of the problem, reviewing dis-cretization techniques, conditioning results, and solutionmethods. We also survey the state-of-the-art in computa-tional peridynamics, discussing available codes and show-ing demonstration problems.

Michael L. ParksSandia National [email protected]

MS20

Sparse Bayesian Kernel Techniques for the Solutionof SPDEs

The sparse grid method has been widely utilized in un-certainty propagation problems of SPDEs with well knownlimitations. As the need of alternative approaches is evi-dent, we explore the performance of sparse Bayesian ker-nel techniques to the problem. We employ sophisticatedmodel selection methods to adaptively identify the func-tional form of the kernels and estimate the scale param-eters. The use of Bayesian variance enables us to selecthighly informative data points which minimizes the calls

130 CS11 Abstracts

to the FE solver.

Ilias BilionisCenter for Applied MathematicsCornell University, [email protected]

Xinzeng Feng, Nicholas ZabarasCornell [email protected], [email protected]

MS20

Bayesian Inversion without Markov Chains

Bayesian inference provides a natural framework for quan-tifying uncertainty in PDE-constrained inverse problems,for fusing heterogeneous sources of information, and forconditioning successive predictions on data. In this set-ting, simulating from the posterior via Markov chain MonteCarlo (MCMC) constitutes a fundamental computationalbottleneck. We present a new technique that entirelyavoids MCMC by constructing functional approximationsthat enable rapid sampling and characterization of the pos-terior distribution. The approximations are implementedefficiently using optimization methods.

Tarek El Moselhy, Youssef M. MarzoukMassachusetts Institute of [email protected], [email protected]

MS20

Uncertainty Quantification given Discontinuities,Long-tailed Distributions, and Computationally In-tensive Models

Conventional global spectral methods for uncertaintyquantification are challenged when computer model predic-tions are discontinuous. On the other hand, local methodscan be inefficient given computationally intensive models.The presence of fat-tail distributions in model predictions isalso challenging, as excessive sampling can be prohibitivewhen forward models are computationally intensive. Tocircumvent these challenges we demonstrate a methodologythat employs Bayesian inference to locate discontinuitiesin the model output, followed by efficient spectral prop-agation of uncertainty using domain mapping. We alsoillustrate how the use of tailored basis functions improvesthe convergence of spectral expansions in the tail regions.

Cosmin Safta, Khachik SargsyanSandia National [email protected], [email protected]

Bert J. DebusschereEnergy Transportation CenterSandia National Laboratories, Livermore [email protected]

Habib N. NajmSandia National LaboratoriesLivermore, CA, [email protected]

MS20

An Application of Rare Event Tools to a BimodalOcean Current Model

I will review some recent results on importance sampling

methods for rare event simulation as well as discuss theirapplication to the filtering problem for a simple model ofthe Kuroshio current. Assuming a setting in which bothobservation noise and stochastic forcing are present butsmall, I will demonstrate numerically and theoretically thatsophisticated importance sampling techniques can achievevery small statistical error. Standard filtering methods de-teriorate in this small noise setting.

Jonathan [email protected]

MS21

Thread Level Parallelism of Modern Architectures:New Degrees of Freedom for Mixed Parallel andSerial Processing Optimization of Algorithm Com-putation

The massively parallel SIMD architecture of the GPU com-bined with its ability to barrier synchronize computationhas enabled a new area of loop optimization. In partic-ular, nested loops with dependencies can be decomposedinto different levels of granularity of parallelism and serialprocessing. Applying unimodular linear transformations tothe dependence matrix that represents the order of compu-tation can optimize certain classes of algorithms. Restatingthe problem in a dual forward path dynamic programmingoptimization can be used to discover algorithm optimiza-tion.

Dennis [email protected]

MS21

Optimization of Cartesian Treecodes

Many choices go into designing a fast method for com-puting long-range particle interactions. Here we discuss arelatively simple particle-cluster treecode that uses Carte-sian Taylor series to implement the far-field expansions.An update is provided on our efforts to optimize the serialand parallel performance of this algorithm. One applica-tion of special interest is the Poisson-Boltzmann model forsolvated biomolecules.

Robert KrasnyUniversity of MichiganDepartment of [email protected]

Weihua GengDepartment of MathematicsUniversity of [email protected]

MS21

Parallel Rank-1 Updates of QR Factorizations

The QR factorizations of matrices under successive rank-1updates are used in adaptive signal processing. The tradi-tional approach is highly sequential. We review a recentlyintroduced algorithm that enables efficient parallel compu-tation, has lower complexity in total arithmetic operations,and bahaves better numerically. We extend the method-ology to a few other array operations frequently used in

CS11 Abstracts 131

signal processing.


MS21

Parallelizing the Fast Gauss Transform

Fast Gauss transform allows for calculation of the sum ofN Gaussians at M points in O(N +M) time. We presentnew algorithms for its efficient parallelization. Computingthe transform to six-digit accuracy at 120 billion pointstook approximately 140s using 4096 cores on the Jaguarsupercomputer. Our algorithms can also be used for other”Gaussian-type” kernels and thereby form a new class ofcore computational machinery for solving parabolic PDEson massively parallel architectures.

Shravan VeerapaneniCourant InstituteNew York [email protected]

Rahul [email protected]

Hari SundarSiemens [email protected]

MS22

The General Curvilinear Environmental Model

General Curvilinear Environmental Model (GCEM) is avery high-resolution model composed of two sub models(1) General Curvilinear Atmosphere Model (GCAM) and(2) General Curvilinear Coastal Ocean Model (GCCOM).GCEM is written in fully 3D curvilinear coordinate andcan work with both non-orthogonal and orthogonal gridin all three dimensions. It uses non-hydrostatic approachto solve for pressure and Large Eddy Simulation (LES)to solve primitive Navier-Stokes equations with Boussinesqapproximations.

Mohammad AboualiCSRCSan Diego State [email protected]

MS22

Using the Distributed Coupling Toolkit (DCT) toCouple Model Components of Generalized Curvi-linear Environmental Model

The General Curvilinear Environmental Model (GCEM) isan on-going modeling project to simulate a high resolutionearth model system. Current GCEM components are aGeneral Curvilinear Coastal Ocean Model (GCCOM) anda General Curvilinear Atmospheric Model (GCAM). Bothmodels can run in parallel or Sequential computational en-vironments. The Distributed Coupling Toolkit (DCT) is alibrary to couple multi-physics and multi-resolution modelsin a truly distributed manner. The DCT has a user-friendlyinterface to formulate the coupling of variables and fieldswithin pairs of model components. DCT distributed ap-

proach guarantees scalability both at the model complex-ity and parallel processing levels. Here we use the DCTto weakly couple different components of GCEM. Also, wepresent some new capabilities implemented in DCT. Lastly,we show some preliminary performance results of DCT ina GCEM based application.

Dany De CecchisComputational Science Research CenterSan Diego State [email protected] or [email protected]

Leroy A. DrummondLawrence Berkeley National LaboratoryComputational Research [email protected]

Jose E. CastilloComputational Science Research [email protected]

MS22

Data Asimilation Techniques for the GCEM


Mariangel GarciaCSRCSan Diego State [email protected]

MS22

A Cybeinfrastructure-based Computational Envi-ronment for the GCEM

The GCEM computational environment (CE) allowsclients (human or application) to perform tasks includ-ing: running simulations across heterogeneous computingenvironments; hosting sub-models as services for applica-tions; nesting sub-models within other models. Utilizingthe SDSU Cyberinfrastructure Web Application Frame-work (CyberWeb), the CE provides middleware and back-end services including: dynamic job creation, submission,history, tracking, and management; data migration andmanagement; quick visualization of results; and resourcesand services management. In this talk, we describe thedesign, architecture and status of the project.

Mary ThomasSan Diego State [email protected]

MS23

Accelerated Simulation of Analog Systems UsingLinear and Nonlinear Robust Compact Models

One common approach for accelerating the simulation ofcomplex analog systems is to replace subsystems of the cir-cuit with simpler compact models. However, such proce-dures often result in unstable and non-physical behavior ofthe resulting system, particularly when employing nonlin-ear models. Using recently developed optimization-basedmodel reduction techniques, it is possible to generate ro-bust models, for both linear and nonlinear subsystems, thatare guaranteed to be stable and passive, ensuring simula-

132 CS11 Abstracts

tion of the overall system remains well-behaved.

Brad BondSandia National [email protected]

MS23

A Nonlinear Timing/Phase Macromodeling Tech-nique for General Circuits

We extend the concept of timing/phase macromodels, pre-viously established rigorously only for oscillators, to applyto general circuits, both non-oscillatory and oscillatory. Wederive a timing/phase macromodel via nonlinear perturba-tion analysis, and provide numerical methods to computethe macromodel. The macromodel that emerges is a scalar,nonlinear time-varying equation that accurately character-izes the system’s phase/timing responses. We validate thetechnique on non-oscillatory circuits and show that ourmacromodel accurately captures the timing response.

Chenjie Gu, Jaijeet RoychowdhuryUniversity of California, [email protected], [email protected]

MS23

Parallel Circuit Simulation Using Multi-LevelNewton as a Graph Mitigation Strategy

Parallel SPICE-style circuit simulation is challenging for anumber of reasons. Traditional circuit simulation involvesimplicitly solving a potentially large set of differential-algebraic equations (DAE’s), which requires constructingand solving a linear system at each Newton step. Unfor-tunately, modern integrated circuit topologies frequentlylead to matrix structures that are problematic for precon-ditioned iterative solvers. In a previous work, a combina-tion of techniques, including singleton removal, Dulmage-Mendelsohn decomposition, and hypergraph partitioninghas been shown to be an effective preconditioning strat-egy for some integrated circuits. However, there are cir-cumstances where this strategy is ineffective, as it dependsupon the Jacobian matrices having a particular structure.Circuit features that can break the expected structure in-clude highly connected nodes such as non-ideal power sup-ply and clock nodes (common in many integrated circuits),as well as feedback loops typical in phase-locked loops(PLL’s). Through strategic application of multi-level New-ton methods it is possible to remove problematic structuresfrom the matrix and/or circuit graph, and enable the pre-conditioner to be effective.

Eric KeiterSandia National [email protected]

MS23

Efficient Preconditioners for Large-Scale ParallelCircuit Simulation

While direct linear solvers have long been regarded as arequirement for successful circuit simulation, the paralleltransistor-level simulation of large-scale integrated circuitsnecessitates the use of iterative linear solvers. However,the linear systems generated through circuit simulation arechallenging for conventional matrix ordering, load balanc-ing, and preconditioning techniques. We will discuss thechallenges presented by these linear systems, the currenttechniques used for preconditioning circuit matrices, and

the ongoing work in developing efficient preconditioners forscalable circuit simulation using Xyce.

Heidi K. ThornquistSandia National [email protected]

MS24

A Computational Quest for Quantum SubsystemCodes

Quantum error correction is important because it allowsus to build quantum computers that work despite havingnoise in their components. Previous approaches for investi-gating quantum error correcting codes have focused on ap-plying theoretical analysis to look for interesting codes andto investigate their properties. In this talk we present analternative approach that uses computational analysis toaccomplish the same goals. Specifically, we present an algo-rithm that computes the optimal quantum subsystem codethat can be implemented given an arbitrary set of measure-ments that are tensor products of Pauli operators. We thendemonstrate the utility of this algorithm by performing asystematic investigation of the quantum subsystem codesthat exist in the setting where the measurements are lim-ited to 2-body measurements between neighbors on latticesderived from the convex uniform tilings of the plane.

Gregory M. CrosswhiteUniversity of [email protected]

MS24

A Large Time Step and Low Communication FiniteVolume Method for Atmospheric Simulation


Matthew NormanNorth Carolina State UniversityRaleigh, [email protected]

MS24

Strategies for In-situ Analysis and Visualization inLarge-scale Cosmological Simulation


Paul Matthew SutterUniversity of Illinois at [email protected]

MS25

Gradient and Hessian Consistency in Discontinu-ous Galerkin Solution of Inverse Wave PropagationProblems


Tan Bui-ThanhThe University of Texas at [email protected]

MS25

The Inverse Medium Problem in Site Characteri-

CS11 Abstracts 133

zation

I discuss recent progress in the full-waveform imaging ofprobed solids/soils, with site characterization applicationsin mind that typically involve arbitrarily heterogeneousdomains, and mandate domain truncation via Pefectly-Matched-Layers (PMLs). I discuss a variational, symmet-ric, hybrid, non-convolutional, unsplit-field approach forwave simulations in the combined truncated domain andPML layer, and report on numerical experiments, includ-ing inversion attempts for the Marmousi profile.

Loukas F. KallivokasThe University of Texas at [email protected]

MS25

On the Effect of Boundary Conditions in SeismicFull Waveform Inversion

In seismic full waveform inversion, the enormous size ofthe computational earth models that are of interest to theoil and gas industry makes solving this problem challeng-ing, even with today’s supercomputers. In order to re-duce the computational domain of earth models to a man-ageable size, the forward simulation of wave propagationon the unbounded domain of the problem is truncated toa finite computational domain and some form of absorb-ing boundary conditions (ABCs) are applied. The trun-cated computational domain is typically a minimal volumespanned by the receiver spread at the earth surface. Bydoing so, however, one implicitly discards reflections thatwould come from near the outside of the truncated domain,caused by the inherent inhomogeneities of the earth’s ma-terial properties. We present results from numerical exper-iments on the inversion of material properties from acous-tic wave propagation data as obtained through typical ge-ometries used in standard reflection seismology. The for-ward problem is discretized with discontinuous Galerkinfinite elements, and we compare results obtained from avery large computational domain to these obtained on atrunctated computational domain with absorbing bound-aries modeled either with the simple characteristic ABC,or with the state-of-the-art perfectly matched layer (PML).We conclude that while the PML conditions are clearly su-perior for modeling a purely absorbing boundary, its effecton the inversion does not always justify the increased com-putational cost. Our work suggests that the error intro-duced in the inversion by the use of a truncated domainthat cannot account for the incoming reflections comingfrom outside the domain of interest and present in the sig-nal, could often be larger than the residual reflections dueto imperfectly absorbing boundaries.

Dimitar TrenevIMA Univ of [email protected]

Volkan Akcelik, Huseyin Denli, Alex Kanevsky, Kines KPatel, Laurent White, Martin-Daniel [email protected],[email protected],[email protected],[email protected],[email protected],[email protected]

MS25

Full Wave Form Inversion using DiscontinuousGalerkin

We demonstrate the use of Discontinuous Galerkin (DG)to solve a linear least squares problem constrained by fullwave form propagation. An adjoint provides sensitivityinformation which is combined with Nonlinear ConjugateGradient and phase encoding to solve a large multi right-hand side problem. DG offers flexibility to provide addedresolution in areas where the subsurface is complex. Wedemonstrate an initial implementation using a synthetictwo dimensional dataset.

Bart G. Van Bloemen Waanders, Scott Collis, Curt OberSandia National [email protected], [email protected], [email protected]

MS26

Real-Time Parametric Adaptation of Reduced-Order Models by Consistent Interpolation on aManifold

The concept of parametric adaptation of reduced-orderbases using a database and interpolation algorithms onmanifolds is extended to linearized reduced-order mod-els. Two steps are involved: (1) the transformation of thereduced-operators in consistent bases, and (2) the interpo-lation of the transformed operators on a suitable manifold,resulting in a fully on-line method. It is illustrated withapplications to parametric studies of structural and aeroe-lastic systems in subsonic to supersonic regimes. Its abilityto detect mode crossing and veering is highlighted.

David Amsallem, Charbel FarhatStanford [email protected], [email protected]

MS26

Parametric Adaptation of Reduced-Order Bases byInterpolation on a Manifold

Physics-based Reduced-Order Models can enable real-timecomputations but remain expensive to generate. They alsogenerally lack robustness with respect to variations of un-derlying model parameters. To address these variations,a computational approach based on an offline database ofreduced-order bases equipped with online algorithms forinterpolation on appropriate manifolds is presented. Theproposed approach incorporates a machine learning-basedtraining algorithm. It is demonstrated here by applicationto the flutter analysis of an F-16 aircraft configuration.

Charbel Farhat, David AmsallemStanford [email protected], [email protected]

Julien CortialStanford UniversityInstitute for Computational and [email protected]

MS26

The Certified Reduced Basis Method for High-Fidelity Simulations in Real-Time Deployed Appli-

134 CS11 Abstracts

cations

We employ the certified reduced basis method to achievehigh-fidelity numerical simulation of parametrized partialdifferential equations on “lightweight’ deployed devices.The computational approach is divided into a computa-tionally intensive Offline stage (performed on a supercom-puter, for example) in which the reduced order model isgenerated, and a very inexpensive Online stage in whichquantities of interest and rigorous error bounds are evalu-ated in real-time. This methodology allows us to consider“in the field’ inverse and design problems and we presenta number of examples from heat transfer, fluid mechanicsand acoustics.

David J. [email protected]

MS26

Empirical Operator Interpolation for Reduced Ba-sis Approximations of Nonlinear Evolution Equa-tions

In this contribution we present a new approach to treatnonlinear operators in reduced basis approximations ofparametrized evolution equations. The approach is basedon empirical interpolation of nonlinear differential opera-tors and their Frechet derivatives. Efficient online/offlinedecomposition is obtained for discrete operators that sat-isfy an H-independent DOF dependence for a certain set ofinterpolation functionals, where H denotes the dimensionof the underlying high dimensional discretization space.The resulting reduced basis method is applied to nonlin-ear parabolic and hyperbolic equations based on explicitor implicit finite volume discretizations. We show thatthe resulting reduced scheme is able to capture the evo-lution of both smooth and discontinuous solutions. In caseof symmetries of the problem, the approach realizes anautomatic and intuitive space-compression or even space-dimensionality reduction. We perform empirical investiga-tions of the error convergence and runtimes. In all caseswe obtain a runtime acceleration of at least one order ofmagnitude.

Martin DrohmannWWU [email protected]

Bernard HaasdonkUniversity of [email protected]

Mario OhlbergerUniversitat MunsterInstitut fur Numerische und Angewandte [email protected]

MS27

Edge Functions – A Sound Basis

The reconstruction map I – also known as the Whitneymap – plays an important role in mimetic/compatible dis-cretizations, as described in [1]. I maps cochains onto dif-ferential forms and needs to satisfy the following criteria:

1. I is linear;

2. RI ≡ Id, the reconstruction map is the right inverseof the reduction map R;

3. IR = Id + O(hp), the reconstruction map is an ap-proximate left inverse of the reduction map;

4. I satisfies dI = Iδ, where d is the exterior derivativeacting on differential forms and δ is the coboundaryoperator acting on cochains.

This talk will address edge functions which are high orderpolynomials which satisfy the criteria for reconstructionmaps listed above. All properties of the reconstructionmap are retained when highly deformed curvilinear spec-tral elements are concerned, [2].

[1] Bochev and Hyman, Principles of Mimetic Discretiza-tions of Differential Operators, Vol. 142 of the IMAVolumes in Mathematics and its Applications, Springer,Berlin, pp. 89-119, 2006 [2] Gerritsma, Edge functionsfor spectral element methods, In: Spectral and High OrderMethods for Partial Differential Equations. Eds: Jan S.Hesthaven and Einar M. Rønquist, Lecture notes in Com-putational Science and Engineering, 76, Springer, pp. 199-208, 2010.

Mark GerritsmaDelft University of [email protected]

MS27

Fast Finite Element Algorithms using BernsteinPolynomials

Bernstein polynomials on the d-simplex possess specialstructure that allows spectral-complexity matrix-free ap-plication of finite element operators, both constant andvariable coefficients. I shall describe this structure anddiscuss application of these ideas for the de Rham complexof H(grad), H(curl) elements, and H(div).

Robert C. KirbyTexas Tech [email protected]

MS27

A Locally Conservative, Discontinuous Least-Squares Finite Element Method for the StokesEquations

Conventional least-squares finite element methods (LS-FEMs) for incompressible flows do not lead to exact con-servation of mass in the resulting approximation. In thistalk we formulate a new, locally conservative LSFEM forthe Stokes equations wherein a discontinuous velocity fieldis computed that is point-wise divergence free on each ele-ment. The effect of the new LSFEM approach on improvedlocal and global mass conservation is compared with a con-ventional LSFEM for the Stokes equations employing stan-dard C0 elements.


James Lai, Luke OlsonDepartment of Computer ScienceUniversity of Illinois at [email protected], [email protected]

CS11 Abstracts 135

MS27

Application of a Discontinuous Petrov-GalerkinMethod to the Stokes Equations

The discontinuous Petrov-Galerkin finite element methodproposed by L. Demkowicz and J. Gopalakrishnan guar-antees the optimality of the solution in what they call theenergy norm. An important choice that must be made inthe application of the method is the definition of the innerproduct on the test space. In this work, we apply the DPGmethod to the Stokes problem in two dimensions, analyzingit to determine appropriate inner products, and perform aseries of numerical experiments.


Nathan RobertsUniversity of Texas at [email protected]

Leszek DemkowiczInstitute for Computational Engineering and Sciences(ICES)The University of [email protected]


MS28

A Posteriori Error Estimates for Polynomial ChaosExpansions of Response Surfaces for DifferentialEquations

We develop computable a posteriori error estimates for lin-ear functionals of a solution to a general nonlinear stochas-tic differential equation with random model/source param-eters. These error estimates are based on a variationalanalysis applied to stochastic Galerkin methods for forwardand adjoint problems. The result is a representation for theerror estimate as a polynomial in the random model/sourceparameter. The advantage of this method is that we usepolynomial chaos representations for the forward and ad-joint systems to cheaply produce error estimates by simpleevaluation of a polynomial. By comparison, the typicalmethod of producing such estimates requires repeated for-ward/adjoint solves for each new choice of random param-eter. We present numerical examples showing that there isexcellent agreement between these methods.

Troy Butler, Clint DawsonInstitute for Computational Engineering and SciencesUniversity of Texas at [email protected], [email protected]

Tim M. WildeyThe University of Texas at AustinAustin, [email protected]

MS28

Numerical Solutions of SPDES with Multiplicative

Noise Forcing Terms

This talk focuses on the numerical solution of stochasticparabolic SPDES with multiplicative noise forcing terms.We first convert the SPDE to forward-backward doublystochastic differential equations based on the theory offorward-backward stochastic differential equations (FBS-DEs) and forward-backward doubly stochastic differentialequations (FBDSDEs). We propose a new numerical algo-rithm, called binomial tree method, for solving FBDSDEs.Error analysis as well as numerical experiments will be pre-sented. We shall demonstrate that our method is superiorin comparison with standard finite difference method.

Yanzhao CaoDepartment of Mathematics & StatisticsAuburn [email protected]

Feng BaoAuburn [email protected]

Weidong Zhaoshandong [email protected]

MS28

On the use of ANOVA Expansions in UQ

We consider the use of ANOVA expansions in the contextof uncertainty quantification and parameter compressionin high-dimensional problems. The discussion includes at-tention to both Lebesgue and Dirac ANOVA expansionsand in the latter case we highlight a close connection be-tween sparse grid integration and the optimal choice of theanchor point. The accuracy and efficiency of the proposedtechniques are illustrated through examples of dynamicalsystems with high-dimensional parametric uncertainty.


Zhen GaoOcean University of China, Qiandao, [email protected]

MS28

Stochastic Elliptic Modeling based on the WickProduct

Based on the study of two commonly used stochastic el-liptic models: I: − ∇ · (a(x,ω) · ∇u(x, ω)) = f(x) andII: − ∇ · (a(x,ω) � ∇u(, ω)) = f(x), we constructeda new Wick-type stochastic elliptic model III: −∇ ·((

a−1)�(−1) � ∇u(x, ω)

)= f(x). The difference between

models I and II is twofold: a scaling factor induced by theway of applying the Wick product and the regularizationinduced by the Wick product itself. We show that modelIII has the same scaling factor as model I, and present adetailed discussion about the difference between models Iand III with respect to the two characteristic parameters ofthe random coefficient, i.e., the standard deviation σ andthe correlation length lc. Numerical results are presented

136 CS11 Abstracts

for both one- and two-dimensional cases.

Xiaoliang WanLouisiana State UniversityDepartment of [email protected]

MS29

Extending the Time Scales of Biomolecular Simu-lations on Emerging Computing Architectures

One of the significant challenges faced with molecular dy-namics (MD) simulation is to extend its attainable timescale. There are two ways to do this: to make each se-quential step run faster by using fast algorithms, and toimprove the parallel efficiency so that more processors canbe used. In this talk, I will discuss more scalable treat-ment of long-range electrostatic interactions in simulationusing Ewald-mesh based explicit models and PB/GB basedimplicit models.

Xiaolin ChengOak Ridge National [email protected]

MS29

Accelerating Separable Tensor Operations withThroughput-Oriented Processors

Streamlining memory performance is a critical issue forutilizing the full computational bandwidth of throughput-oriented processors such as graphics cards, especially formemory-bound and high-dimensional applications. Thereare significant related scientific computing challenges, bothwithin the context of the underlying architecture and at theapplication level. In this work we present a technique forimproving the memory efficiency of separable tensor com-putations that matches application requirements againstarchitectural constraints.

Christopher McKinlayUniversity of California Los [email protected]

MS29

The Parallel Full Approximation Scheme in Spaceand Time

I will discuss a strategy for the parallelization of numeri-cal methods for partial differential equations in both thespatial and temporal directions. The method is based onan iterative multilevel approach whereby spectral deferredcorrection sweeps are applied to a hierarchy of discretiza-tions at different spatial and temporal resolutions. Connec-tions to the parareal algorithm and space-time multigridmethods will be discussed, and the parallel efficiency andspeedup for three dimensional problems will be presented.

Michael MinionUniversity of North Carolina-Chapel HillMathematics [email protected]

MS29

On the Parallelization of Non-Uniform Convolu-

tions

The non-uniform convolution (NUCONV) refers to the dis-crete convolution with non-equally spaced data at input, oroutput, or both. This irregularity in sample locations com-plicates the data structure and the concurrent relation inparallel computation. We introduce a systematic method-ology for parallel NUCONV in one to three dimensions, andpresent an auto-tuning library for NUCONV on multicoreprocessors using Posix threads.

Nikos PitsianisDepartment of Electrical & Computer EngineeringDuke [email protected]


MS30

Flipping Edges and Vertices in Graphs

We study a certain random process on a graph G which is avariation of a classical voter model and is also a special caseof the so-called Tsetlin library random walk. Initially eachvertex of G is colored either in blue or red. At each step anedge is chosen at random and both endpoints change theircolors to blue with probability p and to red otherwise. Thisedge-flipping process corresponds with a random walk onthe associated sta te graph in which each coloring config-uration is a node. We show that the eigenvalues for therandom walk on the state graph can be indexed by subsetsof the vertex set of G. For example, for the uniform caseof p = 1/2, for each subset T of the vertex set V of G, theeigenvalue λT (with multiplicity 1) is the ratio of the num-ber of edges in the induced subgraph of T over the totalnumber of edges in G. We analyze the stationary distribu-tion of the state graph of colorings of G for several specialfamilies of graphs, such as paths, cycles and trees. We alsomention related problems in connection with memorylessgames.

Fan Chung-GrahamDepartment of MathematicsUniversity of California at San [email protected]

Ron [email protected]

MS30

The Spectre of the Spectrum: Spectral InsightsAcross Graphs

In this talk, we present insights from analyzing a suiteof common discrete graph metrics (connected components,clustering coefficients, cores, for example) across a range ofreal-world graphs and synthetic graph models. We also an-alyze spectral properties of the adjacency, Laplacian, andmodularity matrices of these graphs in normalized and un-normalized versions. This includes finding complete spec-tra for graphs with hundreds of thousands of nodes.

David F. GleichSandia National [email protected]

CS11 Abstracts 137

MS30

Eigenspace Analysis for Subgraph Detection

We describe statistical tests for anomalous subgraph detec-tion and localization using spectral properties of a graph’smodularity matrix. In addition to a Chi- squared test inthe matrix’s principal two-dimensional subspace, we use atest based on L1 properties of less significant eigenvectorsto detect weaker anomalies. The techniques are extendedto dynamic graphs and demonstrated on real and simulateddata.

Nadya Bliss, Benjamin MillerMIT Lincoln [email protected], [email protected]

Patrick J. WolfeStatistics and Information Sciences LaboratoryHarvard [email protected]

MS30

The Laplacian Paradigm: Emerging Algorithms forMassive Graphs

This presentation describes an emerging paradigm forthe design of efficient algorithms for massive graphs.This paradigm, which we will refer to as the LaplacianParadigm, is built on a recent suite of nearly-linear timeprimitives in spectral graph theory developed by Spiel-man and Teng, especially their solver for linear systemsAx = b, where A is the Laplacian matrix of a weighted,undirected n-vertex graph and b is an n-place vector. Inthe Laplacian Paradigm for solving a problem (on a mas-sive graph), we reduce the optimization or computationalproblem to one or multiple linear algebraic problems thatcan be solved efficiently by applying the nearly-linear timeLaplacian solver. So far, the Laplacian paradigm alreadyhas some successes. It has been applied to obtain nearly-linear-time algorithms for applications in semi-supervisedlearning, image process, web-spam detection, eigenvalueapproximation, and for solving elliptic finite element sys-tems. It has also been used to design faster algorithms forgeneralized lossy flow computation and for random sam-pling of spanning trees. The goal of this presentation isto encourage more researchers to consider the use of theLaplacian Paradigm to develop faster algorithms for solv-ing fundamental problems in combinatorial optimization(e.g., the computation of matchings, flows and cuts), inscientific computing (e.g., spectral approximation), in ma-chine learning and data analysis (such as for web-spamdetection and social network analysis), and in other appli-cations that involve massive graphs.

Shanghua TengUniversity of Southern [email protected]

Daniel SpeilmanYale [email protected]

MS31

Integrating Efficiency Into Designing PerformanceModels

Characterizing the performance of scientific applications isessential for effective code optimization. Limited tools ex-ist to support the model building process that involves ex-

tracting detailed information about the application, plat-forms, and their interactions. We have designed a suiteof tools to automate the model building process, provid-ing tools to design models, validate models, and measuremodel parameters. We will present an example in design-ing a performance model for selecting messaging schemesfor an electromagnetic simulation.

Van BuiUniversity of Illinois at [email protected]

Boyana NorrisArgonne National [email protected]

William D. GroppUniversity of Illinois at Urbana-ChampaignDept of Computer [email protected]

MS31

Predictive Models of Memory Subsystem Perfor-mance for Sparse Scientific Computing

A large number of computational modeling and simula-tion applications rely on sparse data structures and al-gorithms. Such sparse computations are inherently scal-able yet difficult to tune and adapt for energy-aware high-performance on modern multicore architectures, since theirregular and complex access patterns that classify thesecodes create many bottlenecks. Consequently, we developmemory-based interaction models between such applica-tions and hardware to enable performance optimizationsat the software and hardware layers of advanced comput-ing systems.

Michael FrascaThe Pennsylvania State [email protected]

Padma RaghavanThe Pennsylvania State Univ.Dept of Computer Science [email protected]

MS31

Performance Modeling for Systematic PerformanceTuning

The performance of parallel scientific applications dependson many factors. Especially on large systems, it is tooexpensive to explore the solution space with a series ofbenchmarks. Analytical models allow estimating and ex-trapolating their execution performance, bottlenecks, andthe potential impact of optimization options. We proposeto use such performance modeling techniques from the be-ginning of the applications design. We will motivate theuse of performance modeling with several examples.

William D. GroppUniversity of Illinois at Urbana-ChampaignDept of Computer [email protected]

Torsten HoeflerUniversity of [email protected]

138 CS11 Abstracts

Marc SnirUniersity of [email protected]

MS31

A Dynamic Runtime Optimization Framework forOpenUH’s OpenMP Runtime

We present the design and implementation of an OpenMPdynamic optimization framework based on the OpenUHCollector Interface. This framework allows user-defined dy-namic libraries to register callback functions for predefinedevents with the purpose of gathering runtime informationand affecting runtime system behavior. Users are able tofocus on implementing various optimizations without theneeds to modify the program. Interesting applications ofthis framework related to performance modeling are dis-cussed.

Besar Wicaksono, Ramachandra Nanjegowda, BrettEstrade, Barbara ChapmanUniversity of [email protected], [email protected],[email protected], [email protected]

MS32

Fast Higher Order FD like Schemes with SobolevType Norm Minimization

Fast and stable numerical discretization schemes based ona Sobolev like norm minimization are presented. These areused to construct Finite Difference like weights for differen-tiation and integration and used to construct higher-orderPDE Solvers for Elliptic Boundary value problems in pla-nar domains. Specific examples derived from engineeringareas are considered over complex domains; the related sta-bility and convergence issues are discussed. Performance ofthe schemes are compared to the FEM techniques in termsof convergence and computational speed.

Shivkumar ChandrasekaranUniversity of California, Santa BarbaraDepartment of Electrical and Computer [email protected]

Karthik Jayaraman RaghuramDept. of Electrical EngineeringUniversity of California, Santa [email protected]

Joseph MoffittDept. of Electrical and Computer EngineeringU C Santa [email protected]

Hrushikesh MhaskarCalifornia State University, Los AngelesDept. of [email protected]

Ming GuUniversity of California, BerkeleyMathematics [email protected]

MS32

Structured Low-Rank Matrix Recovery via Convex

Optimization

In this talk, we discuss new algorithms for computing struc-tured low-rank matrix approximations to a given matrixusing convex optimization techniques. The structures weconsider include Hankel matrices and banded matrices. Weshow how our techniques can be used to solve practicalproblems in signal processing applications.


MS32

Existence of H-matrix Approximants to the In-verse Finite-Element Matrix of ElectrodynamicProblems and H-Based Fast Direct Finite-ElementSolvers

We prove that the inverse of the sparse matrix re-sulting from a finite-element-based analysis of electro-dynamic problems has a data-sparse H-matrix approx-imation. We thus develop an H-matrix-based directfinite-element-solver having O(kN log N) storage unitsand O(k2N log2 N) operation counts for electromagnetic-analysis, where k is a parameter that is adaptively deter-mined based on accuracy requirements. The complexity isfurther reduced to O(M log M) in storage and O(Nlog2M)in time for layered-structures, where M is the number ofsingle-layer unknowns.

Dan JiaoElectrical and Computer EngineeringPurdue [email protected]

Haixin LiuPurdue UniversityElectrical and Computer [email protected]

MS32

Parallel Algorithms for Hierarchically Semi-separable Structures

Much progress has been made in fast algorithms for struc-tured linear systems, such as those involving hierarchicallysemi-separable (HSS) matrices. Nearly linear time factor-ization algorithms have been developed to solve these sys-tems. A key idea behind these algorithms is to fully exploitnumerical low rankness in these structured matrices. Inthis talk, we present new parallel algorithms for HSS ma-trix operations and their use in the context of factorization-based sparse solvers and preconditioners.

Xiaoye Sherry LiComputational Research DivisionLawrence Berkeley National [email protected]

Shen WangPurdue [email protected]

Jianlin XiaDepartment of MathematicsPurdue [email protected]

CS11 Abstracts 139

MS32

Efficient Structured Solution of Large Sparse Lin-ear Systems

We present some new techniques for the structured solutionof sparse linear systems. They include new semisepara-ble structure generation under different circumstances, newstructured factorizations, and their applications to sparseproblems. The efficiency, robustness and scalability of thissparse solution are analyzed. Numerical examples in termsof PDEs and some real-world applications are shown.

Jianlin XiaDepartment of MathematicsPurdue [email protected]


MS33

Cool@hpc

A non-standard numerical approach called COOL (Con-straint Oriented Library) is presented. It enables to solveP.D.E. without generating non-physical solutions. Exter-nal constraints such as ∇.u= 0, can handled by an elim-ination process, thus reducing the number of variables tothe number of the physical problem. Operators includingsingularities in the spectrum as in Maxwells are well repre-sented. It is shown how this method will be implementedto run on HPC machines.

Medji AzaiezUniversite de [email protected]

MS33

An Eigen-Based High Order Expansion Basis forSpectral Elements

We present an efficient high-order expansion basis for thespectral element approach. This belongs to the categoryof modal basis, but it is not hierarchical. The interiormodes are constructed by solving a small generalized eigen-value problem, while the boundary modes are constructedbased on such eigen functions in lower dimensions. Wecompare this expansion basis with the commonly-used Ja-cobi polynomial-based expansion basis, and demonstratethe significantly superior numerical efficiency of the newbasis in terms of conditioning and the number of iterationsto convergence for iterative solvers.

Xiaoning Zheng, Suchuan DongPurdue [email protected], [email protected]

MS33

Radial Basis Functions for Planetary Scale Flows:Recent Developments

The talk will concentrate on the development of radial ba-sis function methods for fluid modeling. Applications willbe geared towards the geosciences. Recent advances in al-gorithm development to make the methodology more com-

putationally effective will be discussed.

Natasha FlyerNational Center for Atmospheric ResearchInstitute for Mathematics Applied to [email protected]

MS33

The Lower Bounds for Eigenvalues of Elliptic Oper-ators - by Nonconforming Finite Element Methods

Abstract. We propose a condition and prove that it issufficient to guarantee the nonconforming finite elementmethods to produce the lower bounds for eigenvalues ofthe symmetric elliptic operators. We show that this condi-tion holds for the most used nonconforming elements, e.g.,the Wilson element, the nonconforming linear element byCrouzeix and Raviart, the nonconforming rotated Q1 ele-ment by Rannacher and Turek, and the enriched noncon-forming rotated Q1 element by Lin, Tobiska and Zhou forthe second order elliptic operators, the Morley element,the Adini element and the enriched Adini element by Huand Shi for the fourth order elliptic operators, and theMorley-Wang-Xu element for the 2m-th order elliptic op-erator. Whence they will give lower bounds for eigenvaluesof these operators. Moreover, we follow the sufficient condi-tion to propose two new classes of nonconforming elementsfor the second order elliptic operators and prove that theywill yield the lower bounds for eigenvalues.

Yunqing HuangXiangtan University, ChinaDepartment of [email protected]

MS33

New Efficient Spectral Methods forHigh-Dimensional PDEs

Many scientific, engineering and financial applications re-quire solving high-dimensional PDEs. However, tradi-tional tensor product based algorithms suffer from the socalled ”curse of dimensionality”. We shall construct a newsparse spectral method for high-dimensional problems, andpresent, in particular, rigorous error estimates as well as ef-ficient numerical algorithms for some typical PDEs.

Jie ShenDepartment of MathematicsPurdue [email protected]

MS34

Comparison of Search Strategies in Empirical Per-formance Tuning of Linear Algebra Kernels

We have developed an empirical tuning system, Orio, whichis aimed at improving both application performance anddeveloper productivity by generating and evaluating mul-tiple optimizations based on developer annotations of keycomputations. The size of the search space that Orio ex-plores empirically depends exponentially on the numberof performance parameters, making exhaustive search im-practical in most cases. We compare different search strate-gies in the context of tuning several linear algebra kernelson multiple platforms.

Thomas NelsonUniversity of Colorado at Boulder

140 CS11 Abstracts

Argonne National [email protected]

Qian Zhu, Boyana Norris, Prasanna BalaprakashArgonne National [email protected], [email protected], [email protected]

MS34

Automatic Performance Tuning with BTO BLAS

Current hardware trends require that basic linear algebraroutines take advantage of multiple cores. The BTO toolfuses multiple linear algebra operations and automaticallyrecognizes opportunities for parallelism as it generates ba-sic linear algebra routines. This talk describes how theBTO compiler extracts parallelism and shows that suchparallelism can achieve further speedups for fused routines.

Jeremy SiekDepartment of Electrical and Computer EngineeringUniversity of Colorado at [email protected]

Geoffrey BelterDept. of Electrical, Computer, and Energy EngineeringUniversity of Colorado at [email protected]

MS34

Diagnosis, Tuning, and Redesign for Multicore

We describe how we improved the within-node scalabilityof the fast multipole method (FMM) through a systematicsequence of modeling, analysis, and tuning steps. On aquad-socket Intel Nehalem-EX system, we show speedupsof of 1.7 over the previous best multithreaded implementa-tion, 19.3 over a sequential but highly tuned (e.g., SIMD-vectorized) code, and match or outperform a GPGPU im-plementation. Our study suggests a more general tuningprocess that practitioners and tools could themselves ap-ply.

Aparna Chandramowlishwaran, Richard VuducGeorgia Institute of [email protected], richie@cc. gatech. edu

Kamesh MadduriLawrence Berkeley National [email protected]

MS34

Modeling Loop Fusion on Shared Memory ParallelArchitectures

The performance of many scientific applications is mostaccurately expressed in terms of data movement. Loopfusion is a technique that can decrease or increase datamovement. In this talk we describe a model that accu-rately captures large performance effects of fusing loops onshared memory parallel architectures. We show how themodel economically compares different implementations ofthe same routine with varying amount of loop fusion withina compiler framework without sacrificing kernel efficiency.

Ian KarlinDepartment of Computer ScienceUniversity of Colorado at Boulder

[email protected]

Elizabeth JessupUniversity of Colorado at BoulderDepartment of Computer [email protected]

MS35

Presenters To Be Announced


TBD [email protected]

MS36

Sparse Interpolatory Models for Molecular Poten-tial Energy Surfaces

We describe a parallel algorithm for generating interpo-latory approximations to molecular potential energy sur-faces. We show how that algorithm can be applied to ef-ficiently model a transition from a stable ground state, toan excited state, and finally to a different stable groundstate.

Carl T. KelleyNorth Carolina State UnivDepartment of Mathematicstim [email protected]

Alexei BykhovskiDept of Electrical and Computer EngineeringNC State [email protected]

David MokrauerDept of MathematicsNC State [email protected]

MS36

Optimal Placement of Gamma-monitoring Beaconsusing the Mesh Adaptive Direct Search Algorithm

The deployment of gamma-monitoring (GMON) beaconsis considered by the Research Institute of Hydro-Quebecin order to improve the snowpack estimate accuracy in or-der to manage the hydrological forecast throughout theyear, especially at critical times such as spring snowmelt.The placement of these beacons is critical and it may beseen as an optimization problem in which the GMON loca-tions are the variables and where the objective is to mini-mize the approximation error. Map constraints as well asthe error computation categorize this problem as a diffi-cult blackbox problem for which the Mesh Adaptive Di-rect Search (MADS) is designed. In addition, the fact thatthe variables correspond to two-dimensional locations ofobjects suggests grouping variables in order to decomposethe problem. Different grouping and regrouping strategiesare developed and supported by a convergence analysis.

Sebastien Le DigabelGERAD - Polytechnique [email protected]

Stephane Alarie

CS11 Abstracts 141

[email protected]

Charles AudetEcole Polytechnique de Montreal - [email protected]

Vincent GarnierPolytechnique [email protected]

Louis-Alexandre [email protected]

MS36

Robust Design for Cardiovascular Surgery Applica-tions using the Surrogate Management Framework

Recent work has demonstrated substantial progress inpatient-specific cardiovascular flow simulations, and a needfor customization of treatment plans for individual pa-tients. We present a unified framework for derivative-freeoptimization of cardiovascular geometries with pulsatileflow that incorporates uncertainties. Optimization is per-formed using the surrogate management framework withmesh adaptive direct search. We incorporate uncertaintiesusing adaptive stochastic collocation to perform robust de-sign. We apply these tools to several pediatric and adultcardiovascular surgery applications.

Alison MarsdenDepartment of Mechanical and Aerospace EngineeringUniversity of California, San [email protected]

MS36

Applied Optimization Without Derivatives

Through a set of diverse applications, we provide anoverview of challenges faced when optimizing functions ofcomplex simulations. These difficulties are compoundedby the computational expense of many simulations and in-clude computational noise, constraints without relaxations,mixtures of continuous and discrete variables, and avail-ability of some derivatives. These examples, and the ap-plications presented throughout the minisymposium, illus-trate the potential for methods in this area to make signif-icant contributions to computational science and engineer-ing.

Stefan Wild, Jorge More’Argonne National [email protected], [email protected]

MS37

Stability of Methods for Quasiseparable Matrices

In this talk we discuss computation of the matrix–vectorproduct for a quasiseparable matrix, a computation at theheart of many fast algorithms involving the class. Theusual parametrization of a quasiseparable matrix can intro-duce errors, even when algorithms are structured backwardstable. We derive a nested factorization involving House-holders to give a fast, stable algorithm. This talk is basedon joint work with Vadim Olshevsky and Michael Stewart.

Tom Bella

University of Rhode IslandDepartment of [email protected]

Vadim OlshevskyUniversity of ConnecticutDepartment of [email protected]

Michael StewartGeorgia State UniversityDepartment of [email protected]

MS37

Polynomial Representations for Matrices: Howthey Work and What One can do with Them

An input vector presented to a matrix can be viewed asa sequence of data that is processed in order of appear-ance, producing a similarly linearly ordered output vector.Block-diagonal matrices can then be viewed as instanta-neous operators producing an output at the same indexnumber as the input. In a further step, a matrix canbe viewed as a collection of shifted (block-)diagonals. Abanded matrix is then polynomial in the shift operator.The question arises whether there exist (minimal) rationalrepresentations for matrices in a sense that parallels ra-tional representations for functions in a variable, whethergeneral matrices can be so approximated, whether thereexist banded matrices with banded inverses and whetherinterpolation theories can be derived to obtain low com-plexity approximations of general matrices. The resultsare in many ways surprising as most classical theories havematricial counterparts. The theory produces new repre-sentations for the celebrated semi-separable matrices anda new vista on classical interpolation problems s.a. Lwnerand Schur-Takagi interpolation.

Patrick DewildeInstitute for Advanced Study, Technische UniversitatMunchenand CAS Section, EEMCS, TU [email protected]

MS37

A Structured Linear Algebra Problem in SpaceImaging

Resolution quality of image deblurring algorithms dependson accurate knowledge of the blurring operator, or pointspread function (PSF). When imaging objects in space us-ing telescopes, a wavefront sensor (WFS) is used to esti-mate the PSF. However the estimation quality is limitedbecause the WFS collects low resolution measurements.We describe a new approach to obtain high resolution PSFsfrom multiple low resolution WFS measurements. Effi-ciency is obtained by exploiting structured and sparse ma-trix computations.


MS37

Orthogonal Methods for the Solution of Quasisep-

142 CS11 Abstracts

arable Systems

This talk describes a nested product decomposition of aquasiseparable matrix A. The decomposition representsA as a nested product of Householders and a sequence ofvery sparse matrices matrices that each have only a fewnonzeros. The cost of computing the decomposition froman n × n matrix with off-diagonal ranks bounded by r isO(r2n2). The cost of using the decomposition to solve asystem of equations or multiply a vector by A is O(rn).The decomposition and the system solver are both back-ward stable and a proof of this fact is briefly summarizedin the talk.

Michael StewartGeorgia State UniversityDepartment of [email protected]

MS38

Modelling, Algorithm and Simulation of Wave Mo-tion in Quantum and Plasma Physics

In this talk, I begin with a review of several mathemat-ical models for describing wave motion in quantum andplasma physics. Computational difficulties for simulatingwave propagation and interaction in quantum and plasmaphysics are discussed. Efficient and accurate numerical al-gorithms for computing ground and excited states as wellas the dynamics of the nonlinear Schroedinger equation arepresented. Extensive simulation results of wave propaga-tion and interaction in quantum and plasma physics arereported. Finally, some conclusions are drawn.

Weizhu BaoNational University of SingaporeDepartment of [email protected]

MS38

Optimal Error Estimates of Finite DifferenceMethods for the Gross-Pitaevskii Equation withAngular Momentum Rotation

We analyze finite difference methods for the Gross-Pitaevskii equation with an angular momentum rotationterm in two and three dimensions and obtain the opti-mal convergence rate, for the conservative Crank-Nicolsonfinite difference (CNFD) method and semi-implicit finitedifference (SIFD) method, at the order of O(h2 + τ 2) inthe l2-norm and discrete H1-norm with time step τ andmesh size h. Besides the standard techniques of the energymethod, the key technique in the analysis for the SIFDmethod is to use the mathematical induction, and resp.,for the CNFD method is to obtain a priori bound of thenumerical solution in the l∞-norm by using the inverse in-equality and the l2-norm error estimate.

Yongyong CaiNational University of [email protected]

MS38

Title Not Available at Time of Publication


Laurent Di MenzaUniversity Paris 11

[email protected]

MS38

Critical Angular Velocities of Vortex Nucleation inRotating Bose-Einstein Condensation

Recently, nucleation of vortices in rotating Bose-Einsteincondensates (BEC) has been the subject of intensive exper-imental and theoretical research. For a two-dimensionalrotating BEC, we numerically study the critical angularvelocities of vortex nucleation. The Gross-Pitaevskii equa-tion in a rotating frame is used as the mathematical model,which is solved by using the time-splitting Fourier spectralmethod. Our numerical results agree with the theoreticalpredictions in the literature.

Yanzhi ZhangMissuri University of Science and [email protected]

MS39

Structure Preserving Interpolatory Model Reduc-tion

Dynamical systems are the basic framework for modelingand control of an enormous variety of complex systems.Rational Krylov methods are often capable of providingnearly optimal approximating subspaces for efficient reduc-tion of large-scale, complex dynamical systems. An inter-polation framework for model reduction is presented thatincludes rational Krylov methods as a special case. Thisbroader framework allows retention of special structure inthe reduced order models that is often encoded in the sys-tem parameterization such as symmetry, internal delays,and port-Hamiltonian structure.


MS39

Structure-preserving Model Reduction for MEMS

Simulations of damped vibration in micro-electro-mechanical systems (MEMS) typically lead to non-Hermitian problems which depend nonlinearly on a fre-quency parameter. In this talk, we discuss model reduc-tion methods which use three types of structure present inthe full discrete model: algebraic structure, such as com-plex symmetry of the system matrices, that is inheritedfrom the underlying PDEs; geometric structures such assymmetry groups or the presence of beam-like or plate-likecomponents; and perturbative structure that arises fromdisparate time scales in the physics. We illustrate ourmethods with models of a several resonant microstructures.

David BindelCornell [email protected]

MS39

Model Order Reduction for RC Networks withMany Terminals

A novel approach for reducing multi-terminal RC circuits isproposed. Using graph-based circuit partitionings in com-bination with state-of-the-art model reduction methods,structure and sparsity in the reduced model are enhanced.

CS11 Abstracts 143

The reduced models are easily converted to a circuit rep-resentation. These contain many fewer nodes and circuitelements than the original circuit, allowing faster simula-tions at little accuracy loss.

Roxana IonutiuTU [email protected]

Joost RommesNXP Semiconductors, The [email protected]

Wil SchildersTechnical University of [email protected]

MS39

Model Reduction and Vertex Cuts

We report on an algorithm for model reduction that reliesheavily on the computation of small cardinality vertex cutsin a graph representing resistances between the externalnodes of a circuit. The approach we propose is unified andconceptually simple. Furthermore our results are at leastas good as those previously reported in the literature.

Odile MarcotteCRM and [email protected]

Suzanne M. ShontzThe Pennsylvania State [email protected]

Wil SchildersTechnical University of [email protected]

Petko KitanovUniversity of [email protected]

MS40

Reduced Basis Method for Radar Cross SectionComputation of a Pacman Scattering Problem

We consider the scattering of TM-polarized electromag-netic waves by a perfectly conducting 2D cylinder with acut-out wedge. The parameter of this problem is the angleof the wedge. An interesting phenomenon of this problemis that the fields, and thus the bistatic radar cross section(RCS), change dramatically with only a small change inthe wedge angle. Reduced basis method (RBM) is appliedto this nonlinear problem. An extensive test of the algo-rithm shows exponential convergence of the RB solutionswith roughly constant effectivity index. A study of themonostatic scattering as a function of the wedge angle re-veals the capability of RBM to capture the critical wedgeangle for the optimal reduction of the backscatter.

Yanlai ChenDepartment of MathematicsUniversity of Massachusetts [email protected]

Jan S. HesthavenBrown University

Division of Applied [email protected]

Yvon MadayUniversite Pierre et Marie Curieand Brown [email protected]

MS40

Reduced Basis A Posteriori Error Bounds for Lin-ear Quadratic Optimal Control Problems

We present a posteriori error bounds for the solution oflinear quadratic optimal control problems using reducedbasis methods. We consider error bounds for the optimalcontrol input and the corresponding output of interest aswell as for the error in the optimal cost function. Ourbounds are online efficient, i.e., the computation cost isindependent of the underlying truth approximation, andare provable upper bounds for the quantities of interest.

Martin Grepl, Daniel Amat, Mark KaercherRWTH Aachen [email protected], [email protected],[email protected]

MS40

Reduced Basis and Freeform Deformation forShape Optimization

We present a new approach for shape optimization thatcombines two different types of model reduction: a suitablelow-dimensional parametrization of the geometry (yield-ing a geometrical reduction) combined with reduced ba-sis methods (yielding a reduction of computational com-plexity). More precisely, free-form deformation tech-niques are introduced for the geometry description andits parametrization, while reduced basis methods are usedupon a finite element discretization to solve systems ofparametrized partial differential equations. This allows anefficient flow field computation and cost functional evalua-tion during the iterative optimization procedure, resultingin effective computational savings with respect to usualshape optimization strategies. This approach is very gen-eral and can be applied for a broad variety of problems. Toprove its effectivity, we apply it to find the optimal shapeof aorto-coronaric bypass anastomoses based on vorticityminimization in the down-field region.

Alfio QuarteroniEcole Pol. Fed. de [email protected]

Gianluigi RozzaMathematics Institute of Computational Science andEng, [email protected]

Andrea ManzoniEcole Pol. Fed. de [email protected]

MS40

An ”Uncertainty Region” Reduced Basis Approach

144 CS11 Abstracts

to Parameter Estimation

We present a novel reduced basis approach for the solutionof parameter estimation problems. The method allows ef-ficient calculation of the “uncertainty region’ in parameterspace, where the uncertainty region is the set of all param-eter values consistent with measurements. The methodexploits the reduced basis method to rapidly solve the un-derlying parametrized PDEs. The computed uncertaintyregion incorporates (reduced-order) modelling and experi-mental error, needs no regularization, and can handle non-convex and non-simply connected domains.

Karen VeroyAICESRWTH Aachen [email protected]

Martin GreplRWTH Aachen [email protected]

MS41

On the Application of Estimation Theory to Com-plex System Design Under Uncertainty

Complex system design is a multifidelity multidisciplinarytask that often begins with large initial uncertainties in allaspects of the system. We present a method for system-atically reducing these uncertainties by casting the designprocess as a discovery procedure built on the tools of esti-mation theory. Further, we incorporate methods of globalsensitivity analysis to manage the multifidelity multidisci-plinary nature of the system throughout the process.

Doug Allaire, Karen WillcoxMassachusetts Institute of [email protected], [email protected]

MS41

Advanced Dynamically Adaptive Algorithms forStochastic Simulations on Extreme Scales

In spite of the recent and continuing growth in compu-tational resources available to the scientific community,full-scale resolution of most complex stochastic models iscompletely intractable. The situation becomes even moreserious because of the need to conduct uncertainty quan-tification. The traditional deterministic models of complexsystems are now stochastic models in higher dimensionalspaces, whose dimensionality is determined by a poten-tially very large set of random variables. We discuss theset of challenges needed to overcome in order to performadvanced numerical algorithms for modeling uncertainty incomplex stochastic systems at extreme scales.

Cory HauckOak Ridge National [email protected]

Ralf DeiterdingOak Ridge National LaboratoryComputer Science and Mathematics [email protected]

Dongbin XiuPurdue [email protected]

Rick ArchibaldComputational Mathematics GroupOak Ridge National [email protected]

MS41

Fast Generation of Nested Space-filling Latin Hy-percube Sample Designs

When “binning optimal” sample designs are recursively di-vided into uniform grids of cube bins, every same-sized bincontains the same number of points until the bins are smallenough to all contain either 1 or 0 points. Binning optimaldesigns are space-filling. We construct sequences of LatinHypercube Sample designs in which subsequent designs in-herit all points from previous designs. These designs arebinning optimal in the highest, lowest and certain interme-diate subsets of dimensions.


George KarystinosDepartment of Electronic and Computer EngineeringTechnical University of [email protected]

MS41

Optimal Uncertainty Quantification

We propose a rigorous framework for Uncertainty Quantifi-cation (UQ) in which the UQ objectives and the assump-tions/information set are brought to the forefront. Thisframework, which we call Optimal Uncertainty Quantifica-tion (OUQ), is based on the observation that, given a setof assumptions and information about the problem, thereexist optimal bounds on uncertainties: these are obtainedas extreme values of well-defined optimization problemscorresponding to extremizing probabilities of failure, or ofdeviations, subject to the constraints imposed by the sce-narios compatible with the assumptions and information.In particular, this framework does not implicitly imposeinappropriate assumptions, nor does it repudiate relevantinformation. Although OUQ optimization problems areextremely large, we show that under general conditions,they have finite-dimensional reductions. As an application,we develop Optimal Concentration Inequalities of Hoeffd-ing and McDiarmid type. Surprisingly, contrary to theclassical sensitivity analysis paradigm, these results showthat uncertainties in input parameters do not necessarilypropagate to output uncertainties. In addition, a generalalgorithmic framework is developed for OUQ and is testedon the Caltech surrogate model for hypervelocity impact,suggesting the feasibility of the framework for importantcomplex systems. This is a joint work with C. Scovel, T.Sullivan, M. McKerns and M. Ortiz. A preprint is availableat http://arxiv.org/abs/1009.0679v1

Houman OwhadiApplied [email protected]

Clint ScovelLos Alamos National [email protected]

Timothy J. Sullivan, Michael McKerns, Michael Ortiz

CS11 Abstracts 145

California Institute of [email protected], [email protected],[email protected]

MS42

The Method of Regularized Stokeslets with Appli-cations to Biological Flows

Biological flows, such as those surrounding swimming mi-croorganisms or beating cilia, are often modeled using theStokes equations due to the small length scales. The or-ganism surfaces can be viewed as flexible interfaces im-parting force on the fluid. I will present the latest resultson the Method of Regularized Stokeslets and other ele-ments that are used to compute Stokes flows interactingwith immersed flexible bodies or moving through obsta-cles. The method treats the flexible bodies as sources offorce or torque in the equations and the resulting velocityis the superposition of flows due to all the elements. A setof images is used to compute flows bounded by a plane.Exact flows are derived for forces that are smooth but sup-ported in small spheres, rather than point forces. I willpresent the idea of the method, some of the known resultsand several examples from biological applications.

Ricardo CortezTulane UniversityMathematics [email protected]

MS42

Front Tracking Method on Fluid Structure Inter-action

Front tracking is a Lagrangian tool for the propagation ofmaterial interface and is known for its excellent preserva-tion of interface geometry. In this presentation, we willdiscuss how to use this software tool to track the motion ofrigid body structure and airfoil. We will present the simu-lation and comparison with experiments of two interestingproblems, the windmill power generator and the inflationof parachute.

Xiaolin LiDepartment of Applied Math and StatSUNY at Stony [email protected]

Joungdong Kim, Yan LiSUNY at Stony BrookStony Brook, NY [email protected], [email protected]

MS42

Immersed Finite Element Methods for InterfaceProblems

This presentation will give a quick survey on the research ofimmersed finite element methods for solving interface prob-lems. In engineering and sciences, many simulations needto be carried out over domains consisting multiple homoge-neous materials separated from each other by curves or sur-faces. This often leads to the so called interface problemsof partial differential equations whose coefficients are piece-wise constants. Traditional finite element methods can beused to solve interface problems satisfactorily with meshesconstructed according to the material interfaces; otherwise,convergence cannot be guaranteed. This requires each el-

ement in a mesh to be occupied essentially by one of thematerials. In other words, each element needs to be onone side of a material interface. Therefore, the mesh ina traditional finite element method for solving an inter-face problem has to be unstructured to handle non-trivialinterface configurations. This restriction usually causes asubstantial negative impact on the simulation cost for ap-plications whose material interfaces are not fixed becausethe involved simulation domain has to be remeshed repeat-edly whenever the interfaces are adjusted. In this talk, wewill discuss a class of recently developed immersed finite el-ement (IFE) methods intended to alleviate this limitationof traditional finite element methods. The basic idea ofthe IFE methods will be described through particular ap-plications and numerical examples will also be presentedto illustrate features of IFE methods.


MS42

Numerical Methods for Elliptic Systems with Mov-ing Interfaces

Linear constant-coefficient elliptic systems of partial dif-ferential equations occur frequently in computational sci-ence, and challenge standard solution methodologies. Wepresent new “locally-corrected spectral boundary inte-gral methods’ for their accurate discretization and fastsolution. Arbitrary elliptic systems such as Poisson,Yukawa, Helmholtz, Maxwell, Stokes and elasticity equa-tions are transformed to an overdetermined first-order formamenable to unified solution. A simple well-conditionedboundary integral equation, for solutions satisfying arbi-trary boundary conditions posed on complex interfaces,is derived from first principles. A fast Ewald summa-tion formula for the periodic fundamental solution is de-rived by Fourier analysis, linear algebra, and local asymp-totic expansion. Ewald summation evaluates box, volumeand layer potentials efficiently, and separates the bound-ary integral equation into a low-rank system with regu-lar spectral structure, followed by a simple local correc-tion formula. A new “geometric nonuniform fast Fouriertransform’ (GNUFFT) produces accurate Fourier coeffi-cients of discontinuous piecewise-polynomial data on a d-dimensional simplicial tessellation in RD, for arbitrary di-mensions d and D, and costs as little as a few standardFFTs.

John A. StrainDepartment of MathematicsUniversity of California at [email protected]

MS43

Multilevel Aggregation Methods for Small-WorldGraphs with Application to Random-Walk Rank-ing

We describe multilevel aggregation in the context of usingMarkov chains to rank the nodes of graphs. Aggregationsuccessfully generates efficient multilevel methods for var-ious eigenproblems from discretized PDEs, which involvemesh-like graphs. Our goal is to extend the applicabilityto similar problems on small-world graphs. For a class ofsmall-world model problems, we show how multilevel hier-archies formed with non-overlapping aggregation are effi-ciently employed to accelerate convergence of methods that

146 CS11 Abstracts

calculate the ranking vector.

Hans De SterckUniversity of WaterlooApplied [email protected]

Van HensonLawrence Livermore National [email protected]

Geoffrey D. SandersCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

MS43

Directed Graph Embedding: Continous Limit ofLaplacian-based Operators

We consider the problem of embedding directed graphs inEuclidean space while retaining directional information.We view a directed graph as a finite sample from a dif-fusion process on a manifold endowed with a vector field.The algorithms we design separate and recover the featuresof this process: the geometry of the manifold, the data den-sity and the vector field. The application of our methodto both artificially constructed and real data highlights itsstrengths.

Marina MeilaStatisticsUniversity of [email protected]

Dominique Perrault-JoncasDepartment of StatisticsUniversity of [email protected]

MS43

Algebraic Multigrid for Spectral Calculations onLarge Complex Networks

Approximations to the expected commute time betweenany two vertices within a graph are quickly obtained usingan accurate set of eigenpairs corresponding to the lowereigenvalues of the graph Laplacian matrix. This and otherexamples motivate the development of scalable eigensolversfor matrices associated with scale-free graphs. We inves-tigate the use of Lanczos iteration on such matrices anddevelop a multilevel restart method, which is based on ag-gregation multigrid, to accelerate the Lanczos convergence.

Geoffrey D. SandersCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

Panayot Vassilevski, Van HensonLawrence Livermore National [email protected], [email protected]

MS43

Counting Triangles in Real-World Networks

The number of triangles is a computationally expensivegraph statistic which is frequently used in complex network

analysis (e.g., transitivity ratio), in various random graphmodels (e.g., exponential random graph model) and in im-portant real world applications such as spam detection, un-covering of the hidden thematic structure of the Web andlink recommendation. In this talk we will present a familyof spectral algorithms for triangle counting which performefficiently on ‘real-world’ networks, due to the special spec-tral properties which they typically possess. Finally we willpresent Triangle Sparsifiers, i.e., sparse graphs which ap-proximate the original graph G with respect to the countof triangles within a factor of ε with high probability aslong as G contains Ω(n) triangles. Triangle Sparsifiers al-low us to justify significant expected speedups (e.g., if Ghas O(n1.5+ε) triangles, we obtain in expectation a speedupof O(n)) and obtain excellent running times for large net-works in practice. Finally, we will provide potential re-search directions.

Charalampos TsourakakisCARNEGIE MELLON [email protected]

Mihalis KolountzakisDepartment of MathematicsUniversity of [email protected]

Gary MillerCarnegie Mellon [email protected]

MS44

Energy and Performance Tuning Framework forAccelerator Environment

Accelerator computing with GPUs for HPC has been widespread. Energy consumption of applications became as im-portant as execution performance. We propose a frame-work to perform tuning of energy consumption and exe-cution performance for accelerated computer environment.The framework contains user specified policy to indicateexecution policy including energy consumption and execu-tion performance. The framework tunes applications ac-cording to the user specified policy.

Shoichi HirasawaThe University of [email protected]

Satoshi OhshimaUniv. of [email protected]

Hiroki HondaThe University of [email protected]

MS44

Development of C Language Version of ABCLib-Script - The Impact To Auto-tuning Software

The much eagerly-awaited techniques on current computerenvironment are auto-mated system for arbitrary user pro-grams. Compiler support of tuning is limited due to com-plex architectures: multicore, heterogeneous configurationof CPU and GPU, and very deep hierarchy of cashes. Oneof solutions is to supply general computer language andpreprocessor for auto-tuning. The ABCLibScript was de-

CS11 Abstracts 147

signed to match the requirement of auto-tuning. However,the previous released version was only supported for sourcecodes written by Fortarn90. We hence now are developinga C language version. In this presentation, the author willexplain the impact of the C version to HPC software onheterogeneous environments, like CPU with GPU acceler-ation.

Takahiro KatagiriInformation Technology Center, The University of [email protected]

MS44

Development of a Multiple Precision Version ofLAPACK and Performance Measurement

We have been developing a multiple precision version ofBLAS and LAPACK (are widely used linear algebra pack-ages) called MPACK at http://mplapack.sourceforge.net/ .This project is intended to (i) provide reference implemen-tation of multiple precision version of BLAS and LAPACK(ii) rewritten by C++ (iii) can use many multiple precisionarithmetic libraries like mpfr, gmp and qd. (iv) currently76 BLAS routines and 100 LAPACK routines are imple-mented and tested, (v) distributed under 2-clause BSD li-cense. In this presentation we present current status ofMPACK and provide some benchmark results.

Maho [email protected]

MS44

Computer Assisted Proof of Non-singularity of aFloating-point Matrix based on High PerformanceFunctions

This talk is concerned with verified numerical compu-tations, especially, verification of non-singularity of afloating-point matrix A is focused on. Let R be an approxi-mate inverse matrix of A, then the problem is to prove thata maximum norm of RA − I is smaller than 1, where I isthe identity matrix. Our algorithm does as much works asnecessary to prove it by using high performance functionssupported by BLAS.

Katsuhisa OzakiShibaura Institute of Technology / [email protected]

Takeshi OgitaTokyo Woman’s Christian [email protected]

Shin’ichi OishiWaseda [email protected]

MS45

Presenters To Be Announced


TBD [email protected]

MS46

Designing Derivative-Free Hybrid OptimizationMethods for Hydrological Applications

Derivative-free methods have emerged as invaluable forfinding solutions to black-box optimization problems, andeach has distinct advantages and disadvantages. Often,these strengths and weaknesses are problem dependent.We consider hybrid approaches which combine beneficialelements of multiple methods in order to more efficientlysearch the design space. In this talk, we will examine somecommon derivative-free optimization approaches and de-scribe some appropriate hybrids which address the specificcharacteristics of hydrology applications. Using numeri-cal examples, we will illustrate how these hybrids can findsolutions under conditions which the traditional optimalsearch methods fail.

Genetha GraySandia National [email protected]

Katie FowlerClarkson [email protected]

MS46

ExploitingUncertainty Quantification in Derivative-free Op-timization

We describe a model-based trust-region algorithm forderivative free optimization using weighted regression. Theapproach provides a simple mechanism for exploiting un-certainty quantification, allowing greater weight to be givento function evaluations that are known with greater cer-tainty, resulting in better local models. We describe astrategy for choosing weights based on this idea, but whichtakes other factors into account, including proximity to thetrust-region center and the geometry of the sample set.

Stephen C. Billups, Jeffrey LarsonUniversity of Colorado [email protected],[email protected]

MS46

Parameter Estimation of Complex Simulations

Model-based methods evaluate the objective function attrial points and construct a model (or surrogate) of thefunction. We discuss algorithmic and performance issues ina new model-based trust-region algorithm (Pounder) thatconstructs a quadratic model of least change by interpolat-ing the function at a selected set of previous trial points.We discuss performance issues on a set of model problemsand on parameter estimation problems in nuclear energydensity functional theory.

Jorge J. MoreArgone National [email protected]

Stefan WildArgonne National [email protected]

148 CS11 Abstracts

MS46

Using Derivative-free Algorithms to Identify Sur-rogate Models of Energy Systems

We propose a method of simulation-based optimiza-tion with accurate, low-complexity surrogate models.Derivative-free and derivative-based algorithms are used toidentify, improve, and validate a set of surrogate models ofsimulations. The methodology is applied to a power plantsimulation, where a set of surrogate models is identified byseparating the plant into less complex components, model-ing the components, then combining the surrogate modelsto formulate an algebraic nonlinear program for optimiza-tion.

Alison CozadCarnegie Mellon [email protected]

Nick [email protected] mellon university

MS47

A Well-Conditioned Hierarchical Basis for Trian-gular H(Curl) -Conforming Elements for Maxwell’sEquations

We construct a well-conditioned hierarchical basis for tri-angular H(curl) conforming elements with selected orthog-onality. The basis functions are grouped into edge andinterior functions, and the later is further grouped into nor-mal and bubble functions. In our construction, the traceof the edge shape functions are orthonormal on the asso-ciated edge. The interior normal functions, which are per-pendicular to an edge, and the bubble functions are bothorthonormal among themselves over the reference element.The construction is made possible with classic orthogonalpolynomials, viz., Legendre and Jacobi polynomials. Forboth the mass matrix and the quasi-stiffness matrix, bet-ter conditioning of the new basis is shown by a comparisonwith the existing high order basis.

Wei CaiUniversity of North Carolina, [email protected]

Steve XinUNC [email protected]

MS47

A Parareal in Time Algorithm

The parareal algorithms are efficients tools to solve evolu-tion problems in real time. The principle is the following:one first approximates the solution on a coarse time gridusing a Coarse Solver, and then locally solves the prob-lem on fine time subgrids using a Fine Solver on parallelcomputers. The associated iterative procedure ensures anaccuracy which is of same order as the accuracy of the finescheme when applied on the fine time grid. We presentnew applications of the method.

Sidi M. KaberUniversity Pierre & Marie [email protected]

MS47

Maximum-principle-satisfying and Positivity-preserving HighOrder Discontinuous Galerkin and Finite VolumeSchemes for Conservation Laws

We construct uniformly high order accurate discontinu-ous Galerkin (DG) and weighted essentially non-oscillatory(WENO) finite volume (FV) schemes satisfying a strictmaximum principle for scalar conservation laws and passiveconvection in incompressible flows, and positivity preserv-ing for density and pressure for compressible Euler equa-tions. A general framework (for arbitrary order of accu-racy) is established to construct a limiter for the DG or FVmethod with first order Euler forward time discretizationsolving one dimensional scalar conservation laws. Strongstability preserving (SSP) high order time discretizationswill keep the maximum principle and make the schemeuniformly high order in space and time. One remarkableproperty of this approach is that it is straightforward to ex-tend the method to two and higher dimensions. The samelimiter can be shown to preserve the maximum principlefor the DG or FV scheme solving two-dimensional incom-pressible Euler equations in the vorticity stream-functionformulation, or any passive convection equation with anincompressible velocity field. A suitable generalization re-sults in a high order DG or FV scheme satisfying positivitypreserving property for density and pressure for compress-ible Euler equations. Numerical tests demonstrating thegood performance of the scheme will be reported.

Xiangxiong ZhangBrown [email protected]

Chi-Wang ShuBrown UniversityDiv of Applied [email protected]

MS47

Numerical Methods of the Space-Time FractionalDifferential Equations

We consider initial boundary value problems of fractionaldifferential equations and its numerical solutions. Two def-initions, i.e. Riemann–Liouville definition and Caputo one,of the fractional derivative are considered in parallel. Atheoretical framework for the weak solutions of the space-time fractional diffusion equations is developed, and thewell-posedness of the weak problems is proved. Moveover,based on the proposed weak formulation, we construct anefficient spectral method for numerical approximations ofthe weak solution.

Chuanju XuSchool of Mathematical SciencesXiamen [email protected]

MS47

Spectral Collocation/p-Version Finite ElementMethods for Hamiltonian Dynamical Systems

We make a systematical study of spectral Galerkin meth-ods (or p-version finite element methods) and spectralcollocation methods in numerically solving the Hamilto-nian dynamical systems. Different strategies includingLegendre-Lobatto collocation, Chebyshev-Lobatto colloca-

CS11 Abstracts 149

tion, spectral Galerkin/p-version, are discussed and com-pared, especially with symplectic methods. Numericaltests on some benchmark nonlinear problems are provided.

Zhimin Zhang, Nairat KenyameeWayne State [email protected], [email protected]

MS48

Artificial Boundary Conditions for SchrodingerEquations with General Potentials and Nonlinear-ities

This talk addresses the construction of absorbing bound-ary conditions for the two-dimensional Schrodinger equa-tion with a general variable repulsive potential or with acubic nonlinearity. Semi-discrete time schemes, based onCrank-Nicolson approximations or relaxation scheme, arebuilt for the associated initial boundary value problems.Finally, some numerical simulations give a comparison ofthe various absorbing boundary conditions to analyse theiraccuracy and efficiency.

Christophe BesseUniversite des Sciences et Technologies de [email protected]

MS48

Propagators for the Time-dependent SchrodingerEquation in an Adaptive, Discontinuous, Multi-wavelet Basis


Jun JiaOak Ridge National [email protected]

MS48

Fourth Order Time-stepping for Kadomtsev-Petviashvili and Davey-Stewartson Equations

Purely dispersive partial differential equations as theKorteweg-de Vries equation, the nonlinear Schrodingerequation and higher dimensional generalizations there ofcan have solutions which develop a zone of rapid mod-ulated oscillations in the region where the correspond-ing dispersionless equations have shocks or blow-up. Tonumerically study such phenomena, fourth order time-stepping in combination with spectral methods is bene-ficial to resolve the steep gradients in the oscillatory re-gion. We compare the performance of several fourth ordermethods for the Kadomtsev-Petviashvili and the Davey-Stewartson equations, two integrable equations in 2+1 di-mensions: exponential time-differencing, integrating fac-tors, time-splitting, and Driscoll’s IMEX method. The ac-curacy in the numerical conservation of integrals of motionis discussed.

Christian KleinInstitut de Mathematiques de Bourgogne9 avenue Alain Savary, BP 47870 21078 DIJON [email protected]

Kristelle RoidotIMBUniversite de [email protected]

MS48

Fast Gaussian Wavepacket Transforms and Gaus-sian Beams for the Schrodinger Equation


Jianliang QianDepartment of MathematicsMichigan State [email protected]

MS49

Toward Extreme Scale: Performance Analysis onthe IBM BG/P for High-Order ElectromagneticModeling

We demonstrate a massively parallel, open-source elec-tromagnetic code, based on advanced high-order numer-ical algorithms targeting production runs on the petas-cale computer architecutres. Computational results in-clude cutting-edge simulations in accelerator physics andnanotechnology-based science applications,involving re-search and development in applied mathematics, computerscience, and software. We present convergence and stabil-ity analysis, efficient parallel communication kernel, I/Operformance, and realistic simulations and validation ofthe computational results in comparison with other meth-ods. A detailed analysis on the parallel performance of thespectral-element discontinous Galerkin (SEDG) methodfor the time-domain electromagnetic simulations based onexplicit timestepping scheme will be discussed, providedwith strong and weak scalings up to 131K cores on theArgonne BG/P.

MiSun MinArgonne National LaboratoryMathematics and Computer Science [email protected]

Jing FuRensselaer Polytechnic [email protected]

MS49

Large Scale Computational Dynamics on NextGeneration Heterogeneous Clusters using DDEP

The talk discusses a Dynamic Data Exchange Protocol(DDEP), a programming framework developed to assistin large-scale computational dynamics simulations on het-ergeneous GPU-accelerated clusters. The problem domainis subdivided spatially with each subdivision assigned toa specific GPU; the developed framework facilitates themovement of body data between GPUs when bodies crossthese subdivisions. It is shown that using DDEP, a GPU-accelerated cluster may decrease simulation time by an or-der of magnitude.

Andrew SeidlUniversity of [email protected]

Hammad Mazhar, Toby Heyn, Dan NegrutUniversity of Wisconsin MadisonDepartment of Mechanical [email protected], [email protected], [email protected]

150 CS11 Abstracts

MS49

Finite-Element Electromagnetic Modeling of Ac-celerator at Extreme Scale

The past decade has seen tremendous advances in elec-tromagnetic modeling for accelerator applications with theuse of high performance computing on state-of-the-art su-percomputers. Under the support of the DOE SciDACcomputing initiative, a comprehensive set of parallel elec-tromagnetic codes based on the finite-element method hasbeen developed aimed at tackling some of the most compu-tationally challenging problems in accelerator RD. Aidedby collaborative efforts in computational science, thesepowerful tools have enabled large-scale simulations of com-plex systems to be performed with unprecedented detailsand accuracy. Significant progress has been made in scal-able eigensolvers for accelerator cavity prototyping and op-timization, calculations of wakefields for large acceleratorsystems and for ultra-short charged particle bunches, andself-consistent particle-in-cell simulations for space-chargedominated devices. The impact of these capabilities oncurrent and future accelerator projects worldwide will bedescribed.

Cho-Kuen NgStanford Linear Accelerator [email protected]

MS49

Large Scale Ab Initio Calculations for PhotovoltaicMaterials

Large scale ab initio calculation can play an important rolein understanding the underlying mechanism in solar cell.Not only the ab initio calculation can be used to studythe atomic structures of many systems (e.g., the interfaceand defects), it can also be used to study the electronicstructures, optical properties and carrier transport. In or-der to study these properties, large scale computations areneeded. In this talk, the computational requirements fordifferent types of calculations will be discussed, and themain challenges will be presented. I will also discuss howsuch calculations can be scaled to hundreds of thousandsof processors.

Lin-Wang WangLawrence Berkeley National [email protected]

MS50

Multi-GPU Calculations in Lattice Quantum Chro-modynamics

Multi-GPU Calculations in Lattice Quantum Chromody-namics Quantum chromodynamics (QCD) is the funda-mental theory describing the interactions of quarks andgluons. In the lattice formulation of QCD, the equationsgoverning these interactions are solved numerically on afour-dimensional spacetime grid. In this talk, I will discussthe implementation of ”QUDA,” a library of linear solverstailored for QCD that is capable of leveraging many graph-ics processing units (GPUs) in parallel. MPI is used forcommunication, while kernels executing on the GPUs arewritten in CUDA C. I will discuss the strategies we em-ployed both to obtain high performance on a single GPUand maintain efficiency on large clusters.

Ronald BabichCenter for Computational Science

Boston [email protected]

Kipton BarrosCenter for Nonlinear StudiesLos Alamos National [email protected]

Richard BrowerDepartment of PhysicsBoston [email protected]

Michael ClarkHarvard-Smithsonian Center for [email protected]

Balint JooJefferson [email protected]

Claudio RebbiDepartment of PhysicsBoston [email protected]

MS50

Overlapping Communication and Calculation witha PGAS Language

The high complexity of climate applications poses difficul-ties for porting and validation on new platforms. Effortsfor improving programmability and performance of suchapplications cannot be disruptive and must lead to clearbenefits. Partitioned Global Address Space (PGAS) lan-guages have emerged as alternative programming modelsfor highly parallel systems and climate applications offeropportunities for assessing the efficiency of these models.Here we report on overlapping communication and calcu-lation in an ocean circulation model.

Osni A. MarquesLawrence Berkeley National LaboratoryBerkeley, [email protected]

MS50

Multi-GPU MapReduce

MapReduce greatly simplifies many processing tasks. Wemodify portions of MapReduce such as chunking and im-plement extensions such as Partial Reduction to better fitthe GPU and show how to efficiently implement MapRe-duce on GPUs. We demonstrate the ability of a GPUMapReduce implementation through several benchmarksincluding Word Occurrence and Matrix Multiplication.

Jeff StuartDepartment of Computer ScienceUniversity of California, [email protected]

MS50

Using UPC for Hybrid Programming

Unified Parallel C (UPC) is a Partitioned Global AddressSpace (PGAS) language that provides programming con-

CS11 Abstracts 151

venience similar to shared-memory programming modelsand can deliver scalable performance on diverse computerarchitectures ranging from commodity multi-core systemsto customized supercomputers. Two aspects of using UPCfor hybrid programming will be presented: 1) combiningUPC with MPI or OpenMP for programming multi-coreclusters and 2) a UPC extension with CUDA/OpenCL forprogramming GPU clusters.

Yili ZhengLawrence Berkeley National [email protected]

MS51

Uncertainty Quantification and Numerical ErrorControl in Simulation of Coupled Porous Flow andMechanical Deformation

We present recent work on interactions between numeri-cal error and uncertainty quantification (UQ), especiallyin the context of multiphysics applications. Our approachinvolves error correction using goal-oriented error estima-tion, especially over the large parameter space induced byrandom variables and random spatial fields. We also inves-tigate approaches for generating surrogate models of nu-merical error that can be used to correct model realizationsin UQ calculations.

Brian CarnesSandia National [email protected]

MS51

A Jacobian-Free Newton Krylov Method forMortar-Discretized Thermomechanical ContactProblems

The thermomechanical contact approach studied here isbased on the mortar finite element method, where La-grange mulipliers are used to enforce weak continuity con-straints at participating interfaces. In this formulation, theheat equation couples to linear mechanics (in weak form)through a thermal expansion term. Lagrange multipliersare used to formulate the continuity constraints for bothheat flux and the normal component of interface tractionat contact interfaces. The resulting system of nonlinear al-gebraic equations is normalized and cast in residual formfor solution of the transient problem. A preconditionedJacobian-free Newton Krylov method is used to obtain thesolution of the coupled constraint, thermal contact, andheat equations.

Glen A. HansenIdaho National [email protected]

MS51

Conservative and Noise Resistant Data Remappingfor Coupled Regional Climate Modeling

The coupling of WRF (the Weather Research and ForecastModel) and CAM (Community Atmospheric Model) for re-gional climate modeling requires transferring data (such aswind velocities, temperature, moisture, etc.) between thegrids of WRF and CAM. These grids are in general non-matching and may use different map projects. We inves-tigate a common-refinement based data transfer methodthat satisfies physical conservation, numerical accuracy,

and noise resistance.

Ying ChenStony Brook [email protected]

Xiangmin JiaoStony Brook UniversityStony Brook, [email protected]

Wuyin LinBrookhaven National [email protected]

Minghua Zhang, Juanxiong HeStony Brook [email protected], [email protected]

MS51

Multiscale Simulation of the Effect of Irradiation-Induced Microstructure Evolution on Reactor FuelPerformance

Fuel performance in nuclear reactors is highly dependenton irradiation-induced microstructure evolution. There-fore, a fuel performance code must consider atomistic andmesoscale effects in order to provide a predictive capability.In this work, we present the multiscale fuel performancemodeling approach currently being employed at Idaho Na-tional Laboratory (INL). Atomistically-informed mesoscalephase field simulations are used to determine the effect ofirradiation-induced microstructure evolution on bulk prop-erties, such as thermal conductivity and density. Contin-uum expressions describing the effect of irradiation on thebulk properties as a function of temperature are then de-termined from the mesoscale simulations. Finally, theseexpressions are used in INLs BISON fuel performance codeto model fuel pellet behavior.

Michael Tonks, Paul Millet, Derek Gaston, Bulent BinerIdaho National [email protected], [email protected],[email protected], [email protected]

Anter El-AzabFlorida State [email protected]

MS52

Lessons Learned and Open Issues from the Devel-opment of the Proteus Toolkit for Coastal and Hy-draulics Modeling

The Proteus toolkit evolved to support research on newmodels for coastal and hydraulic processes and improve-ments in numerical methods. The models considered in-clude multiphase flow in porous media, shallow water flow,turbulent free surface flow, and flow-driven processes suchas sediment and species transport. Python was used forimplementing high-level class hierarchies and prototypingnew algorithms, while performance critical sections wereoptimized using compiled languages. We discuss the toolkitdesign, performance, and open issues.

Chris KeesU.S. Army Engineer Research and Development CenterCoastal and Hydraulics [email protected]

152 CS11 Abstracts

Matthew FarthingUS Army Engineer Research and Development [email protected]

MS52

Python, Clawpack, and PyClaw

Clawpack (Conservation Laws Package,www.clawpack.org) is an open source softwarepackage forsolving nonlinear conservation laws and other hyperbolicsystems of PDEs. The core computational routines are inFortran, but Python is now used for the user interface,both on the input end and for graphics and visualizationof results. Recent developments will be summarized withdiscussion of some lessons learned that may be useful todevelopers of other packages.

Randall J. LeVequeApplied MathematicsUniversity of Washington (Seattle)[email protected]

Kyle T. MandliUniversity of WashingtonDept. of Applied [email protected]

MS52

FEniCS: An Attempt to Combine Simplicity, Gen-erality, Efficiency and Reliability

FEniCS allows PDE problems expressed in terms of vari-ational forms to be stated in Python, with a syntax thatclosely resembles the mathematics. The Python definitionof the problem is compiled to highly efficient, problem-specific C++ code and linked with general libraries for fi-nite elements and linear algebra. Recently, FEniCS canalso automatically generate tailored error estimators andprovide error control. Many examples on using FEniCS tosolve PDEs will be shown.

Kent-Andre MardalSimula Research Laboratory and Department ofInformaticsUniversity of [email protected]

Hans Petter LangtangenCenter for Biomedical ComputingSimula Research Laboratory and University of [email protected]

Anders LoggSimula Research [email protected]

Garth N. WellsDept. of Engineering,Univ. of [email protected]

MS52

FEMhub Online Numerical Methods Laboratory

The FEMhub Online Numerical Methods Laboratory(http://lab.femhub.org) is an Ext-JS-based web interfaceto FEMhub (http://femhub.org), an open source distri-bution of PDE solvers with a unified Python interface.

The objective of the Online Lab is to make the codesin FEMhub widely accessible without installation. Wewill explain the structure of FEMhub (Python wrappers,Python build system etc.), describe the Online Lab, andperform sample finite element computations over the web.We will also present a new FEMhub Mesh Editor thatmakes it possible to generate 2D geometries and finite ele-ment meshes in the web browser.

Pavel Solin, Ondrej CertikDepartment of Mathematics and StatisticsUniversity of Nevada, [email protected], [email protected]

Mateusz PaprockiTechnical University of WroclawWroclaw, [email protected]

Aayush PoudelUniversity of Nevada, [email protected]

MS53

ReALE: A Reconnection-basedArbitrary-Lagrangian-Eulerian Method

We present a new reconnection-based arbitraryLagrangian-Eulerian (ALE) method. The main elementsin as standard ALE simulation are an explicit Lagrangianphase in which the solution and grid is updated, a rezon-ing phase in which a new grid is defined, and remappingphase in which the Lagrangian solution is conservativelyinterpolated onto new mesh. In our new method we allowconnectivity of the mesh to change in rezone phase, whichleads to general polygonal mesh. Rezone strategy is basedon using Voronoi mesh. In our talk we will discuss detailsof rezoning phase and show some numerical results.


Raphael LoubereUniversity of Toulouse, [email protected]


Jerome BreilUMR [email protected]

Stephan GaleraINRIA Bordeaux - Team [email protected]

MS53

Vertex Reordering for Local Mesh Quality Im-provement of Tetrahedral Meshes

Vertex reordering can be performed within the context oflocal mesh quality improvement in order to decrease theamount of time spent on the mesh optimization. We pro-pose several static vertex reordering schemes, based on theperformance of the optimization algorithm and the mesh

CS11 Abstracts 153

quality, and investigate the trade-offs between ordering andoverall performance of the optimization algorithm. We em-ploy Laplace smoothing, within the Mesquite package, tooptimize tetrahedral mesh quality.

Jeonghyung Park, Suzanne M. ShontzThe Pennsylvania State [email protected], [email protected]

Patrick M. KnuppSandia National [email protected]

MS53

New Developments on Image-based Mesh Genera-tion

We have developed a high-fidelity image segmentation ap-proach using a variant of the graph-cut method. Fromthe segmented boundaries, high-quality and smooth sur-face meshes are generated with the center-to-center con-nection scheme. Tetrahedral meshes are then generatedusing an octree-based approach. The experiments haveshown promising results in mesh generation from 3D imag-ing data. A user-friendly GUI has also been developed toencapsulate all these algorithms, which will be made avail-able at www.bimos.org.

Zeyun YuUniversity of [email protected]

MS53

Guaranteed-Quality Quadrilateral Mesh Genera-tion for Planar Domains

Provably good-quality triangular mesh generation werewell developed. However, fewer algorithms exist for all-quadrilateral mesh generation, and most of these algo-rithms are heuristic. Here, two novel approaches arepresented to generate guaranteed-quality all-quadrilateralmeshes. It is proved that for any planar domain, all theangles in the constructed mesh are within [45◦±ε, 135◦±ε](ε ≤ 5◦) for the quadtree-based method and [60◦ ± ε,120◦ ± ε] for the hexagon-based method.

Yongjie ZhangDepartment of Mechanical EngineeringCarnegie Mellon [email protected]

MS54

Adaptive High Dimensional Smoothing using Iter-ative bias Reduction Algorithms

Smoothing high dimensional noisy data is challenging be-cause of the curse of dimensionality. This talk presents asimple numerical procedure that iteratively fits the residu-als to correct for the bias of the current smoother. Whileconceptually simple, this method is a practical algorithmfor achieving adaptativity to the underlying smoothness ofthe true regression function. The iterative algorithm worksfor thin plate splines, and for kernel smoother for posi-tive defined kernels. In practical examples, our smootherhas smaller out of sample prediction error than compet-ing state-of-the-art multivariate smoothers. Finally, thealgorithm is distributed as the ibr R package available at

CRAN.

Nicolas HengartnerInformation Sciences GroupLos Alamos National [email protected]

MS54

Inverse Methods for Event Reconstruction Prob-lems

When a contaminant has been released into the atmo-sphere, event reconstruction algorithms are used to deter-mine where, when and how much has been released. Alimited number of measurements are available from envi-ronmental sensors deployed in urban areas, and inverse al-gorithm are used to find atmospheric release parameters,based on these data. These parameters can then be used inflow simulations to predict current and future locations ofthe material. Many inverse algorithms use Bayesian infer-ence to solve the inverse problem. This can be viewed as astochastic process and typically Markov chain Monte Carlosampling is used to generate an ensemble of predictive sim-ulations. We will give an overview of these methods, andalso discuss adjoint methods from the Bayesian perspectiveand their potential impact on these problems. Relevant is-sues include non-Gaussian distribution of parameters andmeasurements, nonlinear models, uncertainty is data, pa-rameters, models and solution, and efficient algorithms forreal-time emergency response.

Jodi MeadBoise State UniversityDepartment of [email protected]

MS54

Projected Conjugate Gradient Methods on GPUsfor Improving Efficiency in Large Scale Image Re-construction

Graphical processing units improve the capability for largescale computations using low-end computing platforms.For the solution of linear systems of equations, much efforthas been devoted to their use for efficient Krylov subspace-based solvers. Here we discuss implementation of a pro-jected CG algorithm, taking advantage of BLAS3 opera-tions, and Krylov subspace recycling. It is extended forsolution of regularized least squares problems arising inlarge scale image restoration situations. Numerical resultsare presented.

Rosemary A. RenautArizona State UniversitySchool of Mathematical and Statistical [email protected]

Tao LinDepartment of Mathematical SciencesNew Jersey Institute of [email protected]

MS54

Rapid-response Simulations of Forward and InverseProblems in Atmosperic Transport and Dispersion

Event reconstruction of an atmospheric contaminantplume, detected by a sensor network, is a challenging in-

154 CS11 Abstracts

verse problem in threat reduction field. Dispersion model-ing in urban areas and over complex terrain constitutes aforward model, and benefits from computationally inten-sive prognostic models. To this end, we present the devel-opment and performance of an incompressible wind solverwith a geometric multigrid algorithm on GPU clusters anddiscuss its efficient use within an inverse methodology.

Inanc SenocakMechanical & Biomedical EngineeringBoise State [email protected]

Dana JacobsenBoise State [email protected]

MS55

Auto-tuning for BLAS-based Matrix Computa-tions

We discuss performance tuning of matrix computationsbased on BLAS routines. In such computations, blockedalgorithms are generally used and how to partition the tar-get matrices into submatrices determines the performanceof each BLAS routine. To achieve high performance, block-ing should be done depending on the characteristics of theBLAS routines. In this talk, we clarify issues toward auto-tuning of BLAS-based computations and explain our ap-proach, called generalized recursive blocking, that tries tooptimize the partitioning using dynamic programming.

Takeshi FukayaDepartment of Computational Science and EngineeringNagoya [email protected]

Yusaku YamamotoKobe [email protected]

Shao-Liang ZhangNagoya University, [email protected]

MS55

Performance Evaluation for a Dense EigenvalueSolver for the Next-generation Petascale System

We developed a high performance and high scalable eigen-value solver for the next-generation petascale system,‘K-computer’, which is supposed to have almost millioncores. We adopt MPI-OpenMP hybrid parallelization, two-dimensional topological communication, and auto-tuningtechnology in order to achieve higher performance. Cur-rently, our solver performs on T2K supercomputer housedat the University of Tokyo with 256 AMD Barcelona pro-cessors (4096 cores). At the workshop, we will report theperformance on other systems and estimate the perfor-mance by using full system of ‘K-computer’.

Toshiyuki Imamura, Huu Phuong PhamThe University of [email protected], [email protected]

Susumu Yamada, Masahiko MachidaJapan Atomic Energy [email protected], [email protected]

MS55

Autotuning of Sparse Matrix-Vector Multiplicationby Selecting Storage Schemes on GPU

Sparse matrix vector multiplication is one of the most oftenused functions in computer science. The storage schemesfor sparse matrices have been proposed, however, eachsparse matrices have an optimal storage scheme. In thistalk, we propose an auto-tuning algorithm of sparse ma-trix vector multiplication by selecting storage schemes au-tomatically on GPU. We evaluated our algorithm usingConjugate Gradient solver. As a result, we found that ouralgorithm was effective in many sparse matrices.

Yuji Kubota, Daisuke TakahashiGraduate School of Systems and Information EngineeringUniversity of [email protected], [email protected]

MS55

Development of Xabclib: A Sparse Iterative Solverwith Numerical Computation Policy Interface

Matrix libraries have many parameters as inputs by theuser. They include problem parameters what are difficultto set values and the approach of automatically settingthem is needed. In this talk, we propose matrix librariesnamed “Xabclib” with numerical computation policy in-terface. By using Xabclib, users can specify their numer-ical policy, such as minimizing computation time, mem-ory saving, and accuracy requirement for the residual ofsolution without difficult parameter setting. A result ofperformance evaluation with one node of the T2K Opensuper computer shows the proposed approach can reduceover 60% of parameters and satisfy user needs.

Takao SakuraiCentral Research LaboratoryHitachi [email protected]

Ken NaonoHITACHI [email protected]

Takahiro KatagiUniversity of [email protected]

Hisayasu KurodaUniv. of Tokyo / Ehime [email protected]

Kengo NakajimaThe University of TokyoInformation Technology [email protected]

Satoshi Ohshima, Shoji ItohUniv. of [email protected], [email protected]

Mitsuyoshi IgaiHitachi ULSI Systems [email protected]

MS56

Fluid-Structure Interaction of Wind Turbines in

CS11 Abstracts 155

3D: Methods and Applications

In this talk I will present a collection of numerical meth-ods combined into a single framework for wind turbinerotor modeling and simulation. I will cover structuraldiscretization for wind turbine blades and details of thefluid?structure interaction computational procedures fo-cusing on the challenges of FSI coupling and mesh motionin the presence of large rotation. I will present simulationsof the NREL 5MW offshore baseline wind turbine rotor,including validation against published data.

Yuri BazilevsaDepartment of Structural EngineeringUniversity of California, San [email protected]

MS56

Time Adaptive SDIRK Methods for Thermal FluidStructure Interaction

To model cooling processes in steel manufacturing, we con-sider the unsteady thermal coupling of the compressibleNavier-Stokes equations with the heat equation. In time,stiffly stable SDIRK methods are employed. The coupledproblem on each stage is solved using partitioned FSI meth-ods. The SDIRK schemes allow for error estimation, savingconsiderable computing time and guaranteeing a maximaltime integration error. The method is tested on parallelflow along a heated plate, allowing for verification by ex-periments.

Philipp BirkenDepartment 10, Mathematics and Natural SciencesUniversity of Kassel, [email protected]

Andreas MeisterUniversity of KasselInstitute of [email protected]

Stefan HartmannClausthal University of TechnologyInstitute of Applied [email protected]

Karsten J. QuintUniversity of KasselDepartment of [email protected]

Detlef KuhlUniversity of KasselInstitute of Statics and [email protected]

MS56

Multiphysics Coupling Via Lime: Lightweight In-tegrated Multiphysics Environment

LIME is an enabling solution technology built as extensionsto the Trilinos Solver Framework. It accommodates codesranging from mature legacy applications to novel applica-tions undergoing active development. LIME’s coupling al-gorithms range from loose fixed-point iterations to variousimplementations of Newton’s method. It is lightweight byonly requiring an application’s response to inputs but canincoroprate additional information, eg residuals, Jacobians,

etc., to effect commensurate improvements in robustnessand efficieny of coupled solutions.

Russell W. HooperSandia National [email protected]

Roger PawlowskiSandia National [email protected]

Kenneth Belcourt, Rodney SchmidtSandia National [email protected], [email protected]

MS56

Stable and Efficient Coupling for Fluid-StructureInteractions

Partitioned fluid-structure interaction simulations exhibitbig challenges in particular for incompressible fluids as in-stabilities caused by the added mass effect cannot be re-duced by smaller time steps. Different approaches rangingfrom Gauss-Seidel coupling iterations with constant un-derrelaxation over Aitken underrelaxation to quasi-Newtonmethods are known as possible remedies. We propose anew hierarchical coupling method inducing very low over-head and at the same time yielding fast convergence of thecoupling iterations.

Miriam MehlDepartment of Computer ScienceTechnische Universitaet [email protected]

MS56

Partitioned Fluid-Structure Interaction UsingMultilevel Acceleration

In partitioned fluid-structure interaction sub-iteration arerequired for strongly coupled problems, increasing compu-tational time as flow and structure have to be resolvedmultiple times every time step. In this paper we applya multilevel acceleration technique to the sub-iterations,which resolves a defect-correction in the flow domain ona coarse mesh level to reduce computing time. For dif-ferent sub-iteration techniques computational efficiency isdetermined in combination with multilevel acceleration.

Alexander H. van Zuijlen, Hester BijlFaculty Aerospace EngineeringDelft University of Technology, [email protected], [email protected]

MS57

A Highly Scalable Method for Coulomb Interac-tions Based on Multigrid

The simulation of particle interactions due to electrostaticscan be reformulated as the solution of a Poisson equationin a consistent way plus near-field correction terms. Asmultigrid methods are well-known to be optimal Poissonsolvers, a method incorporating multigrid was developed.The resulting method scales linearly in the number of par-ticles. The parallelization is based on a domain-splittingapproach obeying the underlying computer architecture,

156 CS11 Abstracts

thus a very good scaling behavior is achieved.

Matthias BoltenUniversity of [email protected]

MS57

Maxwell Equations Molecular Dynamics - A LocalAlgorithm for a Global Interaction

The presented electrostatics algorithm is designed to prop-agate electric field through the simulation system via a lo-cal update scheme rather than solving the Poisson equationglobally. This directly yields algorithmic advantages suchas linear scaling with system size and trivial parallelization.In the talk, the algorithm is presented and the scaling be-haviour of a simple implementation is shown. Finally, itwill be discussed, for which systems it might advantageousto use.

Florian W. FahrenbergerUniversity of [email protected]

MS57

A Parallel Fast Multipole Method for Particle Sim-ulations

In this talk we present the parallel version of our error-controlled FMM implementation for long-range interac-tions in particle simulations. The algorithm can handleopen as well as 1D, 2D and 3D periodic boundary condi-tions efficiently. We will show results and timings of low-precision and high-precision computations for up to severalbillion particles. The cross-over point with the direct sum-mation as well as the scaling behavior on different super-computers (e.g. BlueGene) of the code will be discussed.

Ivo KabadshowResearch Centre [email protected]

Holger DachselJuelich Supercomputing CentreResearch Centre [email protected]

MS57

Parallel Fast Ewald Summation Based on Noneq-uispaced Fourier Transforms

The fast calculation of long-range interactions is a demand-ing problem in particle simulation. For periodic bound-ary conditions, Ewald proposed to split interactions intotwo fast converging parts. We show that the reciprocalspace part can be efficiently evaluated by nonequispacedfast Fourier transforms. The resulting algorithms are verysimilar to particle-particle particle-mesh methods. Fur-thermore, we present a massively parallel implementationof the nonequispaced fast Fourier transform, which yieldsa straightforward parallelization of our Ewald method.

Michael PippigChemnitz University of TechnologyFaculty of [email protected]

MS58

Recent Progress in Radial Basis Functions

RBFs are a meshless spectral method that have been ap-plied successfully in complicated geometry. We review re-cent developments in analytic approximations to RBF car-dinal functions, asymptotics of RBF coefficients, RBFs andgroup theory, boundary layer effects, RBFs and the RungePhenomenon, solving the vorticity equation for flow on arotating sphere, and damping and dealiasing.

John P. BoydUniversity of MichiganAnn Arbor, MI 48109-2143 [email protected]

MS58

Computing at the Speed of Thought: Solving Dif-ferential Equations in Chebfun

Chebfun is a free MATLAB-based software system for com-puting with functions as readily as numbers by using fastand powerful Chebyshev series approximations. Using op-erators and boundary conditions expressed in a compact,natural syntax, many boundary-value, eigenvalue, and con-tinuation problems can be solved by spectral collocationmethods to high accuracy in moments, without requiringknowledge of the numerical discretizations. The ability tointeract with such models in realtime accelerates the pro-cess of understanding their solutions.

Tobin DriscollUniversity of DelawareMathematical [email protected]

MS58

Petascale Applications of the Spectral ElementMethod

We describe recent advances in the spectral element codeNek5000 for simulation of incompressible and low-Machnumber flows. The SEM can significantly reduce computa-tional complexity, particularly for large-scale problems en-abled by peta- and exascale architectures where cumulativeeffects dictate tight control of local discretzation errors. Wedescribe our underlying discretization and solution strate-gies, including a scalable multigrid solver and present sev-eral recent simulation and performance results includingsustained performance exceeding 20% of peak and 70% ef-ficiency for P=262000.

Paul F. FischerArgonne National [email protected]

MS58

High-Order Integral Equations Methods for High-Frequency Scattering by Diffraction Gratings

In this talk we present a new high-order Integral Equationalgorithm for the simulation of high-frequency scatteringreturns by diffraction gratings. For shallow gratings (thosefor which Geometric Optics indicates that there will beno multiple reflections) the method amounts to a phase-extraction technique resulting in a slowly-varying ampli-tude as unknown which requires only a small number ofdegrees of freedom to resolve. For deeper gratings we fol-low the work of Bruno, Reitich, and collaborators (e.g.,

CS11 Abstracts 157

Phil. Trans. Roy. Soc. London A 362, 2004) who uti-lize Geometric Optics corrections to iteratively update therapidly varying amplitude which consists of many slowly-varying components. Our current contribution shows thatthe iterative update scheme can be eliminated and replacedwith a simultaneous solution procedure. While our centralideas can be extended to the full vector electromagnetictime-harmonic Maxwell equations, we focus upon the caseof two-dimensional linear acoustics for simplicity.

David P. NichollsUniversity of Illinois at [email protected]

Fernando ReitichSchool of MathematicsUniversity of [email protected]

MS59

WaLBerla: HPC Software Design for Computa-tional Engineering Simulations

WaLBerla is a massively parallel software framework sup-porting a wide range of physical phenomena. We de-scribe the software designs realizin the major goal of theframework, a good balance between expandability and scal-able, highly optimized, hardware-dependent, special pur-pose kernels. In this talk, we discuss the coupling of ourLattice-Boltzmann fluid flow solver and a method for fluidstructure interaction. Additionally, we show a softwaredesign for heterogeneous computations on GPU and CPUutilizing optimized kernels.

Christian FeichtingerChiar for System SimulationUniversity of Erlangen-Nuremberg, [email protected]

Jan GotzUniversitat [email protected]

Stefan DonathComputer Science 10 - System SimulationFriedrich-Alexander-Universitat Erlangen-Nurnberg,[email protected]

Harald KostlerUniversitat [email protected]

Ulrich RudeComputer Science 10 - System SimulationFriedrich-Alexander-University Erlangen-Nuremberg,[email protected]

MS59

Parallel Performance of Spectral-Element Dis-continuous Galerkin Lattice Boltzmann Method(SEDG-LBM) for Flow Simulations on BG/P

We present the spectral-element discontinuous Galerkinlattice Boltzmann method for 3D flow simulations based on19-velocity model. Our SEDG-LBM employs body-fittedunstructured meshes that enable us to deal with complex

geometry with high-order accuracy.Explicit time marching,diagonal mass matrix, and minimum communication costfor numerical flux make our SEDG-LBM highly efficient inparallel. We demonstrate scalable parallel algorithms andparallel performance on the IBM BG/P at large scale.

Taehun LeeDepartment of Mechanical EngineeringThe City College of New [email protected]

MiSun MinArgonne National LaboratoryMathematics and Computer Science [email protected]

MS59

Million Body Simulations of Granular Dynamics onthe GPU

This paper describes the formulation and implementationof an algorithm capable of simulating dynamic systemswith over one million bodies on the GPU. The problemis formulated as a cone complementarity problem, and thecorresponding quadratic optimization problem is solved viaan iterative method. The GPU is leveraged via NVIDIAsCUDA architecture, solving the dynamics and the colli-sion detection tasks in parallel. An example simulation ofa tracked vehicle moving on granular terrain is shown.

Hammad Mazhar, Toby HeynUniversity of Wisconsin MadisonDepartment of Mechanical [email protected], [email protected]

Alessandro TasoraUniversita di [email protected]


Dan NegrutUniversity of Wisconsin MadisonDepartment of Mechanical [email protected]

MS59

In-Situ Visualization for Large-Scale TurbulentCombustion Simulations

As scientific supercomputing moves toward petascale andexascale, in-situ visualization clearly becomes one of themost feasible ways to enable scientists to see the full ex-tent of the data generated by their simulations. This isparticularly crucial for capturing and understanding highlyintermittent transient phenomena, e.g. ignition/extinctionevents in turbulent combustion. However, integrating thevisualization process into a simulation pipeline poses sev-eral challenges which have not been fully addressed. Inthis talk, we present our study of in-situ visualization forlarge-scale turbulent combustion simulations. We describeour design decisions and optimization strategies on domaindecomposition, rendering, and image compositing. Wepresent an in-depth evaluation of our implementation onthe Cray XT5 at the NCCS/ORNL, and discuss the lessons

158 CS11 Abstracts

learned through the study. We demonstrate that in-situ vi-sualization is a feasible solution and our work provides avaluable reference to those who will pursue a similar solu-tion.

Hongfeng YuSandia National [email protected]

MS60

An MPI/OpenMP Implementation of 3D FFTsfor Plane Wave First Principles Materials ScienceCodes

Plane wave first principles codes based on 3D FFTs are oneof the largest users of supercomputer cycles in the world.I will present results for an MPI/OpenMP version of ourspecialized 3D FFT that gives improved scaling over a pureMPI version on mulitcore machines. I will also presentsome preliminary results for the electronic structure codePARATEC using the new 3D FFT and threaded librariesto gain improved scaling to large core counts.

Andrew M. CanningLawrence Berkeley National [email protected]

MS60

Hybrid Programming Models for ComputationalChemistry

We describe and compare two programming models andtheir performance on the Cray XT-5 up to 120K cores on anumber of simulations from computational chemistry anddensity functional theory. The first is a more traditionalmodel based on a NUMA programming paradigm GlobalArray/ARMCI for NWChem and the second is a message-passing/multi-threaded task based model for MADNESS,an adaptive pseudo-spectral code.

George FannOak Ridge National [email protected]

MS60

Evaluation of MPI/OpenMP Efficiency in Particle-in-cell Beam Dynamics Simulation

We present performance analysis of the particle-fielddecomposition method and the domain decompositionmethod in a parallel particle-in-cell beam dynamics sim-ulation. Both parallelization approaches can be effectivedepending on the ratio of macroparticles to grid points. Inboth cases, the performance of the flat MPI implementa-tion deteriorates rapidly with increasing number of coresper node. Using mixed MPI/OpenMP model reduces bothcommunication and intra-node memory contention, leadingto large performance gain.

Xiaoye Sherry LiComputational Research DivisionLawrence Berkeley National [email protected]

Ji QiangLawrence Berkeley National [email protected]

Robert D. [email protected]

MS60

Multicore Parallelization of Determinant QuantumMonte Carlo Simulations

QUEST is a software package that implements the deter-minant Quantum Monte Carlo method for simulations ofinteracting electrons. Its main computational kernel is theevaluation of Green’s functions. The existing approach isbased on computing a graded decomposition of a long prod-uct of small matrices using the QR decomposition with col-umn pivoting (QRP). Current implementations of QRP arelimited by BLAS level 2 performance and do not scale wellin modern multicore processors. In this talk, we will firstdiscuss a number of alternative approaches and then fo-cus a structured orthogonal factorization by exploiting theproblem structure and avoiding the communication cost ofpivoting. This new approach show promising parallel per-formance on multicore architectures.

Che-Rung LeeNational Tsing Hua University, [email protected]

Andres Tomas, Zhaojun BaiUniversity of [email protected], [email protected]

Richard ScalettarDepartment of Physics,University of California, Davis, [email protected]

MS61

A Posteriori Analysis of a Multirate NumericalMethod for Evolution Problems

We consider multirate integration methods for evolutionmodels that present significantly different scales within thecomponents of the model. We describe both an a priori er-ror analysis and a hybrid a priori-a posteriori error analysis.Both analyses distinguish the effects of the discretizationof each component from the effects of multirate solutionwhile the effects on stability are reflected in perturbationsto certain associated adjoint operators.

Donald EstepColorado State [email protected] or [email protected]

Victor E. GintingDepartment of MathematicsUniversity of [email protected]

S TavenerDept. of MathematicsColorado State [email protected]

MS61

On the use of Newton-Krylov Methods in a Gen-eral, Moment-based, Scale-bridging Algorithm

We are developing a consistent, scale-bridging algorithm

CS11 Abstracts 159

which accelerates an accurate solution of the fine scaleequations to the point where coarse time and length scalesare achievable. Our prototype fine scale problem has time,configuration space, and phase space as independent vari-ables. This fine scale model is often referred to as the ”ki-netic” problem, and the discretized version of this problemwill be referred to as the High Order (HO) problem. Thekinetic problem could be solved by either a determinis-tic or Monte-Carlo approach. We will accelerate the solu-tion to this fine problem using a Low Order (LO) problemas a coarse space preconditioner. The LO problem is de-rived from a small number of phase space moments of theHO problem, and can also be solved on a coarser config-uration space mesh. We self-consistently determine thehigher-order moments required by the LO problem withthe local (in time and space) phase-space solution of theHO problem. The potential impact for these algorithmicadvancements is best summarized by a selected list of ap-plications. These could include: neutron transport; photontransport; plasma kinetic simulation; rarefied gas dynam-ics; transport in condensed matter e.g., semiconductors toname a few.

Dana KnollLos Alamos National [email protected]

MS61

Simulating Multi-Scale Multi-Physics Particle Ac-celerator Problems with the Adaptive Discontinu-ous Galerkin Method

Simulating the dynamics of charged particles in large ac-celerator facilities poses a variety of problems. It requiresthe simultaneous and self-consistent solution of Maxwell’sequations and the relativistic equations of motion of thecharge carriers. Moreover, the problem exhibits a pro-nounced multi-scale character. Short particle bunches (μmto a few mm) excite electromagnetic fields up to the THzregime. This is contrasted by a length of some meters of theaccelerator sections considered. We present a time-domaindiscontinuous Galerkin (DG) method with hp-adaptationfor solving such problems. Details of the both refinementtypes, the local regularity estimation and the dispersiveproperties are presented.

Sascha SchneppGraduate School of Computational EngineeringTechnische Universitaet [email protected]

Thomas WeilandInstitut fuer Theorie Elektromagnetischer Felder, TEMFTechnische Universitaet [email protected]

MS61

Adaptive Solution of Multiphysics Coupled Prob-lems Using the HERMES Library

Hermes2D is a C++ library for rapid development ofspace- and space-time adaptive hp-FEM/hp-DG solvers.Novel PDE-independent hp-adaptivity and multimesh as-sembling algorithms allow the user to solve a large varietyof PDE problems ranging from stationary linear equationsto complex time-dependent nonlinear multiphysics PDEsystems. The library comes with a free interactive on-line lab powered by UNR computing facilities. Detailedtutorial enhanced with many benchmarks and examples

allows the user to employ Hermes2D without being expertin object-oriented programming, finite element methods,or in the theory of partial differential equations. There isa very active user community where help can be obtainedquickly. The code is distributed under the GNU GeneralPublic License.

Pavel SolinDepartment of Mathematics and StatisticsUniversity of Nevada, [email protected]

MS62

Mpi4py and Petsc4py: Using Python to developScalable PDE Solvers

PETSc is a suite of routines and data structures for thescalable (parallel, MPI-based) solution of scientific applica-tions modeled by PDE’s. PETSc for Python (aka petsc4py)is a Cython-based wrapper to PETSc components like dis-tributed vectors and matrices, Krylov-based linear solvers,Newton-based nonlinear solvers, and basic timesteppers.petsc4py facilitates the Cython/SWIG/f2py wrapping ofC/C++/Fortran codes using PETSc. New mechanisms arebeing worked on to cross language boundaries and easilyemploy Python as an extension language.

Lisandro DalcinCentro Int. de Metodos Computacionales en [email protected]

MS62

Paper to GPU: Optimizing and Executing Discon-tinuous Galerkin Operators in Python

I will present a domain-specific language that representsdiscontinuous Galerkin discretizations of linear and non-linear hyperbolic partial differential equations. While try-ing to remain faithful to the human-readable, paper-basedrepresentation of such operators uncluttered by implemen-tation detail, an optimizing processing pipeline brings thishuman-suited representation into a form that is suited tohigh-performance machine execution. The core of the talkis devoted to the design and components of this pipeline,which emits optimized C/C++ code that can be targetedat CPUs and GPUs in single-core and distributed memoryarrangements.


MS62

A Distributed Architecture for Very Large-scaleGeometric Computing

Field modelling and simulation dominate computationalscience and engineering, but entered biology only recently,to help understanding the mechanisms of life on a hierarchyof scales, from proteins to cells to tissues, organs and sys-tems. Computer biomodels require to integrate petascalegeometric data, multiscale and multiphysics simulations,concurrency-based adaptive behaviour and functional spe-cialization. We introduce a computational format capableof combining geometric and physical representations andalgorithms on distributed environments of any size.

Alberto Paoluzzi

160 CS11 Abstracts

Roma Tre [email protected]

Valerio PascucciUniversity of [email protected]

Giorgio ScorzelliUniversity of Utah andUniversita Roma [email protected]

MS62

Making a Python based PDE Solver Work Effi-ciently in Parallel with the Available Open SourceInterfaces to MPI

This talk will describe our experiences in porting a serialpython-based PDE solver(FiPy, http://www.ctcms.nist.gov/fipy) to run efficientlyin parallel. These experiences include migrating a serialtest suite to parallel, effective partitioning and interfacingwith the available open-source parallel tools, while makingminimal alterations to the original code base. Results frommaterials science models simulated on large clusters will beused to demonstrate the efficiency benefits and overheadassociated with using these tools.

Daniel WheelerNational Institute of Standards and [email protected]

Jonathan GuyerNISTMetallurgy [email protected]

James O’BeirneGeorge Mason [email protected]

MS63

Mesh Quality and Monge-Kantorovich Optimiza-tion

Equidistribution of a monitor function has been a guid-ing principle in mesh generation for a long time. While in1D equidistribution determines the mesh uniquely, an infi-nite number of meshes can satisfy a given equidistributionprinciple in 2D and 3D, implying that there is room foroptimization. In this context, Monge-Kantorovich (MK)optmization choses one equidistributed mesh by minimiz-ing the L2 norm of the grid point displacement of suchmesh relative to an initial mesh. This procedure gives riseto a very robust and efficient multidimensional mesh gen-eration/adaptation method, based on the solution of thenonlinear Monge-Ampere equation. The latter is solvedwith the multigrid preconditioned Jacobian Free Newton-Krylov method, delivering a scalable algorithm under gridrefinement [G.L. Delzanno, et al., JCP 227, 9841 (2008)].We will present mesh quality comparisons against alter-native equidistribution methods for 2D static problems,showing that MK optimization produces good quality gridswith nearly minimal grid cell distortion. It is also very ro-bust against mesh tangling, maintaining good quality gridswhen other competitor methods fail. We will also presentresults on the application of MK optimization to 3D time-dependent mesh adaptation for the advection of a passive

scalar with various levels of flow shear. Our results confirmthe effectiveness and robustness (algorithmic and againstmesh tangling) of the method, in addition to good meshquality.

Gian Luca [email protected]

Luis ChaconOak Ridge National [email protected]

John M. FinnLos Alamos National [email protected]

MS63

Mesh Generation for Modeling and Simulation ofCarbon Sequestration Processes

To support modeling and simulation of carbon sequestra-tion processes at specific geologic sites, we introduce analgorithm to generate Voronoi meshes conforming to non-convex domains that might include internal fractures. TheVoronoi mesh is required to be non-regular to mimic prop-erties found in geologic fracture systems. Thus the algo-rithm first generates a random point cloud using maximalPoisson disk sampling. Our target is to generate one mil-lion Voronoi cells in less than thirty seconds using a singleprocessor.

Mohamed S. Ebeida, Patrick M. Knupp, Vitus LeungSandia National [email protected], [email protected],[email protected]

MS63

Generation and Opti-mization of Prismatic-tetrahedral Hybrid Meshesfor Complex Biomedical Geometries

We present a method to generate and optimize hybridmeshes with prisms and tetrahedra. Our method firstgenerates a layered tetrahedral mesh, and then generatesprismatic boundary layers by advancing the surface andshrinking the tetrahedra using a variational procedure thatoptimizes the prisms while preserving the quality of tetra-hedra. The resulting mesh is further improved using localedge flipping and smoothing. Experimental results demon-strates its effectiveness for complex biological geometries.

Xiangmin JiaoStony Brook UniversityStony Brook, [email protected]

Daniel R. EinsteinPacific Northwest National [email protected]

Vladimir DyedovDepartment of Applied Mathematics & StatisticsStony Brook [email protected]

Andrew Kuprat

CS11 Abstracts 161

Pacific Northwest National [email protected]

Navamita RayStony Brook [email protected]

MS63

A New Strategy for Untangling Meshes via Node-Movement

A new mesh optimization strategy for untangling quadri-lateral meshes, based on node-movement, is investigated.The strategy relies on a set of Propositions which showthat, for certain non-negative quality metrics within theTarget-matrix paradigm, if the value of the metric is lessthan 1, then the local area is positive. The Propositions areexploited to devise a new strategy for simultaneous meshuntangling and quality improvement. Numerical resultsconfirm the expected behavior.

Patrick M. KnuppSandia National [email protected]

Jason FranksUniversity of WashingtonSeattle, [email protected]

MS64

AC-SAMMM - The Aachen Platform for Struc-tured Automatic Manipulation of MathematicalModels - A Case Study

We apply second-order adjoint sensitivity analysis to an in-dustrial polymerization process given by about 2000 para-metric differential-algebraic equations (DAEs). A Model-ica model of the process is automatically transformed intoa C-implementation of the DAE residuals. The deriva-tive code compiler dcc is applied to this C-implementationto yield first-order tangent-linear and adjoint derivatives.Second-order adjoint derivatives are obtained by reappli-cation of dcc to its own output. The obtained derivativesmodels are embedded into an extension of the ESO class(equation set object class) of the CAPE-OPEN standard.

Ralf HannemannAachener VerfahrenstechnikRWTH Aachen [email protected]

Boris Gendler, Michael ForsterSTCERWTH Aachen [email protected],[email protected]

Jutta Wyes, Moritz Schmitz, Wolfgang MarquardtAachener VerfahrenstechnikRWTH Aachen [email protected],[email protected],[email protected]

Uwe NaumannRWTH Aachen UniversitySoftware and Tools for Computational Engineering

[email protected]

MS64

Three Dimensional Tomography in AtmosphericRemote Sensing

The upper troposphere / lower stratosphere plays a key rolein the climate system and is one of the least understoodatmospheric regions. Next generation remote sensing in-struments allow for tomographic measurement of this partof the atmosphere and will improve our understanding ofdynamical, radiative, and chemical processes in this regionsignificantly. The amount of data obtained by these newinstruments is enormous and new retrieval schemes have tobe developed. In this talk we present a new algorithm op-timized for massive 3-D retrievals of several hundred thou-sands of measurements and atmospheric constituents. Dif-ferent regularization approaches and minimizers to solvethe non-linear problem are discussed. A thematic focuswill be on automatic differentiation by operator overload-ing and source code transformation.

Martin KaufmannInstitute for Energy and Climate Research - StratosphereJuelich Research [email protected]

Joern Ungermann, Lars Hoffmann, Joerg BlankInstitute of Chemistry and Dynamics of the GeosphereJuelich Research [email protected], [email protected],[email protected]

Martin Riesenstitute of Chemistry and Dynamics of the GeosphereJuelich Research [email protected]

Johannes LotzLuFG Software and Tools for Computational EngineeringRWTH Aachen [email protected]

Ebadollah VarnikSoftware and Tools for Computational EngineeringRWTH Aachen [email protected]

Uwe NaumannRWTH Aachen UniversitySoftware and Tools for Computational [email protected]

MS64

A Gentle Introduction to Algorithmic Differentia-tion and Discrete Adjoints

Potentially very large gradients and Hessian of multivariatefunctions F : Rn → R play an often crucial role in modernCSE algorithms. Traditional numerical differentiation byfinite differences as well as exact (up to machine accuracy)differentiation using tangent-linear models exhibit a com-putational complexity of O(n) ·Cost(F ) (O(n2) ·Cost(F ))for the evaluation of the gradient (Hessian) which is likelyto become infeasible for large-scale numerical simulations.Discrete adjoints compute the same gradients (Hessians)with machine accuracy at O(1) ·Cost(F ) (O(n) ·Cost(F )),thus saving an order of maginitude in the computational

162 CS11 Abstracts

cost. Their (semi-)automatic generation by tools for Algo-rithmic Differentiation (AD) yield a number of challenges -both for AD-tool developers and users. This talk is meantto set the stage for the following talks by giving a gentleintroduction to AD, discrete adjoints, and their use withinvarious numerical algorithms.

Uwe NaumannRWTH Aachen UniversitySoftware and Tools for Computational [email protected]

MS64

Error Estimation in Geophysical Fluid Dynamicswith Algorithmic Differentiation

Uncertainty quantification is an essential part of model de-velopment in the geosciences. We present in this talk vari-ations of dual weight error estimation for goals calculatedfrom the solution of a shallow water model. The necessaryadjoint solutions are obtained with an Algorithmic Differ-entiation tool. The local error estimates are obtained withdeterministic and stochastic functionals of the flow. Weshow that the stochastic interpretation of dual weight errorestimation allows ensemble-type uncertainty quantificationfor physical quantities derived from a single model run.

Florian RauserMax Planck Institute for MeteorologyOcean in the Earth [email protected]

Peter KornMax Planck Institute for [email protected]

MS65

Programming Motifs 2: Adaptive Mesh Refine-ment (High Level)

Efforts to write adaptive mesh refinement code in Chapelhave been strikingly successful, demonstrating a reductionin code size by factors of 10 to 20 compared to existingimplementations. This talk will present a high-level lookat block-structured AMR, detailing its various geometricchallenges and the Chapel features used to address them.

Jonathan ClaridgeUniversity of WashingtonApplied [email protected]

MS65

Programming Motifs 2: Adaptive Mesh Refine-ment (Details)

This talk will explore in greater detail several core elementsof the code. This includes Chapel’s capacity for dimension-independent programming, our method for cleanly han-dling unions of rectangular index sets, and a remarkablysimple implementation of the Berger-Rigoutsos partition-ing algorithm.

Jonathan ClaridgeUniversity of WashingtonApplied [email protected]

MS65

Programming Motifs 1: Linear Algebra

Our evaluation of Chapel is built around implementationsof the Berkeley programming motifs, which we discussbriefly. We illustrate Chapel’s power with some specificexamples from blocked dense linear algebra algorithms andfrom graph algorithms relevant to sparse matrix computa-tions.

John G. LewisCray, [email protected]

MS65

An Overview of Chapel

Chapel is a new programming language, under develop-ment by Cray Inc., which is designed to simplify parallelprogramming on multicore workstations, commodity clus-ters, and supercomputers alike. Chapel strives to increaseproductivity for users of all levels by supporting greater ab-straction than current parallel programming models, whilealso providing performance that meets or exceeds currenttechnologies. This talk will provide an overview of theChapel language, focusing on features utilized throughoutthe remainder of the minisymposium.

Jonathan TurnerUniversity of ColoradoComputer [email protected]

MS66

Accelerating Linear Algebra Calculations usingStatistical Techniques

We illustrate with two applications how linear algebra cal-culations can be enhanced by statistical techniques. Thefirst example is related to condition numbers in linear sys-tems or linear least-squares for which computation mightbe as expensive as the solution itself. In this case the com-putational cost can be reduced from one order of magni-tude using statistical estimates. Another application con-cerns linear systems solution in which preconditioning byrandom matrices enables us to avoid pivoting.

Marc BaboulinINRIA/University of [email protected]

MS66

Solving Linear Systems of Equations on ParallelComputers with Multiple Levels of Parallelism

In this talk we present an algorithm for solving dense sys-tems of equations on hierarchical parallel machines, thatexploits multiple levels of parallelism. The algorithm isbased on CALU, a communication optimal LU factoriza-tion algorithm initially developed for distributed memorymchines. We evaluate its performance on clusters of multi-core processors. We show that an implementation based onMPI and Pthreads leads to a better performance comparedto routines implementing LU factorization in well-knownnumerical libraries.

Laura GrigoriINRIAFrance

CS11 Abstracts 163

[email protected]

Simplice DonfackINRIA, [email protected]

MS66

Towards High Performance Algorithms for theSymmetric Eigenvalue Problems

The goal of this talk is to solve the symmetric eigenvalueproblem on multicore/GPU architecture more quickly thanexisting implementations. The bulk of the existing algo-rithms is the reduction to tridiagonal form. It becomesjudicious to efficiently develop new algorithms. In order toachieve this goal, we have developed an approach based ontiles algorithm that reduces a symmetric matrix to bandedform and that find the eigenvalues using a band divide andconquer algorithm.

Azzam Haidar, Hatem LtaiefDepartment of Electrical Engineering and ComputerScienceUniversity of Tennessee, [email protected], [email protected]

Jack J. DongarraDepartment of Computer ScienceThe University of [email protected]

MS66

Speeding Householder Bidiagonalization

Iterations to find singular values of bidiagonal matrices arequick, due to the small number of nonzero entries. House-holder reduction to bidiagonal form is relatively slowerthan QR or LU decomposition, which can be accomplishedby BLAS-3 (matrix matrix multiplications) at near peakcomputational speeds. In LAPACK Householder bidiago-nalization, half the operations are matrix vector multiplica-tions (BLAS-2), which are relatively slow because only twofloating operations are performed for each matrix elementfetched to a CPU or GPU. One way to speed the com-putation is to combine two matrix vector multiplications(BLAS-2.5) Householder reduction from full to banded tri-angular from is BLAS-3, thus fast. This talk addresses the“bottleneck” reduction from banded to bidiagonal form.

Gary W. HowellNorth Carolina State Universitygary [email protected]

MS67

Recovery-Based A Posteriori Error Estimation

Estimators of the recovery type possess a number of attrac-tive features: their ease of implementation, generality, andasymptotical exactness, that have led to their popularity inthe engineering community. However, it is well-known thatfor interface problems with large jumps, existing estimatorsof the recovery type over-refine regions where there are noerrors and, hence, fail to reduce the global error. In thistalk, I will first explain why they fail and how to fix thisstructural failure. I will then introduce new recovery-basedestimators for conforming, nonconforming, mixed, and dis-continuous Galerkin elements. It is shown theoretically andnumerically that these estimators are robust with respect

to the jumps of diffusion coefficients. Moreover, these es-timators do not require triangulation being aligned withphysical interfaces, which is essential for their applicationsto nonlinear interface problems of practical interest.

Zhiqiang CaiPurdue UniversityDepartment of [email protected]

MS67

Adjoint Error Estimation Formulations for Nonlin-ear Algorithms Applied to Linear Advection andAdvection-Diffusion Equations

A theoretical framework is developed for adjoint-basedmethods of a posteriori error estimation with nonlinearalgorithms. Application is restricted to linear advectionand advection-diffusion equations to delineate the techni-cal difficulties introduced specifically by the algorithmicnonlinearity. Development of the correct adjoint opera-tor and efficiency in implementation are discussed, withcomputational examples. This work performed under theauspices of the U.S. Department of Energy by LawrenceLivermore National Laboratory under Contract DE-AC52-07NA27344.

Jeffrey M. ConnorsLawrence Livermore National LaboratoryCenter for Applied Scientific [email protected]


Jeffrey A. HittingerCenter for Applied Scientific ComputingLawrence Livermore National [email protected]


MS67

A Posteriori Error Estimates and Adaptive ErrorControl for Probabilistic Sensitivity Analysis

We consider the nonparametric density estimation problemboth the forward and inverse sensitivity analysis of differ-ential equations. We describe an a posteriori error analysisfor computed probability distributions that accounts forboth stochastic and deterministic sources of error. The er-ror estimate is precise and can be used for a general adap-tive error control that balances all sources of error. We de-scribe applications to both ordinary differential equationsand elliptic problems.

Troy ButlerInstitute for Computational Engineering and SciencesUniversity of Texas at [email protected]


164 CS11 Abstracts

Axel MalqvistDepartment of Information TechnologyUppsala [email protected]

Simon TavenerColorado State [email protected]

MS67

A Posteriori Error Measures and Adaptive LocalRefinement with First-Order System Least-Squares(FOSLS) Finite-Element Methods

Standard FOSLS formulation of a system of PDEs enjoys asimple a posteriori error measure that is both locally sharpand globally reliable. One would prefer an error measurethat is also a local upper bound on the error. FOSLS localerror measure provides an upper bound in a special semi-norm of the error, similar to an H1 semi-norm. This talkwill include a discussion of implications of this distinction.A hybrid FOSLS/FOSLL* formulation will be introducedthat reduces local L2 error in the approximation, whileretaining the FOSLS semi-norm bounds.

Thomas ManteuffelUniversity of [email protected]

Steve McCormickDepartment of Applied [email protected]

Lei Tang, Kuo LiuUniversity of [email protected], [email protected]

MS68

Immersed Elastic Structure Dynamics in Viscoelas-tic Fluids

Many biological fluids are non-Newtonian and exhibit vis-coelastic responses. Here we discuss recent results obtainedusing an immersed boundary framework to study the inter-action between immersed elastic structures and surround-ing viscoelastic fluids.

John ChrispellMathematics DepartmentTulane [email protected]

Lisa J. FauciTulane UniversityDepartment of [email protected]

MS68

A Monolithic, Conservative Fi-nite Element Method for Darcy-Stokes Coupling:Towards an Optimal Error Analysis

Recently, finite element methods for Darcy-Stokes couplingbased on Raviart-Thomas elements have been introduced.These elements have the right continuity properties at theinterface to implement the Beavers-Joseph-Saffman condi-tions. Indeed, mass conservation holds strongly indepen-

dent of permeability, and the possibility of tangential slipavoids boundary layers in the porous medium. We showoptimal convergence of the method in L2 on the wholedomain. Furthermore, we address the question of precon-ditioning the resulting discrete problems.

Guido KanschatDepartment of MathematicsTexas A&M [email protected]

MS68

Error Estimate for Discontinuous Finite ElementApproximation to Interface Problems under Mini-mum Regularity

An error analysis of discontinuous Galerkin methods is de-rived for the interface problems under minimum regularityassumption. This is achieved by using the technique in-volving a posteriori error estimate introduced by Gudi [1].[1] T. Gudi, A new error analysis for discontinuous finite el-ement methods for linear elliptic problems, Math. Comp.,electronically published on April 12, 2010.

Xiu YeUniversity of Arkansas, Little [email protected]

MS68

A High Order Cartesian Grid Method for theHelmholtz Equation with Geometrically Compli-cated Material Interfaces

Across the dielectric interfaces, the electromagnetic wavesolutions are usually non-smooth or even discontinuous, sothat our effort in designing efficient algorithms is easilyfoiled, unless the complex interfaces are properly treated.We have recently introduced a novel higher order finitedifference method – the matched interface and boundary(MIB) method, for solving the Helmholtz equation witharbitrarily curved dielectric interfaces based on a simpleCartesian grid. Like other Cartesian grid methods, theMIB method in some sense fits the numerical differentiationoperators to the complicated geometries. Nevertheless, theMIB method distinguishes itself from the existing interfacemethods by avoiding the use of the Taylor series expansionand by introducing the concept of the iterative use of loworder jump conditions. The difficulty associated with otherinterface approaches in extending to ultra high order is thusbypassed in the MIB method. In solving waveguides withstraight interfaces, the MIB interface treatment can be car-ried out systematically so that the proposed approach is ofarbitrarily high order, in principle. Orders up to 12 are con-firmed numerically for both transverse magnetic (TM) andtransverse electric (TE) modes. In solving waveguides withcurved interfaces, the enforcement of jump conditions cou-ples two transverse magnetic field components, so that theresulting MIB method becomes a full vectorial approach.The full vectorial MIB method has been shown to be ableto deliver a fourth order of accuracy in treating arbitrarilycurved interfaces.

Shan ZhaoDepartment of MathematicsUniversity of [email protected]

CS11 Abstracts 165

MS69

Comparison of Some Aspects of Nodal and hp-FEM Methods for Nuclear Reactor Simulations

The highly efficient nodal methods have originated in thenuclear engineering community, where they quickly over-shadowed the more demanding FVM or low-order FEM. Toour best knowledge, though, nodal methods received littleattention in other engineering fields, where the FEM weremore dominant. In an effort to clarify their relationship,we will solve a neutron diffusion problem with both an hp-adaptive FEM and a transverse integrated nodal methodand discuss their similarities and differences.

Milan HanusDepartment of MathematicsUniversity of West Bohemia, Czech [email protected]

MS69

Combined Adaptive Multimesh Hp-Fem/hp-Dg forMultiphysics Coupled Problems Involving Com-pressible Inviscid Flows

During the last decade, Discontinuous Galerkin (DG)methods have been studied by numerous researchers in thecontext of different problems ranging from linear ellipticequations to Euler equations of compressible inviscid flow.It is well known that DG methods yield larger discreteproblems than standard continuous finite element meth-ods (FEM). On the other hand, their implicit stabilizationthrough embedded numerical fluxes makes them particu-larly well suited for hyperbolic flow problems. Thus formultiphysics problems involving compressible flow we pro-pose to combine the best of both worlds: DG is used forthe flow part only while standard FEM is employed forsecond-order equations where it works very well.

Lukas KorousCharles University in PragueFaculty of Mathematics and [email protected]

Milan HanusDepartment of MathematicsUniversity of West Bohemia, Czech [email protected]

MS69

Fully-coupled AMGPreconditioned Newton-Krylov Solution Methodsfor Strongly-coupled Multi-physics Applications

This study considers the performance of a fully-coupledalgebraic multilevel preconditioner for Newton-Krylov so-lution methods. The performance of the preconditioneris demonstrated on a set of challenging multiphysics PDEapplications: drift-diffusion approximations for semicon-ductor devices; a low Mach number formulation for flow,transport and non-equilibrium chemical reactions; and alow flow Mach number resistive magnetohydrodynamics(MHD) system. The AMG preconditioner is based onan aggressive-coarsening graph-partitioning of the nonzeroblock structure of the Jacobian matrix.

John ShadidSandia National LaboratoriesAlbuquerque, [email protected]

Paul LinSandia National [email protected]

Ray S. TuminaroSandia National LaboratoriesComputational Mathematics and [email protected]

MS69

Rheagen: The Rheology Application Engine

The FEniCS project with Automated Scientific Comput-ing aims to present a higher level of abstraction to user ofscientific software. Equations are entered directly as weakforms in Python, from which efficient back-end finite ele-ment code is generated. We have enriched the languagein order to accommodate many complex fluid models. Wewill demonstrate this for several difficult benchmark prob-lems in Oldroyd fluids using numerous discretization andstabilization schemes.

Andy TerrelTexas Advanced Computing CenterUniversity of [email protected]

Matthew G. KnepleyUniversity of [email protected]

Ridgway ScottDepartment of Computer ScienceUniversity of [email protected]

MS70

Analyzing Optimization Models Developed withthe Pyomo Modeling Software

We describe Pyomo, an open source tool for modeling opti-mization applications in Python. Pyomo can be used to de-fine symbolic problems, create concrete problem instances,and solve these instances with standard solvers. Pyomoprovides a capability that is commonly associated withalgebraic modeling languages such as AMPL, AIMMS,and GAMS, but Pyomo’s modeling objects are embed-ded within a full-featured high-level programming languagewith a rich set of supporting libraries. Pyomo leverages thecapabilities of the Coopr software library, which providesinterfaces to optimizers that can analyze a rich array of op-timization models, including linear and integer programs,stochastic programs and generalized disjunctive programs.

William E. HartSandia National Labs, [email protected]

Jean-Paul WatsonSandia National LaboratoriesDiscrete Math and Complex [email protected]

David WoodruffUniversity of California, [email protected]

166 CS11 Abstracts

MS70

A Flexible Python Environment forPDE-Constrained Optimization

This talk gives an overview of NLPy, a toolkit for optimiza-tion in Python, and decribes how NLPy interacts with theFEniCS framework for the description of variational prob-lems to yield a convenient and powerful environment forPDE-constrained optimization. A new solver and its im-plementation are covered. Several applications are used toillustrate the concepts and demonstrate usage of the soft-ware.

Dominique OrbanGERAD and Dept. Mathematics and IndustrialEngineeringEcole Polytechnique de [email protected]

Nick GouldNumerical Analysis GroupRutherford Appleton [email protected]

Sue ThorneRutherford Appleton [email protected]

MS70

Algorithmic Differentiation in Python

We discuss what kind of derivatives can be evaluated ef-ficiently using Algorithmic Differentiation and give a briefoverview of univariate Taylor polynomial arithmetic. Wealso describe how numerical linear algebra functions likethe QR decomposition fit into the framework, describe howsparsity in derivative tensors can be exploited and showhow arbitrary mixed partial derivatives can be evaluatedusing interpolation methods. We show examples and givepointers to available software in Python.

Sebastian F. WalterHumboldt-Universitat zu [email protected]

MS70

Stochastic Nonlinear Programming with Pyomo

We describe PySP, an open-source extension of Pyomo -a Python-based modeling language for mathematical pro-gramming - that enables modeling and solution of stochas-tic mixed-integer programs. PySP contains a number ofgeneric decomposition-based solution strategies made pos-sible through Python language features such as introspec-tion. We discuss the design and implementation of thesegeneric strategies, in addition to computational results onstandard stochastic benchmarks.

Carl LairdAssistant ProfessorTexas A&M [email protected]


MS71

System Requirements for Enabling Graph Algo-rithms on Massively Parallel Computers

Many scientific, national security, and business applica-tions need to process irregular, unstructured data. Un-like scientific applications based on linear algebra routines,these applications comprise large graph computations withirregular memory patterns, low computation to memoryratios, and abundant fine grain parallelism that synchro-nizes frequently. Traditional architectures optimized to runfloating point intensive simulations are inadequate. In thistalk I discuss the requirements for graph based applica-tions motivated by a clustering algorithm for generatingDelaunay meshes.

John Feo, Jace MogillPacific Northwest [email protected], [email protected]

MS71

Approximate Graph Operations on Parallel Plat-forms


Ananth GramaPurdue UniversityDepartment of Computer [email protected]

MS71

Multithreaded Algorithms for Graph Coloring

Graph coloring is an abstraction for partitioning a set ofbinary related objects into few independent subsets. Itis used, among others, to discover concurrency in paral-lel computing. There exist linear-time heuristic algorithmswith good solutions for this problem, but they are challeng-ing to parallelize due to limited concurrency and a high ra-tio of data access to computation. We present two kinds ofmultithreaded coloring algorithms: the first is targeted forthe Cray-XMT and the second for general shared-memoryarchitectures. We present results on carefully chosen in-put graphs, covering various classes of graphs, with up tobillions of edges.

Mahantesh HalappanavarPacific Northwest National [email protected]

Umit V. CatalyurekThe Ohio State UniversityDepartment of Biomedical [email protected]

John FeoPacific Northwest [email protected]

Assefaw H. GebremedhinPurdue [email protected]

Alex PothenPurdue UniversityDepartment of Computer [email protected]

CS11 Abstracts 167

MS71

Optimizing Short-read Genome Assembly Algo-rithms for Emerging Multicore Platforms

We present a new parallel implementation for de novo as-sembly problem of large-scale genomic data on multicoreclusters. This new approach belongs to the family of Eu-lerian path-based methods to de novo genome assembly,and involves construction, traversal, and simplification ofa large string graph. We will discuss parallelization of var-ious steps of the assembly process, focusing on techniquesfor scalable graph construction and simplification.

Kamesh MadduriLawrence Berkeley National [email protected]

MS72

Emulating the Nonlinear Matter Power Spectrumfor the Universe

Many of the most exciting questions in astrophysics andcosmology, including most observational probes of darkenergy, rely on an understanding of the nonlinear regimeof structure formation. In order to fully exploit the in-formation available from this regime and to extract cos-mological constraints accurate theoretical predictions areneeded. Currently such predictions can only be obtainedfrom costly, precision numerical simulations. This work isaimed at constructing an accurate predictor of the non-linear mass power spectrum on Mpc scales for a widerange of currently viable cosmological models, includingdark energy. We use the Coyote Universe simulation suitewhich comprises nearly 1,000 N-body simulations at differ-ent force and mass resolutions, spanning 38 w CDM cos-mologies. This large simulation suite enables us to con-struct a prediction scheme for the nonlinear matter powerspectrum accurate at the percent level for large wave num-bers. We present this scheme and discuss the tests we havedone to ensure its accuracy, and discuss how it can be usedto estimate cosmological parameters.

Dave HigdonLos Alamos National LaboratoryStatistical [email protected]

Earl Lawrence, Katrin Heitmann, Salman HabibLos Alamos National [email protected], [email protected],[email protected]

Martin WhiteLawrence Berkeley National [email protected]

Christian WagnerAstrophysikalisches Institut Potsdam (AIP), Potsdam,[email protected]

MS72

Gradient Enhanced Universal Kriging Model forInexpensive Uncertainty Quantification in ReactorSafety Simulations

In this work we discuss an approach for uncertainty prop-agation through computationally expensive physics simu-lation codes. Our approach incorporates gradient informa-

tion to provide a higher quality surrogate with fewer sim-ulation results compared with derivative-free approaches.In turn, we create an Gaussian probabilistic model for thesystem response which, coupled with input uncertainty in-formation provides a complete uncertainty approach whenthe physics simulation code can be run at only a small num-ber of times. We demonstrate that explicitly modeling themean process substantially increases the efficiency of theapproach when compared with ordinary kriging and thatusing a Gaussian process substantially reduces the errorwhen compared to regression approaches using the sameinformation. We demonstrate our findings on syntheticfunctions as well as nuclear reactor models.

Brian LockwoodUniversity of [email protected]


MS72

Parallelization Schemes for Constructing EffectiveSurrogate Models of Large Scale Computer Simu-lations

We will present in this talk some effective computationalstrategies for constructing surrogate models (Bayes LinearModels) from an ensemble of expensive computer models.Such surrogates are crucial to uncertainty quantificationof many classes of models of physical systems modeled bypartial differential equations. Careful strategies for par-allel construction of the inverse of the covariance matrixare used to render this computation tractable. While sim-ilarities do exist between these and more classical iterativesolution methods there are also interesting differences.

Abani K. PatraSUNY at BuffaloDept of Mechanical [email protected]


E. Bruce PitmanDept. of MathematicsUniv at Buffalo, Buffalo, NY [email protected]

Elena StefanescuSUNY at Buffalo Dept of Mechanical [email protected]

Matthew JonesCenter for Comp. Research,Univ. at Buffalo, Buffalo, NY [email protected]

MS72

Bayesian Model Analysis Using Parallel AdaptiveMulti-Level Sampling Algorithms

We present a parallel adaptive algorithm for sampling mul-timodal distributions and computing Bayesian model evi-

168 CS11 Abstracts

dences. The algorithm samples a sequence of distributions(levels) that converge to the final multimodal distribution,communicating sampling information among adaptively se-lected levels in order to correctly capture the volume pro-portions among final modes. The parallel version of the al-gorithm considers load balancing as well. We show parallelcomputational examples on the analysis of some turbulencemodels.

Sai Hung CheungInstitute for Computational Engineering and [email protected]

Todd OliverPECOS/ICES, The University of Texas at [email protected]

Ernesto E. PrudencioInstitute for Computational Engineering and [email protected]

Karl W. SchulzInstitute for Computational Engineering and SciencesThe University of Texas at [email protected]

MS73

Computational Islet Simulations of PancreaticBeta Cells

One of the causes of diabetes is the failure of beta cells tosecrete insulin in response to blood glucose levels. Betacell is the most prevalent cell type in the islets of the en-docrine system. By using robust numerical methods andefficient programming techniques, we have created an ex-tensible, efficient, and functional computation islet simu-lator in Matlab that can simulate islets with 1000 or morebeta cells, sufficient for physiologically representative sim-ulations. Advisor: Dr. Matthias Gobbert

Christopher RaastadUniversity of [email protected]

MS73

Solving a NIST Suite of PDE Benchmark Problemswith Adaptive Low-Order FEM and hp-FEM

This paper is based on a recently NIST Report byMitchell containing a collection of PDE benchmark prob-lems. These feature various phenomena that pose chal-lenges to adaptive finite element algorithms. After a briefreview of main differences in automatic adaptivity for low-order and higher-order FEM including hanging nodes andspatially and polynomially anisotropic refinements, we usethe benchmarks in the suite to discuss and compare vari-ous aspects of automatic adaptivity in low-order FEM andhp-FEM. Advisor: Dr. Pavel Solin; University of Nevada,Reno

Erick A. SantiagoUniversity of Nevada, [email protected]

MS73

The Physics and Engineering of Skateboardings

MegaRamp

The MegaRamp was designed by a professional skate-boarder based on his experience. There has been no for-mal research performed to analyze the ramp. This researchuses lumped-parameter modeling techniques supplementedwith motion tracking data to develop equations of motionthat accurately model the dynamics of a skateboarder onthe MegaRamp. This model will be used as a quantitativetool to determine if the ramp can be modified to improvea skateboarders performance. Advisors: Kevin Fite (Me-chanical Engineering) and Aaron Luttman (Mathematics)

Emily A. StefanoClarkson [email protected]

MS73

Developing Computational Tools to Predict Mate-rial Behavior

By solving the Schrdinger Equation for many differentatoms, we can predict the behavior of materials withouthaving to make higher level assumptions. However, it iscurrently difficult or even impossible to solve the equationfor millions of atoms. We present a method that uses aDiscrete Wavelet Transform to downsize the matrix aris-ing from the discretization of the Schrdinger Equation inorder to compute the ground state energy of the systemmore efficiently. Advisors: Dr. Malena Espanol and Prof.Michael Ortiz

Stephanie TsueiCalifornia Institute of [email protected]

MS73

A Generalized Monte Carlo Loop Algorithm forFrustrated Ising Models

Monte Carlo (MC) simulation of some frustrated 2D latticespin models (Ising models) at low temperatures can be pro-hibitively slow due to an extensive number of ground-stateor near-ground-state spin configurations separated by largeenergy barriers to single spin flips. In this paper, we intro-duce a Generalized Loop Move (GLM) that uses the dualgraph of a 2D Ising model to overcome this slowness anddemonstrate its effectiveness in several cases where stan-dard MC is ineffective. Advisors: Hans De Sterck, Univer-sity of Waterloo; Roger Melko, University of Waterloo

Yuan WangUniversity of [email protected]

MS74

Adjoint Methods for Error Control and Adaptivityin Ocean Models

We present an error control framework, based on the solu-tions of adjoint problems on potentially coarser scales, thatallows for reliable error estimation and grid adaption. Theframework supports explicit, multirate time integration. Inaddition, resource saving methods such as block adaptiv-ity and compensated domain decomposition can be usedto solve the adjoint problem. We illustrate the method on

CS11 Abstracts 169

a variety of single and multi-stack shallow water models.

Varis Carey, Donald EstepColorado State [email protected], [email protected] [email protected]

MS74

A Multirate Time Integration Scheme and A Pos-teriori Error Estimates for Ocean Circulation Mod-eling

n ocean circulation modeling, the system of modeling equa-tions is often split into a 2D, depth-independent barotropicsubsystem and a 3D, depth-dependent baroclinic system.In this presentation we describe a multirate discontinuousGalerkin method for integrating the subsystems in time.The scheme allows us to use different time steps for differ-ent subsystems. We also present an a posteriori estimateof the error at the final time f computation. This estimatormay be used for adaptive meshing

Brandon ChabaudPennsylvania State [email protected]

MS74

A Scale-invariant Formulation of the AnticipatedPotential Vorticity Method

This is the first of a series of efforts to develop scale-awaresubgrid parametrization schemes on variable-resolutiongrids. We focus on the anticipated potential vorticitymethod (APVM) on quasi-uniform grids with varying reso-lutions. By a scale analysis technique and the phenomeno-logical theories for two-dimensional turbulence, we derivea scale-invariant formulation of the APVM, which dependson one single parameter. Then utilizing a basic optimiza-tion technique, we determine the optimal value of the pa-rameter for the APVM through numerical experiments.

Qingshan ChenFlorida State [email protected]


Todd RinglerLos Alamos National [email protected]

MS74



Todd RinglerLos Alamos National [email protected]

MS75

Automatic Differentiation and Geometry Adapta-tion Techniques for Shape Optimization of Fluid-

Based Systems

In the context of shape optimization of fluid-based systemsinvolving complex fluids, the aspect of sensitivity of opti-mal shapes with respect to the fluid model parameters isexamined. The parametric sensitivities are obtained byapplying automatic differentiation tools to the entire opti-mization tool chain. A second aspect of practical geometrymodification for complex shapes is also addressed. Appli-cations include blood flow devices and plastics extrusiondies.

Marek BehrRWTH Aachen UniversityChair for Computational Analysis of Technical [email protected]

Stefanie Elgeti, Mike NicolaiRWTH Aachen [email protected], [email protected]

Markus ProbstRWTH Aachen UniversityChair for Computational Analysis of Technical [email protected]

MS75

Development and Application of a Discrete AdjointApproach for Unsteady Flow Control

We discuss the development and application of an efficientdiscrete adjoint solver for unsteady, viscous, incompress-ible flow control problems. The discrete adjoint flow solveris developed by the use of automatic differentiation in re-verse mode. Here, the binomial checkpointing algorithm’revolve’ is combined with the adjoint flow solver for opti-mal reduction in memory requirements.

Nicolas R. Gauger, Anil Nemili, Emre OezkayaDepartment of Mathematics and CCESRWTH Aachen [email protected], [email protected], [email protected]

MS75

Heliostat Layout Optimization via Automatic Dif-ferentiation: Adjoints and Convex Relaxations

Concentrated solar thermal power generation with a cen-tral receiver is considered. Its major costs are the landarea required and the heliostats used to concentrate theinsolation. Heliostats are optimally placed via a modeldeveloped. Automatic differentiation techniques are usedfor the calculation of derivatives, convex relaxations andtheir subgradients. The theory of convex relaxation of al-gorithms is summarized and numerical results from the he-liostat placement and other case studies are presented.

Corey Noone, Alexander MitsosDepartment of Mechanical EngineeringMassachusetts Institute of [email protected], [email protected]

MS75

Memory-Efficient Newton/Pantoja Method forOptimal Control Problems

We investigate a memory-efficient New-

170 CS11 Abstracts

ton/Pantoja method for the numerical solution of optimalcontrol problems. Using a local minimum principle wederive the first order necessary optimality conditions, thatare equivalent to a nonlinear equation in appropriate Ba-nach spaces. This equation is solved by a combination be-tween the Newton and Pantoja methods. Each it- erationof the Newton method, i.e. each application of the Pan-toja method to evaluate a search direction, contains threealternative sweeps through a time horizon, with a specificinformation dependence between different sweeps, so thatthe straightforward implementation of the algorithm wouldrequire a huge amount of memory to store all intermediatevariables. To reduce this memory requirement we developnested checkpointing techniques and prove some theoreti-cal results concerning them. Finally, we discuss numericalex- periences considering an optimal control problem forlaser surface hardening of steel.

Julia Sternberg-KalettaDepartment of MathematicsUniversity of [email protected]

Andreas GriewankHU Berlin, MATHEON Research Center, [email protected]

MS76

Shape Optimization of Chiral Structures for LowReynolds Number Propulsion

Recent advances in micron-scale fabrication techniques al-low for the construction of helically shaped and magneti-cally controlled artificial micro-swimmers. To facilitate thedesign of these novel transport devices, we conduct shapeoptimization simulations based on a boundary integral rep-resentation of the Stokes equations and a variational ap-proach for the optimization problem. We determine thoseswimmer shapes that maximize speed in a particular di-rection and identify key improvements to the shapes fabri-cated in experimental studies.

Eric E. KeavenyCourant InstituteNew York [email protected]

MS76

A Two-dimensional Numerical Study of a Perme-able Capsule under Stokes Flow Condition

In the present work, a healthy red blood cell and a malaria-infected red blood cell in the presence of membrane perme-ability is modeled using two-dimensional immersed inter-face method. The fluid properties of the enclosed fluid andthe surrounding fluid are assumed as the same. The re-sults show that the healthy red blood cell gradually movesaway from the vessel wall while the malaria-infected redblood cell rolls on the vessel wall due to adhesion. It isfound that the resistance on the blood flow given by themalaria-infected red blood cell is higher than the corre-sponding resistance given by the healthy red blood cell.Moreover, the mass transfer characteristics of the healthyand malaria-infected red blood cells are investigated forvarious flow fields and the membrane properties.

Pahala Jayathilake, B.C. KhooNational University of SingaporeSingapore

[email protected], [email protected]

MS76

Wetting Dynamics and Particle Deposition for AnEvaporating Colloidal Drop: A Lattice BoltzmannStudy

A three-dimensional (3D) lattice Boltzmann method(LBM) has been developed for multiphase (liquid and va-por) flows with solid particles suspended within the liquidphases. The method generalizes our recent 2D model to3D, extends the implicit scheme to include inter-particleforces and introduces an evaporation model to simulatedrying of the colloidal drop. The LBM is used to examinethe dynamical wetting behavior of drops containing sus-pended solid particles on homogeneous and patterned sub-strates. The influence of the particle volume fraction andparticle size on the drop spreading dynamics is studied asis the final deposition of suspended particles on the sub-strate after the carrier liquid evaporates. The final particledeposition can be controlled by substrate patterning, ad-justing the substrate surface energies and by the rate ofevaporation.

Ying Sun, Abhijit JoshiDrexel [email protected], [email protected]

MS76

Comparing Slender-body Formulation with theMethod of Regularized Stokeslets

In this work we compare the slender-body formulation foran elastic slender body immersed in a viscous fluid to themethod of regularized Stokeslets. Quantitative compari-son is conducted, and both formulations will be applied tomodel and simulate the dynamics of primary cilium to elu-cidate both formulations. In particular, the experimentaldata on primary cilium are utilized and comparative resultswill be presented. This work is supported by NSF/CBET-0853673.

Yuan-Nan YoungDepartment of Mathematical [email protected]

MS77

Scalable Algorithms for Large-scale Inverse WavePropagation

We consider algorithms for geophysical inverse problemswith the goal of achieving large-scale parallel scalability.To this end, all parts of the computational pipeline needto be examined, and redeveloped where necessary. Wepresent inexact Newton-Krylov iterative methods, wherethe Hessian is applied via the solution of forward and ad-joint problems. These are solved in parallel using contin-uous and discontinuous Galerkin methods, where dynamicmesh adaptivity is applied to both the state and parameterfields.


Omar GhattasUniversity of Texas at Austin

CS11 Abstracts 171

[email protected]



MS77

Parallel Multiscale Optimization Techniques inNonlinear Mechanics

The parallel numerical solution of realistic mechanicalproblems, such as large-deformation contact between anelastic body and a rigid obstacle, often gives rise to nonlin-ear and possibly non-convex optimization problems. Thus,in order to succeed in computing a local minimizer of suchoptimization problems their solution is most oftenly carriedout employing globalization strategies, i.e., Trust-Regionand Linesearch methods. As is well-known, the paradigmof globalization strategies is to compute and to damp asearch direction in order to achieve a descent in the valueof a given objective function. Usually, search directionsare computed as the solution of constrained quadratic pro-gramming problems. But, even if these quadratic pro-gramming problems are solved exactly, the damping ofthe search directions might yield a slow convergence ofthe overall scheme. Unfortunately, this effect often in-creases with the size of the optimization problem. There-fore, we present a class of nonlinear preconditioning strate-gies where a possible slow convergence is bypassed by com-puting search directions in parallel. In particular, theparadigm of these strategies is to locally solve certainnonlinear programming problems employing either Trust-Region or Linesearch methods. But, since these globaliza-tion strategies asynchronously compute local corrections,we must furthermore take care of the overall convergence ofthe method. As it turns out, this can be done by employ-ing global control strategies yielding globally convergent,inherently parallel Linesearch (APLS) and Trust-Region(APTS) strategies. In this talk, we will therefore reviewthe concept of nonlinear additively preconditioned glob-alization strategies and focus on the application of suchstrategies to the parallel solution of large-scale nonlinearoptimization problems arising from the discretization oflarge deformation contact problems.

Christian GrossUniversita’ della Svizzera ItalianaInstitute of Computational [email protected]

Rolf KrauseUniversita’ della Svizzera [email protected]

MS77

On the Treatment of Uncertainties in AerodynamicShape Optimization

The unavoidable presence of uncertainties poses several dif-ficulties to the numerical treatment of optimization tasks.In this talk, we discuss a novel approach towards aleatoryuncertainties for the specific application of optimal aerody-namic design under uncertainties. An appropriate robust

formulation of the underlying deterministic problem andefficient approximation techniques of the probability spaceare investigated. Finally, algorithmic approaches based onmultiple-setpoint ideas in combination with one-shot meth-ods are presented as well as numerical results.

Claudia SchillingsUniversity of Trier, [email protected]

Volker SchulzUniversity of TrierDepartment of [email protected]

MS77

HPC Issues in Shape and Topology Optimization

Shape and Topology optimization problems are a specialsub-class of PDE constrained optimization. Concerningshape optimization problems, special care must be takento treat the resulting mesh sensitivities properly, as oth-erwise the sensitivity information of the objective with re-spect to a change of the shape of the underlying domaincan become prohibitively expensive to compute for a fineparameterization of the domain. One possible remedy hereis shape calculus which can be used to arrive at a surfaceformulation of the shape gradient. The resulting numer-ical procedure is very efficient allowing one to use everysurface mesh node position as the unknown for the shape.Concerning topology optimization, the special structureof the SIMP- method creates a very regular memory ac-cess pattern, which also leads to potentially very efficientschemes for novel hardware. As such, the implementationof this topology optimization approach is studied on mod-ern GPUs.

Stephan SchmidtUniversity of [email protected]


MS78

Adjoint-Based Numerical Error Estimation for theUnsteady Compressible Navier-Stokes Equations

We present practical adjoint-based strategies for estimatingnumerical discretization errors in scalar outputs obtainedfrom unsteady simulations of the compressible Navier-Stokes equations. The discretization is discontinuousGalerkin in space and time, and the adjoint is obtainedin a discrete fashion. We investigate various approxima-tions of the fine-space adjoint in terms of error estimatecost and accuracy, and we present a method for separatingeffects of the spatial and temporal discretizations.

Krzysztof FidkowskiUniversity of [email protected]

Isaac [email protected] of michigan

172 CS11 Abstracts

MS78

Space-Time Error Estimation Via Dual Problemsfor CFD Problems

Space-time a-posteriori error estimates for derived outputsM(u) are expressed in terms of residuals of the primal nu-merical problem uh and weights involving the solution of alocally linearized dual problem φ

|M(u) − M(uh)| <∑time

∑space

W(φ) · Residual(uh) .

This error representation can also be used as the basis for amesh adaptivity and error control. Primal physical systems

with genuine linearity may exhibit local instability (e.g.hydrodynamic instability, MHD instability, etc) while re-maining globally stable. This often leads to a rapid growthin the dual problem and the weights W(φ) appearing in a-posteriori error estimates. This rapid growth can makethe control of errors in certain computed outputs problem-atic. In this presentation, we investigate the growth ofdual problems for nonlinear systems such as the compress-ible Navier-Stokes equations. Particular attention is givento

• the choice of computed output M(u),

• approximate local linearization,

• long time integration.

Numerical calculations include compressible flow simula-tion at low to moderate Reynolds number and preliminarycalculations arising in 2-fluid plasma simulation.

Timothy J. BarthNASA Ames Research [email protected]

MS78

Error Estimation for Aleatoric Uncertainty Propa-gation using Stochastic Adjoints

Stochastic adjoint equations are introduced within a non-intrusive discrete sampling framework to enable efficientpropagation of aleatoric uncertainties in systems governedby algebraic or differential equations involving random pa-rameters. Adjoints are used to estimate the error (due toinexact reconstruction of the solution in stochastic space)in statistical moments of interest and the procedure isshown to exhibit super-convergence, in accordance withthe underlying theoretical rates. Goal-oriented error in-dicators are then built using the adjoint solution and usedto identify regions for adaptive sampling.

Karthik [email protected]

MS78

Adjoint-based Discretization Error Estimation forTime Dependent Problems

Adjoint-based error estimation techniques for functionaloutputs in time-dependent computational fluid dynamicsproblems are discussed. Unsteady fluid flow problems ondynamically deforming unstructured meshes in two dimen-sions are considered, where the governing flow equationsare discretized in arbitrary Lagrangian Eulerian (ALE)form. The discrete adjoint for the coupled fluid flow/meshdeformation equations is derived and solved using a back-wards integration in time. The adjoint solution is then used

to derive error estimates for spatial, temporal and algebraicerror in the simulation, and for driving adaptive refinementstrategies for reducing these error sources. Prospects forextending these techniques to more complex multiphysicssimulations will also be considered.


MS79

Numerical Methods for Subsurface Flow in KarstAquifer

In a karst aquifer, free flow and porous media flow aretightly coupled together, for which the Stokes-Darcy modelhas higher fidelity than either the Darcy or Stokes systemson their own. The Stokes-Darcy model has attracted sig-nificant attention in the past ten years since it also arisesin many other applications such as surface water flows,petroleum extraction and industrial filtration. However,coupling the two constituent models leads to a very com-plex system. This presentation discusses numerical meth-ods for solving several types of Stokes-Darcy system. Com-putational results are presented to illustrate their featuresand some convergence analysis is demonstrated.

Yanzhao CaoDepartment of Mathematics & StatisticsAuburn [email protected]


Xiaoming HeDepartment of Mathematics and StatisticsMissouri University of Science and [email protected]

Xiaoming [email protected]

MS79

Numerical Method for Tracking Interfaces in Flu-ids

We describe the sliding interface method (SIM) for track-ing a sharp interface separating two materials as it movesthrough a structured logically rectangular grid in two spa-tial dimensions. The interfaces between the different so-lution regions are represented by a piecewise linear curvealong grid cell edges or diagonals. Accurate numerical ap-proximations to partial differential equations (PDEs) withmoving discontinuous interfaces require methods that ex-plicitly account for the movement of the interface and theinteraction between the interface and the solution of theunderlying PDE. The best existing numerical algorithm toaccount for these discontinuities is problem dependent. Wewill review the strengths and weakness of other interfacetracking methods to identify which problems where ournew sliding interface methods is most appropriate. TheSIM provides a simple approach to incorporate a sharp ap-proximation of moving interfaces that separate regions withvastly different properties within existing methods based

CS11 Abstracts 173

on finite volume discretizations of the underlying systemof PDEs. We describe the advantages of the SIM on a se-ries of problems where a moving interface immersed in afluid can be accurately approximated.

Michael NicholaMathematics DepartmentTulane [email protected]

MS79

Ciliary Dynamics and Chamlydomonas

Chlamydomonas is a tiny unicellular green alga with twohair-like flagella for propelling the cell body through itsfluid environment. The flagella with both sensory andmotility functions are very similar to animal cilia, whichmakes Chlamydomonas an outstanding model organism tostudy. In recent years, Chlamydomonas has been inten-sively used for the study of numerous fundamental bio-logical processes in cell and molecular biology, includingcilia-pathology in the human health. However the mech-anism governing the bi-flagellar motilities is still poorlyunderstood. We present a fluid-dynamical model, usingcomputational approaches, that examine swimming anda variety of the bi-flagellar motilities of Chlamydomonas.This model couples the time-dependent fluid dynamics andthe internal force generation mechanism by ATP-inducedmolecular motor proteins.

Xingzhou YangDepartment of Mathematics and StatisticsMississippi State [email protected]

Lisa J. FauciTulane UniversityDepartment of [email protected]

MS79

Matched Interface and Boundary (MIB) Methodfor Solving Multi-Flow Navier-Stokes Equationswith Applications to Geophysics

Based on Matched-Interface and Boundary (MIB) method,we propose a high-order numerical approach for solving in-compressible Navier-Stokes equations with discontinuousviscosity and singular forces at internal interfaces. Thecontinuity of the viscous stress across the interface is en-forced by coupling all velocity components in modifying thelocal finite difference scheme of computing spatial deriva-tives of any velocity component. This allows the applica-tion of projection method to ensure a velocity field free ofdivergence, and hence is suitable for applications to realgeophysical problems with strong non-linearity.

Yongcheng Zhou, James LiuDepartment of MathematicsColorado State [email protected], [email protected]

Dennis HarryDepartment of GeosciencesColorado State [email protected]

MS80

Cython: Compiled Code meets Dynamic Python

Cython is an extension to the Python language that allowsexplicit type declarations and is compiled directly to C.This addresses Pythons large overhead for numerical loopsand makes it very easy to make effiicient use of existing C,C++ and Fortran code, which Cython code can interactwith natively. Cython combines the speed of C with thepower and simplicity of Python, and is one of the core toolsfor Python-based scientific computation.

Lisandro DalcinCentro Int. de Metodos Computacionales en [email protected]

Robert BradshawGoogle, [email protected]

MS80

Matplotlib - from Interactive Exploration to Pub-lication Graphics

matplotlib is a python library for generating scientificgraphs and visualizations. While often used for ease of usein interactive and scripted environments, the library hasa full featured API for publication quality graphics andproduction applications. In this talk we explore some ofthe new features of matplotlib such as the HTML5 canvas,with an emphasis on publication quality features such assophisticated axes grid layouts, dropped spines, fonts andmathematical text.

John HunterTradelink, [email protected]

MS80

Capabilities and Recent Developments of NumPyfor Scientific Computing

NumPy is a library that provides a flexible and power-ful N-dimensional array system, along with core numericalfunctionality, and lays the foundation for scientific com-puting in Python. The array system allows creation of anarray of arbitrary but strongly-typed data along with basicmath functions that can operate at compiled speeds on theentire array in an element-by-element fashion. This talkwill describe NumPy as well as the ports to the Python3and IronPython platforms.

Travis E. OliphantEnthought, [email protected]

MS80

Why Modern, High-performance Networking Mat-ters for Interactive Computing

IPython is widely used as a terminal-based environmentfor interactive scientific computing. Recently, we have re-designed its architecture around the ZeroMQ library forhigh-performance networking, to provide vastly enhancedcapabilities. Based on a protocol for frontends commu-nicating with kernels that execute user code, we can buildmultiple user interfaces, provide facilities for interactive re-mote collaboration and much more. We will present this

174 CS11 Abstracts

design and showcase some of the possibilities for novel in-teractive interfaces it enables.

Fernando PerezHelen Wills Neuroscience InstituteUniversity of California, [email protected]

Brian GrangerCal Poly, San Luis [email protected]

Evan PattersonDepartment of PhysicsCalifornia Institute of [email protected]

MS81

Implicit Newton-Krylov Drift-Diffusion Semicon-ductor Simulations on Multicore Architectures

This talk will discuss the performance of a massively par-allel simulation code for semiconductor devices on a fewcurrent multicore architectures. The drift-diffusion equa-tions are discretized by a finite element method, and solvedwith a fully-implicit Newton-Krylov approach. While thisapproach is robust, it is also heavily dependent on sparsematrix algorithms. We discuss the impact of precondition-ers on the scaling and performance, and the efficiency ofthe MPI-only code on multicore architectures.

Paul LinSandia National [email protected]

John ShadidSandia National LaboratoriesAlbuquerque, [email protected]

MS81

Multithreaded Hybrid Solver for Sparse LinearSystems


Sivasankaran RajamanickamSandia National [email protected]

Erik G. BomanSandia National Labs, NMScalable Algorithms [email protected]

Michael A. HerouxSandia National [email protected]

MS81

TraceMin: A Scalable Parallel Symmetric Eigen-solver

The Trace Minimization algorithm (TraceMIN) obtains afew of the smallest eigenpairs of a symmetric generalizedeigenvalue problem. It is ideally suited for implementationon parallel architectures. We illustrate its scalability on avariety of problems including the simulation of car body

dynamics at high frequencies. In addition, we show theeffectiveness of TraceMIN in computing the Fiedler vectorfor weighted spectral reordering of large sparse matrices.

Ahmed SamehPurdue [email protected]

MS81

Parallel Scalability Analysis of Sparse Hybrid Lin-ear Solvers on the Cray XE6

Achieving high parallel scalability of sparse linear systemsolvers implemented on large-scale computing platformscomprised of tens of thousands of multicore processors is atask that offers many challenges. Towards achieving sucha solver, we build on the success of the PARDISO-SPIKEfamily of parallel solvers sparse linear systems. In this talk,I present a generalization of a hybrid family of schemes forhandling general sparse linear systems. Application resultswill be presented from seismic inversion and 3D oil reser-voir modeling.

Olaf SchenkDepartment of Mathematics and Computer ScienceUniversity of Basel, [email protected]

MS82

Parallel Stochastic Newton MCMC for Large-scaleStatistical Inverse Problems

We present a new Parallel MCMC method for the solu-tion of the Bayesian statistical inverse problem. LocalHessian and gradient information is used to adaptivelyconstruct a radial basis function approximation (RBF) ofthe posterior, which is used as proposal distribution forthe Metropolis-Hastings algorithm across several parallelchains. Parallelism allows for rapid convergence of this ap-proximation, and thus minimal sample correlation in theresulting MCMC chains.

Tan BuiUniversity of Texas at [email protected]

Carsten BursteddeThe University of Texas at [email protected]



Lucas WilcoxUniversity of Texas at [email protected]

MS82

Benchmarking of Discrete Stochastic GalerkinSolver for Computational Fluid Dynamics

In this paper we describe an unintrusive approach to

CS11 Abstracts 175

Galerkin-based generalized polynomial chaos (gPC) usingtemplating and operator overloading. We have developeda CFD solver using this approach which is easy to readand can address new classes of physics automatically, with-out extensive re-implementation of the gPC equations. Webenchmark the performance of the code for canonical CFDproblems, and present results for accuracy, timing and scal-ing on large-scale parallel computing platforms.

Sanjay MathurPurdue [email protected]

Jayathi MurthyPurdue University, Dept. of Mechanical EngineeringWest Lafayette, IN [email protected]

MS82

Uncertainty Quantification and Robust Design ofCardiovascular Bypass Graft Surgeries

We present a framework for performing uncertainty quan-tification and stochastic optimization of systems governedby partial differential equations. A novel and non-intrusiveadaptive stochastic collocation technique is used to ac-count for uncertainties. Optimization is performed usinga derivative-free technique called the Surrogate Manage-ment Framework (SMF), in which Kriging interpolationfunctions are used to approximate the objective function.Applications to the method on computationally intensivesimulation and design of cardiovascular bypass graft surg-eries are presented.

Sethuraman SankaranUniversity of California at San [email protected]

Alison MarsdenDepartment of Mechanical and Aerospace EngineeringUniversity of California, San [email protected]

MS82

High-Performance Preconditioning for InstrusiveStochastic Projections

Several solution methods for stochastic Galerkin discretiza-tion of partial differential equations with random inputdata are compared. Less intrusive approaches based on Ja-cobi and Gauss-Seidel iterations are compared with moreintrusive Krylov-based approaches. A set of precondi-tioners for Krylov-based methods are examined, includ-ing mean-based, Gauss-Seidel, approximate Gauss-Seideland approximate Jacobi preconditioners. Krylov-based ap-proach using approximate Gauss-Seidel and Jacobi precon-ditioners is found to be most effective. Sandia’s Trilinossoftware is used to implement above algorithms.

Ramakrishna TipireddyUniversity of Southern CaliforniaAerospace and Mechanical Engineering and [email protected]


Roger GhanemUniversity of Southern CaliforniaAerospace and Mechanical Engineering and [email protected]

MS83

Reduced Models of Multiscale Kinetic Systems un-der Uncertainty

Chemical kinetic systems contain both uncertain rate pa-rameters and dynamics at multiple time scales. The latterfeature aids model reduction in the deterministic case, butmodel reduction under uncertainty raises new challenges.We use computational singular perturbation to calculate‘importance indices’ for species-reaction pairs. Distribu-tions of these indices are used to form reduced modelsthat (1) yield predictions within probabilistic bounds de-termined by the full model, or (2) preserve entire outputdistributions of the full model.

Thomas ColesDepartment of Aeronautics and AstronauticsMassachusetts Institute of [email protected]

Youssef M. MarzoukMassachusetts Institute of [email protected]

MS83

Coupling Algorithms for Stochastic Multiphysics

We present recent developments for the numerical resolu-tion of coupled partial differential equations with stochasticcoefficients. These equations arise naturally in the contextof UQ for multiphysics problems. The proposed algorithmsexplore different probabilistic representations of informa-tion exchanged between the coupled equations. Issues ofaccuracy and efficiency, including model and dimension re-duction, are stressed. An application to a problem arisingin nuclear safety is used to demonstrate the mathematicalconstructions and associated algorithms.

Maarten ArnstUniversity of Southern [email protected]


John Red-HorseValidation and Uncertainty Quantification ProcessesSandia National [email protected]


MS83

Stochastic Atomistic-to-Continuum Coupling using

176 CS11 Abstracts

Bayesian Inference

We perform a multiscale simulation in a model operatingunder uncertainty. This latter has the form of both para-metric uncertainty and sampling noise intrinsic in atomisticsimulations. We present a mathematical formulation thatenables the exchange of information and propagation of un-certainty between the discrete and continuum components.We implement a Bayesian inference machinery to build thepolynomial chaos expansion (PCE) of the exchanged vari-ables. We consider a simple Couette flow model wherethe variable of interest is the wall velocity. Results showconvergence toward the analytical solution at a reasonablecomputational cost.

Maher SalloumSandia National LaboratoriesScalable & Secure Systems Research [email protected]

Khachik SargsyanSandia National [email protected]

Reese Jones, Bert Debusschere,[email protected], [email protected]


Helgi AdalsteinssonSandia National Laboratories, Livermore [email protected]

MS83

Kernel Principal Component Analysis for Stochas-tic Input Generation of Multiscale Systems

We apply kernel principal component analysis (KPCA)to construct a reduced-order stochastic input model forthe material property variation in heterogeneous media.KPCA can be considered as a nonlinear version of PCA.Through use of kernel functions, KPCA enables the preser-vation of high-order statistics of the random field. Thus,this method can model non-Gaussian, non-stationary ran-dom fields. In addition, polynomial chaos (PC) expansionis used to represent the random coefficients in KPCA whichprovides a parametric stochastic input model. Thus, real-izations, which are consistent statistically with the experi-mental data, can be generated in an efficient way. We show-case the methodology by constructing a low-dimensionalstochastic input model to represent channelized permeabil-ity in porous media.

Nicholas Zabaras, Xiang MaCornell [email protected], [email protected]

MS84

ToppGene Computational Analysis: IdentifyingPotential Genes Necessary for Acute MyeloidLeukemia Viability

Acute myeloid Leukemia (AML) is a disease characterizedby various cytogenetic abnormalities and poor clinical out-come. The focus of this study was to synthesize the diverse

gene expression data stemming from these abnormalitiesand their subsequent AML subtypes. To accomplish this,we utilized the ToppGene Computational Analysis Soft-ware Suite, which provided statistically significant identi-fication of potentially therapeutic targets. As a result, weidentified MCL1, KLF44, KLF6, and the miR-29 family aslikely targets for therapy. Advisors: Dr. Bruce Aronow,Cincinnati Children’s Hospital

Zach BeaverWofford [email protected]

MS84

Understanding How Errors in Model ParametersImpact the Simulation of Subsurface TemperatureProfiles

We consider simulation of heat flow in the shallow sub-surface. This work is motivated by trying to matchtemperature profiles taken at meteorology stations us-ing a simulation-based approach and analytic approaches.Specifically we determine soil parameters using derivative-free optimization to solve the nonlinear-least squares prob-lems. We also study how errors in the initial and boundaryconditions propagate over time. Advisor: K.R.Fowler

Xiaojing FuClarkson [email protected]

MS84

Optimization for Modeling Brain Activity DuringThreatening Scenarios

The focus of this work is determining the architecture rep-resenting how the brain detects threats. A model wasdeveloped containing seventeen connections between sen-sory, response attention, and threat detection nodes forsomatosensory and visual tasks. Optimization was used tofit these parameters but results indicated that the modelcontained undesirable local minima. To account for thisoccurrence physical constraints were adapted to the modeland an analysis of variance determined the sensitivities ofthe parameters. Advisors: Dr Katie Fowler, Dr RobertDowman

Brian LeventhalClarkson [email protected]

MS84

Development and Analysis of Computer Simula-tions of Neuron-Astrocyte Networks to Better Un-derstand the Role of Astrocytes

In the brain, astrocytes had been sidelined to a supportingrole in neural networks. The past decade revealed that theyinfluence and propagate neural signals, suggesting an im-portant role for astrocytes in neural networks. By runningmultiple simulations of neural networks, with and with-out astrocytes, we observed that they have the ability tosynchronize neural activity. Such findings resemble brainrecordings of epileptic patients whose neuronal activity issynchronized during seizures correlated with hypertrophiedastrocytes. Advisors: Dr. Anne J. Catlla

Tahirali H. MotiwalaWofford College

CS11 Abstracts 177

[email protected]

MS84

Multimaterial Multiphase Deflagration-to-Detonation Model in PetaScale ComputationalFramework

High-performance science-based computer simulation toolscan predict explosion violence in accidents involving highexplosives and propellants. Our model combines steady-state reaction models for thermally-activated combustionand pressure-activated detonation in the Uintah computa-tional framework, which computes coupled fluid/structureinteractions in multiphase multimaterial domains. Theunified reaction model was validated against aluminum-flyer impact, Steven and exploding cylinder tests forPBX9501. Preliminary results show that leveraging themassively parallel capability of Uintah allows predictionin large domains with high fidelity. Advisors: Chuck A.Wight and Martin Berzins

Joseph PetersonUniversity of [email protected]

MS85



Marino ArroyoUniversitat Politecnica de [email protected]

MS85

An Afem for Fluid-Structure Interaction

A parametric FEM for free boundary problems is discussed.Some applications include geometric and fluid-membraneinteraction in bio-membranes. With slight modification themethod can be successfully applied in shape optimizationproblems such the design of an obstacle with minimal dragor a by-pass design. The last in the context of inexactsequential quadratic programming. Inexactness is a conse-quence of using adaptive finite element methods (AFEM)to approximate the state equation, update the boundary,and compute the geometric functional.

Miguel S. PaulettiTexas A&M [email protected]

MS85

Electric-field-induced Ordering and Pattern For-mation in Colloidal Suspensions

The long-time dynamics and pattern formation in semi-dilute suspensions of colloidal spheres in a viscous elec-trolyte under a uniform electric field are investigated usingnumerical simulations. The rapid chain formation that oc-curs in the field direction as a result of dielectrophoreticinteractions is found to be followed by a slow coarseningprocess by which chains coalesce into hexagonal sheets andeventually rearrange to form mesoscale cellular structures,in agreement with recent experiments. The morphologyand characteristic wavelength of the equilibrium phasesthat emerge are shown to depend on suspension volumefraction, electrode spacing and field strength, suggesting

novel ways of controlling effective suspension properties inpractical applications.

Jae Sung Park, David SaintillanDepartment of Mechanical Science and EngineeringUniversity of Illinois at [email protected], [email protected]

MS85

The Microfluidics of Particle/ Vesicle Mixtures inthe Microvasculature

Many dispersions of colloidal and noncolloidal particleswith application in medicine contain elongated particles.For example, such particles are useful at the nanoscalefor the delivery of anti-cancer agents to tumor endothe-lial cells. In a different context, hemostasis in the smallvessels relies on naturally occurring blood particles such asplatelets being concentrated near the vessel walls. We willreview our computer simulations of these processes witha view toward understanding and engineering these thera-pies.

Eric S. ShaqfehProfessor, Chemical Engineering & MechanicalEngineeringStanford University, Stanford, [email protected]

Hong ZhaoDepartment of Mechanical EngineeringStanford University, Stanford, CA [email protected]

Andrew SpannInstitute of Computational and MathematicalEngineeringStanford University, Stanford CA [email protected]

MS86

On Multilevel Diffusion-based Load Balancing forParallel Adaptive Numerical Simulations

Load balancing is an important requirement for the effi-cient execution of parallel numerical simulations. In par-ticular when the simulation domain changes over time,the mapping of computational tasks to processors needsto be modified accordingly. Here we further explore thevery promising diffusion-based graph partitioning multi-level algorithm DibaP, which uses two different strate-gies (algebraic multigrid and matchings) for constructinga multilevel hierarchy. The presented experiments withgraph sequences that imitate adaptive numerical simula-tions demonstrate the applicability and quality of DibaPfor load balancing by repartitioning.

Henning MeyerhenkeGeorgia Institute of TechnologyCollege of [email protected]

Burkhard MonienUniversity of PaderbornDepartment of Computer [email protected]

178 CS11 Abstracts

MS86

Multilevel and Graph-based Methods in Data-mining

A number of nonlinear dimensionality reduction methodswhich exploit affinity graphs have been developed in thepast for solving various problems in data-mining. This talkwill explores a multilevel framework based on graph coars-ening whose goal is to reduce the cost of these techniques.Among the applications, we will consider the problem ofunsupervised manifold learning and spectral clustering. Anapplication to information retrieval will also be discussed.


Haw-ren FangDepartment of Computer Science & EngineeringUniversity of [email protected]

MS86

Relaxation-based Coarsening andMultiscale Graph Organization with Applicationsto Linear and Compression-friendly Orderings

In this talk we generalize and improve the multiscale orga-nization of graphs by introducing a measure that quanti-fies the “closeness’ between two nodes. The calculation ofthe measure is linear in the number of edges in the graphand involves just a small number of relaxation sweeps. Asimilar notion of distance is then calculated and used ateach coarser level. We demonstrate the use of this measurein multiscale methods for several important graph/matrixlinear and compression-friendly ordering problems and dis-cuss the multiscale graph organization.

Achi E. BrandtWeizmann Institute of ScienceDept of Applied Mathematicsachi.brandtweizmann.ac.il

Jie ChenDepartment of Computer Science and EngineeringUniversity of [email protected]

Dorit RonThe Weizmann Institute of [email protected]

Ilya SafroArgonne National [email protected]

MS86

Engineering Multilevel Graph Partitioning Algo-rithms

We describe two different approaches to multi-level graphpartitioning (MGP). The first algorithm is based on the ex-treme idea to contract only a single edge on each level of thehierarchy. The second algorithm is an approach to parallelgraph partitioning that scales to hundreds of processors.Quality improvements compared to previous systems are

due to better priorization of edges to be contracted andFM local search algorithms that work more locally thanprevious approaches.

Peter SandersKarlsruhe Institue of TechnologyFaculty of [email protected]

Christian SchulzKarlsruhe Institut of TechnologyFaculty of [email protected]

MS87



M AdamsTexas A&M [email protected]

MS87

A Posteriori Error Estimation via Error Transport

Error estimation for time dependent hyperbolic problems ischallenging for theoretical and practical reasons. In thesesystems, error propagates long distances and produces ef-fects far from the point of generation. In addition, non-linear interaction of error plays an important role. Timemarching and non-linear discretizations must also be ad-dressed. In this talk we investigate the use of error equa-tions for a-posteriori error estimation. These auxiliaryPDEs are treated numerically to yield field estimates oferror.



MS87

Quantification of Numerical Uncertainty for EntireCurves or Fields

A new capability is proposed to assess the asymptotic con-vergence of entire solution fields without having to reducethem to scalars. The technique estimates lower and up-per bounds of solution error when the exact solution isunknown. The technique also generalizes the Grid Conver-gence Index to estimate bounds of numerical uncertaintyfor entire fields. Applications are presented using test prob-lems analyzed with finite element or hydro-dynamics codes.

Francois M. HemezLos Alamos National [email protected]

MS87

The Role of Theory in Calculation Verification

Error estimation is a recommended practice in computa-

CS11 Abstracts 179

tional science and engineering. The approach employs a se-quence of discretizations, a measure from the solution, andextrapolation to estimate errors. The standard applicationof this methodology usually does not involve practices be-yond the rote application of the methodology, but oftenproduces results that defy explanation. The use of appro-priate theories of numerical approximation can be used toclarify results, and suggest improvements to the simulationmethodology.

William J. RiderSandia National [email protected]

MS88

A Two-dimensional hp-adaptive DiscontinuousGalerkin Model for the Shallow Water Equations

Unstructured meshes are becoming more and more popularin geophysical flow models. We present a two-dimensionalmodel solving the shallow water equations on unstructuredmeshes. The latter is dynamically adapted using the AMRtechnique to minimize the discretization error. The inter-polation order is also adapted during the solution process.Classical test cases on the sphere are used to validate themodel, as well as the global simulation of the 2010 tsunamiin Chile.

Sebastien BlaiseNational Center for Atmospheric [email protected]

Amik St-CyrNational Center for Atmospheric ResearchInstitute for Mathematics Applied to the [email protected]

David HallNational Center for Atmospheric Research1850 Table Mesa Drive, Boulder CO [email protected]

MS88

A Comparison of Gauss-Legendre and Gauss-Legendre-Lobatto Discontinuous Galerkin SpectralElement Methods

In this talk, different implementations of the collocationdiscontinuous Galerkin spectral element method, namelythe Gauss Legendre and the Gauss-Lobatto Legendre vari-ants, are discussed and compared. Analysis of the differentwave propagation properties as well as computational as-pects are presented. An application to the simulation of asimple turbulent flow is used to illustrate the differences ofboth methods.

Gregor GassnerInstitute for Aerodynamics and GasdynamicsUniversitaet [email protected]

David KoprivaDepartment of MathematicsThe Florida State [email protected]

MS88

Two-level Optimized Schwarz Preconditioning for

SEM-based MHD

We present a new two-level preconditioner for a“pseudo-Laplacian’ operator that emerges from an ex-plicit spectral-element discretization of the incompress-ible magnetohydrodynamics equations. Extending work inJCP,133:84(1997), we use new overlap stenciling and cornercommunication for the fine grid whose preconditioner uti-lizes a theoretical result, SISC,29(6):2402, enabling trivialconversion to an optimized overlapping Schwarz method.Results are presented for the (pseudo-)Poisson equation,and for magnetohydrodynamics simulations of the Orszag-Tang vortex.


Duane RosenbergNCARInstitute for Mathematics Applied to [email protected]

James LottesOxford [email protected]

Paul F. FischerArgonne National [email protected]

Annick [email protected]

MS88

GPU Accelerated Discontinuous Galerkin Methods

We will discuss the trend towards many-core architecturesin modern computers and in particular how current non-uniform memory hierarchy should be factored in to com-parisons of competing formulations for numerically solv-ing partial differential equations. We will focus primar-ily on the discontinuous Galerkin methods, in particulara customized version that is designed to deliver an accu-rate treatment of curvilinear domains with relatively lowstorage overhead. Examples simulations from electromag-netics and gas dynamics will be shown with benchmarksindicating the performance obtained on current graphicsprocessing units.

Tim WarburtonRice UniversityDepartment of Computational and [email protected]

MS89

Real-time Classification of Astronomical Eventswith Python

Over the next several years, massive sky surveys will recordmore astronomical transient events than all previouslyrecorded by mankind. Discovering those events (and max-imizing the scientific value of a subset of them) is rapidlybecoming a needle-in-the-haystack problem that cannotbe tackled with humans in the loop. Here I describe aPython-based framework for ingesting massive astronomi-cal datastreams, applying machine-learning techniques on

180 CS11 Abstracts

time-series-derived data, and producing probabilistic clas-sifications of transients.

Joshua BloomUniversity of California, [email protected]

Dan Starr, Joseph Richards, Nathaniel ButlerUC [email protected], [email protected],[email protected]

Dovi PoznanskiUC BerkeleyLawrence Berkeley National [email protected]

MS89

High-Order, Adaptive Finite Element Methods forAtomic Structure Calculations

We present high-order, adaptive finite element solvers forradial Schroedinger and Dirac equations. These solvers arewritten in C++ and Fortran, and exposed to Python usingCython and Fwrap. We explain the importance and thetechnical details of this exposition. Moreover, we highlighthow the resulting solvers may advance quantum chemistryby providing a robust, variational alternative to conven-tional shooting (or Gaussian-based) methods, finding alldesired states simultaneously, with accuracy and orthogo-nality approaching machine precision.

Ondrej Certik, Pavel SolinDepartment of Mathematics and StatisticsUniversity of Nevada, [email protected], [email protected]

MS89

FEMhub Online Numerical Methods Laboratory

The FEMhub Online Numerical Methods Laboratory(http://lab.femhub.org) is an Ext-JS application based onCodenode, whose objective is to facilitate remote scientificcomputing with open source packages included in FEMhub.Currently, these packages include FiPy, Hermes, Phamland SfePy, and their number will grow in the future. Wewill mention how the Online Lab can be used to enhanceteaching of numerical methods courses and perform sam-ple finite element computations with codes included inFEMhub. We will also present a new FEMhub Mesh Ed-itor that makes it possible to generate 2D finite elementmeshes inside a web browser window.


Mateusz PaprockiTechnical University of WroclawWroclaw, [email protected]

Aayush PoudelUniversity of Nevada, [email protected]

MS89

Interactive Parallel Python with ZeroMQ

ZeroMQ is an advanced, lightweight message passing li-brary, written in C/C++. It is based around rich Sock-ets that provide simple, but powerful, messaging primi-tives including publish/subscribe, request/reply, point-to-point and broadcast. These building blocks can be usedto construct sophisticated and high-performance messag-ing architectures. IPython is an open-source package thatprovides high-level parallel computing capabilities in thePython programming language. By moving both the se-rial and parallel IPython computing models to ZeroMQ,we gain access to higher performance, and move to a fun-damentally different programming model with many ad-vantages. We will present some details of ZeroMQ, howit is used in IPython, and the new model for interactiveparallel computing.

Min Ragan-KelleyUC Berkeley AS&[email protected]


Brian GrangerCal Poly, San Luis [email protected]

MS90

Stochastic Galerkin and Collocation Methods forPDEs with Random Coefficients

We focus on PDE-based models with random input param-eters and consider multivariate polynomial approximationsof the solution as a function of the random parameters.We review and compare Galerkin and Collocation-type ap-proximations with particular attention to the choice of thepolynomial space depending on the structure of the PDEand the input probability measure. These methods typi-cally require lots of ”deterministic” solves and demand forHPC. We also comment on implementation issues and par-allelization.

Fabio NobileMOX, Dip. di MatematicaPolitecnico di [email protected]

Joakim BackApplied Mathematics and Computational Science,[email protected]

Lorenzo TamelliniMOX, Department of MathematicsPolitecnico di [email protected]

Raul F. TemponeMathematics, Computational Sciences & EngineeringKing Abdullah University of Science and [email protected]

CS11 Abstracts 181

MS90

Practical Considerations in Modeling RandomVariables, Vectors and Fields Using ExpansionMethods

Expansion-based probability methods are at the core ofSandia’s HPC random field simulation capabilities devel-opment, and are key elements in this regard to both theSIERRA and Trilinos software suites. We address many ofthe relevant probability-related aspects that are implicit inthese approaches, including the existence of certain trans-formations of random variables and the sense of equalityunder which these transformations are understood. Wealso discuss various aspects of equality and conditions onexistence for, and the convergence of, approximations forthese random variables.

John Red-HorseValidation and Uncertainty Quantification ProcessesSandia National [email protected]

Philippe PebaySandia National [email protected]

Paul BoggsSandia National [email protected]

MS91

Application of Control Theory to Spacecraft Atti-tude Stability

We describe results from a year-long capstone researchproblem in satellite control posed by Space Systems / Loralof Palo Alto, CA. Our goal is to find criteria that guaran-tee stability for a given feedback control system, even inthe presence of parameter uncertainties, delays in the sig-nals, and disturbances to the system. We have constructeda computer simulation and derived an analytical transferfunction for the spacecraft control system. We present acomparison of numerical and theoretical results.

Jacob Bouricius, Max Lee, Andrea Levy, Maggie RogersHarvey Mudd [email protected], [email protected],andrea b [email protected], [email protected]

MS91

Spectral Counting Functions of Atomic Measures

For this talk, we will be working with a family of atomicmeasures supported on the boundary of a generalizedCantor-like string. We will develop and analyze geometriccounting functions and spectral zeta functions for a familyof fractal strings. We will also discuss regularity, partitionzeta functions, and complex dimensions as they pertain toour example.

Kate EllisCalifornia State University, [email protected]

MS91

Simulating Stochastic Inertial Manifolds by a

Backward-Forward Approach

We construct stochastic inertial manifolds numerically forstochastic differential equations with multiplicative noises.After splitting the stochastic differential equations intoa backward part and a forward part, we use the the-ory of solving backward stochastic differential equationsto achieve our goal.

Xingye Kan, Jinqiao DuanIllinois Institute of [email protected], [email protected]

Yannis KevrekidisDept. of Chemical EngineeringPrinceton [email protected]

Anthony J. RobertsUniversity of [email protected]

MS91

Aspects on Robust Shape Optimization in CFD

The proper treatment of uncertainties in the context ofaerodynamic shape optimization is a very important chal-lenge to ensure a robust performance of the optimized air-foil under real life conditions. This talk will propose ageneral framework to identify, quantize and include theuncertainties in the overall optimization procedure. Ef-ficient discretization techniques of the probability space,algorithmic approaches based on multiple-setpoint ideas incombination with one-shot methods as well as numericalresults will be presented.

Claudia SchillingsUniversity of Trier, [email protected]


MS91

Integral Tau Leap Method for Stochastic ChemicalSystems

Tau leaping methods can efficiently simulate models of stiffstochastic chemical systems. However, most existing meth-ods do not naturally preserve some chemical structures,such as integer-valued and nonnegative molecular popula-tion states. In this talk, I will present structure preservingtau methods for simulating stochastic chemical systems.We illustrate the new methods through a number of bio-chemically motivated examples, and provide comparisonswith existing implicit tau methods.

Yushu YangUniversity of Maryland Baltimore [email protected]

Muruhan RathinamUniversity of Maryland, Baltimore [email protected]

Jinglai ShenUniversity of Maryland Baltimore County

182 CS11 Abstracts

[email protected]

MS91

A Diffuse Interface Model of Multicomponent Vesi-cle Adhesion and Fusion

Adhesion and fusion of lipid bilayer vesicle membranes areimportant biological process, which play key roles in ex-ocytosis, endocytosis etc. In this talk, a diffuse interfaceapproach is introduced to model the multicomponent vesi-cle adhesion and fusion. Various equilibrium configurationsare examined for the adhered vesicles, pre-fusion and post-fusion between vesicles. Our model predicts the fact thatadhesion can promote phase separation of multicomponentvesicle membrane. The effect of spontaneous curvatures,bending rigidities, and adhesion strength on adhered vesi-cles and fusion process are discussed.

Yanxiang ZhaoPenn State [email protected]

Sovan DasIndian Institute of Technology [email protected]

MS92

StarPU: A Runtime System for Scheduling Taskson Accelerator-Based Multicore Machines

We introduce StarPU, a runtime system designed to exploitaccelerator-based multicore machines. StarPU provides asimple task-based API, allowing programmers to focus onalgorithmic issues. The task scheduling engine automat-ically balances the tasks across available processing unitsand uses prefetching to hide data transfers latency. Weevaluate our system with dense linear algebra kernels, us-ing a scheduling policy based on auto-tuned performancemodels. We detail how our approach provides portabilityof performance.

Cedric AugonnetUniversity of Bordeaux - INRIA [email protected]

MS92

Scheduling Strategies in StarSs

StarSs programming model enables to write sequential ap-plications which inherent concurrency is exploited at run-time. OpenMP-like pragmas are inserted to annotatetasks, indicating the directionality of the subroutine pa-rameters (input, output or inout). A data-dependencegraph of the application is dynamically built and sched-uled in the different cores of a multicore platform. Thetalk will present the scheduling strategies that have beenimplemented in the different StarSs runtimes (i.e., for theCell, SMP or GPU).

Rosa M. BadiaBarcelona Supercomputing [email protected]

MS92

CnC for HPC

We discuss our experiences in expressing and tuning com-putations from high-performance computing (HPC) in In-

tel Concurrent Collections (CnC), a dataflow programmingmodel. Our case studies include asynchronous-paralleldense linear algebra algorithms, where we can match orexceed Intel MKL for representative computations, as wellas more irregular computations like the kernel-independentfast multipole method.

Aparna ChandramowlishwaranGeorgia Institute of [email protected]

Kathleen [email protected]

Richard VuducGeorgia Institute of Technologyrichie@cc. gatech. edu

MS92

Development of a Task Execution EnvironmentDriven by Linear Algebra Applications

We describe the design and development of QUARK(QUeuing And Runtime for Kernels), which is an multi-core execution environment for applications composed oftasks with data dependencies between the tasks. Thedevelopment of QUARK is driven by the needs of thePLASMA linear algebra library, and some of the optimiza-tions specific to linear algebra will be presented. QUARKis intended to be easy-to-use, efficient and scalable, and wediscuss our progress on those goals. Performance resultson linear algebra applications will be presented.

Asim YarKhanDepartment of Electrical Engineering and ComputerScienceUniversity of Tennessee, [email protected]

Jakub KurzakUniversity of Tennessee [email protected]


MS93

Importance of Parallel Linear Solvers in Bat-tlespace Environments

Characterizing the battlespace environment is crucial toUS Department of Defense mission planning because it en-hances safety and warfighting effectiveness. ERDC is in aunique position because of its stewardship of surface andgroundwater codes solving large real-world problems. Of-ten parallel linear solvers become performance bottleneckof these simulations using implicit time scheme. The per-formance of these applications may be capable of advancingtoward exasacale by understanding the effect of physicalattributes and linear operators.

Jing-Ru C. ChengScientific Computing Research CenterU.S. Army Engineer Research and Development Center(ERDC)[email protected]

CS11 Abstracts 183

MS93

Using GPU-Based Accelerators to Prepare for Ex-ascale Computing in Defense Applications

CPU clock speeds are peaking due to thermodynamic lim-its and other limitations, requiring the use of parallelism tocontinue gaining improved performance. In order to gainexascale performance, we must take advantage of extremeparallelism requiring millions/billions of threads. One ap-proach towards reaching exascale involves hybrid architec-tures using many-core CPUs with GPU-based accelerators.A 2-D finite difference time domain simulation using multi-ple GPUs demonstrates GPU programming techniques andperformance.

Paul EllerU.S. Army Engineer Research and Development [email protected]

MS93

The ParalleX Execution Model for Exascale Com-putation

Achieving Exaflops sustained performance by the end ofthis decade will require dramatic improvements in effi-ciency, scalability, and power. ParalleX is a new modelof computation being explored to expose significant paral-lelism and provide latency hiding. It provides a frameworkfor co-design of all future system layers from programmingmodels, through system software, to parallel system andcore architectures. This talk will present the foundationalideas of ParalleX and early results in its use on conven-tional parallel systems for Adaptive Mesh Refinement asapplied to numerical relativity computations.

Thomas Sterling

LSU Center For Computation and Technology (CCT)[email protected]

MS93

Opportunities in the Present and Future of DefenseApplications


John WestU.S. Army Engineer Research and Development (ERDC)CenterInformation Technology [email protected]

MS94

An Energy Based Stochastic Mapping betweenModel Parameters for Complementary Scales

In multiscale modeling and multifidelity model representa-tions, one of the recent research interests is proper char-acterization of fine-to-coarse scale model parameters rela-tionship. The present work focuses on how this relationshipcan be characterized, particularly when a limited amountof highly accurate fine scale/high fidelity information (com-putational/experimental) is available. The proposed worktreats this relationship as a stochastic mapping that re-sults in stochastic coarse model parameters even when thefine scale information is deterministic. The uncertaintyinduced by the stochastic mapping reflects the ‘effects ofloss of information’ in constructing the coarse scale modelparameters. The proposed scheme shifts the domain of

knowledge from the deterministic (highly accurate) finescale/high fidelity regime to the stochastic coarse scale/lowfidelity regime. The stochastic mapping is developed suchthat it is constrained by highly accurate fine scale/highfidelity energy observables in a certain sense.

Sonjoy Das, Youssef M. MarzoukMassachusetts Institute of [email protected], [email protected]

MS94

Stochastic Multiscale Modeling from MolecularReactions to Catalytic Reactor

Integrating first-principles kinetic Monte Carlo (KMC)simulation with a stochastic continuum model, a stochas-tic hybrid model was developed to study the effects of heatand mass transfer on the heterogeneous reaction kinetics.The stochastic hybrid reaction model consists of a surfacephase domain where catalytic surface reactions occur anda gas-phase boundary layer domain imposed on the cata-lyst surface where the temperature and pressure gradientsexist. The surface phase is described using the site-explicitfirst-principles KMC simulations. The heat and mass fluxfluctuation in the gas-phase boundary layer domain, whichis represented by thermal and molecular diffusion, is char-acterized using the stochastic grid-based continuum model.At each time step, the heat and mass exchanges betweentwo domains are simulated simultaneously until the steady-state reaction condition is reached. At the steady-state re-action condition, the activity, the surface coverage of eachreaction species, as well as the temperature and pressuregradient profiles in the gas-phase boundary domain arestatistically constant with very small fluctuations. In or-der to illustrate that the stochastic hybrid model is moreaccurately elucidate the experimentally observed reactionkinetics by considering the fluctuation of heat and masstransfer, we investigated the surface kinetics of CO oxida-tion over the RuO2(110) catalysts with various operatingreaction conditions. Simulation results indicate that thestochastic hybrid model captures the coupling effect be-tween the heat and mass flux fluctuation in the gas-phaseboundary layer and the reaction kinetics more accuratelythan the traditional KMC simulation results.

Guang Lin, Donghai MeiPacific Northwest National [email protected], [email protected]

MS94

Bayesian Inference of Atomic Diffusivity in a Bi-nary Ni/Al System based on Molecular Dynamics

Atomic mixing in Ni/Al nanolaminates is characterized us-ing MD computations. The diffusivity is extracted usinga Bayesian inference framework, based on contrasting themoments of the cumulative distribution functions of theconstituents with corresponding moments of a dimension-less concentration evolving according to a Fickian process.Posterior estimates of the diffusivity are compared with ex-perimental measurements, and conclusions are drawn re-garding the potential of the present framework to refinecontinuum modeling approaches.

Francesco RizziDepartment of Mechanical EngineeringJohns Hopkins [email protected]

184 CS11 Abstracts

Maher SalloumSandia National LaboratoriesScalable & Secure Systems Research [email protected]

Youssef M. MarzoukMassachusetts Institute of [email protected]

Rong-Guang XuDepartment of Physics and AstronomyJohns Hopkins [email protected]

Michael Falk, Timothy P. Weihs, Gregory FritzMaterials Science and EngineeringJohns Hopkins [email protected], [email protected], [email protected]

Omar M. KnioDept. of Mech. Eng.Johns Hopkins [email protected]

MS94

PC-based Uncertainty Propagation in the Gulf ofMexico using HYCOM


Carlisle ThackerNational Oceanic and Atmospheric [email protected]

MS95

Computation of Spherical Delaunay Triangulationsand Centroidal Voronoi Tessellations in Parallel

Spherical centroidal Voronoi tessellations (SCVT) are usedin many applications in a variety of fields, one being climatemodeling. They are a natural choice for spatial discretiza-tions on the Earth. New modeling techniques have recentlybeen developed which allow the simulation of ocean andatmosphere dynamics on arbitrarily unstructured meshes,which could include SCVTs. Since these communities arebeginning to focus on exa-scale computing for large scaleclimate simulations, a need is brought to light for fast andefficient grid generators. Current high resolution simula-tions on the earth call for a spatial resolution of about 0.1o

which corresponds to about 11.1km. In terms of a SCVTthis corresponds to a quasi-uniform SCVT with roughly2 million generators. Computing this grid in serial is in-credibly expensive, and can take on the order of weeksto converge sufficiently for the needs of climate modelers.Utilizing conformal mapping techniques, as well as planartriangulation algorithms, and basic domain decomposition,this presentation outlines a new algorithm that can be usedto compute SCVTs in parallel, thus reducing the overalltime to convergence. This reduces the actual time neededto create a grid on the Earth, as well as allows for newtechniques to be explored when modeling the ocean andatmosphere.

Doug JacobsonFlorida State [email protected]

MS95

Modeling Storm Surges Using the Multilayer Shal-low Water Equations

Storm surge prediction is becoming more integral to thesafety of the increasing populations on the coastlines. Cur-rent efforts have focused on using the single layer shallowwater equations to balance the dominant physics of surgeswith computational efficiency. We are examining the ad-vantages of using the multilayer shallow water equationsto model storm surges. We will present idealized simula-tions using this new model and discuss their effectivenessin capturing additional storm surge physics.

Kyle T. MandliUniversity of WashingtonDept. of Applied [email protected]


MS95



Chris NewmanLos Alamos National [email protected]

MS95

A Numerical Study of Boundary Value Problemsfor the Shallow Water Equations with Topography

Limited Area Models (LAMs) have been used to achievehigh resolution over a region of interest in geophysical fluiddynamics. The model equations under consideration arethe nonviscous Shallow Water equations with topographyin space dimension one. In this work, our goals are two-fold: first to find boundary conditions which are physicallysuitable, namely they let the waves move freely of the do-main without spurious reflecting waves at the boundaryin a nonphysical way; second to to numerically implementthese boundary conditions in a numerically effective way.This is achieved by applying a suitable extension of thecentral-upwind method for the spatial discretization andthe Rung-Kutta method of second order for the time dis-cretization. Several numerical simulations for which wetested the proposed boundary conditions and the numeri-cal schemes are presented.

Ming-Cheng ShiueIndiana [email protected]

Jacques LaminieUniversite des Antilles et de la [email protected]

Roger M. TemamInst. f. Scientific Comput. and Appl. Math.Indiana [email protected]

Joseph J. TribbiaNat’l Center for Atmospheric

CS11 Abstracts 185

[email protected]

MS95

Some Issues Related to the Boundary Conditionsfor the Inviscid Equations of the Atmosphere andthe Oceans

In this lecture we will present some issues related to theboundary conditions for the primitive equations of the at-mosphere and the oceans. We will present some nonlocalboundary conditions for these equations which appear tobe well suitable, and show the results of three dimensionalnumerical simulations performed with these boundary con-ditions based on joint works with Q. Chen, M.-C. Shiue andJ. Tribbia).

Roger M. TemamInst. f. Scientific Comput. and Appl. Math.Indiana [email protected]

MS96

A High-Resolution Fast, Spectral Boundary-Integral Method for Multiple Interacting BloodCells

We discuss a boundary integral method for simulation offlowing red blood cells. A particle-mesh-Ewald approach(PME) achieves an overall O(N log N) scaling. The cellshapes are represented by spherical harmonics for theirperfect resolution and because such global basis functionsfacilitate control of aliasing errors without explicit filter-ing or implicit numerical dissipation. Example simulationsinclude a cell flowing through a constriction, blood flowin round tubes which reproduce measurements, white celltransport by many red cells, and flow in a model vesselnetwork.

Jonathan B. [email protected]

MS96

Front Tracking Method on Precipitation with Sub-surface Flow

We present the simulation results of solute precipitationwith subsurface flow using the FronTier software library.The FronTier interface propagation tool is used to modelthe growing crystal surface in pore scale simulation. Thephase transition code is coupled with the incompressibleNavier-Stokes solver for the convection due to subsurfaceflow. We measure the upstream and downstream growthrates on different flow velocities and the effect of precipi-tation on Darcy’s law.

Xiaolin Li, Yijing Hu, Saurabh JoglekarDepartment of Applied Math and StatSUNY at Stony [email protected], [email protected],[email protected]

MS96

Stabilizing fluid-fluid Interfaces using ColloidalParticles

Bicontinuous interfacially jammed emulsion gels (’bijels’)

were proposed in 2005 as a hypothetical new class of softmaterials in which interpenetrating, continuous domains oftwo immiscible fluids are maintained in a rigid state, by ajammed layer of colloidal particles at their interface. Suchgels offer an important route to materials with unique com-binations of properties not available in a single phase ma-terial. In 2007, the first bijels were created experimentally.In joint work with Sebastian Aland and Axel Voigt, we de-velop a continuum model for such systems which combinesa Cahn-Hilliard-Navier-Stokes model for the macroscopictwo-phase fluid system with a surface Phase-Field-Crystalmodel for the microscopic colloidal particles along the in-terface. We demonstrate the feasibility of this approachand present numerical simulations that confirm the abilityof the colloids to stabilize interfaces for long times.

John LowengrubDepartment of MathematicsUniversity of California at [email protected]

MS96

Large Scale Simulations of Vesicles Suspended in3D Viscous Flows

Vesicles are locally-inextensible closed membranes thatpossess tension and bending energies. Vesicle flows modelnumerous biophysical phenomena that involve deformingparticles interacting with a Stokesian fluid. We will presentnew schemes for simulating the three-dimensional hydrody-namic interactions of large number of vesicles. They incor-porate a stable time-stepping scheme, high-order spatio-temporal discretizations, spectral preconditioners, and anew reparameterization scheme capable of resolving ex-treme mesh distortions in dynamic simulations.

Shravan VeerapaneniCourant InstituteNew York [email protected]

Abtin Rahimian, George BirosGeorgia Institute of [email protected], [email protected]

Denis ZorinComputer Science DepartmentCourant Institute, New York [email protected]

MS97

Video Background Subtraction UsingCommunication-Avoiding QR on a GPU

Reducing communication between the GPU cores andmemory can give us substantial speedups, turning abandwidth-bound problem into a compute-bound problem.Communication-Avoiding QR (CAQR) is a recent algo-rithm for solving a QR decomposition that is optimal withregard to communication. This talk will describe the imple-mentation of CAQR on the GPU. As an application, we useCAQR to efficiently get the SVD of a tall-skinny matrix,which allows us to perform robust background subtractionon surveillance videos.

Michael AndersonUniversity of California [email protected]

186 CS11 Abstracts

James W. DemmelUniversity of CaliforniaDivision of Computer [email protected]

MS97

A Critical Path Approach to Excluding Algorith-mic Variants

Many algorithms can provide competing variants simplyby changing the mathematical approach. We introduce acritical path metric for tiled algorithms on multi-core ar-chitectures which can help determine the performance ofthese competing variants. Our metric provides a more intu-itive grasp of the performance of a variant within the earlydevelopment to access whether or not more effort shouldbe applied to tune the algorithmic variant. One of our casestudy is the two-sided tile reduction to block Hessenbergform which is the main topic of this minsymposium.

Henricus M. BouwmeesterDepartment of MathematicsUniversity of Colorado at [email protected]

Julien LangouUniversity of Colorado at Denver and Health [email protected]

MS97

Solving the Two-Stage Symmetric EigenvalueProblem on Multicore Architectures

This talk will describe the two-stage symmetric eigenvalueproblem on multicore architectures. The first stage con-sists in reducing the symmetric matrix to band tridiagonalform using level 3 BLAS operations. The second stage fur-ther reduces the band form to the required tridiagonal formusing a ”Left-Looking” bulge chasing technique to reducememory traffic and improve data-locality. A dynamic run-time system is used to concurrently schedule both stageson multicore architectures.


Hatem Ltaief, Piotr LuszczekDepartment of Electrical Engineering and ComputerScienceUniversity of Tennessee, [email protected], [email protected]

MS97

Hybrid Two-Sided Transformations using GPU Ac-celerators

We present GPU algorithms for three main two-sided densematrix factorizations - bidiagonalization for SVD, reduc-tions to upper Hessenberg and tridiagonal forms for gen-eral and symmetric eigenvalue problems. These funda-mental algorithms are not yet adequately accelerated onhomogeneous multicores. Our approach, implemented inthe MAGMA library, is based on ”hybridization” of theLAPACK algorithms, where the corresponding sequentialalgorithms are split into tasks and the tasks properly sched-

uled for execution over the hybrid hardware components.

Stanimire TomovInnovative Computing LaboratoryUniversity of Tennessee; [email protected]


MS98

Direct Numerical Simulation and A Priori Analysisfor Large Eddy Simulation of a Two-Phase Multi-component Mixing Layer

Direct Numerical Simulation (DNS) is performed to studythe physics of evaporating multicomponent fuel drops ina transitional, compressible, gaseous mixing layer by us-ing a coupled Eulerian/Lagrangian approach. The Navier-Stokes equations for multiphase flows are extended by fourequations to compute the evolution of the statistical vapourcomposition. The DNS database is used during an a prioristudy for Large Eddy Simulation to examine two differentformulations of the filtered composition equations. Thestudy identifies the modeling requirements of the subgridscale (SGS) contributions and tests the performance of twodifferent SGS models.

Michael GloorSeminar for Applied MathematicsETH [email protected]

Josette BellanJet Propulsion LaboratoryCalifornia Institute of Technology, Pasadena, CA [email protected]

MS98

Adaptive Finite Element Methods for Compress-ible Flows

We present adaptive finite element methods for the com-pressible Euler equations using continuous piecewise linearapproximation in space and time, stabilized by the stan-dard SUPG, and entropy viscosity. To capture and resolvefeatures such as shocks, rarefaction waves, contact disconti-nuities and boundary layers, the mesh must be sufficientlyrefined. Hence, construction of adaptive algorithms arenecessary for efficient simulations. We present a dualitybased adaptive finite element method for compressible flowin three space dimensions.

Murtazo NazarovPhD [email protected]

MS98

Introductory Talk

The CSE community has grown enormously over the lastdecade and so have the education programs in this field.Beside individual characteristics of the many and diverseCSE programs, including their strategies, concepts andperspectives, this talk will address new and innovativeideas for MSc and PhD education and training. A par-

CS11 Abstracts 187

ticular focus of this contribution is on factors and param-eters for sustainability and a high level of quality in CSEeducation.

Martin RuessTechnische Universitat MunchenComputation in [email protected]

MS98

Uncertainty Quantification Methods in Data As-similation for Numerical Weather Prediction

Uncertainty quantification plays a crucial role in data as-similation: overestimating background uncertainty leadsto overfit of observations, while underestimating uncer-tainty restricts the applicability of observations, especiallywhere they are needed most. Using the Weather Re-search and Forecasting (WRF) model and in collabora-tion with Argonne National Laboratory, methods to es-timate the background error covariance matrix in the con-text of data assimilation in numerical weather predictionincluding the ”NMC” method, estimation using a Gaus-sian based on geostrophic balance correlation lengths, anadjoint technique, and a front-based method are presented.The broader impact of these improvements will be brieflydiscussed as well.

Jeffrey L. StewardDepartment of Scientific ComputingFlorida State [email protected]

MS99

Introduction to the ITAPS Field Interface

Scientific computing applications focus on physical ten-sors: scalar temperature, vector velocity, stress tensors,etc. These are formally tensor fields, assigning a tensorquantity to each point in a domain. We are working todevelop a simple, flexible applications programming inter-face (API) for tensor fields that will support the needsof all common families of numerical discretizations with amodest-sized data model and API. Domain support willleverage our previously developed mesh API.

Carl Ollivier-GoochUniversity of British [email protected]

Mark MillerLawrence Livermore National [email protected]

Fabien DelalondreScientific Computation Research CenterRensselaer Polytechnic [email protected]

MS99

ITAPS Tools in Computational Evaluation of Al-ternative Methods of Nuclear Fusion

Computational model and software for the simulation ofplasma liner driven magnetoinertail fusion have been de-veloped based on ITAPS front tracking libraries. The for-mation of plasma liners via the merger of several hundredsof plasma jets, the liner implosion and target compression

have been simulated and compared with theoretical pre-dictions and other available numerical studies. Simulationsrelevant to the future Plasma Liner Experiment at LANLhave been obtained as well as simulations at extreme linerenergies leading to larger fusion energy gains.

Roman SamulyakBrookhaven National [email protected]

Lingling WuSUNY at Stony [email protected]

Paul ParksGeneral [email protected]

MS99

Tools to Support Unstructured Meshes on Mas-sively Parallel Computers

A set of tools for managing unstructured meshes on mas-sively parallel computers are being developed. These toolsinteract with a mesh infrastructure for parallel unstruc-tured meshes based on the DOE ITAPS iMeshP inter-face. Mesh management tools developed to support par-allel adaptive meshing operations include neighborhoodaware message packing, predictive load balancing, and in-cremental partition improvement. The iMeshP interfaceand mesh management tools are available for downloadfrom http://www.tstt-scidac.org/.

Mark S. ShephardRensselaer Polytechnic InstituteScientific Computation Research [email protected]

Kenneth Jansen, Min Zhou, Misbah MubarakRensselaer Polytechnic [email protected], [email protected],[email protected]

Seegyoung SeolRensselaer Polytechnic InstituteScientific Computation Research [email protected]

TIng Xie, Aleksandr OvcharenkoRensselaer Polytechnic [email protected], [email protected]

MS99

MeshKit: An Open-Source Toolkit for Mesh Gen-eration

Mesh generation typically requires the use of a collectionof algorithms, to address application-specific constraints onthe mesh, to apply different meshing approaches in differ-ent parts of the domain, or to apply pre- or post-generationalgorithms like smoothing. From the algorithm developerpoint of view, interactions with other meshing algorithms,or support services outside the algorithm being developed,are often necessary. In this talk, we present MeshKit, anopen-source library for mesh generation. MeshKit is de-signed to support both end users and mesh generation algo-rithm developers. For the former, MeshKit provides bothtri/tet and quad/hex algorithms, along with services to co-

188 CS11 Abstracts

ordinate and automate the mesh generation process. Forthe latter, MeshKit provides important pre-/post-meshingservices like support for geometric models, smoothing, andmesh I/O. MeshKit uses the ITAPS mesh and geometryinterfaces internally, and therefore can interact with anygeometry and mesh databases providing those interfaces.

Timothy J. TautgesArgonne National [email protected]

Jason KraftcheckThe University of [email protected]

James PorterUniversity of [email protected]

MS100

Simulations of ICRF Heating using the Delta-f PICMethod

Heating using the ion cyclotron range of frequencies(ICRF) has been explored in a number of tokamak de-vices, and, in addition, is being considered for ITER. ICRFpower can be transferred from the edge to the core with-out destroying the favorable wave properties of the plasma.We present recent progress on modeling ICRF heating ina tokamak using the δf PIC method. Both 1D and 2D re-sults will be presented from the VORPAL ComputationalFramework.

Travis M. Austin, David N. Smithe, Chet Nieter, ScottSides, C. D. ZhouTech-X [email protected], [email protected], [email protected], [email protected], [email protected]

MS100

Towards a Particle-in-Cell Solver based on Discon-tinuous Galerkin Methods

We discuss the ongoing development of a particle-in-cell(PIC) method for the modeling of kinetic plasma phenom-ena in which the field solver is based on a general high-orderaccurate discontinuous Galerkin method. The presentationshall motivate the need for such a development, present adetailed discussion of the current state of the effort and anaccount of open algorithmic challenges. The performanceand accuracy will be highlighted through 1D-3D bench-marks to illustrate the potential for such an approach toserve as a modeling tool for complex high-speed kineticplasma problems in complex geometries.


Gustaaf JacobsDepartment of Aerospace EngineeringSan Diego State [email protected]

Andreas KloecknerCourant Institute of the Mathematical SciencesNew York University

[email protected]

Tim WarburtonRice UniversityDepartment of Computational and [email protected]

MS100

High-Order Vlasov Simulation with Adaptive MeshRefinement for Laser-Plasma Interaction

We will report on our development of algorithms forVlasov-Maxwell discretization with adaptive mesh refine-ment (AMR) for the simulation of laser-plasma interac-tions. Our approach is based on an explicit, high-order,nonlinear, finite-volume discretization that is discretelyconservative, controls oscillations, and can explicitly en-force positivity. AMR algorithms particular to Vlasovsimulation, e.g. those required for inter-dimensional re-ductions and injections, will be discussed. Physically-motivated results in 1+1D and 2+2D will be presentedto demonstrate our progress.



Bruce CohenFusion Energy ProgramLawrence Livermore National [email protected]

Richard BergerAX DIVISIONLawrence Livermore National [email protected]

Stephan BrunnerCentre de Recherches en Physique des PlasmasEcole Polytechnique Federale de [email protected]

MS100

Dynamic Particle Weighting and Velocity Distribu-tions

In this presentation we will discuss our recent work in im-plementing novel dynamic particle weighting schemes in anelectrostatic PIC code. Examples of prior approaches foraddressing similar problems include radial particle weight-ing and the CPK method. Our method is based on se-lecting pairs of particles to merge based on velocity phase-space matching criteria. A short history of our efforts andour current status, including sheath modeling, will be pre-sented.

Matthew HopkinsSandia National [email protected]

Jeremiah BoernerNanoscale and Reactive ProcessesSandia National Laboratories

CS11 Abstracts 189

[email protected]

Paul Crozier, Thomas HughesSandia National [email protected], [email protected]

Matthew BettencourtSandia National LaboratoriesUS Air Force Research [email protected]

MS101

3D Nonlinear Electromagnetic Inversion using aCompressed Implicit Jacobian Scheme

We present a compressed implicit Jacobian scheme for theregularized Gauss-Newton inversion algorithm for recon-structing three-dimensional conductivity distribution fromelectromagnetic data. In this scheme, the Jacobian ma-trix, whose storage usually requires a large amount ofmemory, is decomposed in terms of electric fields excitedby sources located and oriented identically to the physi-cal sources and receivers. As a result, the memory usagefor the Jacobian matrix reduces from O(NFNSNRNP) toO[NF (NS +NR)NP], in which NF is the number of fre-quencies, NS is the number of sources, NR is the numberof receivers and NP is the number of conductivity cells tobe inverted. Moreover, we apply the adaptive cross approx-imation (ACA) to compress these fields in order to furtherreduce the memory requirement and to improve the effi-ciency of the method. This implicit Jacobian scheme pro-vides a good balance between the memory usage and thecomputational time and renders the Gauss-Newton algo-rithm more efficient. We demonstrate the benefits of thisscheme using numerical examples including both syntheticand field data for both cross-well and surface electromag-netic applications.

Aria AbubakarSchlumberger Doll [email protected]

Maokun Li, Guangdong Pan, Jianguo LiuSchlumberger-Doll [email protected], [email protected], [email protected]

Tarek [email protected]

MS101

Finite-difference Optimal Gridding Approach for3D EM Modeling of Geophysical Applications inTime- and Frequency Domain

I present an automatic optimal-gridding tool for finite-difference (FD) 3D modeling of various geophysical elec-tromagnetic (EM) applications: resistivity logging, ma-rine/land EM survey, cross-well EM, ground-penetratingradar. The latest development of the FD techniqueallows applying a medium-independent automatic gridrefinement that allows accurate computation of multi-frequency/multi-time and multi-spacing responses in onerun at the approximate cost of a single-frequency/time andsingle-spacing run. The method allows analyzing 3D mod-els of various geological formations and enables handlingexceptionally challenging test cases. Examples of variousgeophysical applications are presented. The method en-

abled development of a new separation technique for new-generation triaxial induction/propagation logging data, inorder to separate different azimuthal effects to which theolder-generation logging tools have weak sensitivity (thetool eccentricity effect, the dipping anisotropy and dippingboundary effects). The technique is based on the sym-metrization and rotation of the tensor measurements. Thisallows extraction of some formation properties directlyfrom the logging data, that makes reduced-dimension in-version schemes applicable. The method also enabled de-velopment of a new Focused Source EM (FSEM) surveythat allows separation of effects of deep resistive bodiesfrom unwanted shallow effects, based on using a propercombination of measurements. FSEM allows simple visualinterpretation of deep reservoir responses, due to higherspatial resolution and higher sensitivity to deep resistorsthan the standard EM surveys can offer.

Sofia DavydychevaRice University and 3DEM [email protected]

MS101

Efficient TEM Forward Modelling using KrylovSubspace Approximations

In this talk we describe the solution of the time-dependentquasi-static Maxwell equations to simulate the transientelectromagnetic (TEM) method for geophysical explo-ration. Our approach is based on Ndlec finite elementdiscretization, an exact boundary condition at the air-earth interface and rational Krylov subspace methods forthe time-stepping. We present numerical experiments todemonstrate the effectiveness of our method.

Oliver G. ErnstTU Bergakademie FreibergFakultaet Mathematik und [email protected]

MS101

Large-Scale Joint Imagingof Geophysical At-tributes

Large-scale three-dimensional (3D) modeling and imagingis receiving considerable attention in the interpretationof controlled source electromagnetic (CSEM) and magne-totelluric (MT) data in offshore hydrocarbon explorationand geothermal exploration in complex volcanic terrains.The need for 3D modeling and imaging is necessary asthe search for energy resources now increasingly occursin highly complex situations where hydrocarbon effectsand geothermal fluids are subtle aspects of their partic-ular geological environment. Further complicating mattersis the realization that electrical anisotropy also needs tobe incorporated directly into the imaging process. Failureto properly treat anisotropy can produce misleading andsometimes un-interpretable results when broadside/wideazimuth CSEM data is included. Merely excluding broad-side data detecting antennas is frequently an issue when3D coverage is desired. In this workshop we discuss a 3Dmodeling and imaging approaches that treats both CSEMand MT data jointly as well as transverse anisotropy, whichappears to be relevant for many practical exploration sce-narios. We discuss effective strategies for large scale mod-eling and imaging using multiple levels of parallelizationon distributed machines, and new developments in portingour 3D modeling and imaging algorithms to GPU compu-

190 CS11 Abstracts

tational platforms.

Gregory [email protected]

MS102

Implicit Particle Filter

Particle filters are presented in the setting of nonlinearstochastic differential equations (SDE). The task is to esti-mate the solution of the SDE conditioned by noisy obser-vations. Traditional particle filters approximate the condi-tional probability by sequential Monte Carlo; at each stepin time, one first guesses a “prior” incorporating the infor-mation in the SDE, which is then corrected by samplingweights determined by the observations, yielding a “pos-terior” density. The catch is that, in common weightingschemes, most of the weights become very small very fast,leading to a catastrophic growth in the number of requiredparticles, especially if the dimension of the state space islarge. The implicit filter overcomes this problem by re-versing the standard procedure. Rather than find sam-ples and then determine their probabilities, the implicitfilter picks probabilities and then generates samples thatassume them. The posterior density for each new particleposition xn+1 is written as exp(−F (xn+1)) (this defines afunction F for each particle). We then represent xn+1 asa function of a fixed Gaussian variable ξ by solving theequation F (xn+1) − φ = ξT ξ/2, where φ = min F and Tdenotes a transpose. The solution of this equation mapshigh probability samples of ξ onto high probability sam-ples xn+1; xn+1 is a sample of the posterior density withweight exp(− 1

2φ)J , where J is the Jacobian of the map

ξ → xn+1. There is a great deal of freedom in choosing amap ξ → xn+1 that satisfies the equation, and we use thisfreedom to produce maps that are easy to perform with aJacobian that is easy to evaluate. Nothing is assumed inadvance about the pdf one is sampling. We demonstratethe power of implicit filters with several challenging exam-ples, including a Kuramoto-Sivashinski equation driven bywhite noise and a Lorenz system with very sparse data.

Alexander J. ChorinUniversity of California, BerkeleyMathematics [email protected]

Matthias MorzfeLawrence Berkeley National [email protected]

Xuemin TuUniversity of [email protected]

MS102

A Measure Theoretic Approach to Inverse Sensi-tivity Analysis

We discuss a numerical method for inverse sensitivity anal-ysis of a deterministic map from a set of parameters toa randomly perturbed quantity of interest. The solutionmethod has two stages: (1) approximate the unique set-valued solution to the inverse problem using derivative in-formation and (2) apply measure theory to compute theapproximate probability measure on the parameter spacethat solves the inverse problem. We discuss convergence

and numerical analysis of the method.

Jay BreidtDepartment of StatisticsColorado State [email protected]

Troy ButlerInstitute for Computational Engineering and SciencesUniversity of Texas at [email protected]


Jeff SandelinDepartment of MathematicsColorado State [email protected]

MS102

Stochastic Collocation based on Interpolation withArbitrary Nodes

We present a generalized algorithm for the ‘least inter-polant’ method of Carl de Boor and Amos Ron for polyno-mial interpolation on arbitrary data nodes in multiple di-mensions. Our variation on the least interpolant producesan interpolant that can be tailored for various probabilitydistributions. We empirically analyze conditioning of theassociated Vandermonde-like matrices and present a fewexamples illustrating utility of the method for generalizedPolynomial Chaos collocation methods and for generatingresponse surfaces.

Dongbin Xiu, Akil NarayanPurdue [email protected], [email protected]

MS102

Predictability in Stochastic Reaction Networks

Predictability analysis in stochastic reaction networks istypically challenged by intrinsic noise. We utilize non-intrusive spectral expansions to efficiently propagate in-put parametric uncertainties in the presence of intrinsicstochasticity. To address the curse of dimensionality, or-thogonal spectral projections are performed using a sparsequadrature approach that is shown to perform better thanHigh Dimensional Model Representation (HDMR) for thebenchmark problem. The methodology is illustrated forthe gene regulation network of the Bacillus Subtilis bac-terium.



Khachik SargsyanSandia National [email protected]

CS11 Abstracts 191

MS103

Distributed Control of Cascading Blackouts

We describe continuing experience with online, adaptivecontrol algorithms for minimizing the impact of a cascad-ing blackout. Assuming that a contingency has taken place,the algorithms compute a locally-optimum control that willbe applied as the cascade unfolds. The control will take asinputs observable quantities, such as line overloads, andoutput control actions such as load shedding. The objec-tive is to stop the cascade in a stable state with minimalloss of demand. We present computational experience us-ing large real-life grids.

Daniel BienstockColumbia University IEOR and APAM DepartmentsIEOR [email protected]

MS103

Models and Algorithms for N-k Survivable GridDesign

Not available at time of publication.

Richard Li-Yang ChenSandia National LaboratoriesLivermore, [email protected]

Amy CohnUniversity of Michigan, Ann [email protected]

Ali PinarSandia National [email protected]

MS103

Flexibility in Electricity Markets

Greater flexibility is especially important in the context ofa smarter electric grid. In this talk we examine the flexibil-ity of electric assets in electric network optimization mod-els, specifically the unit commitment and dispatch prob-lems. We review how flexibility is represented now, howit might be represented better, and issues that need to beaddressed.

Kory HedmanArizona State [email protected]

MS103

Protecting Electric Power Grids from Terrorist At-tack: Solving Full-scale, Three-stage StackelbergGames


Kevin WoodNaval Postgraduate SchoolMonterey, [email protected]

MS104

Preconditioned Krylov Subspace Methods on GPU

Applied to Multiphase CFD

By using C for CUDA environment, we have successfullyimplemented and evaluated various kinds of precondition-ing methods for Krylov subspace solvers on GPU, includ-ing SOR variants, Multigrid, Line-by-Line and incompleteCholesky decomposition. These preconditioned Krylovsubspace methods has then been applied to multiphaseCFD simulation and real-time visualization on GPU.

Hidetoshi Ando, Takuma Shibuya, Koji ToriyamaUniversity of [email protected], [email protected],[email protected]

MS104

Building the Next Generation of Scalable Many-core Applications and Libraries

Multicore nodes have become the standard building blockof scalable computers systems, and core counts per nodepromise to increase dramatically over the next decade.Although single level MPI-only programming approachescan work for some applications, we will need different ap-proaches going forward. In this talk we make observationsabout existing successful approaches to parallel applica-tion development and highlight some key requirements forsuccess in developing the next generation of scalable appli-cations.

Michael A. HerouxSandia National [email protected]

MS104

Efficient Use of GPUs and a Large Number ofCPUs in a Dense Linear Algebra Library

The efforts of implementing dense linear algebra on multi-core and accelerators have been pursued in two different di-rections, one that emphasizes the efficient use of multicoreprocessors, represented by the PLASMA project, and an-other that emphasizes the use of accelerators, representedby the MAGMA project. While the former makes greatusage of multicores, it is void of support for accelerators.While the latter makes great usage of GPUs, it seriouslyunderutilizes CPU resources. This presentation introducesan approach for efficiently combining GPU acceleratorswith a large number of classic multicores for dense linearalgebra computations.

Jakub KurzakUniversity of Tennessee [email protected]

MS104

Shared Memory and GPU Algorithms for Finite-element Linear-system Assembly

Computing element-stiffness matrices in a finite-elementanalysis application and assembling them into a globalsparse linear system is a computationally intensive prob-lem which could benefit from parallelism. In a multicoreCPU setting it seems natural to have separate threads com-pute each element-matrix, but the challenge is that threadscould then collide when making contributions to the sameglobal matrix row. Algorithms suitable for multicore CPUsand GPUs will be discussed, along with performance re-

192 CS11 Abstracts

sults.

Alan B. WilliamsSandia National LaboratoryDistributed Systems Research [email protected]

MS105

An h-p adaptive Discontinuous Galerkin Methodfor the Navier-Stokes Equations

We discuss the implementation of an h-p adaptive approachfor high-order accurate discontinuous Galerkin discretiza-tions of the Navier-Stokes equations. The discretizationoperates on mixed element triangular and quadrilateralmeshes in two dimensions and h-adaptivity is incorporatedusing non-conforming elements. Both the primal (Navier-Stokes) and dual (adjoint) equations are converged usingan efficient h-p multigrid solver. The adjoint solution isused to estimate discretization error in selected simulationoutput functionals which in turn drives the adaptation pro-cess. A smoothness indicator is also implemented in orderto choose between h and p refinement options. Predictedand observed error reduction at each refinement level arecompared and used to gauge convergence of the error esti-mate. Dual consistency of the discretization and its effecton error estimates and adaptive convergence are also dis-cussed.


MS105

High Fidelity Simulations of Flapping Wings De-signed for Energetically Optimal Flight

We present our work on a multi-fidelity framework for in-verse design of flapping wings. A panel method-wake onlyenergetics solver is used to define the energetically optimalwing shape and flapping kinematics. Candidate designs arethen simulated using our high-order accurate discontinuousGalerkin solver, based on fully unstructured curved meshesof tetrahedra, a mapping-based ALE solver, and efficientparallel Newton-Krylov solvers, to gain insight into practi-cal wing designs and the influence of viscous effects.

Per-Olof PerssonUniversity of California BerkeleyDept. of [email protected]

MS105

On the Implementation of Rosenbrock-W Methods


Adrian SanduVirginia Polytechnic Institute andState [email protected]

MS105

New Practices on High Order Numerical Methods

High order numerical methods have been applied in prac-tical applications and their advantages have been shown

in terms of accuracy, flexibility and efficiency. Multi-gridtechnique has been used to improve the performance of thehigh order method. Different high order bases have beenused in the simulations of different physical phenomena.Their performances have been analyzed and compared.Practice and challenges on applying high order methodsin large scale computing will be shown and discussed.

Jin XuMath. & Computer / Physics Division, Argonne NationalLab.jin [email protected]

MS106

A Fast Optimization Method for the Chan-VeseModel in Image Segmentation

The Chan-Vese model is used to segment images that areapproximately piecewise continuous (a large class of im-ages of practical interest, e.g. in medical imaging). Thegoal of this work is to devise a fast and robust optimizationmethod to compute the boundaries minimizing the Chan-Vese model. To achieve this, we compute the second shapederivative of the Chan-Vese model and incorporate this inthe gradient descent process in order to reduce the num-ber of iterations needed for convergence. We discretize thegradient descent using the finite element method. We alsoemploy adaptive discretizations for both the curves and thedomains, resulting in further reductions in computations.

Gunay DoganTheiss ResearchNational Institute of Standards and [email protected]

MS106

Multilevel Methods for Image Deblurring

We present multilevel methods for discrete ill-posed prob-lems arising from the discretization of Fredholm integralequations of the first kind. In particular, we presentwavelet-based multilevel methods for signal and imagerestoration problems as well as for blind deconvolutionproblems. We show results that indicate the promise ofthese approaches.

Malena I. EspanolGraduate Aeronautical LaboratoriesCalifornia Institute of [email protected]

Misha E. KilmerTufts [email protected]

MS106

Adaptive Methods in Total Variation based ImageRestoration


Michael HintermuellerHumboldt-University of [email protected]

MS106

Multiphase Scale Segmentation and a Regularized

CS11 Abstracts 193

K-means

A typical Mumford-Shah-based image segmentation isdriven by the intensity of objects in a given image, and weconsider image segmentation using additional scale infor-mation. Using the scale of objects, one can further classifyobjects in a given image from using only the intensity value.We develope a fast automatic data clustering method forthe data clustering. The model automatically gives a rea-sonable number of clusters by a choice of a parameter. Weexplore various properties of this classification model andpresent different numerical results.

Sung Ha KangGeorgia Inst. of [email protected]

Berta SandbergAdel Research, [email protected]

Andy YipDepartment of MathematicsNational University of [email protected]

MS107

High Resolution Simulation of the Intracranial Ar-terial Network

Simulating the human arterial tree is a grand challenge re-quirung state-of-the-art mathematical algorithms and com-puters. In this talk, we will discuss modeling of arterial flowin a patient-specific intracranial arterial tree and presenta methodology we have developed. Our focus is on ul-trascale parallel algorithms based on a two-level domaindecomposition (2DD) method for the solution of Navier-Stokes equations with billions of unknowns on thousandsof computer processors.

Leopold GrinbergBrown [email protected]

George E. KarniadakisBrown UniversityDivision of Applied [email protected]

MS107

Vascular Modelling Toolkit: An Open-sourceFramework for Image-based Modeling and Anal-ysis of Blood Vessels

Thanks to the development of the imaging modalities,image-based computational fluid dynamics (CFD) hasshown its potentiality in elucidating the role of haemo-dynamics in cardiovascular diseases. However, the processfrom images to CFD is still a time-consuming and operator-dependent task, while the variability of real-anatomieschallenges CFD application to large patient populations.VMTK represents a concrete step in addressing these con-cerns: robust and objective computational techniques willbe presented for geometric and hemodynamics modeling.

Marina PiccinelliDepartment of Math&CSEmory University

[email protected]

Luca AntigaMario Negri Institute, Bergamo,ItalyOrobix, Bergamo, [email protected]

MS107

Some Recent Challenges in Patient-Specific Car-diovascular Mathematics: from Forward to InverseProblems

Combination of data coming from medical images and nu-merical solutions is a fundamental step for patient-specificsimulations in computational hemodynamics. We will con-sider some examples of data assimilation for merging veloc-ity measures and numerical simulations and for estimatingphysical parameters such as vessel compliance. Emphasiswill be focused on controllability problems for the incom-pressible Navier-Stokes equations and the impact of noiseof measures on the reliability of the numerical results.


Marta D’EliaMathCS, Emory [email protected]

Mauro PeregoMathCS, Emory University, Atlanta, GAPolitecnico di Milano, [email protected]

Christian VergaraUniversita’ degli Studi di BergamoBergamo, [email protected]

MS107

Surgical Modeling of the Fontan Procedure: Grow-ing the Knowledge Base One Patient at a Time

Long-term morbidities of single ventricle patients can beoften related to the blood dynamics through the surgicalconstruct (the total cavopulmonary connection - TCPC).To understand/improve the outcomes of these patients,we have generated patient-specific anatomical (experimen-tal and computational) models and developed a softwareto mimic surgical different alternatives for the TCPC de-sign, to optimize energetic efficiency/flow distribution. Wereview our experience, and demonstrate the value of apatient-specific surgical approach.

Ajit P. YoganathanDepartment of Biomedical EngineeringGeorgia Institute of [email protected]

Diane De ZelicourtDepartment of Biomedical EngineeringGeorgia Institute of Technology, Atlanta, Georgia;[email protected]

Christopher Haggerty, Maria Restrepo, KartikSundareswaranDepartment of Biomedical Engineering

194 CS11 Abstracts

Georgia Institute of Technology, Atlanta, [email protected], [email protected],[email protected]

Jarek RossignacGeorgia Institute of [email protected]

Kirk KanterDivision of Cardiothoracic SurgeryEmory University School of Medicine, Atlanta, [email protected]

William Gaynor, Thomas SprayDivision of Cardiothoracic SurgeryChildrens Hospital of Philadelphia, [email protected], [email protected]

MS108

Direct Numerical Simulation of Particulate Flowson 294912 Processor Cores

High performance on current supercomputers is mainly ob-tained by highly optimized kernels, which handle one spe-cific problem with given restrictions. waLBerla in contrastis a large C++ software framework and aims at high per-formance. It is our extensible implementation of a parallelcoupled fluid-structure solver. In this talk, the suitabilityof waLBerla for current and upcoming supercomputers isdiscussed and performance values on up to 294 912 proces-sor cores of the Blue Gene/P are given.

Jan GoetzSystem Simulation , Friedrich-Alexander-UniversityErlangen-Nuernberg, [email protected]

Klaus IglbergerComputer Science 10 - System SimulationFriedrich-Alexander-University Erlangen-Nuremberg,[email protected]

Ulrich J. RuedeUniversity of Erlangen-NurembergDepartment of Computer Science (Simulation)[email protected]

MS108

Massively Parallel Solution of 3D Ice Sheet FlowModels

The flow of polar ice sheets is characterized by a wide rangeof length scales, with localized flow features many ordersof magnitude smaller than continental scales. To capturethis wide range of scales in 3D ice sheet models, we em-ploy adaptive mesh refinement and higher-order methods.We investigate the application of multigrid precondition-ers to the resulting systems, as well as issues arising fromcapturing the free surface at the ice/air boundary.

Tobin IsaacUniversity of Texas at [email protected]

Carsten BursteddeThe University of Texas at [email protected]

Georg Stadler, Omar GhattasUniversity of Texas at [email protected], [email protected]

MS108

Massively Parallel Stabilized Finite Element Meth-ods for Fluid Dynamics



Kenneth JansenRensselaer Polytechnic [email protected]

MS108

Multiscale Simulation of Blood Flow in the Coro-nary Arteries

We present a computational model for blood flow in coro-nary arteries. The simulation uses the Lattice Boltz-mann method coupled with microscopic Molecular Dynam-ics modeling of the red blood cells, which interact withone another and the surrounding fluid, to provide a multi-physics and multiscale representation of the flow in patientspecific geometries. We will present the modeling meth-ods as well as the techniques leveraged to achieve excellentscaling on up to 294,912 processor cores.

Amanda PetersHarvard School of Engineering and Applied [email protected]

Simon [email protected]

Jonas LattEcole Polytechnique Federale de [email protected]

Sauro Succi, Massimo BernaschiIstituto per le Applicazioni del [email protected], [email protected]

Efthimios KaxirasDept. of Physics and School of Engg. and AppliedSciencesHarvard [email protected]

Mauro BissonDipartimento di InformaticaUniversita’ La [email protected]

MS109

Performance Optimizations for Heterogeneous andHybrid 3D Lattice Boltzmann Simulations onHighly Parallel On-Chip Architectures

Optimized CFD solvers for GPU computing can be up to

CS11 Abstracts 195

an order of magnitude faster than on current standard x86-type servers. We use a lattice Boltzmann based flow solverkernel, which is proven to perform well both on CPUs andGPUs. The focus is to go beyond a single compute nodeand show the potential of multinode GPU clusters for oursolver as well as the capability of utilizing heterogeneoushardware setups together with hybrid OpenMP+MPI pro-gramming techniques.

Johannes HabichRegional Computing Center [email protected]

MS109

GPU Implementation of a Numerical Wave Tankfor the Simulation of Non-linear and TurbulentFree Surface Flow

We present an efficient GPU implementation for the nu-merical simulation of three-dimensional free surface flow onthe basis of the Lattice Boltzmann method (LBM) and thenVIDIA CUDA framework. Several validations and appli-cations in the field of civil and environmental engineeringwill be presented. The runtimes of the numerical simula-tions are only up to one order of magnitude higher thanthe time scale of the real world event, for single precisionsimulations on one GPU.

Christian JanssenInstitute for Computational Modeling in Civil Eng.Technical University of Braunschweig, [email protected]

Martin SchonherrInstitute for Computational Modeling in CivilEngineeringTechnische Universitat [email protected]

Manfred KrafczykTech. Univ. of [email protected]

MS109

Comparing Performance and Energy Efficiency ofLightweight Manycore to GPU for LBM methods

The rising power consumption of conventional cluster tech-nology has prompted investigation of architectural alterna-tives that offer higher computational efficiency. This pre-sentation compares the performance and energy efficiencyof highly tuned code on the three architectural alternativesthe Intel Nehalem X5530 multicore processor, the NVIDIATesla C2050 GPU, and a gate-level architectural model ofa manycore chip design called ”Green Wave.”

John ShalfLawrence Berkeley National [email protected]

MS109

The Lattice Boltzmann Method: Basic Perfor-mance Characteristics and Performance Modeling

High memory bandwidth requirements and regular dataaccess patterns are common features of many CFD appli-cations. Choosing a 3D Lattice Boltzmann kernel and fo-

cusing on single node performance we first demonstratebasic optimization techniques for modern multicore CPUsand GPUs. Concepts and potentials of more advancedmulticore-aware temporal blocking techniques will alsobriefly be covered. Guided by established performancemodels we compare typical performance characteristics ofthe latest x86-CPU and GPU generations.

Gerhard Wellein, Georg HagerErlangen Regional Computing Center (RRZE), [email protected],[email protected]

Johannes HabichRegional Computing Center [email protected]

MS110

Toward Future Environmental Modeling


Mohammad AboualiCSRCSan Diego State [email protected]

MS110

Isogeometric Analysis in the World of CAD

Isogeometric analysis is a new finite element approach. Itunifies the worlds of computer aided design (CAD) and fi-nite element analysis (FEA) and makes it possible to usethe same free-form-surfaces model for CAD and FEA.The presentation shows a full static analysis in the commer-cial NURBS-based geometry modeling tool ”Rhino” usingthe isogeometric approach. The use of the same geometricdescription for both the CAD and FEA model has signif-icant advantages in practical application. The presenta-tion shows illustrative examples from structural analysisto demonstrate the practical application of the approach.

Michael BreitenbergerBGCETechnical University [email protected]

MS110

Local-isotropy in Direct Numerical Simulation ofTurbulent Channel Flow at High-Reynolds Num-ber

Turbulent channel flow (TCF) is one of the most canon-ical wall-bounded turbulent flows, and there have beentherefore extensive studies on TCF by direct numericalsimulations (DNS). High-resolution DNS of TCF providesus not only with detailed practical data for modeling ofwall-bounded turbulence but also with detailed fundamen-tal data to explore universality in the small-scale statisticsof high-Reynolds-number wall-bounded turbulence. In thepresent study, we developed a DNS code of TCF for thecurrent version of the Earth Simulator (ES), attained a5.9Tflops computation (11.3% of the peak performance)in the DNS of TCF on 1024x1536x1024 grid points us-ing 64 nodes of ES, and achieved the friction Reynoldsnumber Reτ = 2560, which is the world’s largest Reτ sofar achieved in DNS of TCF. The analysis of the DNSdata shows that the one-dimensional longitudinal energy

196 CS11 Abstracts

spectrum which is consistent with the prediction of Kol-mogorov’s theory (K41) is realized in the so-called log-lawlayer of high-Reynolds-number wall-bounded turbulence,but shows also that local isotropy, which is a key hypoth-esis in K41, does not necessarily hold at sufficiently smallscales.

Koji MorishitaDept Computational Science and EngineeringNagoya [email protected]

MS110

Hybrid Discontinuous Galerkin Methods for In-compressible Flow Problems

We propose a hybrid discontinuous Galerkin method forincompressible flow, that allows for nonconforming h-refinements and approximations with locally varying poly-nomial degrees (hp-adaptivity). We present order optimala-priori estimates and present a simple, efficient and re-liable error estimator. The agreement of the theoreticalbounds with numerical results will be demonstrated.

Christian WalugaAICES Graduate SchoolRWTH Aachen [email protected]

Herbert EggerInstitute for Mathematics and Scientific ComputingUniversity of [email protected]

MS111

Parallel Mesh Generation by Distributing CADGeometry

Having geometry information is important to make goodquality of meshes such as in highly curved geometry oradaptive meshing. Therefore, an approach to parallel meshgeneration by distributing assembly CAD geometry will bedescribed in this talk. This method is based on partition-ing of geometric surfaces and volumes across processors,with some surfaces shared between processors. Meshing ofinterface surfaces and volume bodies is partitioned acrossprocessors using a graph partitioning method to minimizecommunication cost and maximize load balance. Modeledges are meshed serially by the root processor, interfacesurfaces meshes are generated by one processor and sentto the other sharing processor, and non-interface surfacesand volume interiors are meshed in parallel. Non-interfacesurface meshing and asynchronous messages are used tohide the cost of message passing latency. It is imple-mented using ITAPS interfaces to read/distribute geome-try by iGeom, to store/communicate mesh information byiMesh and to relate them for meshing geometry in parallelby iRel.

Hong-Jun KimArgonne National [email protected]

Timothy J. TautgesArgonne National [email protected]

MS111

Parallel Hybrid Mesh Adaptation

A procedure for parallel anisotropic mesh adaptation ac-counting for mixed element types is presented. The paralleladaptive approach uses local mesh modification proceduresin a manner that maintains layered and graded elementsnear the walls, which are popularly known as boundarylayer or semi-structured meshes, with highly anisotropicelements of mixed topologies. The technique developed iswell suited for parallel viscous flow applications where theexact knowledge of the mesh resolution is unknown a priori.

Aleksandr OvcharenkoRensselaer Polytechnic [email protected]

MS111

Towards Large-Scale Predictive Flow Simulationsusing ITAPS Services

This talk will present advances made in massively paral-lel fluid-dynamics simulations based on tools/services pro-vided by ITAPS. Examples including complex geometriesand flow physics will be covered. It will be shown that evenwith these advances, unknowns associated with real-worldproblems limit the use of current tools in a comprehensiveand predictive manner. Therefore, we will also present ba-sic steps to advance in this regard, for example, based onITAPS-related manipulation services for geometry, etc.

Onkar SahniPECOS/ICES at the Univeristy of Texas at [email protected]

Kenneth JansenRensselaer Polytechnic [email protected]


MS112

Damping of Spurious Reflections off of Coarse-fineAdaptive Mesh Refinement Grid Boundaries

Adaptive mesh refinement (AMR) is an efficient techniquefor solving systems of partial differential equations numer-ically. The underlying algorithm determines where andwhen a base spatial and temporal grid must be resolvedfurther in order to achieve the desired precision and accu-racy in the numerical solution. However, systems of PDEswith low dissipation prove problematic for AMR. In such asystem, a wave traveling from a finely resolved region into acoarsely resolved region encounters a numerical impedancemismatch, resulting in spurious reflections off of the coarse-fine grid boundary. Here, we present a scheme for dampingthese spurious reflections and apply it to Maxwell’s Equa-tions.

Sven ChiltonUniversity of California, [email protected]

Phillip ColellaLawrence Berkeley National [email protected]

CS11 Abstracts 197

MS112

Fast Summation Method for Electro-Magnetics us-ing the Yukawa Screening Potential


Andrew J. ChristliebMichigan State UniverityDepartment of [email protected]

MS112

Asymptotic-Based Numerical Methods for Plasmasin the Quasineutral and Strong Magnetic Confine-ment Regimes

An efficient and accurate numerical scheme for the solu-tion of highly anisotropic elliptic equations is presented.The anisotropy is driven by a magnetic field. Hence, it canbe strong in some regions and weak in the others. More-over, the direction may vary. Our method is based on theAsymptotic Preserving reformulation of the original prob-lem, permitting an accurate resolution independly of theanisotropy strength and direction without the need of meshadapted to the anisotropy.

Pierre DegondCNRS, Institut de Mathematiques de ToulouseCNRS et Universite Paul [email protected]

Fabrice DeluzetInstitut de [email protected]

Alexei Lozinski, Jacek NarskiInstitut de Mathematiques de ToulouseUniversite Paul Sabatier - Toulouse [email protected],[email protected]

Claudia NegulescuCMI/LATPUniversite de [email protected]

MS112

Design and Preliminary Results for PIC on GPUswith Python

The Particle-in-Cell (PIC) model, popular for many multi-scale plasma physics problems, is well suited to paral-lel architectures. Our goals include mapping PIC toGPUs, and providing a programmatic interface. We ex-ploit multiple levels of parallelism using tools such as Py-CUDA/PyOpenCL, IPython, and ZeroMQ. The design,and promising preliminary performance results will be pre-sented. Our GPU code achieved one order of magnitudeperformance improvement over a single CPU core on a sim-plified push subproblem.

Min Ragan-KelleyUC Berkeley AS&[email protected]

John VerboncoeurDept. Nuclear EngineeringUniversity of [email protected]

MS113

Eigenanalysis of Polynomial Chaos Representa-tions of Uncertain ODE Systems

Ordinary differential equations (ODEs) with uncertain pa-rameters and/or initial conditions are of practical interestin the analysis and reduction of many physical systems. Weuse Galerkin projection onto polynomial chaos (PC) basisfunctions to reformulate an uncertain system of ODEs asa deterministic ODE system that describes the evolutionof the PC modes. We analyze the eigenstructure of theuncertain Jacobian and that of the Jacobian of the pro-jected PC-system, outlining general statements about therelation between the two solutions. We also discuss theuse of eigenvalues and eigenvectors of both systems for thereduction of ODE systems under uncertainty.

Robert D. BerrySandia National [email protected]



MS113

Efficient Computation of Failure Probability

Computing failure probability is a critical step in many ap-plications such as reliability based optimization. The moststraightforward method is to sample the response space.This can be prohibitively expensive because each samplerequires a full scale simulation of the underlying physi-cal system. An alternative way is to generate an accuratesurrogate model for the system and sample the surrogate.This can be extremely efficient as long as one can constructsuch a surrogate. In this talk we demonstrate that a ex-plicit surrogate approach is fundamentally flawed, no mat-ter how accurate the surrogate is. Furthermore, we presenta hybrid algorithm combines the surrogate and samplingapproach and address the robust problem described above.Rigorous error estimate will be presented as well as thenumerical examples.

Dongbin Xiu, Jing LiPurdue [email protected], [email protected]

MS113

Multiscale Methods for Statistical Inference in El-liptic Problems

Estimating the coefficients of an elliptic PDE from observa-tions of the solution is an ill-posed inverse problem. Finitedata resolution and the smoothing character of the forwardoperator limit one’s ability to recover fine-scale informa-tion, but simultaneously suggest an intrinsically multiscaleapproach to inversion. We formulate a new Bayesian in-ference approach in this context. Essential components ofthe approach are (1) constructing a prior distribution overstiffness matrices at the coarse scale; and (2) generatingrealizations of the fine scale conditioned on coarse-scale in-formation, via nonlinear constraints. Numerical examples

198 CS11 Abstracts

demonstrate the efficiency and accuracy of the method.

Matthew Parno, Youssef M. MarzoukMassachusetts Institute of [email protected], [email protected]

MS113

Sensitivity-based Reduced Order Modeling of HighDimensional Uncertain Inputs

We present an approach of reducing the dimensionality ofthe input space of a stochastic simulation using SVD ofrandomly sampled sensitivity gradients. Such reductionof the dimension of parameter space has deep implicationon the number of samples required to build accurate sur-rogate surfaces. The curse of dimensionality encounteredby most uncertainty propagation schemes makes the re-quired samples grow very fast as the dimensionality of theparameter space. Therefore, reduction in the parameterdimensions through our method can dramatically reducethe computational cost for uncertainty quantification withhigh dimensional stochastic space.

Alireza DoostanDepartment of Aerospace Engineering SciencesUniversity of Colorado, [email protected]


Paul ConstantineSandia National [email protected]

Gianluca IaccarinoStanford [email protected]

MS114

Challenges for Optimization in Future ElectricPower Systems


Jeremy [email protected]

MS114

Resource Commitment and Dispatch in the PJMWholesale Electricity Market

This presentation will provide an overview of the size andscope of the PJM wholesale electricity market and thescheduling and operational challenges we expect to en-counter in the future with increased penetration of inter-mittent and distributed resources. It will also provide abrief overview of the development and implementation ofthe mixed-integer programming based unit commitmentand dispatch programs in the PJM market and will de-scribe the benefits achieved to date.

Andrew L. [email protected]

MS114

Long-term Planning for the Power Grid



MS114

High-Performance Computing for TransmissionExpansion Planning

We address computational issues arising in transmissionexpansion planning. The large-scale adoption of smart-gridprograms and intermittent renewables coupled to chang-ing market designs require of highly flexible transmissionsystems. A powerful technique that can be used to ad-dress this problem is stochastic mixed-integer optimiza-tion. However, the associated network complexity, timescales, number of uncertain scenarios, and number of in-teger variables arising in transmission planning problemsseverely limit the scope of applications. In this project, weleverage high-performance computing capabilities throughthe development of a set of optimization tools that enablethe solution of problems of unprecedented complexity. Wepresent scalability studies on BlueGene/P. In addition, wepresent a novel set of complementarity formulations to han-dle transmission switching decisions in DC and AC powerflow models.

Victor Zavala, Zhen XieArgonne National [email protected], [email protected]

Sven LeyfferArgonne National [email protected]

MS115

Performance of a Sparse Hybrid Linear Solver ona Multicore Cluster

Sparse hybrid solvers are a trade-off between direct meth-ods and iterative methods. Part of the computation is firstperformed with a direct method in order to ensure numer-ical robustness; the algorithm then switches to an itera-tive method to alleviate the computational complexity andmemory usage. The convergence and number of iterationsdepend on the amount of computation performed in thedirect part. We present the benefits of such a hierarchicalapproach on a multicore cluster.

Emmanuel [email protected]

Luc GiraudINRIAToulouse, [email protected]

Abdou GuermoucheLaBRI-INRIA [email protected]

Azzam HaidarDepartment of Electrical Engineering and Computer

CS11 Abstracts 199

ScienceUniversity of Tennessee, [email protected]

Yohan [email protected]

Jean RomanINRIA / [email protected]

MS115

Auto-Tuned Linear Algebra Computations forKrylov Methods on Multicore and GPUs

We discuss performance optimizations for sparse Krylovsolvers. By modelizing and then micro-benchmarking ma-trix and vector computations, we manage to choose a bettersparse structure for the matrices and a better data distri-bution for the vector computations on multicore processorsand GPUs. Our experiments on different generations ofprocessors and three generations of GPUs from first CUDAcapable GPUs to the Fermi architecture show different op-timums, with speed-ups of 2-3x on processors and up to10x on GPUs versus respective original performance. Weconclude that autotuning is a good way to deliver betterperformance on both hardware.

Serge G. PetitonCNRS/LIFL and [email protected]

Christophe CalvinCEA-Saclay/DEN/DANS/DM2S/SERMA/[email protected]

Jerome [email protected]

MS115

Developing a 3-D Sweep Radiation Transport Codefor Large-scale GPU Systems

Solution of the seven-dimensional Boltzmann equation is acomputationally expensive operation used in multiple sci-ence applications, typically solved using 3-D sweep algo-rithms which are difficult to parallelize efficiently. This talkdiscusses a 3-D sweep algorithm implemented on NVIDIAGPUs, as part of the effort to port the Denovo radia-tion transport code to ORNL’s next-generation GPU-basedpetascale system. This talk presents strategies used to ex-pose thread parallelism as well as performance results onNVIDIA Fermi GPUs.

Wayne JoubertOak Ridge National [email protected]

MS115

Parallel Precondi-tioning Methods for Ill-Conditioned Problems onMulticore Clusters

In this work, authors developed a method for paral-lel preconditioning based on HID (Hierarchical InterfaceDecomposition) for finite-element applications with ill-

conditioned coefficient matrices, and implemented the de-veloped method on to multi-core/multi-socket clusters us-ing OpenMP/MPI hybrid parallel programming model.In this method, HID is also applied to computations oneach domain. Robustness and efficiency of the developedmethod have been demonstrated on T2K Open Supercom-puter (Tokyo) and Cray XT4 (LBNL).

Kengo Nakajima, Masae HayashiThe University of TokyoInformation Technology [email protected], [email protected]


MS116

Viscosity based Shock Capturing for High OrderSpace and Time Adaptive Discontinuous GalerkinSchemes

This talk deals with shock capturing strategies usingthe explicit space-time expansion discontinuous Galerkin(STE-DG) scheme. With its high order local time-steppingfunctionality, the increasing time-step restrictions, intro-duced by using viscosity for capturing shocks, can be over-come. Starting with ideas by Persson and Peraire, weadopted and refined their approach, while searching for al-ternative, less parameter-dependent strategies. The talkwill introduce the scheme, shock sensors and capturingmechanisms, together with applications for Navier-Stokesand Magnetohydrodynamics.

Christoph AltmannInstitut fuer Aerodynamik und GasdynamikUniversitaet [email protected]

Gregor GassnerInstitute for Aerodynamics and GasdynamicsUniversitaet [email protected]

Claus-Dieter MunzInstitut fur Aerodynamik und Gasdynamik (IAG)[email protected]

MS116

Shock Capturing in a Time-Explicit DiscontinuousGalerkin Method on the GPU

Having recently shown that high-order unstructured dis-continuous Galerkin (DG) methods form a spatial dis-cretization that maps well onto graphics processing units(GPUs), I will report on ongoing work aiming to designa GPU-capable (i.e. data-local) shock capturing schemebased on DG. I will discuss the mathematical and algo-rithmic motivation for a shock sensor and the design of aGPU-suited smoother that uses the sensor’s findings. Inaddition, I will touch upon accuracy results for the result-ing schemes on a variety of nonlinear conservation laws.


200 CS11 Abstracts

Timothy WarburtonDepartment of Computational and Applied MathematicsRice [email protected]


MS116

A h-p Adaptive Non-hydrostatic Atmospheric FlowSolver

We present a framework aimed at problems arising in thegeosciences: it can solve the non-hydrostatic equations ofthe atmosphere, shallow-water problems, tsunami predic-tion and advection tests. The DG method is employed,however, any element based discretization can be sup-ported. A mesh database enables parallel non-conformingmesh refinements in both h-p. Modern coding techniquesare employed. They enable the seamless optimization ofcompute kernels: SSE vector operations are supported butcould be extended to GPUs.


Sebastien BlaiseNational Center for Atmospheric [email protected]

David HallNational Center for Atmospheric Research1850 Table Mesa Drive, Boulder CO [email protected]

MS116

Exponential Integrators: Construction, Implemen-tation and Performance

Exponential integrators offer an efficient alternative to ex-plicit and implicit methods for integration of large stiffsystems of ODEs. While first exponential schemes wereintroduced as early as 1958, they did not attain wide pop-ularity since the algorithms were deemed too computa-tionally expensive. However, recent research on exponen-tial integrators allowed construction of efficient methodswhich are competitive with commonly used implicit andexplicit schemes. In this talk we will provide an overviewof the structure and performance of exponential integra-tors. We will describe what design principles allow forconstruction of efficient exponential schemes and quantita-tively illustrate their computational advantages comparedto Krylov-based implicit and explicit integrators using aset of test problems. We will present a new class of ex-ponential propagation iterative schemes (EPI) and discusshow these methods can be derived using the B-series the-ory. The major building blocks of the new EPI methods,such as their structure, adaptivity and performance will bediscussed.

Mayya TokmanUniversity of California, MercedSchool of Natural [email protected]

John LoffeldUniversity of [email protected]

MS117

Designing Optimal Filters for Ill-posed InverseProblems

Ill-posed inverse problems arise in many scientific and en-gineering applications. Regularization via filtering of thesingular value decomposition can be used to compute rea-sonable solutions. In this talk, a general framework for de-signing optimal filters is developed, where techniques fromstochastical and numerical optimization are utilized. Weemploy prior knowledge of the application and investigateerror metrics, such as p-norms. Numerical examples fromimage deblurring illustrate better performance comparedto well-established filtering methods.

Julianne ChungUniversity of [email protected]

Matthias ChungDepartment of MathematicsTexas State [email protected]

Dianne P. O’LearyUniversity of Maryland, College ParkDepartment of Computer [email protected]

MS117

A Fast GPU-based Method for Image Segmenta-tion

The goal of image segmentation is to partition an im-age into two or more regions that are characterized by,e.g., similar intensity or texture in order to simplify im-age analysis. Applications range from location of tumorsto face recognition. In many of these applications realtimeor close to realtime performance is required. GPUs offerhigh computational performance at low cost and are there-fore an interesting architecture especially for data parallelalgorithms found in imaging. Therefore, we have imple-mented a variational shape optimization approach for im-age segmentation based on a Mumford-Shah functional onGPU using the Open Computing Language (OpenCL). Wepresent performance results on different GPUs and CPUsand additional scaling results for an MPI-parallel versionon multi-GPU.

Harald KoestlerUniversity of [email protected]

MS117

Computational Challenges in SPECT Reconstruc-tion

This talk presents a comprehensive reconstruction frame-work for the task of Single Photon Emission ComputedTomography (SPECT) reconstruction. Different configu-rations of this framework lead to different reconstructionproblems. Selected problems are discussed in detail and thechallenges with respect to the computational complexity

CS11 Abstracts 201

are highlighted. The talk conludes with basic approaches tomaster the aforementioned challenges and the correspond-ing reconstruction results are discussed.

Jan ModersitzkiUniversity of LubeckInstitute of Mathematics and Image [email protected]

Sven BarendtInstitute of Mathematics and Image [email protected]

MS117

Open Curve Evolution for Mumford-Shah Segmen-tation Models

In two dimensions, the Mumford-Shah functional allows forpiecewise-smooth minimizers u with the edge set K madeof open or closed curves. Prior level set formulations forMumford-Shah only allow for segmentation of images withclosed edge sets. We propose here an efficient level setbased algorithm for segmenting images with edges madeof open curves or crack tips, adapting Smereka’s approachcombined with M-S energy minimization. Numerical re-sults on synthetic and real images will be presented.

Rami MohieddineUCLA Math [email protected]

Luminita A. VeseUniversity of California, Los AngelesDepartment of [email protected]

MS118

Adaptive Mesh Refinement for Time-Harmonic In-verse Scattering Problems



MS118

Shape Optimization for Free Boundary Problems— Analysis and Numerics

In this talk the solution of a Bernoulli type free bound-ary problem by means of shape optimization is considered.Four different formulations are compared from an analyt-ical and numerical point of view. By analyzing the shapeHessian in case of matching data it is distinguished be-tween well-posed and ill-posed formulations. A nonlinearRitz-Galerkin method is applied for discretizing the shapeoptimization problem. In case of well-posedness existenceand convergence of the approximate shapes is proven. Incombination with a fast boundary element method efficientfirst and second order shape optimization algorithms areobtained.

Karsten EpplerTechnische Universitaet [email protected]

Helmut Harbrecht

Universitaet StuttgartInstitut fuer Angewandte Analysis und [email protected]

MS118

Optimization of Shell Structure Acoustics

This work analyzes a mathematical model for shell struc-ture acoustics based on the Naghdi shell equations, andthin boundary integral equations, with full coupling at theshell mid-surface. The use of adjoint equations allows thecomputation of derivatives with respect to large parametersets in shape optimization problems where the thicknessand mid-surface of the shell are computed so as to gen-erate a radiated sound field subject to broad-band designrequirements.

Sean HardestyRice [email protected]

MS118

Boundary Element Methods for Dirichlet ControlProblems

We present a boundary integral approach for the solution ofDirichlet boundary control problems with box constraints.The reduced minimization problem results in a variationalinequality for which we analyse equivalent boundary inte-gral formulations involving Bi-Laplace boundary integraloperators. In addition to the numerical analysis we alsocomment on the solution of the resulting discrete varia-tional inequality.

Olaf SteinbachTU Graz, Institute of Computational [email protected]

Guenther Of, Thanh Phan XuanTU [email protected], [email protected]

MS119

Sensitivity Analysis Concerning Geometry andModeling for Aneurysms

In this work examples of cerebral aneurysms will be used todiscuss effects of uncertainties in the model reconstructionfrom medical images and the rheological models to describeblood. Preliminary findings indicate an acute sensitivityto regional variations in the geometry definition. Further-more, within the aneurysm, a change of rheological modelsincurs noticeably on the wall shear stress up to comparablelevels as the uncertainty in geometry definition.

Alberto GambarutoDepartamento de MatematicaInstituto Superior [email protected]

Alexandra Moura, Joao Janela, Adelia SequeiraDepartment of MathematicsInstituto Superior [email protected], [email protected],[email protected]

202 CS11 Abstracts

MS119

Hemodynamics and Morphology in the Parent Ves-sels could Predict the Aneurysm Phenomenology

In the cerebral circulation, the morphological features ofthe feeding arteries could determine a hemodynamics en-vironment which is more or less protective from the for-mation and development of cerebral aneurysms. This re-search aims at defining a set of parameters describing boththe morphological features and hemodynamics of cerebralarteries, which significantly and jointly correlate with thelocation of the disease and possibly with its tendency torupture.

Tiziano PasseriniDepartment of Math & CSEmory [email protected]

Marina PiccinelliDepartment of Math&CSEmory [email protected]


MS119

Stress Analysis of Cerebral Aneurysms

The rupture of intracranial aneurysms is a mechanical phe-nomenon caused by underlying biological events. The dis-tribution of pressure induced tissue wall tension may pro-vide some insights into this phenomenon and perhaps helpassess rupture risk. Patient-specific aneurysms models re-constructed from diagnostic imaging data have been con-sidered. We employed anisotropic finite elastic materialmodels and principal curvature-based material fiber direc-tions. Strain energy stored during deformation is mini-mized when material fiber directions conform to the prin-cipal curvatures.

Madhavan RaghavanDepartment of Biomedical EngineeringUniversity of [email protected]

MS119

Functional Data Analysis of Three-dimensionalCerebral Vascular Geometries for the Study ofAneurysms Pathogenesis

We perform statistical analysis of three-dimensional cere-bral vascular geometries, obtained from reconstructions ofangiographic images. This exploratory study highlights therole of vascular morphology on the pathogenesis of cere-bral aneurysms. Advanced techniques are developed forthe statistical analysis of these functional data, includingmethods for multidimensional curve fitting, dimension re-duction, registration and classification.

Laura M. SangalliMOX Department of MathematicsPolitecnico Milano, [email protected]

Piercesare Secchi, Simone VantiniMOX - Department of Mathematics

Politecnico di Milano, [email protected], [email protected]

MS120

Viscoelastic Flows with Microscopic Evaluation ofKramers Rod Forces


Gregory H. MillerUC Davis & [email protected]

MS120

Time-parallel Continuum-kinetic-molecular Com-putation of Polymer Rod Models


Sorin MitranUniversity of North Carolina Chapel [email protected]

MS121

Heterogeneous Simulation of Particulate Flows onGPU Clusters

Particulate flows are crucial for various industrial pro-cesses, however understanding the underlying transportprocesses is still ongoing research. For our numerical ap-proach, we couple a rigid body dynamics and a latticeBoltzmann flow solver, which fully resolves the particles.Hence, the simulation of real-world scenarios is very com-pute intensive. In this talk we present a performance studyof our approach on GPU clusters including heterogeneoussimulations, and a comparison between CPU and GPU.

Christian FeichtingerChiar for System SimulationUniversity of Erlangen-Nuremberg, [email protected]

MS121

LBM Simulation of Multi-phase Flow — Large-scale Parallel Implementation on the Mole-8.5GPGPU Supercomputer

Mole-8.5 is the first GPGPU supercomputer (Rpeak ofabout 1100 Tflops) using NVIDIA Tesla C2050 in theworld, designed and established in April 2010 by Instituteof Process Engineering (IPE), Chinese Academy of Sci-ences. It holds the No. 19 spot on the 35th TOP500 list ofworldwide supercomputers and debuts at No.8 for the mostenergy-efficient supercomputers in the Green500 list of July2010. Mole-8.5 system has already carried out many CFDapplications covering such areas as chemical engineering,oil exploitation and recovery, metallurgy, based on somediscrete particle method, among which Lattice Boltzmannmethod is representative due to its natural parallelism andhigh efficiency. Here we present an application of the di-rect numerical simulation (DNS) of particle-fluid systems.Particle-fluid systems exist widely in both natural and en-gineer, such as sandstorm, debris flow, sediment transport,blood flow, fluidized bed reactors, pneumatic conveying,etc. Understanding the physical mechanisms underlyingthe complex multi-scale behaviors of these systems requiresdetailed physical information at high resolution below theparticle scale, which can not be provided adequately by

CS11 Abstracts 203

experiments so far. We have implemented a coupled nu-merical method for DNS of particle-fluid both in 2D and3D. In our proposed coupling scheme, the particle motion isdescribed by the particle method, while the hydrodynamicequations governing fluid flow are solved by LBM. Particle-fluid coupling is realized by an immersed boundary method(IBM). As a result of the fast and efficient simulation of thepresent scheme and computational capability of Mole-8.5,the scale that DNS can be performed with for particle-fluid system can even be expanded to an engineering scale,that is, meters in magnitude. This implementation makesDNS an attractive alternative to explore the complexityin particle-fluid systems and a potential tool for industrialapplication.

Limin wang, Xiaowei Wang, Qingang Xiong, GuofengZhouState Key Laboratory of Multiphase Complex Systems,Institute of Process Engineering, Chinese Academy [email protected], [email protected],[email protected], [email protected]

Wei GeState Key Laboratory of Multiphase Complex SystemsChinese Academy of Sciences, [email protected]

MS121

Highly Interactive Computational Steering of CFDSimulations Utilizing Multiple GPUs

Traditionally, computational fluid dynamics (CFD) is donein a cyclic sequence of independent steps. As CFD simu-lations are computationally intensive they are usually exe-cuted on high performance systems with rather limited userinteraction. Not surprisingly, it is a long term wish of sci-entists and engineers to closely interact with their runningsimulations. We will show that the convergence of massiveparallel computational power of GPUs and a steering en-vironment into a single system significantly improves theusability, application quality and user-friendliness.

Jan LinxweilerInstitute for Computational Modeling in CivilEngineeringTechnical University of Braunschweig, Braunschweig,[email protected]


MS121

Subject-sepcific Pulmonary Airflow Simulation byLattice Boltzmann Method on GPU Cluster

We propose a novel meshing method for multi-GPU LBMcomputation of subject-specific pulmonary airflows. SinceGPU computation cannot bring out its performance underunoptimized memory access, there is little GPU compu-tation of complex geometries including pulmonary airway.In this method, we overcome the issue by decomposingcomputational domain by small cubes consisting of 43-sizeCartesian mesh. This method can compute airflow 40 times

faster than a Core i7-930 by using Geforce GTX 480.

Takahito MikiDepartment of Biomedical EngineeringTohoku University, [email protected]

Xian WangTokyo Institute of [email protected]

Takayuki AokiTokyo Institute of [email protected]

Yohsuke Imai, Takuji IshikawaDept. of Bioengineering and RoboticsTohoku University, [email protected],[email protected]

Kei TakaseDept. of Diagnostic RadiologyTohoku University, [email protected]

Takami YamaguchiDept. of Biomedical EngineeringTohoku University, [email protected]

MS122

Establishing a Consortium of CSE Programs (OpenDiscussion)



Martin RuessTechnische Universitat MunchenComputation in [email protected]

Michael HankeKTH Royal Institute of TechnologySchool of Computer Science and [email protected]

MS122

Toward a Predictive Model of Tumor Growth

A general continuum theory of mixtures is used as a ba-sis for diffuse-interface models of multiple interacting con-stituents. The resulting models are of Cahn-Hilliard type,governed by systems of evolution equations in the volumefractions of species. We develop a framework of statisticalinverse analysis based on Bayesian methods and employmethods of statistical calibration and validation of varioustumor models that lead to the quantification of uncertain-ties in specific quantities of interest. Virtual medical imag-ing data are used to inform representative tumor growthmodels for the investigation of the initial robustness of the

204 CS11 Abstracts

scheme.

Andrea J. Hawkins-DaarudInstitute for Computational Engineering and SciencesThe University of Texas at [email protected]

J. Tinsley OdenThe University of Texas at [email protected]

Kris van Der ZeeInstitute for Computational Engineering and SciencesThe University of Texas at [email protected]

Serge PrudhommeICESThe University of Texas at [email protected]

MS122

Continuum Mechanical Modeling and NumericalSimulation of Finite Damage in Multi-phasic Mate-rials within the Framework of the Extendend FiniteElement Method

In this contribution, fluid-saturated materials are modeledby use of the Theory of Porous Media (TPM) which allowsthe description of multi-component continua with internalinteraction. Multiple numerical models for the computa-tion of tearing mechanisms are discussed in the context ofthe TPM. For the numerical treatment of strong discon-tinuities, the Extended Finite Element Method (XFEM)is applied. Computational examples show the numericalrealization of the presented damage mechanisms.

Hans-Uwe Rempler

Institute of Applied Mechanics (Civil Engineering)University [email protected]

MS122

Mathematical Models for Inverse Design Problemof Painless Electrode and Automatic Detection ofBrain Metastases

This talk presents two mathematical models; inverse de-sign problem of painless electrode and automatic detec-tion of brain metastases. Inverse design problem for find-ing an optimal geometry of electrodes aims to deal withedge singularity problem causing painful sensation or skinburn around the perimeter of the electrode. In the secondmodel of automatic brain metastases detection, we developa robust detection method using a special filtering functionwhich is designed to pick out tumor-like anomalies havinga certain size. Theoretical results of the two models aresupported by numerical simulations.

Yizhuang SongDepartment of Computational Science & EngineeringYonsei [email protected]

Hyeuknam KwonDepartment of Computational Science & EngineeringYonsei University

[email protected]

MS123

Generic and Adaptive PDE Solvers on Multicoreand Distributed Machines

Implementations of parallel algorithms to solve partial dif-ferential equations using adaptively changing meshes arelarge and complex. At the same time, if implementedintelligently, they can efficiently make use of parallelismboth within a single machine using multiple cores as wellas across nodes of a cluster computer. We will review ourexperiences in extending the widely used finite element li-brary deal.II to use both of these paradigms, along with thedifficulties one encounters and how they can be resolved.

Wolfgang BangerthTexas A&M [email protected]

MS123

Petascale Simulations of Reacting Flows

We investigate local and parallel performance issues for di-rect numerical simulation of turbulent combustion usingthe spectral element code Nek5000. We analyze single-node multi-core performance across the range of kernels en-countered in combustion with detailed chemistry and alsodiscuss a variety of scaling issues, including an AMG-basedcoarse grid solve, a parallel communication framework, andI/O. We present simulation results for several applications,including reacting and non-reacting turbulent flows anddemonstrate scaling for up to 290,000 cores of the IBMBG/P.

Stefan Kerkemeier, Paul F. FischerArgonne National [email protected], [email protected]

James LottesOxford [email protected]

MS123

Trading Memory Usage for Improved Communi-cation/compute Overlap in a Peta-scale ReactingFlow Solver

Effective scaling of massively parallel PDE solvers usingfinite-difference techniques with a physical domain decom-position is limited by the inter-process and inter-node com-munication necessary to communicate ‘halo’ data betweenneighboring tasks. We have reorganized a peta-scale react-ing follow solver, S3D, to improve overlap between the com-munication and computation resulting in improved scalingand increased efficiency when running at the full machinesize of jaguar, the XT5 at Oak Ridge National Laboratory( 224, 000 MPI tasks) . The reorganization is enabled byincreasing the amount of scratch memory used during thecomputation to break the conventional data dependencylimitations. Specifically, derivative operands are computedand halo zone communication begun asynchronously be-fore evaluation of the chemical reaction rates. Tradition-ally, each operand is computed immediately before evalua-tion of the derivative to minimize working memory require-ments, and reaction rates are evaluated after computingthe derivatives. Reacting flow solvers such as S3D tend tobe compute and memory-bandwidth bound and are rarely

CS11 Abstracts 205

limited by available memory; with the proposed reorgani-zation we have significant flexibility to vary the amount ofmemory used in order to minimize work array space whilefully overlapping communication with computation.

Ray W. GroutNational Renewable Energy [email protected]

John M. [email protected]

Ramanan SankaranNational Centre for Computational SciencesOakridge National [email protected]

Steven HammondNational Renewable Energy [email protected]

Jacqueline ChenSandia National [email protected]

MS123

MAESTRO and CASTRO - Petascale AMR Codesfor Astrophysical Applications

We present a suite of AMR hydrodynamics codes for as-trophysical applications developed at the Center for Com-putational Sciences and Engineering at LBNL. MAESTROis suitable for low Mach number flows and CASTRO is ageneral compressible code. Both codes scale to 100k-200kcores using a hybrid MPI/OpenMP approach on the JaguarXT5 supercomputer at OLCF. We are currently studyinga variety of astrophysical phenomena including Type Iasupernovae.

Andy Nonaka, Ann S. AlmgrenLawrence Berkeley National [email protected], [email protected]

John B. BellCCSELawrence Berkeley [email protected]

Michael LijewskiLawrence Berkeley National [email protected]

Michael ZingaleDept. of Physics and AstronomyStony Brook [email protected]

MS124

Progress in Large-Scale Differential Variational In-equalities for Heterogeneous Materials

Modeling the mesoscale behavior of irradiated materials isan essential aspect of developing a computationally predic-tive, experimentally validated, multiscale understanding ofthe thermo-mechanical behavior of nuclear fuel. Phase fieldmodels provide a flexible representation of time-dependentheterogeneous materials. We explain how differential vari-

ational inequalities (DVIs) naturally arise in phase fieldmodels, and we discuss recent work in developing advancednumerical techniques and scalable software for DVIs as ap-plied to large-scale, heterogeneous materials problems.



Lois Curfman McInnes, Todd MunsonArgonne National LaboratoryMathematics and Computer Science [email protected], [email protected]

Barry F. SmithArgonne National LabMCS [email protected]

Lei WangUniversity of [email protected]

MS124

Scalable Solvers for the Cahn-Hilliard Equationwith Applications in Nuclear Fuel Modeling

The Cahn-Hilliard (CH) equation is often used for the sim-ulation of phase separation. In order to use large timesteps, we consider some implicit methods, which are tradi-tionally unconditionally stable. But for the CH equation,standard fully implicit methods are not stable due to theanti-diffusion term in the equation. We propose and testsome stabilized domain decomposed fully implicit methods.The parallel scalability of a PETSc-based implementationwill be reported for 2D and 3D problems.

Chao YangUniversity of Colorado, [email protected]

Xiao-Chuan CaiUniversity of Colorado, BoulderDept. of Computer [email protected]

Michael PerniceIdaho National [email protected]

MS124

A Posteriori Error Analysis for a Cut Cell FiniteVolume Method

We consider elliptic problems where the diffusion coefficientchanges discontinuously across a smooth curved interface.The discontinuity has a strong impact on the accuracy ofnumerical methods. We derive goal-oriented a posteriorierror estimates for the numerical solutions by describinga systematic approach to discretizing a cut-cell problemthat handles complex geometry in the interface in a naturalfashion. Our approach reduces to the well-known Ghost

206 CS11 Abstracts

Fluid Method in simple cases.


Michael PerniceIdaho National [email protected]

Simon TavenerColorado State [email protected]

Haiying WangMichigan Technical [email protected]

MS124

Phase-Field Modeling for Nuclear Materials Appli-cations

In nuclear materials, the characteristics of irradiation-induced gas bubble structures are dependent on the de-fect production rates as well as microstructure. Here,we implement a phase-field model capable of capturingmulti-component defect diffusion, bubble nucleation andgrowth, and bubble/grain boundary interactions to inves-tigate these processes throughout time and for varying irra-diation conditions. Furthermore, we have utilized this sim-ulation capability to develop models of bubble percolationand the effective thermal transport across heterogeneousmicrostructures.

Paul MillettIdaho National LaboratoryNuclear Fuels and [email protected]


Michael TonksIdaho National [email protected]

MS125

Two-Phase Flow Simulation on GPU cluster usingan AMG Preconditioned Sparse Matrix Solver

Multi-GPU computing is applied to a gas-liquid two-phaseflow simulation which is one of the challenging themes inCFD applications. The volume of fluid should be locallyconserved and the interface between gas and liquid is cap-tured by level-set function and the WENO scheme. Thepressure Poisson equation including coefficients with thehigh density ratio is solved by a AMG preconditioning andBiCGStab solver. A high performance is achieved on aninfiniband-connected GPU cluster.

Takayuki Aoki, Kenta SugiharaTokyo Institute of [email protected], [email protected]

MS125

Engineering a Kernel-agnostic Distributed Linear

Algebra Library for Multi/many-core in Trilinos

In a shared-memory parallel library, user-authored kernelsare required to be parallelized to avoid the introductionof serial bottlenecks. In the context of multiple program-ming models, this is a significant burden for users, es-pecially those unaccustomed to shared-memory program-ming. Conversely, vendors/heroes may wish to replacegeneric library kernels with tuned versions. However, thiscan be an intrusive and complicated substitution. We dis-cuss Trilinos efforts to decouple distributed objects andtheir kernels to address these difficulties.

Christopher G. BakerOak Ridge National [email protected]

MS125

Preparing Multi-physics, Multi-scale Codes for Hy-brid HPC

Effective scientific code teams create development environ-ments that allow them to focus on their scientific goals.Here we examine how one such team is preparing for gpu-accelerated multicore computers for the shock hydrody-namics on an unstructured mesh capability within theirmulti-scale multi-physics application program. The focusis on the organization of data structures and computationsso that they execute effectively in this new environment aswell as existing homogeneous multi-core environments.

Richard Barrett, Richard R. Drake, Allen C. RobinsonSandia National [email protected], [email protected],[email protected]

MS125

A Sequential Programming Framework for Large-Scale GPU-Accelerated Structured Grids

Although structured-grid applications have demonstratedsignificant speedups with GPUs, the complexities of mem-ory system involved in large-scale GPU clusters makes pro-gramming such machines very challenging. We propose aprogramming framework that allows the user to expressstructured grid applications in a concise and declarativeway by automatically translating the user-written sequen-tial program code to parallel programs in MPI and CUDA.We report performance studies using several benchmarkson the TSUBAME2 supercomputer.

Tatsuo Nomura, Naoya Maruyama, Toshio EndoTokyo Institute of [email protected],[email protected], [email protected]

Satoshi MatsuokaTokyo Insitute of [email protected]

MS126

Optimum Experimental Design for Nonlinear Dy-namic Processes

Modeling, simulation and optimization are a powerful ’en-abling technology’ for mastering the challenges of today’sscience and engineering. An important precondition is thevalidation and calibration of the models by means of ex-perimental data. We present new algorithms to optimize

CS11 Abstracts 207

the expensive yet often ambiguous experiments to providesignificant data, solve the corresponding nonlinear mixed-integer control problems and incorporate model uncertain-ties. Applications in chemical industry and cell biologydemonstrate a truly amazing potential of Nonlinear Opti-mum Experimental Design.

Hans Georg BockIWR, University of [email protected]

Ekaterina KostinaFachbereich Mathematik und InformatikPhilipps-Universitaet [email protected]

Stefan Koerkel, Johannes SchloederIWR, University of Heidelberg, [email protected], [email protected]

MS126

Designing Maximally Informative Experiments inSystems Biology

Biological measurements are technically challenging and re-source demanding. In this talk, we show how to identifymaximally informative experimental protocols for the se-lection between conflicting nonlinear dynamical hypothe-ses. The expected information gain is optimized by speci-fying configurations of crucial subsets of chemical species,sampling schedules and interventions. Our approach isvalidated with complex mechanistic biochemical networks.Automatically cycling through hypothesis generation, de-sign and testing, this computational framework enables thefeedback loop between modeling and experimentation.

AlbertoGiovanni BusettoETH ZurichDepartment of Computer Science and [email protected]

Mikael Sunnaker, Sotiris Dimopoulos, Jorg StellingETH [email protected],[email protected],[email protected]

Joachim M. BuhmannETH ZurichDepartement Informatik and [email protected]

MS126

Approximation Algorithms for Bayesian Experi-mental Design

Many sensor management and experimental design prob-lems require us to adaptively select observations to obtainthe most useful information. These problems involve se-quential stochastic optimization under partial observabil-ity – a fundamental but notoriously difficult challenge.Fortunately, many observation selection problems have astructural property that makes them easier than generalsequential stochastic optimization. In this talk, I will intro-duce this structural property – a new concept that we calladaptive submodularity – which generalizes submodular setfunctions to adaptive policies. In many respects adaptive

submodularity plays the same role for adaptive problemssuch as sequential experimental design as submodularityplays for nonadaptive problems (such as placing a fixed setof sensors). Specifically, just as many nonadaptive prob-lems with submodular objectives have efficient algorithmswith good approximation guarantees, so too do adaptiveproblems with adaptive submodular objectives. I will il-lustrate the usefulness of the concept by giving several ex-amples of adaptive submodular objectives arising in diverseapplications including sensor selection, viral marketing andactive learning. Proving adaptive submodularity for theseproblems allows us to recover existing results in these appli-cations as special cases and handle natural generalizations.In an application to Bayesian experimental design, we showhow greedy optimization of a novel adaptive submodularcriterion outperforms standard myopic heuristics such asinformation gain and value of information.

Andreas Krause, Daniel GolovinCalifornia Institute of [email protected], [email protected]

MS126

An Optimal Simultaneous Source for Inverse Prob-lems with Multiple Sources

PDE-constrained parameter estimation problems typicallyinvolve multiple right-hand sides. For large-scale problemsin 3D, the computational cost and memory requirements tosolve these problems increases exponentially. In this work,we introduce techniques from stochastic optimization to re-duce the number of PDE solves drastically. This reductionis made possible by a superpostion principle where datafrom a multitude of sources is replaced by data from a sin-gle designed source. We demonstrate the viability of thisprinciple on realistic examples.


MS126

Simulation-Based Optimal Bayesian ExperimentalDesign

We propose a Bayesian framework for optimal experimen-tal design with nonlinear simulation-based models. Theformulation accounts for uncertainty in model parameters,experimental conditions, and observables, capturing theirinterdependence using polynomial chaos expansions. Theobjective function is constructed from information theo-retic measures, reflecting expected information gain frompotential sequences of experiments. Stochastic approxima-tion algorithms are then used to make optimization feasiblein computationally intensive and high-dimensional appli-cations. The setup is demonstrated on a stiff combustionkinetics problem.

Xun Huan, Youssef M. MarzoukMassachusetts Institute of [email protected], [email protected]

MS127

Exploration of a Cell-Centered Lagrangian Hydro-

208 CS11 Abstracts

dynamics Method


Donald [email protected]

Scott Runnels, Theodore Carney, Mikhail ShashkovLos Alamos National [email protected], [email protected], [email protected]

MS127

High Order Finite Elements for Lagrangian Hydro-dynamics, Part I: General Framework

This talk presents a general Lagrangian framework fordiscretization of compressible shock hydrodynamics usinghigh order finite elements. The novelty of our approach isin the use of high order polynomial spaces to define boththe mapping and the reference basis functions. This leadsto improved robustness and symmetry preservation prop-erties, better representation of the mesh curvature thatnaturally develops with the material motion, significant re-duction in mesh imprinting, and high-order convergence forsmooth problems.

Veselin Dobrev, Truman EllisLawrence Livermore National [email protected], [email protected]

Tzanio V. KolevCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

Robert RiebenLawrence Livermore National [email protected]

MS127

High Order Finite Elements for Lagrangian Hydro-dynamics, Part II: Numerical Results

This talk focuses on several practical considerations of thehigh order finite element discretization from Part I, includ-ing the generalization of traditional concepts such as cornerforces and artificial viscosity. We consider an extensive setof test problems to examine shock wave propagation overunstructured/distorted meshes and symmetry preservationfor radial flows in 2D, 3D and axisymmetric geometry. Ineach case we demonstrate robust performance of the highorder FEM implementation in our research code BLAST.

Veselin Dobrev, Truman EllisLawrence Livermore National [email protected], [email protected]

Tzanio V. KolevCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

Robert RiebenLawrence Livermore National [email protected]

MS127

Shock Hydrodynamics on Tetrahedral Meshes

A new, variational multiscale stabilized formulation La-grangian shock hydrodynamics is presented. To the au-thor’s knowledge, it is the only hydrocode that can ac-curately compute highly unsteady shock hydrodynamicstransients on triangular/tetrahedral meshes in two/threedimensions, as well as the more commonly used quadri-lateral/hexahedral meshes. Piecewise linear, equal-orderinterpolation is adopted for velocities, displacements, andthermodynamic variables. This last aspect makes the cur-rent formulation insensitive to the typical pathologies af-fecting standard hydrocodes (namely hourglass on quadri-lateral/hexahedral meshes, and artificial stiffness on tri-angular/tetrahedral meshes). Numerical tests for the un-steady Euler equations of gas dynamics are presented intwo and three dimensions.

Guglielmo ScovazziSandia National [email protected]

MS128

Fully Implicit Methods for Kinetic Simulation ofPlasmas

Kinetic plasma simulation presents formidable challengesdue to its strongly nonlinear nature, the many dimensionsof phase space (up to 6D), and the time-scale disparity.Current algorithmic approaches rely on inefficient explicittime stepping. Here, we explore fully implicit methods, em-ploying Newton-Krylov solvers, for particle-based kineticmethods. Key to the viability of the approach is a suitableformulation of the nonlinear residual. We present proof-of-principle results in 1D+1D electrostatic PIC to demon-strate the potential of the approach.

Luis ChaconOak Ridge National [email protected]

Guangye [email protected]

Daniel BarnesCoronado [email protected]

MS128

3D Penning Tap Simulations using Space-time Par-allel Particle Solvers on GPGPUs


Andrew J. ChristliebMichigan State UniverityDepartment of [email protected]

MS128

High Order Hybrid Semi-Lagrangian Scheme forKinetic Equations

In my talk, I will discuss the coupling betweensemi-Lagrangian framework with high order finite vol-ume/difference/element methods. We aim to combine the

CS11 Abstracts 209

advantages of no CFL time step restriction from semi-Lagrangian schemes with the state-of-art high order fi-nite volume/difference WENO scheme and discontinuousGalerkin method. Vlasov simulation results from proposedmethodology will be presented to demonstrate the effec-tiveness and efficiency of the method.

Jing-Mei QiuMathematical and Computer SciencesColorado School of [email protected]

Wei GuoThe Department of Mathematical & Computer SciencesColorado School of [email protected]

MS128

Constrained Transport Schemes for Ideal MHD onUnstructured Grids

Standard shock-capturing numerical methods fail to giveaccurate solutions to the equations of magnetohydrody-namics (MHD). The essential reason for this failure is thatby ignoring the divergence-free constraint on the magneticfield, these methods can be shown to be entropy unstable.In this talk we will briefly review the entropy stability the-orem for ideal MHD. We will then present a class of discon-tinuous Galerkin constrained transport (DG-CT) methodson unstructured grids that give both stable and accurateresults. The proposed CT approach can be viewed as apredictor-corrector method, where an approximate mag-netic field is first predicted by a standard DG method, andthen corrected through the use of a magnetic potential.

James A. RossmanithUniversity of WisconsinDepartment of [email protected]

MS129

Investigation of Air Quality Forecasting Biasesthrough Mass Budget and Process Analysis of Sci-ence Algorithms in the Model

Predictability of an air quality model is determined bythe accuracy of model algorithms representing atmosphericprocesses as well as the initial and boundary conditions. Asthe present forecasting models rely on previous forecastingresults as initial conditions, errors in meteorological andair quality predictions may accumulate and propagate dif-ferent places for multiple days. To understand the causalrelations of biases in ozone and particulate matter predic-tions, mass budget and process analysis of individual sci-ence algorithms is applied to the Community MultiscaleAir Quality (CMAQ) model. By analyzing differences inthe process contributions between results of sensitivity sim-ulations, we bound how much of the original forecastingerror can be reduced by the improved initial and boundaryconditions.

Daewon ByunNOAA Air Resources [email protected]

Yunsoo Choi, Richard Saylor, Tianfeng ChaiAir Resources Laboratory, [email protected], [email protected],[email protected]

MS129

Sensitivity and Process Analysis of Daytime andNighttime Atmospheric Chemistry

Process analysis is used to quantify changes in air pollutantmixing ratios attributable to specific processes while sen-sitivity analysis is used to calculate the response of mixingratios to small perturbations in parameters or to the initialstate. Process and sensitivity analysis are critical for thedevelopment of chemical data assimilation methods. Ananalysis for a number of modeling cases with models em-ploying the Regional Atmospheric Chemistry Mechanism,version 2 (RACM2) will be presented.

Charlene LawsonDepartment of Chemistry, Howard [email protected]

William R. StockwellHoward [email protected]

Wendy GoliffCE-CERT University of California, [email protected]

John LewisNOAA National Severe Storms Laboratory and [email protected]

MS129

The Mathematics of the Air Quality Modeling Sys-tem

Deterministic air quality models link meteorology with at-mospheric chemistry. The three-dimensional Eulerian ap-proach is the most commonly used modeling method. Eu-lerian air quality models typically solve tens of thousandsof continuity equations. These are partial differential equa-tions that describe the sum total of processes affecting themixing ratio of a chemical species within each grid box. Anoverview of the processes, equations that are employed bythe models and their solutions will be presented.

William R. StockwellHoward [email protected]

Charlene LawsonDepartment of Chemistry, Howard [email protected]


MS130

Multiscale Modeling for Stochastic Forest Dynam-ics

Individual-based models are widely employed to representand simulate complex systems. Those descriptions how-ever come with a high computational cost and perhaps un-necessary degrees of detail. In this work, we start with aspatially explicit representation of the interacting agentsand attempt to explain the systems dynamics at multiplescales by means of successive coarse-graining steps. We ap-ply our technique to a forest model subjected to different

210 CS11 Abstracts

disturbance regimes.

Maud ComboulUniversity of Southern [email protected]


MS130

Length-scale Dependent Active Microrheology ofBiological Materials


M. Gregory ForestUniversity of North Carolina at Chapel HillDept of Math & Biomedical [email protected]

MS130

Correlated Brownian Motions and the DepletionEffect in Colloids

We first review the model of correlated Brownian motionsas derived from deterministic dynamics (Kotelenez 1995,2005). We then describe the qualitative behavior of cor-related Brownian motions at short distances. In partic-ular, we obtain that at short distances and for randomtimes two correlated Brownian motions are attracted toeach other (K., Leitman and Mann 2008). This attractivebehavior is in good agreement with the depletion phenom-ena, experimentally observed in colloids (Asakura and Oo-sawa (1954)). [Based on joint work with Marshall Leitman(CWRU) and Jay Mann (CWRU)]

Peter KotelenezCase Western Reserve [email protected]

MS130

A Continuum Model for Moving Contact Lines andthe Spreading of Liquid Thin Films

PWe will discuss a continuum model for the moving con-tact line problem derived based on thermodynamics prin-ciples and molecular dynamics simulations. Macroscopicthermodynamic argument is used to place constraints onthe form of the boundary conditions; molecular dynamicsis then used to measure the detailed functional dependenceof the boundary conditions. A new continuum model is ob-tained for the case of partial wetting as well as the case ofcomplete wetting in a unified form.

Weiqing RenCourant Institute of Mathamatical [email protected]

MS131

A Novel Sparse Preconditioner for High-Order Fi-nite Element Problems

In the 1980s, Orszag [J. Comp. Phys., 37, 70 (1980)] in-troduced a method for preconditioning high-order finite el-ement problems with a low-order discretization. The low-

order discretization is performed on a finer mesh definedby the high-order finite element nodes. We describe anapproach for constructing a low-order preconditioner thatonly needs the high-order element stiffness matrices. Newresults on 2D and 3D meshes for various diffusion-type sys-tems will be considered.

Travis M. AustinTech-X [email protected]


Chetan Jhurani, Srinath VadlamaniTech-X [email protected], [email protected]

Marian BrezinaU. of Colorado, BoulderApplied Math [email protected]


John RugeUniversity of Colorado at [email protected]

MS131

Application of C1 Finite Elements to Two-FluidMagnetohydrodynamics

The vector potential/stream function representation of themagnetic and velocity fields can be advantageous whencomputing solutions of the two-fluid magnetohydrody-namic equations, but results in fourth-order derivatives.In applications where shocks are not expected to form, theuse of finite-elements having C1-continuity is particularlyuseful. The M3D-C1 code, which employs reduced-quinticC1 elements on an unstructured mesh, is discussed, and re-sults primarily regarding tokamak fusion applications arepresented.

Nathaniel FerraroOak Ridge Institute for Science and EducationGeneral [email protected]

Stephen JardinPrinceton Plasma Physics [email protected]

MS131

Smoothers for High Order H(curl) Bases

We consider algebraic multigrid for systems discretized byhigh order compatible finite elements. In particular, we fo-cus on H(curl) discretizations. Algebraic multigrid meth-ods require efficient smoothers to damp high frequency er-ror. However, the nullspace of purely H(curl) discretiza-tions is large, hence standard smoothers are insufficient forthese types of problems. Hybrid smoothers targeting thenullspace have been introduced for lowest order elements,and we show how to generalize this smoother to high order

CS11 Abstracts 211

elements for hierarchical and interpolatory bases.

James Lai, Luke OlsonDepartment of Computer ScienceUniversity of Illinois at [email protected], [email protected]

MS131

High Order Elements in FOSLS; Or Why LinearsSuck

We will talk about the use of high order elements in thecontext of first-order system least-squares (FOSLS) dis-cretization. Numerical results show that high order el-ements perform better than standard analysis predicts.That is, in general, FOSLS yields locally-sharp a-posteriorierror estimates equivalent to H1-seminorms. The L2-erroris bounded through coercivity constants that depend ondomain shape and boundary conditions. It suffers whenusing linears; however, the accuracy is greatly enhancedwhen high order elements are used. We will present resultsthat show dramatic improvement in the accuracy per de-gree of freedom (DOF) for high order elements, and discussthe theoretical analysis.


Steve McCormickDepartment of Applied [email protected]

John RugeUniversity of [email protected]

Lei TangDepartment of Applied MathematicsUniversity of [email protected]

MS132

Apex to Base Heterogeneous Electrophysiology ina Ventricular Dog Model of the Heart

We created a transmurally and longitudinally heteroge-neous, fully-coupled electromechanical model of the dogleft ventricle. With pacing the spread of the Action Poten-tial restitution (AP90) was larger in the longitudinal het-erogeneous cases compared with the transmural (AP90 lon-gitudinal = 25.9 10 ms; AP90 transmural = 6 2 ms). Lon-gitudinal plus transmural heterogeneity improved strokevolume from 6.7 0.2 ml to 8.7 0.2 ml compared with atransmurally heterogeneous only model.

Jazmin Aguado-SierraUCSDDept of [email protected]

Roy KerckhoffsUCSDDepartment of [email protected]

MS132

Finite Element Analysis of the Mitral Valve withActive Muscle Fibres

The mitral valve located between the left atrium and leftventricle of the heart prevents blood from flowing back intothe atrium when the ventricle contracts. Several experi-mental findings showed that this valve contains contractilecells, meaning that the mitral valve is not only a passivestructure. We present a transversely isotropic hyperelasticmaterial model for the leaflets in which we add a contractileelement to simulate the mechanical function of the musclecells present in the leaflets.

Victorien Prot, Bjørn SkallerudNorwegian University of Science and Technologymailto:[email protected], [email protected]

MS132

Electromechanical Models of the Heart

Over the last decade multiscale models of cardiac electri-cal behavior that incorporate biophysical models of my-ocyte subcellular processes as well as continuum modelsincorporating anatomy and passive mechanics have beendeveloped. Advancements in modeling of these processeshave occurred largely independently. However, in the lastfew years coupled models of cardiac electromechanics haveemerged. We present the state-of-the-art in ventricularelectromechanical modeling and discuss areas of clinicalsignificance where these models could make important con-tributions.

Natalia A. TrayanovaJohn Hopkins UniversityInstitute for Computational [email protected]

MS132

Modeling the Infarct Injured Heart, Insights intoMechanical Dysfunction

As direct stress measurements in the beating heart are notpossible, computational methods are required to estimatemechanical stress that the heart experiences during the car-diac cycle. We present a fully-coupled electromechanicalmodel which was built to match the in vivo measurementsof myocardial strain in an infarct injured ovine left ventri-cle. This validated model was used to evaluate stress fieldsas the ventricle beats and to asses function / dysfunctionof the injured heart.

Samuel WallSimula Research LaboratoryCenter for Biomedical [email protected]

MS133

Lattice Boltzmann Approach to the NumericalModeling of Electrowetting Phenomena

A 3D lattice Boltzmann approach for the modeling ofsingle-component multiphase (SCMP) fluid is applied tothe modeling of basic transport and splitting processes indigital microfluidics. The liquid free surface is modeledby using a Shan-Chen interaction potential and the phys-ical properties of this surface set to the case of water-airinterface. Contact angle at the walls and electrowettingactuation at the electrodes is modeled by considering an

212 CS11 Abstracts

anisotropy in the interaction of fluid particles with the solidobstacles. Different contact angles and different geome-tries of the device are then used in order to simulate basicprocesses consisting in displacements of droplets from oneelectrode to another as well as splitting of single dropletsin a three-electrode approach. Comparisons of numericalresults to both analytical models and experimental mea-surements reveal good agreement and point out strengthand limitations of both numerical model and electrowet-ting actuation.

Liviu Clime, Daniel Brassard, Teodor VeresNational Research Council [email protected], [email protected], [email protected]

MS133

Capillary Flows with Parallel Free-surface LatticeBoltzmann Method

Many interesting flow phenomena in micro scales such asporous media and nano tubes are influenced by capillary ef-fects. Our free surface lattice Boltzmann method (FSLBM)simulates gas-liquid flows, including surface tension effectsand has been incorporated in waLBerla, which is a frame-work combining several lattice Boltzmann applications inone efficient and parallel code. A recently developed ex-tension of the FSLBM introduces capillary forces to enablethe simulation of capillary effects for gas-liquid systems.

Stefan DonathComputer Science 10 - System SimulationFriedrich-Alexander-Universitat Erlangen-Nurnberg,[email protected]

Ulrich RudeComputer Science 10 - System SimulationFriedrich-Alexander-University Erlangen-Nuremberg,[email protected]

MS133

Drop-on-demand Drop Impact on Patterned Sur-faces

Lattice Boltzmann simulations of micron-scale water dropimpact on dry patterned surfaces are carried out over awide range of impact velocities and equilibrium contactangles. Minimization of the total free energy subject to thepolynomial wall free energy determines the contact angleand the density profile at solid surfaces. Time evolutionof dimensionless kinetic energy and surface energy of animpacting drop at various contact angles will be discussed.

Taehun LeeDepartment of Mechanical EngineeringThe City College of New [email protected]

MS133

Lattice Boltzmann Simulation of Isothermal Va-porization in a Porous Material

We apply a lattice Boltzmann multiphase approach tomodel the drying process at the pore scale and comparethe simulated liquid distribution in a soil sample with lab-

oratory measurements to study effects of soil and trans-port properties on the evaporation behavior. The paral-lel efficiency and simplicity of lattice Boltzmann methodpave the way for such a computational challenge. Quanti-tative comparisons with measured water retention curvesand transient drying front depth will be presented.

Ying WangTechnische Universitat [email protected]


Benjamin AhrenholzTechnische Universitat [email protected]

Peter Lehmann, Dani OrEidgenossische Technische Hochschule [email protected], [email protected]

MS134

Uncertainty Assessment for Dynamical Systems,with Applications to Chemical Plants

We discuss the issue of material tracking under uncertaintyin a dynamical system as a model of a chemical process.As opposed to typical state estimation approaches, we areinterested in using current information to reduce the un-certainty of past states. Due to the high dimensionalityand complexity of the distribution, sequential Monte Carlo(SMC) methods are used to estimate the material amountsin the production phases. We discuss sufficient conditionson observed quantities that would result in slowly increas-ing variance in time and thus in a tractable problem. Weprove that SMC methods converge when the proposal den-sity at each time step is normal if the distribution is definedon compact space. We demonstrate our findings using SIS-TOS, a parallel code that was developed for this purpose.


Xiaoyan ZengArgonne National [email protected]

MS134

A Scalable Algorithm for Solutions of Large-scaleStatistical Inversions

We present an attempts to reduce the cost of evaluatingthe probability density by employing a Bayesian responsesurface method based on a Gaussian process model. Ourmethod incorporates gradient and Hessian information intothe prior Gaussian process, which is expected to betterexplore variations of the posterior and reduce the num-ber of sampling points. We apply the proposed methodto various synthesized problems and inverse shape scat-tering problems governed by Maxwell’s equations in the

CS11 Abstracts 213

time domain. The shape gradient and shape Hessian-vectorproducts are computed via adjoint equations and contin-uous shape derivatives. Scatterer shape is parametrizedvia Fourier bases, yielding increasingly higher dimensionalinverse problems.

Tan Bui-ThanhThe University of Texas at [email protected]


MS134

A Discretized Law of Total Probability Approachto Inverse Problems

Given a set of physical measurements with a known er-ror model, the goal of model calibration is to find the in-put parameters of a simulation that bring the output intoagreement with measurements. We examine a procedurebased on the law of total probability, which formulates theinverse problem as a standard convex optimization prob-lem. In contrast to MCMC-based inversion, this procedurenaturally separates the measurement error from the modelerror.

Paul ConstantineSandia National [email protected]

Christopher Maes, Peter GlynnStanford [email protected], [email protected]

MS134

Sparse Approximation of SPDEs

We propose a method for the approximation of solutions ofPDEs with stochastic coefficients based on the direct, i.e.,non-adapted, sampling of solutions. This sampling can bedone by using any legacy code for the deterministic prob-lem as a black box. The method converges in probability(with probabilistic error bounds) as a consequence of spar-sity and a concentration of measure phenomenon on theempirical correlation between samples. We show that themethod is well suited for truly high-dimensional problems(with slow decay in the spectrum).

Alireza DoostanDepartment of Aerospace Engineering SciencesUniversity of Colorado, [email protected]

Houman OwhadiApplied [email protected]

MS135

Large Sparse Matrix Problems in Ab-initio NuclearPhysics

A microscopic theory for the structure of light nucleiposes formidable challenges for high-performance comput-ing. The ab-initio no-core full configuration method framesthis quantum many-body problem as a large sparse matrix

eigenvalue problem which is solved for the lowest eigenval-ues (binding energies) and their associated eigenvectors.The eigenvectors are employed to evaluate experimentalquantities. We discuss different strategies for distributingand solving this large sparse matrix on current multicorecomputer architectures.

Pieter MarisIowa State [email protected]

MS135

An Interpolatory Parallel Method for Large-scaleNonlinear Eigenvalue Problems

We present a course-grained parallel method for computinginterior eigenvalues and their corresponding eigenvectorsof large-scale nonlinear eigenvalue problems. Our methodconsists of a rational interpolation that is constructed fromsolutions of independent systems of linear equations withdifferent shift points. This enables us to avoid communi-cation between computing nodes assigned for each linearsolver, and provides a good scalability on many-core pro-cessors with massively parallel computing resources.

Tetsuya Sakurai, Hiroto TadanoDepartment of Computer ScienceUniversity of [email protected], [email protected]

Tsutomu IkegamiInformation Technology Research [email protected]

Ichitaro YamazakiComputational Research DivisionLawrence Berkeley National [email protected]

MS135

Sparse Matrix Techniques in a Parallel HybridSolver for Large-scale Linear Systems

A parallel hybrid linear solver based on the Schur comple-ment method has a great potential to utilize thousandsof processors for solving large-scale linear systems thatare becoming increasingly difficult to solve using standardtechniques. In this talk, we outline the algorithm imple-mented in our parallel hybrid solver, particularly focus-ing on the sparse matrix techniques for achieving high-performance. We also present numerical results of solvinghighly-indefinite linear systems from real applications.

Ichitaro Yamazaki, Xiaoye Sherry LiComputational Research DivisionLawrence Berkeley National [email protected], [email protected]

Esmond G. NgLawrence Berkeley National [email protected]

MS135

Sparse Matrix Techniques in X-ray DiffractiveImaging

An emerging technique in X-ray diffractive imaging is Pty-

214 CS11 Abstracts

chography. In a Ptychography experiment, a large num-ber of small and overlapping diffraction patterns of an un-known object are collected. These diffraction images areused in an iterative procedure to recover both the phaseand amplitude of the object. An efficient implementationof this algorithm relies on using sparse matrix vector mul-tiplications to update the approximate reconstruction. Wewill describe the use of such a technique on high perfor-mance computers in this talk.

Chao Yang, Stefano MarchesiniLawrence Berkeley National [email protected], [email protected]

Filipe MaiaNERSCLawrence Berkeley National [email protected]

Andre SchirotzekAdvanced Light SourceLawrence Berkeley National [email protected]

MS136

Inexact Newton Methods for Large-Scale Nonlin-ear Optimization

InexactNewtonmethods play a fundamental role in the so-lution of large-scale unconstrained optimization problemsand nonlinear equations. The key advantage of these ap-proaches is that they can be made to emulate the propertiesofNewton’s method while allowing flexibility in the com-putational cost per iteration. Due to the multi-objectivenature of *constrained* optimization problems, however,that require an algorithm to find both a feasible and op-timal point, it has not been known how to successfullyapply aninexact Newtonmethod within a globally conver-gent framework. In this talk, we present a new method-ology for applying inexactness to the most fundamentaliteration in constrained optimization: a line-search primal-dualNewtonalgorithm. We illustrate that the choice ofmerit function is crucial for ensuring global convergenceand discuss novel techniques for handling non-convexity,ill-conditioning, and the presence of inequality constraintsin such an environment. Numerical results are presentedfor PDE-constrained optimization problems.

Frank E. CurtisIndustrial and Systems EngineeringLehigh [email protected]

MS136

A Stochastic Newton Method for Large-scale Sta-tistical Inverse Problems with Application to Geo-physical Inverse Problems

We present a Langevin-accelerated MCMC method forsampling high-dimensional, expensive-to-evaluate proba-bility densities that characterize the solution to PDE-basedstatistical inverse problems. The method builds on previ-ous work in Metropolized Langevin dynamics, which usesgradient information to guide the sampling in useful direc-tions, improving acceptance probabilities and convergencerates. We extend the Langevin idea to exploit local Hes-sian information, leading to what is effectively a stochasticversion of Newton’s method. We apply the method to theBayesian solution of a seismic inverse problem, for which

we observe several orders of magnitude faster convergenceover a reference blackbox MCMC method.





MS136

Proper Orthogonal Decomposition (POD) BasisDesign by Means of Non-linear Inversion

A framework for the design of Proper Orthogonal Decom-position (POD) bases in introduced in this study. Whilebasis selection for PDE-based problems has been exten-sively studied, limited research has addressed the problemform non-linear inversion standpoint. In this study, weformulate the basis design problem in terms of stochasticoptimization, based on the Stochastic Sampling Average(SSA) approach. We demonstrate the utility and effective-ness of the proposed approach over a set of PDE-basedproblems.

Lior HoreshBusiness Analytics and Mathematical SciencesIBM TJ Watson Research [email protected]


Marta D’EliaMathCS, Emory [email protected]

MS136

A Control-theoretic Approach to Inference for Pre-diction

Design of systems with uncertain parameters is ubiquitousin engineering. Many approaches proceed sequentially withexperimentation, inference, and design. These methodsmay waste resources by conducting experiments not signif-icantly contributing to the final control objective or con-straints. We propose a goal-oriented, control-theoretic ap-proach to inference for prediction. The algorithm is demon-strated on a 2-D contamination mitigation problem.

Chad E. [email protected]

Karen WillcoxMassachusetts Institute of Technology

CS11 Abstracts 215

[email protected]

MS137

Career Preparation of Undergraduate and Gradu-ate Students through an Interdisciplinary Consult-ing Approach

It is vital for Mathematicians and Statisticians to inter-act effectively with colleagues from other fields. We haveimplemented a training approach through consulting withclients from application areas to provide our students withdemonstrated experience of their skills in this area. Onthe graduate level, we use a consulting center, and on theundergraduate level, we have created an REU Site usingthe same philosophy. This talk will show both the ideasand their implementation in more detail.


Nagaraj NeerchalDepartment of Mathematics and StatisticsUniversity of Maryland, Baltimore [email protected]

MS137

Undergraduate Foundations of Applied Mathemat-ics

In this talk we explore some of the fundamental conceptsin applied mathematics and demonstrate how they can betaught to undergraduates. What should be the founda-tional core of applied mathematics at that level? We aremotivated by the desire to create an undergraduate degreeprogram an applied and computational mathematics thatwill provide students with a strong background in com-putation, without sacrificing the mathematical rigor of atraditional mathematics degree. Audience participation isencouraged.

Jeffrey HumpherysBrigham Young [email protected]

MS137

Undergraduate Research Experiences in Computa-tional Mathematics


Eric J. KostelichArizona State UniversityDepartment of [email protected]

MS137

Undergraduate CSE Programs in the U.S.

This talk will discuss the current state of ComputaitonalScience programs in the US (and abroad) with reference tothe updated SIAM Undergraduate CSE report. The newSIAM Undergraduate Research Online publication will alsobe presented as a suitable outlet for publishing quality un-dergraduate research in applied and computational math-

ematics.

Peter R. TurnerClarkson UniversitySchool of Arts & [email protected]

MS138

Compressive Sensing: A New Approach to ImageAcquisition

Imaging systems are under increasing pressure to accom-modate larger data sets. The foundation of today’s digitaldata acquisition systems is the Shannon/Nyquist samplingtheorem. However, the physical limitations and inherentlyhigh Nyquist rates of current systems impose a perfor-mance barrier. We will overview our work on compressivesensing, which digitizes analog signals via measurementsusing more general, even random, test functions. The im-plications of compressive sensing are promising for medicalimage reconstruction.

Richard G. BaraniukRice UniversityElectrical and Computer Engineering [email protected]

MS138

Hardware Acceleration of Medical Imaging Appli-cations

Medical imaging applications often involve a pipeline of al-gorithms that include image restoration, registration, andanalysis (e.g., segmentation, feature extraction). However,most algorithms are constrained to the research environ-ment due to a lack of computational power. This talkdiscusses the challenges of deploying these algorithms inclinical practice and focuses on how domain-specific com-puting accelerates pipeline algorithms using a customizableheterogeneous platform that is tuned to the specific needsof each image processing algorithm.

William Hsu, Gene Auyeung, Sugeun Chae, Yi Zou, AlexBuiUniversity of California, Los [email protected], [email protected],[email protected], [email protected], [email protected]

MS138

Numerical Challenges in Constrained Image Reg-istration

This talk introduces image registration and explains thenecessity of application conform constraints such as vol-ume and mass preservation. The talk presents a variationalframework for constrained image registration and its nu-merical implementation. The presented implementation isbased on a sequence of coupled discretizations, where foreach discretization a finite dimensional constrained opti-mization problem is to be solved. Central challenges arethe size of the problems which can reach billions of un-knowns and constraints and the nature of the non-linearconstraints.

Jan ModersitzkiUniversity of LubeckInstitute of Mathematics and Image [email protected]

216 CS11 Abstracts

MS138

Modeling Image Processing Algorithms Using theConcurrent Collections Coordination Language

Concurrent Collections (CnC) is a graphical parallel coor-dination language in which a node corresponds to a step,data or item collection, and a directed edge correspondsto a put, get, or step-creation operation. A CnC programexecution is guaranteed to be deterministic and data-race-free. We summarize our experience with modeling image-processing algorithms using CnC, and discuss how CnC canbe used to naturally uncover multiple levels of parallelismintrinsic to an algorithm or application.

Vivek SarkarRice [email protected]

MS138

Expectation Maximization and Total VariationBased Model for Computed Tomography Recon-struction from Undersampled Data

Computerized tomography plays an important role in med-ical imaging. However, the higher dose of radiation ofCT will result in increasing of radiation exposure in thepopulation. We propose a method combining expecta-tion maximization and total variation regularization, calledEM+TV, to preserve the quality of the image while reduc-ing the exposure. This method can reconstruct the im-age using undersampled views while providing good result,then reduce the overall dose of radiation.

Ming Yan, Luminita A. VeseUniversity of California, Los AngelesDepartment of [email protected], [email protected]

MS139

A High Order Cell Centred Lagrangian GodunovScheme for Shock Hydrodynamics

A new cell centred Lagrangian Godunov scheme for shockhydrodynamics is proposed. The new method uses a tran-sient dual grid to define the motion of the vertices in a waythat is consistent with the geometric conservation law. Theextension of the scheme to second order accuracy in spaceis considered and an initial assessment of the performanceof the method is made by comparison with results obtainedwith a compatible staggered grid scheme.

Andrew J. BarlowAtomic Weapons Esatblishment, United [email protected]

MS139

Staggered Lagrangian Discretization based on Cell-centered Riemann Solver - A Bridge from Stag-gered to Cell-centered Lagrangian Schemes

In this presentation we provide a new formalism to bridgewell-known staggered Lagrangian and newly developedcell-centered Lagrangian schemes. We will present thisnew approach that leads to formally second-order accu-rate scheme in space and time. A 3D implementation ofthis scheme will provide numerical results on classical testcases of hydrodynamics.

Raphael Loubere

University of Toulouse, [email protected]


Pavel VachalCzech Technical University in [email protected]

MS139

A General Formalism to Derive Cell-centeredSchemes for Two-dimensional Lagrangian Hydro-dynamics on Unstructured Grids

The aim of this work is to develop a general formalismto derive cell-centered schemes for 2D Lagrangian hydro-dynamics on unstructured grids that meet the compati-bility GCL requirement. The high-order extension of thisgeneral cell-centered scheme is constructed using the two-dimensional extension of the Generalized Riemann Prob-lem methodology in its acoustic version. Various numericalresults on representative compressible fluid flows are pre-sented to demonstrate the accuracy and the robustness ofthese schemes.


MS139

Issues in Designing a 2D Cylindrically SymmetricConservative Lagrange Hydro Scheme

Work by Burton, Caramana, et al, developed practicalenergy conserving Lagrangian hydrodynamics schemes inthe late 1990’s. The Wilkins area-weighted discretization,which preserves spherical symmetry in cylindrical geom-etry, can be made energy conserving, but the change re-quires Lagrangian corner masses, which in turn raise someissues about the definitions of corner volumes. We presentone such scheme, some test results, and discuss the impactof our definitions on other packages in a multi-physics code.

Douglas S. MillerLawrence Livermore Nat. [email protected]

MS140

Energy Stable Space-Time Discontinuous GalerkinApproximations of the 2-Fluid Plasma Equations

Energy stable variants of the space-time discontinuousGalerkin (DG) finite element method are developed thatapproximate the ideal two-fluid plasma equations. Usingstandard symmetrization techniques, the two-fluid plasmaequations are symmeterized via convex entropy functionand the introduction of entropy variables. Anaysis re-sults for the DG formulation assuming general unstruc-tured meshes in space and arbitrary order polynomial ap-proximation include

• a cell entropy bound for the semi-discrete formulation,

• a global two-sided entropy bound and L2 stability forthe space-time formulation,

• a modification of the DG method with provable sta-bility when entropy variables are not used.

CS11 Abstracts 217

Numerical results of the 2-fluid system including GEMmagnetic reconnection are presented verifying the analy-sis and assessing properties of the formulation.

Timothy J. BarthNASA Ames Research [email protected]


MS140

Discontinuous Galerkin Method Applied to theMulti-fluid Plasma Model

The multi-fluid plasma model only assumes local thermo-dynamic equilibrium within each fluid. Physical parame-ters indicate the importance of the two-fluid effects. The al-gorithm implements a discontinuous Galerkin method withapproximate Riemann fluxes for the fluids and electromag-netic fields. The two-fluid plasma model has time scaleson the order of the electron and ion cyclotron frequencies,the electron and ion plasma frequencies, the electron andion sound speeds, and the speed of light.

Uri ShumlakAerospace and Energetics Research ProgramUniversity of [email protected]

MS140

Simulation Challenges in Industrial Plasma Appli-cations


Manuel TorrilhonSchinkelstrasse 2RWTH Aachen [email protected]

MS140

Robust and Efficient Schemes for Highly Com-pressible Magnetohydrodynamics with Applica-tions to Interstellar Turbulence and StratifiedMagneto-atmospheres

Ideal magnetohydrodynamics is a widely used fluid modelfor astrophysical plasma. The dynamics often containshocks, large density fluctuations and vast scale ranges,making numerically stable simulations challenging. Wepresent stable finite volume schemes based on techniquesof entropy stability, positivity and well-balancing. Simula-tion examples include chromospheric waves and interstellarturbulence.

Knut WaaganCenter for Scientific Computation and MathematicalModelingUniversity of [email protected]

MS141

Analysis of Nudging as a Method for Dynamic Data

Assimilation

While nudging as a method for dynamic data assimilationhas been around since early 1970’s, much of the analysisrelated to the effectiveness of nudging is largely based onempirical testing. In this talk, for the first time, usinga simple linear scalar model, we prove that the nudgedforecast is closer to the data than the non-nudged model.In particular, we derive a closed expression for the optimalvalue for the nudging parameter.

S. LakshmivarahanUniversity of [email protected]


MS141

Forward Sensitivity Approach to Dynamic DataAssimilation



Ramesh VelloreDesert Research [email protected]

MS141

Inverse Modeling of Black Carbon and NOx Emis-sions over North America Using CMAQ: Influenceof Data Sets and Types Used


Armistead RussellGeorgia Institute of [email protected]

MS141

New Computational Tools for Chemical Data As-similation


Adrian SanduVirginia Polytechnic Institute andState [email protected]

MS142

Variance Reduction as a Multiscale Tool

Fluctuations are one of the defining characteristics ofmolecular simulation methods. Although sometimes re-sponsible for a number of interesting physics, in many casesof practical interest they amount to an inconvenience, dueto the considerable computational cost required for theirelimination through statistical sampling. In this talk wepresent variance reduction methods for drastically reduc-ing the statistical uncertainty associated with Monte Carlomethods for solving the Boltzmann transport equation.The variance reduction, achieved by simulating the devia-

218 CS11 Abstracts

tion from equilibrium, provides a speedup which increasesquadratically as the deviation from equilibrium goes tozero, thus enabling the simulation of arbitrarily small de-viations from equilibrium. In addition to enabling classicalhybrid methods based on domain decomposition by facil-itating the coupling process, these control variate meth-ods can themselves be thought of as multiscale methodsin which the continuum/molecular decomposition occursthroughout the computational domain, namely by havingpart of the solution described by a deterministic component(equilibrium) and the remainder described by a molecularsimulation.

Nicolas HadjiconstantinouMassachussets Institute of [email protected]

MS142

Accelerated Kinetic Monte Carlo Methods: Hier-archical Parallel Algorithms and Coarse-graining

In this talk we present two intimately related approachesin speeding-up molecular simulations via Monte Carlo sim-ulations. First, we discuss coarse-graining algorithms forsystems with complex, and often competing particle in-teractions, both in the equilibrium and non-equilibriumsettings, which rely on multilevel sampling and communi-cation. Second, we address mathematical, numerical andalgorithmic issues arising in the parallelization of spatiallydistributed Kinetic Monte Carlo simulations, by develop-ing a new hierarchical operator splitting of the underly-ing high-dimensional generator, as a means of decomposingefficiently and systematically the computational load andcommunication between multiple processors. The commontheme in both methods is the desire to identify and de-compose the particle system in components that commu-nicate minimally and thus local information can be eitherdescribed by suitable coarse-variables (coarse-graining), orcomputed locally on a individual processors within a par-allel architecture.

Markos A. KatsoulakisUMass, AmherstDept of [email protected]

MS142

Multiscale Computation of Rolie-Poly Flow


Paula A. VasquezDepartment of MathematicsUniversity of North Carolina at Chapel [email protected]

MS143

A Full Navier-Stokes Solver on Irregular Domainscoupled with a Poisson-Boltzmann Solver withNeumann or Robin Boundary Conditions on Non-Graded Adaptive Grid

Second-order solver for the full Navier-Stokes equationscoupled with the Poisson-Boltzmann equation on irregulardomains is introduced. The Poisson-Boltzmann solver ison Quadtee/Octree grids. The fluid solver is an improvedprojection method where Neumann boundary conditionsfor pressure are easily enforced at the irregular domain.The projection matrix is positive definite. The Poisson-

Boltzmann matrix is an invertible M matrix, leading to asimple and robust second-order accurate solver.

Asdis HelgadottirDept. Mechanical [email protected]

MS143

Computational Modeling of Porous Supercapaci-tors Using Non-graded Adaptive Cartesian Grids

An efficient finite difference discretization of the non-linearPoisson-Boltzmann equation is presented for complex ge-ometries. The level-set method is adopted to represent theobject while Octree/Quadtree data stuructures are used togenerate adaptive grids, required in the singular limit ofthin double layers. Several numerical experiments indicatesecond order accuracy in L1 and L∞ norms. Finally, we useour method to model porous geometries as electrochemicalsupercapacitors.

Mohammad MirzadehDept. Mechanical [email protected]

MS143

A Level Set Approach for Diffusion and StefanProblems with Robin Boundary Conditions onNon-graded Adaptive Grids


Joe PapacDept. Mechanical [email protected]

MS143

Multigrid Method on Octree Data Structures

We present a new numerical method for solving the equa-tions of linear elasticity on irregular domains in two andthree spatial dimensions. We combine a nite volume anda nite dierence approach to derive second-order discretiza-tions. The domain’s boundary is represented implicitly bya level set function. Our model is a sharp model in thesense that we solve for the solution inside the domain only,without smearing of the solution near the interface. Theapproach is well suited for handling shape optimizationproblems, in which the boundary of the domain is evolvingin time with a velocity that depends on the solution of theelasticity problem.

Maxime TheillardDept. Mechanical EngineeringUCSBmaxime [email protected]

MS144

A High-order Finite-volume Method for Hyper-bolic Conservation Laws on Locally-refined Grids

We present a fourth-order-accurate finite-volume methodfor solving time-dependent hyperbolic systems of conser-vation laws on Cartesian grids with multiple levels of re-finement. From coarser to finer levels, we interpolate in

CS11 Abstracts 219

time using the fourth-order Runge–Kutta method, and inspace by solving a least-squares problem over a neighbor-hood of each target cell. The method uses slope limiters,slope flattening, and artificial viscosity. We show resultsdemonstrating fourth-order convergence for some smoothproblems in 2D and 3D.

Peter Mccorquodale, Phillip ColellaLawrence Berkeley National [email protected], [email protected]

MS144

Higher Order Finite Volume Methods on MappedGrids, with Application to Gyrokinetic PlasmaModeling

We describe the development and application of high-order,mapped-grid, finite-volume methods for the solution of gy-rokinetic models of magnetically confined fusion plasmas.Phase space advection of plasma species distribution func-tions and solution of the associated field equations are per-formed using fourth-order discretizations facilitated by ageneral formalism. Coordinate mapping is employed tosolve the gyrokinetic system on locally rectangular gridsaligned with magnetic flux surfaces. Performance of theapproach on benchmark problems will be demonstrated.

Milo DorrCenter for Applied Scientific ComputingLawrence Livermore National [email protected]

Ronald Cohen, Mikhail DorfLawrence Livermore National [email protected], [email protected]


Phillip Colella, Daniel MartinLawrence Berkeley National [email protected], [email protected]

MS144

Another Look at H-box Methods

We consider an explicit finite volume method for the ap-proximation of hyperbolic conservation laws on embeddedboundary grids. By using a method of lines approach witha strong stability preserving Runge Kutta method in time,the complexity of our previously introduced h-box methodis greatly reduced. Our method is second order accurateand stable for time steps that are appropriate for the reg-ular part of the computational mesh.

Christiane HelzelDepartment of [email protected]

Marsha BergerCourant Institute of Mathematical SciencesNew York [email protected]

MS144

Fourth-order Finite Volume Cut Cell Approach forElliptic and Parabolic PDE’s


Hans JohansenLawrence Berkeley National LaboratoryComputational Research [email protected]

MS145

Microfluidic Simulation of Dewetting over SolidPlane with Micro-structrured Array

DNA molecules in a solution can be immobilized andstretched into a highly ordered array on a solid surfacecontaining micro-features by molecular combing technique.In this paper, the microfluidic dynamics of dewetting oversubstrates covered with microwell and micropillar arraywas simulated based on a front capturing approach withthe deforming body-fitted grid system. The simulation re-sults provide insights for explaining the stretching, immo-bilizing and patterning of DNA nanowire observed in theexperiments.

Chun-Hsun LinChung Yuan [email protected]

Jingjiao GuanFlorida State [email protected]

Shiu-Wu ChauNational Taiwan University of Science and [email protected]

L. James LeeOhio State [email protected]

MS145

Lattice Boltzmann Simulations of Capillary Flowsin Patterned Channels

Recent years have seen rapid progress in the technology offabricating channels at micron length scales. As narrowerchannels are used, to conserve space and reagents, surfaceeffects will have an increasing influence on the fluid flow.In particular, it may be possible to exploit surface pattern-ing to control the flow within the channels. In this talk,I will discuss how lattice Boltzmann simulations may beused to guide smart designs of such channels. Firstly, I willshow how the capillary filling of microchannels is affectedby posts or ridges on the sides of the channels. Secondly,I will consider the equilibrium behaviour and dynamics ofliquid drops on superhydrophobic surfaces patterned withsawtooth ridges or posts. On these surfaces, liquid dropsexhibit complex anisotropic behaviors. Interestingly, thisobservation allows us to interpret recent experiments de-scribing the motion of water drops on butterfly wings.

Halim KusumaatmajaMax Planck Institute of Colloids and InterfacesGolm, [email protected]

220 CS11 Abstracts

Julia YeomansThe Rudolf Peierls Centre for Theoretical PhysicsOxford, [email protected]

MS145

Lattice Boltzmann Modeling of Microchannel Flow

We present the lattice Boltzmann equation (LBE) withmultiple relaxation times to simulate pressure-driven flowsin a long microchannel. We use the first-order slip bound-ary conditions at the walls. The LBE results are val-idated against those of the compressible Navier-Stokesequations, the information-preservation direct simulationMonte Carlo (IP-DSMC) and DSMC methods. The LBEresults agree very well with IP-DSMC and DSMC results inthe slip-flow regime, but not in the transition-flow regimein part due to the inadequacy of the slip velocity model.We also compare the LBE simulations of high-Kn flowswith molecular dynamics simulations.

Li-Shi LuoOld Dominion [email protected]

MS145

Monte Carlo Study on Water Cluster Formationand Degradation Mechanisms in PEMFC GDLs

The water household within the gas diffusion layer (GDL)plays a prominent role both for the performance and degra-dation of PEM fuel cells. To improve understanding ofthe significant processes and investigate their dependenceon the relevant parameters, a Monte Carlo (MC) modelworking on the μm scale has been developed to simulatethe water distribution within a GDL. A description of themodel will be given together with results highlighting watercluster analysis capabilities.

Florian WilhelmZentrum fur Sonnenenergie- und Wasserstoff-ForschungBaden-Wurttemberg, Helmholtzstrae 8, 89081 Ulm,[email protected]

Katrin Seidenberger, Tanja Schmitt, Joachim ScholtaZentrum fur Sonnenenergie- und [email protected], [email protected],[email protected]

MS146

Uncertainty Quantification for Fluid Mixing

Uncertainty Quantification (UQ) for fluid mixing has tostart with lengths scales for observation: macro, meso andmicro, each with its own UQ requirements. New resultsare presented for each. For the micro observables, recenttheories argue that convergence in the LES regime shouldbe governed by pdfs (Young measures) which satisfy theEuler equation. VV results for Rayleigh-Taylor mixing, jetbreakup and chemical processing are presented to supportthis point of view.

James G. GlimmStony Brook UniversityBrookhaven National [email protected] or [email protected]

MS146

Probabilistic Reduced-order Models for Uncer-tainty Quantification

This paper is concerned with the development of proba-bilistic reduced-order models for complex systems in viewof applications in uncertainty quantification. We considersystems characterized by a high-dimensional vector of in-put parameters which are random variables. Given a den-sity for the input uncertainties, the proposed reduced-ordermodels attempt to learn the induced density on the solutionvector. We discuss hierarchical mixture models and par-allelizable, adaptive inference strategies for learning suchprobabilistic models.

Phaedon-Stelios KoutsourelakisSchool of Civil and Environmental Engineering &Center for Applied Mathematics, Cornell [email protected]

MS146

Generalised Polynomial Chaos for Differential Al-gebraic Equations with Random Parameters

We consider dynamical systems consisting of differentialalgebraic equations, where some physical parameters in-clude uncertainties. Thus the parameters are replaced byrandom variables. We resolve the stochastic model bythe generalised polynomial chaos. A stochastic Galerkinmethod yields a larger coupled system of differential al-gebraic equations, which is satisfied by an approximationof the unknown random process. We analyse the proper-ties of the larger coupled system. In particular, the indexof the system of differential algebraic equations is inves-tigated. Numerical simulations of an illustrative exampleare presented.

Roland PulchUniversity of [email protected]

MS146

A Framework for Managing the Combined Effect ofAleatoric Uncertainty, Epistemic Uncertainty, andNumerical Error


Gianluca IaccarinoStanford [email protected]

Jeroen WitteveenStanford UniversityCenter for Turbulence [email protected]

MS147




MS147

An Effective Method for Parameter Estimation

CS11 Abstracts 221

with PDE Constraints with Multiple Right HandSides

Many parameter estimation problems involve with aparameter-dependant PDEs with multiple right hand sides.The computational cost and memory requirements of suchproblems increases linearly with the number of right handsides. For many applications this is the main bottleneck ofthe computation. In this paper we show that problems withmultiple right hand sides can be reformulated as stochasticoptimization problems that are much cheaper to solve. Wediscuss the solution methodology and use the direct currentresistivity as a model problem to show the effectiveness ofour approach.


MS147

”Map-based” Bayesian Inference forPDE-constrained Inverse Problems

Computational expense and convergence challenges associ-ated with MCMC can hinder the application of Bayesianmethods to large-scale inverse problems. We present a newtechnique that entirely avoids Markov chain-based simu-lation, by constructing a map under which the posteriorbecomes the pushforward measure of the prior. Existenceand uniqueness of a suitable map is established by castingour algorithm in the context of optimal transport theory.Functional descriptions of the posterior distribution also fa-cilitate subsequent stages of uncertainty propagation, suchas posterior prediction and sequential analysis.

Youssef M. Marzouk, Tarek El MoselhyMassachusetts Institute of [email protected], [email protected]

MS147

Compressive Sensing and Underwater Acoustics

We will discuss the application of the theory of compressivesensing to two problems encountered in acoustics: sourcelocalization/tracking and communication over an uncertainchannel. For the first problem, we show how a randomizedgroup testing can significantly reduce the number of acous-tic simulations needed to locate and then track a source.For the second problem, we treat the scenario of communi-cating over an unknown channel as a blind deconvolutionproblem, and show that if the (unknown) channel is sparseand a random encoding scheme with a small amount ofredundancy is used by the source, then the receiver candiscover both the message and the channel response bysolving a well-posed optimization program

William Mantzel, Salman AsifGeorgia Institute of [email protected], [email protected]

Justin RombergSchool of ECEGeorgia [email protected]

Karim SabraGeorgia Institute of [email protected]

MS148

Reproducible Research, Lessons from the Mada-gascar Project

The Madagascar open-source project is a community ef-fort, which implements reproducible research practices, asenvisioned by Jon Claerbout. More than 100 geophysicalpapers have been published, together with open softwarecode and data, and are maintained by the community. Wehave learned that continuous maintenance and repeatedtesting are necessary for enabling long-term reproducibil-ity. As noted by Claerbout and others, the main benefi-ciary of the reproducible research discipline is the author.

Sergey FomelUniversity of Texas at [email protected]

MS148

Top 10 Reasons to NOT Share your Code and Whyyou Should Anyway

The research codes used to produce results (tables, plots,etc.) in publications are rarely made available, limitingthe readers’ ability to understand the algorithms that areactually implemented. Many objections are typically raisedto doing so. Although there are some valid concerns, myview is that there are good counter-arguments or ways toaddress most of these issues. In this talk I will discuss whatmay be the top 10 reasons.


MS148

The Challenge of Reproducible Research in theComputer Age

Computing is increasingly central to the practice of math-ematical and scientific research. This has provided manynew opportunities as well as new challenges. In particu-lar, modern scientific computing has strained the abilityof researchers to reproduce their own (as well as their col-leagues’) work. In this talk, I will outline some of the ob-stacles to reproducible research as well as some potentialsolutions and opportunities.

Kenneth J. MillmanUniversity of California, [email protected]

MS148

Intellectual Contributions to Digitized Science:Implementing the Scientific Method

Our stock of scientific knowledge is now accumulating indigital form, and the underlying reasoning is often in thecode that generated the findings, which is often never pub-lished. The case for open data is being made but opencode must be recognized as equally important in a prin-cipled approach, that of reproducibility of computationalresults. Issues involved with code and data disclosure arepresented, along with possible solutions.

Victoria StoddenColumbia University

222 CS11 Abstracts

[email protected]

MS149

Entropy Viscosity for Lagrangian Hydrodynamics

A new technique for approximating nonlinear conservationequations is described (entropy viscosity method). Thenovelty is that a nonlinear viscosity based on the local sizeof an entropy production is added to the numerical dis-cretization at hand. The methodology initially introducedin Eulerian coordinates is adapted to Lagrangian Hydro-dynamics. The methodology is numerically illustrated onstandard benchmark problems.

Jean-Luc GuermondDepartment of MathematicsTexas A&M, [email protected]

MS149

A Mimetic Tensor Artificial Viscosity Method forArbitrary Polygonal and Polyhedral Meshes

We construct a new mimetic tensor artificial viscosity ongeneral polygonal and polyhedral meshes. The tensor ar-tificial viscosity is based on discretization of coordinate in-variant operators, divergence of a tensor and gradient ofa vector. We consider both non-symmetric and symmet-ric forms of the tensor artificial viscosity. We demonstrateperformance of the new viscosity for the Noh implosion,Sedov explosion and Saltzman piston problems on a set ofvarious meshes.

Konstantin Lipnikov, Mikhail ShashkovLos Alamos National [email protected], [email protected]

MS149

A Corner and Dual Mesh ALE Remapping Al-gorithm for use with the Compatible Energy La-grangian Discretization

The energy conserving Lagrangian hydrodynamic dis-cretization introduced by Caramana et al. in the late90’s has proven to be quite successful. However, the intro-duction of subzonal Lagrangian elements complicates ALEremapping algorithms for such methods, both in remap-ping the subzonal masses as well as properties on the dualmesh. We will discuss new ALE algorithms for computingsuch remapped properties which obey important goals suchas consistency with the primary mesh remapping, conser-vation, and monotonicity.

J. Michael OwenLawrence Livermore Nat. [email protected]


MS149

Recent Advances in Lagrangian Hydro Methods

I will describe recent advances in Lagrangian hydro meth-ods: the symmetry preservations, shock resolution and en-ergy conservation; novel finite element and cell-centered

discretization techniques, new forms of artificial viscosityand methods for axisymmetric problems.


MS150

A Massively Parallel FMM Algorithm

We present new scalable algorithms and implementationsof the kernel-independent fast multiple method (KIFMM),employing hybrid distributed memory message passing (viaMPI) and graphics processing unit (GPU) acceleration torapidly evaluate two-body non-oscillatory potentials.


MS150

A Domain Decomposition and Load Balancing Al-gorithm for Large Scale Molecular Dynamics

After reviewing some general aspects of large scale molec-ular dynamics on modern supercomputers, we will presentsome of the new computational algorithms implemented inddcMD to achieve unprecedented performances. In partic-ular, we will focus on a new automatic load balancing algo-rithm which swaps particles between nearest neighbor pro-cessors to continuously balance work. We will also presenta heterogeneous decomposition of forces computation toimprove parallel scaling for applications with long-rangeCoulomb interactions.

Jean-Luc FattebertLawrence Livermore National [email protected]

David F. Richards, Jim N. Glosli, Erik W. DraegerLawrence Livermore Nat. [email protected], [email protected], [email protected]


Fred H. StreitzLawrence Livermore Nat. [email protected]

MS150

Multilevel Summation Method for Large-scale Sim-ulations of Biomolecules

An ongoing challenge in the simulation of biomolecules isto extend computational techniques to larger systems andlonger timescales. However, the use of particle–mesh Ewald(PME) to account for long-range electrostatic interactionshas become a restrictive bottleneck for large-scale parallelcomputation. The multilevel summation method is beinginvestigated as an alternative approach, offering compa-rable accuracy to PME while permitting better parallelscalability by avoiding the need to calculate 3D FFTs.

David J. HardyTheoretical Biophysics Group, Beckman Institute

CS11 Abstracts 223

University of Illinois at [email protected]

MS150

A Metascalable Approach to Large ReactiveMolecular-dynamics Simulations for Energy andNanoscience Applications

We are developing a metascalable algorithmic frameworkthat is likely to scale on future many-core clusters. Theembedded divide-and-conquer algorithms combine globallyinformed local solutions into a global solution conformingto correct symmetry. The framework has achieved paral-lel efficiency over 0.95 on 212,992 IBM BlueGene/L pro-cessors for 1.68 trillion electronic degrees-of-freedom quan-tum molecular dynamics. I will discuss atomistic mech-anisms of rapid hydrogen production from water usingnanocatalysts and mechanically enhanced reaction kinet-ics in nanoenergetics-on-a-chip.

Aiichiro Nakano, Rajiv K. Kalia, Priya VashishtaUniversity of Southern [email protected], [email protected], [email protected]

Fuyuki ShimojoKumamoto [email protected]

MS151

Multi-level Optimization Algorithms for the De-sign of Nano-porous Materials

Multi-level optimization problems in nano-porous materi-als are characterized by the facts that the objective func-tion is different at each level and that it is necessary toget the physics correct at all levels to construct a correctdesign. We discuss extensions to multi-grid optimizationalgorithms that allow us to address nano-porous problemsthat span a wide ranges of scales, include general con-straints, and have different objectives at different scales.Some numerical results will be given.

Paul BoggsSandia National [email protected]

David M. GayAMPL Optimization LLCUniversity of New [email protected]

Robert M. LewisCollege of William and [email protected]

Stephen G. NashGeorge Mason UniversitySystems Engineering & Operations Research [email protected]

MS151

Anomalous Diffusion in Soft Matter Materials andPassive Microrheology

A relatively new field, passive microrheology, has emergedto infer viscoelastic properties of complex media fromstochastic fluctuations of passive tracers. It is a generaliza-tion of fluctuation-dissipation to complex fluids. The semi-

nal paper was in 1995 by Tom Mason and Dave Weitz; sincethen the field has evolved extensively. Our theory groupfocuses on experimental data of our collaborator David Hillin the UNC Cystic Fibrosis Center on human lung mucusand various synthetic simulants. Passive tracer beads fromnms to microns are tracked using advanced microscopy, re-vealing normal and anomalous mean squared displacementstatistics. The lecture will survey experimental data, infer-ence methods, models, and simulations. The potential forthis methodology to be applied to assess nano-porous ma-terials design is a motivation for this topic in this session.


Scott McKinleyUniversity of [email protected]

John FricksDept of StatisticsPenn State [email protected]

Lingxing YaoUtah [email protected]

Christel HoheneggerUniversity of UtahDepartment of [email protected]

MS151

Pore Networks Optimizing Transport in PermeableEnergy Storage Materials

The physics and advantages of gas and electrical energystorage in nanoporous materials are briefly described. Al-though nanoscale pores enhance storage density, smallpores impede transport requiring introduction of widertransport channels to achieve rapid charge and discharge.To this end, the aperture and spacing of hierarchical chan-nel networks are optimized to obtain maximum dischargefrom a fixed volume of storage material within a specifieddischarge period.

Robert Nilson, Stewart GriffithsSandia National [email protected], [email protected]

MS151

Hydrodynamic Theory and Simulations of Nano-Rod and Nano-Platelet Dispersions

We present a new hydrodynamic theory for nano-rod andnano-plate dispersions in the polymer matrix, which arethe intermediate phases during processing of most highperformance nanocomposite materials. In this theory, wemodel the surface contact interaction between the nano-inclusion and the polymer matrix explicitly along with thesemi-flexibility of the nanorods or nano-platelets. We thenexplore the phase behavior of the flowing nano-compositesin equilibrium and in shear flows. Using a bifurcationstudy, we explore the various possible phases in simple

224 CS11 Abstracts

shear geometry along with their corresponding rheologicalconsequences.

Qi WangUniversity of South [email protected]


Jun LiNankai Universitynkjunlimail.com@g

MS152

A GPU/Multi-core Accelerated Multigrid Precon-ditioned Conjugate Gradient Method for AdaptiveMesh Refinement


Austen C. DuffyDepartment of MathematicsFlorida State [email protected]

MS152

A Robust Symmetric Second-Order Discretizationof the Poisson Equation with Mixed Dirichlet-Neumann Boundary Conditions on Arbitrary Ge-ometry

The Poisson equation has countless applications in impor-tant engineering problems. Many different approaches havebeen proposed for solving the Poisson problem subjected todifferent boundary conditions. However, there is a lack ofa straightforward, unified method for dealing with mixedboundary conditions. We propose an approach for impos-ing mixed Dirichlet and Neumann boundary conditions onirregular domains, which can be encountered for example inthe simulation of free surface flows on an arbitrarily shapedtopography.

Yen Ting NgDept. Computer [email protected]

Chohong MinkyungHee [email protected]

Frederic G. GibouUC Santa [email protected]

MS152

Computation of Three-dimensional Standing Wa-ter Waves


Chris RycroftDepartment of MathematicsLawrence Berkeley National [email protected]

MS152

A Second Order Virtual Node Algorithm for Ellip-tic Interface Problems on Irregular Domains


Joseph [email protected]

MS153

Adaptive p-refinement Approaches in StochasticExpansion Methods

This presentation describes automated refinement ofstochastic expansions, including uniform and adaptive p-refinement within nonintrusive polynomial chaos expansionand stochastic collocation methods. Adaptive p-refinementtechniques include anisotropic tensor and sparse grids, withanisotropy detected from online variance-based decompo-sition or online or offline spectral coefficient decay rates.These techniques employ general-purpose refinement con-trols based on response covariance. Alternatively, gener-alized sparse grids admit refinement controls in statisticalquantities of interest, resulting in refinement approachesthat are goal-oriented.

Michael S. EldredSandia National LaboratoriesOptimization and Uncertainty Quantification [email protected]

MS153

Computational Strategies for UQ: Fault TolerantCollocation and Input Model Generation

This work focuses on two aspects of scalable UQ. The firstpart showcases a fault tolerant computational frameworkfor adaptive sparse grid collocation. The second aspectdeals with developing a scalable, parallel, computationalframework for data-driven model reduction. This frame-work utilizes various linear and non-linear dimensionalityreduction methods to extract the most important featuresof the given high-dimensional data. This framework allowsestimating quantitative information about the data includ-ing linearity, compactness and convexity.

Baskar Ganapathysubramanian, Saikiran Samudrala,Jaroslaw Zola, Yu Xie, Srinivas AluruIowa State [email protected], [email protected],[email protected], [email protected], [email protected]

MS153

Optimal Multi-domain Stochastic Collocation

Traditionally stochastic multi-element collocation methodsdecompose the parametric space into multi-dimensionalrectangles. We propose a method based upon high-dimensional edge detection to efficiently decompose theparametric space into arbitrarily shaped regions of high-regularity. A high-degree polynomial interpolant is thenconstructed in each element using a modified version ofDe Bore’s algorithm. The use of De Bore’s ‘least inter-polant’ removes the need for hyper-rectangular elementsby allowing construction of polynomial interpolants basedupon arbitrarily distributed collocation points.

Dongbin Xiu

CS11 Abstracts 225

Purdue [email protected]

John [email protected]

MS153

Data Analysis for Uncertainty Quantification of In-verse Problems

We present exploratory data analysis methods to assessinversion estimates using examples based on �2- and �1-regularization. These methods can be used to reveal thepresence of systematic errors such as bias and discretiza-tion effects, or to validate assumptions made on the sta-tistical model used in the analysis. The methods include:confidence intervals and bounds for the bias, resamplingmethods for model validation, and construction of trainingsets of functions with controlled local regularity.

Luis Tenorio, Christian LuceroColorado School of [email protected], [email protected]

MS154

Fast Schur-complement Approximations for KKTSystems



MS154

Preconditioning for Optimization Problems withPartial Integrodifferential Equations

We consider an optimization problem with a partial inte-grodifferential equation. Examples are calibration prob-lems for PIDE models which occur in finance and biol-ogy. The discretized versions of these problems lead todense systems, which however exhibit a particular struc-ture. We use this structure in the design of preconditionersfor solving the original system and the resulting necessaryoptimality conditions. We analyze the preconditioners ina framework that includes aspects of mesh independence.Numerical results are presented for a calibration problemusing Levy models in option price modeling.

Ekkehard W. SachsUniversity of TrierVirginia [email protected]

MS154

All-at-once Solution of Time-dependent PDE-constrained Optimization Problems

One-shot methods for time-dependent PDEs approximatethe solution in a single iteration that solves for all time-steps at once. Here, we look at one-shot approaches forthe optimal control of time-dependent PDEs and focus onfast solution methods using Krylov solvers and efficient pre-conditioners. We solve only approximate time-evolutionsand compute accurate solutions of the control problem onlyat convergence of the overall iteration. We show that our

approach can give competitive results.

Martin StollComputational Methods in Systems and Control TheoryMax Planck Institute for Dynamics of Complex [email protected]

Andrew J WathenOxford UniversityNumerical Analysis [email protected]

MS154

Robust Preconditioners for Distributed OptimalControl of the Stokes Equations

The velocity tracking problem for Stokes flows with dis-tributed control is considered in the steady-state case. Theformulation of this optimal control problem involves a reg-ularization parameter, say α, in the cost functional. Wewill present a preconditioner for the discretized optimalitysystem. If used in Krylov subspace methods like the min-imal residual method, we obtain an iterative method forthis problem whose convergence rate can be shown to beuniformly bounded in α.

Walter ZulehnerUniversity of Linz, [email protected]

MS155

Publishing Reproducible Results with VisTrails

VisTrails is an open-source provenance management andscientific workflow system designed to support scientificdiscovery. It combines and substantially extends usefulfeatures of visualization and scientific workflow systems.Similar to visualization systems, VisTrails makes advancedscientific visualization techniques available to users allow-ing them to explore and compare different visual repre-sentations of their data; and similar to scientific workflowsystems, VisTrails enables the composition of workflowsthat combine specialized libraries, distributed computinginfrastructure, and Web services.

Juliana FreireScientific Computing and Imaging InstituteSchool of Computing University of [email protected]

Claudio SilvaScientific Computing and Imaging InstituteUniversity of [email protected]

MS155

Reproducible Research: Lessons from the OpenSource World

Why are the practices of open source software develop-ment often more consistent with our ideas of openness andreproducibility in science than science itself? Today’s sci-entific praxis falls short of our ideals of reproducibility, andthese problems are particularly acute in computational do-mains where they should be less prevalent. I will comparethe incentive structure in both cultures to show how thisplays an important role, since many of the incentives for

226 CS11 Abstracts

scientific career advancement are directly at odds with theexpectations of reproducibility in research. In addition, wecan also draw from the everyday toolchain and practices ofopen source software development to improve our scientificworkflows. I will highlight some key tools that can be easilyintegrated into our everyday research workflows and thatprovide real benefits without imposing undue burdens.


MS155

Reproducible Models and Reliable Simulations:Current Trends in Computational Neuroscience

Computational neuroscientists simulate models of neu-ronal networks to further our understanding of brain dy-namics. Unfortunately, the validity of models of neu-ronal dynamics and of the simulation software imple-menting the models is difficult to ascertain, challengingthe validity of computational neuroscience. We will de-scribe how the computational neuroscience community isaddressing validity through software reviews, best prac-tices, increasing use of established software packages, meta-simulators, systematic testing, and simulator-independentmodel-specification languages.

Hans E. PlesserDepartment of Mathematical Sciences and TechnologyNorwegian University of Life [email protected]

Sharon M. CrookDept of Mathematics and Statistics & School of LifeSciencesArizona State [email protected]

Andrew P. DavisonUnite de Neuroscience, Information et ComplexiteCNRS, Gif sur Yvette, [email protected]

MS155

FEMhub, a Free Distribution of Open Source Fi-nite Element Codes

FEMhub (http://femhub.org) is an open source distribu-tion of finite element codes with a unified Python inter-face. The goal of the project is to reduce heterogeneity ininstallation and usage of open source finite element codes,facilitate their interoperability and comparisons, and im-prove reproducibility of results. FEMhub is available fordownload as desktop application, but all codes are alsoautomatically available in the Online Numerical MethodsLaboratory (http://lab.femhub.org).


Aayush Poudel, Sameer RegmiUniversity of Nevada, [email protected], [email protected]

Mateusz Paprock

Technical University of Wroclaw, Wroclaw, [email protected]

PP1

Block Preconditioning for Multi-Physics Problems

The coupling of multi-physics systems is essential for a va-riety of problems in engineering and computational phys-iology. Due to their size and complexity, these systemsrequire robust and efficient solution methods. We focus onenhancing monolithic solvers by using a linearly scalableblock preconditioner, which is shown to result in mesh-independent convergence of GMRES. This approach en-ables the extension of the accurate and stable monolithicmethods to complex large-scale engineering problems.

Liya AsnerUniversity of OxfordComputing [email protected]

David NordslettenMassachusetts Institute of [email protected]

David KayUniversity of OxfordComputing [email protected]

PP1

Rabies Epizootic Modeling

I propose an updated model of the rabies epizootic. Inthis model, a system of partial differential equations rep-resents changes in susceptible, exposed, and infected com-partments for both skunk and bat species. The model ex-pands upon one developed by Steinhaus, Kuang, and Gard-ner (2010). The predictive ability of the model is tested bycomparing simulations with the confirmed case data fromTexas.

Rebecca BorcheringArizona State UniversitySchool of Mathematical and Statistical [email protected]

PP1

Bachelor Course Scientific Programming

For the Bachelor course scientific programming at AachenUniversity of Applied Sciences students are also requiredto train as mathematical technical software developers(MATSE) based at Forschungszentrum Julich or at exter-nal companies. The theoretical part of the MATSE qual-ification (almost equally from mathematics and computerscience) is organised as lecture courses at Julich Supercom-puting Centre and is part of the bachelor course. Both thevocational and academic training are designed to last threeyears.

Oliver BueckerResearch Centre [email protected]

CS11 Abstracts 227

PP1

Spinal Biomechanics Mathematical Model forLumbar Intervertebral Ligaments

Lumbar Spinal Surgery constitutes a Surgical Specializa-tion that deals with rather complicated operations be-cause of the anatomical and neurophysiological difficultiesof the operation field. Both in Lumbar Spinal Surgeryand Orthopedic Lumbar Surgery, the vertebrae distractionand/or forced position changes are usual maneuvres car-ried out during many interventions at the surgical theatre.Vertebrae are tightly joined by a number of strong liga-ments to keep the Biomechanical functionality of the Spine.Each ligament has specific mechanical characteristics andits proper functionality. We present a Mathematical-Biomechanical Model of the Lumbar Intervertebral Liga-ments, related mainly to the necessary distraction forcesthat have to be exerted during the operations to sepa-rate two adjacent vertebrae (for any specific spinal surgerypurpose, e.g., Intervertebral Disck Replacement, Scoliosis,Lumbar Spinal Fusion, Spine Trauma, Surgical Spinal De-compression, etc). The Model has been theoretically fittedto the proper stress and elastic parameters data for eachtype of Ligament, and simulated computational/numericalresults are shown. Applications of fundamental algorithmsin Approximation Theory are used to optimize this re-search. An overview of the Mathematical and Biomechani-cal calculations, together with the designed software is alsopresented in this contribution. F Casesnoves MSc (Physics)MD (MPhil Medicine and Surgery).

Francisco CasesnovesSIAM Activity Group on Geometric DesignNG7 4DW [email protected]

PP1

A Multi-Numerics Scheme for a Multi-PhysicsCoupling of Free Flow with Porous Media Flow

We present a coupled continuous finite element methodwith discontinuous Galerkin method scheme for a coupledfree flow with porous media flow model. We prove existenceand uniqueness of both the weak and numerical solution.Convergence rates are verified for the fully coupled modelusing grid studies. A two grid method which allows thecoupled model to be decoupled into two smaller problems isalso presented. The two grid method is tested numericallyand compared to solving the fully coupled problem.

Prince ChidyagwaiTemple UniversityDepartment of [email protected]

Beatrice RiviereRice UniversityHouston, Texas, [email protected]

PP1

Recent Advances in bout++

Challenges in turbulent edge simulations in plasma physicsinclude scalability and coupling to other codes for differ-ent computational regions. This poster presents recent ad-vances in the solution of time-dependent, nonlinear partialdifferential equations arising in BOUT++, with emphasis onexploration of various preconditioners. We also discuss im-

provements in interfaces that enable BOUT++ to functionwithin the FACETS framework, which has the goal of pro-viding modeling of the core, edge, and wall regions of fusiondevices.

Sean FarleyMathematics and Computer Science DivisionArgonne National [email protected]

Ben DudsonPhysics DepartmentUniversity of [email protected]

PP1

A Numerical Algorithm for the Solution of a Phase-Field Model of Polycrystalline Alloys

We describe an algorithm for the numerical solution of aphase-field model of microstructure evolution in alloys us-ing physical parameters from thermodynamic and kineticdatabases. The system of equations includes a local or-der parameter, a quaternion representation of local crystalorientation and a species composition parameter. The im-plicit time integration of the system uses a backward dif-ference formula combined with a preconditioned Newton-Krylov algorithm to solve the nonlinear system at eachtime step.

Jean-Luc FattebertLawrence Livermore National [email protected]


Michael E. Wickett, James F. Belak, Patrice E. A. TurchiLawrence Livermore Nat. [email protected], [email protected], [email protected]

PP1

Scalable Methods for Large-Scale Statistical In-verse Problems, with Applications to SubsurfaceFlow and Transport

We consider the problem of estimating uncertainty inlarge-scale nonlinear statistical inverse problems with high-dimensional parameter spaces within the framework ofBayesian inference. The solution is a posterior probabil-ity distribution function (pdf) over the desired parameter.Standard Markov chain Monte Carlo approaches are in-tractable for such high-dimensional problems. We approx-imate the pdf by constructing a Gaussian process responsesurface approximation, using the structure of the Hessianmatrix and optimization algorithms to achieve scalability.

H. Pearl FlathInstitute for Computational Engineering and SciencesThe University of Texas at [email protected]

Tan Bui, Omar GhattasUniversity of Texas at [email protected], [email protected]

228 CS11 Abstracts

PP1

Graph Coloring Software for Sparse DerivativeComputation and Beyond

We present a software package, called ColPack, compris-ing implementations of fast and effective algorithms for avariety of graph coloring and related problems arising in ef-ficient computation of sparse derivative matrices using au-tomatic differentiation. The capabilities of ColPack coverJacobian as well as Hessian computation using direct aswell as indirect methods. Several of the coloring problemssupported by ColPack also find important applications inmany areas outside derivative computation. In this poster,we will give an overview of the functionalities available inColPack, highlight its key algorithms, and as an exampleof an application enabled by ColPack, present results froman optimization problem in chemical engineering.

Assefaw H. GebremedhinPurdue [email protected]

PP1

Bayesian Markov Chain Monte Carlo Optimizationof a Physiologically-Based Pharmacokinetic Modelof Nicotine

In an effort to reduce harm caused by tobacco products,the Family Smoking Prevention and Tobacco Act was en-acted by Congress and the Food and Drug Administration(FDA) was tasked with regulating tobacco. The FDA isnow charged to use the best available science to guide thedevelopment and implementation of effective public healthstrategies to reduce the burden of illness and death causedby tobacco products. Therefore, one significant challengeis to reduce toxicant exposure in tobacco products such ascigarettes. Nicotine is the active drug within tobacco thatis delivered during cigarette smoking. Smoking behavior isdriven by the need for nicotine and will influence deliveryof other tobacco toxicants into the lungs. To better un-derstand nicotine dosimetry a physiologically-based phar-macokinetics (PBPK) model which captures the essentialnicotine metabolism and takes into account the variabilityof kinetic parameters among individuals was developed.

Rudy Gunawan, Chuck Timchalk, Justin TeeguardenPacific Northwest National [email protected], [email protected],[email protected]

PP1

High-Dimensional Adaptive Grids for Time-Dependent PDEs

Accurate solution of time-dependent, high-dimensionalPDEs requires massive-scale parallel computing. In par-ticular, the memory requirements for uniform grids withfine enough resolution will become prohibitively large. Wediscuss an on-going project where a parallel framework forblock-adaptive discretization on Cartesian grids of well lo-calized Hamiltonian systems is developed. Numerical sim-ulations from quantum chemistry are presented, where wesolve the time-dependent Schrdinger equation using expo-nential propagators based on the Lanczos algorithm.

Magnus GustafssonUppsala University, [email protected]

Katharina Kormann, Anna NissenDepartment of Information TechnologyUppsala [email protected], [email protected]

Sverker HolmgrenUppsala UniversityDepartment of Scientific [email protected]

PP1

H-P Spectral Element Methods for Three Dimen-sional Elliptic Problems on Non-Smooth Domains

We propose a nonconforming h−p spectral element methodto solve three dimensional elliptic boundary value problemson non-smooth domains to exponential accuracy. To over-come the singularities which arise in the neighborhoods ofthe vertices, vertex-edges and edges we use local systemsof coordinates. Away from these neighborhoods standardCartesian coordinates are used. In each of these neighbor-hoods we use a geometrical mesh which becomes finer nearthe corners and edges. We then derive differentiability esti-mates in these new set of variables and a stability estimateon which our method is based. A parallel preconditionerto solve the normal equations is defined, using the stabil-ity estimates. The Sobolev spaces in vertex-edge and edgeneighborhoods are anisotropic and become singular at thecorners and edges.

Akhlaq HusainIndian Institute of Technology [email protected]

Pravir DuttDepartment of Mathematics,IIT Kanpur, [email protected]

A S Vasudeva MurthyTIFR Centre for Applicable MathematicsTIFR Bangalore, [email protected]

Chandra Shekhar UpadhyayIndian Institute of Technology Kanpur [email protected]

PP1

Constructing Sparse Preconditioners for High-Order Finite Element Problems

We present a method for constructing sparse precondition-ers for elliptic problems discretized using higher-order poly-nomial basis functions. The method computes local sparsepreconditioners from dense element stiffness matrices. Thisis done by solving a sparse least-squares problem on an el-ement. The sparse matrices are assembled to be used inan algebraic multigrid method. We implement a fast andmemory efficient method for these local problems. The re-sults also show that solving the global problem requires lesstime and memory.

Chetan Jhurani, Travis M. AustinTech-X [email protected], [email protected]

CS11 Abstracts 229


Srinath Vadlamani, Scott KrugerTech-X [email protected], [email protected]

PP1

Variance Reduction Techniques for Stochastic Dif-ferential Equations

Stochastic differential equations are essential to modelingmany physical phenomena including option prices in math-ematical finance. The stochastic simulation of the quanti-ties of interest is done using repeated realizations of MonteCarlo methods. However these methods are well knownfor converging slowly with an order of only 1

2in terms of

the number of realizations. Therefore there is an increasingneed for variance reduction in order to improve efficiency ofMonte Carlo simulations. Here we use the variance reduc-tion technique known as control variates to improve thecomputational efficiency. The results presented demon-strate the magnitude of the achieved variance reduction.

Albi Kavo, Todd Caskey, Mandeep SinghNew Jersey Institute of [email protected], [email protected], [email protected]

PP1

Markov Order Estimation and Forecasting

We develop an exact Bayes factor (EBF) classifier to dis-tinguish finite sequences that have Markov order k=0 ver-sus k¿0. Through extensive simulation and testing againsta classifier based on the Bayesian Information Criterion(BIC), we find that EBF classifiers can be significantlymore accurate, especially when prior knowledge is avail-able. Applying these classifiers to five real-world data sets,we find that EBF-based models have higher out-of-samplepredictive accuracy than BIC-based models.

Harish S. Bhat, Nitesh KumarUniversity of California, MercedApplied [email protected], [email protected]

PP1

GPU-Accelerated Preconditioned Iterative LinearSolvers

Sparse linear algebra computations can benefit from thefine-grained massive parallelism of Graphic ProcessingUnits (GPUs). Performance of sparse matrix-vector prod-uct kernels can achieve up to 12 GFLOPS in double pre-cision for unstructured matrices. This work focuses ondeveloping high-performance iterative linear solvers accel-erated by GPUs. Incomplete Cholesky factorization pre-conditioned CG method and incomplete LU factorizationpreconditioned GMRES method are adapted to a GPU en-vironment. Other parallel preconditioners appropriate forGPUs are also explored.

Ruipeng LiDepartment of Computer Science & EngineeringUniversity of [email protected]


PP1

Solar Design Optimization Process Model

The purpose of this research is to develop a simulationmodel process algorithm used to optimize solar energy sys-tem design. This optimization model will be developed tominimize the cost of a solar energy system in order to bemore affordable to marginalized communities. The cost-ing analysis will include a levelized cost of energy (LCE)in determining the maximum power and minimized costsof a system and parameters such as site location and solarradiation.

Scott M. LittleSunWize TechnologiesNorthcentral [email protected]

PP1

Modeling the Dynamics of An Insect Heart

The tubular heart of the fly, Drosophila melanogaster, con-tains excitable cells that cause the heart to contract andpump hemolymph throughout the body. In this work, wemodel the electrical activity of the heart based on a networkof connected excitable cells, where each cell is modeled us-ing a system of ordinary differential equations. Compu-tational studies are performed to investigate the roles ofspecific ion channels and network connectivity in the gen-eration of the heartbeat.

Miles ManningStudent, School of Mathematical and Statistical SciencesArizona State [email protected]

April Chiustudent, School of Mathematical and Statistical SciencesArizona State [email protected]

PP1

A Scalable Preconditioner for Edge Plasma Trans-port with Neutral Gas Species

Simulating multi-component plasma transport in the edgeregion of a tokamak is computationally intensive; after dis-cretization, a nonlinear system must be solved at each timestep. Including neutral gases in the simulation magnifiesthe challenge and has limited the scalability of solvers. Thiswork describes numerical experiments with UEDGE, plac-ing emphasis on the development of a component-wise pre-conditioner that has improved the speed and scalability ofthe nonlinear solve in the presence of neutral gases.

Michael MccourtArgonne National [email protected]

Tom RognlienLawrence Livermore National [email protected]

230 CS11 Abstracts

Lois Curfman McInnesArgonne National LaboratoryMathematics and Computer Science [email protected]

Hong ZhangArgonne National [email protected]

PP1

Efficient Implementation of Smoothness-IncreasingAccuracy-Conserving Filters for DiscontinuousGalerkin Solutions

Smoothnessincreasing accuracy conserving post-processors when ap-plied to a discontinuous Galerkin solution can increase theorder of accuracy from k+1 to 2k+1. Computationally,this involves solving geometric intersection problems andevaluating several numerical quadratures per mesh elementwhich in turn makes this technique an expensive tool touse. However, as post-processor is a suitable candidate forparallelization, we demonstrate how filtering an entire do-main can be achieved in a fast and efficient manner usingdifferent parallelization techniques.

Hanieh MirzaeeScientific Computing and Imaging InstituteSchool of Computing, University of [email protected]

Jennifer K. RyanDelft University of [email protected]

Robert KirbyUniversity of [email protected]

PP1

Local Time-Stepping Adaptive Mesh RefinementTechniques for Discontinuous Galerkin Methods

Adaptive mesh refinement is the process of dynamicallyadding and deleting mesh elements based on some sort oflocal error estimator. Time stepping will be achieved us-ing local time-stepping techniques, which allows each meshelement to have its own time step. This differs from theBerger-Oliger method which uses a hierarchy of increas-ingly refined grid patches, with uniform time steps takenon each grid patch. Various error estimators will be evalu-ated based on a comparison to the actual known error forboth smooth and discontinuous examples. One possible er-ror estimator will involve associating a large gradient witha large error; others will originate from more mathemati-cally rigorous derivations. The LTS-AMR method will beapplied to solving the advection equation in one dimension.

Scott A. MoeUniversity of Wisconsin [email protected]


PP1

Implementation of SpMV Using Indexed Seg-mented Scan Method for CUDA

We show a new SpMV implementation using Indexed Seg-mented Scan method for CUDA. This implementation isbased on Segmented Scan method, and there are some kindof improvement in pre-calculation and GPU’s reduction.In this talk, we show the detail of implementation, perfor-mance, and application for numerical library.


Takao SakuraiCentral Research LaboratoryHitachi [email protected]

Takahiro KatagiriInformation Technology Center, The University of [email protected]

Kengo NakajimaThe University of TokyoInformation Technology [email protected]

Hisayasu KurodaUniv. of Tokyo / Ehime [email protected]

Ken NaonoHITACHI [email protected]

Mitsuyoshi IgaiHitachi ULSI Systems [email protected]

Shoji ItohUniv. of [email protected]

PP1

Adjoint Methods for Inversion of Rheology Param-eters in Ice Sheet Flows

Modeling the dynamics of polar ice sheets is critical forprojections of future sea level rise. Yet, there remain largeuncertainties in the non-Newtonian constitutive relation-ships employed within ice sheet models. Here we formulatean inverse problem to infer the rheological parameters thatminimize the misfit between observed and modeled surfaceflow velocities and surface elevation. The inverse problemis solved using an adjoint-based Gauss-Newton method.

Noemi Petra, Hongyu ZhuInstitute for Computational Engineering and SciencesUniversity of Texas at [email protected], [email protected]

Georg Stadler, Omar GhattasUniversity of Texas at [email protected], [email protected]

CS11 Abstracts 231

PP1

Building a Pde Code from Components, IncludingShape Optimization and Embedded Uq

A general-purpose PDE code is being written to drivethe development of independent yet interoperable softwarecomponents. This code makes use of dozens of librariesfrom the Trilinos, Dakota, Cubit, and Sierra projects. Toavoid a monolithic framework, we use abstract interfaces atmany levels. The heavy use of Automatic Differentiationtechnology means that any new physics can be instantly bedriven by Newton-based solvers, sensitivity analysis, shapeoptimization, non-intrusive UQ, or embedded UQ.

Andy SalingerSandia National [email protected]


Steve J. OwenSandia National LaboratoriesAlbuquerque, [email protected]

Matt Staten, Jake [email protected], [email protected]

PP1

Understanding Memory Effects of Loop Fusion forLinear Algebra Kernels

The performance of scientific applications is often limitedby the cost of memory accesses inside linear algebra ker-nels. We developed a compiler that tunes such kernels formemory efficiency using loop fusion. In this poster, we usestatistical methods to determine when fusing vector op-erations in the presence of matrix operations significantlyimpacts performance. We present circumstances where thefusion of vector operations can be disregarded when search-ing for an optimal routine within our compiler.

Erik SilkensenUniversity of Colorado at [email protected]

Ian KarlinDepartment of Computer ScienceUniversity of Colorado at [email protected]

Geoffrey BelterDept. of Electrical, Computer, and Energy EngineeringUniversity of Colorado at [email protected]

Elizabeth JessupUniversity of Colorado at BoulderDepartment of Computer [email protected]

Thomas NelsonUniversity of Colorado at BoulderArgonne National [email protected]

Pavel ZelinksyUniversity of Colorado at [email protected]

Jeremy SiekDepartment of Electrical and Computer EngineeringUniversity of Colorado at [email protected]

PP1

Measuring Glioma Tumor Volumes with the Ta-lairach Transformation

Glioblastoma multiforme is a high-grade glioma that sig-nificantly lowers patient survival to under two years. Serialmagnetic resonance images have been processed to obtainvolumetric measurements of brain structures that surroundthe tumor based on the Talairach Transformation. Thesemeasured volumes of the surrounding structures provideestimation of the tumor size. Based on these tumor esti-mated values, the glioma growth rate can be calculated forforecast models and implemented in MATLAB treatmentsimulations.

Maple SoArizona State UniversitySchool of Mathematical and Statistical [email protected]

PP1

Inversion of Rheological Parameters of MantleFlow Models from Observed Plate Motions

Modeling the dynamics of the Earth’s mantle is critical forunderstanding the dynamics of the solid earth. Yet, thereremain large uncertainties in the constitutive parametersemployed within mantle convection models. Here we for-mulate an inverse problem to infer the rheological param-eters that minimize the misfit between observed and mod-eled tangential surface velocity fields. The inverse prob-lem is solved using an adjoint-based inexact Newton-CGmethod.

Jennifer A. Worthen, Georg StadlerUniversity of Texas at [email protected], [email protected]

Michael [email protected]


PP1

A Shape Hessian-Based Analysis of Roughness Ef-fects on Viscous Flow

We study the influence of boundary roughness character-istics on the rate of dissipation in a viscous fluid usingfirst- and second-order shape derivatives. We show thatthe flat boundary is a stationary point for the dissipationfunctional, and thus its local behavior is governed by theshape Hessian, whose eigenvectors are shown to be the theFourier modes. We show numerical examples for laminar

232 CS11 Abstracts

flow and present preliminary results for turbulent flow.

Shan Yang, Georg Stadler, Omar Ghattas, Robert MoserUniversity of Texas at [email protected], [email protected],[email protected], [email protected]

PP1

Comparison of Model Reduction Techniques onHigh-Fidelity Linear and Nonlinear Electrical, Me-chanical, and Biological Systems

An impartial and rigorous comparative study of model or-der reduction (MOR) techniques is presented. Some ofthese techniques are classical, while others have been re-cently developed. Large-scale linear and nonlinear, staticand dynamical systems are considered, with a particular fo-cus on nonlinear systems. The study emphasizes accuracyand computational speedups, and considers systems arisingin a wide variety of engineering fields. All MOR methodsexhibit a trade-off between accuracy and speedup.

Matthew J. ZahrUniversity of California, BerkeleyStanford [email protected]

Charbel Farhat, Kevin T. Carlberg, David AmsallemStanford [email protected], [email protected],[email protected]

PP1

Adjoint Methods for Inversion of Basal BoundaryConditions in Ice Sheet Flows

Modeling the dynamics of polar ice sheets is critical forprojection of future sea level rise. Yet, there remain largeuncertainties in the boundary conditions at the base of theice sheet, which represent slip along the basal surface. Herewe study mathematical and computational issues in theinverse problem of inverting for basal slip parameters fromobservations of surface flow velocities and surface elevation.We employ adjoint-based Gauss-Newton methods to solvethe inverse problem.

Hongyu ZhuInstitute for Computational Engineering and SciencesThe University of Texas at [email protected]

Georg StadlerUniversity of Texas at [email protected]

Thomas HughesInstitute for Computational Engineering and SciencesThe University of Texas at [email protected]


Documents

CS11 Abstracts