Parallelized Computing Slides

Embed Size (px)

Citation preview

  • 7/25/2019 Parallelized Computing Slides

    1/32

    Adaptive Mesh Applications

    Sathish Vadhiyar

    Sources:

    - Schloegel, Karypis, Kumar. Multilevel Diffusion Schemes for Repartitioning ofAdaptive Meshes. !D" #$$% &Taken verbatim'

  • 7/25/2019 Parallelized Computing Slides

    2/32

    Adaptive Applications

    Highly adaptive and irregular applications Amount of work per task can vary drastically

    throughout the execution similar to earlierapplications, but..

    Has notions of interesting regions omputations in the interesting regions of

    the domain larger than for other regions !t is difficult to predict which regions will

    become interesting

  • 7/25/2019 Parallelized Computing Slides

    3/32

    AMR Applications

    An example of such applications is "arallelAdaptive #esh $efinement %A#$& formulti'scale applications

    Adaptive #esh #esh or grid si(e is notfixed as in )aplace*+acobi, but interestingregions are refined to form finer level

    grids*mesh .g.- to study crack growth through a

    macroscopic structure under stress

  • 7/25/2019 Parallelized Computing Slides

    4/32

    AMR Applications ( "rac)propagation

    uch a system is sub/ect to the lawsof plasticity and elasticity and can be

    solved using finite element method rack growth forces the geometry of

    the domain to change

    0his in turn necessitates locali(edremeshing

  • 7/25/2019 Parallelized Computing Slides

    5/32

    AMR Applications- Adaptivity

    Adaptivity arises when advances crossesfrom one subdomain to another

    !t is unknown in advance when or where thecrack growth will take place and whichsubdomains will be affected

    0he computational complexity of a

    subdomain can increase dramatically due togreater levels of mesh refinement

    1ifficult to predict future workloads

  • 7/25/2019 Parallelized Computing Slides

    6/32

    Repartitioning

    !n adaptive meshes computation, areasof the mesh are selectively refined orderefined in order to accurately modelthe dynamic computation

    Hence, repartitioning and redistributingthe adapted mesh across processors is

    necessary

  • 7/25/2019 Parallelized Computing Slides

    7/32

    Repartitioning

    0he challenge is to keep therepartitioning cost to minimum limits

    imilar problems to #1, 2o) 0he primary difference in A#$ is that

    loads can drastically change3 cannotpredict3 will have to wait for refinement,then repartition

  • 7/25/2019 Parallelized Computing Slides

    8/32

    Structure of !arallel AMR

  • 7/25/2019 Parallelized Computing Slides

    9/32

    Repartitioning

    4 methods for creating a new partitioningfrom an already distributed mesh that hasbecome load imbalanced due to meshrefinement and coarsening

    Scratch-remapschemes create anentirely new partition

    Diffusive schemes attempt to tweak theexisting partition to achieve better loadbalance, often minimi(ing migration costs

  • 7/25/2019 Parallelized Computing Slides

    10/32

    *raph Representation of Mesh

    5or irregular mesh applications, thecomputations associated with a mesh canbe represented as a graph

    6ertices represent the grid cells3 vertexweights represent the amount ofcomputations associated with the grid cells

    dges represent the communicationbetween the grid cells3 edge weightsrepresent the amount of interactions

  • 7/25/2019 Parallelized Computing Slides

    11/32

    *raph Representation of Mesh

    0he ob/ective is to partition across "processors ach partition has e7ual amount of vertex

    weight 0otal weight of the edges cut by the

    partition is minimi(ed

  • 7/25/2019 Parallelized Computing Slides

    12/32

    Scratch-map Method

    "artitioning from scratch will result inhigh vertex migration since as thepartitioning does not take the initiallocation of the vertices into account

    Hence a partitioning method shouldincrementally construct a new partition

    as simply a modification of the inputpartition

  • 7/25/2019 Parallelized Computing Slides

    13/32

    +otations

    )et 8%7& be the set of vertices withpartition 7

    9eight of any partition 7 can be defined as-

    Average partition weight-

    A graph is imbalanced if it is partitioned,and-

  • 7/25/2019 Parallelized Computing Slides

    14/32

    erms

    A partition is over'balanced if its weightis greater than the average partitionweight times %:; &

    !f less, under'balanced 0he graph is balanced when no partition

    is over'balanced $epartitioning existing partition used

    as an input to form a new partition

  • 7/25/2019 Parallelized Computing Slides

    15/32

    erms

    A vertex is clean if its current partitionis its initial partition3 else dirty

    8order vertex ad/acent vertex inanother partition3 those partitions areneighbor partitions

    0otal6 sum of the si(es of the verticeswhich change partitions3 i.e., sum of thesi(es of the dirty vertices

  • 7/25/2019 Parallelized Computing Slides

    16/32

    /0ectives

    Maintain /alance /et1een partitions

    Minimi2e edge cuts

    Minimi2e otalV

  • 7/25/2019 Parallelized Computing Slides

    17/32

    Different Schemes

    Repartitioning from scratch

    Cut-and-pasterepartitioning: e3cessvertices in an over/alanced partitionare simply s1apped into one or moreunder/alanced partitions in order to/ring these partitions up to /alance

    he method can optimi2e A4V, /utcan have a negative effect on the edge-cut

  • 7/25/2019 Parallelized Computing Slides

    18/32

    Different Schemes

    Another method is analogous to diffusion oncept is for vertices to move from

    overbalanced to neighboringunderbalanced partitions

  • 7/25/2019 Parallelized Computing Slides

    19/32

    53ample&Assuming edge and verte3 1eights as e6ual to #'

  • 7/25/2019 Parallelized Computing Slides

    20/32

    53ample &contd..'

  • 7/25/2019 Parallelized Computing Slides

    21/32

    Analysis of the schemes

    0hus, cut'and'paste repartitioningminimi(es 0otal6, while completelyignoring edge'cut

    "artitioning the graph from the scratchminimi(es edge'cut, while resulting inhigh 0otal6

    1iffusion attempts to keep both 0otal6and edge'cut low

  • 7/25/2019 Parallelized Computing Slides

    22/32

    Space 7illing "urves for !artitioning and 4oad 8alancing

  • 7/25/2019 Parallelized Computing Slides

    23/32

    Space 7illing "urves

    0he underlying idea is to map amultidimensional space to one dimensionwhere the partitioning is trivial

    0here are many different ways 8ut a mapping for partitioning algorithms

    should preserve the proximity information

    present in the multidimensional space tominimi(e communication costs

  • 7/25/2019 Parallelized Computing Slides

    24/32

    Space 7illing "urve

    pace filling curves are 7uick to run, can beimplemented in parallel, and produce good loadbalancing with locality

    A space'filling curve is formed over grid*meshcells by using the centroid of the cells torepresent them

    0he 5 produces a linear ordering of the cellssuch that cells that are close together in alinear ordering are also close together in thehigher dimensional space

  • 7/25/2019 Parallelized Computing Slides

    25/32

    Space 7illing "urve

    0he curve is then broken into segments basedon the weights of the cells %weights computedusing si(e and number of particles&

    0he segments are distributed to processors3thus cells that are close together in space areassigned to the same processor

    0his reduces overall amount of communication

    that occur, i.e., increases locality

  • 7/25/2019 Parallelized Computing Slides

    26/32

    S7" representation

  • 7/25/2019 Parallelized Computing Slides

    27/32

    9-curve or Morton ordering

    0he curve for a 4=k x 4=k grid composed of four4=%k':& x 4=%k':& curves, one in each 7uadrant ofthe 4=k x 4=k grid

  • 7/25/2019 Parallelized Computing Slides

    28/32

    *raycode "urve

    >ses same interleaving function as ?'curve

    8ut visits points in the graycode order 2raycode two successive values differ

    in only one bit

    0he one'bit gray code is %@,:&

  • 7/25/2019 Parallelized Computing Slides

    29/32

    *raycode "urve

    0he gray code list for n bits can begenerated recursively using n': bits 8y reflecting the list %reversing the list&

    :,@B oncatenating original with the reflected

    @,:,:,@B

    "refixing entries in the original list with @,and prefixing entries in the reflected listwith : @@,@:,::,:@B

  • 7/25/2019 Parallelized Computing Slides

    30/32

    *raycode "urve

    C'bit gray code-@@@,@@:,@::,@:@,::@,:::,:@:,:@@

  • 7/25/2019 Parallelized Computing Slides

    31/32

    il/ert "urve

    Hilbert curve is a smooth curvethat avoids the sudden /umps in ?'curve and graycode curve

    urve composed of four curves ofprevious resolution in four7uadrants

    urve in the lower left 7uadrant

    rotated clockwise by D@ degree,and curve in lower right 7uadrantrotated anticlockwise by D@ degree

  • 7/25/2019 Parallelized Computing Slides

    32/32

    S7"s for AMR

    All these curve based partitioningtechni7ues can also be applied foradaptive mesh by forming hierarchical5s