24
Swift: implicitly parallel dataflow scripting for clusters, clouds, and multicore systems www.ci.uchicago.edu/swift Michael Wilde - [email protected]

Swift: implicitly parallel dataflow scripting for clusters, clouds, and multicore systems

Embed Size (px)

DESCRIPTION

Swift: implicitly parallel dataflow scripting for clusters, clouds, and multicore systems. www.ci.uchicago.edu /swift Michael Wilde - [email protected]. When do you need Swift? Typical application: protein-ligand docking for drug screening. O(10) proteins implicated in a disease. O (100K) - PowerPoint PPT Presentation

Citation preview

Swift: implicitly parallel dataflow scripting for clusters, clouds, and multicore systems

www.ci.uchicago.edu/swift

Michael Wilde - [email protected]

When do you need Swift?Typical application: protein-ligand docking for drug screening

Work of M. Kubal, T.A.Binkowski,And B. Roux

(B)

O(100K)drug

candidates

Tens of fruitfulcandidates for wetlab & APS

O(10)proteinsimplicatedin a disease

1Mcompute

jobs

X …

2

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Numerous many-task applications

Simulation of super-cooled glass materials

Protein folding using homology-free approaches

Climate model analysis and decision making in energy policy

Simulation of RNA-protein interaction

Multiscale subsurfaceflow modeling

Modeling of power gridfor OE applications

A-E have published science results obtained using Swift

T0623, 25 res., 8.2Å to 6.3Å (excluding tail)

Protein loop modeling. Courtesy A. Adhikari

Native Predicted

Initial

E

D

C

A B

A

B

C

D

E

F F

>

3

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Submit host (login node, laptop, Linux server)

Data server

Swiftscript

Swift runs parallel scripts on a broad rangeof parallel computing resources.

Language-driven: Swift parallel scripting

Clouds:Amazon EC2,

XSEDE Wispy, …ApplicationPrograms

1018

1015

4

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Example – MODIS satellite image processing

MODIS analysis scriptMODISdataset

5 largest forest land-cover tiles in processed region

• Ouput: regions with maximal specific land types

Input: tiles of earth land cover (forest, ice, water, urban, etc)

5

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Goal: Run MODIS processing pipeline in cloud

analyzeLandUse

colorMODIS

assemble

markMapgetLandUsex 317

analyzeLandUse

colorMODISx 317

getLandUsex 317

assemble

markMap

MODIS script is automatically run in parallel:

Each loop levelcan process tensto thousands ofimage files.

6

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

MODIS script in Swift: main data flow

foreach g,i in geos {

land[i] = getLandUse(g,1);

}

(topSelected, selectedTiles) =

analyzeLandUse(land, landType, nSelect);

foreach g, i in geos {

colorImage[i] = colorMODIS(g);

}

gridMap = markMap(topSelected);

montage =

assemble(selectedTiles,colorImage,webDir);

7

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Submit host (login node, laptop, Linux server)

Data server

Swiftscript

Swift runs parallel scripts on cloud resources provisioned by Nimbus’s Phantom service.

Parallel distributed scripting for clouds

Clouds:Amazon EC2,

NSF FutureGrid, Wispy, …

Nimbus,Phantom

8

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Example of Cloud programming:Nimbus-Phantom-Swift on FutureGrid

User provisions 5 nodes with Phantom– Phantom starts 5 VMs– Swift worker agents in VMs contact Swift coaster service to request work

Start Swift application script “MODIS”– Swift places application jobs on free workers– Workers pull input data, run app, push output data

3 nodes fail and shut down– Jobs in progress fail, Swift retries

User can add more nodes with phantom– User asks Phantom to increase node allocation to 12– Swift worker agents register, pick up new workers, runs more in parallel

Workload completes– Science results are available on output data server– Worker infrastructure is available for new workloads

9

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

10

Programming model:all execution driven by parallel data flow

f() and g() are computed in parallel myproc() returns r when they are done

This parallelism is automatic Works recursively throughout the program’s call graph

(int r) myproc (int i){ j = f(i); k = g(i); r = j + k;}

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Encapsulation enables distributed parallelism

Swift app( ) functionInterface definition

Application program

Typed Swiftdata object

Files expectedor produced

by application program

Encapsulation is the key to transparent distribution, parallelization, and automatic provenance capture

11

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

app( ) functions specify cmd line argument passing

Swift app function“predict()”

t

seq dtlog

PSim application-t -d-s-c

>pdb

pg

To run: psim -s 1ubq.fas -pdb p -t 100.0 -d 25.0 >log

In Swift code:

app (PDB pg, File log) predict (Protein seq, Float t, Float dt) { psim "-c" "-s" @pseq.fasta "-pdb" @pg "–t" temp "-d" dt; }

Protein p <ext; exec="Pmap", id="1ubq">; PDB structure; File log; (structure, log) = predict(p, 100., 25.);

Fastafile

100.0 25.0

12

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

foreach sim in [1:1000] { (structure[sim], log[sim]) = predict(p, 100., 25.);}result = analyze(structure)

…1000

Runs of the“predict”

application

Analyze()

T1af7

T1r69

T1b72

Large scale parallelization with simple loops

13

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Nested parallel prediction loops in Swift

14

1. Sweep( )2. {3. int nSim = 1000;4. int maxRounds = 3;5. Protein pSet[ ] <ext; exec="Protein.map">;6. float startTemp[ ] = [ 100.0, 200.0 ];7. float delT[ ] = [ 1.0, 1.5, 2.0, 5.0, 10.0 ];8. foreach p, pn in pSet {9. foreach t in startTemp {10. foreach d in delT {11. ItFix(p, nSim, maxRounds, t, d);12. }13. }14. }15. }16. Sweep();

10 proteins x 1000 simulations x3 rounds x 2 temps x 5 deltas

= 300K tasks www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Spatial normalization of functional run reorientRun

reorientRun

reslice_warpRun

random_select

alignlinearRun

resliceRun

softmean

alignlinear

combinewarp

strictmean

gsmoothRun

binarize

reorient/01

reorient/02

reslice_warp/22

alignlinear/03 alignlinear/07alignlinear/11

reorient/05

reorient/06

reslice_warp/23

reorient/09

reorient/10

reslice_warp/24

reorient/25

reorient/51

reslice_warp/26

reorient/27

reorient/52

reslice_warp/28

reorient/29

reorient/53

reslice_warp/30

reorient/31

reorient/54

reslice_warp/32

reorient/33

reorient/55

reslice_warp/34

reorient/35

reorient/56

reslice_warp/36

reorient/37

reorient/57

reslice_warp/38

reslice/04 reslice/08reslice/12

gsmooth/41

strictmean/39

gsmooth/42gsmooth/43gsmooth/44 gsmooth/45 gsmooth/46 gsmooth/47 gsmooth/48 gsmooth/49 gsmooth/50

softmean/13

alignlinear/17

combinewarp/21

binarize/40

reorient

reorient

alignlinear

reslice

softmean

alignlinear

combine_warp

reslice_warp

strictmean

binarize

gsmooth

Dataset-level workflow Expanded (10 volume) workflow15

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Complex scripts can be well-structuredprogramming in the large: fMRI spatial normalization script example

(Run snr) functional ( Run r, NormAnat a,

Air shrink )

{ Run yroRun = reorientRun( r , "y" );

Run roRun = reorientRun( yroRun , "x" );

Volume std = roRun[0];

Run rndr = random_select( roRun, 0.1 );

AirVector rndAirVec = align_linearRun( rndr, std, 12, 1000, 1000, "81 3 3" );

Run reslicedRndr = resliceRun( rndr, rndAirVec, "o", "k" );

Volume meanRand = softmean( reslicedRndr, "y", "null" );

Air mnQAAir = alignlinear( a.nHires, meanRand, 6, 1000, 4, "81 3 3" );

Warp boldNormWarp = combinewarp( shrink, a.aWarp, mnQAAir );

Run nr = reslice_warp_run( boldNormWarp, roRun );

Volume meanAll = strictmean( nr, "y", "null" )

Volume boldMask = binarize( meanAll, "y" );

snr = gsmoothRun( nr, boldMask, "6 6 6" );

}

(Run or) reorientRun ( Run ir, string direction){ foreach Volume iv, i in ir.v { or.v[i] = reorient(iv, direction); }}

16

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Dataset mapping example: fMRI datasets

type Study { Group g[ ];

}

type Group { Subject s[ ];

}

type Subject { Volume anat; Run run[ ];

}

type Run { Volume v[ ];

}

type Volume { Image img; Header hdr;

}

On-DiskDataLayout Swift’s

in-memory data model

Mapping functionor script

17

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Simple campus Swift usage: running locally on a cluster login host

Swift can send work to multiple sites

Data server(GridFTP, scp, …)

Any remote host

CondorOr GRAMProvider

RemoteData

Provider

Remote campus cluster

LocalScheduler

GRAM

Remote campus cluster

LocalScheduler

GRAM

. . .

18

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Flexible worker-node agents for execution and data transport

Main role is to efficiently run Swift tasks on allocated compute nodes, local and remote

Handles short tasks efficiently Runs over Grid infrastructure: Condor, GRAM Also runs with few infrastructure dependencies Can optionally perform data staging Provisioning can be automatic or external (manually

launched or externally scripted)

19

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Coaster architecture handles diverse environments

20

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Centralized campuscluster environment

Any remote host

Swiftscript

Swift can deploy coasters automatically, or user can deploy manually or with a script. Swift provides scripts for clouds, ad-hoc clusters, etc.

Coasters deployment can be automatic or manual

Data server

CoasterAutoBoot

DataProvider

CoasterService

Computenodes

f1

a1

RemoteHost

GRAMor SSH

Coasterworker

Coasterworker

. . .

ClusterScheduler

21

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Swift is a parallel scripting system for grids, clouds and clusters– for loosely-coupled applications - application and utility programs linked by

exchanging files Swift is easy to write: simple high-level C-like functional language

– Small Swift scripts can do large-scale work Swift is easy to run: contains all services for running Grid workflow - in one

Java application– Untar and run – acts as a self-contained Grid client

Swift is fast: uses efficient, scalable and flexible “Karajan” execution engine.– Scaling close to 1M tasks – .5M in live science work, and growing

Swift usage is growing:– applications in neuroscience, proteomics, molecular dynamics, biochemistry,

economics, statistics, and more. Try Swift! www.ci.uchicago.edu/swift and www.mcs.anl.gov/exm

22

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

23Parallel Computing, Sep 2011www.ci.uchicago.edu/swift www.mcs.anl.gov/exm

Acknowledgments Swift is supported in part by NSF grants OCI-1148443 and PHY-636265, NIH DC08638,

and the UChicago SCI Program ExM is supported by the DOE Office of Science, ASCR Division Structure prediction supported in part by NIH The Swift team:

– Mihael Hategan, Justin Wozniak, Ketan Maheshwari, Ben Clifford, David Kelly, Jon Monette, Allan Espinosa, Ian Foster, Dan Katz, Ioan Raicu, Sarah Kenny, Mike Wilde, Zhao Zhang, Yong Zhao

GPSI Science portal:– Mark Hereld, Tom Uram, Wenjun Wu, Mike Papka

Java CoG Kit used by Swift developed by:– Mihael Hategan, Gregor Von Laszewski, and many collaborators

ZeptoOS– Kamil Iskra, Kazutomo Yoshii, and Pete Beckman

Scientific application collaborators and usage described in this talk:– U. Chicago Open Protein Simulator Group (Karl Freed, Tobin Sosnick, Glen Hocky, Joe Debartolo,

Aashish Adhikari)– U.Chicago Radiology and Human Neuroscience Lab, (Dr. S. Small)– SEE/CIM-EARTH: Joshua Elliott, Meredith Franklin, Todd Muson– ParVis and FOAM: Rob Jacob, Sheri Mickelson (Argonne); John Dennis, Matthew Woitaszek (NCAR)– PTMap: Yingming Zhao, Yue Chen

24

www.ci.uchicago.edu/swift www.mcs.anl.gov/exm