Overhead Supercomputing 2011

Preview:

DESCRIPTION

Overhead study of workflows

Citation preview

Workflow Overhead Analysis and Optimizations

Weiwei Chen, Ewa Deelman Information Sciences Institute

University of Southern California {wchen,deelman}@isi.edu

WORKS11, Nov 14 2011, Seattle WA

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Introduction • Workflow Optimization • Scheduling • Reducing Runtime • Reducing and Overlapping Overheads

• Overheads • Benefits • Workflow modeling and simulation • Performance evaluation • New optimization methods

Fig  1  System  Overview  

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Modeling Overheads

Workflow Events: •  Workflow Engine Start •  Workflow Engine Finished

Workflow engine delay Queue delay Runtime

Postscript delay

Job Events: •  Job Release •  Job Submit •  Job Execute •  Job Terminate •  Postscript Start •  Postscript Terminate

Makespan

 1  h1p://pegasus.isi.edu/wms/docs/3.1/monitoring_  debugging_stats.php#ploAng_staBsBcs  

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Cumulative Overhead (O1)

O1(workflow engine delay)=10+10+10=30 O1(queue delay)=10+20+10=40 O1(data transfer delay)=10 O1(postscript delay)=10+20+10=40

•  O1 simply adds up a similar type of overheads of all jobs.

Cumulative Overhead (O2)

O2(workflow engine delay)=20  O2(queue delay)=30.

O2(data transfer delay)=10. O2(postscript delay)=40

•  O2 subtracts from O1 the overlaps of the same type of overhead.

Cumulative Overhead (O3)

O3(workflow engine delay)=20  O3(queue delay)=20  

O3(data transfer delay)=10 O3(postscript delay)=30  

•  O3 subtracts the overlap of dissimilar overheads from O2

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Experiments •  Environments:

•  Amazon EC2 •  FutureGrid

•  Applications: •  Biology: Epigenomics, Proteomics, SIPHT •  Earthquake science: Broadband, CyberShake •  Astronomy: Montage •  Physics: LIGO

•  Optimizations: •  Job Clustering •  Resource Provisioning

Data  are  available  at  h1p://pegasus.isi.edu/workflow_gallery/  

•  HPCC •  Other clusters

•  Data Pre-Staging •  Throttling

Experiments

Distribution of Overheads

Job Clustering

With job clustering, the cumulative overheads decrease greatly due to the decreased number of jobs.

without clustering

with clustering

without clustering

without clustering

with clustering

with clustering

•  Merging small jobs into a clustered job  

Percentage(%)=cumulative overhead(seconds) / makspan(seconds)

Resource Provisioning

O3 and O2 have shown more obviously that the portion of runtime has been increased than O1.

without provisioning

without provisioning

without provisioning with

provisioning with provisioning

with provisioning

•  Deploy pilot jobs as placeholders  

Outline

•  Introduction •  Overhead modeling •  Cumulative overhead •  Experiments and evaluations •  Conclusions and future work

Conclusions and Future Work

Conclusions •  Overhead Analysis •  A complete view of these three metrics

Future Work •  More optimization methods. •  Dynamic provisioning

Q & A

•  Pegasus Group: http://pegasus.isi.edu/ •  FutureGrid: https://portal.futuregrid.org/ •  Scripts are available at http://isi.edu/~wchen/techniques.html •  Data are available at http://pegasus.isi.edu/workflow_gallery/

Recommended