19
Evaluation of Agent Evaluation of Agent Teamwork Teamwork A High Performance Distributed A High Performance Distributed Computing Middleware Computing Middleware Solomon Lane Solomon Lane Agent Teamwork Research Agent Teamwork Research Assistant Assistant October 2006 – March 2007 October 2006 – March 2007

Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Embed Size (px)

DESCRIPTION

Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007. What is Agent Teamwork?. HPDC Middleware Job Dispatch & Termination Programming Framework Under Ongoing Development. - PowerPoint PPT Presentation

Citation preview

Page 1: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Evaluation of Agent Evaluation of Agent TeamworkTeamwork

A High Performance Distributed A High Performance Distributed Computing MiddlewareComputing Middleware

Solomon LaneSolomon Lane

Agent Teamwork Research Agent Teamwork Research AssistantAssistant

October 2006 – March 2007October 2006 – March 2007

Page 2: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

What is Agent Teamwork?What is Agent Teamwork?

• HPDC MiddlewareHPDC Middleware

• Job Dispatch & TerminationJob Dispatch & Termination

• Programming FrameworkProgramming Framework

• Under Ongoing DevelopmentUnder Ongoing Development

Page 3: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Project ObjectivesProject Objectives

• Evaluate Agent Teamwork’s Evaluate Agent Teamwork’s performance against a contemporary performance against a contemporary alternativealternative– Job Dispatch & Termination PerformanceJob Dispatch & Termination Performance– Framework PerformanceFramework Performance

• Build a Reference PlatformBuild a Reference Platform

• Write 3 benchmark programs that Write 3 benchmark programs that exercise the framework exercise the framework

Page 4: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Job Dispatch & Termination Job Dispatch & Termination Performance EvaluationPerformance Evaluation

• Globus Based Reference PlatformGlobus Based Reference Platform– Globus ToolkitGlobus Toolkit– OpenPBS scheduler OpenPBS scheduler – MPICH-G2MPICH-G2

Page 5: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Reference Platform Reference Platform HardwareHardware

Medusa Cluster Phoebe Cluster

a 32-node cluster for research use

a 32-node cluster for instructional use

Head Node: specification outbound 1.8GHz Xeon x2, 512MB memory, and 70GB HD 100Mbps

Head node: specification outbound 1.5GHz Xeon, 256MB memory, and 40GB HD 100Mbps

Computing nodes: #nodes specification inbound 24 3.2GHz Xeon, 512MB memory, and 36GB HD 1Gbps 8 2.8GHz Xeon, 512MB memory, and 60GB HD 2Gbps

Computing nodes: #nodes specification inbound 16 1.5GHz Xeon, 512MB memory, and 30GB HD 100Mbps 16 1.5GHz Xeon, 512MB memory, and 30GB HD 1Gbps

Page 6: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Reference Platform Reference Platform OverviewOverview

Page 7: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Reference Platform Reference Platform ChallengesChallenges• Administrator Access to MachinesAdministrator Access to Machines

• Host Config & Cryptic Error MessagesHost Config & Cryptic Error Messages– DNS vs hosts files DNS vs hosts files – Inconsistent hosts filesInconsistent hosts files– Inconsistent ptr recordsInconsistent ptr records– Inconsistent port acls Inconsistent port acls – : globus_init: failed: globus_init: failed

• GTK AuthenticationGTK Authentication

Page 8: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Debugging

• Strace

• TcpDump

• GDB

Page 9: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Job Dispatching and Job Dispatching and Termination Function Termination Function EvaluationEvaluation• Not evaluating the job execution Not evaluating the job execution

performanceperformance

• MethodologyMethodology– Ported available test program to the MPICH-G2 Ported available test program to the MPICH-G2

frameworkframework– measure how long it takes a job submission to be measure how long it takes a job submission to be

deployed, executed and cleaned updeployed, executed and cleaned up– Run with 2-64 nodes across the two clusters in a Run with 2-64 nodes across the two clusters in a

depth-first node distribution series and a depth-first node distribution series and a breadth-first node distribution seriesbreadth-first node distribution series

Page 10: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

ResultsResults

Page 11: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

ResultsResults

Page 12: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

ResultsResults

Page 13: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Framework Function Framework Function EvaluationEvaluation• Framework Issues Framework Issues

– Agent Teamwork MPI implementationAgent Teamwork MPI implementation– MPICH-G2 C++MPICH-G2 C++– MPIJavaMPIJava

• MPI Framework MPI Framework – Communication functionsCommunication functions– Initialization, Barrier, Broadcast, Gather, Initialization, Barrier, Broadcast, Gather,

Scatter, etc.Scatter, etc.

• Goal to write 3 benchmark programs that Goal to write 3 benchmark programs that have communication intensive algorithms.have communication intensive algorithms.

Page 14: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Benchmark ProgramsBenchmark Programs

• MD - a molecular dynamics MD - a molecular dynamics simulationsimulation

• Wave2D - a wave dissemination Wave2D - a wave dissemination simulationsimulation

• Mandelbrot - a Mandelbrot generatorMandelbrot - a Mandelbrot generator

• Code each program twiceCode each program twice

Page 15: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Agent Teamwork Agent Teamwork ProgrammingProgramming• SnapshotsSnapshots

• Programming modelProgramming model– func_nfunc_n int func_0 (String[] Args){int func_0 (String[] Args){

… …

return 1;return 1;

}}

int func_1 () {int func_1 () {

… …

}}

• Code Maturity Code Maturity

Page 16: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Partial ResultsPartial Results

Page 17: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Partial ResultsPartial Results

Page 18: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Future WorkFuture Work

• Framework debugging Framework debugging

• Develop a pre-processor to convert Develop a pre-processor to convert conventionally programmed code conventionally programmed code into the snapshot-able func_n modelinto the snapshot-able func_n model

Page 19: Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware

Skills Developed During Skills Developed During ProjectProject• Significant experience with globus, Significant experience with globus,

openPBS and the mpiopenPBS and the mpi

• Extensive debugging with tcpdump, Extensive debugging with tcpdump, strace, and gdbstrace, and gdb

• experience with performance experience with performance analysis and writing MPI programsanalysis and writing MPI programs

• new insights and understanding of new insights and understanding of HPDCHPDC