Upload
james-rush
View
20
Download
1
Embed Size (px)
DESCRIPTION
Evaluation of Agent Teamwork A High Performance Distributed Computing Middleware. Solomon Lane Agent Teamwork Research Assistant October 2006 – March 2007. What is Agent Teamwork?. HPDC Middleware Job Dispatch & Termination Programming Framework Under Ongoing Development. - PowerPoint PPT Presentation
Citation preview
Evaluation of Agent Evaluation of Agent TeamworkTeamwork
A High Performance Distributed A High Performance Distributed Computing MiddlewareComputing Middleware
Solomon LaneSolomon Lane
Agent Teamwork Research Agent Teamwork Research AssistantAssistant
October 2006 – March 2007October 2006 – March 2007
What is Agent Teamwork?What is Agent Teamwork?
• HPDC MiddlewareHPDC Middleware
• Job Dispatch & TerminationJob Dispatch & Termination
• Programming FrameworkProgramming Framework
• Under Ongoing DevelopmentUnder Ongoing Development
Project ObjectivesProject Objectives
• Evaluate Agent Teamwork’s Evaluate Agent Teamwork’s performance against a contemporary performance against a contemporary alternativealternative– Job Dispatch & Termination PerformanceJob Dispatch & Termination Performance– Framework PerformanceFramework Performance
• Build a Reference PlatformBuild a Reference Platform
• Write 3 benchmark programs that Write 3 benchmark programs that exercise the framework exercise the framework
Job Dispatch & Termination Job Dispatch & Termination Performance EvaluationPerformance Evaluation
• Globus Based Reference PlatformGlobus Based Reference Platform– Globus ToolkitGlobus Toolkit– OpenPBS scheduler OpenPBS scheduler – MPICH-G2MPICH-G2
Reference Platform Reference Platform HardwareHardware
Medusa Cluster Phoebe Cluster
a 32-node cluster for research use
a 32-node cluster for instructional use
Head Node: specification outbound 1.8GHz Xeon x2, 512MB memory, and 70GB HD 100Mbps
Head node: specification outbound 1.5GHz Xeon, 256MB memory, and 40GB HD 100Mbps
Computing nodes: #nodes specification inbound 24 3.2GHz Xeon, 512MB memory, and 36GB HD 1Gbps 8 2.8GHz Xeon, 512MB memory, and 60GB HD 2Gbps
Computing nodes: #nodes specification inbound 16 1.5GHz Xeon, 512MB memory, and 30GB HD 100Mbps 16 1.5GHz Xeon, 512MB memory, and 30GB HD 1Gbps
Reference Platform Reference Platform OverviewOverview
Reference Platform Reference Platform ChallengesChallenges• Administrator Access to MachinesAdministrator Access to Machines
• Host Config & Cryptic Error MessagesHost Config & Cryptic Error Messages– DNS vs hosts files DNS vs hosts files – Inconsistent hosts filesInconsistent hosts files– Inconsistent ptr recordsInconsistent ptr records– Inconsistent port acls Inconsistent port acls – : globus_init: failed: globus_init: failed
• GTK AuthenticationGTK Authentication
Debugging
• Strace
• TcpDump
• GDB
Job Dispatching and Job Dispatching and Termination Function Termination Function EvaluationEvaluation• Not evaluating the job execution Not evaluating the job execution
performanceperformance
• MethodologyMethodology– Ported available test program to the MPICH-G2 Ported available test program to the MPICH-G2
frameworkframework– measure how long it takes a job submission to be measure how long it takes a job submission to be
deployed, executed and cleaned updeployed, executed and cleaned up– Run with 2-64 nodes across the two clusters in a Run with 2-64 nodes across the two clusters in a
depth-first node distribution series and a depth-first node distribution series and a breadth-first node distribution seriesbreadth-first node distribution series
ResultsResults
ResultsResults
ResultsResults
Framework Function Framework Function EvaluationEvaluation• Framework Issues Framework Issues
– Agent Teamwork MPI implementationAgent Teamwork MPI implementation– MPICH-G2 C++MPICH-G2 C++– MPIJavaMPIJava
• MPI Framework MPI Framework – Communication functionsCommunication functions– Initialization, Barrier, Broadcast, Gather, Initialization, Barrier, Broadcast, Gather,
Scatter, etc.Scatter, etc.
• Goal to write 3 benchmark programs that Goal to write 3 benchmark programs that have communication intensive algorithms.have communication intensive algorithms.
Benchmark ProgramsBenchmark Programs
• MD - a molecular dynamics MD - a molecular dynamics simulationsimulation
• Wave2D - a wave dissemination Wave2D - a wave dissemination simulationsimulation
• Mandelbrot - a Mandelbrot generatorMandelbrot - a Mandelbrot generator
• Code each program twiceCode each program twice
Agent Teamwork Agent Teamwork ProgrammingProgramming• SnapshotsSnapshots
• Programming modelProgramming model– func_nfunc_n int func_0 (String[] Args){int func_0 (String[] Args){
… …
return 1;return 1;
}}
int func_1 () {int func_1 () {
… …
}}
• Code Maturity Code Maturity
Partial ResultsPartial Results
Partial ResultsPartial Results
Future WorkFuture Work
• Framework debugging Framework debugging
• Develop a pre-processor to convert Develop a pre-processor to convert conventionally programmed code conventionally programmed code into the snapshot-able func_n modelinto the snapshot-able func_n model
Skills Developed During Skills Developed During ProjectProject• Significant experience with globus, Significant experience with globus,
openPBS and the mpiopenPBS and the mpi
• Extensive debugging with tcpdump, Extensive debugging with tcpdump, strace, and gdbstrace, and gdb
• experience with performance experience with performance analysis and writing MPI programsanalysis and writing MPI programs
• new insights and understanding of new insights and understanding of HPDCHPDC