DAS-2 workshop, June 6 2002
Experiences with Globus on DAS-2 in an educational setting
Herbert Bos & Lex WoltersLIACS, Leiden University
{herbertb,llexx}@liacs.nl
DAS-2 workshop, June 6 2002
Seminar Grid Computing• Fall 2001• 11 students (year 3 or 4) started, 8 finished• Once a week, 2 hours• 13 classes• Programming assignment
Goals• Try to separate Grid hype from Grid reality• Show the underlying technologies that are currently being
developed and used to provide a 'pervasive computing grid'
DAS-2 workshop, June 6 2002
Topics by lecturers
• What is Grid Computing?
• Requirements
• History
• Grid architecture
• Basic Services
• Taxonomy of Grids
• QoS (final class)
DAS-2 workshop, June 6 2002
Presentations by students• Legion• Globus• Resource management: GRAM• Scheduling: AppLeS• Communication (Nexus, etc.)• Information service: MDS, GRIS, GIIS• Data access: GASS + RSL• Security: GSS-API• Language support
DAS-2 workshop, June 6 2002
Programming assignment
Using Globus to implement a Grid application:– Computation chopped up in subtasks which are
distributed to computational nodes
– Final result is combination of results of subtasks
– Resource discovery
– At least one of the following options:• Data is distributed in secure fashion
• Incorporate costs
DAS-2 workshop, June 6 2002
Topics
• Willem de Bruijn: Distributed Evolutionary Algorithm
• Hongqin Chen: RSA Key Breaking
• Jeroen Laros: GridCrafty
• Hui Li: Parallel Fractal Image Generation
• Yafei Sun: Adaptive Quadrature
• Arjan Tijms & Shlomo Raikin: Parallel Genetic Algorithm
DAS-2 workshop, June 6 2002
Situation
• Delivery DAS-2 delayed
• Globus installation on SUN server– System-managers unfamiliar with Globus– Incorrect installation, e.g. certificates
• Jan 21, 2002: DAS-2 operational– Platform for students– Focus on PBS, MPI; not Globus
DAS-2 workshop, June 6 2002
Distributed Evolutionary Algorithm• Purpose: minimizes an arbitrary function• Strategy:
– self-adaptation, no distinction between worker and controller nodes– predefined number of runs
• Language: C++• Modules:
– communication: Globus IO– resource management: GRAM
• Results: – master/slave set-up best results in shortest time-span– other strategies increases self-adaptiveness, but worse results in current
setting
DAS-2 workshop, June 6 2002
Distr. Evolutionary Algorithm (cont’d)
• Problems: – Distinction between fileserver and compute node:
starting up new processes– Wall-time value (60 s) of scheduler cannot be altered
(also not by maxTime in RSL): waiting processes are killed
• Suggestions for improvement:– Symbolic links to Globus libraries– Documentation on Globus:
• Overall idea is neglected• Q&A forum, globus.org
DAS-2 workshop, June 6 2002
RSA Key Breaking• Purpose: factoring large numbers• Strategy:
– Pollard’s Rho factoring algorithm– Master/slave framework
• Language: C• Modules:
– Communication: Nexus– Job allocation: GRAM and PBS
• Results:– Significant speed-ups, depending on work-load/distribution
DAS-2 workshop, June 6 2002
RSA Key Breaking (cont’d)
• Problems:– Start-up
• Problems to get correct certificate
• Libraries were not installed correctly
• Functions were not available
– ‘Real’ problems• GRAM macro-definitions not in corresponding header-file
– Documentation• Lack of practical guidelines and examples
DAS-2 workshop, June 6 2002
GridCrafty
• Purpose: shell script which parallelises the chess engine Crafty
• Strategy:– Master: all possible moves; worker: grade moves
• Modules:– Storage access GASS, globus_rcp, openssh
• Results: – Due to problems with Globus implementation it was
also bypassed entirely which leads to speed-up of 17.5 (theoretical 22)
DAS-2 workshop, June 6 2002
GridCrafty (cont’d)• Problems:
– start-up• GASS did not work properly• Globus_rcp was not installed• Openssh did not work
– ‘real’ problem• Scheduling of tasks takes a lot of time
• Final implementation:– connect to all nodes; query load:
• Static: < 10% host free• Dynamic: clients checks load before start of intensive calculations
– ssh implementation much faster than Globus (speed-ups 17.5 versus 5-9)
DAS-2 workshop, June 6 2002
Parallel Fractal Image Generation
• Purpose: see title• Strategy
– Master distributes work, collects output, draw image
– Slaves calculates points line-wise
• Language: C and C++• Modules
– Resource management GRAM
– Communication MPI
DAS-2 workshop, June 6 2002
Par. Fractal Image Generation (cont’d)
• Problem– Conflict between current MPI set-up and
GRAM job submit script (temporary fixed only on UvA-cluster)
• Suggestions for improvement:– Installation of MPICH-G2– Where can one find good examples on
exploiting Globus to get started?
DAS-2 workshop, June 6 2002
Adaptive Quadrature• Purpose: calculate the quadrature of the curve of an
arbitrary function• Strategy:
– Divide curve into smaller ones– Ring of processes– Results via files
• Language: gcc• Modules
– Process control and allocation DUROC– Communication Nexus
DAS-2 workshop, June 6 2002
Adaptive Quadrature (cont’d)
• Problems– Start-up
• Getting the correct certificate• Using the right RSL parameter (hostCount)
– ‘Real’ problem• Conflict between duroc_runtime_barrier and PBS:
fixed only on UvA-cluster
• Suggestions for improvement– Info on different communication techniques
DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm
• Purpose: improving results of GAs• Strategy
– Start independent searches at different locations of the solution landscape
– Periodically exchange highest fitting of individuals– Init process: job dispatching and bootstrap communication set-up– Master process: relay for communications, synchronizes the start
of worker processes, collects final results, and sets up GUI for monitoring and progress display
– Worker processes: each runs a single N-generation run
DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm (cont’d)
• Language: C and C++• Modules
– Communication NEXUS – RPC– Job submission GRAM– Thread creation Globus_Common
• Preliminary results– Parallel algorithm achieves results that are 8-17%
better than sequential algorithm
DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm (cont’d)
• Problems– Start-up
• Environment and path setting• Obtaining certificates• Who is responsible for globus on das-2?• Different versions of globus (1.1.3 versus 2.0 beta)
– ‘Real’ problems• Shared libraries are not installed at nodes• Delegating proxies• Information about resource availability static or not present• Globus 2.0 is a beta version: things not implemented or missing
DAS-2 workshop, June 6 2002
Parallel Genetic Algorithm (cont’d)
• Suggestions for improvement:– Default Globus environment– Globus libraries on nodes via
• NFS partition• Symbolic link to the ‘strange’ globus-edg beta 2.1 names
– ‘fork’ is default jobmanager, which only ‘schedules’ jobs to local file server (adding PBS makes code dependent on this scheduler)
– Installation of a cluster monitor better than beowulf– Examples and makefiles
DAS-2 workshop, June 6 2002
Conclusions• Seminar quite successful• DAS-2
– Great environment for teaching purposes– Start-up problems– Current setting not optimal– Who is responsible for DAS-2?– Who determines policies, implementations?
• Globus– Documentation, examples (probably better with current training material
on globus.org)– Installation not trivial
• IBM– Pre-sales OK, after-sales???