26
DAS-2 workshop, June 6 20 02 Experiences with Globus on DAS-2 in an educational setting Herbert Bos & Lex Wolters LIACS, Leiden University {herbertb,llexx}@liacs.nl

DAS-2 workshop, June 6 2002 Experiences with Globus on DAS-2 in an educational setting Herbert Bos & Lex Wolters LIACS, Leiden University {herbertb,llexx}@liacs.nl

  • View
    213

  • Download
    1

Embed Size (px)

Citation preview

DAS-2 workshop, June 6 2002

Experiences with Globus on DAS-2 in an educational setting

Herbert Bos & Lex WoltersLIACS, Leiden University

{herbertb,llexx}@liacs.nl

DAS-2 workshop, June 6 2002

Seminar Grid Computing• Fall 2001• 11 students (year 3 or 4) started, 8 finished• Once a week, 2 hours• 13 classes• Programming assignment

Goals• Try to separate Grid hype from Grid reality• Show the underlying technologies that are currently being

developed and used to provide a 'pervasive computing grid'

DAS-2 workshop, June 6 2002

Topics by lecturers

• What is Grid Computing?

• Requirements

• History

• Grid architecture

• Basic Services

• Taxonomy of Grids

• QoS (final class)

DAS-2 workshop, June 6 2002

Presentations by students• Legion• Globus• Resource management: GRAM• Scheduling: AppLeS• Communication (Nexus, etc.)• Information service: MDS, GRIS, GIIS• Data access: GASS + RSL• Security: GSS-API• Language support

DAS-2 workshop, June 6 2002

Programming assignment

Using Globus to implement a Grid application:– Computation chopped up in subtasks which are

distributed to computational nodes

– Final result is combination of results of subtasks

– Resource discovery

– At least one of the following options:• Data is distributed in secure fashion

• Incorporate costs

DAS-2 workshop, June 6 2002

Topics

• Willem de Bruijn: Distributed Evolutionary Algorithm

• Hongqin Chen: RSA Key Breaking

• Jeroen Laros: GridCrafty

• Hui Li: Parallel Fractal Image Generation

• Yafei Sun: Adaptive Quadrature

• Arjan Tijms & Shlomo Raikin: Parallel Genetic Algorithm

DAS-2 workshop, June 6 2002

Situation

• Delivery DAS-2 delayed

• Globus installation on SUN server– System-managers unfamiliar with Globus– Incorrect installation, e.g. certificates

• Jan 21, 2002: DAS-2 operational– Platform for students– Focus on PBS, MPI; not Globus

DAS-2 workshop, June 6 2002

Distributed Evolutionary Algorithm• Purpose: minimizes an arbitrary function• Strategy:

– self-adaptation, no distinction between worker and controller nodes– predefined number of runs

• Language: C++• Modules:

– communication: Globus IO– resource management: GRAM

• Results: – master/slave set-up best results in shortest time-span– other strategies increases self-adaptiveness, but worse results in current

setting

DAS-2 workshop, June 6 2002

Distr. Evolutionary Algorithm (cont’d)

• Problems: – Distinction between fileserver and compute node:

starting up new processes– Wall-time value (60 s) of scheduler cannot be altered

(also not by maxTime in RSL): waiting processes are killed

• Suggestions for improvement:– Symbolic links to Globus libraries– Documentation on Globus:

• Overall idea is neglected• Q&A forum, globus.org

DAS-2 workshop, June 6 2002

RSA Key Breaking• Purpose: factoring large numbers• Strategy:

– Pollard’s Rho factoring algorithm– Master/slave framework

• Language: C• Modules:

– Communication: Nexus– Job allocation: GRAM and PBS

• Results:– Significant speed-ups, depending on work-load/distribution

DAS-2 workshop, June 6 2002

RSA Key Breaking (cont’d)

• Problems:– Start-up

• Problems to get correct certificate

• Libraries were not installed correctly

• Functions were not available

– ‘Real’ problems• GRAM macro-definitions not in corresponding header-file

– Documentation• Lack of practical guidelines and examples

DAS-2 workshop, June 6 2002

GridCrafty

• Purpose: shell script which parallelises the chess engine Crafty

• Strategy:– Master: all possible moves; worker: grade moves

• Modules:– Storage access GASS, globus_rcp, openssh

• Results: – Due to problems with Globus implementation it was

also bypassed entirely which leads to speed-up of 17.5 (theoretical 22)

DAS-2 workshop, June 6 2002

GridCrafty (cont’d)• Problems:

– start-up• GASS did not work properly• Globus_rcp was not installed• Openssh did not work

– ‘real’ problem• Scheduling of tasks takes a lot of time

• Final implementation:– connect to all nodes; query load:

• Static: < 10% host free• Dynamic: clients checks load before start of intensive calculations

– ssh implementation much faster than Globus (speed-ups 17.5 versus 5-9)

DAS-2 workshop, June 6 2002

Parallel Fractal Image Generation

• Purpose: see title• Strategy

– Master distributes work, collects output, draw image

– Slaves calculates points line-wise

• Language: C and C++• Modules

– Resource management GRAM

– Communication MPI

DAS-2 workshop, June 6 2002

Par. Fractal Image Generation (cont’d)

• Problem– Conflict between current MPI set-up and

GRAM job submit script (temporary fixed only on UvA-cluster)

• Suggestions for improvement:– Installation of MPICH-G2– Where can one find good examples on

exploiting Globus to get started?

DAS-2 workshop, June 6 2002

Adaptive Quadrature• Purpose: calculate the quadrature of the curve of an

arbitrary function• Strategy:

– Divide curve into smaller ones– Ring of processes– Results via files

• Language: gcc• Modules

– Process control and allocation DUROC– Communication Nexus

DAS-2 workshop, June 6 2002

Adaptive Quadrature (cont’d)

• Problems– Start-up

• Getting the correct certificate• Using the right RSL parameter (hostCount)

– ‘Real’ problem• Conflict between duroc_runtime_barrier and PBS:

fixed only on UvA-cluster

• Suggestions for improvement– Info on different communication techniques

DAS-2 workshop, June 6 2002

Parallel Genetic Algorithm

• Purpose: improving results of GAs• Strategy

– Start independent searches at different locations of the solution landscape

– Periodically exchange highest fitting of individuals– Init process: job dispatching and bootstrap communication set-up– Master process: relay for communications, synchronizes the start

of worker processes, collects final results, and sets up GUI for monitoring and progress display

– Worker processes: each runs a single N-generation run

DAS-2 workshop, June 6 2002

Parallel Genetic Algorithm (cont’d)

• Language: C and C++• Modules

– Communication NEXUS – RPC– Job submission GRAM– Thread creation Globus_Common

• Preliminary results– Parallel algorithm achieves results that are 8-17%

better than sequential algorithm

DAS-2 workshop, June 6 2002

Parallel Genetic Algorithm (cont’d)

• Problems– Start-up

• Environment and path setting• Obtaining certificates• Who is responsible for globus on das-2?• Different versions of globus (1.1.3 versus 2.0 beta)

– ‘Real’ problems• Shared libraries are not installed at nodes• Delegating proxies• Information about resource availability static or not present• Globus 2.0 is a beta version: things not implemented or missing

DAS-2 workshop, June 6 2002

Parallel Genetic Algorithm (cont’d)

• Suggestions for improvement:– Default Globus environment– Globus libraries on nodes via

• NFS partition• Symbolic link to the ‘strange’ globus-edg beta 2.1 names

– ‘fork’ is default jobmanager, which only ‘schedules’ jobs to local file server (adding PBS makes code dependent on this scheduler)

– Installation of a cluster monitor better than beowulf– Examples and makefiles

DAS-2 workshop, June 6 2002

DAS-2 workshop, June 6 2002

DAS-2 workshop, June 6 2002

DAS-2 workshop, June 6 2002

Conclusions• Seminar quite successful• DAS-2

– Great environment for teaching purposes– Start-up problems– Current setting not optimal– Who is responsible for DAS-2?– Who determines policies, implementations?

• Globus– Documentation, examples (probably better with current training material

on globus.org)– Installation not trivial

• IBM– Pre-sales OK, after-sales???

DAS-2 workshop, June 6 2002

Thanks

Many thanks to David Groep who helped our students many, many times without any hesitation! Great job!