55
Introduction to Grid Computing and Applications in Computational Sciences Barry Wilkinson Department of Mathematics and Computer Science Western Carolina University WSSU September 13, 2004

Introduction to Grid Computing and Applications in Computational Sciences Barry Wilkinson Department of Mathematics and Computer Science Western Carolina

Embed Size (px)

Citation preview

Introduction to Grid Computing and Applications in Computational Sciences

Barry WilkinsonDepartment of Mathematics

and Computer ScienceWestern Carolina University

WSSUSeptember 13, 2004

2

Talk Outline

What is Grid Computing?

Applications

Evolution of Grid Computing

Grid Computing Course

3

Grid Computing

Using usually geographically distributed and interconnected computers together for high performance computing and/or for resource sharing.

Notice “usually”, “and/or” - many definitions of grid computing and applications.

4

The interconnection - now “usually” made through the Internet to multiple administrative domains.

Resource sharing - can involve geographically distributed resources in addition to computers such as software, experimental equipment etc.

5

Some think that grid computing is just cluster computing in the “large”

LocalCluster

Inter PlanetGrid

2100

2100 2100 2100 2100

2100 2100 2100 2100

Personal Device SMPs or SuperComputers

GlobalGrid

PERFORMANCE

+

Q

o

S

•Individual•Group•Department•Campus•State•National•Globe•Inter Planet•Universe

Administrative Barriers

EnterpriseCluster/Grid

Scalable Computing

Figure due to Rajkumar Buyya, University of Melbourne, Australia, www.gridbus.org

7

But grid computing is more than this.

It offer the potential of virtual organizations – groups of people both geographically and

organizationally distributed working together on a problems sharing computers AND other resources such as databases and experimental equipment.

8

The grid virtualizes heterogeneous geographically disperse resources

From "Introduction to Grid Computing with Globus," IBM Redbooks

9

G R ID IN F R A S T R U C T U R E

G r id - e n a b le dR e a l F ie ld E xp e r im e n t

G r id - e n a b le dV ir tu a l L a b o r a to ry

G r id -e n a b le dA lg o r ith m s

( i. e . , d a t a m i n i n g a n d m o d e l s )

G r id - e n a b le dD a ta C o lle c tio n

G R ID IN F R A S T R U C T U R E

G r id - e n a b le dR e a l F ie ld E xp e r im e n t

G r id - e n a b le dV ir tu a l L a b o r a to ry

G r id -e n a b le dA lg o r ith m s

( i. e . , d a t a m i n i n g a n d m o d e l s )

G r id - e n a b le dD a ta C o lle c tio n

Distributed Collaborative Experiment

Figure from M. Faramawi and B. Ramamurthy, SUNY- Buffalo

10

Some Grid Projects & Initiatives Australia

– Nimrod-G– Gridbus– GridSim– Virtual Lab– DISCWorld– GrangeNet.– ..etc

Europe– UK eScience– EU Data Grid– Cactus– XtremeWeb– ..etc.

India– I-Grid

Japan– Ninf– DataFarm

Korea...N*Grid

SingaporeNGP

USA– AppLeS– Globus– Legion– Sun Grid Engine– NASA IPG– Condor-G– Jxta– NetSolve– AccessGrid– and many more...

Cycle Stealing & .com Initiatives– Distributed.net– SETI@Home, ….– Entropia, UD, SCS,….

Public Forums– Global Grid Forum– Australian Grid Forum– IEEE TFCC– CCGrid conference– P2P conference

http://www.gridcomputing.comFigures due to Rajkumar Buyya,

University of Melbourne, Australia, www.gridbus.org

11

Example Grid Networks

Numerous very high performance computing projects developed in late 1990’s and 2000’s.

Examples: USA TeraGrid, UK e-Science Grid, and others

12

TeraGrid

13

TeraGrid

14

UK e-Science Grid

15

EU grid

16

Computational Grid Applications

Biomedical research Industrial research Engineering research Studies in Physics and Chemistry

17

Some “Computational” Grid Projects

Large Hadron Collider experimental facility for complex particle experiments at CERN (European Center for Nuclear Research, near Geneva Switzerland).

DOE Particle Physics Data grid DOE Science grid AstroGrid Project Comb-e-Chem project

18

CERN grid

19

Key aspects of these grids

State-of-the-art interconnection networks.

Sharing resources.

Community of scientists.

20

Shared Resources

Can be much more than just computers:

Storage Sensors for experiments at particular

sites in the grid Application Software Databases, ...

21

Resource sharing and collaborative computing

Grid computing is about collaborating and resource sharing as much as it is about high performance computing.

Many projects

22

Key aspects

Using distributed computers and resources collectively.

Often crossing organizational boundaries

Fueled by the Internet providing communication network.

23

Evolution of grid computing

Started as a form of distributed computing.

Previous distributed computing systems:– 1980’s - Remote Procedure calls (RPC) client

-server model with a service registry.– 1990’s - Distributed objects systems:

• CORBA (Common Request Broker Architecture)• Java RMI (Remote Method Invocation)

24

Internet-Based Grid computing

Grid Computing is now based upon Internet.

Enables using existing Internet protocols,security mechanisms, etc.

Uses a form of web services.

25

Applications

Originally e-Science applications– Computational intensive, not necessarily one

big problem but a problem that has to be solved repeatedly.

– Data intensive.– Experimental collaborative projects

Now also e-Business applications to improve business models and practices.

26

Background

Emergence and immense success of the Internet and the world-wide web, with agreed upon Internet standards for communication and access.

Continual improvement on computer and network technology and speeds.

27

Need to harness computers

Original driving force behind Internet same as grid computing!

– the need for high performance computing by connecting computers at distributed sites.

28

Economic Development

Cohen report: September 2003,– projected impact of grid computing on NC’s

economy - could lead to 240,000 new jobs and $10 billion in economic growth in North Carolina by 2010.

29

Grid Computing CourseFall 2004

Barry Wilkinson Western Carolina UniversityandClayton FernerUNC-Wilmington

30

Originates from WCU on NCREN network and broadcast to students and faculty at 8 participating institutions:– UNC-Wilmington– NC State University– UNC-Asheville– UNC-Greensboro– Appalachian State University– NC Central University– Cape Fear Community College– Elon University

31

Listed as an undergraduate course but can be taken for graduate credit.

Graduate students expected to do more demanding work.

Most students are undergraduate, but there are a few graduate students at NCSU and UNC-W.

Level

32

Prerequisites

Preferably programming skills in Java on a Linux system.

Some later work involves C/C++ programming.

33

Topics

Review of Internet technologies Introduction to grid computing Web services Grid services Security, Public Key Infrastructure Open Grid Services Architecture (OGSA) Globus 3.2 Condor-G MPI and grid-enabled MPI UNC-W workflow editor and other GUI tools Grid computing applications

34

Assessment

6 “simple” programming tasks– Creating a web service– Creating a grid service– Submitting a Globus job– Submitting a CondorG job– MPI-G2 program– Using UNC-W GUI workflow editor

Programming Project Class tests (1 or 2) Final test

Small print: Subject to change. The instructor reserves the right to change the assignments to make it easier or harder.

35

Grid computing Virtual organizations, computational grid projects, grid computing networks, TeraGrid, grid projects in the US and around the world, grid challenges

Internet Technologies IP addresses, HTTP, URL, HTTP, XML, Telnet, FTP, SSL

Web Services I. Service-Oriented Architecture (SOA), service registry, XML documents, XML schema, namespaces, SOAP, XML/SOAP examples, Axis Web Services II. WSDL, portType, message definition, WSDL to/from code

 Assignment 1 "Simple" Web service Java

programming assignment. Tomcat environment, axis, JWS facility

Weeks 1 - 3

36

Weeks 3 - 4

Grid Service Concepts, differences to Web services, stateful/stateless/transient/non-transient, Open Grid Services Architecture (OGSA), OGSI, grid service factory, Web Services Resource Framework (WSRF)

 Assignment 2 "Simple" grid service Java

programming assignment. Globus 3.2 environment.Tools: ant

 

37

Weeks 4 - 6

Security Secure connection, authorization requirements, symmetric and asymmetric (public/private) key cyptography, non-repudiation, digital signatures, certificates, certificate authorities, X509 certificate 

Globus: Part 1 Basic structure (version 3.2), grid service container, service browser, Globus Resource Allocation Manager (GRAM), job submission with managed-job-globusrun, Grid Security Infrastructure (GSI), Globus certificates, simpleCA, proxies, creating a proxy 

Globus: Part II Resource management, Master Managed Job Factory Service (MMJFS), more on managed-job-globusrun. Resource Specification Language (RSL and RSL-2), syntax and examples in RSL and RSl-2 

Assignment 3 Submitting a Job to the Grid, GT3 mangaged-job-globusrun, job specified in RSL-2 (XML file)

38

Weeks 6 - 7

 

Globus: Part III Information Directory Services, LDAP, resource discovery

Schedulers and Condor, submit description file, resource brokers DAGMan, Checkpointing,

ClassAd, Condor-G, other systems

Assignment 4 Submitting a Condor-G Job

39

Weeks 7 - 8

High performance Grand challenge problems, parallel computing (HPF) computing, potential speed-up, types of

parallel computers, shared memory multiprocessors, programming, message-passing multicomputers

Parallel Programming Techniques suitable for a Grid, embarrassingly parallel computations, Monte Carlo, parameter studies, sample "big" problems, gravitational N-body problem

 Cluster Computing Basic message passing techniques, History, Beowulf clusters, system software, programming models (MPMD, SPMD), synchronous message passing, asynchronous message passing, message tags, collective routines

40

Weeks 8 - 9

 

MPI Process creation, communicators, unsafe message passing, point-to-point message-passing, blocking, non-blocking, communication modes, collective communication, running an MPI program on a cluster

 Grid-enabled MPI MPI-G2 internals, mpirun command, RSL script

Assignment 5 Running a simple MPI-G2 program

41

Weeks 10 to 15

 

Grid portals UNC-W GUI Scientific and business applications

Guest Speaker:Professor Dan Reed, University of North Carolina,

Chapel, NC State University, and Duke University.

 

42

Course Text

There is no assigned course textbook

Materials and links are provided on the home page.

43

Course Home Page

http://www.cs.wcu.edu/~abw/CS493F04

for announcements, slides, assignments, reading materials, tests dates, etc.

44

45

46

47

48

49

50

Acknowledgements

This course is a team effort of:

Mountain Area Grid Innovation

Collaborative

(MAGIC)Faculty: Barry Wilkinson and Mark Holliday

Students (Wizards): Jeffrey House and Sam Daoud

http://www.cs.wcu.edu/~abw/MAGIC

and:

University of North Carolina at Wilmington

51

52

53

54

55

Acknowledgements

Partial support for this work was provided by the National Science Foundation’s Course, Curriculum, and Laboratory Improvement program under grant 0410667andby two grants from University of North Carolina, Office of the President.

MAGIC gratefully acknowledges their support.