64
Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February 11 th , 2005

Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Embed Size (px)

Citation preview

Page 1: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid Computing Course Across North Carolina

Barry WilkinsonDepartment of Computer Science,

University of North Carolina at Charlotte

UNC-C, February 11th, 2005

Page 2: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Talk Outline

Evolution of Grid Computing Software Tools Grid Computing Course

Page 3: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid Computing

Using geographically distributed and interconnected computers for high performance computing and/or for resource sharing.

The grid virtualizes heterogeneous geographically disperse resources

From "Introduction to Grid Computing with Globus," IBM Redbooks

Page 4: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Need to Harness Computers

Original driving force behind grid computing same as Internet!

– Need for high performance computing by connecting computers at distributed sites.

But it has developed into more than just a form of distributed computing.

Page 5: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Shared Resources

Can share much more than just computers:

Storage Sensors for experiments at particular sites Application Software Databases Network capacity, …

Page 6: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Virtual Organizations

Grid computing offers the potential of virtual organizations:

– groups of people both geographically and organizationally distributed working together on a problems sharing computers AND other resources such as databases and experimental equipment.

Crosses multiple administrative domains.

Page 7: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Interconnections and Protocols

Focus now on: using standard Internet

protocols and technology, i.e. HTTP, SOAP, web services, etc.,

Establishing grid computing standards around web services.

Page 8: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Applications

Originally e-Science applications– Computational intensive, not necessarily one

big problem but a problem that has to be solved repeatedly with different parameters.

– Data intensive.– Experimental collaborative projects

Now also e-Business applications to improve business models and practices.

Page 9: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Economic Development

Cohen report: September 2003,– projected impact of grid computing on NC’s

economy - could lead to 240,000 new jobs and $10 billion in economic growth in North Carolina by 2010.

Page 10: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

History

Began in mid 1990’s with experiments using computers at geographically dispersed sites interconnected as a high speed computing platform.

Seminal experiment – “I-way” experiment at 1995 Supercomputing conference (SC’95), using 17 sites across the US running:– 60+ applications.– Existing networks (10 networks).

Page 11: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid Networks and Projects

Numerous very high performance computing projects developed in late 1990’s and 2000’s.

Examples: USA TeraGrid, UK e-Science Grid, and others

Page 12: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

TeraGrid

Page 13: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

TeraGrid

Page 14: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Other Grid Computing Projects

Very large number, in many countries:

Most countries now have their own grid computing networks and projects.

Page 15: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

UK e-Science Grid

Page 16: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Computational Grid Applications

Biomedical research Industrial research Engineering research Studies in Physics and Chemistry

Page 17: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Some “Computational” Grid Projects

Large Hadron Collider experimental facility for complex particle experiments at CERN (European Center for Nuclear Research, near Geneva Switzerland).

DOE Particle Physics Data grid DOE Science grid AstroGrid Project Comb-e-Chem project

Page 18: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

CERN grid

Page 19: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Key aspects of these grids

State-of-the-art interconnection networks.

Sharing resources.

Community of scientists.

Page 20: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

National Science FoundationMiddleware Initiative (NMI)

Started in 2001 Initially over 3 years “to create and deploy

advanced network services that simplify access to diverse Internet information and services.”

Provides a centralized location for important grid software.

Current NMI package includes Globus, Condor, MPI-G2, and a new grid portal project called OGCEGrid (funding started Sept 2003).

Page 21: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Globus

Toolkit provides:

– Underlying Grid Security Infrastructure– Resource Management – Data Management– Information services

Higher level tools are meant to be implemented above these basic services.

Page 22: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

2. discover resource

3. submit job

4. transfer data

1. secure environment

From “Globus Toolkit 4 Tutorial,” MCNC Jan-Feb, 2005, Pawel Plaszczak and Bogdan Lobodzinski, Gridwise Technologies.

Page 23: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

GSI (Grid Security Infrastructure)– Grid security.

GRAM (Grid Resource Allocation Manager) – Remote job submission and control.

GridFTP– Secure data transfer.

MDS (Monitoring and discovery Service)– Interface to system and service information.

Key Components

Page 24: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid Security Infrastructure (GSI)

Provides security functions including:

Authentication Authorization Delegation Confidential Communication

Uses PKI (Public Key Infrastructure) with X.509 certificates and a certificate authority.

Page 25: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

2. MDS

3. GRAM

4. GridFTP

1. GSI

From “Globus Toolkit 4 Tutorial,” MCNC Jan-Feb, 2005, Pawel Plaszczak and Bogdan Lobodzinski, Gridwise Technologies.

Page 26: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Resource Management

Job submission Job status Resource allocation

Globus does not have its own job scheduler to find resources and automatically send jobs to suitable machines. For that, use a separate scheduler - we used Condor-G.

Scheduling

From "Introduction to Grid Computing with Globus," IBM Redbooks

Page 27: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Software - Globus Project

Open source software toolkit developed for grid computing.

Foster championed grid computing concept. Roots in I-way experiment. Started in 1996.

– GT version 1 (Late 1990’s)– GT version 2 (Early 2000’s)– GT version 3 (2003)– GT version 4 (2005)

Defacto standard for grid computing.

Page 28: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Globus Toolkit: Recent History

GT2 (2.4 released in 2002): reference implementation of Grid fabric protocols

– GRAM for job submissions– MDS for resource discovery– GridFTP for data transfer– GSI security

GT3 (3.2 released mid-2004): redesign– OGSI based– Grid services, built on SOAP and XML

GT4 (final planned for April 2005): redesign– WSRF based– Grid standards merged with Web services

From “Globus Toolkit 4 Tutorial,” MCNC Jan-Feb, 2005, Pawel Plaszczak and Bogdan Lobodzinski, Gridwise Technologies.

Page 29: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Supercomputing 2003 Demonstration

We used Globus version 2.4 in a Supercomputing 2003 demo organized by the University of Melbourne.

21 countries involved, numerous sites.

Page 30: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February
Page 31: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February
Page 32: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

A re-implementation based upon the Open Grid Service Architecture (OGSA) standard.

We used version 3.2 for the Fall 2004 course.

Underlying implementation of version 3.x used OGSI Open Grid Service Infrastructure) which was not embraced by the community.

Version 3

Page 33: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Version 4

Currently under development to be released early 2005.

OGSA kept but OGSI abandoned in favor of new implementation standards based around web services. (Version 3 used “extended” web services)

Page 34: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Source: globus.org

Page 35: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Web Services-Based Grid Computing

Grid Computing is now strongly based upon web services.

Large number of newly proposed grid computing standards -- WS-xxxx standards:– WS-Resource Framework– WS-Addressing– … Etc.

Page 36: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid Computing Course (Fall 2004)

Originated from WCU on NCREN network Broadcast to:

– UNC-Wilmington– NC State University– UNC-Asheville– UNC-Greensboro– Appalachian State University– NC Central University– Cape Fear Community College– Elon University

43 students, several faculty Uses NMI package Instructors: Barry Wilkinson and

Clayton Ferner (UNC-Wilmington)

Page 37: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Participating Sites

Page 38: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Level and Prerequisites

Listed as an undergraduate course but can be taken for graduate credit.

Graduate students expected to do more work.

Preferably have programming skills in Java on a Linux system.

Page 39: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Topics

Review of Internet technologies Introduction to grid computing Web services Grid services Security, Public Key Infrastructure Open Grid Services Architecture (OGSA) Globus 3.2 Condor-G MPI and grid-enabled MPI UNC-W workflow editor and other GUI tools Grid computing applications

Page 40: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Assessment

Programming assignments– Creating a web service– Creating a grid service– Submitting a Globus job– Submitting a CondorG job– MPI-G2 program (not done in Fall 2004)– Using UNC-W GUI workflow editor

Class tests (2) Final test

Page 41: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid computing Virtual organizations, computational grid projects, grid computing networks, TeraGrid, grid projects in the US and around the world, grid challenges

Internet Technologies IP addresses, HTTP, URL, HTTP, XML, Telnet, FTP, SSL

Web Services I. Service-Oriented Architecture (SOA), service registry, XML documents, XML schema, namespaces, SOAP, XML/SOAP examples, Axis Web Services II. WSDL, portType, message definition, WSDL to/from code

 Assignment 1 "Simple" Web service Java

programming assignment. Tomcat environment, axis, JWS facility

Weeks 1 - 3

Page 42: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Weeks 3 - 4

Grid Service Concepts, differences to Web services, stateful/stateless/transient/non-transient, Open Grid Services Architecture (OGSA), OGSI, grid service factory, Web Services Resource Framework (WSRF)

 Assignment 2 "Simple" grid service Java

programming assignment. Globus 3.2 environment.Tools: ant

 

Page 43: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Weeks 4 - 6

Security Secure connection, authorization requirements, symmetric and asymmetric (public/private) key cyptography, non-repudiation, digital signatures, certificates, certificate authorities, X509 certificate 

Globus: Part 1 Basic structure (version 3.2), grid service container, service browser, Globus Resource Allocation Manager (GRAM), job submission with managed-job-globusrun, Grid Security Infrastructure (GSI), Globus certificates, simpleCA, proxies, creating a proxy 

Globus: Part II Resource management, Master Managed Job Factory Service (MMJFS), more on managed-job-globusrun. Resource Specification Language (RSL and RSL-2), syntax and examples in RSL and RSl-2 

Assignment 3 Submitting a Job to the Grid, GT3 mangaged-job-globusrun, job specified in RSL-2 (XML file)

Page 44: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Weeks 6 - 7

 

Globus: Part III Information Directory Services, LDAP, resource discovery

Schedulers and Condor, submit description file, resource brokers DAGMan, Checkpointing,

ClassAd, Condor-G, other systems

Assignment 4 Submitting a Condor-G Job

Page 45: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Weeks 7 - 8

High performance Grand challenge problems, parallel computing (HPF) computing, potential speed-up, types of

parallel computers, shared memory multiprocessors, programming, message-passing multicomputers

Parallel Programming Techniques suitable for a Grid, embarrassingly parallel computations, Monte Carlo, parameter studies, sample "big" problems, gravitational N-body problem

 Cluster Computing Basic message passing techniques, History, Beowulf clusters, system software, programming models (MPMD, SPMD), synchronous message passing, asynchronous message passing, message tags, collective routines

Page 46: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Weeks 8 - 9

 

MPI Process creation, communicators, unsafe message passing, point-to-point message-passing, blocking, non-blocking, communication modes, collective communication, running an MPI program on a cluster

 Grid-enabled MPI MPI-G2 internals, mpirun command, RSL script

Assignment 5 Running a simple MPI-G2 program

Page 47: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Grid Portal Design

“A web-based application server enhanced with the necessary software to communicate to grid services and resources”

“Provides application scientist a customized view of software and hardware resources from a web browser” [1]

Weeks 10 to 11

Page 48: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

OGCEGrid

Page 49: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Workflow Technique

Functional decomposition - dividing problem into separate functions which take results from other functional units and pass on results to functional units - interconnection patterns depends upon the problem.

Workflow - describes the flow of information between the units.

Weeks 11 to 12

Page 50: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Workflow ExampleClimate Modeling

Atmospheric Atmospheric

Hydrology

Land Surface ModelOceanic Circulation

Atmospheric Model

ChemistryCirculation Model

Model

Ocean Model

Ocean Chemistry

heating rates

water vapor content, humidity , pressure,wind velocities, temperature

sea surf ace temperature

wind stress,heat flux,water flux

Page 51: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

GridNexus Workflow Editor Developed by UNC-Wilmington

www.gridnexus.org

Page 52: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Simple Workflow Example

Computes

-2 * (23 + 6)

Page 53: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

More Complex Example using Grid Services

Page 54: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Guest Speakers

Professor Daniel A. Reed, Chancellor's Eminent Professor, Vice Chancellor for IT and CIO, UNC-Chapel Hill, Director of Institute for Renaissance Computing, University of North Carolina at Chapel Hill, Duke University, and NC State University. Title of presentation: “Grid computing: 21st Century Challenges.”

Wolfgang Gentzsch, Managing Director, MCNC Grid Computing and Networking Services. Title of presentation “Grid Computing in the Industry”

Chuck Kesler, Director, Grid Deployment and Data Center Services, MCNC. Title of presentation: “Security Policy, Legal, and Regulatory Challenges in Grid Computing Environments.”

Professor Ian Foster: Taped presentation “The Grid: Beyond the Hype,” by Ian Foster, Argonne National Laboratory and University of Chicago, (originally given at Duke University, Sept. 14th, 2004).

Weeks 14 to 15

Page 55: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Course Home Page

No assigned course textbook. Materials and links are provided on the home page.

http://www.cs.wcu.edu/~abw/CS493F04

Used for posting announcements, slides, assignments, reading materials, tests dates, etc.

WebCT also used for quizzes.

Page 56: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February
Page 57: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February
Page 58: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February
Page 59: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February
Page 60: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Fall 2005 Grid Computing Course

To originate from UNC-Charlotte as ITCS4010

Again in collaboration with UNC-Wilmington

Can be for undergraduate or graduate credit.

Page 61: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Acknowledgements

Partial support for this work was provided by the National Science Foundation’s Course and University of North Carolina, Office of the President.

1. National Science Foundation, “Introducing Grid Computing into the Undergraduate Curricula,” ref. DUE 0410667, PI: A. B. Wilkinson, $100,000, 2004-2006.

2. University of North Carolina Office of President, “A Consortium to Promote Computational Science and High Performance Computing,” PI: B. Kurtz (Appalachian State University) co-PI B. Wilkinson and others at various universities, total $650,000, 2004-2006.

3. University of North Carolina Office of President, “Fostering Undergraduate Research Partnerships through a Graphical User Environment for the North Carolina Computing Grid,” PI: R. Vetter (UNC-Wilmington), co-PI B. Wilkinson and others at various universities, total $557,634, 2004-2006.

Page 62: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Papers Since Fall 2004

• B. Wilkinson, M. Holliday, and C. Ferner, “Experiences in Teaching a Geographically Distributed Undergraduate Grid Computing Course,” Workshop on Collaborative and Learning Applications of Grid Technology and Grid Education, IEEE International Symposium on Cluster Computing and the Grid (CCGrid2005), Cardiff, UK, May 9 - 12, 2005, accepted.

• M. A. Holliday, B. Wilkinson, J. House, S. Daoud, and C. Ferner, “A Geographically-Distributed, Assignment-Structured Undergraduate Grid Computing Course,” SIGCSE 2005 Technical Symposium on Computer Science Education, St. Louis, Missouri, February 23 - 27, 2005, accepted.

Page 63: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Opportunities

I am looking for one or two students to work with me over the summer of 2005 full time, and continue through Fall 2005 and onwards.

Must be able to handle complexities of Linux-based software.

Could lead to MS or PhD work.

$$$ Paid!!

Page 64: Grid Computing Course Across North Carolina Barry Wilkinson Department of Computer Science, University of North Carolina at Charlotte UNC-C, February

Questions?