Upload
annabella-gibbs
View
217
Download
0
Tags:
Embed Size (px)
Citation preview
Grid Computing Course Across North Carolina
Barry WilkinsonDepartment of Computer Science,
University of North Carolina at Charlotte
UNC-C, February 11th, 2005
Talk Outline
Evolution of Grid Computing Software Tools Grid Computing Course
Grid Computing
Using geographically distributed and interconnected computers for high performance computing and/or for resource sharing.
The grid virtualizes heterogeneous geographically disperse resources
From "Introduction to Grid Computing with Globus," IBM Redbooks
Need to Harness Computers
Original driving force behind grid computing same as Internet!
– Need for high performance computing by connecting computers at distributed sites.
But it has developed into more than just a form of distributed computing.
Shared Resources
Can share much more than just computers:
Storage Sensors for experiments at particular sites Application Software Databases Network capacity, …
Virtual Organizations
Grid computing offers the potential of virtual organizations:
– groups of people both geographically and organizationally distributed working together on a problems sharing computers AND other resources such as databases and experimental equipment.
Crosses multiple administrative domains.
Interconnections and Protocols
Focus now on: using standard Internet
protocols and technology, i.e. HTTP, SOAP, web services, etc.,
Establishing grid computing standards around web services.
Applications
Originally e-Science applications– Computational intensive, not necessarily one
big problem but a problem that has to be solved repeatedly with different parameters.
– Data intensive.– Experimental collaborative projects
Now also e-Business applications to improve business models and practices.
Economic Development
Cohen report: September 2003,– projected impact of grid computing on NC’s
economy - could lead to 240,000 new jobs and $10 billion in economic growth in North Carolina by 2010.
History
Began in mid 1990’s with experiments using computers at geographically dispersed sites interconnected as a high speed computing platform.
Seminal experiment – “I-way” experiment at 1995 Supercomputing conference (SC’95), using 17 sites across the US running:– 60+ applications.– Existing networks (10 networks).
Grid Networks and Projects
Numerous very high performance computing projects developed in late 1990’s and 2000’s.
Examples: USA TeraGrid, UK e-Science Grid, and others
TeraGrid
TeraGrid
Other Grid Computing Projects
Very large number, in many countries:
Most countries now have their own grid computing networks and projects.
UK e-Science Grid
Computational Grid Applications
Biomedical research Industrial research Engineering research Studies in Physics and Chemistry
Some “Computational” Grid Projects
Large Hadron Collider experimental facility for complex particle experiments at CERN (European Center for Nuclear Research, near Geneva Switzerland).
DOE Particle Physics Data grid DOE Science grid AstroGrid Project Comb-e-Chem project
CERN grid
Key aspects of these grids
State-of-the-art interconnection networks.
Sharing resources.
Community of scientists.
National Science FoundationMiddleware Initiative (NMI)
Started in 2001 Initially over 3 years “to create and deploy
advanced network services that simplify access to diverse Internet information and services.”
Provides a centralized location for important grid software.
Current NMI package includes Globus, Condor, MPI-G2, and a new grid portal project called OGCEGrid (funding started Sept 2003).
Globus
Toolkit provides:
– Underlying Grid Security Infrastructure– Resource Management – Data Management– Information services
Higher level tools are meant to be implemented above these basic services.
2. discover resource
3. submit job
4. transfer data
1. secure environment
From “Globus Toolkit 4 Tutorial,” MCNC Jan-Feb, 2005, Pawel Plaszczak and Bogdan Lobodzinski, Gridwise Technologies.
GSI (Grid Security Infrastructure)– Grid security.
GRAM (Grid Resource Allocation Manager) – Remote job submission and control.
GridFTP– Secure data transfer.
MDS (Monitoring and discovery Service)– Interface to system and service information.
Key Components
Grid Security Infrastructure (GSI)
Provides security functions including:
Authentication Authorization Delegation Confidential Communication
Uses PKI (Public Key Infrastructure) with X.509 certificates and a certificate authority.
2. MDS
3. GRAM
4. GridFTP
1. GSI
From “Globus Toolkit 4 Tutorial,” MCNC Jan-Feb, 2005, Pawel Plaszczak and Bogdan Lobodzinski, Gridwise Technologies.
Resource Management
Job submission Job status Resource allocation
Globus does not have its own job scheduler to find resources and automatically send jobs to suitable machines. For that, use a separate scheduler - we used Condor-G.
Scheduling
From "Introduction to Grid Computing with Globus," IBM Redbooks
Software - Globus Project
Open source software toolkit developed for grid computing.
Foster championed grid computing concept. Roots in I-way experiment. Started in 1996.
– GT version 1 (Late 1990’s)– GT version 2 (Early 2000’s)– GT version 3 (2003)– GT version 4 (2005)
Defacto standard for grid computing.
Globus Toolkit: Recent History
GT2 (2.4 released in 2002): reference implementation of Grid fabric protocols
– GRAM for job submissions– MDS for resource discovery– GridFTP for data transfer– GSI security
GT3 (3.2 released mid-2004): redesign– OGSI based– Grid services, built on SOAP and XML
GT4 (final planned for April 2005): redesign– WSRF based– Grid standards merged with Web services
From “Globus Toolkit 4 Tutorial,” MCNC Jan-Feb, 2005, Pawel Plaszczak and Bogdan Lobodzinski, Gridwise Technologies.
Supercomputing 2003 Demonstration
We used Globus version 2.4 in a Supercomputing 2003 demo organized by the University of Melbourne.
21 countries involved, numerous sites.
A re-implementation based upon the Open Grid Service Architecture (OGSA) standard.
We used version 3.2 for the Fall 2004 course.
Underlying implementation of version 3.x used OGSI Open Grid Service Infrastructure) which was not embraced by the community.
Version 3
Version 4
Currently under development to be released early 2005.
OGSA kept but OGSI abandoned in favor of new implementation standards based around web services. (Version 3 used “extended” web services)
Source: globus.org
Web Services-Based Grid Computing
Grid Computing is now strongly based upon web services.
Large number of newly proposed grid computing standards -- WS-xxxx standards:– WS-Resource Framework– WS-Addressing– … Etc.
Grid Computing Course (Fall 2004)
Originated from WCU on NCREN network Broadcast to:
– UNC-Wilmington– NC State University– UNC-Asheville– UNC-Greensboro– Appalachian State University– NC Central University– Cape Fear Community College– Elon University
43 students, several faculty Uses NMI package Instructors: Barry Wilkinson and
Clayton Ferner (UNC-Wilmington)
Participating Sites
Level and Prerequisites
Listed as an undergraduate course but can be taken for graduate credit.
Graduate students expected to do more work.
Preferably have programming skills in Java on a Linux system.
Topics
Review of Internet technologies Introduction to grid computing Web services Grid services Security, Public Key Infrastructure Open Grid Services Architecture (OGSA) Globus 3.2 Condor-G MPI and grid-enabled MPI UNC-W workflow editor and other GUI tools Grid computing applications
Assessment
Programming assignments– Creating a web service– Creating a grid service– Submitting a Globus job– Submitting a CondorG job– MPI-G2 program (not done in Fall 2004)– Using UNC-W GUI workflow editor
Class tests (2) Final test
Grid computing Virtual organizations, computational grid projects, grid computing networks, TeraGrid, grid projects in the US and around the world, grid challenges
Internet Technologies IP addresses, HTTP, URL, HTTP, XML, Telnet, FTP, SSL
Web Services I. Service-Oriented Architecture (SOA), service registry, XML documents, XML schema, namespaces, SOAP, XML/SOAP examples, Axis Web Services II. WSDL, portType, message definition, WSDL to/from code
Assignment 1 "Simple" Web service Java
programming assignment. Tomcat environment, axis, JWS facility
Weeks 1 - 3
Weeks 3 - 4
Grid Service Concepts, differences to Web services, stateful/stateless/transient/non-transient, Open Grid Services Architecture (OGSA), OGSI, grid service factory, Web Services Resource Framework (WSRF)
Assignment 2 "Simple" grid service Java
programming assignment. Globus 3.2 environment.Tools: ant
Weeks 4 - 6
Security Secure connection, authorization requirements, symmetric and asymmetric (public/private) key cyptography, non-repudiation, digital signatures, certificates, certificate authorities, X509 certificate
Globus: Part 1 Basic structure (version 3.2), grid service container, service browser, Globus Resource Allocation Manager (GRAM), job submission with managed-job-globusrun, Grid Security Infrastructure (GSI), Globus certificates, simpleCA, proxies, creating a proxy
Globus: Part II Resource management, Master Managed Job Factory Service (MMJFS), more on managed-job-globusrun. Resource Specification Language (RSL and RSL-2), syntax and examples in RSL and RSl-2
Assignment 3 Submitting a Job to the Grid, GT3 mangaged-job-globusrun, job specified in RSL-2 (XML file)
Weeks 6 - 7
Globus: Part III Information Directory Services, LDAP, resource discovery
Schedulers and Condor, submit description file, resource brokers DAGMan, Checkpointing,
ClassAd, Condor-G, other systems
Assignment 4 Submitting a Condor-G Job
Weeks 7 - 8
High performance Grand challenge problems, parallel computing (HPF) computing, potential speed-up, types of
parallel computers, shared memory multiprocessors, programming, message-passing multicomputers
Parallel Programming Techniques suitable for a Grid, embarrassingly parallel computations, Monte Carlo, parameter studies, sample "big" problems, gravitational N-body problem
Cluster Computing Basic message passing techniques, History, Beowulf clusters, system software, programming models (MPMD, SPMD), synchronous message passing, asynchronous message passing, message tags, collective routines
Weeks 8 - 9
MPI Process creation, communicators, unsafe message passing, point-to-point message-passing, blocking, non-blocking, communication modes, collective communication, running an MPI program on a cluster
Grid-enabled MPI MPI-G2 internals, mpirun command, RSL script
Assignment 5 Running a simple MPI-G2 program
Grid Portal Design
“A web-based application server enhanced with the necessary software to communicate to grid services and resources”
“Provides application scientist a customized view of software and hardware resources from a web browser” [1]
Weeks 10 to 11
OGCEGrid
Workflow Technique
Functional decomposition - dividing problem into separate functions which take results from other functional units and pass on results to functional units - interconnection patterns depends upon the problem.
Workflow - describes the flow of information between the units.
Weeks 11 to 12
Workflow ExampleClimate Modeling
Atmospheric Atmospheric
Hydrology
Land Surface ModelOceanic Circulation
Atmospheric Model
ChemistryCirculation Model
Model
Ocean Model
Ocean Chemistry
heating rates
water vapor content, humidity , pressure,wind velocities, temperature
sea surf ace temperature
wind stress,heat flux,water flux
GridNexus Workflow Editor Developed by UNC-Wilmington
www.gridnexus.org
Simple Workflow Example
Computes
-2 * (23 + 6)
More Complex Example using Grid Services
Guest Speakers
Professor Daniel A. Reed, Chancellor's Eminent Professor, Vice Chancellor for IT and CIO, UNC-Chapel Hill, Director of Institute for Renaissance Computing, University of North Carolina at Chapel Hill, Duke University, and NC State University. Title of presentation: “Grid computing: 21st Century Challenges.”
Wolfgang Gentzsch, Managing Director, MCNC Grid Computing and Networking Services. Title of presentation “Grid Computing in the Industry”
Chuck Kesler, Director, Grid Deployment and Data Center Services, MCNC. Title of presentation: “Security Policy, Legal, and Regulatory Challenges in Grid Computing Environments.”
Professor Ian Foster: Taped presentation “The Grid: Beyond the Hype,” by Ian Foster, Argonne National Laboratory and University of Chicago, (originally given at Duke University, Sept. 14th, 2004).
Weeks 14 to 15
Course Home Page
No assigned course textbook. Materials and links are provided on the home page.
http://www.cs.wcu.edu/~abw/CS493F04
Used for posting announcements, slides, assignments, reading materials, tests dates, etc.
WebCT also used for quizzes.
Fall 2005 Grid Computing Course
To originate from UNC-Charlotte as ITCS4010
Again in collaboration with UNC-Wilmington
Can be for undergraduate or graduate credit.
Acknowledgements
Partial support for this work was provided by the National Science Foundation’s Course and University of North Carolina, Office of the President.
1. National Science Foundation, “Introducing Grid Computing into the Undergraduate Curricula,” ref. DUE 0410667, PI: A. B. Wilkinson, $100,000, 2004-2006.
2. University of North Carolina Office of President, “A Consortium to Promote Computational Science and High Performance Computing,” PI: B. Kurtz (Appalachian State University) co-PI B. Wilkinson and others at various universities, total $650,000, 2004-2006.
3. University of North Carolina Office of President, “Fostering Undergraduate Research Partnerships through a Graphical User Environment for the North Carolina Computing Grid,” PI: R. Vetter (UNC-Wilmington), co-PI B. Wilkinson and others at various universities, total $557,634, 2004-2006.
Papers Since Fall 2004
• B. Wilkinson, M. Holliday, and C. Ferner, “Experiences in Teaching a Geographically Distributed Undergraduate Grid Computing Course,” Workshop on Collaborative and Learning Applications of Grid Technology and Grid Education, IEEE International Symposium on Cluster Computing and the Grid (CCGrid2005), Cardiff, UK, May 9 - 12, 2005, accepted.
• M. A. Holliday, B. Wilkinson, J. House, S. Daoud, and C. Ferner, “A Geographically-Distributed, Assignment-Structured Undergraduate Grid Computing Course,” SIGCSE 2005 Technical Symposium on Computer Science Education, St. Louis, Missouri, February 23 - 27, 2005, accepted.
Opportunities
I am looking for one or two students to work with me over the summer of 2005 full time, and continue through Fall 2005 and onwards.
Must be able to handle complexities of Linux-based software.
Could lead to MS or PhD work.
$$$ Paid!!
Questions?