Transcript
Page 1: Grid Computing:  like herding cats?

Stephen JarvisHigh Performance Systems Group

University of Warwick, UK

Grid Computing: like herding cats?

Page 2: Grid Computing:  like herding cats?

2

• What are we going to cover today?– A brief history

– Why we are doing it

– Applications

– Users

– Challenges

– Middleware

• What are you going to cover next week?– technical talk on the specifics of our work

– Including application to e-Business and e-Science

Sessions on Grid

Page 3: Grid Computing:  like herding cats?

3

An Overused Analogy

• Electrical Power Grid

• Computing power might somehow be like electrical power– plug in

– switch on

– have access to unlimited power

• We don’t know who supplies the power, or where it comes from– just pick up the bill at the end of the month

• Is this the future of computing?

Page 4: Grid Computing:  like herding cats?

4

• Is the computing infrastructure available?

• Computing power

– 1986: Cray X-MP ($8M)

– 2000: Nintendo-64 ($149)

– 2003: Earth Simulator (NEC), ASCI Q (LANL)

– 2005: Blue Gene/L (IBM), 360 Teraflops

– Look at www.top500.org for current supercomputers!

Sounds great - but how long?

Page 5: Grid Computing:  like herding cats?

5

• Storage capabilities– 1986: Local data stores (MB)

– 2002: Goddard Earth Observation System – 29TB

• Network capabilities– 1986 : NFSNET 56Kb/s backbone

– 1990s: Upgraded to 45Mb/s (gave us the Internet)

– 2000s: 40 Gb/s

Storage & Network

Page 6: Grid Computing:  like herding cats?

6

Many Potential Resources

GRID

Terra-bytedatabases

Spacetelescopes

Millions of PCs30% Utilisation

SupercomputingCentres

10k PS/2per week

50M MobilePhones

Page 7: Grid Computing:  like herding cats?

• The vision … mid ’90s– to promote a revolution in how NASA addresses large-

scale science and engineering – by providing a persistent HPC infrastructure

• Computing and data management services– on-demand– locate and co-schedule multi-Center resources – address large-scale and/or widely distributed problems

• Ancillary services – workflow management and coordination – security, charging …

Some History:

NASAs Information Power Grid

Page 8: Grid Computing:  like herding cats?

•Lift Capabilities•Drag Capabilities•Responsiveness

•Thrust performance•Reverse Thrust performance•Responsiveness•Fuel Consumption

•Braking performance•Steering capabilities•Traction•Dampening capabilities

Crew Capabilities- accuracy- perception- stamina- re-action times- SOP’s Engine Models

Airframe Models

Landing Gear Models

Stabilizer Models

Human Models

Whole system simulations are produced by couplingall of the sub-system simulations

Page 9: Grid Computing:  like herding cats?

SDSC

LaRC

GSFC

MSFC

KSCJSC

NCSA

Boeing

JPL

NGIX

EDC

NRENCMU

GRC

300 node Condor pool

NTON-II/SuperNet

MCAT/SRB

O2000

DMF MDSCA

O2000

O2000

cluster

clusterO2000

MDS

MDS

Page 10: Grid Computing:  like herding cats?

VirtualNational Air

SpaceVNAS

GRCEngine Models

LaRC

Airframe Models

LandingGear Models

ARC

Wing Models

Stabilizer Models

Human Models

•FAA Ops Data•Weather Data•Airline Schedule Data•Digital Flight Data•Radar Tracks•Terrain Data•Surface Data

22,000 CommercialUS Flights a day

50,000 Engine Runs

22,000 Airframe Impact Runs

132,000 Landing/Take-off

Gear Runs

48,000 Human Crew Runs

66,000 Stabilizer Runs

44,000 Wing Runs

SimulationDrivers

(Being pulled togetherunder the NASA AvSPAviation ExtraNet (AEN)

National Air Space Simulation Environment

Page 11: Grid Computing:  like herding cats?

• A computational grid is a hardware and software infrastructure that provides dependable, consistent, pervasive and inexpensive access to high-end computational capabilities.

• The capabilities need not be high end.

• The infrastructure needs to be relatively transparent.

What is a Computational Grid?

Page 12: Grid Computing:  like herding cats?

Selected Grid Projects• US Based

– NASA Information Power Grid– DARPA CoABS Grid– DOE Science Grid– NSF National Virtual Observatory– NSF GriPhyN– DOE Particle Physics Data Grid– NSF DTF TeraGrid– DOE ASCI DISCOM Grid– DOE Earth Systems Grid etc…

• EU Based

– DataGrid (CERN, ..)– EuroGrid (Unicore)– Damien (Metacomputing)– DataTag (TransAtlanticTestbed, …)– Astrophysical Virtual Observatory– GRIP (Globus/Unicore)– GRIA (Industrial applications)– GridLab (Cactus Toolkit, ..)– CrossGrid (Infrastructure Components)– EGSO (Solar Physics)

• Other National Projects – UK - e-Science Grid– Netherlands – VLAM-G, DutchGrid– Germany – UNICORE Grid, D-Grid– France – Etoile Grid– Italy – INFN Grid– Eire – Grid-Ireland– Scandinavia - NorduGrid– Poland – PIONIER Grid– Hungary – DemoGrid– Japan – JpGrid, ITBL– South Korea – N*Grid – Australia – Nimrod-G, ….– Thailand – Singapore – AsiaPacific Grid

Page 13: Grid Computing:  like herding cats?

The Big Spend: two examples

– US Tera Grid

• $100 Million US Dollars (so far…)

• 5 supercomputer centres

• New ultra-fast optical network ≤ 40Gb/s

• Grid software and parallel middleware

• Coordinated virtual organisations

• Scientific applications and users

– UK e-Science Grid

• £250 Million (so far…)

• Regional e-Science centres

• New infrastructure

• Middleware development

• Big science projects

SuperJANET4

Page 14: Grid Computing:  like herding cats?

Cambridge

Newcastle

Edinburgh

Oxford

Glasgow

Manchester

Cardiff

Soton

London

Belfast

DL

RL Hinxton

Lancaster White Rose

Birmingham/Warwick

Bristol UCL

e-Science Grid

Page 15: Grid Computing:  like herding cats?

15

• NASA– Aerospace simulations, Air traffic control– NWS, In-aircraft computing– Virtual Airspace– Free fly, Accident prevention

• IBM– On-demand computing infrastructure– Protect software– Support business computing

• Governments

– Simulation experiments– Biodiversity, genomics, military, space science…

Who wants Grids and why?

Page 16: Grid Computing:  like herding cats?

16

Classes of Grid applicationsCategory Examples Characteristics

Distributed supercomputing

DIS, Stellar dynamics, Chemistry

Very large problems, lots of CPU, memory

High Throughput Chip design, cryptography

Harnessing idle resources

On Demand Medical, Weather prediction

Remote resources, time bounded

Data Intensive Physics, Sky surveys Synthesis of new information

Collaborative Data exploration, virtual environments

Connection between many parties

Page 17: Grid Computing:  like herding cats?

17

Classes of GridCategory Examples Characteristics

Data Grid EU DataGridLots of data sources

from one site, processing off site

Compute Grid Chip design, cryptography

Harnessing and connecting rare

resources

Scavenging Grid SETI CPU Cycle steeling, commodity resources

Enterprise Grid Banking Multi-site, but one organisation

Page 18: Grid Computing:  like herding cats?

ScientificInformationScientific

InformationScientific Discovery

In Real Time

Real Time Integration

Dynamic ApplicationIntegration

Workflow Construction

Interactive Visual Analysis

LiteratureLiterature

DatabasesDatabases

OperationalData

OperationalData

ImagesImages

InstrumentData

InstrumentData

Using Distributed Resources

Discovery Net Project

Page 19: Grid Computing:  like herding cats?

Nucleotide Annotation Workflows

Download sequence

from Reference

Server

Save to Distributed Annotation

Server

Execute distributed annotation workflow

NCBIEMBL

TIGR SNP

InterPro

SMART

SWISSPROT

GO

KEGG

1800 clicks 500 Web access200 copy/paste 3 weeks work in 1 workflow and few second execution

Page 20: Grid Computing:  like herding cats?

An e-science challenge – non-trivial

NASA IPG as a possible paradigm

Need to integrate rigorously if to deliver accurate & hence biomedically useful results

Noble (2002) Nature Rev. Mol. Cell.Biol. 3:460

Sansom et al. (2000) Trends Biochem. Sci. 25:368

molecular

cellular

organism

Grand Challenge: Integrating Different Levels of Simulation

Page 21: Grid Computing:  like herding cats?

21

Classes of Grid usersClass Purpose Makes Use

Of Concerns

End Users Solve problems Applications Transparency, performance

Application Developers

Develop applications

Programming models, tools

Ease of use, performance

Tool Developers Develop tools & prog. models Grid services Adaptivity,

security

Grid Developers Provide grid services

Existing grid services

Connectivity, security

System Administrators

Management of resources Management tools Balancing

concerns

Page 22: Grid Computing:  like herding cats?

22

• Composed of hierarchy of sub-systems• Scalability is vital• Key elements:

– End systems• Single compute nodes, storage systems, IO devices etc.

– Clusters• Homogeneous networks of workstations; parallel & distributed

management

– Intranet• Heterogeneous collections of clusters; geographically

distributed

– Internet• Interconnected intranets; no centralised control

Grid architecture

Page 23: Grid Computing:  like herding cats?

23

• State of the art– Privileged OS; complete control of resources and

services

– Integrated nature allows high performance

– Plenty of high level languages and tool

• Future directions– Lack features for integration into larger systems

– OS support for distributed computation

– Mobile code (sandboxing)

– Reduction in network overheads

End Systems

Page 24: Grid Computing:  like herding cats?

24

• State of the art– High-speed LAN, 100s or 1000s of nodes

– Single administrative domain

– Programming libraries like MPI

– Inter-process communication, co-scheduling

• Future directions– Performance improvements

– OS support

Clusters

Page 25: Grid Computing:  like herding cats?

25

• State of the art– Grids of many resources, but one admin. domain– Management of heterogeneous resources– Data sharing (e.g. databases, web services)– Supporting software environments inc. CORBA– Load sharing systems such as LSF and Condor– Resource discovery

• Future directions– Increasing complexity (physical scale etc)– Performance– Lack of global knowledge

Intranets

Page 26: Grid Computing:  like herding cats?

26

• State of the art– Geographical distribution, no central control

– Data sharing is very successful

– Management is difficult

• Future directions– Sharing other computing services (e.g. computation)

– Identification of resources

– Transparency

– Internet services

Internets

Page 27: Grid Computing:  like herding cats?

27

• Authentication– Can the users use the system; what jobs can they run?

• Acquiring resources– What resources are available?

– Resource allocation policy; scheduling

• Security– Is the data safe? Is the user process safe?

• Accounting– Is the service free, or should the user pay?

Basic Grid services

Page 28: Grid Computing:  like herding cats?

28

• Grids computing is a relatively new area– There are many challenges

• Nature of Applications– New methods of scientific and business computing

• Programming models and tools– Rethinking programming, algorithms, abstraction etc.– Use of software components/services

• System Architecture– Minimal demands should be placed on contributing sites– Scalability– Evolution of future systems and services

Research Challenges (#1)

Page 29: Grid Computing:  like herding cats?

29

• Problem solving methods– Latency- and fault-tolerant strategies

– Highly concurrent and speculative execution

• Resource management– How are the resources shared?

– How do we achieve end-to-end performance?

– Need to specify QoS requirements

– Then need to translate this to resource level

– Contention?

Research Challenges (#2)

Page 30: Grid Computing:  like herding cats?

30

• Security– How do we safely share data, resources, tasks?– How is code transferred?– How does licensing work?

• Instrumentation and performance– How do we maintain good performance?– How can load-balancing be controlled?– How do we measure grid performance?

• Networking and infrastructure– Significant impact on networking– Need to combine high and low bandwidth

Research Challenges (#3)

Page 31: Grid Computing:  like herding cats?

31

• Many people see middleware as the vital ingredient

• Globus toolkit– Component services for security, resource location,

resource management, information services

• OGSA– Open Grid Services Architecture

– Drawing on web services technology

• GGF– International organisation driving Grid development

– Contains partners such as Microsoft, IBM, NASA etc.

Development of middleware

Page 32: Grid Computing:  like herding cats?

32

Workload Generation, Visualization…

Middleware Conceptual Layers

Discovery, Mapping, Scheduling, Security, Accounting…

Computing, Storage, Instrumentation…

Page 33: Grid Computing:  like herding cats?

Requirements include:

• Offers up useful resources

• Accessible and useable resources

• Stable and adequately supported

• Single user ‘Laptop feel’

Middleware has much of this responsibility

Page 34: Grid Computing:  like herding cats?

Demanding management issues • Users are (currently) likely to be sophisticated

• but probably not computer ‘techies’

• Need to hide detail & ‘obscene’ complexity

• Provide the vision of access of full resources

• Provide contract for level(s) of support (SLAs)

Page 35: Grid Computing:  like herding cats?

Key Interface between Applications & Machines

Gate Keeper / Manager

• Acts as resource manager.

• Responsible for mapping applications to resources.

• Scheduling tasks.

• Ensuring service level agreements (SLAs)

• Distributed / Dynamic.

Page 36: Grid Computing:  like herding cats?

Middleware Projects

• Globus, Argonne National Labs, USA

• AppLeS, UC San Diego, USA

• Open Grid Services Architecture (OGSA)

• ICENI, Imperial, UK

• Nimrod, Melbourne, Australia

• Many others... including us!!

Page 37: Grid Computing:  like herding cats?

37

HPSG’s approach:

• Determine what resources are required– (advertise)

• Determine what resources are available– (discovery)

• Map requirements to available resources– (scheduling)

• Maintain contract of performance – (service level of agreement)

• Performance drives the middleware decisions– PACE

Page 38: Grid Computing:  like herding cats?

38

• ‘[The Grid] intends to make access to computing power, scientific data repositories and experimental facilities as easy as the Web makes access to information.’

• High Performance Systems Group, Warwick– www.dcs.warwick.ac.uk/research/hpsg

Tony Blair, 2002

Page 39: Grid Computing:  like herding cats?

39

• And herding cats …

– 100,000s computers

– Sat. links, miles of networking

– Space telescopes, atomic colliders, medical scanners

– Tera-bytes of data

– Software stack a mile high…

Page 40: Grid Computing:  like herding cats?