128
Introduction to Grid Computing: Ding Qing SSE USTC

Introduction to Grid Computing:

  • Upload
    lovie

  • View
    73

  • Download
    4

Embed Size (px)

DESCRIPTION

Introduction to Grid Computing:. Ding Qing. SSE USTC. Overview. 1. Background 2. Globus Toolkit 3. Future directions 4. Related tools. 1. Background. Introduction Towards global (Grid) computing Grid Challenges and Technologies Grid Architectures Grid Applications. Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: Introduction to Grid Computing:

Introduction to Grid Computing:

Ding QingSSE USTC

Page 2: Introduction to Grid Computing:

2

Overview

1. Background

2. Globus Toolkit

3. Future directions

4. Related tools

Page 3: Introduction to Grid Computing:

3

1. Background

IntroductionTowards global (Grid) computingGrid Challenges and TechnologiesGrid ArchitecturesGrid Applications

Page 4: Introduction to Grid Computing:

4

Introduction

Page 5: Introduction to Grid Computing:

5

Computing and Communication Technologies Evolution

* Sputnik

1960 1970 1975 1980 1985 1990 1995 2000

* ARPANET

* Email* Ethernet

* TCP/IP* IETF

* Internet Era * WWW Era

* Mosaic

* XML

* PC Clusters* Crays * MPPs

* Mainframes

* HTML

* W3C

* P2P

* Grids

* XEROX PARC wormCO

MP

UTIN

GC

om

mu

nic

ati

on

* Web Services

* Minicomputers * PCs

* WS Clusters

* PDAs* Workstations

* HTC

2010

* e-Science

* Computing Utility

* e-Business

* SocialNet

Page 6: Introduction to Grid Computing:

6

2100

2100 2100 2100 2100

2100 2100 2100 2100

Personal Device SMPs or SuperComputers

LocalCluster

GlobalGrid

PERFORMANCE

+

Q

o

S

Inter PlanetGrid

•Individual•Group•Department•Campus•State•National•Globe•Inter Planet•Universe

Administrative Barriers

EnterpriseCluster/Grid

Scalable Computing

Page 7: Introduction to Grid Computing:

7

Cluster of Clusters

Scheduler

MasterDaemon

ExecutionDaemon

SubmitGraphicalControl

Clients

Cluster 2

Scheduler

MasterDaemon

ExecutionDaemon

SubmitGraphicalControl

Clients

Cluster 3

Scheduler

MasterDaemon

ExecutionDaemon

SubmitGraphicalControl

Clients

Cluster 1

LAN/WAN

Page 8: Introduction to Grid Computing:

8

Towards global (Grid) computing

Metaphor: Applications draw computing power from a Computational Gridin the same way electrical devices draw power from an electrical grid.

http://www.sun.com/hpc/

Grid enables:

Resource Sharing

Selection

Aggreation

Grid: An Internet Computing model for coordinated resource sharing

Page 9: Introduction to Grid Computing:

9

A Typical Grid Computing Environment

Grid Resource Broker

Resource Broker

Application

Grid Information Service

Grid Resource Broker

databaseR2R3

RN

R1

R4

R5

R6

Grid Information Service

Page 10: Introduction to Grid Computing:

10

What is Grid ?(there are several definitions)

A type of parallel and distributed system that enables the sharing, selection, & aggregationof geographically distributed “autonomous” resources:

Computers – PCs, workstations, clusters, supercomputers, laptops, notebooks, mobile devices, PDA, etc;

Software – e.g., ASPs renting expensive special purpose applications on demand;

Catalogued data and databases – e.g. transparent access to human genome database;

Special devices/instruments – e.g., radio telescope – SETI@Home searching for life in galaxy.

People/collaborators.

depending on their availability, capability, cost, and user QoS requirements.

Widearea

Page 11: Introduction to Grid Computing:

11

Various Types of Grid Services Computational Services – CPU cycles

SETI@Home, NASA IPG, TeraGrid, I-Grid,… Data Services

Data replication, management, secure access--LHC Grid/Napster

Application Services Access to remote software/libraries and

license management—NetSolve Interaction Services

eLearning, Virtual Tables, Group Communication (Access Grid), Gaming

Knowledge Services The way knowledge is acquired and

managed—data mining. Utility Computing Services

Towards a market-based Grid computing: Leasing and delivering Grid services as ICT utilities.

Computational Grid

Data Grid

ASP Grid

Interaction Grid

Knowledge Grid

Utility Grid

Page 12: Introduction to Grid Computing:

12

Prominent Grid Drivers: Emerging e-Science and e-Business Apps

Next generation experiments, simulations, sensors, satellites, even people and businesses are creating a flood of data. They all involve numerous experts/resources from multiple organization in synthesis, modeling, simulation, analysis, and interpretation.

Life Sciences Digital Biology

Finance: Portfolio analysis

~PBytes/sec

Newswire & data mining:Natural language engineering

Astronomy

Internet & Ecommerce

High Energy Physics Brain Activity Analysis

Quantum Chemistry

Page 13: Introduction to Grid Computing:

13

E-Science Elements

Distributed instruments

Distributed computation

Distributed data

Peers sharing ideas and collaborative interpretation of data/resultsE-Scientist

2100 2100 2100 2100

2100 2100 2100 2100

Remote Visualization

Data & Compute Service

Page 14: Introduction to Grid Computing:

14

Molecular Docking for Drug Design

It involves screening millions of chemical compounds (molecules) in the Chemical Databases to identify those having potential to serve as drug candidates.

Protein

Molecules

Chemical Databases(legacy, in .MOL2 format)

[Collaboration with WEHI for Medical Science, Melbourne]

Page 15: Introduction to Grid Computing:

15LHC – High Energy Physics Collaboration

(fundamental investigation on the origin of mass)

Page 16: Introduction to Grid Computing:

16

LHC Grid Computing Model

Tier2 Centre ~1 TIPS

Online System

Offline Processor Farm

~20 TIPS

CERN Computer Centre

Asia Pacific Centre ~4 TIPS

France Regional Centre

US Regional Centre

Italy Regional Centre

InstituteInstituteInstituteMelbourne~0.25TIPS

Physicist desktop computers

~100 MBytes/sec

~100 MBytes/sec

~622 Mbits/sec

~10 to 100 Mbits/sec

There is a “bunch crossing” every 25 nsecs.

There are 100 “triggers” per second

Each triggered event is ~1 MByte in size

Physics data cache

~PBytes/sec

~622 Mbits/sec

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Tier2 Centre ~1 TIPS

Australian Centre ~1 TIPS

~622 Mbits/sec

1 TIPS is approximately 25,000

SpecInt95 equivalents

Tier 4

Tier 0

Tier 1

Tier 2

Tier 3

Page 17: Introduction to Grid Computing:

17

Enterprise Computing Applications Traditional Model Grid Based Model

Email server

Webserver

Databaseserver

Appsserver

Upgrade to a new serverto handle

more users

Horizontal integration of Email, Web, Data, and Apps servers

Service Virtualization Layer & Load Balancing

Page 18: Introduction to Grid Computing:

18

Oracle 10g: Towards Enterprise Grid Model

Traditional (e.g., Oracle 9i) Tight/Vertical Integration

of Storage, Database, Application Hosting Server, and Application Elements

They reside on a single computing resource.

Enhancing capability means a new investment:

Replace a machine by new one or upgrade it.

Can’t leverage existing resources.

Expensive approach.

Grid Based (e.g., Oracle 10g)

Disintegration of Storage, Database, Application Hosting Server, and Application Elements

They reside on a different resources in a Grid environment.

Enhancing capability means:

Leveraging existing resources Dynamic provisioning Cost-effective approach

Page 19: Introduction to Grid Computing:

19

Grid Challenges and Technologies

Security

Resource Allocation & Scheduling

Data locality

Network Management

System Management

Resource Discovery

Uniform Access

Application Construction

Page 20: Introduction to Grid Computing:

Realizing the Grid

The Grid Architecture

Page 21: Introduction to Grid Computing:

21

RR

R

R

R

R

R

RR

R

Virtual Organizations• Distributed resources and people• Linked by networks, crossing admin domains• Sharing resources, common goals• Dynamic

VO-BVO-A

R

R

R

R

Page 22: Introduction to Grid Computing:

22

R RR

R

R

R

R

R

RRR

R

VO-A VO-B

Virtual Organizations• Distributed resources and people• Linked by networks, crossing admin domains• Sharing resources, common goals• Dynamic• Fault tolerant

Page 23: Introduction to Grid Computing:

23

Grid Realization Steps The integration of individual s/w & h/w components

into a combined networked resource (single system image cluster).

Low-level middleware to provide a secure and uniform access to services provided by different resources.

User-level middleware to support application development and aggregation of distributed resources.

The construction of distributed applications.

Page 24: Introduction to Grid Computing:

24

Page 25: Introduction to Grid Computing:

25

Networked Resources across Organizations

Computers Networks Data Sources Scientific InstrumentsStorage Systems

Local Resource Managers

Operating Systems Queuing Systems Internet ProtocolsLibraries & App Kernels

Distributed Resources Coupling Services

Information QoSProcess

Development Environments and Tools

Languages/Compilers Libraries Debuggers Web tools

Resource Management, Selection, and Aggregation (BROKERS)

Applications and Portals

Prob. Solving Env.Scientific…CollaborationEngineering Web enabled Apps

Trading

FABRIC

APPLICATIONS

SECURITY LAYER

Security Data

CORE MIDDLEWARE

USER LEVEL MIDDLEWARE

Monitors

Layered Grid Architecture

Page 26: Introduction to Grid Computing:

Major Grid Projects and Initiatives

Page 27: Introduction to Grid Computing:

27

Some Grid Projects & Initiatives

Australia Nimrod-G Gridbus GridSim Virtual Lab DISCWorld GrangeNet. ..etc

Europe UK eScience EU Data Grid Cactus XtremeWeb ..etc.

India I-Grid

Japan Ninf DataFarm

Korea...N*Grid

SingaporeNGP

USA AppLeS Globus Legion Sun Grid Engine NASA IPG Condor-G Jxta NetSolve AccessGrid and many more...

Cycle Stealing & .com Initiatives Distributed.net SETI@Home, …. Entropia, UD, SCS,….

Public Forums Global Grid Forum Australian Grid Forum IEEE TFCC CCGrid conference P2P conference

http://www.gridcomputing.com

Page 28: Introduction to Grid Computing:

28

mix-and-match

Object-oriented

Internet/partial-P2P

Network enabled Solvers

Economic-based Utility / Service-Oriented

ComputingNimrod-G

Page 29: Introduction to Grid Computing:

29

Overview

1. Background

2. Globus Toolkit

3. Future directions

4. Related tools

Page 30: Introduction to Grid Computing:

30

The Role of the Globus Toolkit

A collection of solutions to problems that come up frequently when building collaborative distributed applications

Heterogeneity A focus, in particular, on overcoming

heterogeneity for application developers Standards

We capitalize on and encourage use of existing standards (IETF, W3C, OASIS, GGF)

GT also includes reference implementations of new/proposed standards in these organizations

Page 31: Introduction to Grid Computing:

31

Layers in the Grid

Page 33: Introduction to Grid Computing:

33

Without the Globus Toolkit

WebBrowser

ComputeServer

DataCatalog

DataViewer

Tool

Certificateauthority

ChatTool

CredentialRepository

WebPortal

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

RegistrationService

A

B

C

D

E

Application Developer

10

Off the Shelf

12

Globus Toolkit

0

Grid Community

0

Page 34: Introduction to Grid Computing:

34

With the Globus Toolkit

WebBrowser

ComputeServer

GlobusMCS/RLS

DataViewer

Tool

CertificateAuthority

CHEF ChatTeamlet

MyProxy

CHEF

ComputeServer

Resources implement standard access & management interfaces

Collective services aggregate &/or

virtualize resources

Users work with client applications

Application services organize VOs & enable

access to other services

Databaseservice

Databaseservice

Databaseservice

SimulationTool

Camera

Camera

TelepresenceMonitor

Globus IndexService

GlobusGRAM

GlobusGRAM

GlobusDAI

GlobusDAI

GlobusDAI

Application Developer

2

Off the Shelf

9

Globus Toolkit

4

Grid Community

4

Page 35: Introduction to Grid Computing:

35

The Globus Toolkit:“Standard Plumbing” for the Grid

Not turnkey solutions, but building blocks & tools for application developers & system integrators Some components (e.g., file transfer) go farther than

others (e.g., remote job submission) toward end-user relevance

Easier to reuse than to reinvent Compatibility with other Grid systems comes for free

Today the majority of the GT public interfaces are usable by application developers and system integrators Relatively few end-user interfaces In general, not intended for direct use by end users

(scientists, engineers, marketing specialists)

Page 37: Introduction to Grid Computing:

37

Provisioning

Bridging the Gap:Grid Infrastructure

Service-oriented Gridinfrastructure Provision physical

resources to support application workloads

ApplnService

ApplnService

Users

Workflows

Composition

Invocation

Service-oriented applications Wrap applications as

services Compose applications

into workflows

Page 38: Introduction to Grid Computing:

38

Grid Infrastructure

Distributed management Of physical resources Of software services Of communities and their policies

Unified treatment Build on Web services framework Use WS-RF, WS-Notification (or WS-

Transfer/Man) to represent/access state Common management

abstractions & interfaces

Page 39: Introduction to Grid Computing:

39Globus is Open Source

Grid Infrastructure

Implement key Web services standards State, notification, security, …

Software for Grid infrastructure Service-enable new & existing resources E.g., GRAM on computer, GridFTP on storage system,

custom application services Uniform abstractions & mechanisms

Tools to build applications that exploit Grid infrastructure Registries, security, data management, …

Enabler of a rich tool & service ecosystem

Page 40: Introduction to Grid Computing:

40

An eBusiness Use of Globus:SAP Demonstration @ GlobusWorld

3 Globus-enabled applns: CRM: Internet Pricing Configurator (IPC) CRM: Workforce

Management (WFM) SCM: Advanced Planner

& Optimizer (APO) Applications modified to:

Adjust to varying demand & resources

Use Globus to discover & provision resources

IPCDispatcher

IPCServerRequest:

Price Query

Delegation ofRequest

Response: PricelistDepending on: - Time - Discount - Number of Items - …

Web Browsers / Batch Processes(typically several thousand requests)

IPCServer

1

2

2

3

SAP AG R/3 Internet Pricing & Configurator (IPC)

Page 41: Introduction to Grid Computing:

41

Overview

Background and Globus approach Globus Toolkit Future directions Related tools

Page 42: Introduction to Grid Computing:

42

The Globus Toolkit is a Collection of Components

A set of loosely-coupled components, with: Services and clients Libraries Development tools

GT components are used to build Grid-based applications and services GT can be viewed as a Grid SDK

GT components can be categorized across two different dimensions By broad domain area By protocol support

Page 43: Introduction to Grid Computing:

43

GT Domain Areas

Core runtime Infrastructure for building new services

Security Apply uniform policy across distinct systems

Execution management Provision, deploy, & manage services

Data management Discover, transfer, & access large data

Monitoring Discover & monitor dynamic services

Page 44: Introduction to Grid Computing:

44

GT Protocols

Web service protocols WSDL, SOAP WS Addressing, WSRF, WSN WS Security, SAML, XACML WS-Interoperability profile

Non Web service protocols Standards-based, such as GridFTP Custom

Page 45: Introduction to Grid Computing:

45

“ Stateless” vs. “Stateful” Services

Without state, how does client: Determine what happened (success/failure)? Find out how many files completed? Receive updates when interesting events arise? Terminate a request?

Few useful services are truly “stateless”, but WS interfaces alone do not provide built-in support for state

Client

FileTransferService

move (A to B)move

Page 46: Introduction to Grid Computing:

46

FileTransferService (without WSRF)

Developer reinvents wheel for each new service Custom management and identification of state: transferID Custom operations to inspect state synchronously

(whatHappen) and asynchronously (tellMeWhen) Custom lifetime operation (cancel)

Client

FileTransferService

move (A to B) : transferIDmove

statewhatHappen

tellMeWhen

cancel

Page 47: Introduction to Grid Computing:

47

WSRF in a Nutshell Service State representation

Resource Resource Property

State identification Endpoint Reference

State Interfaces GetRP, QueryRPs,

GetMultipleRPs, SetRP Lifetime Interfaces

SetTerminationTime ImmediateDestruction

Notification Interfaces Subscribe Notify

ServiceGroups

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

Page 48: Introduction to Grid Computing:

48

FileTransferService (w/ WSRF)

Developer specifies custom method to createResource and leaves the rest to WSRF standards:

State exposed as Resource + Resource Properties and identified by Endpoint Reference (EPR)

State inspected by standard interfaces (GetRP, QueryRPs) Lifetime management by standard interfaces (Destroy)

ClientFileTransferService

createResource (A to B) : EPRcreateResource

RPs

Transfer getRP

queryRPs

destroy

Page 49: Introduction to Grid Computing:

Data MgmtSecurityCommonRuntime

Execution Mgmt

Info Services

Web Services

Components

Non-WS Components

Pre-WSAuthenticationAuthorization

GridFTPPre-WS

Grid ResourceAlloc. & Mgmt

Pre-WSMonitoring

& Discovery

C CommonLibraries

AuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

Java WS Core

CommunityAuthorization

ReplicaLocation

eXtensibleIO (XIO)

CredentialMgmt

CommunitySchedulingFramework

Delegation

Globus Toolkit version 4 (GT4)

DataReplication

TriggerC

WS Core

Python WS Core

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Contrib/Preview

Core

Depre-cated

www.globus.org

Page 50: Introduction to Grid Computing:

50

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit: Open Source Grid Infrastructure

Page 51: Introduction to Grid Computing:

51

Java Services in Apache AxisPlus GT Libraries and Handlers

YourJava

Service

YourPythonService

YourJava

Service RF

T

GR

AM

Del

egat

ion

Inde

x

Trig

ger

Arc

hive

r

pyGlobusWS Core

YourC

Service

C WS Core

RLS

Pre

-WS

MD

S

CA

S

Pre

-WS

GR

AM

Sim

pleC

A

MyP

roxy

OG

SA

-DA

I

GT

CP

Grid

FT

P

C Services using GT Libraries and Handlers

SERVER

CLIENT

InteroperableWS-I-compliant

SOAP messaging

YourJavaClient

YourC

Client

YourPythonClient

YourJavaClient

YourC

Client

YourPythonClient

YourJavaClient

YourC

Client

YourPythonClient

YourJavaClient

YourC

Client

YourPythonClient

X.509 credentials =common authentication

Python hosting, GT Libraries

GT4 Components

Page 52: Introduction to Grid Computing:

52

Goals for GT4

Usability, reliability, scalability, … Web service components have quality equal or

superior to pre-WS components Documentation at acceptable quality level

Consistency with latest standards (WS-*, WSRF, WS-N, etc.) and Apache platform WS-I Basic Profile compliant WS-I Basic Security Profile compliant

New components, platforms, languages And links to larger Globus ecosystem

Page 53: Introduction to Grid Computing:

53

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit: Open Source Grid Infrastructure

Page 54: Introduction to Grid Computing:

54

GT4 Web Services Runtime

Supports both GT (GRAM, RFT, Delegation, etc.) & user-developed services

Redesign to enhance scalability, modularity, performance, usability

Leverages existing WS standards WS-I Basic Profile: WSDL, SOAP, etc. WS-Security, WS-Addressing

Adds support for emerging WS standards WS-Resource Framework, WS-Notification

Java, Python, & C hosting environments Java is standard Apache

Page 55: Introduction to Grid Computing:

55

GT4 WS Core in a Nutshell

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

Implementation of WSRF: Resources,

EndpointReferences, ResourceProperties

Operation Providers: pre-build implementations of

WSRF operations

Notification implementation: Topics, TopicSet, Embedded

Notification Consumer service

Implementations of Resources (ReflectionResource,

PersistentReflectionResource) and ResourceProperties

(SimpleResourceProperty, ReflectionResourceProperty)

Page 56: Introduction to Grid Computing:

57

Service Container

GT4 WS Core in a Nutshell

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

Service Container: host multiple services in container; one JVM

process

…more details: based on AXIS service

container, processes SOAP messages, ResourceContext

extension.

Page 57: Introduction to Grid Computing:

58

Service Container

GT4 WS Core in a Nutshell

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

Secure Communication: Transport, Message,

Conversation (Transport demonstrates best

performance)

PIP

PDP

Configurable Security Policies: Policy Information

Points (PIPs), Policy Decision Points (PDP) -- chained

Example authorization PDPs: GridMap, SAML

implementations,XACML policies

Page 58: Introduction to Grid Computing:

59

Service Container

GT4 WS Core in a Nutshell

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

PIP

PDP

WorkManager DB Conn Pool JNDI Directory

WorkManager: “thread pool”, site independent

“work” manager

Apache Database Connection Pool library

(JDBC “DataSource” implementation)

JNDI Directory: manages internal, shared objects

(ResourceHomes, WorkManager,

Configuration objects,…)

Page 59: Introduction to Grid Computing:

60

Apache Tomcat

Service Container

GT4 WS Core in a Nutshell

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

RPs

Resource

ServiceGetRP

GetMultRPs

SetRP

QueryRPs

Subscribe

SetTermTime

Destroy

EPREPR

EPR

ResourceHome

PIP

PDP

WorkManager DB Conn Pool JNDI Directory

Deploy Service Container “standalone”

or within Apache Tomcat

Page 60: Introduction to Grid Computing:

61

CustomWeb

ServicesWS-Addressing, WSRF,

WS-Notification

CustomWSRF Web

Services

GT4WSRF Web

Services

WSDL, SOAP, WS-Security

User Applications

Reg

istr

yA

dmin

istr

atio

n

GT

4 C

onta

iner

GT4 Web Services Runtime

Page 61: Introduction to Grid Computing:

62

StatefulEntities

Registry

Service requestor (e.g., user application)

Factoryservice

Create Stateful Entity

State Address

Resource allocation

RegisterStateful

Entity

Discovery

Interactions standardized using WSDL and SOAP

State inspection Lifetime mgmt Notifications

Authentication & Authorization are applied to all requests

Modeling State in Web Services

Page 62: Introduction to Grid Computing:

63

WSRF & WS-Notification Naming and bindings (basis for virtualization)

Every resource can be uniquely referenced, and has one or more associated services for interacting with it

Lifecycle (basis for fault resilient state mgmt) Resources created by services following factory pattern Resources destroyed immediately or scheduled

Information model (basis for monitoring, discovery) Resource properties associated with resources Operations for querying and setting this info Asynchronous notification of changes to properties

Service groups (basis for registries, collective svcs) Group membership rules & membership management

Base Fault type

Page 63: Introduction to Grid Computing:

64

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit: Open Source Grid Infrastructure

Page 64: Introduction to Grid Computing:

65

Globus Security

Control access to shared services Address autonomous management, e.g.,

different policy in different work-groups Support multi-user collaborations

Federate through mutually trusted services Local policy authorities rule

Allow users and application communities to set up dynamic trust domains Personal/VO collection of resources working

together based on trust of user/VO

Page 65: Introduction to Grid Computing:

66

Organization A Organization B

Compute Server C1Compute Server C2

Compute Server C3

File server F1 (disks A and B)

Person C(Student)

Person A(Faculty)

Person B(Staff) Person D

(Staff)Person F(Faculty)

Person E(Faculty)

Virtual Community C

Person A(Principal Investigator)

Compute Server C1'

Person B(Administrator)

File server F1 (disk A)

Person E(Researcher)

Person D(Researcher)

Virtual Organization (VO) Concept

VO for each application or workload Carve out and configure resources for a particular

use and set of users

Page 66: Introduction to Grid Computing:

67

GT4 Security

VO

RightsUsers

Rights’

ComputeCenter

Access

Services (runningon user’s behalf)

Rights

Local policyon VO identityor attributeauthority

CAS or VOMSissuing SAMLor X.509 ACs

SSL/WS-Securitywith ProxyCertificates

Authz Callout:SAML, XACML

KCA

MyProxy

Page 67: Introduction to Grid Computing:

68

GT4 Security Public-key-based authentication Extensible authorization framework based on Web

services standards SAML-based authorization callout

As specified in GGF OGSA-Authz WG

Integrated policy decision engine XACML policy language, per-operation policies, pluggable

Credential management service MyProxy (One time password support)

Community Authorization Service Standalone delegation service

Page 68: Introduction to Grid Computing:

69

GT4’s Use of Security Standards

Supported, Supported, Fastest, but slow but insecure so default

Page 69: Introduction to Grid Computing:

70

GT-XACML Integration

eXtensible Access Control Markup Language OASIS standard, open source implementations

XACML: sophisticated policy language Globus Toolkit ships with XACML runtime

Included in every client and server built on GT Turned-on through configuration

… that can be called transparently from runtime and/or explicitly from application …

… and we use the XACML-”model” for our Authz Processing Framework

Page 70: Introduction to Grid Computing:

71

GT Authorization Framework

Page 71: Introduction to Grid Computing:

72

Other Security Services Include …

MyProxy Simplified credential management Web portal integration Single-sign-on support

KCA & kx.509 Bridging into/out-of Kerberos domains

SimpleCA Online credential generation

PERMIS Authorization service callout

Page 72: Introduction to Grid Computing:

73

Example: Globus Security Architecture

Diagram of Globus security architecture.

Page 73: Introduction to Grid Computing:

74

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit: Open Source Grid Infrastructure

Page 74: Introduction to Grid Computing:

75

GT4 Data Management Stage/move large data to/from nodes

GridFTP, Reliable File Transfer (RFT) Alone, and integrated with GRAM

Locate data of interest Replica Location Service (RLS)

Replicate data for performance/reliability Distributed Replication Service (DRS)

Provide access to diverse data sources File systems, parallel file systems, hierarchical

storage: GridFTP Databases: OGSA DAI

Page 75: Introduction to Grid Computing:

76

GridFTP in GT4 100% Globus code

No licensing issues Stable, extensible

IPv6 Support XIO for different transports Striping multi-Gb/sec wide area transport

27 Gbit/s on 30 Gbit/s link Pluggable

Front-end: e.g., future WS control channel Back-end: e.g., HPSS, cluster file systems Transfer: e.g., UDP, NetBLT transport

Bandwidth Vs Striping

0

2000

4000

6000

8000

10000

12000

14000

16000

18000

20000

0 10 20 30 40 50 60 70

Degree of Striping

Ba

nd

wid

th (

Mb

ps

)

# Stream = 1 # Stream = 2 # Stream = 4

# Stream = 8 # Stream = 16 # Stream = 32

Disk-to-disk onTeraGrid

Page 76: Introduction to Grid Computing:

77Reliable File Transfer:Third Party Transfer

RFT Service

RFT Client

SOAP Messages

Notifications(Optional)

DataChannel

Protocol Interpreter

MasterDSI

DataChannel

SlaveDSI

IPCReceiver

IPC Link

MasterDSI

Protocol Interpreter

Data Channel

IPCReceiver

SlaveDSI

Data Channel

IPC Link

GridFTP Server GridFTP Server

Fire-and-forget transfer Web services interface Many files & directories Integrated failure recovery Has transferred 900K files

Page 77: Introduction to Grid Computing:

78

Replica Location Service

Identify location of files via logical to physical name map

Distributed indexing of names, fault tolerant update protocols

GT4 version scalable & stable

Managing ~40 million files across ~10 sites

IndexIndex

Local DB

Update send (secs)

Bloom filter

(secs)

Bloom filter (bits)

10K <1 2 1 M

1 M 2 24 10 M

5 M 7 175 50 M

Page 78: Introduction to Grid Computing:

79

Cardiff

AEI/Golm

Birmingham•

Reliable Wide Area Data Replication

Replicating >1 Terabyte/day to 8 sites>30 million replicas so farMTBF = 1 month

LIGO Gravitational Wave Observatory

www.globus.org/solutions

Page 79: Introduction to Grid Computing:

80

OGSA-DAI

Provide service-based access to structured data resources as part of Globus

Specify a selection of interfaces tailored to various styles of data access—starting with relational and XML

Page 80: Introduction to Grid Computing:

81

MySQL

OGSA-DAI service

Engine

SQLQuery

JDBCData

Resources

Activities

DB2

The OGSA-DAI Framework

GZip GridFTPXPath

XMLDB

XIndice

readFile

File

SWISSPROT

XSLT

SQLServer

Data-bases

ApplicationApplicationClient ToolkitClient Toolkit

Page 81: Introduction to Grid Computing:

82

MySQL

OGSA-DAI service

Engine

SQLQuery

JDBC

SQL

JDBC

SQL

JDBC

SQL

JDBC

SQL

JDBC

MultipleSQL GDS

SQLQuery

Extensibility Example

Page 82: Introduction to Grid Computing:

83OGSA-DAI: A Framework for Building Applications

Supports data access, insert and update Relational: MySQL, Oracle, DB2, SQL Server, Postgres XML: Xindice, eXist Files – CSV, BinX, EMBL, OMIM, SWISSPROT,…

Supports data delivery SOAP over HTTP FTP; GridFTP E-mail Inter-service

Supports data transformation XSLT ZIP; GZIP

Supports security X.509 certificate based security

Page 83: Introduction to Grid Computing:

84

OGSA-DAI: Other Features

A framework for building data clients Client toolkit library for application developers

A framework for developing functionality Extend existing activities, or implement your own Mix and match activities to provide functionality

you need Highly extensible

Customise our out-of-the-box product Provide your own services, client-side support,

and data-related functionality

Page 84: Introduction to Grid Computing:

85

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit: Open Source Grid Infrastructure

Page 85: Introduction to Grid Computing:

86

Execution Management (GRAM)

Common WS interface to schedulers Unix, Condor, LSF, PBS, SGE, …

More generally: interface for process execution management Lay down execution environment Stage data Monitor & manage lifecycle Kill it, clean up

A basis for application-driven provisioning

Page 86: Introduction to Grid Computing:

87

GT4 WS GRAM

2nd-generation WS implementation optimized for performance, flexibility, stability, scalability

Streamlined critical path Use only what you need

Flexible credential management Credential cache & delegation service

GridFTP & RFT used for data operations Data staging & streaming output Eliminates redundant GASS code

Page 87: Introduction to Grid Computing:

88

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM Architecture

SEGJob events

Page 88: Introduction to Grid Computing:

89

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM Architecture

SEGJob events

Delegated credential can be:Made available to the application

Page 89: Introduction to Grid Computing:

90

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM Architecture

SEGJob events

Delegated credential can be:Used to authenticate with RFT

Page 90: Introduction to Grid Computing:

91

GRAMservices

GT4 Java Container

GRAMservices

Delegation

RFT FileTransfer

Transferrequest

GridFTPRemote storage element(s)

Localscheduler

Userjob

Compute element

GridFTP

sudo

GRAMadapter

FTPcontrol

Local job control

Delegate

FTP data

Cli

ent Job

functions

Delegate

Service host(s) and compute element(s)

GT4 WS GRAM Architecture

SEGJob events

Delegated credential can be:Used to authenticate with GridFTP

Page 91: Introduction to Grid Computing:

92

WS GRAM Performance

Time to submit a basic GRAM job Pre-WS GRAM: < 1 second WS GRAM: 2 seconds

Concurrent jobs Pre-WS GRAM: 300 jobs WS GRAM: 32,000 jobs

Various studies are underway to test latest software

Page 92: Introduction to Grid Computing:

93

GT4 WS GRAM Performance

Number of Client Threads (M)

1 2 4 8 16 32 64 128

1 7 15 29 57 80 69 69 70

2 15 29 58 79 74 70 70 64

4 29 58 78 77 68 69 52 69

8 59 77 77 72 65 27   69

16 77 77 75 64 27     50

32 76 75 68 64 67    

64 75 73 70 66 65  

128 80 72 64 63 71

All numbers are simple jobs/minute, no delegation or staging

Su

sta

ined

Job

Load

P

er

Clien

t Th

read

(N

)

Page 93: Introduction to Grid Computing:

94

Workspace Service:The Hosted Activity

Policy

Client

Environment

Activity

Allocate/provisionConfigure

Initiate activityMonitor activityControl activity

Interface Resource provider

Page 94: Introduction to Grid Computing:

95

Activities Can Be Nested

Policy

Client

Environment

Interface Resource provider

ClientClient

Page 95: Introduction to Grid Computing:

96

For Example …

Physical machineProcure hardware

Hypervisor/OS Deploy hypervisor/OS

VM VM Deploy virtual machine

Provisioning, management, and monitoring at all levels

JVM Deploy container

JVM Deploy service

Page 96: Introduction to Grid Computing:

97

Dynamic Service Deployment

CommunityA

CommunityZ

• Community scheduling logic• Data distribution• Community management• Science services• ...

Requirements:• Community control• Persistence• Resource guarantees• Non- interference

Page 97: Introduction to Grid Computing:

98

Virtual Machine Costs

GRAM job

GRAM job in paused VM

Job in booted VM

Page 98: Introduction to Grid Computing:

99

Virtual OSG Clusters

OSG cluster

Xen hypervisors

TeraGrid cluster

OSG

Page 99: Introduction to Grid Computing:

100

Data Mgmt

SecurityCommonRuntime

Execution Mgmt

Info Services

GridFTPAuthenticationAuthorization

ReliableFile

Transfer

Data Access& Integration

Grid ResourceAllocation &

ManagementIndex

CommunityAuthorization

DataReplication

CommunitySchedulingFramework

Delegation

ReplicaLocation

Trigger

Java Runtime

C Runtime

Python Runtime

WebMDS

WorkspaceManagement

Grid Telecontrol

Protocol

Globus Toolkit v4www.globus.org

CredentialMgmt

Globus Toolkit: Open Source Grid Infrastructure

Page 100: Introduction to Grid Computing:

101

Monitoring and Discovery

“ Every service should be monitorable and discoverable using common mechanisms” WSRF/WSN provides those mechanisms

A common aggregator framework for collecting information from services, thus: MDS-Index: Xpath queries, with caching MDS-Trigger: perform action on condition (MDS-Archiver: Xpath on historical data)

Deep integration with Globus containers & services: every GT4 service is discoverable GRAM, RFT, GridFTP, CAS, …

Page 101: Introduction to Grid Computing:

102

GT4 Container

GT4 Monitoring & Discovery

GRAM User

MDS-Index

GT4 Cont.

RFT

MDS-Index

GT4 Container

MDS-Index

GridFTP

adapter

Registration &WSRF/WSN Access

Custom protocolsfor non-WSRF entities

Clients(e.g., WebMDS)

Automatedregistrationin container

WS-ServiceGroup

Page 102: Introduction to Grid Computing:

103

Index Server Performance

As the MDS4 Index grows, query rate and response time both slow, although sublinearly

Response time slows due to increasing data transfer size Full Index is being returned Response is re-built for every query

Real question – how much over simple WS-N performance?

Page 103: Introduction to Grid Computing:

104

Information Providers

GT4 information providers collect information from some system and make it accessible as WSRF resource properties

Growing number of information providers Ganglia, CluMon, Nagios SGE, LSF, OpenPBS, PBSPro, Torque

Many opportunities to build additional ones E.g., network monitoring, storage systems,

various sensors

Page 104: Introduction to Grid Computing:

105

Java Services in Apache AxisPlus GT Libraries and Handlers

YourJava

Service

YourPythonService

YourJava

Service RF

T

GR

AM

Del

egat

ion

Inde

x

Trig

ger

Arc

hive

r

pyGlobusWS Core

YourC

Service

C WS Core

RLS

Pre

-WS

MD

S

CA

S

Pre

-WS

GR

AM

Sim

pleC

A

MyP

roxy

OG

SA

-DA

I

GT

CP

Grid

FT

P

C Services using GT Libraries and Handlers

SERVER

CLIENT

InteroperableWS-I-compliant

SOAP messaging

YourJavaClient

YourC

Client

YourPythonClient

YourJavaClient

YourC

Client

YourPythonClient

YourJavaClient

YourC

Client

YourPythonClient

YourJavaClient

YourC

Client

YourPythonClient

X.509 credentials =common authentication

Python hosting, GT Libraries

GT4 Summary

Page 105: Introduction to Grid Computing:

GT4 Documentation

is Much Improved!

Page 106: Introduction to Grid Computing:

107

Overview

1. Background

2. Globus Toolkit

3. Future directions

4. Related tools

Page 107: Introduction to Grid Computing:

108

The Future:Content

We now have a solid and extremely powerful Web services base

Next, we will build an expanded open source Grid infrastructure Virtualization New services for provisioning, data management,

security, VO management End-user tools for application development Etc., etc.

And of course responding to user requests for other short-term needs

Page 108: Introduction to Grid Computing:

109

The Future

We now have a solid and extremely powerful Web services base

Next, we will build an expanded open source Grid infrastructure Virtualization New services for provisioning, data management,

security, VO management End-user tools for application development Etc., etc.

And of course responding to user requests for other short-term needs

Page 109: Introduction to Grid Computing:

110

Short-Term Priorities: Security

Improve GSI error reporting & diagnostics Secure password, one-time password,

Kerberos support for initial log on Trust roots, use of GridLogon Identity/attribute assertions in GT auth.

callouts (e.g., Shib, PERMIS, VOMS, SAML) Extend CAS admin & policy support Security logging with management control

for audit purposes

Page 110: Introduction to Grid Computing:

111

Short-Term Priorities: Data Management

Space & bandwidth management in GridFTP

Concurrency in globus-url-copy Priorities in RFT Data replication service Enhance policy support in data services Physical file name creation service Scalable & distributed metadata manager

Page 111: Introduction to Grid Computing:

112

Short-Term Priorities: Execution Management

Implement GGF JSDL once finalized Advance reservation support Policy-driven restart of “persistent” jobs Improved information collection for jobs Improved management of job collections Credential refresh Development of workspace service Integration of virtual machines (Xen, VMware) and

associated services Windows port of WS GRAM

Page 112: Introduction to Grid Computing:

113

Short-Term Priorities: Information Services

Many more information sources, including gateways to other systems

Automated configuration of monitoring Specialized monitoring displays Performance optimization of registry Archiver service Helper tools to streamline integration of

new information sources

Page 113: Introduction to Grid Computing:

114

Short-Term Priorities: WS Core

Streamlined container configuration Remote management interface Dynamic service deployment Service isolation: multiple service instances WS-Notification, subscription performance Full functionality in C WS Core Optimized WS-ServiceGroup support WS-SecureConversation support

Page 114: Introduction to Grid Computing:

115

Overview

Background Globus Toolkit Future directions Related tools

Page 115: Introduction to Grid Computing:

116

The Globus Ecosystem

Globus components address core issues relating to resource access, monitoring, discovery, security, data movement, etc. GT4 being the latest version

A larger Globus ecosystem of open source and proprietary components provide complementary components A growing list of components

These components can be combined to produce solutions to Grid problems We’re building a list of such solutions

Page 116: Introduction to Grid Computing:

117

Many Tools Build on, or Can Contribute to, GT4-Based Grids

Condor-G, DAGman MPICH-G2 GRMS Nimrod-G Ninf-G Open Grid Computing Env. Commodity Grid Toolkit GriPhyN Virtual Data System Virtual Data Toolkit GridXpert Synergy

Platform Globus Toolkit VOMS PERMIS GT4IDE Sun Grid Engine PBS scheduler LSF scheduler GridBus TeraGrid CTSS NEES IBM Grid Toolbox …

Page 117: Introduction to Grid Computing:

118Documenting

The Grid Ecosystem

The Grid Ecosystem: Software Components for Grid SystemsAnd Applications

www.grids-center.org

Page 118: Introduction to Grid Computing:

119

Example Solutions

Portal-based User Reg. System (PURSE) VO Management Registration Service Service Monitoring Service TeraGrid TGCP Tool Lightweight Data Replicator GriPhyN Virtual Data System

Page 119: Introduction to Grid Computing:

120

Condor-G

The Condor Project @ U Wisconsin Madison develops software for high-throughput computing on collections of distributed compute resources

Condor-G is an interface to GRAM created by the Condor team that allows users to submit jobs to GRAM servers

Page 120: Introduction to Grid Computing:

121

GridShib Allows the use of Shibboleth-transported

attributes for authorization in GT4 deployments And, more generally, SAML support

2 year project started December 1, 2004 Participants

Von Welch, UIUC/NCSA (PI) Kate Keahey, UChicago/Argonne (PI) Frank Siebenlist, Argonne Tom Barton, UChicago

Beta software released September 16, 2005

Page 121: Introduction to Grid Computing:

122

Handle System

The Handle System from CNRI (http://www.handle.net) is a general-purpose global name service enabling secure name resolution over the internet

The Handle System-GT Integration Project leverages the Handle System for identifier and resolution services through tight integration with GT4’s Web services protocols

Page 122: Introduction to Grid Computing:

123

MPICH-G2

MPICH-G2, developed at Northern Illinois University and Argonne National Lab, is a grid-enabled implementation of the MPI v1.1 standard

MPICH-G2 is implemented using the pre-WS GRAM component in GT4; integration with GT4 WS GRAM is expected in the near future

Page 123: Introduction to Grid Computing:

124

Nimrod/G

Nimrod is a specialized parametric modeling system from Monash University

Nimrod/G uses a simple declarative parametric modeling language to express parameter sweep experiments. Based on GT4 WS services, Nimrod/G enables the formulation, execution and monitoring of multiple individual parametric experiments

Page 124: Introduction to Grid Computing:

125

Ninf-G4

Ninf-G4, from AIST, is a reference implementation of the GGF standard GridRPC API

Ninf-G4 is provides higher-level programming APIs for the development and execution of parallel applications on the Grid

Page 125: Introduction to Grid Computing:

126

PERMIS

PERMIS is an EU-funded Privilege Management service that implements Role-Based Access Control

Thanks to the work of the UK Grid Engineering Task Force, services running in a Java WS Core container can use PERMIS via GT4’s SAML authorization callouts

Page 126: Introduction to Grid Computing:

127

SRB

SRB is a package from SDSC providing a uniform interface for connecting to network-based heterogeneous data resources

GT4’s GridFTP includes an interface to SRB data sources, and vice versa

Page 127: Introduction to Grid Computing:

128

Sun Grid Engine

Sun Grid Engine is an open source distributed resource management system from Sun Microsystems

In a collaboration between the London e-Science Centre, Gridwise and MCNC, the Sun Grid Engine has been integrated with GT4

Page 128: Introduction to Grid Computing:

129

Thank Thank you?you?