66
Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

Embed Size (px)

Citation preview

Page 1: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

Nov. 9, 2002

Chan-Hyun Youn

Information and Communications University

Grid Middleware Service

Page 2: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Contents• Grid and Middleware Services

• Architectural Model for Resource Management Hierarchical Resource Management Abstract Owner Market Model

• Scheduling Algorithms in Economy Grid

• Example of Application level Scheduler

• Concluding Remarks

Page 3: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

Gri

d

Info

rmat

ion

S

ervi

ce

Un

ifo

rmR

eso

urc

eA

cces

s

Bro

keri

ng

Glo

bal

Q

ueu

ing

Glo

bal

Eve

nt

Ser

vice

s

Co

-S

ched

uli

ng

Dat

a C

atal

og

uin

g

Un

ifo

rm D

ata

Acc

ess

Co

llab

ora

tio

n

and

Rem

ote

In

stru

men

t S

ervi

ces

Net

wo

rk

Cac

he

Co

mm

un

icat

ion

S

ervi

ces

Au

then

tic

atio

n

Au

tho

riza

tio

n

Sec

uri

ty

Ser

vice

s

Au

dit

ing

Fau

lt

Man

ag

emen

t

Mo

nit

ori

ng

Grid Common Services: Standardized Services and Resources Interfaces

Toolkits: Visualization, data publish/subscribe, etc.Applications: Simulations, Data Analysis, etc.

Resources

Discipline Specific Portals andScientific Workflow Management Systems

Condor pools

networkcaches

tertiary storage national user facilities

clustersnational

supercomputer facility

high-speed networks and communications services

= Globus services

Architecture of a Grid

Source: IPG (Johnston)

Page 4: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid WorkshopHeterogeneous Computing:IPG Milestone Completed 10/2000

IPG managed compute and data management resources

resultsresultsresultsresultsresults

study concept

IPG Grid Common Services: Standardized services and uniform resource access

study object

1) Condor Workstation Pool mgr.

•Molecular design application for nanotechnology devices and materials

• Uses 0.5 million otherwise idle CPU hours/year scavenged from a 60-100 Sun and SGI workstations - a subset of the NAS Condor pool

•The Condor system is an IPG middleware service

2) Parameter Study Manager

- Two problem solving environments use IPG services for uniform access to heterogeneous resources.

• ILab aerospace design parameter study manager uses IPG to access distributed computing and data resources

Page 5: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid WorkshopOnline Instrumentation:Real-time Experiment Interaction

computersimulations

real-timecollection

multi-sourcedata analysis,

desktop & VR clients with shared controls

Unitary Plan Wind Tunnel

archival storage

real-time experiment control

Page 6: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Grid from Services View

:

:E.g.,

Applications

Resource-specific implementations of basic servicesE.g., Transport protocols, name servers, differentiated services, CPU schedulers, public keyinfrastructure, site accounting, directory service, OS bypass

Resource-independent and application-independent services authentication, authorization, resource location, resource allocation, events, accounting,

remote data access, information, policy, fault detection

DistributedComputing

Toolkit

Grid Fabric(Resources)

Grid Services(Middleware)

ApplicationToolkits

Data-Intensive

ApplicationsToolkit

CollaborativeApplications

Toolkit

RemoteVisualizationApplications

Toolkit

ProblemSolving

ApplicationsToolkit

RemoteInstrumentation

ApplicationsToolkit

Applications Chemistry

Biology

Cosmology

High Energy Physics

Environment

Page 7: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Middleware

• Layered collection of middleware services that provide to applications uniform views of distributed resource components and the mechanisms for assembling them into systems– Grid Workload Management, Data Management, Monitoring services

– Management of the Local Computing Fabric

– Mass Storage

• Services extend both “up and down” through the various layers of the computing and communications infrastructure

Page 8: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Functions in Middleware• Workload management

– The workload is chaotic – unpredictable job arrival rates, data access patterns

– The goal is maximising the global system throughput (events

processed per second)

• Data management

– Management of petabyte-scale data volumes, in an environment with

limited network bandwidth and heavy use of mass storage (tape)

– Caching, replication, synchronisation, object database model

• Application monitoring

– Tens of thousands of components, thousands of jobs and individual

users

– End-user - tracking of the progress of jobs and aggregates of jobs

– Understanding application and grid level performance

– Administrator – understanding which global-level applications were

affected by failures, and whether and how to recover

Page 9: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Middleware (in Local Fabric)

• Effective local site management of giant computing fabrics– Automated installation, configuration management, system maintenance– Automated monitoring and error recovery - resilience, self-healing– Performance monitoring– Characterisation, mapping, management of local Grid resources

• Mass storage management multi-PetaByte data storage “real-time” data recording requirement active tape layer – 1,000s of users uniform mass storage interface exchange of data and meta-data between mass storage systems

Page 10: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop Technical Approach in Layered Network

vBNS IDRENCampus

Internet 2GigaPop GigaPop

ESNet

Internet

LBNL

Ames

ANL

Global Middleware Services

Resource Scheduling

Network Cache

QoS Broker

Monitoring & Management

Access Control

CacheTertiary (mass)

storage

Super- Computer

Wind Tunnel

Tertiary storage

Cluster

NCAR

Applications

Applications

Applications

Applications need uniform views of resources, and middleware must deal with the fact that most “real” resources are “locally” owned

Local Services

Source: Grid’98 Workshop (Johnston)

Page 11: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid WorkshopOperation Model (1)

vBNS IDRENCampus

Internet 2GigaPop GigaPop

ESNet

Internet

LBNL

Ames

ANL

Network Cache

QoS Broker

Access Control

CacheTertiary (mass)

storage

Super- Computer

Wind Tunnel

Tertiary storage

Cluster

NCAR

Applications

Some services are provided in the middleware

Middleware must actually reach well !

Resource Characteristics

Resource Scheduling

Global Middleware Services

Monitoring & Management

Most services drill down to institutional resources

Data Catalogues

Some services drill down to the various network layers

Local Services

Source: Grid’98 Workshop (Johnston)

Page 12: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop Operation Model (2)

vBNS IDRENCampus

Internet 2GigaPop GigaPop

ESNet

Internet

LBNL

Ames

ANL

Network Cache

QoS Broker

Access Control

CacheTertiary (mass)

storage

Super- Computer

Wind Tunnel

Tertiary storage

Cluster

NCAR

Applications

Some services are provided in the middleware

Middleware layer and infrastructure to provide the transparent access for applications !

Resource Characteristics

Resource Scheduling

Global Middleware Services

Monitoring & Management Data

Catalogues

Local Services

Proxy management for multi-site resources

Configure

Analyzer

Monitor

Re-configure

Cache

Re-configure

Re-configure

Monitor

Source: Grid’98 Workshop (Johnston)

Page 13: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Middleware Approach

• Toolkit and services addressing key technical problems

– Modular “bag of services” model

– Not a vertically integrated solution

– can be applied to many application domains

• Inter-domain issues, rather than clustering

– Integration of intra-domain solutions

Page 14: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Globus

Page 15: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Globus Approach

• A software toolkit addressing key technical problems

– Offer a modular bag of technologies

– Enable incremental development of grid-enabled tools and applications

– Define and standardize grid protocols and APIs

• Focus is on inter-domain issues, not clustering

– Supports collaborative resource use spanning multiple organizations

– Integrates cleanly with intra-domain services

– Creates a collective service layer

Page 16: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Globus Approach

• Focus on architecture issues

– Provide implementations of grid protocols and APIs as basic infrastructure

– Use to construct high-level, domain-specific solutions

• Design principles

– Keep participation cost low

– Enable local control

– Support for adaptation

Diverse global services

Core Globusservices

Local OS

A p p l i c a t i o n s

Page 17: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Four Key Protocols

• The Globus Toolkit centers around four key protocols– Connectivity layer:

• Security: Grid Security Infrastructure (GSI)

– Resource layer:

• Resource Management: Grid Resource Allocation Management (GRAM)

• Information Services: Grid Resource Information Protocol (GRIP)

• Data Transfer: Grid File Transfer Protocol (GridFTP)

Page 18: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Site A(Kerberos)

Site B (Unix)

Site C(Kerberos)

Computer

User

Single sign-on via “grid-id”& generation of proxy cred.

Or: retrieval of proxy cred.from online repository

User ProxyProxy

credential

Computer

Storagesystem

Communication*

GSI-enabledFTP server

AuthorizeMap to local idAccess file

Remote fileaccess request*

GSI-enabledGRAM server

GSI-enabledGRAM server

Remote processcreation requests*

* With mutual authentication

Process

Kerberosticket

Restrictedproxy

Process

Restrictedproxy

Local id Local id

AuthorizeMap to local idCreate processGenerate credentials

Grid Security Infrastructure in Action

Page 19: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Resource Management

• The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity

• Resource Specification Language (RSL) is used to communicate requirements

• A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services

– Integrated with Condor, MPICH-G2, …

Page 20: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Resource Management Issues for Grid Computing• Site autonomy

– Resources owned by different organizations, in different administrative domains

– Local policies for use, scheduling, security• Heterogeneous substrate

– Different local resource management systems• Policy extensibility

– Local sites need ability to customize their resource management policies

• Co-allocation– May need resources at several sites– Mechanism for allocating multiple resources, initiating

computation, monitoring and managing• On-line control

– Adapt application requirements to resource availability

Page 21: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

GRAM GRAM GRAM

LSF EASY-LL NQE

Application

RSL

Simple ground RSL

Information Service

Localresourcemanagers

RSLspecialization

Broker

Ground RSL

Co-allocator

Queries& Info

Resource Management Architecture

Page 22: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Local Resource Managers

• Implemented with Globus Resource Allocation Manager (GRAM)– Processing RSL specifications representing resource requests

• Deny request

• Create one or more processes (jobs) that satisfy request

– Enable remote monitoring and management of jobs

– Periodically update MDS information service with current availability and capabilities of resources

• GRAM is responsible for– Parsing and processing RSL

– Job monitoring

– MDS update

Page 23: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Grid Information Services

• System information is critical to operation of the grid and construction of applications

– What resources are available?

• Resource discovery

– What is the “state” of the grid?

• Resource selection

– How to optimize resource use

• Application configuration and adaptation?

• We need a general information infrastructure to answer these questions

Page 24: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

GIS Architecture

A A

Customized Aggregate Directories

R RR R

Standard Resource Description Services

Registration

Protocol

Users

Enquiry

Protocol

Page 25: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

A Model Architecture for Data Grids

Metadata Catalog

Replica Catalog

Tape Library

Disk Cache

Attribute Specification

Logical Collection and Logical File Name

Disk Array Disk Cache

Application

Replica Selection

Multiple Locations

NWS

SelectedReplica

GridFTP Control ChannelPerformanceInformation &Predictions

Replica Location 1 Replica Location 2 Replica Location 3

MDS

GridFTPDataChannel

Page 26: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

GridFTP: Basic Approach

• FTP protocol is defined by several IETF RFCs

• Start with most commonly used subset

– Standard FTP: get/put etc., 3rd-party transfer

• Implement standard but often unused features

– GSS binding, extended directory listing, simple restart

• Extend in various ways, while preserving interoperability with existing servers

– Striped/parallel data channels, partial file, automatic & manual TCP buffer setting, progress monitoring, extended restart

Page 27: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Striped GridFTP Server

Parallel File System (e.g. PVFS, PFS, etc.)

MPI-IO

Plug-in

Control

GridFTP Server Parallel BackendGridFTPservermaster

mpirun

GridFTPclient

Plug-in

Control

Plug-in

Control

Plug-in

Control…MPI (Comm_World)

MPI (Sub-Comm)

To Client or Another Striped GridFTP Server

Controlsocket

GridFTP Control Channel GridFTP Data Channels

Page 28: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Condor

Page 29: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

What is Condor?• Condor converts collections of distributively

owned workstations and dedicated clusters into a distributed high-throughput computing facility.

Resource finder Batch queue manager Scheduler Checkpoint/Restart Process migration Remote system calls

All jobs

Jobs linked

with the Condor

library

Page 30: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Layered Design

ResourceResource

Access ControlAccess Control

Match-MakingMatch-Making

Request AgentRequest Agent

Application RMApplication RM

ApplicationApplication

Con

dor

Resource OwnerResource Owner

SystemSystem AdministratorAdministrator

Customer/UserCustomer/User

Page 31: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Unique Mechanisms

• Checkpointing– Enables Preemptive Resume Resource Allocation (essential

in an opportunistic environment)

• Remote I/O – Enables computation across administrative domains

(essential for HTC)

• ClassAds– Enables flexible resource matchmaking (essential in a

distributively owned environment)

Page 32: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Condor System Structure

Submit Machine Execution Machine

Collector

CA[...A]

[...B]

[...C]

CN

RA

Negotiator

Customer Agent Resource Agent

Central Manager

Page 33: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Job Submission Machine

Job Execution Site

Job

Condor-G GridManager

GASS Server

Condor-G Scheduler

Persistant Job Queue

End User Requests

Condor Shadow

Process for Job X

Condor-G Collector

Globus Daemons +

Local Site Scheduler

[See Figure 1]

Condor Daemons

Job X

Condor System Call

Trapping & Checkpoint Library

Resource

Information

Transfer Job X

Redirected System Call

Data

Page 34: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

TENT

Page 35: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

TENT• A distributed workflow management and

integration system for engineering applications developed by

– German Aerospace Center (DLR), Simulation and Software Technology (SISTEC) http://www.sistec.dlr.de

– German National Research Center for Information Technology (GMD), Institute for Algorithms and Scientific Computing (SCAI) http://www.gmd.de/scai

Page 36: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

TENT - The Integration Framework

visualization

Page 37: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

TENT Packages

Page 38: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

TENT - Software architecture

Page 39: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Architectural Models for Resource Management in the Grid

Page 40: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Typical Grid Computing Environment

Grid Resource Broker

Resource Broker

Application

Grid Information Service

Grid Resource Broker

databaseR2R3

RN

R1

R4

R5

R6

Grid Information Service

Page 41: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Sources of Complexity in Grid Resource Management• No single administrative control.• No single ownership policy:

– Each resource owner has their own policies or scheduling

mechanisms

– Users must honour them (particularly external Grid users)

• Heterogeneity – resources : PCs, Workstations, clusters, supercomputers, instruments,

databases, software …

– fabric management systems and

management policies

– application requirements

• Dynamic availability – may appear and disappear…

Page 42: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Sources of Complexity in Grid Resource Management

• Unreliable resource – disappear from view

• No uniform cost model - varies from one user’s resource to another and from time of day.

• No single access mechanism – Web, custom

interfaces, command line…

Page 43: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Grid Resource Management Issues

•Authentication (once).

•Specify (code, resources, etc.).

•Discover resources.

•Negotiate authorization, acceptable

use, Cost, etc.

•Acquire resources.

•Schedule Jobs.

•Initiate computation.

•Steer computation.

•Access remote data-sets.

•Collaborate with results.

•Account for usage.

•Discover resources.

•Negotiate authorization,

acceptable use, Cost, etc.

•Acquire resources.

•Schedule jobs.

•Initiate computation.

•Steer computation.

Domain 2

Domain 1

Rajkumar Buyya (Monash Univ.)

Page 44: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Data Access for Resource Management

Grid StatusRegistryManager

Grid SpaceManager

Data Disseminator

RequestRouter/

Allocator (1)

RequestRouter/

Allocator (2)

GridspaceGrid Status

Registry

GridespaceCache

Status update message in

Gridspace update message in

Resource request message in

Update message out

Route or Allocation message out

Route or allocation with single choice

Page 45: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Architectural Models for RM

MODEL REMARKS Systems

Hierarchical It captures model followed in most contemporary systems.

Globus, Legion, CCS, Apples, NetSolve, Ninf.

Abstract Owner (AO) Order and delivery model and focuses on long term goals.

Expected to emerge and most peer-2-peer computing systems likely to be based on this.

Market Model It follows economic model for resource discover, sharing, & scheduling.

GRACE, Nimrod/G, JavaMarket, Mariposa.

Page 46: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid WorkshopHierarchical RM

Connection control

Connection control

Global Scheduler

Global Scheduler

Access/AdmissionControl Agent

Global Scheduler

Global Scheduler

Global Scheduler

monitor

DeploymentAgent

Domain Resource manager

or control agent

Control domaintask

Local Scheduler

resource

Grid information

service

PersistentJ ob controlagent

user

Page 47: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Resource Management in Globus

• The Grid Resource Allocation Management (GRAM) protocol and client API allows programs to be started on remote resources, despite local heterogeneity

• Resource Specification Language (RSL) is used to communicate requirements

• A layered architecture allows application-specific resource brokers and co-allocators to be defined in terms of GRAM services

– Integrated with Condor, MPICH-G2, …

Page 48: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

GRAM GRAM GRAM

LSF EASY-LL NQE

Application

RSL

Simple ground RSL

Information Service

Localresourcemanagers

RSLspecialization

Broker

Ground RSL

Co-allocator

Queries& Info

Resource Management Architecture in Globus

Page 49: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Local Resource Managers

• Implemented with Globus Resource Allocation Manager (GRAM)– Processing RSL specifications representing resource requests

• Deny request

• Create one or more processes (jobs) that satisfy request

– Enable remote monitoring and management of jobs

– Periodically update MDS information service with current availability and capabilities of resources

• GRAM is responsible for– Parsing and processing RSL

– Job monitoring

– MDS update

Page 50: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Globus/MPICH-G2 components

Globus SecurityInfrastructure

Globus-job-manager

Client API calls to request resource allocation

and process creation.

MDS client API callsto locate resources

Query current statusof resource

Launch

RSL Library

Parse

Request Allocate &create processes

Process

ProcessMonitor &

control

Local siteboundary

MPI Apps MDS: Grid Index Info Server

Globus Gatekeeper

MDS: Grid Resource Info Server

Globus Resource Manager

MDS client API callsto get resource info

Provide state changecallbacks to client

Process MPImessages

MPICH-G2

Page 51: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

High throughput workload management system architecture (simplified design)

GRAM

CONDOR

GRAM

LSF

GRAM

PBS

globusrun

Site1Site2 Site3

condor_submit(Globus Universe)

Condor-G

Master GISSubmit jobs(using Class-Ads)

ResourceDiscovery

Information on characteristics andstatus of local resources

Page 52: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

Condor Globus Universe

Page 53: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

AO General Model

Abstract Owner

Resource Manager

Physical Resource

Order window

PickupWindow

(a) External view of AO model

Order window

PickupWindow

(b) AO is Resource Owner

Manager

Sales Rep. Delivery Rep.

AO1

AO2

AO3

Order window

PickupWindow

(c ) AO is broker

Job shop(Estimator & Execution)

AO for Grid

J ob Result

(d) Job scheduling step AO

Estimator Executorlist

(e) Job Shop

Page 54: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Order Pickup

AO is owner or broker• User negotiates with AO through “order window”• That AO may own some resources, and/or it may broker with other AOs for those resources• After negotiation, resources are delivered through “pickup window”

OrderWindow

PickupWindow

PhysicalResource

UserRequests Resources

AO

Order Pickup

ResourceManager

AO1

Manager

DeliverySales

AO2

AO3

Page 55: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

AO Resources• Resources are objects

• Classes are

– Instrument

• Data source, sink, transform

• e.g. programs, people, files,

data collection devices

– Channel

• Moves data among instruments

– Complexes of above

• Attributes define sizes, times,

connections, etc.

Instrument(File)

Instrument(Program)

Instrument(File)

Instrument(Program)

Channels

Instrument(Telescope)

Instrument(Person)

Page 56: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Negotiating with an AOMake dummy resource

(with attributes set to constants, variables, or

“don’t care”)

+ bid + delivery plan+ variable constraints

Resource candidates(values for variables/attributes

+ asking price for each)

Pick one,Try again,Or give up

DeliveryWindow

Resource

Order Window

Assign tasksto resource,use, relinquish

Perhapslater...

USER

AO

Page 57: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Economic Models for Trading

• Commodity Market Model• Posted Prices Models• Bargaining Model• Tendering (Contract Net) Model• Auction Model • Proportional Resource Sharing Model• Shareholder Model• Partnership Model

Page 58: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Economy Grid = Globus + GRACEApplications

MDS

GRAMGlobus Security Interface

Heartbeat MonitorNexus

Local Services

LSF

Condor GRD QBank

PBS

TCP

SolarisIrixLinux

UDP

High-level Services and Tools

DUROC globusrunMPI-G Nimrod/GMPI-IO CC++

GlobusView Grid Status

GASS

GRACE-TS

GARA

GridFabric

GridApps.

GridMiddleware

GridTools

GBankGMD

eCash

JVM

DUROC

Core Services

Science

Engineering Commerce Portals ActiveSheet……

Source: Rajkumar Buyya (Monash Univ.)

Page 59: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Grid Node N

Grid Architecture for Computational Economy

Grid User

Application

Grid Resource Broker

Grid Service Providers

Grid Explorer

Schedule Advisor

Trade Manager

Job ControlAgent

Deployment Agent

Trade Server

Resource Allocation

ResourceReservation

R1

Misc. services

Information Server(s)

R2 Rm…

Pricing Algorithms

Accounting

Grid Node1

Grid Middleware Services

HealthMonitor

Grid Market Services

JobExec

Info ?

Secure

Trading

QoS

Storage

Sign-on

Source: Rajkumar Buyya (Monash Univ.)

Page 60: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

GRACE components

• A resource broker (e.g., Nimrod/G)• Various resource trading protocols for different economic

models• A mediator for negotiating between users and grid service

providers (Grid Market Directory)• A deal template for specifying resource requirements and

services offers• Grid Trading Server• Pricing policy specification• Accounting (e.g., QBank) and payment management

(GridBank, not yet implemented)

Page 61: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid WorkshopFlow Diagram for Pricing, Accounting, Allocations and Job Scheduling

QBankQBank

Resource Manager44

IBM-LL/PBS/….

00

55 88

66 77

Compute Resourcesclusters/SGI/SP/...

0. Make Deposits, Transfers, Refunds, Queries/Reports1. Clients negotiates for access cost.2. Negotiation is performed per owner defined policies. 3. If client is happy, TS informs QB about access deal.4. Job is Submitted5. Check with QB for “go ahead”6. Job Starts7. Job Completes8. Inform QB about resource resource utilization.

Trade Server 3311

Pricing PolicyPricing Policy22

DB@Each SiteDB@Each Site

GRID BankGRID Bank(digital transactions)(digital transactions)00

Rajkumar Buyya (Monash Univ.)

Page 62: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

• A resource broker for managing, steering, and executing task farming (parametric sweep/SPMD model) applications on Grid based on deadline and computational economy.

• Based on users’ QoS requirements, our Broker dynamically leases services at runtime depending on their quality, cost, and availability.

• Key Features– A single window to manage & control experiment– Persistent and Programmable Task Farming Engine– Resource Discovery– Resource Trading – Scheduling & Predications– Generic Dispatcher & Grid Agents– Transportation of data & results– Steering & data management– Accounting

Nimrod/G : A Grid Resource Broker

Source: Rajkumar Buyya (Monash Univ.)

Page 63: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid WorkshopA Glance at Nimrod-G Broker

Grid Middleware

Nimrod/G Client Nimrod/G ClientNimrod/G Client

Grid Information Server(s)

Schedule Advisor

Trading Manager

Nimrod/G Engine

GridStore

Grid Explorer

GE GISTM TS

RM & TS

Grid Dispatcher

RM: Local Resource Manager, TS: Trade Server

Globus, Legion, Condor, etc.

G

G

CL

Globus enabled node.Legion enabled node.

GL

Condor enabled node.

RM & TSRM & TS

C L

Source: Rajkumar Buyya (Monash Univ.)

Page 64: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Nimrod/G Interactions

Grid InfoServer

ProcessServer

UserProcess

File accessFileServer

Grid Node

NimrodAgent

Compute NodeUser Node

GridDispatcher

Grid Trade Server

GridScheduler

Local Resource Manager

Nimrod-G Grid Broker

TaskFarmingEngine

Grid ToolsAnd

Applications

Do this in 30 min. for $10?

Source: Rajkumar Buyya (Monash Univ.)

Page 65: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Discover Discover ResourcesResources

Distribute JobsDistribute Jobs

Establish Establish RatesRates

Meet requirements ? Remaining Meet requirements ? Remaining Jobs, Deadline, & Budget ?Jobs, Deadline, & Budget ?

Evaluate & Evaluate & RescheduleReschedule

Discover Discover More More

ResourcesResources

Adaptive Scheduling Steps

Compose & Compose & ScheduleSchedule

Source: Rajkumar Buyya (Monash Univ.)

Page 66: Nov. 9, 2002 Chan-Hyun Youn Information and Communications University Grid Middleware Service

[email protected]

Int’l DataGrid Workshop

Concluding Remarks• Restriction in Grid Middleware

– Difficulties in distributed computing and resource management policy

– Difficulties of middleware implementation required for heterogeneous systems in meta-computing infrastructure

• Globus, Condor, TENT, PARIS, Cactus, ….

• Difficulties of Resource Management in Grid Computing

• Models for Grid resource management architecture

– Hierarchical, AO, and Market-model ….