20
6/23/2005 R. GARDNER OSG Baseline Services 1 OSG Baseline Services In my talk I’d like to discuss two questions: What capabilities are we aiming for in 2005/6? How do we introduce new services into the OSG?

Rob Gardner

Embed Size (px)

Citation preview

Page 1: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 1

OSG Baseline Services

In my talk I’d like to discuss two questions:

What capabilities are we aiming for in 2005/6?

How do we introduce new services into the OSG?

Page 2: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 2

Guidance for Capabilities - take a wish list of the present

Principles and paths to deployment are guided by essential

needs of the participating VO’s

Example list Ability to store, serve, catalog, manage and discover collaboration-

wide datasets on a very large scale

Ability to access opportunistically non-dedicated resources

Ability to host VO managed services and agents on gatekeeper

hosts

Page 3: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 3

…with hard lessons from the past Protect grid services

that are vulnerable in a multi-VO environment Managed data transfer services required

since data staging is most likely point of failure Policy-based authorization infrastructure

to distinguish user roles within a VO Delay binding jobs to resources to last possible moment

to optimize utilization Robustness and reliability required

to keep operating costs low

Page 4: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 4

…and performance targets for the (near!) future. Submission of collections of O(1000) jobs should

happen within a few seconds. It is expected that even a typical data analysis task will translate into submission of O(1000) jobs.

WMS needs to be able to keep all available resources busy.

Overall reliability must be such that task completion is generally guaranteed within less than 3 retries even for large tasks. For late 2007 this implies >95% success rate per job.

Page 5: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 5

Example: ATLAS Production System

LCGNorduGrid Grid3 LSF

LCGexe

NGexe

G3exe

Legacyexe

super super super super

prodDB(CERN)

datamanagement

RLS RLS RLS

jabber soap soap jabber

Don Quij ote “DQ”

Windmill

Lexor

AMI(Metadata)

CaponeDulcinea

LCGNorduGrid Grid3 LSF

LCGexe

NGexe

G3exe

Legacyexe

super super super super

prodDB(CERN)

datamanagement

RLS RLS RLS

jabber soap soap jabber

Don Quij ote “DQ”

Windmill

Lexor

AMI(Metadata)

CaponeDulcinea

Requirements forVO and

Core Services

DDMS

WMS

Page 6: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 6

..introduces challenging distributed data management issues A number of catalogs in play - standardize on an

interface Require traffic shaping and load balancing for I/O

(SRM-dCache deployed on US Tier2 Centers) IO & space resources utilized based on VO

policies Site-level catalogs for managed Tier2 storage

elements Reliable transfer agents used also by WMS

Page 7: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 7

Managing Persistent Services VO-owned and dedicated sites will allow running of

“persistent” VO specific services and agents for

cataloging, space management, replication, web caching,

and WMS related services

Non-dedicated sites can be managed with VO services

running at another site…

Or dynamically deploying VO services & agents

Page 8: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 8

Portal Partitions & Edge Services Strategy for multiple VOs: partition resources into VO-managed and shared

Provision for hosting persistent “Guest” VO services & agents

Resources

GRAMGRIDFTP

GIPSRM

VO Services, Agents,Proxy Caches,Catalogs

Resources Resources Resources

Guest VOServices, Agents,Proxy Caches

SharedOSG

G A T E K E E P E R SG A T E K E E P E R S

Page 9: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 9

Edge Services Need to support VO specific agents and services

on “leased” gateways Typical:

Need a local MySQL database Access to port 80 (or generally, a port range) Will give I/O requirements in advance Need this for a time >> job execution time

Page 10: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 10

Example Edge Service: Scalable Remote Data Access ATLAS reconstruction and analysis jobs require

access to remote database servers at CERN, and elsewhere.

Presents additional traffic on network, long latencies for remote sites & bottleneck at central DB.

Suggest use of local mechanisms, such was web proxy caches, to minimize this impact

Page 11: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 11

Summary of OSG 0.4 targets (end 2005) GT4 gridftp already deployed; move to GT4 gram Managed computing elements for multi-VO Edge services framework providing capabilities,

VO-managed late binding of job-to-resources Job sandbox inspection, globally Policy and trust infrastructure Data location services and transfer agents

dynamically deployed via edge services Site catalogs, providing VO space management

Page 12: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 12

New Services in OSG - how? Requirements and schedule are determined with the OSG

deployment activity Architectural coherence will maintained through

participation with the blueprint group Integrate middleware services from technology providers

targeted for the OSG Provide testbed for evaluation and testing of new services

and applications Test and exercise installation and distribution methods Provide feedback to service providers and VO application

developers Prepare release candidates for provisioning.

Page 13: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 13

Service Readiness and Integration Plans Service proponents come to the integration testbed with

an appropriately scoped functionality and integration plan Purpose, scope Service Description Packaging Description Dependencies: resources and services needed Test use cases identified Testing tools – clients, harness; metrics for success clearly

defined Effort to contribute to the OSG-IVC and schedule Links to appropriate documentation, WSDL, etc

Page 14: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 14

Path for New Services in OSG

OSG

Dep

loym

ent A

ctiv

ity

Metrics &Certification

Applicationvalidation

VO Application

SoftwareInstallation

OSG Integration Activity

Release Description

MiddlewareInteroperability

Software &packaging

Functionality & Scalability

Tests

Readiness plan adopted

Service deployment

OSG

Ope

ratio

ns-P

rovi

sion

ing

Act

ivity

ReleaseCandidate

Readinessplan

Effort

Resources

feedback

Page 15: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 15

OSG Integration Testbed Layout

Stable ProductionRelease

IntegrationRelease

Resources enterand leave as necessary

OSG Integration Testbed

Serviceplatform

applications, test harness, clients

VO contributed

Page 16: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 16

Deployed ITB

Page 17: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 17

Validation: CMS-MOP and ATLAS-Capone

Page 18: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 18

Validation Jobs onITB 0.1.5

Page 19: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 19

Validation: GT4 GridFTP

http://osg.ivdgl.org/twiki/bin/view/Integration/GridFTP

Page 20: Rob Gardner

6/23/2005 R. GARDNER OSG Baseline Services 20

Conclusions OSG driven by VO requirements, core capabilities, and

principles for guidance Capabilities for next major release to introduce flexibility

for maturing middleware and VO boxes for delegated responsibility, via Edge Services

Paths for new services, validation and release process within a contributed, consortium model is working reasonably well so far

Process is leading to reliable, core computing substrate, with VO flexibility