First Steps With Grid Computing

First Steps with Grid Computing &

Oracle Application Server 10g Venkata Ravipati

Product Manager

Oracle Corporation

Session id: 40187

Sastry malladi

CMTS

Oracle Corporation

Jamie ShiersIT Division, [email protected]

Agenda

Introduction Grid Computing OracleAS 10g Features CERN Case Study OracleAS 10g Roadmap Q&A

Introduction Grid Computing

IT Challenges Enterprise I/T is highly fragmented, leading to

– poor utilization, excess capacity, and systems inflexibility.

Adding capacity is complex and labor-intensive Systems are fragmented into inflexible “islands” Expensive server capacity sits underutilized Installing, configuring, and managing application

infrastructure is slow and expensive Poorly integrated applications with redundant functionality

increase costs and limit business responsiveness

Grid Computing Solves IT Problems

High cost of adding capacity

Islands of inflexible systems

Underutilized server capacity

Hard to configure and manage

Poorly integrated applications with redundant functions

Pool modular, low-cost hardware components

Virtualize system resources

Dynamically allocate workloads and information

Unify management and automate provisioning

Compose applications from reusable services

IT Problem Grid Solution

What is Grid computing Grid computing is a hardware and software infrastructure that

enable – Transparent Resource Sharing across an

enterprise:Divisions,Data Centers, Resources Categories Computers Storage, Databases Application Servers Applications

– Coordination resources that are not subject to centralized control

– Using standard, open, general-purpose protocols and interfaces

– To deliver nontrivial qualities of service

Enterprise Grid Infrastructure Must Be Comprehensive

Management

Middleware

Database

Storage

Agenda


OracleAS 10g Features

Introducing Oracle 10g

Complete, integrated grid infrastructure

Oracle Application Server 10g

Software Provisioning

User Provisioning

10g

Application Availability

Application Development

Application Monitoring

Workload Management

Workload Management

Workload Management

Adding and allocating computing capacity is expensive and too slow to adapt to changing business requirements

Virtualize servers as modular HW resources

Virtualize software as reusable run-time services

Manage workloads automatically based on pre-defined policies

IT Problem Oracle 10g Solution

Virtualized Hardware Resources

Add Capacity Quickly and Economically

Virtualized Middleware Services

Accounting Application

Group Collections of Resources and Runtime Services into Logical Applications

HTTP Server

Web Cache

J2EE Server

Policy ManagerPolicy ManagerStores application-specific policiesStores application-specific policies

Resource ManagerResource ManagerManages resource availability/statusManages resource availability/status

Dispatcher & SchedulerDispatcher & SchedulerDistribute workloads based on Distribute workloads based on

application-specific policiesapplication-specific policies

Workload ManagerWorkload Manager

Policy-based Workload Management

Middleware Services

HTTP servers Web caches J2EE servers EJB processes Portal services Wireless services Web services Integration services

Directory services Authentication services Authorization services Enterprise Reporting

services Query Analysis services

Metrics-based Workload Reallocation

Unexpected demand! shift more capacity to Web Store

Employee Portal Portal

Accounting Discoverer, reports

Web Store HTTP, J2EE Server

Scheduled Workload Reallocation

General Ledger

Order Entry

General Ledger

Order Entry

Start of Quarter: End of Quarter:

Policy-based Edge Caching

Virtualized pools of storage enable sharing and transfer of data between nodes

Adaptive caching policies flexibly accommodate changing demand

Virtual HTTPServer

GridCaches

Client



User Provisioning

10g




Workload Management


Installing, configuring, upgrading and patching systems is labor-intensive and too slow to adapt to changing business requirements

Manage virtualized HW and SW resources as one system

Automate installation, configuration, upgrading, and patching processes



Grid Control Repository (GCR) with centralized inventories for installation and configuration

– Provision servers– Provision software– Provision users

Grid ControlRepository

Automated Deployment

Install and configure a single server node Register configuration to the Repository Automatically deploy to nodes as they are added

to the grid

Grid ControlRepository

Software Cloning

Select Software and Instances to Clone

1

Automated provisioning based on master node Archive & replicate specific configurations

– e.g.: Payroll config. optimized for Fridays at 4:00pm

Context-specific adjustments– e.g.: IP address, host name, web listener

Update Configuration Inventory in GCR

3Clone to

Selected Targets2

Patch and Update Management

DetermineApplicability

2Apply Patch/

Upgrade3Patch Published1

Real-time discovery of new patches Automated staging and application of patches Rolling application upgrades Patch history tracking

Update Patch Inventory in GCR

4



User Provisioning

10g




Workload Management

User Provisioning

It takes too long to register new users

Users have too many accounts, passwords, and privileges to manage

Developers re-implement authentication for each new application

Centralized identity management

Shared authentication service


Single Sign-on Across the Grid

AccountingSales Portal

Directory Support PortalClient

Consolidate accounts

Simplify management

Facilitate re-use

Create users once– Centrally manage roles, privileges, preferences

Support single password for all applications Delegate administration

– Locally administered departments, LOBs, etc.– User self-service

Interoperate with existing security infrastructure

User Provisioning



User Provisioning

10g




Workload Management


Ensuring required levels of availability is too expensive

Modular components provide inexpensive redundancy

Coordinated response to system failures ensures application availability



Transparent Application Failover (TAF)– Automatic session migration

Fast-Start Fault Recovery™– Automatic failure detection and recovery

Multi-tier Failover Notification (FaN)– Speeds end-to-end application failover time– From 15 minutes to <15 seconds




Resource failure! fail-over the service to additional nodes

Transparent Application Failover




Nodes recovered re-instate automatically

Fast-Start Fault Recovery™

> 15 mins> 15 mins

< 12 secs< 12 secs

15 mins15 mins

< 4 secs< 4 secs

Overcomes TCP/IP timeout delays associated with cross-tier application failovers:

Without FaN

With FaN

Without FaN

With FaN

RAC Failover AS Detection Total DowntimeRAC Failover AS Detection Total Downtime

< 8 secs*< 8 secs*

< 8 secs*< 8 secs*

Multi-tier Failover Notification (FaN)



User Provisioning

10g




Workload Management


Insufficient performance data to plan, tune, and manage systems effectively

Software pre-instrumented to provide status and fine-grained performance data

Centralized console analyzes and summarizes Grid performance



Monitor virtual application resources– e.g.: J2EE containers, HTTP servers, Web caches,

firewalls, routers, software components, etc.

Root cause diagnostics Track real-time and historic performance metrics

– App. availability, business transactions, end user perf.

Notifications and alerts Administer service level agreements (SLAs)

Repository-based Management

Centralized repository-based management provides a unified view of entire infrastructure

Manage all your end-to-end application infrastructure from any device

Grid Control Repository

Computer Host

Database

Storage System

App Server

Client

Router/Switch

Firewall

Portals

Clusters

Integration

Web Sites

Custom Apps

Performance Monitoring

Capture real-time and historical performance data

Analyze and tune workload policies

Answer questions like:– “How much time is being

spent in just the JDBC part of this application?”

– “What was the average response time over the past 3, 6, and 9 months?”

User specified targets, metrics, and thresholds– e.g.: CPU utilization, user response times, etc.

Flexible notification methods– e.g.: Phone, e-mail, fax, SMS, etc.

Self-correction via pre-defined responses– e.g.: Execute a script to shut down low priority jobs

Policy-based Alerts

Agenda


LHC Computing Grid Project

Oracle-based Production Services for LCG 1

Goals

To offer production quality services for LCG 1 to meet the requirements of forthcoming (and current!) data challenges

– e.g. CMS PCP/DC04, ALICE PDC-3, ATLAS DC2, LHCb CDC’04

To provide distribution kits, scripts and documentation to assist other sites in offering production services

To leverage the many years’ experience in running such services at CERN and other institutes

– Monitoring, backup & recovery, tuning, capacity planning, …

To understand experiments’ requirements in how these services should be established, extended and clarify current limitations

Not targeting small-medium scale DB apps that need to be run and administered locally (to user)

What Services?

POOL file catalogue using EDG-RLS (also non-POOL!)– LRC + RLI services + client APIs– For GUID <-> PFN mappings

and EDG-RMC– For file-level meta-data: POOL currently stores:

filetype (e.g. ROOT file), fully registered, job status– Expect also ~10 items from CMS DC04: others?

plus (service behind) EDG Replica Manager client tools

Need to provide robustness, recovery, scalability, performance, …

File catalogue is a critical component of the Grid!– Job scheduling, data access, …

The Supported Configuration

All participating sites should run: A “Local Replica Catalogue” (LRC)

– Contains GUID <-> PFN mapping for all local files A “Replica Location Index” (RLI) <-- independent of EDG deadlines

– Allows files at other sites to be found– All LRCs are configured to publish to all remote RLIs

Scalability beyond O(10) sites??Hierarchical and other configurations may come

later… A “Replica Metadata Catalogue” (RMC)

– Not proposing a single, central RMC– Jobs should use local RMC– Short-term: handle synchronisation across RMCs

In principle possible today “on the POOL-side” (to be tested)

– Long-term: middleware re-engineering?

Component Overview

ReplicaLocation

Index

LocalReplicaCatalog

StorageElement

CNAF

ReplicaLocation

Index

LocalReplicaCatalog

StorageElement

RAL

ReplicaLocation

Index

LocalReplicaCatalog

StorageElement

CERN

ReplicaLocation

Index

LocalReplicaCatalog

StorageElement

IN2P3

Where should these services be run?

At sites that can provide supported h/w & O/S configurations(next slide)

At sites with existing Oracle support team

We do not yet know whether we can make Oracle-based services easy enough to setup (surely?) and run (should be for canned apps?) where existing Oracle experience is not available

– Will learn a lot from current roll-out– Pros: can benefit from scripts / doc / tools etc.– Other sites: simply re-extract catalog subset from nearest

Tier1 in case of problems?– Need to understand use-cases and service level

Requirements for Deployment

A farm node running Red Hat Enterprise Linux and Oracle9iAS– Runs Java middleware for LRC, RLI etc.– One per VO

A disk server running Red Hat Enterprise Linux and Oracle9i– Data volume for LCG 1 small (~105 – 106 entries, each < 1KB)– Query / lookup rate low (~1 every 3 seconds)

Projection to 2008: 100 – 1000Hz; 109 entries– Shared between all VOs at a given site

Site responsible for acquiring and installing h/w and RHEL– $349 for ‘basic edition’ http://www.redhat.com/software/rhel/es

/

http://www.redhat.com/software/rhel/es/







What if?

DB server dies– No access to catalog until new server configured & DB restored– ‘Hot standby’ or clustered solution offers protection against most

common cases– Regular dump of full catalog into alternate format, e.g. POOL XML?

Application server dies– Stateless, hence relatively simple move to a new host

Could share with another VO– Handled automatically with application server clusters

Data corrupted– Restore or switch to alternate catalog

Software problems– Hardest to predict and protect against– Could cause running jobs to fail and drain batch queues!– Very careful testing, including by experiments, before move to a new

version of the middleware (weeks, including smallish production run?) Need to foresee all possible problems, establish recovery plan and test!

What happens durin

g period when catalog is

unavailable?

Backup & Recovery, Monitoring

Backend DB included in standard backup scheme– Daily full, hourly incrementals + archive log – allows point in time

recovery– Need additional logging plus agreement with experiments to

understand ‘point in time’ to recover to – and testing! Monitoring: both at box-level (FIO) and DB/AS/middleware Need to ensure problems (inevitable, even if undesirable) are handled

gracefully Recovery tested regularly, by several members of the team

Need to understand expectations:– Catalog entries guaranteed for ever?– Granularity of recovery?

Recommended Usage - Now

POOL jobs: recommend extracting catalog sub-set prior to job and post-cataloging new entries as separate step

Non-POOL jobs, e.g. EDG-RM client: minimum, test RC and implement simple retry + provide enough output in job log for manual recovery if necessary

– Perpetual retry inappropriate if e.g. configuration error

In all cases, need to foresee hiccoughs in servicee.g. 1 hour, particularly during ramp-up phase

Please provide us with examples of your usage so that we can ensure adequate coverage by test suite!

Strict naming convention essential for any non-trivial catalogue maintenance

Status RLS/RLI/RMC services deployed at CERN for each experiment

+ DTEAM– RLSTEST service also available, but should not be used for

production! Distribution mechanism, including kits, scripts and documentation

available and ‘well’ debugged Only 1 outside site deployed so far (Taiwan) – others in the pipeline

– FZK, RAL, FNAL, IN2P3, NIKHEF … We need help to define list and priorities! Actual installation rather fast (max a few hours) Lead time can be long

– Assign resources etc – a few weeks! Plan is (still) to target first sites with Oracle experience to make scripts &

doc as clear and smooth as possible– Then see if it makes sense to go further…

Registration for Access to Oracle Kits

Well known method of account registration in dedicated group (OR) Names will be added to mailing list to announce e.g. new releases

of Oracle s/w, patch sets etc. Foreseeing much more gentle roll-out than for previous packages Initially just DBAs supporting canned apps

– RLS backend, later potential conditions DB if appropriate For simple, moderate-scale DB apps, consider use of central Sun

cluster, already used by all LHC experiments Distribution kits, scripts etc in afs

– /afs/cern.ch/project/oracle/export/ Documentation also via Web

– http://cern.ch/db/RLS/

Links http://cern.ch/wwwdb/grid-data-management.html

High level overview of the various components; pointers to presentations on use-cases etc

http://cern.ch/wwwdb/RLS/

Detailed installation & configuration instructions

http://pool.cern.ch/talksandpubl.html

File catalog use-cases, DB requirements, many other talks…

http://cern.ch/wwwdb/grid-data-management.html

http://cern.ch/wwwdb/RLS/

http://pool.cern.ch/talksandpubl.html

Future Possibilities

Investigating resilience against h/w failure using Application Server & Database clusters

AS clusters also facilitate move of machines, addition of resources, optimal use of resources etc.

DB clusters (RAC) can be combined with stand-by databases and other techniques for even greater robustness

(Greatly?) simplified deployment, monitoring and recovery can be expected with Oracle10g

Summary

Addressing production-quality DB services for LCG 1

Clearly work in progress, but basic elements in place at CERN, deployment just starting outside

Based on experience and knowledge of Oracle products, offering distribution kits, documentation and other tools to those sites that are interested

Need more input on requirements and priorities of experiments regarding production plans

AQ&Q U E S T I O N SQ U E S T I O N S

A N S W E R SA N S W E R S

Technology

First Steps With Grid Computing