PAGE 1 Fujitsu Computer SystemsKCCMG 2005 IMPACT A Framework for Grid and Utility Computing Automatic Provisioning and Load Management for Server Farms

KCCMG 2005 IMPACTPAGE 1 Fujitsu Computer Systems

A Framework for Grid and Utility ComputingAutomatic Provisioning and Load Management

for Server Farms and Blade Servers

Joe Bell, Bob EquitzFujitsu Computer Systems

KCCMG - Fall IMPACT 2005


TRADEMARKS

• SuperDome and HP Open View Automation Manager are trademarks of HP

• PRIMEPOWER, PRIMEQUEST and Adaptive Services Control Center are trademarks of

Fujitsu Ltd., Fujitsu Computer Systems and Fujitsu-Siemens Corp.

• eServer, pSeries, P5, zSeries, z990, Tivoli Provisioning Manager (TPM), and Tivoli

Intelligent Orchestrator (TIO), JES2, JES3, NJE, YES/MVS, S/360 and SYSPLEX are

Trademarks of IBM.

• SunFire E25k, N1 Grid System, N1 Grid Service Provisioning System (GSPS), N1

Provisioning Server Blades Edition, N1 Grid Engine, and N1 System Manager are

Trademarks of SUN.

• OpForce is a Trademark of Veritas.

• EMC (Legato) AutoStart is a Trademark of EMC (Legato).

• All automated provisioning and virtualization concepts and notes relating to such, in this

presentation, are based on Fujitsu Siemens Corporation documentation material.


What are we going to cover?

-Joe Bell -• Define Grid and or Utility Computing• What’s wrong with today’s typical environment • The factors pushing Grid and Utility designs

– Business drivers– Scientific & engineering

• Why can’t we just make everything run faster?– Speed of light – Moore’s Law– Parallelism– Bottlenecks


What are we going to cover?

-Bob Equitz -• Review today’s environment• What’s the new technologies we need for Utilities/Grids?• Why Virtualize and what’s the general process?

– Remove boundaries– Implement management– Assign/provision services

• The Road Map to Autonomic Systems• The process to maintain Autonomic Systems

– Why are they important to our goals– General configuration end-to-end

• Product review• Summary


• Bell’s simple minded approach – start with the basics

• Merriam/Webster –

– Grid : A framework, or lattice or network or net or web …1 : GRATING2 a (1) : a perforated or ridged metal plate used as

a conductor in a storage battery (2) : an electrode consisting of a mesh or a spiral of fine wire in an electron tube (3) : a network of conductors for distribution of electric power; also : a network of radio or television stations b : a network of uniformly spaced horizontal and perpendicular lines (as for locating points on a map); also : something resembling such a network

What is Grid or Utility Computing Anyway?


• Merriam/Webster –

– Utility: Usefulness, service, convenience, function, …..1 : fitness for some purpose or worth to

some end2 : something useful or designed for use3 a : PUBLIC UTILITY b (1) : a service (as light, power, or water) provided by a public utility (2) : equipment or a piece of equipment to provide

such service or a comparable service4 : a program or routine designed to perform or facilitate especially routine operations (as

copying files or editing text) on a computer



• Bell’s feeble mind –– A Computing or IT Grid Utility:

• A network of heterogeneous computer server and storage platforms, providing the framework and infrastructure for the useful, convenient and functional computational services, that support a given set of business and / or scientific IT requirements.

– Or roll your own variation within the previous definition boundaries.



Is this really new stuff?

• Shared Disk – Loosely Coupled Systems • What was JES2/NJE. JES3? • Home Grown Load Balancers? • Resource Affinity Scheduling? • Serially Reusable Resource Controls Across Multiple Systems? • SYSPLEX • YES/MVS


Today‘s computing geography – Static and unshared islands

Dedicated IT systems Inflexible use of resources Low resources utilization Hard to manage High TCO, low ROI


Business Imperative: Increase Utilization


Other Business Requirements & Issues

– Data bases / files too large for required nightly linear or sequential massaging / Reorganization requirements ... Split it up to more isolated systems.

– Peak processing requirements requiring huge amounts of CPU that are not utilized during off peak time periods.

– Outage costs that are prevented with duplicate idle hardware resources and software licenses (HA Clustering)

– Transactions that must process too much data to meet SLAs – More smarts in the application or middle ware required – or a very big CPU for a relatively small gain.

– OSs and program products that don’t manage more than one instance very well, and/or don’t share resources .. (all for one – one for all)

• All of the above and more press IT in the direction of lowered resource utilizations or additional resources that are isolated and under utilized .. 180º away from the goal of higher utilizations!!


Business Objectives

Utilization before Grid Utility

0%

10%

50%

20%

30%

40%

60%

Utilization after Grid Utility

0%

10%

50%

20%

30%

40%

60%

Service before Grid Utility – typical bimodal

0%

20%

100%

40%

60%

80%

Service with Grid Utility

0%

20%

100%

40%

60%

80%

better utilization of hundreds of pooled systems more achievable.

• Administering and maintaining workloads - Provisioning


Todays Low Agility & Efficiency of IT

– “To prevent the data center from consuming the entire IT budget, increased manageability and utilization through standardization and automation are essential”

• Source: 2003 META Group – The Data Center of the Future

– 75% of all IT staff is absorbed by maintaining the existing IT • Source: Andy Butler, Gartner Group, April 2004

– Utilization of UNIX/Windows servers is low ( < 25% over 24 hours across all servers)

• Source: 2003 META Group – The Data Center of the Future


Scientific and Engineering Requirements

• The traditional scientific paradigm – First do theory on paper

– Then perform experiments to confirm or deny the theory.

• The traditional engineering paradigm – First do a design

– Then build a laboratory prototype.

• These paradigms are being replaced by numerical experiments and digital prototyping – why? – Real phenomena are too complicated to model on paper (e.g. climate

prediction).

– Real experiments are • too hard, too expensive, too slow, or too dangerous for a laboratory • e.g. oil reservoir simulation, large wind tunnels, overall aircraft design, galactic

evolution, whole factory or product life cycle design and optimization, weather prediction, nuclear fusion control etc.


Why even use a grid parallel process?

• Scientific and engineering problems requiring the most computing power to simulate are commonly called “Grand Challenges” or “largest problems”

– For example predicting the climate 50 years ahead is estimated to require computers computing at the rate of 1

TFLOP and with a memory size of 1 TB

• 1 MFLOP = 106 floating point operations per second

• 1 GFLOP = 109 floating point operations per second

• 1 TFLOP = 1012 floating point operations per second


• Weather prediction for one week requires 56 GFLOPS. – Climate prediction for 50 years requires 4.8 TFLOPS.

• The actual grid resolution used in climate codes today is 4 degrees of latitude by 5 degrees of longitude, or about 450 km by 560 km. – A near term goal is to improve this resolution to 2 degrees by 2.5

degrees, which is four times as much data. • NASA has launched weather satellites expecting to collect 1TB of

data per day for a period of years – Totaling > 6 PB (1015 bytes - Peta Bytes) of data over time.– No existing system is large enough to store this data today. – The Sequoia 2000 Global Change Research Project is

concerned with building this database. – http://appl.nasa.gov/pdf/

61537main_eosdis_case_study_602904.pdf

• Some other sites: http://www.cio.noaa.gov/hpcc/ http://www.noaa.gov/

A stake in the ground - Weather forecasting


Why Parallelism is Essential

• The clock speed is increasing – can’t we just push it all the way up to 1THz?

– 1 flop / Hz 1TFLOPS

– No - speed of light sets a limit upon the speed of a computer. • Now assume a completely sequential computer with 1 TB of memory

running at 1 TFLOP. – If the data has to travel a distance d to get from the memory to the CPU, and it has to

travel this distance 1012 times per second at the speed of light c=3x108 m/s, then d <= 3 *108 / 1012 = 0.3 mm.

– So the computer theoretically has to fit into a 0.3 mm cube.

• Now consider the 1TB memory. – Memory is conventionally built as a planar grid of bits, in our case say a 106 x 106 grid

of words. – If this grid is 0.3mm by 0.3mm, then one word occupies about 3 Angstroms (Å) by 3

Angstroms (.3x10-3/1x106 per side), or the size of a typical atom.

• Getting close to 3 Angstroms? 1nm= 10Å

– 45nm (current Fujitsu leading edge chip etching)= 450Å, – 450Å/ 3Å=> 150 atoms for the etching size


How Small Can We Go?

From the beginning to the present: on the left an early computing machine built from mechanical gears, on the right a state-of-the art IBM chip with 0.25 micron features. The production version will contain 200 million transistors.

http://www.qubit.org/library/intros/nano/nano.html


Nanocomputing with Quantum Effects

The transition from microtechnology to nanotechnology. The structure on the right is a single-electron transistor (SET) which was carved by the tip of a scanning tunneling microscope (STM). According to classical physics, there is no way that electrons can get from the 'source' to the 'drain', because of the two barrier walls either side of the 'island'. But the structure is so small that quantum effects occur, and electrons can, under certain circumstances, tunnel .through the barriers (but only one electron at a time can do this!). Thus the SET wouldn't work without quantum mechanics.

http://www.qubit.org/library/intros/nano/nano.html


Fast and Parallel • As of Jan 1996, the fastest machine then was an Intel Paragon with

6768 processors and a peak speed of 50 Megaflops/proc, for an overall peak speed of 6768*50 = 338 GFLOPS. – Doing Gaussian elimination, the machine got 281 GFLOPS on a

128600x128600 matrix; the whole problem takes 84min. • The Linpack Benchmark (component of SPECFP), sorts all machines by the speed with

which they can solve systems of linear equations Ax=b, of various dimensions, using Gaussian elimination.

• In the Netlib repository there is a long list of computers, together with performance benchmark information.

• As of June. 2005, the fastest machine on the TOP500 (see Top500.org) list is the IBM Blue Gene with a peak speed of 183 TFLOPS

• Trips, or the Tera-op Reliable Intelligently Adaptive Processing System. "Our goal is to exploit concurrency, …”

– Defense Advanced Research Projects Agency in its Polymorphous Computing Architectures project. DARPA, which is contributing $15.4 million to Trips, is looking for a chip that is able to scale to 1 trillion sustained operations (tera-op) per second on many applications http://www.computerworld.com/hardwaretopics/hardware/story/0,10801,104911,00.html?source=NLT_EMC&nid=104911


Writing Fast Programs is Hard

• Where do the FLOPS go? Why does the speed depend so much on the problem size? The answer lies in understanding the memory hierarchy. All computers, even cheap ones, look something like this (since IBM S/360 – or 3rd generation):


• The memory at the bottom level of the hierarchy, disk, is large, slow and cheap– Useful work, such as floating point operations, can only be done

on the data at the top of the hierarchy. – Transferring data among levels is slow, much slower than the rate

at which we can do useful work on data in the registers.

• In fact, this data transfer is the bottleneck in almost all computation and numerical analyses – More time is spent moving data in the hierarchy than doing useful

work. – These are the non-compute related tasks that significantly impact

scalability of compute clusters– Thus enhancing these systems provides future potential




• Good algorithmic designs require keeping active data near the top of the hierarchy for as long as possible, as well as minimizing movement between levels. – For many problems, like Gaussian elimination, only if the

problem is large enough, is there enough work to do at the top of the hierarchy to mask the time spent transferring data between lower levels. – Else, your no better than a few sequential processors…

• The more processors one has, the larger the problem has to be to mask this transfer time.

• These mechanisms are inherently inefficient



• Moore’s Law– Speeds of basic microprocessors grow by approximately a

factor of 2 every 18 months because; – Number of transistors doubles every 18 months – One of the reasons Moore's Law is true is that

microprocessor manufacturers are adopting many of the tricks of parallel computing and accounting for memory hierarchies.

– Getting the peak speed from the processor is becoming increasingly more difficult.

– Facet - there is no way around the issue today without radical new technology.


To COMPUTE or To COMMUNICATE?

• Which takes longer always depends upon – The application in hand– The speed of the processor - memory architecture– The speed of the network

• For a given problem, any of the above is a huge “bottleneck” – whether Business or Scientific Computing

• The bottleneck can be reduced – maybe ..

– at least partially, by introducing a large SMP based entities, as elements of the Grid/Utility with a massive interconnect backbones (e.g., HP SuperDome, Fujitsu PRIMEPOWER & PRIMEQUEST, eServer p5 Series, zSeries z990, Sun Fire E25K) to reconcile these mutually exclusive grid design constraints.

– Analyze the requirement for speed and availability versus costs : several large SMP’s versus large clusters of 1U commodity servers both arranged into a grid structure.

– Potential to produce a hybrid of both– Good R&D Project!


Which one would you challenge?”

James Montgomery Doohan (March 3, 1920 – July 20, 2005) was a Irish-Canadian character and voice actor best known for his portrayal of Scotty in the television and movie series Star Trek.

Pig in Mud

“Arguing with an engineer is like wrestling in mud with a pig: After a while you realize the pig likes it!!” --Mark Simmons, Sr. Consulting Engineer and Marketing Product Specialist, FCS


Today‘s computing geography – Static and unshared islands

• Inefficient

• Over / Under provisioned

• Hard to manage

• Inflexible

Remember; This is where most of us are today……………


Required Core Technologies

Virtualization

Separation of business applications and data from the need for dedicated technology

Automation

Automatic adjustment of platforms and infrastructure to changes in operation & environment

Integration

Low-cost, low-risk implementations & upgrades, re-usable technology, unified processes and services as validated product integration templates


Business Efficiency through Virtualization

• IT resources are shared, not isolated as in today’s “islands of computing” model

• Business priorities determine the allocation of IT resources

• Service levels are predictable and consistent, despite the unpredictable demands for IT services

Server Virtualization Storage Virtualization

Application

Services


Pooling and sharing of the overall resource

• Remove ServerBoundaries

Service A Service B Service C Service D Service E



• Remove ServerBoundaries

• ConsolidateStorage





• Remove Server Boundaries

• Establish overall management

• ConsolidateStorage




ServiceServiceEE

ServiceServiceBB

ServiceServiceAA

ServiceServiceCC

ServiceServiceDD

0

10

20

30

40

50

60

A B C D E

Load

• Remove Server Boundaries

• Establish overall management

• Assign Services

• Consolidate Storage


Automatic provisioning & loadmanagement for Applications

ApplicationQoS Metrics

Applicationinstances

WorkloadGraph

Resourceallocation


QoS Monitoring & Management

High Water Mark

Low Water Mark

Target Metric Range

Time

QoS Metric

Measured QoS metric exceeds the specified maximum acceptable valueAllocate more satellite nodes and deploy needed application to meet QoS target

Measured QoS metric is below the specified minimal acceptable value (too many resources)

Perform orderly shutdown of some instances thus reducing cost and freeing the resources for other work.


Virtualization: On the way to Autonomic Systems

FLEXIBILITY

COST

AutonomicSystems

• Self configuring

• Self optimizing

• Self protecting

• Self healing

Resource management

• Standardization

• Consolidation

• AutomationVirtualization

• Dynamic provisioning

• Allocation policies

• Consistent QoS


Managing the autonomic cycle

• Autonomic system monitoring & control

• Scripts• Policies

• Commands• SNMP set• ...

• “Autonomic” rules• ...

• SNMP values• Commands• System

parameters• ...

• Event rules• Thresholds• ...

selfconfiguring

self optimizing

selfprotecting

selfhealing

Resource A(HW, FW, OS, Middleware

Application, System)

Resource B(HW, FW, OS, Middleware

Application, System)

Interface forMonitoring

Monitoring Execution

Event Generation Measures

Event Handling

Interface forControl

AutonomicCycle


Importance of Autonomic Functions

• Benefits include: – 1) fast and unattended adaptation of IT infrastructure to changing

business requirements;

– 2) automated monitoring and immediate reaction to changing workloads without operator intervention and with lower operating risks;

– 3) no changes to applications or operating systems are required. Transparently manages Linux or Windows or.. And applications.

– 4) better response to changing demands

– 5) easier accommodation of SLA’s

• Total effect: Further reduction of wasted or over utilized resources and reduction of personnel monitoring and manually adjusting the systems further reduction of TCO per unit of work accomplished.


PRIMERGY (e.g. BX600)Adaptive Service ManagerAdaptive Service Manager

Storage network

Control network

AdaptationAdaptation• RestartRestart• RemoteDeployRemoteDeploy

OSDeploy-ment

Server

Spare

Client network

Clients

Terminal Server Deploy

Console

MonitoringMonitoring

Shared storage (NAS)

Actions

Policies

Inventory

ImagesData Areas

Automatic provisioning

Efficient management of large environment

Automatic workload management

Automated policy-based management

Centralized fully automated operating system and application deployment

Single image administration

Tie it all together .. End –to- End Solution


Products for Investigation

• Fujitsu Siemens Adaptive Services Control Center 1.1: Automatic provision,

deploy, monitor, and allocate resources - controlling load, utilization and

service level and quality metrics for each application service, per user

requirements.

• HP Open View Automation Manager is a data center automation solution that

extends Open View Change and Configuration Management solution to

automatically re-provision resources in accordance with business priorities.

Automation Manager runs under Windows and supports Windows and Linux

servers.

• IBM Tivoli Provisioning Manager (TPM) combined with Tivoli Intelligent

Orchestrator (TIO) automates tasks in anticipation of, or in response to,

changing conditions. TIO manages pooled resources and prioritizes

allocations. TPM provisions resources. TIO monitors performance and decides

what actions to take in order to maintain committed application service levels.


Products for Investigation

• EMC (Legato) AutoStart is a cluster solution integrating EMC’s suite of

storage products with application availability. AutoStart supports

automated switching of servers, networks, and data.

• SUN N1 Grid System, “a collection of architectures, products, and

services...” products are available today – N1 Grid Service Provisioning

System (GSPS), N1 Provisioning Server Blades Edition, and N1 Grid

Engine. N1 System Manager GSPS automates application provisioning on

Solaris, Linux, AIX, and Windows servers.

• Veritas OpForce is based on software acquired from Jareva Technologies

in 2003. VERITAS positions OpForce for server automation and

provisioning, and managing IT resource lifecycle. OpForce automates

tasks associated with controlling, provisioning, and updating

heterogeneous data center environments, including bare-metal discovery,

resource pooling, and application and OS software deployment.


Summary

Requirements for Utility/Grid are driven by Business TCO pressures Scientific/Engineering problem solving requirements

Full-featured Framework for Utility/Grid Computing High availability High scalability Disaster recovery Automatic provisioning Enterprise applications Multi-platform support

Solaris, Linux, Windows, VMware, .........

... And easy installation / operation with instrumentation Self managing, self scaling, self healing, self adapting, auto

configuration updates

Documents

PAGE 1 Fujitsu Computer SystemsKCCMG 2005 IMPACT A Framework for Grid and Utility Computing Automatic Provisioning and Load Management for Server Farms