60
IBM STG Deep Computing Confidential | Systems Group 2004 © 2004 IBM Corporation IBM Systems Group Deep Computing with IBM Systems Barry Bolding, Ph.D. IBM Deep Computing SciComp 2005

IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

IBM Systems Group

Deep Computing with IBM Systems

Barry Bolding, Ph.D.IBM Deep ComputingSciComp 2005

Page 2: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Deep Computing Components

High Performance Computing LeadershipResearch and InnovationSystems Expertise

– pSeries

– xSeries

– Storage

– NetworkingInnovative Systems

Page 3: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Deep Computing Focus

Government Research Labs– Energy and Defense

Weather/Environmental– Weather Forecasting Centers– Climate Modeling

Higher Education/Research UniversitiesLife Sciences

– Pharma, BioTech, ChemicalAero/AutoPetroleumBusiness Intelligence, Digital Media, Financial Services, On Demand HPC

Page 4: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

Deep Computing Teams and Organization

Page 5: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Deep Computing Technical TeamKent Winchell

Technical TeamDeep Computing

Barry BoldingTechnical Manager

Public Sector

Farid ParpiaHPC Applications

Life Sciences

John BauerHPC Storage

Government, HPC

Wei ChenEDA

Asia Pacific HPC

Charles GrasslGovernmentHIgher Ed.

Stephen BehlingHigher Ed

CFD

Ray PadenGPFS

HPC Storage

James AbelesWeather/Environment

Joseph SkoviraSchedulers

CSM

Marcus WagnerGovernmentLife Sciences

Jeff ZaisTechnical ManagerIndustrial Sector

Doug PeteschAuto/Aero

Martin FeyereisenAuto/Aero

Business Intelligence

“Suga” SugavanamHPC Applications

BlueGene/L

Guangye LiAuto/Aero

Si MacAlesterDigitial Media

Harry YoungDigital Media

Scott DenhamPetroleum

Janet ShiuPetroleum, Visualization

Len JohnsonDigital Media/Storage

Page 6: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

IBM Deep Computing Summary of Technology Directions

Page 7: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

HPC Cluster System DirectionSegmentation Based on Implementation

Syst

ems

20062004 20072005

High End

Midrange

Blades

High Volume

High Value Segment

'Good Enough' Segment

Blades - Density - Segment

Off Roadmap Segment

Page 8: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

HPC Cluster Directions

20062004 20072005 2009

Per

form

ance

Cap

acity

C

lust

ers

Cap

abili

tyM

achi

nes

Limited Configurability(Memory Size, Bisection)

Extended ConfigurabilityBlueGene

Power, AIX, Federation

Linux ClustersLess Demanding Communication

Power, Intel, BG, Cell NodesBlades

Power, Linux, HCAs

Power (w/Accelerators?)Linux, HCAs

100TF

PF

2008 2010

PERCS

Page 9: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Deep Computing Architecture

HPC NetworkBackbone NetworkStorage Network

Large-MemoryBW driven

SAN switch

High DensityComputing

Shared Storage

Gateways, Webservers, Firewalls, On-Demand Access

EmergingTechnologies

User Community

Page 10: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Deep Computing Architecture (Multicluster GPFS)

HPC NetworkBackbone NetworkStorage Network

Large-MemoryBW driven

High DensityComputing

Shared Storage

Gateways, Webservers, Firewalls, On-Demand Access

EmergingTechnologies

User Community

Shared Storage Shared Storage

Page 11: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

IBM Offerings are Deep and Wide

Storage, Networking, System

Management,

Tools

pSeries, eServer1600IBM Power4 and Power5 chip

AIX/Linux

xSeries/eServer 1350Intel Xeon

AMD OpteronBladeCenter

Linux, Server2003

Workstations

HPC Clusters,Grids,Blades

Software, expertiseand Business Partners"to tie it all together for your HPC solution"

Page 12: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Processor Directions

Power Architectures– Power4 Power5 Power6

– PPC970 Power6 technology

– BlueGene/L BlueGene/P

– Cell Architectures (Sony, Toshiba, IBM)Intel

– IA32 EM64T (NOCONA) AMD Opteron

– Single-core dual-core

Page 13: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

System Design

Power Consumption (not heat dissipation)Chips might only be 10-20% of the power on a system/nodeNew metrics

– Power/ft^2

– Performance/ft^2

– Total cost of ownership (including power/cooling)Power5 clusters (p575) = 96 cpu/rack1U rack optimized clusters = 128 cpu/rackBladecenter(PPC/Intel/AMD) = 168 cpu/rack (dual core will increase this)BlueGene = 2048 cpu/rack

Page 14: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Systems Directions

Optimizing Nodes– 2,4,8,16 CPU nodes

– Large SMPs

– Rack Optimized Servers and BladeCenterOptimizing Interconnects

– Higher Performance Networks– HPS, Myrinets, Infiniband, Quadrics, 10GigE

– Utility Networks– Ethernet, Gigabit, 10GigE

Optimizing Storage– Global Filesystems (MultiCluster GPFS)

– Avoiding Bottlenecks (NFS, Spindle counts, FC adapters and switches)

Optimizing Grid Infrastructure

Page 15: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Systems Directions

pSeries– Power4 Systems (p-6xx)– 2,4,8,16 way Power5 clusters (p-5xx, OpenPower-7xx)– 32,64 way Power5 SMPs (p-595)– BladeCenter cluster (JS20)

xSeries– Intel EM64T, Rack Optimized and BladeCenter

– x335,x336,HS20,HS40– AMD Opteron Rack Optimized

– x325,x326,LS20BlueGene/LInterconnects

– HPS, Myrinet, IB, GIGE, 10GIGE

Page 16: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Software DirectionsSystem Software

– Unix (AIX, Solaris)– Linux

– Linux on POWER– Linux on Intel and Opteron

– WindowsHPC Software

– Same Software on AIX and Linux on POWER– Compilers, Libraries, Tools

– Same HPC Infrastructure on Linux/Intel/Opteron and POWER

– GPFS, Loadleveler, CSM– MultiCluster GPFS– Grid Software– Backup and Storage Management

Page 17: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Linux Software Matrix

Kernels (not even considering distros)– 2.4, 2.6

Interconnects– IB (3 different vendors), Myrinet, Quadrics, GigE

(mpich and lam)32 and 64-bit binaries and librariesCompiler options (Intel, Pathscale, PGI, gcc)Geometric increase in number of binaries and sets of libraries that any code developer might need to support.

Page 18: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

There are passengers and there are drivers!

IBM is a Driver– POWER (www.power.org)

– Linux on Power and Intel/Opteron, LTC

– BlueGene/L

– STI Cell Architectures

– Open Platform SupportHP, SGI, SUN, Cray are passengers

– Rely primarily on external innovations (HP, SGI, SUN, Cray).

Page 19: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Introducing IBM’s Deep Computing OrganizationGovernment Weather

Forecasting

PetroleumExploration

Digital Media

DrugDiscovery

Chip Design

Crash Analysis

Financial Services

• Clear #1 position in High Performance Computing (Top500, Gartner, IDC, …)

• “Our goal is to solve consistently larger and more complex problems more quickly and at lower cost.”

Page 20: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

The CAE World is in FluxƒHardware vendors

ƒSoftware vendors

ƒOperating systems

ƒCluster computing

ƒMicroprocessors

Most users are seeing dramatic changes in their CAE environment

Page 21: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Evolution of Hardware: drive towards commonality

MainFrames(~1979)

Vectors(~1983)

RISC SMPs(~1994)

Clusters(~2002)

Mostly MSC.Nastran

Beginning in 1986 crash simulation drove CAE compute requirements

SMP architecture was often first introduced in the CFD department and helped push parallel computing.

Cluster architecture (Unix & Linux) now dominates crash and CFD environments

Page 22: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

1998 2000 2002 20040

10

20

30

40

50

60

70

Perc

ent W

orkl

oad

SerialSMP4-30 CPUs>30 CPUs

Crash Simulation

1998 2000 2002 20040

10

20

30

40

50

60

70

80

Perc

ent W

orkl

oad

SerialSMP4-30 CPUs>30 CPUs

CFD Simulation

1998 2000 2002 20040

102030405060708090

100

Per

cent

Wor

kloa

dSerialSMP4-30 CPUs>30 CPUs

Structural AnalysisTransition of the CAE environment

Page 23: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Recent Trends – Top 20 Automotive Sites

Other IA-64

IA-32Vector

SPARC

Alpha

PA-RISC

POWER

MIPS

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

1997 1998 1999 2000 2001 2002 2003

Perc

ent o

f Ins

talle

d G

igaF

LOPS

Source: TOP500 websitehttp://www.top500.org/lists/2003/11/

Page 24: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

IBM Power Technology and Products

Page 25: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

POWER : The Most Scaleable ArchitectureB

inary B

inary Com

patibility

ServersServersPOWER3POWER4 POWER4+

POWER2

POWER5

EmbeddedEmbeddedPPC401

PPC 405GP

PPC 440GP

PPC 440GX

DesktopGames

PPC 603e

PPC 750

PPC 750CXe

PPC 750FX

PPC 750GX

PPC 970FX

Page 26: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

IBM powers Mars exploration

PowerPC is at the heart of the BAE Systems RAD6000 Single Board Computer, a specialized system enabling the Mars Rovers — Spirit and Opportunity — to explore, examine and even photograph the surface of Mars.

In fact, a new generation of PowerPC based space computers is ready for the next trip to another planet. The RAD750, also built by BAE Systems, is powered by a licensed radiation-hardened PowerPC 750 microprocessor that will power space exploration and Department of Defense applications in the years in the come.

IBM returns

to MARS

Page 27: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

IBM OpenPower / eServerTM p5 Server Product LineHigh-endNo Compromises......

Linux

POWER4+Systems

Workstations

Mdl 275

IntelliStation Blades JS20+

PPC970+Systems

POWER5Systems p5-595

Std & Turbop5-590

Midrange

p5-570Express, Std

& Turbo

p5-575

OpenPower OP 720OP 710

Entry Towers

Entry Rack

p5-550Express & Std

p5-520Express & Std

P590p595

p5-510Express & Std IBM ^

Cluster 1600

p550p520 p575p570

p575p590p595p50

Page 28: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

pSeries

pSeries

p5-575

p5-570

p5-520

POWER5 Technology Bottom to Top

p5-510

p5-550

p5-590p5-595

Footprint, packaging

19-inchrack

19-inchrack,

deskside

19-inch rack,

deskside19-inch rack

24-inch frame

by node 24-inch frame

24-inchframe

8 to 321.65

8 to 1,0249.3TB

20 to 1608

254151.72

Yes

Yes

Max. rPerf 9.86 9.86 19.66 77.45 46.36 306.21

No. CPUs/node 1, 2 1, 2 1, 2, 4 2, 4, 8, 12, 16 8 16 to 64GHz clock 1.5, 1.65 1.5, 1.65 1.5, 1.65 1.5, 1.65, 1.9 1.9 1.65, 1.9

Int. storage 587.2MB 8.2TB 15.2TB 38.7TB 1.4TB 14.0TBGB memory 0.5 to 32 0.5 to 32 0.5 to 64 2 to 512 1 to 256 8 to 2,048

PCI-X slots 3 6 to 34 5 to 60 6 to 163 0 to 24 20 to 240I/O drawers 0 4 8 20 1 12LPARs 20 20 40 160 80 254

Cluster 1600 Yes Yes Yes Yes Yes YesHACMP™(AIX 5L™ V5.2) Yes 2Q05 Yes Yes Yes Yes Yes

Page 29: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

POWER5 architecture

Simultaneous multi-threading

Hardware support for Micro-Partitioning–Sub-processor allocation

Enhanced distributed switch

Enhanced memory subsystem–Larger L3 cache: 36MB –Memory controller on-chip

Improved High Performance Computing [HPC]

Dynamic power saving–Clock gating GX+

Chip-ChipMCM-MCM

SMPLink

Mem

ory

L31.9 MB

L2 Cache

L3 Dir / Ctl

MemCtl

POWER5Core

POWER5Core

Enhanced distributed switch

POWER5 design

POWER5 enhancements

1.5, 1.65 and 1.9 GHz 276M transistors .13 micron

Page 30: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

~ p5: Simultaneous multi-threadingPOWER4 (Single Threaded)

CRL

FX0FX1LSOLS1FP0FP1BRZ

Thread1 active

Thread0 activeNo thread active

Utilizes unused execution unit cyclesPresents symmetric multiprocessing (SMP) programming model to softwareNatural fit with superscalar out-of-order execution coreDispatch two threads per processor: “It’s like doubling the number of processors.” Net result:

– Better performance– Better processor utilization

Appears as 4 CPUs per chip to the

operating system (AIX 5L V5.3 and

Linux)

Syst

em th

roug

hput

SMTST

POWER5 (simultaneous multi-threading)

Page 31: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

p5-575 design innovations

Distinctive, high efficiency,intelligent DC power conversion and distributionsubsystem

Power Distribution Module (DCA)

Single core POWER5 chips support high memory bandwidthPackaging designed to accommodate dual core technology

CPU/Memory Module

High capacity 400 CFM impellersHigh efficiency motors with intelligent control

Cooling ModuleVersatile I/O and service processorDesigned to easily supportchanges in I/O options

I/O Module

Page 32: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

POWER5 p5-575p5-575 P655

Drawers / Rack 12 16

4/8 – wayPOWER4+

32 MB / Chip / Core Shared in 8–way config

4GB – 64B42U ( 24” rack )

Two2 PCI-X

One

Two 10/100

0, ½, or 1 Drawer

Yes

Yes (Frame)

Yes

8/16 – wayPOWER5

36 MB / Chip / Core

1GB – 256GB42U ( 24” rack )

Two4 PCI-X

Two

Four 10/100/1000

0, ½, or 1 Drawer

Yes

Yes (Frame)

Yes

Architecture

L3 Cache

MemoryPackagingDASD / BaysI/O ExpansionIntegrated SCSIIntegrated EthernetRIO2 DrawersDynamic LPARRedundant Power

Redundant Cooling

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

p5-575 System42U rack chassisRack: 2U Drawers12 Drawers / rack

Page 33: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

p5-575 and Blue Gene

Largest p5-575 configuration: 12,000+ CPUs ASCI PURPLE - LLNL

p5-575:64-bit AIX 5L,/Linux® cluster node suitable for applications requiring high memory B/W and large memory (32GB) per 64-bit processor.

Blue Gene®:32-bit Linux cluster suitable for highly parallel applications with limited memory requirements (256MB per 32-bit processor) and limited or highly parallelized I/O.

Scalable systems: 16 to 1,024 POWER5 CPUs (more special order)

Very large systems: up to 100,000+ PPC440 CPUs

“Off-the-shelf” and custom configurations Custom configurationsStandard IBM service and support Custom service and support1,000s of applications supported Highly effective in highly specialized

applications IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

IBM

H C R U6

Chip(2 processors)

Compute Card(2 chips, 2x1x1)

Node Board(32 chips, 4x4x2)

16 Compute Cards

2.8/5.6 GF/s4 MB

5.6/11.2 GF/s0.5 GB DDR

90/180 GF/s8 GB DDR

2.9/5.7 TF/s256 GB DDR

180/360 TF/s16 TB DDR

Blue Gene/L Configuration: 131,000 CPUs - LLNL

Page 34: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

IBM HPC Clusters:Power/Intel/Opteron

Page 35: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 - Value

Leading edge Linux Cluster technology– Employs high performance, affordable Intel®, AMD®

and IBM PowerPC® processor-based servers– Capitalizes on IBM’s decade of experience in clusteringThoroughly tested configurations / components– Large selection of industry standard components– Tested for compatibility with major Linux distributionsConfigured and tested in our factories– Assembled by highly trained professionals, tested

before shipment to client siteHardware setup at client site included (except 11U)– Enables rapid accurate deploymentSingle point of contact for entire Linux Cluster …including third-party components– Warranty services provided/coordinated for entire

system – including third-party components– Backed by IBM’s unequalled worldwide support organization

Page 36: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Or would you rather want to deal with this?

Page 37: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 - OverviewIntegrated Linux cluster solution

– Factory integrated & tested (in Greenock for EMEA) – delivered and supported as one product– Complemented by 3 year IBM warranty services including OEM parts (Cisco, Myrinet, ...)

Broad solution stack portfolio– Servers xSeries 336/x346 (Xeon EM64T), eServer 326 (Opteron), Blades– Storage TotalStorage DS4100/4300/4400/4500– Networking Cisco / SMC / Force10 Gigabit Ethernet = Commodity networks

Myrinet / Infiniband = high performance low latency (< 5 µs) networks– Software Cluster Systems Management 1.4 (CSM) = cluster installation & admin.

General Parallel File System 2.3 (GPFS) = optional cluster file system– Services Factory integration & testing, onsite hardware setup (included)

SupportLine, software installation (both optional)– Currently supported and recommended Linux distributions:

SUSE Linux Enterprise Server (SLES) 8 & 9RedHat Enterprise Linux (RHEL) 3

– More options available via ‚special bid‘, e.g. other networking gear, ...

For more info have a look at– http://www-1.ibm.com/servers/eserver/clusters/

Page 38: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 - Node Choices

xSeries 346 High availability node for

application serving

Dual Processor Support (Nocona/Irwindale)16GB Maximum Memory

Integrated System Management2U

xSeries 336Highly manageable rack-dense node

~ 326High performance rack dense node

Dual Processor Support - Opteron16GB Maximum Memory (with option), 8 RDIMMs

2 Hot-swap SCSI or 2 Fixed SATA HDDs

Integrated System Management1U

Dual Processor Support – Nocona (12/04)14 Blades per Chassis8GB Maximum Memory/BladeIntegrated System Management7U Chassis

~ BladeCenter with HS20

POWER 4-based BladeCenter Blade2.2 GHz PPC 970, 2-way Blade Maximum Memory: 4GB IDE Drives: 2, 60GB3 Daughter Cards available (Ethernet, Fibre Channel w/ Boot Support, Myrinet)

~ BladeCenter with JS20

16GB Maximum Memory*, 8 RDIMMs2 Hot-swap SCSI hard disk drivesDual Processor Support (Nocona/Irwindale)Integrated System Management1U

6 Hot-swap SCSI HDDs

AMD Opteron-based BladeCenterSingle or Dual Core 2-Socket Blade Maximum Memory: 8GB SFF SCSI Drives and Daughter CardsIntegrated Systems Management

~ BladeCenter with LS20*

Page 39: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Two socket AMD

Single and Dual core

Similar feature set to HS20

32- or 64-bit HPC

High memory bandwidth apps

AMD Opteron LS20

Targ

et A

pps

Feat

ures

HS20 2-way Xeon HS40 4-way Xeon

Common Chassis and Infrastructure

Intel Xeon MP processors

4-way SMP capability

Supports Windows, Linux, and NetWare

Back-end workloads

Large mid-tier apps

Intel Xeon DP

EM64T

Mainstream rack dense blade

High availability apps

Optional HS HDD

Edge and mid-tier workloads

Collaboration

Web serving

JS20 PowerPC

Two PowerPC®970 processors

32-bit/64-bit solution for Linux & AIX 5L™

Performance for deep computing clusters

32- or 64-bit HPC, VMX acceleration

UNIX server consolidation

Blade portfolio continues to build

Page 40: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Introducing the AMD Opteron LS20HPC performance with “enterprise” availability feature set

4 DDR VLP (very low profile) DIMM slots

Ultra 320 non Hot

Swap Disk w/ RAID1

Supports SFF and

legacy I/O expansion

cardsBroadcom dual port ethernet

RHEL 4 for 32-bit and x64

SuSE Linux ES 9 for 32-bit and x64

RHEL 3 for 32-bit and x64

RHEL 2.1 (not at announce)

Planned OS support

Two sockets

68W processors

Single and dual core

Page 41: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

How much can you fit in one rack? Your choice!

IBM eServer xSeries 336 (Xeon DP 3.6 GHz)– IA-32, up to 84 CPUs (8.7 KW) / rack– price/performance (1058.4 total SPECfp_rate)– $268.9 k list price (604.8 GFLOP peak)

IBM eServer 326 (Opteron 250)– x86-64, up to 84 CPUs (7.5 KW) / rack– memory bandwidth (1432.2 total SPECfp_rate)– $241.9k list price (403.2 GFLOP peak)

IBM eServer BladeCenter HS 20 (Xeon DP 3.6 GHz)– IA-32, up to 168 CPUs (17.3 KW) / rack– foot print, integration (2116.4 total SPECfp_rate)– $574.7k list price (1209.6 GFLOP peak)

IBM eServer BladeCenter JS20 (PPC970 2.2 GHz)– PPC-64, up to 168 CPUs (10.1 KW) / rack– Performance, foot print (1680 total SPECfp_rate)– $389.7k list price (1478 GFLOP peak)

*Prices are current as of (the date) and subject to change without notice

Page 42: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 - Compute Node Positioning

e326 nodes - leading price/performance for memory-intensive applications in a server platform that supports both 32-bit and 64-bit applications

x336 and x346 nodes - leading performance and manageability for processor-intensive applications in an IA platform that supports both 32-bit and 64-bit applications

HS20 blades - performance density, integration, and investment protection in an IA platform that supports both 32-bit and 64-bit applications

JS20 blades - leading 64-bit price/performance in a POWER™processor-based blade architecture or have applications that can exploit the unique capabilities of VMX

Page 43: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 - Storage Selections

+3U Chassis +Single or Dual Controllers

+Up to 2 TB

+3U Chassis+Single or Dual Controllers+Up to 3.5TB Single+Up to 28TB Dual

DS4100SATA

(FAStT 100)DS400

FC-SCSI

+3U Chassis+Up to 8TB (4300)+Up to 16 TB (4300 Turbo)+Up to 28TB - SATA

DS4300 (Turbo)FC

(FAStT 600)

DS4500FC

(FAStT 900)

+3U Chassis+Up to 32 TB – Fiber+Up to 56 TB - SATA

DS4400FC / SATA

(FAStT 700)

+3U Chassis+Up to 32 TB – Fiber+Up to 56 TB - SATA

DS300iSCSI-SCSI

+3U, Single or Dual 1Gb Controllers+Up to 2 TB

Page 44: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

Interconnect options of e1350 (Intel/Opteron)

Page 45: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 - Network Selections / Ethernet

Cisco 6509– Used as core switch or aggregation switch– 8 slots for configuration– Up to 384 1Gb copper ports– Up to 32 10 Gb Fiber ports– Non-blocking

Cisco 6503 – Used as an aggregation switch or as a small

cluster core switch – 2 slots for configuration– Up to 96 1Gb copper ports– Up to 8 10Gb Fiber ports– Max of 5:4 oversubscribed ‘Near line rate’

Force10 E600– Used as core switch or aggregation switch– Up to 324 ports– Non-blocking– Alternative to 6509

Cisco 4006– Used as core switch or aggregation switch in very

large clusters– 5 slots for Line cards only– Up to 240 1Gb copper ports– Max of 3.75:1 oversubscribed

SMC 8648T– Used for core switch in small clusters or

aggregation switch in mid-size clusters– 1U form factor / 48 ports– Non-blocking by itself– At best 5:1 blocking in distributed mode

SMC 8624T– Used for core switch in small clusters or

aggregation switch in small clusters– 1U form factor / 48 ports– Non-blocking by itself– At best 5:1 blocking in distributed mode

Page 46: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Some more Cisco Components

Catalyst 3750G-24TS 24 port GigE 1U switch– 32Gbit backplane, stackable– Used for small clusters & distributed switch networks

Catalyst 4500 Series Switches– 3-slot & 6-slot versions– Lower cost, over-subscribed– Up to 384 GigE ports

Catalyst 6500 Series Switches– 3-slot & 9-slot versions– Higher cost, non-blocking (720Gbit backplane)– Up to 384 GigE ports– 10GigE ports available

Page 47: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Gigabit Ethernet Details – Force10 Overview

Why Force10?

– High port count GigE switches capable of non-blocking throughput are hard to find. Force10 series is one of the few.

– E600 specifications:– 900 Gbps non-blocking switch fabric– 1/3 rack chassis (19" rack width) – 500 million packets per second – 7 line card slots– 1+1 redundant RPMs– 8:1 redundant SFMs– 3+1 & 2+2 redundant AC power supplies– 1+1 redundant DC Power Entry Modules

Page 48: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

New Myrinet switches for large clusters

256hosts

512hosts

768hosts

Clos256 Clos256+256Spine

1024hosts

1280hosts

Only320

cables

All inter-switch cabling on quad ribbon fiber.

Page 49: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

TopSpin InfiniBand Switch Portfolio

Topspin 120Topspin 270

Topspin 720*

Chassis Type 1U Fixed 6U Modular 8U Modular

8 32 64 (32 if dual fabric config)

24 96 192 (96 if dual fabric config)

Fixed Config. (rear)8 Port 12X (opt. or cop)24 Port 4X (copper)

8 Horizontal Slots (rear)4 by 12X (optical or copper)12 by 4X (copper)Hybrid 9 by 4X + 1 by 12X

16 Vertical Slots (rear)4 by 12X (optical or copper)12 by 4X (copper)Hybrid 9 by 4X + 1 by 12X

0/8, 24/0 96/0, 64/8, 48/16, 0/32 Single Fabric: 192/0,128/16,96/32,0/64Dual Fabric: 96/0, 64/8, 48/16, 0/32

Redundant Power/CoolingRedundant ControlHot Swap InterfacesDual Box Fault Tolerance

Redundant Power/CoolingRedundant ControlHot Swap InterfacesDual Fabric or Dual Box Fault Tol.

Redundant PowerRedundant CoolingDual Box Fault Tol.

Embedded Embedded Embedded

Q1CY04 Q2CY04 Q4CY04

Max 12X ports

Max 4X ports

Interface Module Options

Popular Configs(4X/12X)

High Availability

Fabric Manager

Product Avail.

Page 50: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Voltaire InfiniBand Switch Router 9288

Voltaire’s Largest InfiniBand switch– 288 4x or 96 12x InfiniBand ports– Non-blocking bandwidth– Ideal for Clusters ranging from tens to

thousands of nodes

Powerful multi-protocol capabilities forSAN/LAN connectivity

– Up to 144 GbE ports or FC ports

No single point of failure – Redundant and hot-swappable Field ReplaceableUnits (FRUs)

Non-disruptive software update, processor fail-over

Page 51: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Cluster 1350 …. Today and TomorrowCluster 1350 will continue to expand client choice and flexibility by offering leading-edge technology and innovation in a reliable, factory-integrated and tested cluster system.

• x336, x346, e326, HS20 and JS20

TODAYTODAY TOMORROWTOMORROW• Dual core technology• Expanded Blade-based offerings• New PowerPC technology

• Emerging technologies• Expanded 3rd party offerings• Focus on both commercial and

HPC environments

• Red Hat and SUSE enterprise offerings

• LCIT, CSM, GPFS, SCALI

• Leading Linux distributions• Enhanced cluster and HPC software

High Performance High Performance NodesNodes

High Speed Switches, High Speed Switches, Interconnects, and Interconnects, and

StorageStorage

Leading OS & Cluster Leading OS & Cluster Management SoftwareManagement Software

• Gigabit Ethernet, Myrinet, and Infiniband

• High performance local and network storage solutions

Worldwide Service and Worldwide Service and SupportSupport

• Factory hardware integration• Single point of contact for warranty

service• Custom IGS services

• Custom hardware, OS and DB integration services

• Enhanced cluster support offerings

Page 52: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Thank you very much for your attention!

Links for more detailed information and further reading:– IBM eServer Clusters

http://www-1.ibm.com/servers/eserver/clusters/– IBM eServer Cluster 1350

http://www-1.ibm.com/servers/eserver/clusters/hardware/1350.html– Departmental Supercomputing Solutions

http://www-1.ibm.com/servers/eserver/clusters/hardware/dss.html– IBM eServer Cluster Software (CSM, GPFS, LoadLeveler)

http://www-1.ibm.com/servers/eserver/clusters/software/– IBM Linux Clusters Whitepaper

http://www-1.ibm.com/servers/eserver/clusters/whitepapers/linux_wp.html– Linux Clustering with CSM and GPFS Redbook

http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246601.html?Open

Page 53: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM STG Deep Computing

Confidential | Systems Group 2004 © 2004 IBM Corporation

Innovative Technologies

Page 54: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Over 460 TF Total IBM Solution1.5X the total power of the Top 500 List

IBM's proven capability to deliver the world's largest production quality supercomputers

ASCI Blue (3.9 TF) & ASCI White (12.3 TF)ASCI Pathforward (Federation 4GB Switch)

Three IBM Technology Roadmaps100 TF eServer 1600 pSeries Cluster

–12,544 POWER5 based processors–7 TF POWER4+ system in 2003

9.2 TF eServer 1350 Linux Cluster–1,924 Intel Xeon processors

360 TF Blue Gene/L (From IBM Research)–65,536 PowerPC based nodes

PURPLE

Page 55: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Blue Gene/L Blue Gene/L

Chip(2 processors)

Compute Card(2 chips, 2x1x1)

Node Board(32 chips, 4x4x2)

16 Compute Cards

System(64 cabinets, 64x32x32)

Cabinet(32 Node boards, 8x8x16)

2.8/5.6 GF/s4 MB

5.6/11.2 GF/s0.5 GB DDR

90/180 GF/s8 GB DDR

2.9/5.7 TF/s256 GB DDR

180/360 TF/s16 TB DDR

Page 56: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Blue Gene/L - The Machine

65536 nodes interconnected with three integrated networks

EthernetIncorporated into every node ASICDisk I/OHost control, booting and diagnostics

3 Dimensional TorusVirtual cut-through hardware routing to maximize efficiency2.8 Gb/s on all 12 node links (total of 4.2 GB/s per node)Communication backbone134 TB/s total torus interconnect bandwidth1.4/2.8 TB/s bisectional bandwidth

Global TreeOne-to-all or all-all broadcast functionalityArithmetic operations implemented in tree~1.4 GB/s of bandwidth from any node to all other nodes Latency of tree less than 1usec~90TB/s total binary tree bandwidth (64k machine)

Page 57: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

The BlueGene computer as a central processor for radio telescopes.

Bruce ElmegreenIBM Watson Research Center914 945 [email protected]

LOFAR

LOFAR

LOFAR = Low Frequency ArrayLOIS = LOFAR Outrigger in Sweden

LOIS

BlueGene/Lat ASTRON:6 racks, 768 IOs27.5 Tflops

Page 58: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Enormous Data Flows from Antenna Stations

LOFAR will have 46 remote stations and 64 stations in the central coreEach remote station transmits:ƒ32000 channels/ms in one beam

–or 8 beams with 4000 channelsƒ8+8 bit (or 16+16) complex dataƒ2 polarizations

– 1-2 Gbps from each station

Each central core station transmits the same data rate in several independent sky directions (for epoch of recombination experiment)t 110 - 300 Gbps input rates to central processor

sample LOFARstation array andantenna array for each station

Page 59: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

BlueGene Replaces Specialized Processors

32-processor Mark IV digital correlator (MIT, Jodrell Bank, ASTRON)

32-node BlueGene/L board with1. 64x64 bit comp. prod every 2 clock cycles2. Four Gbps ethernet IOs 3. One chip type (dual core PowerPC)4. A LINUX "feel"

Page 60: IBM Systems Group - spscicomp.org · IBM Systems and Technology Group © 2004 IBM Corporation HPC Cluster Directions 2004 2005 2006 2007 2009 Performance Capacity Cl usters Capability

IBM Systems and Technology Group

© 2004 IBM Corporation

Thank You!

for your time & attention

Questions?