Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
IBM Systems Group
Deep Computing with IBM Systems
Barry Bolding, Ph.D.IBM Deep ComputingSciComp 2005
IBM Systems and Technology Group
© 2004 IBM Corporation
Deep Computing Components
High Performance Computing LeadershipResearch and InnovationSystems Expertise
– pSeries
– xSeries
– Storage
– NetworkingInnovative Systems
IBM Systems and Technology Group
© 2004 IBM Corporation
Deep Computing Focus
Government Research Labs– Energy and Defense
Weather/Environmental– Weather Forecasting Centers– Climate Modeling
Higher Education/Research UniversitiesLife Sciences
– Pharma, BioTech, ChemicalAero/AutoPetroleumBusiness Intelligence, Digital Media, Financial Services, On Demand HPC
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
Deep Computing Teams and Organization
IBM Systems and Technology Group
© 2004 IBM Corporation
Deep Computing Technical TeamKent Winchell
Technical TeamDeep Computing
Barry BoldingTechnical Manager
Public Sector
Farid ParpiaHPC Applications
Life Sciences
John BauerHPC Storage
Government, HPC
Wei ChenEDA
Asia Pacific HPC
Charles GrasslGovernmentHIgher Ed.
Stephen BehlingHigher Ed
CFD
Ray PadenGPFS
HPC Storage
James AbelesWeather/Environment
Joseph SkoviraSchedulers
CSM
Marcus WagnerGovernmentLife Sciences
Jeff ZaisTechnical ManagerIndustrial Sector
Doug PeteschAuto/Aero
Martin FeyereisenAuto/Aero
Business Intelligence
“Suga” SugavanamHPC Applications
BlueGene/L
Guangye LiAuto/Aero
Si MacAlesterDigitial Media
Harry YoungDigital Media
Scott DenhamPetroleum
Janet ShiuPetroleum, Visualization
Len JohnsonDigital Media/Storage
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
IBM Deep Computing Summary of Technology Directions
IBM Systems and Technology Group
© 2004 IBM Corporation
HPC Cluster System DirectionSegmentation Based on Implementation
Syst
ems
20062004 20072005
High End
Midrange
Blades
High Volume
High Value Segment
'Good Enough' Segment
Blades - Density - Segment
Off Roadmap Segment
IBM Systems and Technology Group
© 2004 IBM Corporation
HPC Cluster Directions
20062004 20072005 2009
Per
form
ance
Cap
acity
C
lust
ers
Cap
abili
tyM
achi
nes
Limited Configurability(Memory Size, Bisection)
Extended ConfigurabilityBlueGene
Power, AIX, Federation
Linux ClustersLess Demanding Communication
Power, Intel, BG, Cell NodesBlades
Power, Linux, HCAs
Power (w/Accelerators?)Linux, HCAs
100TF
PF
2008 2010
PERCS
IBM Systems and Technology Group
© 2004 IBM Corporation
Deep Computing Architecture
HPC NetworkBackbone NetworkStorage Network
Large-MemoryBW driven
SAN switch
High DensityComputing
Shared Storage
Gateways, Webservers, Firewalls, On-Demand Access
EmergingTechnologies
User Community
IBM Systems and Technology Group
© 2004 IBM Corporation
Deep Computing Architecture (Multicluster GPFS)
HPC NetworkBackbone NetworkStorage Network
Large-MemoryBW driven
High DensityComputing
Shared Storage
Gateways, Webservers, Firewalls, On-Demand Access
EmergingTechnologies
User Community
Shared Storage Shared Storage
IBM Systems and Technology Group
© 2004 IBM Corporation
IBM Offerings are Deep and Wide
Storage, Networking, System
Management,
Tools
pSeries, eServer1600IBM Power4 and Power5 chip
AIX/Linux
xSeries/eServer 1350Intel Xeon
AMD OpteronBladeCenter
Linux, Server2003
Workstations
HPC Clusters,Grids,Blades
Software, expertiseand Business Partners"to tie it all together for your HPC solution"
IBM Systems and Technology Group
© 2004 IBM Corporation
Processor Directions
Power Architectures– Power4 Power5 Power6
– PPC970 Power6 technology
– BlueGene/L BlueGene/P
– Cell Architectures (Sony, Toshiba, IBM)Intel
– IA32 EM64T (NOCONA) AMD Opteron
– Single-core dual-core
IBM Systems and Technology Group
© 2004 IBM Corporation
System Design
Power Consumption (not heat dissipation)Chips might only be 10-20% of the power on a system/nodeNew metrics
– Power/ft^2
– Performance/ft^2
– Total cost of ownership (including power/cooling)Power5 clusters (p575) = 96 cpu/rack1U rack optimized clusters = 128 cpu/rackBladecenter(PPC/Intel/AMD) = 168 cpu/rack (dual core will increase this)BlueGene = 2048 cpu/rack
IBM Systems and Technology Group
© 2004 IBM Corporation
Systems Directions
Optimizing Nodes– 2,4,8,16 CPU nodes
– Large SMPs
– Rack Optimized Servers and BladeCenterOptimizing Interconnects
– Higher Performance Networks– HPS, Myrinets, Infiniband, Quadrics, 10GigE
– Utility Networks– Ethernet, Gigabit, 10GigE
Optimizing Storage– Global Filesystems (MultiCluster GPFS)
– Avoiding Bottlenecks (NFS, Spindle counts, FC adapters and switches)
Optimizing Grid Infrastructure
IBM Systems and Technology Group
© 2004 IBM Corporation
Systems Directions
pSeries– Power4 Systems (p-6xx)– 2,4,8,16 way Power5 clusters (p-5xx, OpenPower-7xx)– 32,64 way Power5 SMPs (p-595)– BladeCenter cluster (JS20)
xSeries– Intel EM64T, Rack Optimized and BladeCenter
– x335,x336,HS20,HS40– AMD Opteron Rack Optimized
– x325,x326,LS20BlueGene/LInterconnects
– HPS, Myrinet, IB, GIGE, 10GIGE
IBM Systems and Technology Group
© 2004 IBM Corporation
Software DirectionsSystem Software
– Unix (AIX, Solaris)– Linux
– Linux on POWER– Linux on Intel and Opteron
– WindowsHPC Software
– Same Software on AIX and Linux on POWER– Compilers, Libraries, Tools
– Same HPC Infrastructure on Linux/Intel/Opteron and POWER
– GPFS, Loadleveler, CSM– MultiCluster GPFS– Grid Software– Backup and Storage Management
IBM Systems and Technology Group
© 2004 IBM Corporation
Linux Software Matrix
Kernels (not even considering distros)– 2.4, 2.6
Interconnects– IB (3 different vendors), Myrinet, Quadrics, GigE
(mpich and lam)32 and 64-bit binaries and librariesCompiler options (Intel, Pathscale, PGI, gcc)Geometric increase in number of binaries and sets of libraries that any code developer might need to support.
IBM Systems and Technology Group
© 2004 IBM Corporation
There are passengers and there are drivers!
IBM is a Driver– POWER (www.power.org)
– Linux on Power and Intel/Opteron, LTC
– BlueGene/L
– STI Cell Architectures
– Open Platform SupportHP, SGI, SUN, Cray are passengers
– Rely primarily on external innovations (HP, SGI, SUN, Cray).
IBM Systems and Technology Group
© 2004 IBM Corporation
Introducing IBM’s Deep Computing OrganizationGovernment Weather
Forecasting
PetroleumExploration
Digital Media
DrugDiscovery
Chip Design
Crash Analysis
Financial Services
• Clear #1 position in High Performance Computing (Top500, Gartner, IDC, …)
• “Our goal is to solve consistently larger and more complex problems more quickly and at lower cost.”
IBM Systems and Technology Group
© 2004 IBM Corporation
The CAE World is in FluxƒHardware vendors
ƒSoftware vendors
ƒOperating systems
ƒCluster computing
ƒMicroprocessors
Most users are seeing dramatic changes in their CAE environment
IBM Systems and Technology Group
© 2004 IBM Corporation
Evolution of Hardware: drive towards commonality
MainFrames(~1979)
Vectors(~1983)
RISC SMPs(~1994)
Clusters(~2002)
Mostly MSC.Nastran
Beginning in 1986 crash simulation drove CAE compute requirements
SMP architecture was often first introduced in the CFD department and helped push parallel computing.
Cluster architecture (Unix & Linux) now dominates crash and CFD environments
IBM Systems and Technology Group
© 2004 IBM Corporation
1998 2000 2002 20040
10
20
30
40
50
60
70
Perc
ent W
orkl
oad
SerialSMP4-30 CPUs>30 CPUs
Crash Simulation
1998 2000 2002 20040
10
20
30
40
50
60
70
80
Perc
ent W
orkl
oad
SerialSMP4-30 CPUs>30 CPUs
CFD Simulation
1998 2000 2002 20040
102030405060708090
100
Per
cent
Wor
kloa
dSerialSMP4-30 CPUs>30 CPUs
Structural AnalysisTransition of the CAE environment
IBM Systems and Technology Group
© 2004 IBM Corporation
Recent Trends – Top 20 Automotive Sites
Other IA-64
IA-32Vector
SPARC
Alpha
PA-RISC
POWER
MIPS
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
1997 1998 1999 2000 2001 2002 2003
Perc
ent o
f Ins
talle
d G
igaF
LOPS
Source: TOP500 websitehttp://www.top500.org/lists/2003/11/
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
IBM Power Technology and Products
IBM Systems and Technology Group
© 2004 IBM Corporation
POWER : The Most Scaleable ArchitectureB
inary B
inary Com
patibility
ServersServersPOWER3POWER4 POWER4+
POWER2
POWER5
EmbeddedEmbeddedPPC401
PPC 405GP
PPC 440GP
PPC 440GX
DesktopGames
PPC 603e
PPC 750
PPC 750CXe
PPC 750FX
PPC 750GX
PPC 970FX
IBM Systems and Technology Group
© 2004 IBM Corporation
IBM powers Mars exploration
PowerPC is at the heart of the BAE Systems RAD6000 Single Board Computer, a specialized system enabling the Mars Rovers — Spirit and Opportunity — to explore, examine and even photograph the surface of Mars.
In fact, a new generation of PowerPC based space computers is ready for the next trip to another planet. The RAD750, also built by BAE Systems, is powered by a licensed radiation-hardened PowerPC 750 microprocessor that will power space exploration and Department of Defense applications in the years in the come.
IBM returns
to MARS
IBM Systems and Technology Group
© 2004 IBM Corporation
IBM OpenPower / eServerTM p5 Server Product LineHigh-endNo Compromises......
Linux
POWER4+Systems
Workstations
Mdl 275
IntelliStation Blades JS20+
PPC970+Systems
POWER5Systems p5-595
Std & Turbop5-590
Midrange
p5-570Express, Std
& Turbo
p5-575
OpenPower OP 720OP 710
Entry Towers
Entry Rack
p5-550Express & Std
p5-520Express & Std
P590p595
p5-510Express & Std IBM ^
Cluster 1600
p550p520 p575p570
p575p590p595p50
IBM Systems and Technology Group
© 2004 IBM Corporation
pSeries
pSeries
p5-575
p5-570
p5-520
POWER5 Technology Bottom to Top
p5-510
p5-550
p5-590p5-595
Footprint, packaging
19-inchrack
19-inchrack,
deskside
19-inch rack,
deskside19-inch rack
24-inch frame
by node 24-inch frame
24-inchframe
8 to 321.65
8 to 1,0249.3TB
20 to 1608
254151.72
Yes
Yes
Max. rPerf 9.86 9.86 19.66 77.45 46.36 306.21
No. CPUs/node 1, 2 1, 2 1, 2, 4 2, 4, 8, 12, 16 8 16 to 64GHz clock 1.5, 1.65 1.5, 1.65 1.5, 1.65 1.5, 1.65, 1.9 1.9 1.65, 1.9
Int. storage 587.2MB 8.2TB 15.2TB 38.7TB 1.4TB 14.0TBGB memory 0.5 to 32 0.5 to 32 0.5 to 64 2 to 512 1 to 256 8 to 2,048
PCI-X slots 3 6 to 34 5 to 60 6 to 163 0 to 24 20 to 240I/O drawers 0 4 8 20 1 12LPARs 20 20 40 160 80 254
Cluster 1600 Yes Yes Yes Yes Yes YesHACMP™(AIX 5L™ V5.2) Yes 2Q05 Yes Yes Yes Yes Yes
IBM Systems and Technology Group
© 2004 IBM Corporation
POWER5 architecture
Simultaneous multi-threading
Hardware support for Micro-Partitioning–Sub-processor allocation
Enhanced distributed switch
Enhanced memory subsystem–Larger L3 cache: 36MB –Memory controller on-chip
Improved High Performance Computing [HPC]
Dynamic power saving–Clock gating GX+
Chip-ChipMCM-MCM
SMPLink
Mem
ory
L31.9 MB
L2 Cache
L3 Dir / Ctl
MemCtl
POWER5Core
POWER5Core
Enhanced distributed switch
POWER5 design
POWER5 enhancements
1.5, 1.65 and 1.9 GHz 276M transistors .13 micron
IBM Systems and Technology Group
© 2004 IBM Corporation
~ p5: Simultaneous multi-threadingPOWER4 (Single Threaded)
CRL
FX0FX1LSOLS1FP0FP1BRZ
Thread1 active
Thread0 activeNo thread active
Utilizes unused execution unit cyclesPresents symmetric multiprocessing (SMP) programming model to softwareNatural fit with superscalar out-of-order execution coreDispatch two threads per processor: “It’s like doubling the number of processors.” Net result:
– Better performance– Better processor utilization
Appears as 4 CPUs per chip to the
operating system (AIX 5L V5.3 and
Linux)
Syst
em th
roug
hput
SMTST
POWER5 (simultaneous multi-threading)
IBM Systems and Technology Group
© 2004 IBM Corporation
p5-575 design innovations
Distinctive, high efficiency,intelligent DC power conversion and distributionsubsystem
Power Distribution Module (DCA)
Single core POWER5 chips support high memory bandwidthPackaging designed to accommodate dual core technology
CPU/Memory Module
High capacity 400 CFM impellersHigh efficiency motors with intelligent control
Cooling ModuleVersatile I/O and service processorDesigned to easily supportchanges in I/O options
I/O Module
IBM Systems and Technology Group
© 2004 IBM Corporation
POWER5 p5-575p5-575 P655
Drawers / Rack 12 16
4/8 – wayPOWER4+
32 MB / Chip / Core Shared in 8–way config
4GB – 64B42U ( 24” rack )
Two2 PCI-X
One
Two 10/100
0, ½, or 1 Drawer
Yes
Yes (Frame)
Yes
8/16 – wayPOWER5
36 MB / Chip / Core
1GB – 256GB42U ( 24” rack )
Two4 PCI-X
Two
Four 10/100/1000
0, ½, or 1 Drawer
Yes
Yes (Frame)
Yes
Architecture
L3 Cache
MemoryPackagingDASD / BaysI/O ExpansionIntegrated SCSIIntegrated EthernetRIO2 DrawersDynamic LPARRedundant Power
Redundant Cooling
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
p5-575 System42U rack chassisRack: 2U Drawers12 Drawers / rack
IBM Systems and Technology Group
© 2004 IBM Corporation
p5-575 and Blue Gene
Largest p5-575 configuration: 12,000+ CPUs ASCI PURPLE - LLNL
p5-575:64-bit AIX 5L,/Linux® cluster node suitable for applications requiring high memory B/W and large memory (32GB) per 64-bit processor.
Blue Gene®:32-bit Linux cluster suitable for highly parallel applications with limited memory requirements (256MB per 32-bit processor) and limited or highly parallelized I/O.
Scalable systems: 16 to 1,024 POWER5 CPUs (more special order)
Very large systems: up to 100,000+ PPC440 CPUs
“Off-the-shelf” and custom configurations Custom configurationsStandard IBM service and support Custom service and support1,000s of applications supported Highly effective in highly specialized
applications IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
IBM
H C R U6
Chip(2 processors)
Compute Card(2 chips, 2x1x1)
Node Board(32 chips, 4x4x2)
16 Compute Cards
2.8/5.6 GF/s4 MB
5.6/11.2 GF/s0.5 GB DDR
90/180 GF/s8 GB DDR
2.9/5.7 TF/s256 GB DDR
180/360 TF/s16 TB DDR
Blue Gene/L Configuration: 131,000 CPUs - LLNL
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
IBM HPC Clusters:Power/Intel/Opteron
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 - Value
Leading edge Linux Cluster technology– Employs high performance, affordable Intel®, AMD®
and IBM PowerPC® processor-based servers– Capitalizes on IBM’s decade of experience in clusteringThoroughly tested configurations / components– Large selection of industry standard components– Tested for compatibility with major Linux distributionsConfigured and tested in our factories– Assembled by highly trained professionals, tested
before shipment to client siteHardware setup at client site included (except 11U)– Enables rapid accurate deploymentSingle point of contact for entire Linux Cluster …including third-party components– Warranty services provided/coordinated for entire
system – including third-party components– Backed by IBM’s unequalled worldwide support organization
IBM Systems and Technology Group
© 2004 IBM Corporation
Or would you rather want to deal with this?
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 - OverviewIntegrated Linux cluster solution
– Factory integrated & tested (in Greenock for EMEA) – delivered and supported as one product– Complemented by 3 year IBM warranty services including OEM parts (Cisco, Myrinet, ...)
Broad solution stack portfolio– Servers xSeries 336/x346 (Xeon EM64T), eServer 326 (Opteron), Blades– Storage TotalStorage DS4100/4300/4400/4500– Networking Cisco / SMC / Force10 Gigabit Ethernet = Commodity networks
Myrinet / Infiniband = high performance low latency (< 5 µs) networks– Software Cluster Systems Management 1.4 (CSM) = cluster installation & admin.
General Parallel File System 2.3 (GPFS) = optional cluster file system– Services Factory integration & testing, onsite hardware setup (included)
SupportLine, software installation (both optional)– Currently supported and recommended Linux distributions:
SUSE Linux Enterprise Server (SLES) 8 & 9RedHat Enterprise Linux (RHEL) 3
– More options available via ‚special bid‘, e.g. other networking gear, ...
For more info have a look at– http://www-1.ibm.com/servers/eserver/clusters/
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 - Node Choices
xSeries 346 High availability node for
application serving
Dual Processor Support (Nocona/Irwindale)16GB Maximum Memory
Integrated System Management2U
xSeries 336Highly manageable rack-dense node
~ 326High performance rack dense node
Dual Processor Support - Opteron16GB Maximum Memory (with option), 8 RDIMMs
2 Hot-swap SCSI or 2 Fixed SATA HDDs
Integrated System Management1U
Dual Processor Support – Nocona (12/04)14 Blades per Chassis8GB Maximum Memory/BladeIntegrated System Management7U Chassis
~ BladeCenter with HS20
POWER 4-based BladeCenter Blade2.2 GHz PPC 970, 2-way Blade Maximum Memory: 4GB IDE Drives: 2, 60GB3 Daughter Cards available (Ethernet, Fibre Channel w/ Boot Support, Myrinet)
~ BladeCenter with JS20
16GB Maximum Memory*, 8 RDIMMs2 Hot-swap SCSI hard disk drivesDual Processor Support (Nocona/Irwindale)Integrated System Management1U
6 Hot-swap SCSI HDDs
AMD Opteron-based BladeCenterSingle or Dual Core 2-Socket Blade Maximum Memory: 8GB SFF SCSI Drives and Daughter CardsIntegrated Systems Management
~ BladeCenter with LS20*
IBM Systems and Technology Group
© 2004 IBM Corporation
Two socket AMD
Single and Dual core
Similar feature set to HS20
32- or 64-bit HPC
High memory bandwidth apps
AMD Opteron LS20
Targ
et A
pps
Feat
ures
HS20 2-way Xeon HS40 4-way Xeon
Common Chassis and Infrastructure
Intel Xeon MP processors
4-way SMP capability
Supports Windows, Linux, and NetWare
Back-end workloads
Large mid-tier apps
Intel Xeon DP
EM64T
Mainstream rack dense blade
High availability apps
Optional HS HDD
Edge and mid-tier workloads
Collaboration
Web serving
JS20 PowerPC
Two PowerPC®970 processors
32-bit/64-bit solution for Linux & AIX 5L™
Performance for deep computing clusters
32- or 64-bit HPC, VMX acceleration
UNIX server consolidation
Blade portfolio continues to build
IBM Systems and Technology Group
© 2004 IBM Corporation
Introducing the AMD Opteron LS20HPC performance with “enterprise” availability feature set
4 DDR VLP (very low profile) DIMM slots
Ultra 320 non Hot
Swap Disk w/ RAID1
Supports SFF and
legacy I/O expansion
cardsBroadcom dual port ethernet
RHEL 4 for 32-bit and x64
SuSE Linux ES 9 for 32-bit and x64
RHEL 3 for 32-bit and x64
RHEL 2.1 (not at announce)
…
Planned OS support
Two sockets
68W processors
Single and dual core
IBM Systems and Technology Group
© 2004 IBM Corporation
How much can you fit in one rack? Your choice!
IBM eServer xSeries 336 (Xeon DP 3.6 GHz)– IA-32, up to 84 CPUs (8.7 KW) / rack– price/performance (1058.4 total SPECfp_rate)– $268.9 k list price (604.8 GFLOP peak)
IBM eServer 326 (Opteron 250)– x86-64, up to 84 CPUs (7.5 KW) / rack– memory bandwidth (1432.2 total SPECfp_rate)– $241.9k list price (403.2 GFLOP peak)
IBM eServer BladeCenter HS 20 (Xeon DP 3.6 GHz)– IA-32, up to 168 CPUs (17.3 KW) / rack– foot print, integration (2116.4 total SPECfp_rate)– $574.7k list price (1209.6 GFLOP peak)
IBM eServer BladeCenter JS20 (PPC970 2.2 GHz)– PPC-64, up to 168 CPUs (10.1 KW) / rack– Performance, foot print (1680 total SPECfp_rate)– $389.7k list price (1478 GFLOP peak)
*Prices are current as of (the date) and subject to change without notice
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 - Compute Node Positioning
e326 nodes - leading price/performance for memory-intensive applications in a server platform that supports both 32-bit and 64-bit applications
x336 and x346 nodes - leading performance and manageability for processor-intensive applications in an IA platform that supports both 32-bit and 64-bit applications
HS20 blades - performance density, integration, and investment protection in an IA platform that supports both 32-bit and 64-bit applications
JS20 blades - leading 64-bit price/performance in a POWER™processor-based blade architecture or have applications that can exploit the unique capabilities of VMX
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 - Storage Selections
+3U Chassis +Single or Dual Controllers
+Up to 2 TB
+3U Chassis+Single or Dual Controllers+Up to 3.5TB Single+Up to 28TB Dual
DS4100SATA
(FAStT 100)DS400
FC-SCSI
+3U Chassis+Up to 8TB (4300)+Up to 16 TB (4300 Turbo)+Up to 28TB - SATA
DS4300 (Turbo)FC
(FAStT 600)
DS4500FC
(FAStT 900)
+3U Chassis+Up to 32 TB – Fiber+Up to 56 TB - SATA
DS4400FC / SATA
(FAStT 700)
+3U Chassis+Up to 32 TB – Fiber+Up to 56 TB - SATA
DS300iSCSI-SCSI
+3U, Single or Dual 1Gb Controllers+Up to 2 TB
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
Interconnect options of e1350 (Intel/Opteron)
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 - Network Selections / Ethernet
Cisco 6509– Used as core switch or aggregation switch– 8 slots for configuration– Up to 384 1Gb copper ports– Up to 32 10 Gb Fiber ports– Non-blocking
Cisco 6503 – Used as an aggregation switch or as a small
cluster core switch – 2 slots for configuration– Up to 96 1Gb copper ports– Up to 8 10Gb Fiber ports– Max of 5:4 oversubscribed ‘Near line rate’
Force10 E600– Used as core switch or aggregation switch– Up to 324 ports– Non-blocking– Alternative to 6509
Cisco 4006– Used as core switch or aggregation switch in very
large clusters– 5 slots for Line cards only– Up to 240 1Gb copper ports– Max of 3.75:1 oversubscribed
SMC 8648T– Used for core switch in small clusters or
aggregation switch in mid-size clusters– 1U form factor / 48 ports– Non-blocking by itself– At best 5:1 blocking in distributed mode
SMC 8624T– Used for core switch in small clusters or
aggregation switch in small clusters– 1U form factor / 48 ports– Non-blocking by itself– At best 5:1 blocking in distributed mode
IBM Systems and Technology Group
© 2004 IBM Corporation
Some more Cisco Components
Catalyst 3750G-24TS 24 port GigE 1U switch– 32Gbit backplane, stackable– Used for small clusters & distributed switch networks
Catalyst 4500 Series Switches– 3-slot & 6-slot versions– Lower cost, over-subscribed– Up to 384 GigE ports
Catalyst 6500 Series Switches– 3-slot & 9-slot versions– Higher cost, non-blocking (720Gbit backplane)– Up to 384 GigE ports– 10GigE ports available
IBM Systems and Technology Group
© 2004 IBM Corporation
Gigabit Ethernet Details – Force10 Overview
Why Force10?
– High port count GigE switches capable of non-blocking throughput are hard to find. Force10 series is one of the few.
– E600 specifications:– 900 Gbps non-blocking switch fabric– 1/3 rack chassis (19" rack width) – 500 million packets per second – 7 line card slots– 1+1 redundant RPMs– 8:1 redundant SFMs– 3+1 & 2+2 redundant AC power supplies– 1+1 redundant DC Power Entry Modules
IBM Systems and Technology Group
© 2004 IBM Corporation
New Myrinet switches for large clusters
256hosts
512hosts
768hosts
Clos256 Clos256+256Spine
1024hosts
1280hosts
Only320
cables
All inter-switch cabling on quad ribbon fiber.
IBM Systems and Technology Group
© 2004 IBM Corporation
TopSpin InfiniBand Switch Portfolio
Topspin 120Topspin 270
Topspin 720*
Chassis Type 1U Fixed 6U Modular 8U Modular
8 32 64 (32 if dual fabric config)
24 96 192 (96 if dual fabric config)
Fixed Config. (rear)8 Port 12X (opt. or cop)24 Port 4X (copper)
8 Horizontal Slots (rear)4 by 12X (optical or copper)12 by 4X (copper)Hybrid 9 by 4X + 1 by 12X
16 Vertical Slots (rear)4 by 12X (optical or copper)12 by 4X (copper)Hybrid 9 by 4X + 1 by 12X
0/8, 24/0 96/0, 64/8, 48/16, 0/32 Single Fabric: 192/0,128/16,96/32,0/64Dual Fabric: 96/0, 64/8, 48/16, 0/32
Redundant Power/CoolingRedundant ControlHot Swap InterfacesDual Box Fault Tolerance
Redundant Power/CoolingRedundant ControlHot Swap InterfacesDual Fabric or Dual Box Fault Tol.
Redundant PowerRedundant CoolingDual Box Fault Tol.
Embedded Embedded Embedded
Q1CY04 Q2CY04 Q4CY04
Max 12X ports
Max 4X ports
Interface Module Options
Popular Configs(4X/12X)
High Availability
Fabric Manager
Product Avail.
IBM Systems and Technology Group
© 2004 IBM Corporation
Voltaire InfiniBand Switch Router 9288
Voltaire’s Largest InfiniBand switch– 288 4x or 96 12x InfiniBand ports– Non-blocking bandwidth– Ideal for Clusters ranging from tens to
thousands of nodes
Powerful multi-protocol capabilities forSAN/LAN connectivity
– Up to 144 GbE ports or FC ports
No single point of failure – Redundant and hot-swappable Field ReplaceableUnits (FRUs)
Non-disruptive software update, processor fail-over
IBM Systems and Technology Group
© 2004 IBM Corporation
Cluster 1350 …. Today and TomorrowCluster 1350 will continue to expand client choice and flexibility by offering leading-edge technology and innovation in a reliable, factory-integrated and tested cluster system.
• x336, x346, e326, HS20 and JS20
TODAYTODAY TOMORROWTOMORROW• Dual core technology• Expanded Blade-based offerings• New PowerPC technology
• Emerging technologies• Expanded 3rd party offerings• Focus on both commercial and
HPC environments
• Red Hat and SUSE enterprise offerings
• LCIT, CSM, GPFS, SCALI
• Leading Linux distributions• Enhanced cluster and HPC software
High Performance High Performance NodesNodes
High Speed Switches, High Speed Switches, Interconnects, and Interconnects, and
StorageStorage
Leading OS & Cluster Leading OS & Cluster Management SoftwareManagement Software
• Gigabit Ethernet, Myrinet, and Infiniband
• High performance local and network storage solutions
Worldwide Service and Worldwide Service and SupportSupport
• Factory hardware integration• Single point of contact for warranty
service• Custom IGS services
• Custom hardware, OS and DB integration services
• Enhanced cluster support offerings
IBM Systems and Technology Group
© 2004 IBM Corporation
Thank you very much for your attention!
Links for more detailed information and further reading:– IBM eServer Clusters
http://www-1.ibm.com/servers/eserver/clusters/– IBM eServer Cluster 1350
http://www-1.ibm.com/servers/eserver/clusters/hardware/1350.html– Departmental Supercomputing Solutions
http://www-1.ibm.com/servers/eserver/clusters/hardware/dss.html– IBM eServer Cluster Software (CSM, GPFS, LoadLeveler)
http://www-1.ibm.com/servers/eserver/clusters/software/– IBM Linux Clusters Whitepaper
http://www-1.ibm.com/servers/eserver/clusters/whitepapers/linux_wp.html– Linux Clustering with CSM and GPFS Redbook
http://publib-b.boulder.ibm.com/Redbooks.nsf/RedbookAbstracts/sg246601.html?Open
IBM STG Deep Computing
Confidential | Systems Group 2004 © 2004 IBM Corporation
Innovative Technologies
IBM Systems and Technology Group
© 2004 IBM Corporation
Over 460 TF Total IBM Solution1.5X the total power of the Top 500 List
IBM's proven capability to deliver the world's largest production quality supercomputers
ASCI Blue (3.9 TF) & ASCI White (12.3 TF)ASCI Pathforward (Federation 4GB Switch)
Three IBM Technology Roadmaps100 TF eServer 1600 pSeries Cluster
–12,544 POWER5 based processors–7 TF POWER4+ system in 2003
9.2 TF eServer 1350 Linux Cluster–1,924 Intel Xeon processors
360 TF Blue Gene/L (From IBM Research)–65,536 PowerPC based nodes
PURPLE
IBM Systems and Technology Group
© 2004 IBM Corporation
Blue Gene/L Blue Gene/L
Chip(2 processors)
Compute Card(2 chips, 2x1x1)
Node Board(32 chips, 4x4x2)
16 Compute Cards
System(64 cabinets, 64x32x32)
Cabinet(32 Node boards, 8x8x16)
2.8/5.6 GF/s4 MB
5.6/11.2 GF/s0.5 GB DDR
90/180 GF/s8 GB DDR
2.9/5.7 TF/s256 GB DDR
180/360 TF/s16 TB DDR
IBM Systems and Technology Group
© 2004 IBM Corporation
Blue Gene/L - The Machine
65536 nodes interconnected with three integrated networks
EthernetIncorporated into every node ASICDisk I/OHost control, booting and diagnostics
3 Dimensional TorusVirtual cut-through hardware routing to maximize efficiency2.8 Gb/s on all 12 node links (total of 4.2 GB/s per node)Communication backbone134 TB/s total torus interconnect bandwidth1.4/2.8 TB/s bisectional bandwidth
Global TreeOne-to-all or all-all broadcast functionalityArithmetic operations implemented in tree~1.4 GB/s of bandwidth from any node to all other nodes Latency of tree less than 1usec~90TB/s total binary tree bandwidth (64k machine)
IBM Systems and Technology Group
© 2004 IBM Corporation
The BlueGene computer as a central processor for radio telescopes.
Bruce ElmegreenIBM Watson Research Center914 945 [email protected]
LOFAR
LOFAR
LOFAR = Low Frequency ArrayLOIS = LOFAR Outrigger in Sweden
LOIS
BlueGene/Lat ASTRON:6 racks, 768 IOs27.5 Tflops
IBM Systems and Technology Group
© 2004 IBM Corporation
Enormous Data Flows from Antenna Stations
LOFAR will have 46 remote stations and 64 stations in the central coreEach remote station transmits:ƒ32000 channels/ms in one beam
–or 8 beams with 4000 channelsƒ8+8 bit (or 16+16) complex dataƒ2 polarizations
– 1-2 Gbps from each station
Each central core station transmits the same data rate in several independent sky directions (for epoch of recombination experiment)t 110 - 300 Gbps input rates to central processor
sample LOFARstation array andantenna array for each station
IBM Systems and Technology Group
© 2004 IBM Corporation
BlueGene Replaces Specialized Processors
32-processor Mark IV digital correlator (MIT, Jodrell Bank, ASTRON)
32-node BlueGene/L board with1. 64x64 bit comp. prod every 2 clock cycles2. Four Gbps ethernet IOs 3. One chip type (dual core PowerPC)4. A LINUX "feel"
IBM Systems and Technology Group
© 2004 IBM Corporation
Thank You!
for your time & attention
Questions?