© 2005 IBM Essential Overview Louisiana Tech University Ruston, Louisiana Charles Grassl IBM January, 2006

Embed Size (px)

DESCRIPTION

3 © 2005 IBM Corporation Hardware Overview Processors: Nodes: Clusters:

Citation preview

2005 IBM Essential Overview Louisiana Tech University Ruston, Louisiana Charles Grassl IBM January, 2006 2 2005 IBM Corporation Agenda Hardware Software Documentation 3 2005 IBM Corporation Hardware Overview Processors: Nodes: Clusters: 4 2005 IBM Corporation Product Naming New NameOld NameMarketProcessor iSeriesAS400CommercialRS64 pSeries RS600 SP SP2 Technical POWER3 POWER4 POWER5 xSeries IA-32 IA-64 Server Xeon AMD zSeriesES9000MainframeRS64 5 2005 IBM Corporation Processor Progression ProcessorYearsClock RateFeature POWER 60 MHzRISC P2SC 150 MHzBandwidth POWER 450 MHzSingle Chip POWER 1.9 GHzDual Core POWER 1.9 GHzMulti-Thread 6 2005 IBM Corporation POWER5 Systems POWER5 processors Single and Dual processor chips Modules Dual Chip Modules (DCM) Multi Chip Modules (MCM) Nodes Multiple modules p5-575 p5-595 Cluster Multiple nodes Connected with High Speed Switch (HPS) 7 2005 IBM Corporation Systems (Nodes) ModelProcessors Clock Rate (GHz) Memory (x 2^30 byte) p , p , p ,5, p , p p p , 8 2005 IBM Corporation POWER5 Processor Systems MCM Chip Processor DCM p5-575 p5-595 Cluster 9 2005 IBM Corporation Cluster 1600 Multi Processor Nodes Physical View Logical View Network, Disk System 10 2005 IBM Corporation Local System Name IBM p5-575 nodes 1.9 GHz POWER5 processors Single processor chips 8 processors per node HPS interconnect 575 distinction: Dual Chip Module (DCM) 8 DCMs One or two processors per chip Single Core (SC) Dual Core (DC) 595 distinction: Multi Chip Module (MCM) construction 8 MCMs 11 2005 IBM Corporation POWER5 Processors Multi-processor chip High clock rate: Multiple GHz Three cache levels Bandwidth Latency hiding Shared Memory Large memory size 12 2005 IBM Corporation POWER5 Features Private L1 cache Shared L2 cache Shared L3 cache Interleaved memory Hardware Prefetch Multiple Page Size support 13 2005 IBM Corporation Processor Characteristics High frequency clocks Deep pipelines High asymptotic rates Superscalar Speculative out-of-order instructions Up to 8 outstanding cache line misses Large number of instructions in flight Branch prediction Hardware Prefetching 14 2005 IBM Corporation Processor Features POWER4POWER5 Clock 1.0 1.9 GHz1.5 GHz Caches Three levels L3 Speed 1/3 clock frequency clock frequency Virtualization Up to 32 partitionsUp to 254 partitions Partitions Unit processorFractional Power Mang. StaticDynamic Thread Execution Single ThreadMulti Threading Memory Store Single BufferDouble Buffer Renaming Registers GP: 72 FP: 80 GP: 120 FP: 120 15 2005 IBM Corporation Caches and Memory POWER4POWER5 L1 Cache Data: 32 kbyte Instruction: 64 kbyte 2-way Assoc., FIFO Data: 32 kbyte Instruction: 64 kbyte 4-way Assoc., LRU L2 Cache 1.5 Mbyte 8-way Assoc., FIFO 1.9 Mbyte 10-way Assoc., LRU L3 Cache 32 Mbyte 8-way Assoc., LRU 120 Cycles 36 Mbyte 12-way Assoc., LRU ~80 Cycles Memory Bandwidth 4 Gbyte/s / Chip16 Gbyte/s / Chip 16 2005 IBM Corporation POWER4+POWER5 Frequency (GHz) L2 Latency (Cycles) 12 L3 Latency (Cycles) Memory Latency (Cycles) Copy Bandwidth 4 proc. (Gbyte/s) 818 Linpack Rate N=1000 (Gflop/s) SPECint_base SPECfp_base POWER4 POWER5 Comparison 17 2005 IBM Corporation POWER5 Design: Summary More gates 170 million 260 million Enhancements Increased cache associativity Increased number of rename registers Reduced L3 and cache latency New features Simultaneous Multi Threading Dynamic power management 18 2005 IBM Corporation Processor Systems (Nodes) Multiple processors Multiple modules Various construction formats Multi Chip Modules Dual Chip Modules Shared memory 19 2005 IBM Corporation Multi Chip and Dual Chip Modules Multi Chip Module (MCM) p5-590 p5-595 Chip POWER5 Processor Dual Chip Module (MCM) p5-570 p5-575 20 2005 IBM Corporation Dual Chip Module Each Module: 1 processor chip 1 L3 cache 1 Memory card Each Processor Chip 2 processors L1 caches Registers Functional units 1 L2 cache 1 path to memory 36 Mbyte L3 Memory 21 2005 IBM Corporation Multi Chip Module Each Module: 4 processor chips 4 L3 cache chips 2 Memory cards Each Processor Chip 2 processors L1 caches Registers Functional units 1 L2 cache 1 path to memory Memory 22 2005 IBM Corporation POWER5 Multi Chip Module Four POWER5 chips Four L3 cache chips 95mm 95mm 4,491 signal I/Os 89 layers of metal 23 2005 IBM Corporation POWER5 Dual Chip Module One POWER5 chip Single or Dual Core One L3 cache chips 24 2005 IBM Corporation L3 Modifications to POWER4 System Structure PP L2 Memory L3 Fab Ctl PP L2 L3 Memory L3 Fab Ctl L3 Mem Ctl 25 2005 IBM Corporation Switch Technology Internal network In lieu of GigEthernet, Myrinet, Quadrics, etc. Fourth generation HPS Switch (POWER2 generation) SP Switch (POWER2 -> POWER3) SP Switch 2 (POWER3 -> POWER4) HPS (POWER4 -> POWER5) Multiple links per node Match number of links to number of processors 26 2005 IBM Corporation High Performance Switch (HPS) Also Known As Federation Follow on to SP Switch2 Also known as Colony Specifications: 2 Gbyte/s (bidirectional) 5 microsecond latency Configuration: Up to four adaptors per node 2 links per adaptor 16 Gbyte/s per node 27 2005 IBM Corporation HPS Specifications Latency [microsec.] Bandwidth, single [Mbyte/s] Bandwidth, multiple [Mbyte/s] SP Switch HPS 28 2005 IBM Corporation Software Overview Operating System AIX Compilers C C++ Fortran Batch Queue LoadLeveler (IBM) LSF (Platform) PBS Gridware 29 2005 IBM Corporation AIX Current Version: AIX 5.3 Processors: POWER3 POWER4 POWER5 Linux Affinity Logical PARtitions (LPAR) Nodes Operating system Memory Network connections Kernel Address Size: 64-bit 32-bit 30 2005 IBM Corporation Linux on POWER Native Linux, SuSE7 SuSE8 Rpm's and package managers Cluster Systems Manager 64-bit kernel 32/64-bit applications support (SuSE8) CompilerUser Name CXlc C++xlC Fortranxlf 31 2005 IBM Corporation Compilers C and C++ Visual Age C and C++ Professional for AIX Versions 6, 7, 8 ANSI C C++ Compiler names: xlc xlC Fortran XL Fortran for AIX Versions 8, 9, 10 Fortran 77 Fortran 90 Compiler names: xlf77 xlf90 32 2005 IBM Corporation Compiler Names CompilerUser Name Fortran 77xlf77 Fortran 90xlf90 Cxlc C++xlC MPI compilempxlf, mpcc Reentrantxlf_r, xlc_r AIX uses different compiler names to perform some tasks which are handled by compiler flags on most other systems 33 2005 IBM Corporation Compiler Usage LanguageCommandFeatureExtension ANSI C xlc xlc_r ANSI Thread safe.c Extended C ccPre-ANSI.c MPI, C mpxlcMPI.c C++ xlC xlC_rThread safe.C.cc.cpp Fortran 77 xlf xlf_rThread safe.f Fortran 90 xlf90 xlf90_rThread safe.f MPI fortran mpxlfMPI.f 34 2005 IBM Corporation User Limits Set by the system administrator Ulimit: C or K shell built-in Sets or reports resource limits Limits are defined in /etc/security/limits Sizes are in 512 byte blocks Times are in seconds $ ulimit -a 35 2005 IBM Corporation Ulimit Defaults Value LimitDefinitionDefaultTypical fsizeFile Size Unlimited (-1) coreCore File Size Unlimited (-1) cpuPer Process limit-1 (unlimited)Unlimited (-1) dataData Segment Size262144Unlimited (-1) stackStack Segment Size65536*Unlimited (-1) No. filesFile Descriptor Limit2000 * 64-bit address mode 36 2005 IBM Corporation Other Defaults Thread control /etc/environment AIXTHREAD_SCOPE=S AIXTHREAD_MNRATIO=1:1 AIXTHREAD_COND_DEBUG=OFF AIXTHREAD_GUARDPAGES=4 AIXTHREAD_MUTEX_DEBUG=OFF AIXTHREAD_RWLOCK_DEBUG=OFF 37 2005 IBM Corporation Batch Queuing Compile on any AIX node Use qarch=pwr5 Submit job with available batch utility Use appropriate queue name Available queuing systems: LoadLeveler PBS Gridware LSF 38 2005 IBM Corporation Cluster Layout Compile And Submit Node Node 0Node 1 Network Node 2 39 2005 IBM Corporation Documentation Software:Products A-Z X -> xl C, xl C/C++, xl FortranCompilers /usr/vac/doc /usr/vacpp/doc /usr/lpp/xlf/doc Redbooks:IBM eServer p5 590 and 595 System Handbook 40 2005 IBM Corporation Documentation AIX Commands Reference AIX command: /usr/sbin/infocenter /opt/ibm_help/help_start.shxcmdsrefbooks.htmhttp://www.unet.univie.ac.at/aix/aixgen/wbinfnav/ai xcmdsrefbooks.htm Google search: AIX Commands Reference 41 2005 IBM Corporation Documentation Library Google Search: AIX 5L documentation Library 42 2005 IBM Corporation Summary: Architecture System architecture Processors Nodes Cluster Processors POWER5 Three levels of cache Nodes: Eight processor p5-575 Cluster: 14 p5-575 nodes HPS interconnect