High-Confidence Clustering with Intel® Cluster Ready
Clem Cole
ICR Architect
Cluster Software and Technologies
Intel Software and Solutions Group
High-Confidence Clustering with Intel® Cluster Ready
INFORMATION IN THIS DOCUMENT IS PROVIDED IN CONNECTION WITH INTEL® PRODUCTS. NO LICENSE, EXPRESS OR IMPLIED, BY ESTOPPEL OR OTHERWISE, TO ANY INTELLECTUAL PROPERTY RIGHTS IS GRANTED BY THIS DOCUMENT. EXCEPT AS PROVIDED IN INTEL'S TERMS AND CONDITIONS OF SALE FOR SUCH PRODUCTS, INTEL ASSUMES NO LIABILITY WHATSOEVER, AND INTEL DISCLAIMS ANY EXPRESS OR IMPLIED WARRANTY, RELATING TO SALE AND/OR USE OF INTEL PRODUCTS INCLUDING LIABILITY OR WARRANTIES RELATING TO FITNESS FOR A PARTICULAR PURPOSE, MERCHANTABILITY, OR INFRINGEMENT OF ANY PATENT, COPYRIGHT OR OTHER INTELLECTUAL PROPERTY RIGHT.
UNLESS OTHERWISE AGREED IN WRITING BY INTEL, THE INTEL PRODUCTS ARE NOT DESIGNED NOR INTENDED FOR ANY APPLICATION IN WHICH THE FAILURE OF THE INTEL PRODUCT COULD CREATE A SITUATION WHERE PERSONAL INJURY OR DEATH MAY OCCUR.
Intel may make changes to specifications and product descriptions at any time, without notice. Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them. The information here is subject to change without notice. Do not finalize a design with this information.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an order number and are referenced in this document, or other Intel literature, may be obtained by calling 1-800-548-4725, or by visiting Intel's Web Site. Intel may make changes to specifications and product descriptions at any time, without notice.
All products, computer systems, dates, and figures specified are preliminary based on current expectations, and are subject to change without notice.
64-bit computing on Intel architecture requires a computer system with a processor, chipset, BIOS, operating system, device drivers and applications enabled for Intel® 64 architecture. Performance will vary depending on your hardware and software configurations. Consult with your system vendor for more information.
The linked sites are not under the control of Intel and Intel is not responsible for the content of any linked site or any link contained in a linked site. Intel reserves the right to terminate any link or linking program at any time. Intel does not endorse companies or products to which it links and reserves the right to note as such on its web pages. If you decide to access any of the third party sites linked to this Site, you do this entirely at your own risk.
Intel, the Intel logo, Intel Core, Intel. Leap ahead., the Intel. Leap ahead. logo, Xeon, and Xeon Inside are trademarks of Intel Corporation in the U.S. and other countries.
*Other names and brands may be claimed as the property of others.
Copyright © 2008, Intel Corporation. All rights reserved.
Legal Notices
Why are you here? orWhat you should learn
• Why is Intel doing this?– A better understanding of the HPC market and how Intel contributes
• What does Intel bring to the table– The value of Intel® Cluster Ready clusters and Intel® cluster technology
• Why should I care?– Intel processor-based clusters make it easier for you
High-Confidence Clustering with Intel® Cluster Ready
90 nm90 nm20032003
180 nm180 nm19991999 130 nm130 nm
2001200165 nm65 nm20052005 45 nm45 nm
20072007
Silicon Technology — Intel Execution On-Time 2 Year Cycle
High-Confidence Clustering with Intel® Cluster Ready
The Promise of HPC Clustering with Intel® Architecture
• High CPU performance
• Linux maturity
• Broad application availability
Compelling raw performance and price/performance of clustersCompelling raw performance and price/performance of clusters
Pentium® II ArchitecturePentium® II Architecture
Pentium® 4 Pentium® 4 ArchitectureArchitecture
Pentium® ArchitecturePentium® Architecture
486486386386
Intel® Core™ uArchIntel® Core™ uArch
1.E+061.E+06
1.E+01.E+077
1.E+081.E+08
1.E+091.E+09
1.E+101.E+10
1.E+111.E+11
1.E+121.E+12
1.E+131.E+13
1.E+141.E+14
1.E+151.E+15
19851985 19901990 19951995 20002000 20052005 20102010
FLOPSFLOPS
Pentium® III ArchitecturePentium® III Architecture
TeraTera
PetaPeta
GigaGiga
Source: IntelSource: Intel
Intel® Core™ 2 uArchIntel® Core™ 2 uArch
High-Confidence Clustering with Intel® Cluster Ready
The Case for Clusters: Price/performance
ClusterInterconnect
Network
SMP Price~ $300k-500K
Volume serverPrice < $3000
Cluster price24x($3000+$1000/port) = $96,0003-5x price/performance advantage
ports
High-Confidence Clustering with Intel® Cluster Ready
The Cluster Challenge: Many Components, make cluster success hard to reach
uP 0
SharedMemory
Coherent Memory Bus
uP 1 uP (N-1) uP(N)uP 0
SharedMemory
Coherent Memory Bus
uP 1 uP (N-1) uP(N)
Cluster System • Lots of cooperating components• Higher Latency for sharing• Software Complexity• Lower Cost
SMP System• Easy to build, closed environment• Low latency shared memory• Simple programming model• Cache coherency drives up cost
uP 0
DistributedMemory
uP 0
DistributedMemory
NetworkAdapter
uP 0
DistributedMemory
uP N
DistributedMemory
NetworkAdapter
ClusterInterconnect
Network
uP 0
DistributedMemory
uP 1
DistributedMemory
NetworkAdapter
High-Confidence Clustering with Intel® Cluster Ready8
Volume • Mainly capacity• Clusters >50% of revenue• Rapid growth since 2002 - driven
by broader commercial HPC usage
Market Dynamics*
High End (>$500K system)• Mainly capability & constellations• Risk takers, early adopters
* Source: IDC Worldwide High-Performance and Technical Computing Server 2008-2012 Forecast, March 2008
IDC:2007: $3.0B2012: $3.9B
IDC:2007: $8.6B2011: $14.0B
Total segment > $11.6B in 2007 Forecast > $17.9B in 2012
IDC Segment System Size
2007 2012 CAGR
$250K-$499K $1.8B $3.4B 13.9%
$100K-$249K $4.2B $7.1B 11.2%
<$100K $2.6B $3.5B 5.9%
> 20% of overall server segment CPU volume> 20% of overall server segment CPU volume
High-Confidence Clustering with Intel® Cluster Ready
What Do Most HPC Buyers Really Want?
• To get their job done– Complete their task using HPC applications
e.g. modeling products, fluid flow, chemical analysis, etc.– Hardware specifics do not matter but …
– must be easy to choose, deploy, and maintain– must be cost effective– must scale
• A cluster with characteristics of an enterprise server– Very positive out of the box experience– Does not need specialized staff to keep it running
Cluster Solutions•Cost Effective•Scale Well•Historically Hard to Deploy
Cluster Solutions•Cost Effective•Scale Well•Historically Hard to Deploy
SMP Solutions•Get Job Done•Limited in Scale•Expensive
SMP Solutions•Get Job Done•Limited in Scale•Expensive
Our answer is IntelOur answer is Intel®® Cluster Ready Cluster Ready
High-Confidence Clustering with Intel® Cluster Ready
“Keep it simple… Use a good idea again… Divide and conquer.”
-- “Hints for Computer System Design” by Butler Lampson, Proceedings of Ninth ACM SOSP, Oct 1983.
High-Confidence Clustering with Intel® Cluster Ready
Intel® Cluster ReadyA validated, interoperable cluster application platform
One application runs on many solutions
One cluster solution runs many applications
One to one application and cluster solutions
Improves quality and compatibility of systems and applications
9
I ntel enables components to work togetherIntel enables components to work togetherFromFrom SiliconSilicon toto Solutions Solutions in Volume HPCin Volume HPC
Customer Ready
Solutions
Comprehensive Developer Tool Suites
And much more…•Operating System enabling•Middleware enabling•Performance tuning•Customer support
Ethernet and Advanced FabricComponents
ISV Enabling
CertifiedClusterDesigns
Architecture I nfluence
Performance-Leading Processors
HPC-Optimized Platforms
HPC Systems Engineering
Interconnect Software
Registered Applications
Intel® Cluster Checker Tool
Why are you here? orWhat you should learn
• Why is Intel doing this?– A better understanding of the HPC market and how Intel contributes
• What does Intel bring to the table– The value of Intel® Cluster Ready clusters and Intel® cluster technology
• Why should I care?– Intel processor-based clusters make it easier for you
High-Confidence Clustering with Intel® Cluster Ready
Intel® Cluster Ready makes it easier to acquire and use
clusters built with Intel-based platforms
Architecture Specification
Certification program
Tools
Labeling
Communication
Registered application
Intel® Cluster Ready Program
http://www.intel.com/go/cluster
High-Confidence Clustering with Intel® Cluster Ready
Intel® Cluster Ready Architecture: Linux Binding
• A minimal common basis for clustersA minimal common basis for clusters
Industry standards
Minimum runtime interface
Binary portability
Solutions quality
Embraces innovation and differentiation
Full specification is available: http://www.intel.com/go/cluster
Linux* Distribution
Intel® C/C++
Intel® Fortran Intel® MPI Library Runtime
SocketsOS Kernel
DFS Mesg Fabric
Cluster
Installer
Cluster Hardware Node Architecture
Intel® Runtime & Cluster Math Kernel Libs
DAPL
OFED
Application registered with Intel® Cluster Ready
Intel® Cluster Checker
High-Confidence Clustering with Intel® Cluster Ready
Architecture:Example Compute Node Requirements
† gibibyte = giga-binary-byte. This is equal to exactly 230 bytes and is slightly greater than 1 gigabyte (109 bytes). Binary SI units are used to remove any ambiguity in the Intel® Cluster Ready specification.
At least one Intel® 64 processor
Minimum of 1 gibibyte of memoryor 0.5 gibibyte per core,whichever is greater
Ethernet port operating at one GibibitInfiniBand* fabric based on4X channels or greater
10 gibibytes of long term storage per node
High-Confidence Clustering with Intel® Cluster Ready
Architecture:Example Kernel Requirements
• Shall materially conform to the Linux* Standard Base (LSB)
• The following kernel versions are sufficient:
• The kernel based on distributions from http://www.kernel.org equal to or later than version 2.6.17
• 2.6.9-42.ELsmp in Red Hat* Enterprise Linux* 4, Update 4
• 2.6.16.21-0.8-smp in SUSE* Linux* Enterprise Edition 10
• A list of tested and known sufficient Linux* distributions and associated kernels can be found at: http://www.intel.com/go/cluster
Specification Requires and Advises
• Specification requires freely available Intel runtime binaries
• Specification does not require using Intel tools
• Specification requires how interfaces are installed
• Specification does not say which ones must be used, only that if you install it, where and how it must exists
• Specification advises in areas where there is no “Best Known Method” or industry agreed upon solution
– Eg. Requires that an cluster installer must be provided, but does not say which one; but can only advise that a resource manager be made available; although we feel strongly that such a tool is necessary to help keep a cluster heathly
High-Confidence Clustering with Intel® Cluster Ready
Intel® Cluster Ready reference implementations are built by Intel
Intel® Cluster Ready products are built by Intel partners
High-Confidence Clustering with Intel® Cluster Ready
Intel® Cluster Checker
Every Intel® Cluster Ready system includes the newIntel® Cluster Checker
• Verifies conformance to the specification
• Checks ‘wellness’ of a cluster Example: verifies functionality and performance of the cluster messaging networks
• Provides diagnostic data• Reduces support
Example: Easily identifies system issues
• Configurable and extensible
High-Confidence Clustering with Intel® Cluster Ready
• Intel® Cluster Checker is modular, scalable, and extensible
Intel® Cluster Checker Architecture
High-Confidence Clustering with Intel® Cluster Ready
Intel Cluster Ready - Roles
ISV• Build Compatible Apps
• Register Products as CB compliant
• Execute joint marketing
System Vendor
• Design, Build, and Sell Compatible Systems
• Certifiy Product
• Provide Intel Cluster Checker
Intel • Define & Enable ICR
Platform Architecture
• Build Reference Designs
• Defines Certification and Registration Program
• Solutions
High-Confidence Clustering with Intel® Cluster Ready
Intel® Cluster Ready EcosystemWho does what?
• Intel:
• Publishes cluster specification
• Makes reference cluster available to ISVs for testing & verification
• Makes reference recipes available to cluster vendors
• Provides Intel® Cluster Checker to help cluster vendors check systems for spec compliance
• Cluster Vendors
• Design & build systems to meet cluster specification
• Use Intel® Cluster Checker to “certify” system configuration is compliant to the specification
• “Certified” cluster config
• OEM/VAR & Channel Partners use recipes to build compliant clusters
ISVs
Write applications stack to the cluster specification
Test applications against reference clusters
“Register” their SW
Component Vendors
Build HW and SW system components
Work with cluster vendors and Intel to add to certified configurations
Once in one recipe, easy to move to another
Intel® Cluster ReadyIntel® Cluster Ready: ecosystem acceleration for all : ecosystem acceleration for all players in the fastest growing HPC segmentplayers in the fastest growing HPC segment
““All boats are lifted in a rising in tide”All boats are lifted in a rising in tide”
Why are you here? orWhat you should learn
• Why is Intel doing this?– A better understanding of the HPC market and how Intel contributes
• What does Intel bring to the table– The value of Intel® Cluster Ready clusters and Intel® cluster technology
• Why should I care?– Intel processor-based clusters make it easier for you
High-Confidence Clustering with Intel® Cluster Ready
Cluster Solutions Platform Vision:High Confidence Clustering
• The Intel® Cluster Ready Solution provides confidence
• End Users are confident that applications and systems are interoperable providing enhanced productivity and saving time, trouble, and cost
• ISVs are confident that their application is interoperable with Intel® Cluster Ready systems
• Cluster vendors are confident that Intel® Cluster Ready applications are interoperable with their systems
• By
• Comprehensive specification
• Reference implementations for Intel platforms
• Verification and certification tools and program for Intel® Cluster Ready systems
• Software registration to demonstrate applications are interoperable with Intel® Cluster Ready systems
High-Confidence Clustering with Intel® Cluster Ready
Complete Solution
Certified Solution from a specific
Platform Integrator
Intel® Cluster Ready
Intel® Cluster ReadyApplication RegistrationRegistration
Intel® Cluster ReadySolution Deployment End-Users
ISV-Applications Platform Solutions
Provider
HW SW
Intel® Cluster ReadyPlatform CertificationCertification
Platform Solutions ProviderDeliver SKU that is a
Certified Solution
PI Hardware Installer
Intel® Cluster ReadyComponent CertificationCertification
Intel® Cluster ReadyHardware CertificationCertification
High-Confidence Clustering with Intel® Cluster Ready
Benefits of Choosing an Intel® Cluster Ready Certified System
• In production faster– Simplified selection of compatible systems and applications– Faster and higher confidence initial deployment
• Less downtime– Ability to detect and correct problems soon after they occur– Better support from ISVs and your system vendor
•Lower initial and ongoing costs
•Higher ongoing productivity
Demand an IntelDemand an Intel®® Cluster Ready certified Cluster Ready certifiedsystem for your next clustersystem for your next cluster
High-Confidence Clustering with Intel® Cluster Ready
Members of the Intel® Cluster Ready Community
A final a word from our sponsor...
• Why talk about Intel Cluster Ready at a FOSS Grid & Clustering Conference?
– The ICR speciation and documentation are freely available on the Intel Web Site ( http://www.intel.com/go/cluster )
– The Cluster Checker is an implementation of a tool to help check to see if an instance meets the specification• The interfaces are modular so that 3rd parties can add more tests to the
tool for their own value add• Cluster Checker is free to all to Intel ICR partners• An ICR “certified” cluster is required to pass on a copy of the cluster
checker – Allows End Users, Administrators, FE’s from PI or ISVs to run CC in
wellness mode– Allows check for degradation since factory installation
Biography: Clem Cole
• an old school hacker and “Open Sourcer” with more than 25 years of free and open source system development experience.
• first encountered the early editions of UNIX while a university student, writing a variety of device drivers, kernel enhancements, and microprocessor support tools for the early 4- and 8-bit CPUs
• has been developing Operating Systems and Technical Computing Systems ever since
• most recent enterprise at Intel Corporation is leading the Architecture and Development of Intel Cluster Ready
• degrees in EE, Math & CS from CMU and UCB, with the usual multi-page list of publications and talks to his credit
• helped to write one of the original TCP/IP implementations in the late 1970’s
• is also known as one of the authors of the precursor to IM, the UNIX talk program, as well as other more humorous and notorious hacks
• is honored to be the current VP of the USENIX Association
High-Confidence Clustering with Intel® Cluster Ready
High-Confidence Clustering with Intel® Cluster Ready
Thank You
http://www.intel.com/go/clusterhttp://www.intel.com/go/cluster
High-Confidence Clustering with Intel® Cluster Ready
• Joining theIntel® Cluster Ready Program
High-Confidence Clustering with Intel® Cluster Ready
• Solutions Providers• Component Vendors• Application ISVs
Intel® Cluster Ready Participants:
High-Confidence Clustering with Intel® Cluster Ready
Sign Up On-line
Just a couple clicks away…
Go on-line at http://www.intel.com/go/cluster and follow links to sign up as a participant.
Intel® Cluster Ready participants can use the badge on certified solutions or registered applications. Participants are also listed on the Intel® Cluster Ready website.
High-Confidence Clustering with Intel® Cluster Ready
For more information about Intel® Cluster Ready:
• Intel® Cluster Ready website:
• http://www.intel.com/go/cluster
• Intel® Enabled Server Acceleration Alliance website:
• http://www.intel.com/design/servers/esaa
• Email:
•
High-Confidence Clustering with Intel® Cluster Ready
Cluster Components
• Processors
• Computer nodes
• High performance interconnect with fast message passing software
• Storage network, NFS, etc. (e.g., Ethernet)
• Cluster management – OS & Middleware
• Parallel libraries, compilers, mathematical libraries(e.g., Intel® MPI Library, OpenMP*, Intel® Math Kernel Library)
• Parallel application(s)
Mgt
LAN
Storage
Storage
Appl
Headnode
Compute nodes
Console