25
An Assessment of Leadership Performance with POWER6 Processors and Red Hat Enterprise Linux 5.1 © IBM Corporation 2007 All Rights Reserved

Performance Leadership Power6 and RHEL 5.1

Embed Size (px)

Citation preview

Page 1: Performance Leadership Power6 and RHEL 5.1

An Assessment of Leadership Performance with

POWER6 Processors and Red Hat Enterprise Linux 5.1

January 25th, 2008

Bill Buros

Elisabeth Stahl

© IBM Corporation 2007All Rights Reserved

Page 2: Performance Leadership Power6 and RHEL 5.1

IBM Systems and Technology Group

Executive Overview

Compute intensive performance is increasingly required for today’s high performance environments. The IBM System p 570 with POWER6 processors provides blazing performance and demonstrates excellent scalability moving from one node to four nodes in this environment while providing easy SMP scaling and growth.

This paper highlights the leadership performance on IBM’s POWER6 systems running with the latest Red Hat Enterprise Linux (RHEL) Version 5.1, based on recently audited and published SPECcpu2006 and, SPECjbb2005 results, plus LINPACK, on IBM System p 570 4-core, 8-core, and 16-core systems. With POWER6 Simultaneous Multi-Threading support (SMT) turned on, Linux is easily able to provide effective scheduling support of the 32 CPUs seen on the 16-core System p 570 system.

We discuss the impact of high performance computing on organizations, highlight POWER6 systems and Red Hat Enterprise Linux solutions, and conclude with some examples of easy software stack performance tuning considerations and recommendations for achieving leadership results with Linux and POWER6.

© IBM Corporation 2007All Rights Reserved

Page 2

Page 3: Performance Leadership Power6 and RHEL 5.1

The Business of Compute Intensive Performance

Compute intensive performance is increasingly required for today’s high performance environments. Many industries depend on this level of performance; it is a necessity to run their business. Industries are finding that increased performance can have a positive impact on revenue in many ways, even for those companies who do not follow the classic scientific or high performance computing models

As an example, many companies in the financial services sector depend on compute-intensive, business-critical applications that require high performance, scalable and reliable processing power. These organizations thrive on complexity and rely on high-volume, complex trades to generate profits. These complex applications require compute-intensive system performance leadership to differentiate the owning organization from competitors.

The workloads presented here are great examples of the foundation compute intensive workloads where easy customer tuning techniques can be highlighted. IBM and Red Hat are leaders in the providing the infrastructure and system solutions which are easy to manage, help reduce complexity and lower energy costs by consolidating workloads. Using the examples here, across CPU-intensive integer, floating point, and Java applications, we show that POWER6 and RHEL 5.1 can provide customers with the performance, flexibility, and effective solutions they need.

© IBM Corporation 2007All Rights Reserved

Page 3

Page 4: Performance Leadership Power6 and RHEL 5.1

IBM POWER6 Systems – Design, Performance, Function

The recent introduction of the POWER6™ processor-based servers brought to market advances in server design, performance, and function.

The IBM POWER6 processor-based System p™ 570 server delivers outstanding price/performance, reliability and availability features, flexible capacity upgrades and innovative virtualization technologies to enable management of growth, complexity and risk. These systems excel at database and application serving, as well as server consolidation, demonstrating resource optimization, secure and dependable performance and the flexibility to change with business needs. (1)

POWER6 processors can easily run 64-bit applications, while concurrently supporting 32-bit applications to enhance flexibility. The processors feature Simultaneous Multi-threading support, allowing two application “threads” to be run at the same time, which in many cases can significantly reduce the time to complete tasks.

IBM and Red Hat recently completed work on publishing three industry standard benchmark suites on the System p 570 servers with the latest Red Hat Enterprise Linux Version 5.1 update. The benchmark suites included SPECcpu2006, SPECjbb2005, and Linpack single-system which were executed on 4-core, 8-core, and 16-core p 570 systems. The workloads easily demonstrate the SMP scalability which comes naturally with Linux and the IBM p570 systems.

.

© IBM Corporation 2007All Rights Reserved

Page 4

Page 5: Performance Leadership Power6 and RHEL 5.1

The Red Hat Enterprise Linux (RHEL) 5.1 Solution

On one certified platform, Red Hat Enterprise Linux offers your choice of:

* Applications - Thousands of certified ISV applications * Deployment - Including standalone or virtual servers, cloud computing, or software appliances * Hardware - Wide range of platforms from the world's leading hardware vendors

This gives IT departments unprecedented levels of operational flexibility. And it gives ISVs unprecedented market reach when delivering applications. Certify once, deploy anywhere. All while providing world-class performance, security, and stability. And unbeatable value.

Red Hat Enterprise Linux is available in two variants for servers. A base Red Hat Enterprise Linux server is designed for small deployments while Red Hat Enterprise Linux Advanced Platform is designed for mainstream customers and provides the most cost-effective, flexible, and scalable environment. Both versions are based on common core technology. Both include a comprehensive suite of open source server applications and virtualization capabilities.(2)

Red Hat provides:

Thousands of certified applications from Independent Software Vendors (ISVs) Hundreds of certified hardware systems and peripherals from leading OEM vendors, spanning

multiple processor architectures A range of partner programs Comprehensive service offerings, up to 24x7 support with 1-hour response, available from Red

Hat and selected ISV/OEM partners Excellent performance, security, scalability, and availability, with audited industry benchmarks Open source technologies rigorously tested and matured through the Red Hat sponsored Fedora

project With each major version, stable application interfaces and 7 years of product support A homogeneous client/server product family that enables seamless inter-operation of systems

from the laptop to the data-center to the mainframe. Plus, excellent interoperability with existing Unix and Microsoft® Windows® deployments

Red Hat Enterprise Linux (RHEL) 5.1 for POWER systems provides excellent performance right “out of the box”, and can be easily tuned to provide maximum performance for IBM’s latest POWER6 systems.

© IBM Corporation 2007All Rights Reserved

Page 5

Page 6: Performance Leadership Power6 and RHEL 5.1

Benchmark Configuration Performance Recommendations

To demonstrate how easy it is to attain outstanding POWER6 and RHEL 5.1 performance in the compute intensive environment, we use recently published SPECfp™_rate2006, SPECint™_rate2006, Linpack, and SPECjbb2005 benchmarks as examples on 4-core, 8-core, and 16-core p570 systems. All of these systems were booted with simultaneous multi-threading on, so the three tested systems had 8 CPUs, 16 CPUs, and 32 CPUs, respectively, available for Linux to schedule and leverage.

SPEC CPU2006

SPEC CPU2006 contains two benchmark suites designed to provide performance measurements that can be used to compare compute-intensive workloads. CINT2006 measures and compares compute-intensive integer performance and CFP2006 measures and compares compute-intensive floating point performance. CPU2006 is SPEC's next-generation, industry-standardized, CPU-intensive benchmark suite, stressing a system's processor, memory subsystem and compiler. SPEC designed CPU2006 to provide a comparative measure of compute-intensive performance across the widest practical range of hardware using workloads developed from real user applications.(3)

We use the “through-put” (the rate runs) workloads here to demonstrate how a fully loaded system completes the specified workloads. One of the key advantages of the CPU2006 benchmark programs is that they are developed from actual end-user applications, as opposed to synthetic benchmarks. The workloads were modified to be highly portable. CINT2006 (Integer Workloads) is made up of 12 benchmarks, 9 use C, and 3 use C++. CFP2006 (Floating Point Workloads) has 17 benchmarks, 4 use C++, 3 use C, 6 use Fortran, 4 use a mixture of C and Fortran.

For the through-put runs, the SPEC infrastructure runs multiple copies of the workload, in our case, one copy for each schedule’able CPU. These runs highlight how effective the hardware and software support of SMT. We have found that in most cases running a POWER 6 system with SMT on is the preferred case.

LINPACK HPL

The LINPACK benchmark compares the performance of different computer systems in solving dense systems of linear equations (5). In this assessment, the Highly Parallel Computing (HPC) benchmark was run and OpenMPI was used. OpenMPI has proven to be effective with these types of workloads and is a great example of an efficient open-source implementation.

The LINPACK benchmark metrics are defined as follows: Rmax : the performance in Gflop/s for the largest problem run on a machine. Nmax: the size of the largest problem run on a machine. N1/2 : the size where half the max R execution rate is achieved. Rpeak : the theoretical peak performance in Gflop/s for the machine.

The number of processors and the cycle time are also listed when reporting results of the LINPACK benchmark. Full or half precision reflects that the computation was computed using 64 or 32-bit floating point arithmetic respectively. In our case, we use 64-bit floating point leveraging the POWER6 hardware

We use LINPACK as a classic single program floating point program example which is heavily dependent on math libraries and an MPI protocol. LINPACK is the basic workload run for the Top 500 list of super computers, where hundreds and even thousands of computer systems or nodes are interconnected to provide massive parallelization of the workload. In our case, we run LINPACK HPL in a single SMP image from leveraging multiple threads on the SMP system. The work on the LINPACK single system extends easily to the clustered systems.

© IBM Corporation 2007All Rights Reserved

Page 6

Page 7: Performance Leadership Power6 and RHEL 5.1

SPECjbb2005

SPECjbb2005 evaluates the performance of servers running typical Java business applications and represents an order processing application for a wholesale supplier. The benchmark can be used to evaluate performance of hardware and software aspects of Java Virtual Machine (JVM) servers.

SPECjbb2005 (Java Server Benchmark) is SPEC's benchmark for evaluating the performance of server side Java by emulating a three-tier client/server system (with emphasis on the middle tier) on a single hardware system. The benchmark exercises the implementations of the JVM (Java Virtual Machine), JIT (Just-In-Time) compiler, garbage collection, threads and aspects of the operating system. It also measures the performance of CPUs, caches, memory hierarchy and the scalability of shared memory processors.(4)

We use SPECjbb2005 as a classic test of the IBM provided Java engine, in this case the latest Java 1.6 release from IBM.

© IBM Corporation 2007All Rights Reserved

Page 7

Page 8: Performance Leadership Power6 and RHEL 5.1

SPECcpu2006

The SPECcpu2006 benchmark suite consists of several metrics, two of which we present here.

In this example, we used the “rate” run mode of the SPEC CPU2006 suite, to compare how the p 570 system handles normal workloads starting with the 4-core system, then a separate 8-core system, and finally on a 16-core p 570 system running on the latest RHEL 5.1 release from Red Hat.

As you can see in the graph below, the workload scaled very nicely from 4-core, to 8-core, to 16-core on the RHEL 5.1 based POWER6 system.

The SPEC cpu2006 results here were easily obtained by leveraging the following software pieces available for customers. The results demonstrate how easy it is to get leadership performance with Linux and POWER 6 systems.

1. Red Hat’s Enterprise Linux (RHEL) Server Version 5.1 for POWER.

2. IBM’s XL C/C++ and XL Fortran Advanced Edition Compilers, Ver 9.0 and 11.1, respectively

3. The Linux “libhugetlbfs” project for Transparent Access to 16MB Large Pages included with RHEL 5.1 – libhugetlbfs is an ongoing sourceforge project.

4. MicroQuill’s SmartHeap™ Version 8.1

5. IBM’s Engineering and Scientific Subroutine Library (ESSL) for Linux on Power, Version 4.3

6. IBM’s Post-Link Optimization for Linux on Power, Version 5.4 (aka FDPRpro)

Each software component listed provides critical enabling support for improving the performance of these example CPU-intensive benchmarks. SPEC.org makes it easy to see what options were used for each workload component in the suites.

© IBM Corporation 2007All Rights Reserved

Page 8

Page 9: Performance Leadership Power6 and RHEL 5.1

In the output of each SPEC.org report listed at the end of this paper, there are two primary sections the reader’s attention is referred to. There is a section in the published result file for “Base Optimization Flags”, where the common C, C++, and Fortran optimizations are defined. Then a “Peak” run can be done by individually tuning the optimizations for each individual benchmark.

In our example, for the SPEC CPU2006 runs, C, C++, and Fortran programs used the “–O5” compiler optimization in the base run. Interestingly enough, for the benchmarks, some of the benchmark programs would run slightly better with lower optimization levels specified.

Red Hat Enterprise Linux Server

RHEL 5.1 for Power ships with 64KB memory pages built in the kernel as the default, which enables easy performance gains for these types of HPC workloads, with no application source code changes. The 64KB pages allow applications to access memory more efficiently on the POWER6 hardware platform.

In general, SPEC CPU2006 “rate” workloads were run by binding each workload process to a specific CPU executing under the control of the Linux operating system. This is easy to do with the system command “taskset” as follows, where # is the CPU number.

taskset –p –c <#> <command>

For the “rate” runs, each system was booted with “SMT on”, which allowed the operating system and hardware to best juggle idle cycles between two simultaneous threads executing on a single core. On the POWER systems, each schedule’able CPU is backed by one of the SMT threads on a core.

RHEL 5.1 also ships with automatically selected CPU-tuned libraries. These libraries provide application programs access to the right set of libraries for the system being executed on, in our case, libraries tuned specifically for POWER6. This selection is handled automatically and seamlessly by the operating system.

IBM’s XL C/C++ and XL Fortran Advanced Edition Compilers

On Linux, the normal gcc compilers and tool-chain provide good “out-of-the-box” performance and excellent cross-platform compilation support for enabling of applications. Where needed and desired, the IBM compilers for Linux on POWER can provide the proven ability to deliver the best performance for the hardware system. The IBM Linux on Power compilers are the “same” compilers used by AIX on Power systems, so Linux customers are taking advantage of years of development and customer usage of the compilers.

In our experience, every workload component has its own unique characteristics and performance signatures, so there are several common tuning techniques used:

1. First, while the “-O5” optimization level is our default, we found in some cases the different optimization levels of “–O4”, “-O3”, and surprisingly even “–O2” on occasion could provide slightly better performance characteristics. This is something easy for customers to test.

2. Keep in mind that when compiling at the “–O5” level, the compiler automatically produces executable code targeted at the system you are compiling on. For SPEC CPU2006 workloads, most companies that publish results will compile the executable specifically targeted at the hardware being tested.

This technique provides incremental performance gains by leveraging the compilers knowledge of the hardware. The POWER systems have been designed for long-term customer stability and upwards compatibility, so customers can easily choose to compile their applications for the more general case of POWER systems and still achieve good performance.

3. In some cases there was a notice’able difference in performance between compiling for 32-bit or 64-bit. Most applications in the SPEC CPU2006 suite did fine with the 32-bit compilation mode. Compiling for 64-bit is usually something done to increase the capacity of the application

© IBM Corporation 2007All Rights Reserved

Page 9

Page 10: Performance Leadership Power6 and RHEL 5.1

program, allowing the application program easy access to a much bigger address space and memory. 64-bit does not automatically mean the program will run faster though.

4. In one case on POWER, we found that the workload component “447.dealII” ran a little faster being statically linked. Most customers would think twice before statically linking an executable, and we use this simply as an example of a performance technique which can be leveraged, and for a publish we obviously wanted to use the faster result.

5. Another compilation technique supported by the IBM compilers is to employ “profile-directed feedback” (PDF) optimization runs. This technique is a two-step process. First, the application is compiled, automatically instrumented, and executed with a typical data set. During the test run, profile data is gathered and subsequently used in the second-step compilation to further optimize the final end result executable. This step requires a sample data set to use for the profiling run, so that’s something to consider for this approach.

“Libhugetlbfs”

A Linux community performance improvement project, libhugetlbfs (8) is now being embraced by Linux enterprise customers in special cases to transparently load portions of an executable into the 16MB large pages supported by POWER hardware systems. By leveraging the hardware assist of memory access to larger pages, the processing time of finding and accessing memory pages can in some cases be significantly reduced, improving the performance of applications.

“libhugetlbfs” provides a flexible interface to allow an executable to back memory malloc’s with 16MB large pages, or the .bss segment (for example - those un-initialized arrays in a program), or the combination of the .bss segment, the .data segment, and the executable text of the program into memory backed by 16MB larges.

On RHEL 5.1, the majority of the large size memory performance gains are obtained with the 64KB memory pages. In some cases, additionally leveraging the 16MB large pages can add incremental small performance gains to the workloads. In other words, 64KB provides the great out-of-the-box performance, and 16MB pages can be further leveraged to squeeze out a few more potential percentage points of performance.

To clarify a common question, libhugetlbfs only operates on the 16MB large pages on POWER systems and is optionally used on application programs. 64KB memory page support is built into the Linux kernel (it’s a kernel build option) and is always available on RHEL 5.1 and future RHEL releases.

MicroQuill’s SmartHeap™

MicroQuill’s SmartHeap library is a fast and portable malloc library replacement. SmartHeap supports multiple memory pools and improves memory management performance for multi-threaded applications. In the Linux community, the malloc routines shipping in the common base are extremely robust and evolved over the years to make cross-platform development easy and consistent.

In the SPEC CPU20006 suite, the benchmark component “483.xalancbmk” is commonly referred to as a “malloc abuser”. With MicroQuill’s SmartHeap leveraged for this workload, the performance of the benchmark is significantly improved, almost doubling in performance. The benchmark components “471.omnetpp” and “473.astar” are also improved by about 30% and 20% respectively. Not all benchmarks or workloads will improve from this technique, but if your workload depends heavily on malloc, you should consider trying the libraries from MicroQuill. MicroQuill is available across operating systems and hardware platforms.

© IBM Corporation 2007All Rights Reserved

Page 10

Page 11: Performance Leadership Power6 and RHEL 5.1

IBM’s Engineering and Scientific Subroutine Library (ESSL)

IBM’s ESSL product is a collection of state-of-the-art mathematical subroutines specifically designed to improve the performance of engineering and scientific applications on the IBM POWER™ processor-based servers and BladeCenter® JS20, JS21, and JS22 POWER blades. ESSL is commonly used in the aerospace, automotive, electronics, petroleum, utilities and scientific research industries for a large suite of HPC applications.

ESSL supports 32-bit and 64-bit Fortran, C and C++ applications, and serial and symmetric multiprocessing (SMP) applications running under AIX® and Linux®. As with the IBM compilers, ESSL for Linux on Power provides the same time-tested software base used by IBM’s AIX customers.

ESSL is used selectively in many cases of CPU2006 Fortran Floating Point programs, and is used extensively in Linpack HPL publications. The ESSL product is specifically tuned for the POWER hardware and systems and is highly recommended. As seen in Linpack results graph, the results scaled very nicely from 4-core to 8-core and finally to 16-core.

Post-Link Optimization for Linux on POWER (aka FDPR-Pro)

Post-Link Optimization for Linux® on POWER™, also known as FDPR-Pro, is a performance-tuning utility used to improve the execution time and the real memory utilization of user-level application programs, based on their run-time profiles. This approach is similar to the profile-directed feedback technique used by the IBM compilers, but in this case is a standalone product which leverages technologies and innovative approaches for re-structuring an executable based on executing with representative data sets.

Compute Intensive Summary

When maximum performance is desired for an application, it is easy to start the process of optimizing and tuning a compute-intensive application, be it C, C++, or Fortran.

1. Leverage a state-of-the-art Linux operating system like RHEL 5.1

2. With RHEL 5.1, the system automatically has support for 64KB memory pages

3. Where possible, take advantage of compilers which exploit the hardware

4. Assess simple combinations of compiler optimization levels (-O5, -O4, -O3) and then 32-bit vs 64-bit targets

5. Consider the use of transparent access to system large pages (16MB pages on POWER)

6. Consider whether profile-directed feedback techniques are possible with access to representative data sets

© IBM Corporation 2007All Rights Reserved

Page 11

Page 12: Performance Leadership Power6 and RHEL 5.1

SPECjbb2005

Another common set of compute-intensive workloads are Java server-side workloads. SPEC provides a benchmark which exercises the implementations of the Java Virtual Machine (JVM), the Just-in-time compiler (the JIT), and the garbage collection and threads. The workload allows for multiple JVMs to be used which in some cases simplifies the scalability of the benchmark.

The results below show the easy scalability of this workload on the POWER 6 p 570 systems. The workload as we ran it specified two JVMs for each set of 4 cores. So the 4-core system ran two JVMs. On the 8-core system, four JVMs were run. And on the 16-core system, eight JVMs were run.

For these publishes, there wasn’t very much tuning done. A heap size of 2253 MB was specified (in essence close to 2GB memory) for the 32-bit JVMs. When using multiple JVMs are used as we did in these publishes, we found that the 32-bit IBM JVM had better performance than the equivalent 64-bit JVM. Conversely, if the runs were done with a single JVM on a 4-core, 8-core, and 16-core system, we may have seen that a 64-bit JVM would be better due to the larger memory address space available for the heap. This provides nice flexibility for the customers depending on their choice of implementation.

In these Java runs, the Java heap was configured to run with the 16MB large pages. This was done easily by first defining 16MB large pages to the system, and then setting the appropriate environment variable controls. 500 large pages were defined for each 4-core node used. So in the case of the 16-core POWER 6 p 570 system, the following “root user” controls were used to reserve 2000 16MB large pages:

echo 2000 > /proc/sys/vm/nr_hugepages

echo 4000000000 > /proc/sys/kernel/shmmax

echo 500000 > /proc/sys/kernel/shmall

We of course check to be sure the 2000 16MB pages were allocated and available for the Java engines.

© IBM Corporation 2007All Rights Reserved

Page 12

Page 13: Performance Leadership Power6 and RHEL 5.1

On RHEL 5.1, with 64KB pages automatically leveraged, the majority of the performance gains were realized without needing to specify the 16MB huge pages. For a marketing publish, with the focus being the best possible metric, 16MB pages were used.

The IBM Java 1.6 holds the latest functional, enterprise, and performance enhancements for customers wanting to leverage the most out of the POWER6 systems. IBM’s engineering and software teams continue to incrementally improve the Java performance, so it’s recommended that customers stay on top of service packs and updates as they are made available.

© IBM Corporation 2007All Rights Reserved

Page 13

Page 14: Performance Leadership Power6 and RHEL 5.1

Conclusion

Corporations in varied industries rely on compute-intensive performance to distinguish their organizations and positively impact their revenue. POWER6 processors with Red Hat Enterprise Linux are a winning combination, as demonstrated by the SPECcpu2006, SPECjbb2005, and LINPACK benchmarks. Linux on POWER can deliver.

IBM and RedHat continue to expand the breadth and coverage of performance improvements for customers in real-life situations. We encourage the reader to leverage the lessons demonstrated with these leadership publishes.

© IBM Corporation 2007All Rights Reserved

Page 14

Page 15: Performance Leadership Power6 and RHEL 5.1

References

[1] IBM POWER6 System p 570http://www.ibm.com/systems/p/hardware/midrange_highend/p570/

[2] Red Hat Enterprise Linux 5 http://www.Red Hat.com/rhel/

IBM POWER 6 System p 570 Performance Data – including the recent RHEL 5 resultshttp://www-03.ibm.com/systems/p/hardware/midrange_highend/p570/perfdata.html

[3] SPEC CPU2006http://www.spec.org/cpu2006/

[4] SPEC jbb2005http://www.spec.org/jbb2005/

[5] Linpack HPLhttp://www.netlib.org/benchmark/performance.pdf

[6] IBM’s XL C/C++ Advanced Edition Compilers for Linux on POWERhttp://www-306.ibm.com/software/awdtools/xlcpp/features/linux/xlcpp-linux.html

[7] IBM’s XL Fortran Advanced Edition Compilers for Linux on POWERhttp://www-306.ibm.com/software/awdtools/fortran/xlfortran/features/linux/xlf-linux.html

[8] “libhugetlbfs”http://sourceforge.net/projects/libhugetlbfs/

[9] MicroQuill’s SmartHeap™http://microquill.com/smartheap/

[10] IBM’s ESSLhttp://www-03.ibm.com/systems/p/software/essl/

[11] Post-Link Optimization for Linux on POWER (aka FDPR-pro)http://www.alphaworks.ibm.com/tech/fdprpro/

[12] IBM SWK For Java Version 6 Early Release Programhttps://www14.software.ibm.com/iwm/web/cc/earlyprograms/ibm/java6/index.shtml

© IBM Corporation 2007All Rights Reserved

Page 15

Page 16: Performance Leadership Power6 and RHEL 5.1

Published Results

SPECint®_rate2006

IBM System p 570 (4.7 Ghz, 4 core, RHEL) - 122http://www.spec.org/cpu2006/results/res2007q4/cpu2006-20071030-02417.html

IBM System p 570 (4.7 Ghz, 8 core, RHEL) - 243http://www.spec.org/cpu2006/results/res2007q4/cpu2006-20071030-02418.html

IBM System p 570 (4.7 Ghz, 16 core, RHEL) - 484http://www.spec.org/cpu2006/results/res2007q4/cpu2006-20071030-02415.html

SPECfp®_rate2006

IBM System p 570 (4.7 Ghz, 4 core, RHEL) - 116http://www.spec.org/cpu2006/results/res2007q4/cpu2006-20071030-02421.html

IBM System p 570 (4.7 Ghz, 8 core, RHEL) - 216http://www.spec.org/cpu2006/results/res2007q4/cpu2006-20071030-02422.html

IBM System p 570 (4.7 Ghz, 16 core, RHEL) - 430http://www.spec.org/cpu2006/results/res2007q4/cpu2006-20071030-02419.html

SPECjbb2005

IBM System p 570 (4.7 Ghz, 4 core, RHEL) - 169304http://www.spec.org/jbb2005/results/res2007q4/jbb2005-20071030-00400.html

IBM System p 570 (4.7 Ghz, 8 core, RHEL) - 335424http://www.spec.org/jbb2005/results/res2007q4/jbb2005-20071030-00401.html

IBM System p 570 (4.7 Ghz, 16 core, RHEL) - 664167http://www.spec.org/jbb2005/results/res2007q4/jbb2005-20071030-00402.html

Linpack HPL

http://www.netlib.org/benchmark/performance.ps

IBM System p 570 (4.7 Ghz POWER6 RHEL 5.1) 4 core Rmax = 60.37 Rpeak = 75.2IBM System p 570 (4.7 Ghz, 8 core, RHEL) 8 core Rmax = 116.4 Rpeak = 150.4IBM System p 570 (4.7 Ghz, 16 core, RHEL) 16 core Rmax = 229.7 Rpeak = 300.8

© IBM Corporation 2007All Rights Reserved

Page 16

Page 17: Performance Leadership Power6 and RHEL 5.1

© IBM Corporation 2007IBM CorporationSystems and Technology GroupRoute 100Somers, New York 10589

Produced in the United States of AmericaNovember 2007All Rights Reserved

This document was developed for products and/or services offered in the United States. IBM may not offer the products, features, or services discussed in this document in other countries.

The information may be subject to change without notice. Consult your local IBM business contact for information on the products, features and services available in your area.

All statements regarding IBM future directions and intent are subject to change or withdrawal without notice and represent goals and objectives only.

IBM, the IBM logo, Micro-Partitioning, POWER, POWER5, POWER6, System p, System p5 are trademarks or registered trademarks of International Business Machines Corporation in the United States or other countries or both. A full list of U.S. trademarks owned by IBM may be found at: http://www.ibm.com/legal/copytrade.shtml.

UNIX is a registered trademark of The Open Group in the United States, other countries or both.

Linux is a trademark of Linus Torvalds in the United States, other countries or both.

Other company, product, and service names may be trademarks or service marks of others.

IBM hardware products are manufactured from new parts, or new and used parts. In some cases, the hardware product may not be new and may have been previously installed. Regardless, our warranty terms apply.

This equipment is subject to FCC rules. It will comply with the appropriate FCC rules before final delivery to the buyer.

Information concerning non-IBM products was obtained from the suppliers of these products or other public sources. Questions on the capabilities of the non-IBM products should be addressed with those suppliers.

All performance information was determined in a controlled environment. Actual results may vary. Performance information is provided “AS IS” and no warranties or guarantees are expressed or implied by IBM. Buyers should consult other sources of information, including system benchmarks, to evaluate the performance of a system they are considering buying.

When referring to storage capacity, 1TB equals total GB divided by 1000; accessible capacity may be less.

The IBM home page on the Internet can be found at: http://www.ibm.co m .

The IBM System p home page on the Internet can be found at: http://www.ibm.com/systems/p.

© IBM Corporation 2007All Rights Reserved

Page 17