16
Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum Arend Dittmer Director Product Management HPC April,17 2009

Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Embed Size (px)

DESCRIPTION

Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum. Arend Dittmer Director Product Management HPC April,17 2009. Penguin Vision and Focus. Founded 1998 – One of HPC industry’s longest track records of success - PowerPoint PPT Presentation

Citation preview

Page 1: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Taking the Complexity out of Cluster ComputingVendor Update HPC User Forum

Taking the Complexity out of Cluster ComputingVendor Update HPC User Forum

Arend Dittmer

Director Product Management HPC

April,17 2009

Page 2: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

Penguin Vision and FocusPenguin Vision and Focus Founded 1998 – One of HPC industry’s longest

track records of success

Donald Becker, CTO – Inventor of Beowulf architecture and primary contributor to Linux kernel

Over 2500 Customers in Enterprise, Academia and Government

Focus on integrated ‘turnkey’ HPC clusters

Page 3: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

Rack Integration Software

Integration

• Scyld Clusterware

• Schedulers

• Development tools

• Applications Solution Testing

• System level burn-in

• Full cluster testing

24x7 Support

Software

>Cluster Management

>Applications and Workload Managers

>Compilers and Tools

Hardware

>Servers

>GPU Accelerators

>Storage

>Interconnects

>Racks and PDU’s

Penguin Solutions Delivered “Ready-to-Run”Penguin Solutions Delivered “Ready-to-Run”

Page 4: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Trends in Cluster ComputingTrends in Cluster Computing

Cluster Management Software

Page 5: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

Linux clusters deliver unmatched price/performance Linux clusters dominate the HPC Market (Market share >75%)

however…

Compute power delivered by many systems introduces complexity > Configuration consistency

> Distributed applications

> Workload Management

Scyld ClusterWare designed to make cluster management easy

5

The HPC Cluster Management ChallengeThe HPC Cluster Management Challenge

Page 6: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved 6

Scyld ClusterWare DesignScyld ClusterWare Design

Master node is the single point of control

Compute nodes are attached 'stateless' memory and processor resources

Scyld maintains consistency across the cluster

Designed for Ease-of-Use and Manageability

‘Manage a Cluster like a Single System’

Page 7: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

Web Based Monitoring FramworkWeb Based Monitoring Framwork One web based interface to all HPC cluster components

Integrates existing tools e.g. IPMI, Ganglia, TORQUE

Customizable, extensible Framework

>Based on XML, Java script and ExtJS

Page 8: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Trends in Cluster ComputingTrends in Cluster Computing

Hardware

Page 9: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved 9

Heterogeneous Computing: GPUs + CPUsHeterogeneous Computing: GPUs + CPUs Massive processing power introduces I/O challenge

>Getting data to and from the processing units can take as long as the processing itself

>Requires careful software design and deep understanding of algorithms and architecture of Processors (Cache effects, memory bandwidth) GPU accelerators Interconnects (Ethernet, IB, 10 Gigabit Ethernet), Storage (local disks, NFS, parallel file systems)

4 cores

Page 10: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved 10

Application Case Study: ANSYS / AccelewareApplication Case Study: ANSYS / Acceleware ANSYS Direct Sparse Solver (DSS) - Single System

Mode Matrix Decomposition offloaded to NVIDIA Tesla

C1060 GPU Accelerator ANSYS standard benchmark BM-7 – 500K-1750K

DoF Overall speedup up to 3.7X for Single Precision

runs, 2.7X for Double Precision

Page 11: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

Sample of Penguin’s Advanced Compute OfferingSample of Penguin’s Advanced Compute Offering

NVIDIA Tesla S1070 GPU Accelerator>Four processors, 240 cores each>Native double precision floating point support>Supports Nvidia’s CUDA API

Niveus HTX Personal Supercomputer

>Engineered to support Tesla coprocessors

>720 GPU cores

Relion Intel 1702

>1U Chassis housing two independent x86 nodes

>Two Xeon 5500 Series 'Nehalem' processors per node

>Up to 96GB of RAM on each node

Page 12: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Thank YouThank You

April,17 2009

Page 13: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved 13

Application Case Study: ANSYS / AccelewareApplication Case Study: ANSYS / Acceleware ANSYS > Direct Sparse Solver (DSS) - SMP/Single System Mode

Acceleware Plug-In for ANSYS > Matrix Decomposition offloaded to NVIDIA Tesla C1060 GPU Accelerator

Benchmark> ANSYS standard benchmark BM-7 – 500K-1750K Degrees if Freedom (DoF)> Intel Xeon E5405 – Dual core runs> Overall speedup up to 3.7X for Single Precision runs, 2.7X for Double

Precision

Page 14: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

Integrated Management FrameworkIntegrated Management Framework One web based interface to all HPC cluster components

Follows Scyld ‘Ease-of-Use’ Philosophy

Integrates existing tools e.g. IPMI, Ganglia, TORQUE

Page 15: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved

A Sample of our 2500+ CustomersA Sample of our 2500+ Customers

National Labs Aerospace/Defense

Universities/Institutions Enterprise

Page 16: Taking the Complexity out of Cluster Computing Vendor Update HPC User Forum

Copyright © 2009 Penguin Computing, Inc. All rights reserved 16

Hardware Effects: Multicore-MultithreadingHardware Effects: Multicore-Multithreading Moore’s Law is doubling the number of transistors on an

integrated circuit every 18 months

However, clock speeds are not scaling

Multicore and Multithreaded Programming is critical for continued software scalability

Rather than reinvent the wheel,use existing frameworks and tools

> OpenMP

> MPI

> Threaded Building Blocks

> Atlas, FFTW, MKL, AMCL, etc.