36
HPC as a Driver for Computing Technology and Education Tarek El-Ghazawi The George Washington University Washington D.C., USA

HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

HPC as a Driver for Computing Technology and Education

Tarek El-Ghazawi

The George Washington University Washington D.C., USA

Page 2: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

2 Tarek El-Ghazawi, GWU

NOW- July 2015: The TOP 10 Systems Rank Site Computer Cores Rmax

[Pflops] % of Peak

Power [MW]

MFlops/Watt

1 National Super

Computer Center in Guangzhou, China

Tianhe-2 NUDT, Xeon 12C 2.2GHz + IntelXeon Phi

(57c) + Custom 3,120,000 33.9 62 17.8 1905

2 DOE / OS

Oak Ridge Nat Lab USA

Titan, Cray XK7, AMD (16C) + Nvidia Kepler GPU (14c) + Custom 560,640 17.6 65 8.3 2120

3 DOE / NNSA

L Livermore Nat Lab USA

Sequoia, BlueGene/Q (16c) + custom 1,572,864 17.2 85 7.9 2063

4 RIKEN Advanced Inst for Comp Sci, Japan

K computer Fujitsu SPARC64 VIIIfx (8c) + Custom 705,024 10.5 93 12.7 827

5 DOE / OS

Argonne Nat Lab, USA

Mira, BlueGene/Q (16c) + Custom 786,432 8.16 85 3.95 2066

6 Swiss CSCS Piz Daint, Cray XC30, Xeon 8C + Nvidia Kepler (14c) + Custom 115,984 6.27 81 2.3 2726

7 KAUST, Saudi Shaheen II, Cray XC30, Xeon 16C + Custom 196,608 5.54 77 4.5 1146

8 TACC, USA Stampede, Dell Intel (8c) + Intel Xeon Phi (61c) + IB 204,900 5.17 61 4.5 1489

9 Forschungszentrum

Juelich (FZJ), Germany

JuQUEEN, BlueGene/Q, Power BQC 16C 1.6GHz+Custom 458,752 5.01 85 2.30 2178

10 DOE / NNSA LLNL, USA

Vulcan, BlueGene/Q, Power BQC 16C 1.6GHz+Custom 393,216 4.29 85 1.97 2177

500 (422) Software Comp HP Cluster USA 18,896 .309 48

Page 3: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

3 Tarek El-Ghazawi, GWU

HPC is a Top National Priority!

3

Establishment of a National Strategic Computing Initiative (NCSI) – 29 July 2015

Executive Order from the White House

Page 4: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

4 Tarek El-Ghazawi, GWU

National Strategic Computing Initiative

Five strategic themes of the NSCI:

1) Create systems that can apply exaflops of computing power to exabytes of data

2) Keep the United States at the forefront of HPC capabilities

3) Improve HPC application developer productivity

4) Make HPC readily available

5) Establish hardware technology for future HPC systems

4

Page 5: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

5 Tarek El-Ghazawi, GWU

Future/Investments - International Exascale HPC Programs

5

Country Funding Year(s) Remarks

European Union €700M 2014-20 Private-Public Partnership commitment through European Tech Platform for HPC (ETP4HPC) €143.4M in 2014-15

€74M 2011- 6 dedicated FP7 Exascale projects

India $2B 2014-20 Led by IISc (Indian Institute of Science) and ISRO (Indian Space Research Organization). Targeting a 132 ExaFLOP/s machine

$750M 2014-19 C-DAC (Center for Development of Advanced Computing) to set up 70 supercomputers over 5 years

Japan $1.38B 2013-20 Post-K computer to be installed at RIKEN; Tentatively based on Extreme SIMD chip “PACS-G”

China - Due to U.S./DoC ban will use Chinese parts to upgrade current #1 system

Page 6: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

6 Tarek El-Ghazawi, GWU

Why is HPC Important?

Critical for economic competitiveness (Highlighted by Minster Daoudi) because of its wide applications (through simulations and intensive data analyses)

Drives computer hardware and software innovations for future conventional computing

Is becoming ubiquitous, i.e. all computing/information technology is turning into Parallel!! Is that why it is turning into an international HPC

muscle flexing contest?

Page 7: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

7 Tarek El-Ghazawi, GWU

Why is HPC Important? (1)Competitiveness

Design Build Test

Simulate Build Model Design

Page 8: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

8 Tarek El-Ghazawi, GWU

Why is HPC Important? Competitiveness

Molecular Dynamics

Simulation for 2ns: • 2 weeks on a desktop • 6 hours on a supercomputer

Gene Sequence Alignment Inhibitor Drug

HIV-1 Protease

Phylogenetic Analysis: • 32 days on desktop • 1.5 hrs supercomputer

Car Crash Simulations

Understanding Fundamental

Structure of Matter 2 million elements simulation: • 4 days on a desktop • 25 minutes on a supercomputer Requires a billion-

billion calculations per second

HPC Application Examples

Page 9: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

9 Tarek El-Ghazawi, GWU

Why is HPC Important? (2) HPC of Today is Conventional Computing for

Tomorrow

The ASCI Red Supercomputer 9000 chips for 3 TeraFLOPs in 1997

Intel 80 Core Chip 1 Chip and 1 TeraFLOPs in 2007

Page 10: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

10 Tarek El-Ghazawi, GWU

3- Why is HPC Important?- HPC Concepts are becoming Ubiquitous

Sony PS3

The Road Runner: Was Fastest Supercomputer in 08 Tile64: A 64 CPU Chip-

Can be in your future laptop!

HPC is Ubiquitous! All Computing is becoming HPC, Can we become bystanders?

Uses Cell Processors!

Uses the Cell Processors!

Samsung S6 – 8 Cores

Page 11: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

11 Tarek El-Ghazawi, GWU

How Did we Get Here - Supercomputers in recent History Computer Processor # Pr. Year Rmax

(TFlops)

Tianhe-2 (MilkyWay-2) TH-IVB-FEP Cluster, Intel Xeon E5-2692 12C

2.200GHz, TH Express-2, Intel Xeon Phi 31S1P

3120000 2013-till now 33,862

Titan Cray XK7, Opteron 16 Cores, 2.2GHz, Nvidia K20X 560640 2012 17,600

K-Computer, Japan SPARC64 VIIIfx 2.0GHz, 705024 2011 10,510

Tianhe-1A, China Intel EM64T Xeon X56xx (Westmere-EP) 2930

MHz (11.72 Gflops) + NVIDIA GPU, FT-1000 8C

186368 2010 2,566

Jaguar, Cray Cray XT5-HE Opteron Six Core 2.6 GHz 224162 2009 1,759

Roadrunner, IBM PowerXCell 8i 3200 MHz (12.8 GFlops) 122400 2008 1,026

BlueGene/L - eServer Blue Gene Solution, IBM PowerPC 440 700 MHz (2.8 GFlops) 212992 2007 478

BlueGene/L - eServer Blue Gene Solution, IBM PowerPC 440 700 MHz (2.8 GFlops) 131072 2005 280

BlueGene/L beta-System IBM PowerPC 440 700 MHz (2.8 GFlops) 32768 2004 70.7

Earth-Simulator / NEC NEC 1000 MHz (8 GFlops) 5120 2002 35.8

IBM ASCI White,SP POWER3 375 MHz (1.5 GFlops) 8192 2001 7.2

IBM ASCI White,SP POWER3 375MHz (1.5 GFlops) 8192 2000 4.9

Intel ASCI Red Intel IA-32 Pentium Pro 333 MHz (0.333 GFlops) 9632 1999 2.4

Page 12: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

12 Tarek El-Ghazawi, GWU

How Did we Get Here - Supercomputers in recent History

See: http://spectrum.ieee.org/tech-talk/computing/hardware/china-builds-worlds-fastest-supercomputer

Page 13: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

13 Tarek El-Ghazawi, GWU

How Did we Get Here - Supercomputers in recent History

Vector Machines

MPPs with Multicores and Heterogeneous Accelerators

Massively Parallel

Processors

1993- HPCC

2008- 2011 End of Moore’s

Law in Clocking!

Performance

Time

PetaFLOPS

TeraFLOPS

Discrete Integrated

Page 14: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

14 Tarek El-Ghazawi, GWU

NOW- July 2015: The TOP 10 Systems Rank Site Computer Cores Rmax

[Pflops] % of Peak

Power [MW]

MFlops/Watt

1 National Super

Computer Center in Guangzhou, China

Tianhe-2 NUDT, Xeon 12C 2.2GHz + IntelXeon Phi

(57c) + Custom 3,120,000 33.9 62 17.8 1905

2 DOE / OS

Oak Ridge Nat Lab USA

Titan, Cray XK7, AMD (16C) + Nvidia Kepler GPU (14c) + Custom 560,640 17.6 65 8.3 2120

3 DOE / NNSA

L Livermore Nat Lab USA

Sequoia, BlueGene/Q (16c) + custom 1,572,864 17.2 85 7.9 2063

4 RIKEN Advanced Inst for Comp Sci, Japan

K computer Fujitsu SPARC64 VIIIfx (8c) + Custom 705,024 10.5 93 12.7 827

5 DOE / OS

Argonne Nat Lab, USA

Mira, BlueGene/Q (16c) + Custom 786,432 8.16 85 3.95 2066

6 Swiss CSCS Piz Daint, Cray XC30, Xeon 8C + Nvidia Kepler (14c) + Custom 115,984 6.27 81 2.3 2726

7 KAUST, Saudi Shaheen II, Cray XC30, Xeon 16C + Custom 196,608 5.54 77 4.5 1146

8 TACC, USA Stampede, Dell Intel (8c) + Intel Xeon Phi (61c) + IB 204,900 5.17 61 4.5 1489

9 Forschungszentrum

Juelich (FZJ), Germany

JuQUEEN, BlueGene/Q, Power BQC 16C 1.6GHz+Custom 458,752 5.01 85 2.30 2178

10 DOE / NNSA LLNL, USA

Vulcan, BlueGene/Q, Power BQC 16C 1.6GHz+Custom 393,216 4.29 85 1.97 2177

500 (422) Software Comp HP Cluster USA 18,896 .309 48

Page 15: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

15 Tarek El-Ghazawi, GWU

How to Make Progress

Launch a competitive funding cycle or a large national project

Pose a system challenge ~ 33.8 PFLOPS/17.8 Mwatt provides about

2GF/Watt To get to Exascale using same total power we

need 200GF/Watt

Pose an application challenge(s)

Let the community compete for government funding with innovative ideas

Page 16: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

16 Tarek El-Ghazawi, GWU

Challenges - The End of Moore’s Law

The phenomenon of exponential improvements in processors was observed in 1979 by Intel co-founder Gordon Moore The speed of a microprocessor doubles every

18-24 months, assuming the price of the processor stays the same The price of a microchip drops about 48% every

18-24 months, assuming the same processor speed and on chip memory capacity The number of transistors on a microchip

doubles every 18-24 months, assuming the price of the chip stays the same

Ok, for Now

Ok, for Now

Wrong, not anymore!

Page 17: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

17 Tarek El-Ghazawi, GWU

No faster clocking but more Cores?

Source: Ed Davis, Intel

Page 18: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

18 Tarek El-Ghazawi, GWU

Accelerators and Dealing with the Moore’s Law Challenge Through Parallelism

Fab. Process Freq # Cores Peak FP

Performance Peak

Power DP

Flops/W Memory

nm GHz SPFP GFlops

DPFP GFlops W BW GB/s Memory

type

PowerXCell 8i 65 3.2 1 + 8 204 102.4 92 1.11 25.6 XDR

Nvidia Kepler K40 28 0.75 2880 4290 1430 235 6.1 288 GDDR5

Intel Xeon Phi 7120P 22 1.24 61 (244

threads) 2417 1208 300 4.0 352 GDDR5

Intel Xeon 12-core 2.7 GHz E5-2697v2

22 2.7 12 518.4 259.2 130 1.99 59.7 DDR3-1866

AMD Opteron 6370P Interlagos 32 2.5 16 320 160 99 1.62 42.7 DDR3-

1333

Xilinx XC7VX1140T 28 - - 801 241 43 5.6 - -

Xilinx XCUV440 20 - - 1306 402 80* 5.0*

Altera Stratix V GSB8 28 - - 604 296 59 5.0 - -

Page 19: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

19 Tarek El-Ghazawi, GWU

FPGAs Cell GPUs Phi …

Microprocessor

Application Speedup SAVINGS

Cost Power Size DNA Match 8723 22x 779x 253x

DES Breaker 38514 96x 3439x 1116x

El-Ghazawi et. al. The Promise of HPRCs. IEEE Computer, February 2008

Accelerators/Heterogeneous Computing

Page 20: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

20 Tarek El-Ghazawi, GWU

A General Execution Model for Heterogeneous Computers

PC

µP Accelerator

•Transfer of Control •Input Data

•Output Data •Transfer of Control

FPGA

GPU

Clearspeed

CELL B.E.

Intel Xeon Phi

Page 21: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

21 Tarek El-Ghazawi, GWU

Challenges for Accelerators

1. Application must lend itself to the 90-10 rule, and different accelerators suit diffent type of computations

2. Programmer partitions the code across the CPU and accelerator 3. Programmer co-schedules CPU and accelerator, and ensures

good utilization of the expensive accelerator resources 4. Programmer explicitly transfers data between CPU and

accelerator 5. Accelerators are fast as compared to the link, and overhead that

can render the use of the accelerator useless or harmful 6. Multiple programming paradigms are needed 7. New accelerator means learning/porting to a new programming

interface 8. Changing the ratio of CPUs to accelerators requires also

substantial programming unless accelerators are vituralized

Page 22: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

22 Tarek El-Ghazawi, GWU

Challenges for Advancing or for Exascale

1. Energy Efficiency 2. Interconnect Technology 3. Memory Technology 4. Scalable System Software 5. Programming Systems 6. Data Management 7. Exascale Algorithms 8. Algorithms for Discovery, Design

& Decision 9. Resilience and Correctness 10. Scientific Productivity

DoE ASCAC Subcommittee Report Feb 2014

Data movement and/or programming related

Page 23: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

23 Tarek El-Ghazawi, GWU

Exascale Technological Challenges

23

The Power Wall Frequency scaling is no longer possible,

power increases rapidly

The Memory Wall Gap between processor speed and memory

speed is widening

The Interconnect Wall Available bandwidth per compute operations

is dropping Power needed for data movement is

increasing

Programmability Wall, Resilience Wall, ..

Page 24: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

24 Tarek El-Ghazawi, GWU

The Data Movement Challenge

Locality matters a lot, cost (energy and time) rapidly increases with distance

Locality should be exploited at short distance, needed more at far distances

Bandwidth density vs. system distance Energy vs. system distance

[Source: ASCAC 14]

Page 25: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

25 Tarek El-Ghazawi, GWU

Data Movement and the Hierarchical Locality Challenge

25

Page 26: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

26 Tarek El-Ghazawi, GWU

Locality is Not Flat Anymore– Chip and System

26

Page 27: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

27 Tarek El-Ghazawi, GWU

Locality is Not Flat in Anymore – Chip and System

27

Page 28: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

28 Tarek El-Ghazawi, GWU

Locality is Not Flat Anymore – Chip and System

28

Page 29: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

29 Tarek El-Ghazawi, GWU

Locality is Not Flat in Extreme Scale – Chip and System

29

Cray XC40

Page 30: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

30 Tarek El-Ghazawi, GWU

Locality in Extreme Scale – Chip and System Perspectives

30

Cray XC40

TTT TILE64

Tile64

Page 31: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

31 Tarek El-Ghazawi, GWU

What Does that Mean for Programmers

Exploiting Hierarchical Locality

Machine level and Chip level

Hierarchical Tiled Data Structures

Hierarchical Locality Exploitation with RTS

MPI+X

Page 32: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

32 Tarek El-Ghazawi, GWU

General Implications

Short term programming challenge

Golden opportunity for smart programmer

New hardware advances needed first and they will influence software

May be silicon based, may be nano technologies like carbon nano-tube transistors by IBM (9nm), may keep things the way they are from the software side for a while

Page 33: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

33 Tarek El-Ghazawi, GWU

General Implications- Longer Run

Long-term hardware technology may move toward Nano-photonics for computing Quantum Computing

Many of the new hardware computing innovations may show first as discrete accelerators, then on the chip accelerator, then move closer to the processor internal circuitry ( data path )

Page 34: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

34 Tarek El-Ghazawi, GWU

Longer term

The bad news: with the limits of the silicon approached we may see departures from conventional methods of computing which may dramatically change the way we conceive software

The good news: history has shown that good ideas from the past get resurrected in new ways

Page 35: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

35 Tarek El-Ghazawi, GWU

Conclusions Graduating and intelligent IT workforce can be a

golden egg for countries like Morocco

You can teach skills but it is imperative to teach and stress concepts in the curriculum Stress Parallelism Stress Locality

See the recommendations by IEEE/NSF and SIAM for incorporating parallelism in Computer Science, Computer Engineering, and Computational Science and Engineering Curricula, and add locality

For the very long-term There is nothing better than having good

foundations in Physics and Math even for CS and CE majors

Page 36: HPC as a Driver for Computing Technology and Education › High Power Computing as a Driver for Computing... · 2015-10-03 · HPC as a Driver for Computing Technology and Education

36 Tarek El-Ghazawi, GWU

Conclusions cont.

Integrate teaching soft skills as President Ouaouicha said Communications Entrepreneurism and marketing, individually

and in groups Patenting and legal