60
A Computational Physicist’s View of Reconfigurable High Performance Computing Vincent Natoli Reconfigurable Systems Summer Institute 11 July 2005

A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

  • Upload
    vudan

  • View
    221

  • Download
    2

Embed Size (px)

Citation preview

Page 1: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

A Computational Physicist’s View of Reconfigurable High Performance

Computing

Vincent NatoliReconfigurable Systems Summer Institute

11 July 2005

Page 2: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

OutlineI. High Performance Computing: Past and

PresentII. Field Programmable Gate ArraysIII. Reconfigurable High Performance

Computing

Page 3: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

PART I

High Performance Computing Past and Present

Page 4: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

The Early Years

Page 5: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPCThe Early Years

ENIAC

BINAC

Colossus

Z3

1945 1950

UNIVAC

ERA1101SEAC

1990198019701960

ERA RAND CDCIBM

Fujitsu

CDC6600IBM360

Cray 1

Intel 4004

CRAY Hitachi

NECAlliant

Convex

Sperry

Cray XMP

Cray YMP

Page 6: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

The Phantom Menace

Page 7: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

1993-2000Decline of Vector ProcessorsRise of Commodity Processors

The Phantom Menace

Page 8: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

Attack of the Clones

Page 9: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

2000-2005Rise of Clusters (ASC Red, Blue, White, Q)

Attack of the Clones

Page 10: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

The Empire Strikes Back

Page 11: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Brief History of HPC

2002: Japanese Earth SimulatorComputnik?

The Empire Strikes Back

5,120 (640 8-way nodes) 500 MHz NEC CPUs 8 GFLOPS per CPU (41 TFLOPS total) 2 GB Memory per CPU (10 TB total) 20 kVA power consumption per node

Page 12: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

HPC Performance 1993-2005

All is well …

#Proc: 1.3/yr Scalar: 1.4/yrTotal : 1.8/yr

Or is it?

1 PFLOP by 2009

Page 13: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Current Problems in HPCThe Studies

(2002) DARPA: HPCS(2003) DoD: IHEC(2004) NCO/NITRD: HECRTF(2004) NRC: Future of Supercomputing(2004) DOE: HEC Revitalization Act“The Coming Crisis in Computational Science” Doug Post

Summary of ResultsGood News! Only two big problems

Hardware and SoftwareHardware: Moore’s law

Power Dissipation: More difficult to wring out clock speed increaseMemory wall: Time to access memory in clock cycles is risingDivergence problem: sustained performance < 10% of Peak

Software: The Law of MoreMachines more and more complicated to programMachines are obsolete by the time software is ready

Page 14: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Moore’s Law

In 1965, Gordon Moore sketched out his prediction of thepace of silicon technology. Decades later, Moore’s Law remains true, driven largely by Intel’s unparalleled silicon expertise. Copyright © 2005 Intel Corporation.

"In terms of size [of transistor] you can see that we're approaching the size of atoms which is a fundamental barrier, but it'll be two or three

generations before we get that far - but that's as far out as we've ever been able to see. We have

another 10 to 20 years before we reach a fundamental limit. By then they'll be able to make

bigger chips and have transistor budgets in the billions.“

1965

2005

“The complexity for minimum component costs has increased at a rate of roughly a factor of two per year (see graph on next page). Certainly over the short term this rate can be expected to continue, if not to increase. Over the longer term, the rate of increase is a bit more uncertain, although

there is no reason to believe it will not remain nearly constant for at least 10 years.

That means by 1975, the number of components per integrated circuit for

minimum cost will be 65,000.”

Page 15: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Power Dissipation

0.1

1

10

100

1000

4004

8008

8080

8085

8086

8028

680

386

8048

6Pen

tium I

Pentiu

m IIPen

tium III

Pentiu

m IVPen

tium IV

Pow

er (W

atts

)

The increase in power dissipation must stop. New engineering techniques have to be implemented to cap the rise in power

Source: “IC Power: The Influence and Impact of Semiconductor Technology”, Presentation by Marc Knox (IBM), Burn-in & Test Socket Workshop, March 7-10, 2004.http://www.bitsworkshop.org/archive/archive2004/2004s1.pdf

Page 16: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Source: “IC Power: The Influence and Impact of Semiconductor Technology”, Presentation by Marc Knox (IBM), Burn-in & Test Socket Workshop, March 7-10, 2004.http://www.bitsworkshop.org/archive/archive2004/2004s1.pdf

Power Dissipation

Page 17: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

1

10

100

1 10 100 1000 10000Cache Capacity (KB)

180nm130nm100nm70nm50nm

Memory Access Times (SIA Clock Est)

Memory Wall

Page 18: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Source: Horst Simon, The Divergence Problem, 18th International Supercomputer Conference ISC2003, Heidelberg, Germany, June 2003. http://www.nersc.gov/~simon/Talks/ISC2003_rev5.pdf

Divergence ProblemRequirements of HPC for Science and Engineering

and requirements for the commercial market are diverging

Memory BandwidthInterconnect LatencyInterconnect BandwidthHP Parallel I/O

Page 19: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Software: The law of MoreResearchers spending more time on

software developmentProblem will get more acute as the

number of processors increasesSoftware done…Machine obsolete

Page 20: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

The Capability GapReliance on commodity solutions has led

us on a path that has delivered great capacity computing solutions but has left a

gap in capability computing.

or

Page 21: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

SummaryMaturing Field: Scientific HPC is still a very young field. Its final state is as yet unknown.Commodity Rules: We have had a free ride on Moore’s law that has led commodity solutions to dominate the HPC market.

Good: Excellent price/performance; Wide dissemination of skills.Bad: Low sustained performance ~10% of peak; Difficult programming model

Technology Divergence: Dependence on increased clock speed and increased number of processors is now in jeopardy

End of the Roadmap?Signs of a transition: Multi-core chips; Less attention to clock speedPrediction: Intel will solve the power problem, but…How do you divide work among 100,000 processors? Good for huge problems but what about doing small to medium sized problems faster? What about capacity computing?

Page 22: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

SummaryMarket Divergence: Increasingly, the interest of HPC is diverging from the commodity market.

Less motivation for chip vendors to provide massive FP performanceWho ordered multi-core chips? Will they share the FP units?

JES: HPC market still dominated by US manufacturers.JES slipped to #4$7,500/GFLOPLimited Access…not the path to widespread use of supercomputing

How do we continue the logical progression of commodity-based

supercomputing?

Page 23: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

History of HPC: Next Chapter

A New Hope

Page 24: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

A New Hope2005: Reconfigurable Computing; FPGA’s

History of HPC: Next Chapter

Page 25: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

PART II

Field Programmable Gate Arrays“The real significance of a great invention ultimately rests in its ability

to transcend its original purpose and empower radically new ideas. Reconfigurable hardware holds such an extraordinary potential. Not only will it give us additional flexibility in our current directions, but

more importantly, it will create the substrate for emergent capabilities of awesome reach.” Federico Faggin

Page 26: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Field Programmable Gate Arrays

Lots of wireVast quantities of Mountain DewHealthy disregard for personal hygiene

Very little wireVast quantities of Mountain DewHealthy disregard for personal hygiene

The old way

1984

The new way

2005

Page 27: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Field Programmable Gate ArraysMask Programmable Gate Arrays (MPGAs)

Field Programmable Gate Arrays

Page 28: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Field Programmable Gate Arrays

...a general purpose multi-level programmable device that is customized in the package by the end-user...

Disconnected G ates

10110111100

Software descriptionof c ircu it Connected G ates

Page 29: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Field Programmable Gate Arrays (FPGA’s)

Volume

Cost

NRE

ASICFPGA

Page 30: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Field Programmable Gate ArraysAdvantages of FPGAs

Reduced design cycleIncreased securityCommodity parts (better reliability, lower cost)Faster upgrades (technology generation, application)Design for speed

Disadvantages of FPGAsHistorically trail by one technology generationSwitching fabric takes up spaceSlow clock (~200-300 Mhz vs. 3 Ghz)

Page 31: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Field Programmable Gate Arrays (FPGA’s)

Traditional uses− Communications− Data Acquisition− Signal Processing− Embedded Processing− Hardware testing

Industries using FPGAs− Military Aerospace and Defense− Automotive− Consumer (STBs, Broadband, PDAs, HDTV)− Networking (Routers, Switches, Modems)

− Key Features− Programmability− Re-Programmability

Page 32: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

FGPAs and General ComputingAlgorithms implemented in hardware can be many times

faster than software implementations

Examples:Graphics cards and chipsGrape; Grape-MD QCDOC Lattice Gauge Theory

on a chip

Page 33: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

FGPAs and General Computing

Offload bottleneck calculations to a programmable circuit− 90/10 rule

FPGACPU

IDEA!

Page 34: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Spatial vs. Temporal ComputingCPUs are temporal processors

Algorithm translated into set of instructions and data which areinterpreted sequentially

FPGAs are spatial processorsAlgorithm is laid out in space in digital hardware

(a) Spatial and (b) Temporal computations for the expression y[i]=w1*x[i]+w2*x[i-1]+w3*x[i-2]+w4*x[i-3]

“The Density Advantage of Configurable Computing”Computer, April 2000 Andre DeHon

Page 35: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Why FPGA’s are fast?Parallelism Pipelining

“A Quantitative Analysis of the Speedup Factors of FPGAs over Processors” FPGA’04 Guo, Naijar, Vahid, Vissers

1.08

N/A

CPI

2101236,865,6001,000Pentium III

8(Pipeline Depth)

8131,07240FPGA(XC2V2000E)

Instr/PixelParallelismClock CyclesClock(MHz)

Time(CPU) = (1/Clock) X N(Pixels) X (Clocks/Instr.) X (Instr/Pixel)Time(FPGA) = (1/Clock) X N(Pixels)/ PRatio = .04 X 8 X 1.08 X 210 = 70.1

Maximum Filter 3x3 box on 1024x1024 image

Instruction efficiencyLatency

Page 36: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Scaling Advantage of FPGAsMoore’s Law: Number of transistors doubles every 2.03 years over last 35 years.

CPUs: Lower capacitance, faster clock, more FLOPSMemory: Smaller feature size, higher density, more capacityFPGAs: Faster clock and more capacity, even more FLOPS

Page 37: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

The computational capacity of FPGAs exceeds that of CPUs and the gap is increasing.

DeHon’s Law

“The Density Advantage of Configurable Computing” Computer, April 2000 Andre DeHon

1996 DEC Alpha.18µ technology208mm2 die size

6.8x109 λ2

2.3ns Clock128 ALU bitops/clock

8.6

Page 38: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

DeHon’s Law: FP Addition

“FPGA’s vs. CPUs: Trends in Peak Floating–Point Performance” Keith Underwood FPGA’04

Historical and Projected Scaling of FP Addition on FPGAs and CPUs

2003Virtex II Pro 100-6Pentium 4 3.2GHz

Page 39: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

DeHon’s Law: FP Multiplication

“FPGA’s vs. CPUs: Trends in Peak Floating–Point Performance” Keith Underwood FPGA’04

Historical and Projected Scaling of FP Multiplication on FPGAs and CPUs

2003Virtex II Pro 100-6Pentium 4 3.2GHz

Page 40: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Trends in FPGA FP Performance

Data from Xilinx 2005 Annual Report

Xilinx Products

0

50

100

150

200

250

1996 1997 1998 1999 2000 2001 2002 2003 2004 2005

Time

Logi

c C

ells

(tho

usan

ds) Virtex-4

Virtex-2

Virtex-EXC4000XL

CPU: Flops increase = 1.4 /yrFPGA: Flops increase = 1.59 x 1.39 = 2.2 /yr

Spatial Factor Time Factor

Each year FPGA FP capability

increases 50% faster than that

of CPU’s!

Page 41: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

SummaryFPGA’s are a candidate to fill the capability gap

Can provide significant hardware acceleration to a wide variety of problems

Fits profile for continuation of commodity performance solutionsDeHon’s law: The computational capacity of FPGAs is

increasing at a faster rate than that of CPU’sSometime in the last few years FPGA FP

performance surpassed that of CPU’sFPGA’s and CPU’s benefit equally from coming

semiconductor processing technology improvements

Page 42: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

PART IIIReconfigurable High

Performance Computing

“The most constant difficulty in contriving the engine has arisen from the desire to reduce the time in which the

calculations were executed to the shortest which is possible.”--- Charles Babbage

“I feel the need…the need for speed”---Tom Cruise (Mav)

Page 43: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Pentium Prescott

Floating Point Operations~7% of chip area

www.chip-architect.com

L2 Cache

90 nm CMOSClock 3.4 GHz

Page 44: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Other

FP

What HPC Users WantHow Computational Physicists see chips

Page 45: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Other

FP

What HPC Users WantHow Computational Physicists see chipsHow Computational Physicists would design chips

FPOther

Cache

Bessel Functions

Page 46: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

The House and the DishwasherWe have lots of dishes to doWhat Computer Scientists have given us

What we want

Page 47: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Cray Quote

"If you were plowing a field, which would you rather use? Two strong

oxen or 1024 chickens ?" --- Seymour Cray

Page 48: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Xilinx XC4VLX200

90 nm CMOS200,448 Logic Cells750 kB BRAM96 18x18 bit MultipliersRocket IO 1.5 GB/sClock <500MHz

Page 49: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

32 bit Integer and Fixed Point Thousands of Arithmetic Units

Floating Point600 SP Floating Point Multipliers100 SP Floating Point Dividers100 DP Floating Point Multipliers20 DP Floating Point Dividers

SP != 2 X DP

Theoretical PeaksSP Floating Point 20-120 GFLOPsDP Floating Point 4-20 GFLOPsInteger .5-1 TOP

Nallatech Floating Point Core data used

Xilinx XC4VLX200

Page 50: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Metric Comparison

JESVirtex-4Pentium-440,00020-1206-8GFLOPs

?0.14-0.8116.9Density(mm2/GFlops)

3200.05-0.3216.5Power(W/GFlops)

$7,500$15-$99$60Economics($/GFlops)

Page 51: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

HPC Concerns

Hey! The clock speed is slower!Can I do Floating Point?Where is the chip?What about the memory wall?What about the bandwidth?How do I program it?Where is the Mountain Dew?

Page 52: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Floating Point Cores

CommercialNallatech Single and Double Precision FP CoresDillon EngineeringXilinx Coregen

AcademicMiriam Leeser (Northeastern U.)Viktor Prasanna (USC)Peter Athanas (VA Tech)

Page 53: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

FPGA’s and The Memory Wall

No instruction fetchLarger “Register Set”More flexibility in memory

configuration and accessSlower clock

Nallatech BenData-WS6 Banks of ZBT-SRAM

8GB/s throughput

Page 54: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

PCI Bandwidth Problem

Blas 1 Vector multiplication: X·XAssume vector length NData Transferred: 8N bytesNumber of operations: 2NFP Operations per Byte xferred: 1/4PCI 64Bit 64 MHz =512MB/sPerformance 64MFlops

Overall FP Performance = FP Ops

Blas 3Matrix Multiplication; A·AAssume NxN matricesData Transferred; 8N2 bytesNumber of operations: N3

FP Operations per Byte xfered: N/8Performance: FPGA performance for N>100-200

Data xferred

PCI BW+ FP Ops

FPGA Performance

Page 55: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Barriers to Use

Development EnvironmentUnfamiliar to application programmersHuman is least programmable part of development chainBiased toward digital designers

BandwidthPCI boards currently present a bottleneckPCI-Express will help (1Q06)

Hardware is expensive$8K – $250K+

Software is expensive$2,500 - $25,000 for development tools

Technology AdoptionHPC developers are willing to adopt new hardware and make reasonable changes to codes

Page 56: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

RHPC AlgorithmsTime steppingMonte CarloInteger based

e.g. BioInformatics, Encryption

Spatially Local (e.g. FD)Pixel processingDigital filteringConvolutions and Transforms (FFT, WT)Data Compression; Encryption

High Computation to Bandwidth ratio

Trivially ParallelMinimize data

transferRepetitive

application of kernelsLong Pipelines of

repetitive kernels

Page 57: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Recent FP FPGA ApplicationsCFD – Schnore and Smith (GE)

2.210.36.4

RCC (GFlops)

25x.086Smoothing133x.077Viscous41x.154Euler

SpeedupP4 (GFlops)

138 x 66 x 52 Mesh

2 Nallatech BenNuey4 Nallatech BenData-WS

“Towards an RCC-based Accelerator for Computational Fluid Dynamics Applications” Schnore and Smith, GE

Page 58: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

Future

Bigger FPGAs (2006-2007) >500K logic cellsLots of headroom on the FPGA clockFPGAs with more embedded hardwareMuti-core chips; Less mention of ClockTighter coupling between GPP and RC

Page 59: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

ConclusionsThe HPC community is confronting a capability gap

Hardware and software challenges threaten the progression of system performanceReconfigurable computing and FPGAs are a potential solution

Intrinsic dimensionally rooted scaling laws favor reconfigurable spatial computing over temporal computing

FPGA’s have a greater intrinsic computational density and the gap is growing

Many challenges are ahead (e.g. Development environments, bandwidth, cost)Merging of hardware and software

The hardware is the software; The software is the hardware

Page 60: A Computational Physicist’s View of Reconfigurable High ... · A Computational Physicist’s View of Reconfigurable High Performance Computing ... “A Quantitative Analysis of

FrontierLogical extension of UNIX to manage reconfigurable co-processing.

SCC (Stone Ridge Compiler Collection)Compiler tools that target Frontier and conform to HPC development methodologies

An Operating System, Software tools and Applications to enable the use of reconfigurable devices by HPC

programmers and users

[email protected]