Upload
others
View
6
Download
0
Embed Size (px)
Citation preview
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 1
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
From Organic Computing to Reconfigurable Computing
Reiner Hartenstein
TU Kaiserslautern
PASA, Frankfurt, March 16, 2006
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
2
Reconfigurable Computing (RC) and FPGA*
in the media
#####
Design Starts until 2010: from 80,000 to 110,000
[Dataquest]
June 2005
fastest growing segment of the semiconductor market:
~6 billion US-$ [Dataquest]
*) Field-Programmable Gate Array
Google: 10 million hits
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
3
The Pervasiveness of RC
162,000
127,000
158,000 113,000
171,000 194,000
# of hits by Google
1,620,000
915,000
398,000
272,000
647,000
1,490,000
# of hits by Google
search “FPGA and ….”
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
4
>> Outline <<
• Reconfigurable Computing Paradox
• Von Neumann loosing its dominance
• Software vs. Configware
• The dual paradigm approach
• Coarse-grained Reconfigurable Devices
• Conclusions http://www.uni-kl.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
5
The RC Paradox
Effective integration density much worse than the Gordon Moore curve: by a factor of more than 10,000
„very power-hungry“ [Rick Kornfeld*]
*) personal communication
application development: until recently still Logic Design on a very strange platform
The awful technology of FPGAs:
FPGAs run at lower clock frequencies, draw more power and are more expensive.
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
6
fine-grained RC: low effective integration density
immense area inefficiency
reconfigurability overhead
routing congestion
wiring overhead
overhead:
> 10 000
1980 1990 2000 2010 100
103
106
109
FPGA logical
FPGA routed
density:
FPGA physical
transistors
/ microchip
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 2
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
7
published speed-up factors #
1980 1990 2000 2010 100
103
106
109
8080
P4
http://xputers.informatik.uni-kl.de/faq-pages/fqa.html
100 000
Los Alamos traffic simulation 47
real-time face detection 6000
video-rate stereo vision
900 pattern recognition 730
SPIHT wavelet-based image compression 457
Smith-Waterman pattern matching
288
BLAST 52 protein identification
40
molecular dynamics simulation 88
Reed-Solomon Decoding 2400
Viterbi Decoding 400
FFT
100
1000 MAC
Grid-based DRC: no FPGA: DPLA on MoM by TU-KL
2000
2-D FIR filter (no FPGA: DPLA by TU-KL)
39,4
Lee Routing (DPLA by TU-KL)
160
Grid-based DRC („fair comparizon“)
15000
DSP and wireless Image processing, Pattern matching,
Multimedia
Bioinformatics
GRAPE 20
Astrophysics
crypto
rela
tive
perf
orm
ance
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
8
HeHon‘s Law MOPS / milliWatt
1
10
100
1000
2 1 0.5 0.25 0.13 0.1 0.07
µ feature size RISC
FPGA
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
9
However ....
Application migration [from supercomputer] resulting in performance increase up to 4 orders of magnitude
Reducing electricity bill by an order of magnitude
Hits the memory wall from a different direction
People think that high-performance must mean expensive
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
10
why the RC paradigm shift is so important
Move the stool or the grand piano?
by Software
by Configware
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
11
>> Outline <<
• Reconfigurable Computing Paradox
• Von Neumann loosing its dominance
• Software vs. Configware
• The dual paradigm approach
• Coarse-grained Reconfigurable Devices
• Conclusions http://www.uni-kl.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
12
Cray XD1
vN paradigm loosing its dominance
Xilinx inside !
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 3
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
13
von Neumann is not the common model
progra
m
counter
DPU CPU
RAM memory
von Neumann bottleneck
von Neumann instruction-stream-
based machine
co-processors
accelerator CPU
instruction-stream-based
data-stream-
based
har
dw
are
software
mainframe age:
microprocessor age:
wagging the dog
the tail is
vN paradigm dominance ?
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
14
Here is the common model
progra
m
counter
DPU CPU
RAM memory
von Neumann bottleneck
von Neumann instruction-stream-
based machine
co-processors
accelerator CPU
instruction-stream-based
data-stream-
based
har
dw
are
software
mainframe age:
microprocessor age:
configware age:
mor
phw
are
accelerator reconfigurable
accelerator hardwired
CPU
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
15
Here is the common model
progra
m
counter
DPU CPU
RAM memory
von Neumann bottleneck
von Neumann instruction-stream-
based machine
co-processors
accelerator CPU
instruction-stream-based
data-stream-
based
har
dw
are
software
mainframe age:
microprocessor age:
configware age:
CPU accelerator reconfigurable
mor
phw
are
software/configware co-compiler
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
16
Fundamentally different mind set
no program counter
non-von-Neumann
completely different OS principles
no instruction fetch at run time
it’s configware: definitely it is not software
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
17
>> Outline <<
• Reconfigurable Computing Paradox
• Von Neumann loosing its dominance
• Software vs. Configware
• The dual paradigm approach
• Coarse-grained Reconfigurable Devices
• Conclusions http://www.uni-kl.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
18
Compilation: Software vs. Configware
source program
software compiler
software code
Software Engineering
configware code
mapper
configware compiler
scheduler
flowware code
source „program“
Configware Engineering
placement & routing
data
C, FORTRAN MATHLAB
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 4
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
19
configware resources: variable
Nick Tredennick’s Paradigm Shifts explain the differences
2 programming sources needed flowware algorithm: variable
Configware Engineering
Software Engineering
1 programming source needed algorithm: variable
resources: fixed
software CPU
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
20
Co-Compilation
software compiler
software code
Software / Configware Co-Compiler
configware code
mapper
configware compiler
scheduler
flowware code
data
C, FORTRAN, MATHLAB
automatic SW / CW partitioner
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
21
Organic Computing ? Bio-inspired use of FPGAs
• evolvable „hardware“ community:
• crossover of chromosomes
• In love with genetic algorithms: darwinistic way to fitness thru generations of populations
• inefficient, but unexpected results possible
• simulated annealing (genetic morphing) - fitness by synthesis: highly efficient
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
22
Software / Configware Co-Compilation
Resource Parameters
supporting different platforms Analyzer
/ Profiler
SW code
SW compiler
para d igm “vN" machine
CW Code
CW compiler
Kress/Kung machine paradigm
Partitioner
C language source
FW Code
Juergen Becker’s CoDe-X, 1996
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
23
Co-Compiler for Hardwired Kress/Kung Machine [e. g. Brodersen]
software compiler
software code
Software / Flowware
Co-Compiler flowware compiler
scheduler
flowware code
data
source
automatic SW / CW partitioner
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
24
>> Outline <<
• Reconfigurable Computing Paradox
• Von Neumann loosing its dominance
• Software vs. Configware
• The dual paradigm approach
• Coarse-grained Reconfigurable Devices
• Conclusions http://www.uni-kl.de
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 5
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
25
The dual paradigm approach
von Neumann paradigm Kress-Kung paradigm
Software Engineering
Configware Engineering
ASM
CPU
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
26
DPA
x x x
x x x
x x x
|
| |
x x
x
x
x
x
x x
x
- -
-
input data streams
x x
x
x
x
x
x x
x
- -
-
-
-
-
-
-
-
-
-
-
x x x
x x x
x x x
|
|
|
|
|
|
|
|
|
|
|
| output data streams
time
port #
time
time
port # time
port #
Flowware defines: ... which data item at which time at which port
Data streams (flowware)
(pipe network)
ASM
ASM
ASM
ASM
ASM
ASM
AS
M
AS
M
AS
M
AS
M
AS
M
AS
M
algebraic synthesis algorithms:
H. T. Kung paradigm (systolic array)
Auto-Sequencing
Memory
RA
M
GA
G
ASM
implemented
by distributed
memory
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
27
500MHz Flexible
Soft Logic Architecture
200KLogic Cells
500MHz Programmable DSP
Execution Units
0.6-11.1Gbps
Serial Transceivers
500MHz PowerPC™ Processors
(680DMIPS)
with
Auxiliary Processor Unit
1Gbps Differential
I/O
500MHz multi-port
Distributed 10 Mb SRAM
500MHz DCM Digital
Clock Management
DSP platform FPGA [courtesy Xilinx Corp.]
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
28
Generalization of the systolic array ....
discard algebraic synthesis methods
[Rainer Kress]
use optimization algorithms instead
for example: simulated annealing
the achievement: also non-linear and non-uniform pipes, and even more wild pipe structures possible
now reconfigurability makes sense
remedy?
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
29
>> Outline <<
• Reconfigurable Computing Paradox
• Von Neumann loosing its dominance
• Software vs. Configware
• The dual paradigm approach
• Coarse-grained Reconfigurable Devices
• Conclusions http://www.uni-kl.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
30
rDPU not used used for routing only operator and routing port location markerLegend: backbus connect
array size: 10 x 16 = 160 rDPUs
Coarse grain is about computing, not logic
rout thru only
not used backbus connect
SNN filter on KressArray (mainly a pipe network)
[Ulrich Nageldinger]
Example: mapping onto rDPA by DPSS: based on simulated annealing
reconfigurable function block, e. g. 32 bits wide
no CPU
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 6
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
31
coarse-grained RC: high integration density
FPGA routed
> 10 000
1980 1990 2000 2010 100
103
106
109
transistors
/ microchip
The Reconfigurable Computing Paradox
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
32
Claassen‘s Law
2 1 0.5 0.25 0.001
0.01
0.1
1
10
100
1000
0.13 0.1 0.07
µ feature size
MOPS / milliWatt
DSP
+ Hartenstein‘s Amendment
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
33
commercial rDPA example:
PACT XPP - XPU128
XPP128 rDPA
• Evaluation Board available, and • XDS Development Tool with Simulator
buses not
shown
rDPU
CF
G
PAE
core
ALU CtrlALU
CF
GC
FG
PAE
core
CF
GC
FG
PAE
core
PAE
core
ALU CtrlALUALU CtrlALU
CF
GC
FG
CF
GC
FG
• Full 32 or 24 Bit Design working silicon • 2 Configuration Hierarchies
© PACT AG, http://pactcorp.com
(r)DPA
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
34
>> Outline <<
• Reconfigurable Computing Paradox
• Von Neumann loosing its dominance
• Software vs. Configware
• The dual paradigm approach
• Coarse-grained Reconfigurable Devices
• Conclusions http://www.uni-kl.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
35
Conclusions
RC is reducing cost without loss of performance and flexibility.
FPGAs may be configured like for a micro-processor for C/C++ code.
An FPGA can perform a specific algorithm at very high speed.
Using a high-level language, the FPGA can be programmed for a wide variety of algorithms without any deep knowledge of the underlying architecture.
RC is reducing the electricity bill and the required building floor area
Speed-up factors of up to 4 orders of magnitude hve been reported
Compared to ASICs, prototyping time is on the order of hours rather than months, with a cost less than a tenth of that for an ASIC.
The personal supercomputer is near
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
36
Conclusions (2)
We urgently need Reconfigurable Computing Education
An Update of CS curricula is overdue
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 7
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
37
END
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
38
thank you
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
39
The first archetype machine model
main frame
CPU
compile or assemble
procedural personalization
Software Industry Software Industry’s Secret of Success
simple basic . Machine Paradigm
personalization: RAM-based
instruction-stream- based mind set
“von Neumann”
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
40
An Archetype Common Model needed
Guidance for organizing efficient solutions
Make the project manageable
Allow to share lessions between applications and between application areas
Useful simple archetype not widely accepted
Archetype common model should provide ....
Progress stalled by the software/configware chasm
Configware Industry from the
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
41
The 2nd archetype machine model
compile structural
personalization
Configware Industry Configware Industry’s Secret of Success
personalization: RAM-based
data-stream- based mind set
“Kress-Kung”
accelerator reconfigurable
simple basic . Machine Paradigm
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
42
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
S
+
for demo: a tiny section of the pipe network inter-rDPU-communication: no memory cycles needed
configware solution: computing in space
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 8
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
43
Compare it to software solution on CPU
on a very simple CPU
C = 1
memory
cycles nano
seconds
if C
then
read A
read instruction
instruction decoding
read operand*
operate & register transfers
if not C
then
read B
read instruction
instruction decoding
add &
store
read instruction
instruction decoding
operate & register transfers
store result
total
S = R + (if C then A else B endif);
S
+
A B R C
Clock 200
=1
S
+
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
44
hypothetical branching example to illustrate software-to-configware migration
*) if no intermediate storage in register file
C = 1 simple conservative CPU example
memory cycles
nano seconds
if C
then read A
read instruction 1 100
instruction decoding
read operand* 1 100
operate & reg. transfers
if not C
then read B
read instruction 1 100
instruction decoding
add & store
read instruction 1 100
instruction decoding
operate & reg. transfers
store result 1 100
total 5 500
S = R + (if C then A else B endif);
S
+
A B R C
clock
200 MHz (5 nanosec)
=1
sect
ion
of
a m
ajo
r p
ipe
net
wo
rk o
n r
DP
U
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
45
The wrong mind set ....
S = R + (if C then A else B endif);
=1
+
A B R C
section of a very large pipe network:
not knowing this solution:
symptom of the hardware / software chasm
and the configware / software chasm
„but you can‘t implement decisions!“
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
46
The hardware / software chasm
If I use the term "software", a variety of images might appear in the engineering audience's mind.
Still we have "hardware" engineers and "software" engineers that go to different schools, attend different conferences, avoid each other's cocktail parties, and almost never play on the same volleyball teams at the company picnic. System designers begin to plan their creations around the skill sets and development processes of hardware engineers and software engineers. The two become oil and water.
The hardware / software chasm
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
47
Blurred line between hardware and software
The line between "hardware" and "software" is rapidly blurring and even becoming irrelevant from a system design perspective. As this happens, the traditional roles and skillsets of hardware and software engineers are being challenged, and a new generation of designers is emerging as a result.
the obfuscation caused by the pervasiveness of softness.
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
48
We need Reconfigurable Computing Education
We need a unification in dealing with problems, which are shared across many different application domains
There is an urgent need to cure severe qualification deficiencies of our graduates.
We need new curricula in CS and CE for providing an integrating dual paradigm mind set instead of vN-only
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 9
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
49
Terminology clean-up
Software: for scheduling instruction streams
Flowware: for scheduling data streams
Configware: for configuring morphware
Programming sources:
von Neumann
primarily
non-von Neumann
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
50
Why coarse grain
much more MOPS/milliWatt
reconfigurable Data Path Unit (e. g. rALU)
mind set close to classical computing background
instead of rLB (~1 bit wide) use rDPU (e. g. 32 bits wide)
instead of FPGA use rDPA
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU
rDPU rDPU rDPU rDPU Reconfigurable Computing (RC)
much more area-efficient
much less reconfigurability
overhead
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
51
„data stream“: an ambigouos definition
Reconfigurable Computing is not instruction-stream-based
it‘s data-stream-based
it‘s different from the operation of the (indeterministic) „dataflow machine“
other definition also from multimedia area
usable definition from systolic array area
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
52
>> Outline <<
• Reconfigurable Devices
• Coarse-grained Reconfigurable Devices
• Data-stream-based Computing
• The contemporary Common Model
• Reconfigurable Supercomputing
• Conclusions http://www.uni-kl.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
53
Why the speed-up ...
... although FPGA is clock slower by x 3 or even more (most know-how from „high level synthesis“ discipline)
decisions without memory cycles nor clock cycles
most „data fetch“ without memory cycle
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
54
data moved around by software
i.e. by memory-cycle-hungry instruction streams which fully hit the memory wall
P&R: move locality of operation, not data !
stolen from Bob Colwell
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 10
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
55
Replace Caches by ...
stolen from Bob Colwell
caches
… by 16 x 16 reconfigurable data path array (rDPA)
which fits on the same chip
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
56
Similarly skilled
with hardware description languages, Hardware engineers had to adopt the methodologies and techniques of software engineers - Increased softness has an impact on even our products themselves
The required skills for your respective jobs are converging (against the grain in an age of increased specialization) and you'll soon be working with (and competing against) a new generation of embedded engineers that are similarly skilled in both disciplines.
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
57
Using FPGAs
Reducing cost without loss of performance and flexibility.
It may be configured like a general flexible micro-processor executing conventional C/C++ code, and as a highly specific programmability of FPGAs distinguishes to ASICs.
An FPGA can perform a specific algorithm at very high speed. Compared to ASICs, prototyping time is on the order of hours rather than months, with a cost less than a tenth of that for an ASIC.
Using a high-level language, the FPGA can be programmed for a wide variety of algorithms without any deep knowledge of the underlying architecture.
Field-programmable FPGAs
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
58
Co-Compiler Enabling Technology
is available from academia
only a small team needed for commercial re-implementation
on the road map to the Personal Supercomputer
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
59
Conclusions (1)
We need a unification in dealing with problems, which are shared across many different application domains.
RC suffers from fragmentation into different cultures of the many application domains.
CS is the only domain being qualified f. such an effort
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
60
Conclusions (2)
IEEE Computer Society should advocate to improve application development methodologies
and, a common educational approach useful for the wide variety of application domains
inside IEEE Computer Society, a TC on RC should lobby for more
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 11
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
61
Conclusions (3)
reverse the downtrend in CS enrolment
educate not only students …
increase membership
make CS more fascinating
Strategic issue for entire IEEE Computer Society
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
62
Conclusions (4)
The personal supercomputer is near, not only for the desktop, but also for a new road map to large scale supercomputing of up to now unthinkable highest performance dimensions.
IEEE-CS should accept this fascinating challenge, by spearheading the paradigm shift.
IEEE-CS is needed as a translator to explain the impact to managers and to a wide public.
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
63
RC education last week at Karlsruhe
Attendees declared ready to work for a task force
35 submissions from
Australia, Brasil, India, USA, and throughout Europe
But education is just one of several facets ……
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
64
However ....
“What did you say again that your company does?” My father posed the question, “Gate arrays,” I replied, “They’re chips used to…”
“Oh yes, that’s right, Gatorade.” ….. “I used to give that to my marching band members so they wouldn’t get dehydrated on hot days. Don’t remember it coming in chip form …..”
Explain to your grandmother what it means if you’re one of the world’s leading experts on optical proximity correction (OPC) for nanometer-scale semiconductor lithography?
Could you perhaps relate it to some difficulty she has with needlepoint and her cataracts?
Even those with a scientific or technical background often won’t understand precisely what we do. A PhD in molecular biology won’t help to understand VHDL and Verilog synthesis for FPGAs.
Trying to relate DNA sequences to LUT truth tables might offer a starting point, but somebody has to be able to bridge the technology and terminology gap, even to initiate that analogy.
Try explaining FPGAs with the consumer electronics approach. “People tend to relate when you tell them what your part goes into. Today, finally, ‘chip’ seems universally understood. I never get people asking about potato chips anymore.”
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
65
However ....
Abstract. Google’s yaw-dropping hit rates illustrate the pervasiveness of Reconfigurable Computing (RC), mainstream in embedded systems already for years, and now being adopted by supercomputing (Cray, sgi, etc.). From FPGA usage as accelerators, speed-up factors by up to two orders of magnitude are reported, as well as floor space requirements and electricity invoice amounts reduced by one order of magnitude. About 3 orders of magnitude and more is obtained by using coarse-grained reconfigurable datapath arrays (rDPAs) available from a number of start-ups.This is astonishing, since FPGAs and rDPAs have a substantially lower clock speed than microprocessors. Algorithmic cleverness is the secret of success, based on software to configware migration mechanisms, striving away from memory-cycle-hungry instruction-stream-based computing paradigms. The main benefit of RC platforms - having replaced the use of hardwired accelerators - is their flexibility by non-procedural programmability. This also contributes to those concepts of Organic Computing, which rely on processes of evolution, self-organization, adaptation and fault tolerance. The main hurdles on the way to heart-stopping new horizons of cheap highest performance are CS-related educational deficits causing the configware / software chasm and a methodology fragmentation between the different cultures of application domains. Current CS curricula do not sufficiently meet their transdisciplinary responsibility. The talk gives a survey on fundamental issues in RC and on new directions in CS-related curricula, focused on a dual paradigm organic computing approach.
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
66
However ....
Application migration [from supercomputer] resulting in performance increase up to 4 orders of magnitude
„Saves more than $10,000 in electricity bills per year (7¢ / kWh) - .... per 64-processor 19" rack“
[Herb Riley, R. Associates]
Reducing electricity bill by an order of magnitude
Hits the memory wall from a different direction
Reiner Hartenstein (keynote address): "From Organic Computing to Reconfigurable Computing"; The 8th Workshop on Parallel Systems and Algorithms (PASA 2006), co-located with the 19th Int'l Conf. on Architecture of Computing Systems (ARCS 2006) Frankfurt/Main, Germany, March 13-16, 2006 12
Reiner Hartenstein, TU Kaiserslautern, Germany http://hartenstein.de
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
67
However ....
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
68
Conclusions
IEEE Computer Society should advocate to introduce a dual paradigm approach – away from the monopoly of the vN mind set IEEE Computer Society should advocate a common model useful for the wide variety of application domains
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
69
Conclusions
We need a unification in dealing with problems, which are shared across many different application domains.
RC suffers from fragmentation into different cultures of the many application domains.
Each domain uses its own trick box. We should teach the world to think outside the box
CS is the only domain qualified for this unification
© 2005, [email protected] http://hartenstein.de
TU Kaiserslautern
70
An Archetype Common Model needed
Configware Industry from the
IEEE Computer Society should advocate to introduce a dual paradigm transdisciplinary education by using Configware Engineering as the counterpart of Software Engineering by new curricula in CS and CE for providing an integrating dual paradigm mind set supporting a unification in dealing with problems, which are shared across many different application domains - to cure severe qualification deficiencies of our graduates.