http://vsp2.ecs.umass.edu/vspg/vspg.html
http://lester.univ-ubs.fr/10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Targeting Tiled Architectures in Design Exploration
Lilian Bossuet1, Wayne Burleson2, Guy Gogniat1, Vikas Anand2, Andrew Laffely2, Jean-Luc Philippe1
1 LESTER LabUniversité de Bretagne Sud
Lorient, France{lilian.bossuet, guy.gogniat,
jean-luc.philippe}@univ-ubs.fr
2 Department of Electricaland Computer Engineering
University of Massachusetts,Amherst, USA
{burleson, vanand, alaffely}@ecs.umass.edu
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Outline
Introduction: Design Space Exploration
Design Space of Reconfigurable Architecture
A Target Architecture: aSoC
Proposition of Design Space Exploration Flow
Results
Conclusion and Future Work
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Design Space Exploration: Motivations
Design solutions for new telecommunication and multimedia applications targeting embedded systems
Optimization and reduction of SoC power consumption
Increase computing performance Increase parallelism Increase speed
Be flexible Take into account run-time reconfiguration Targeting multi-granularity (heterogeneous) architectures
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Design Space Exploration: Flow
AlgorithmicSpecification
DESI GN SPACEEXPLORATI ON
archi1
archi2 archi3
archi4
archi5 archi6
archi7
archi8
archi9
archi10
archi11
archi12
Generic Synthesisor Estimations
First Run
SecondRun
ArchitecturalSpecification
Functional descriptionof design space
Dedicated Tool
RTLSpecification
Ab
str
acti
on
Level
Low
Hig
h
Physical Model ofArchi2 and Archi10
Accurate ModelArchi2
Perf
orm
an
ce A
ccu
racy
Low
Hig
h
Progressive design space reduction: iterative exploration refinement of architecture model increase of performance estimation
accuracy
One level of abstraction for one level of estimation accuracy
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Outline
Introduction: Design Exploration Flow Principe
Design Space of Reconfigurable Architecture
A Target Architecture: aSoC
Proposition of Design Space Exploration Flow
Results
Conclusion and Future Works
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Reconfigurable Architectures
Bridging the flexibility gap between ASICs and microprocessor [Hartenstein DATE 2001]
Energy efficient and solution to low power programmable DSP [Rabaey ICASSP 1997, FPL 2000]
Run Time Reconfigurable [Compton & Hauck 1999]
=> A key ingredient for future silicon platforms[Schaumont & all. DAC 2001]
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Design Space of Reconfigurable Architecture
RECONFIGURABLE ARCHITECTURES(R-SOC)
FINE GRAIN(FPGA)
MULTI GRANULARITY(Heterogeneous)
COARSE GRAIN(Systolic)
Processor +Coprocessor
Tile-BasedArchitecture
Coarse Grain Coprocessor
Fine GrainCoprocessor
IslandTopology
Hierarchical Topology
LinearTopology
HierarchicalTopology
MeshTopology
• Chameleon• REMARC• Morphosys
• Pleiades• Garp• FIPSOC• Triscend E5• Triscend A7• Xilinx Virtex-II Pro• Altera Excalibur• Atmel FPSIC
• Xilinx Virtex• Xilinx Spartran• Atmel AT40K• Lattice ispXPGA
• Altera Stratix• Altera Apex• Altera Cyclone
• Systolic Ring• RaPiD• PipeRench
• DART• FPFA
• RAW• CHESS• MATRIX• KressArray• Systolix Pulsedsp
• aSoC• E-FPFA
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Outline
Introduction: Design Exploration Flow Principe
Design Space of Reconfigurable Architecture
A Target Architecture: aSoC
Proposition of Design Space Exploration Flow
Results
Conclusion and Future Works
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
A Target Architecture: aSoC
Adaptive System-on-a-Chip (aSoC)
Tiled architecture containing many heterogeneous processing cores (RISC, DSP, FPGA, Motion Estimation, Viterbi Decoder)
Mesh communication network controlled with statically determined communication schedule
A scalable architecture.
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
tile
FPGA
uProc
MUL
MUL Heterogeneous
Cores
aSoC Architecture
Point-to-point connections
ctrl
South Core
West
North
East
Communication Interface
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
aSoC Communications Interface
Core
Coreports
DecoderLocal
Frequency& Voltage
North to South & East
Instruction Memory
PC
Controller
North
South
East
West
Local Config.
North
South
East
WestInputs Outputs
Interface Crossbar inter-tile transfer tile to core transfer
Interconnect/Instruction Memory contains instructions to configure
the interface crossbar (cycle-by-cycle)
Interface Controller selects the instruction
Coreports data interface and storage for
transfers with the tile IP core Dynamic Voltage and Frequency
Selection Dynamic Power Management
Interface Crossbar
ctrl
South Core
West
North
East
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
aSoC Exploration ...
Type of tiles
Number of each type of tile
Placement of the tiles
Intern architecture of reconfigurable tiles (FPGA core)
Communication scheduling
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Outline
Introduction: Design Exploration Flow Principe
Design Space of Reconfigurable Architecture
A Target Architecture: aSoC
Proposition of Design Space Exploration Flow
Results
Conclusion and Future Work
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Design Space Exploration: Goals
Goal: Rapid exploration of various architectural solutions to be implemented on heterogeneous reconfigurable architectures (aSoC) in order to select the most efficient architecture for one or several applications
Take place before architectural synthesis (algorithmic specification with high level abstraction language)
Estimations are based on a functional architecture model (generic, technology-independent)
Iterative exploration flow to progressively refine the architecture definition, from a coarse model to a dedicated model
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Design Exploration Flow Targeting Tiled ArchitectureC SPECIFICATION
C to HCDFG parser
Function F2
HCDFG Graphs of the application
Application App1
Function F1
Model of the aSOC Architectures
Tile T2aSOC A1
Tile T1
ApplicationAnalysis
Tile Exploration
Results of the Tile exploration step
Function Tile PerformanceF1 T1 T11, C11 , Occ11
T2 T21, C21 , Occ21
F2 T1 T12, C12 , Occ12
T2 T22, C22 , Occ22
aSOC Builder
Static CommunicationScheduling
Final model ofaSOC architecture
aSOC Analysis
THF Model HF Model
F1
F2
T2
T1
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
C SPECIFICATION
C to HCDFG parser
Function F2
HCDFG Graphs of the application
Application App1
Function F1
Model of the aSOC Architectures
Tile T2aSOC A1
Tile T1
Application Analysis
Tile Exploration
Results of the Tile exploration step
Function Tile PerformanceF1 T1 T11, C11, Occ11
T2 T21, C21, Occ21
F2 T1 T12, C12, Occ12
T2 T22, C22, Occ22
aSOC Builder
Static CommunicationScheduling
Final model ofaSOC architecture
aSOC Analysis
THF Model HF Model
F1
F2
T2
T1
Application Analysis
Use of algorithmic metrics and dedicated scheduling algorithms to highlight the target architectures
Algorithmic metrics: Characterize the application orientation
• Processing• Memory• Control
Characterize the application potential parallelism
• Processing• Memory
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
C SPECIFICATION
C to HCDFG parser
Function F2
HCDFG Graphs of the application
Application App1
Function F1
Model of the aSOC Architectures
Tile T2aSOC A1
Tile T1
Application Analysis
Tile Exploration
Results of the Tile exploration step
Function Tile PerformanceF1 T1 T11, C11, Occ11
T2 T21, C21, Occ21
F2 T1 T12, C12, Occ12
T2 T22, C22, Occ22
aSOC Builder
Static CommunicationScheduling
Final model ofaSOC architecture
aSOC Analysis
THF Model HF Model
F1
F2
T2
T1
Tile Exploration: with 3 steps Projection:
Link between necessary resources (application) and available resources (tile)
Use of an allocation algorithm based on communication costs reduction
Composition: Take into account of the function scheduling to
estimate additional resources (register, mux, …)
Estimation: performance interval computation (lower and
upper bounds) speed/resource utilization/power
characterization
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
C SPECIFICATION
C to HCDFG parser
Function F2
HCDFG Graphs of the application
Application App1
Function F1
Model of the aSOC Architectures
Tile T2aSOC A1
Tile T1
Application Analysis
Tile Exploration
Results of the Tile exploration step
Function Tile PerformanceF1 T1 T11, C11, Occ11
T2 T21, C21, Occ21
F2 T1 T12, C12, Occ12
T2 T22, C22, Occ22
aSOC Builder
Static CommunicationScheduling
Final model ofaSOC architecture
aSOC Analysis
THF Model HF Model
F1
F2
T2
T1
aSoC Builder
Environment AppMapper
Partition and assignment based on Run Time Estimation
Compilation Communication Scheduling Core compilation
Generate tiles configuration Communications instructions Bitstreams (for reconfigurable tile) RISC instructions
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
C SPECIFICATION
C to HCDFG parser
Function F2
HCDFG Graphs of the application
Application App1
Function F1
Model of the aSOC Architectures
Tile T2aSOC A1
Tile T1
Application Analysis
Tile Exploration
Results of the Tile exploration step
Function Tile PerformanceF1 T1 T11, C11, Occ11
T2 T21, C21, Occ21
F2 T1 T12, C12, Occ12
T2 T22, C22, Occ22
aSOC Builder
Static CommunicationScheduling
Final model ofaSOC architecture
aSOC Analysis
THF Model HF Model
F1
F2
T2
T1
aSoC Analysis
Use the results of previous steps Functions scheduling Tile allocation Communication scheduling
Complete estimation of the proposed solution
Global execution time Global power consumption Total area
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Outline
Introduction: Design Exploration Flow Principe
Design Space of Reconfigurable Architecture
A Target Architecture: aSoC
Proposition of Design Space Exploration Flow
Results
Conclusion and Future Work
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Results
aSoC architecture (UMASS) Prototype of aSoC interconnect
• Technology 0.18 µm• Clock speed of 400 MHz
AppMapper (UMASS) Several mapped applications
• Matrix operations• Median Filter• Viterbi decoder• DCT
Tile exploration (UBS) Application analysis
• Intelligent Camera (motion detection)
• Matching Pursuit Projection step
• Lee DCT
• Matrix operations
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Outline
Introduction: Design Exploration Flow Principe
Design Space of Reconfigurable Architecture
A Target Architecture: aSoC
Proposition of Design Space Exploration Flow
Results
Conclusion and Future Work
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Conclusion and future work
Conclusion Original design exploration flow working at a high level of
abstraction Fast and flexible (use of functional view of the architectures) Targeting an efficient reconfigurable architecture: aSoC Statically-scheduled, point-to-point communications
Future Work Development of larger set of design exploration benchmarks Exploration of other configurable systems
http://vsp2.ecs.umass.edu/vspg/vspg.html
http://lester.univ-ubs.fr/10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Thank you ...
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Previous Work
Xplorer - University of Kaiserslautern, Germany [Hartenstein PATMOS 2000]
Targets a mesh coarse grain architecture: The KressArray a fast reconfigurable ALUs
Gives design guidance concerning: the size of the array, the available operators, the communication architecture and the connection structure.
Controlled by performance and power estimations. Starts with high level specification of application (ALE-X language).
RAW - Massachusetts Institute of Technology, USA [Moritz FCCM 1998]
Targets a reminiscent coarse grained FPGA: The MIT Raw Microprocessor Answers to the balance problem: to determine the best division of VLSI
resources among computing, memory and communication. Answers to the grain problem: to determine the optimum size of each
architecture tiles Use several models: architecture model, costs model and performance model
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
HCDFG: Hierarchical Control Data Flow Graph
Task 1 Task 2
Task 1
F2
F1
F5
F4F3
HCDFG
HDFG LOOP
DFG
CDFGDFG
Loop CORE
X
Y#
C Y
MAC
ALU
A
http://lester.univ-ubs.fr/
http://vsp2.ecs.umass.edu/vspg/vspg.html
10th Reconfigurable Architecture Workshop, RAW’03, Nice, France, Tuesday, April 22, 2003
Application’s Metrics
Average Parallelism metric (a lot of parallelism if γ is high)
Nb of global memory accesses and processing operations
Critical Pathγ =
Nb of global memory accesses
Nb of processing operations + Nb of global memory accessesMOM =
Memory Orientation Metric [0,1]
Nb of test
Nb of global mem. accesses + Nb of proc. op. + Nb of testCOM =
Control Orientation Metric [0,1]
Y. Le Moullec, N. Ben Amor, J-Ph. Diguet, M. Abid and J-L. Philippe. Multi-Granularity Metrics for the Era of Strongly Personalized SOCs.
In DATE 2002, Munich, Germany, March 2002