32
POLITECNICO DI MILANO Polaris Polaris

3D-DRESD Polaris

Embed Size (px)

Citation preview

Page 1: 3D-DRESD Polaris

POLITECNICO DI MILANO

PolarisPolaris

Page 2: 3D-DRESD Polaris

2

A workflow to manage allocation and relocation of tasks in a reconfigurable architecture

Final goal: complete architecture (bitstreams) generation

PolarisPolaris

Page 3: 3D-DRESD Polaris

POLITECNICO DI MILANO

Management of 2D Management of 2D Reconfiguration in a Reconfiguration in a

Reconfigurable SystemReconfigurable System

Massimo [email protected]

Page 4: 3D-DRESD Polaris

4

OutlineOutline

IntroductionProblem description Project Goals and Contributions

Project in detailsPhasesResults

Future Work

Page 5: 3D-DRESD Polaris

Problem DescriptionProblem Description

New Generation of FPGAsVirtex-4 and Virtex-5Allow bi-dimensional reconfiguration

This permits to:Better exploit reconfigurable areaObtain modules performance optimizations

More complex management:Handle one more degree of freedomAvoid more fragmentationPerform good placement choices to keep low TRRKeep acceptable intra-module routing paths

5

Page 6: 3D-DRESD Polaris

Project Goals and Project Goals and ContributionsContributions

Analyze effects of 2D reconfigurationNew advantagesNew problems

Examine possible solutions to new problemsExplore literature to find promising ideasEvaluate those solutions in various scenarios

Propose a new solutionCombining ideas from literature with new onesObtaining good cost-quality tradeoff

6

Page 7: 3D-DRESD Polaris

Setting and Advantages Setting and Advantages DefinitionDefinition

Definition of the setting:2D self partial dynamical run-time reconfiguration

Analysis of the advantages of 2D ReconfigurationIn area usage and performance

7

Page 8: 3D-DRESD Polaris

8

2D Fragmentation Problem2D Fragmentation Problem

Analysis of the 2D-fragmentation problemArea generally more fragmentedCan nullify the area optimizations obtained

Page 9: 3D-DRESD Polaris

9

Placement DecisionsPlacement Decisions

Analysis of 2D placement choices effects:Again, bad choices can lead to performance loss

Page 10: 3D-DRESD Polaris

10

Allocation managerAllocation manager

Definition of allocation manager desired features:Low TRRLow management overheadHigh routing efficiencyLow fragmentation

Definition of allocation manager structure:Empty space manager

Complete space Heuristic selection

FitterGeneral (FF,BL,BF,WF…)Focused (FA,RA… )

Page 11: 3D-DRESD Polaris

Most relevant worksMost relevant works

Maintain complete information on empty space:KAMER:

Keep All Maximally Empty RectanglesApply a general fitting strategy

CUR:Maintain the Countour of a Union of RectanglesApply a focused fitting strategy

Heuristically prune part of the information:KNER:

Keep Non-overlapping Empty RectanglesApply a general fitting strategy

2D-HASHING:Keep Non-ov. Empty Rectangles in optimized data structure

Apply (exclusively) a general fitting strategy11

Page 12: 3D-DRESD Polaris

Evaluation and Proposed Evaluation and Proposed ApproachApproach

Proposed ApproachHeuristic (KNER-like) empty space manager, to keep low complexity for use in a self-reconfigurable systemFitting strategy focused on minimizing routing paths, to maintain high performance of the reconfigurable system (chosen metric to minimize Manhattan distance)12

High placement quality => high complexityLowest compl. => no focused fitting (bad especially for routing)

Page 13: 3D-DRESD Polaris

13

Structure of the allocation managerStructure of the allocation manager

Task, defined by:Arrival time, ASAP, (ALAP), H, W, Latency, Communicating TasksHosted in a queue which also adds a pointer to the rectangle where it is placed

Reconfigurable Device, represented as:Binary Tree structure, each node is a Rectangle, each leaf is an empty Rectangle. Navigation trough pointers to left child, right child, next leaf and a function to find previous leaf (for bookkeeping after split or merge)

Rectangle, defined by:X, Y, H, WInitially one, (X,Y)=(0,0), H=FPGA Rows, W=FPGA Cols

Page 14: 3D-DRESD Polaris

14

The Placement AlgorithmThe Placement Algorithm

Page 15: 3D-DRESD Polaris

Experimental ResultsExperimental Results

Benchmark of 100 randomly generated tasks:Size (5% to 25% of FPGA), randomly interconnected

Execution time: 3x less than CUR, close to KNERCommunication cost: 3x less than KNER, close to CURTask Rejection Rate: all solutions quite close

15

Page 16: 3D-DRESD Polaris

Future WorkFuture Work

Apply the proposed solution to self reconfiguration:

Adapt the algorithm to run on the internal processorCreate a validation reconfigurable architectureIntegrate the architecture with relocation

Tune the algorithm to improve results:Experiment techniques to reduce TRRTry to optimize the code to have an algorithm with lower running time

Evaluate other fitting strategies16

Page 17: 3D-DRESD Polaris

17

Questions?Questions?

Page 18: 3D-DRESD Polaris

POLITECNICO DI MILANO

Relocation for 2D Relocation for 2D Reconfigurable SystemsReconfigurable Systems

Marco [email protected]

Page 19: 3D-DRESD Polaris

19

Project OutlineProject Outline

IntroductionProblem descriptionProject Goals

Project in detailsPhases Results

What’s next

Page 20: 3D-DRESD Polaris

ProblemProblem DescriptionDescription

Self Dynamical Runtime 2D ReconfigurationXilinx Virtex-4 and Virtex-5

Relocation, different solutionsSoftwareHardware

We chose an hardware solutionBiRF Square

20

Page 21: 3D-DRESD Polaris

Project GoalsProject Goals

Study of the new FPGA FamiliesExamination of Xilinx documentation on V4 and V5

Analysis of the new bitstream structureGeneration of V4 and V5 bitstream

Development of the new version of BiRFImplementationValidation

21

Page 22: 3D-DRESD Polaris

New Frame Addressing:Possibility of addressing rows and columns

Frame Addressing (1/2)Frame Addressing (1/2)

22

Page 23: 3D-DRESD Polaris

23

Frame Addressing (2/2)Frame Addressing (2/2)

Page 24: 3D-DRESD Polaris

24

New ParserNew Parser

Page 25: 3D-DRESD Polaris

CRC CalculationCRC Calculation

Particular CRC value, used by Xilinx tools

Two version of BiRF Square:By using the “predefined” valueWith actual CRC calculation

An optimized algorithm has been used

25

Page 26: 3D-DRESD Polaris

Synthesis resultsSynthesis results

On a Virtex-4 with speed grade -12General purpose version: max frequency of 160 MHzSpecific version: max frequency of 290 Mhz

26

Page 27: 3D-DRESD Polaris

27

Target DeviceTarget Device

Page 28: 3D-DRESD Polaris

28

Validation ArchitectureValidation Architecture

Page 29: 3D-DRESD Polaris

Results (1/2)Results (1/2)

BiRF SquarePermits apply relocation in a self partially and dynamically 2D-reconfigurable systemThe occupation ratio is relatively smallFrequency more than acceptableReduction of internal memory requirements

29

Page 30: 3D-DRESD Polaris

Results (2/2)Results (2/2)

Throughput of 7,3 MB/s:

A total configuration file size is about 1 MBConsidering an architecture:

1/3 of the area as fixed part 2/3 as reconfigurable part with 6 slots

With such hypothesisSize of a partial bitstream will be about 110 KBRelocation time of about 15 ms

30

Page 31: 3D-DRESD Polaris

What’s NextWhat’s Next

Future improvements:Direct access to the memory (DMA)

Direct manipulation of the bitstreamPortability

Integration with ICAPElimination of the relocation overhead Relocation time << reconfiguration time

Future work:Provide a simulation framework to monitor the reconfigurable system evolution and to evaluate different choices

The final goal:Creation of a real architecture that exploits self partial and dynamical 2D-reconfiguration,with relocation

31

Page 32: 3D-DRESD Polaris

32

QuestionsQuestions