24
Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS 2010 – Athens - Greece

Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Embed Size (px)

Citation preview

Page 1: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece

17th IEEE International Conference on Electronics, Circuits, and Systems ICECS 2010 – Athens - Greece

Page 2: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Motivation Architecture Platform Design Space Exploration Results Conclusions

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab 2

Page 3: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Motivation Architecture Platform Design Space Exploration Results Conclusions

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab 3

Page 4: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Modern multimedia applications› Increased need for computational power

High resolution/throughput imaging/digital signal processing › Need for larger memory space

Modern FPGA devices › Larger› More powerfull› Offer a variety of memory architectures› MPSoC capabilities

Formulate the complete Design Space Exploration problem

4C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 5: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Design space of MPSoC FPGA platforms taking into account:› The number of processors

› Data/task level parallelism

› Different interconnection strategies

› Different memory architectures offered

5C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 6: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Motivation Architecture Platform Design Space Exploration Results Conclusions

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab 6

Page 7: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

MPSoC based on the Microblaze processor› Exploration of systems with one to four processors› Interconnection of the processors chosen

Fast Simplex Links (FSL) FIFO based therefore can also serve as a data buffer

Different memory architectures used› External memory DDR2 on the xupv5-lx110t (Virtex-5

board)› Local BRAM› Combination of both for all architectures

7C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 8: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Motivation Architecture Platform Design Space Exploration Results Conclusions

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab 8

Page 9: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Algorithm selection› A widely used streaming multimedia application› Different types of parallelism

The Powerstone JPEG decoder› 4 stages

1-D DC prediction stage (DC) Entropy Decoder (AC) DeQuantization (DeQ) 2-D IDCT (IDCT)

9C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 10: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Single Microblaze implementations for profiling the application (execution cycles)› With extrernal DDR2 memory› With BRAM

Single MBJPEG Decoding Stage

DC predictionEntropy

DecodingDeQuantizati

on2D-IDCT

DDR2 3,81% 28,83% 10,81% 48,69%

BRAM 4,11% 29,92% 8,50% 51,00%

10C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 11: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

8 x 8 pixel blocks are used in every calculation stage

There is no data dependence between the blocks apart from the DC prediction stage

Architecture with only external memory FSL depth = 4 only pointers are propagated

Architecture with use of BRAMs FSL depth = 64 they are also used as data buffers

11C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 12: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

12

One Microblaze serves as a master FSL depth of 4 for synchronization purposes

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

MB0DC,AC,DeQ, IDCT

MB1AC,DeQ, IDCT

FSL 0

MB0AC,DeQ, IDCT

MB1DC,AC,DeQ, IDCT

FSL 0 MB2AC,DeQ, IDCT

FSL 1

MB3AC,DeQ, IDCT

FSL 2

MB0AC,DeQ, IDCT

MB1DC,AC,DeQ, IDCT

FSL 0 MB2AC,DeQ, IDCT

FSL 1

Page 13: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

13

One Microblaze serves as a master External memory FSL depth of 4 (pointer

propagation) Internal memory FSL depth of 64 (data propagation)C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-

eLab

MB0DC,AC,DeQ

MB1 IDCT

FSL 0

MB0DC,AC(DeQ)

MB1DeQ/1D- IDCT

FSL 0 MB2IDCT/1D-IDCT

FSL 1

MB0DeQ

MB11D-IDCT

FSL 0

FSL 1

MB31D-IDCT

FSL 2

MB2DC,AC

Page 14: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

14

Microblaze 0 serves as a master Data equally divided between Microblaze 0 and

1

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

MB0DC,AC,DeQ

MB1AC,DeQ

FSL 0

FSL 1

MB3IDCT

FSL 2

MB2IDCT

Page 15: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Motivation Architecture Platform Design Space Exploration Results Conclusions

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab 15

Page 16: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Calculations of speed up and efficiency with reference to the single Microblaze architectures

Introducing a new parameter hardware efficiency to associate the area increase of the design with the speed up

16

_ _ _

_ _ _ _execution time of multiprocessor

SpeedUpexecution time of single processor

_ _

SpeedUpEfficiencyNumber of Cores

__

SpeedUpHW efficiency

Area IncreaseC.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 17: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

17C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 18: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

18

External Memory External Memory + Local BRAMLocal BRAM

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 19: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

19

There is a limitation in the speed up when only external memory is used due to simultaneous memory requests of the processors

This is overcome by using both external and internal memory

The greatest speed up is achieved by the system with both data and task level parallelism, both external memory/local BRAM and 4MB (x3.27)

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 20: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

20

The systems with the greatest hardware efficiency are the systems which use only internal BRAMs

The greatest HW_efficiency (3.27) is achieved by the system with 2MB and local BRAMs, followed the system with 3MB (2.8)

HW_efficiency demonstrates the revenue gained at a certain hardware cost (area)

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 21: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

Motivation Architecture Platform Design Space Exploration Results Conclusions

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab 21

Page 22: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

22

A design space exploration for FPGA-based multiprocessing and memory architecture based on the JPEG algorithm

20 different system implementations with 3 different memory approaches and 4 different processor architectures were examined

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 23: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

23

Higher speed ups were observed for the architectures which use both internal and external memories and have 4 processors

Higher hardware efficiencies are achieved for architectures that use only internal memories at the expense of total BRAM usage

Our goal is to formulate a methodology for an optimum MPSoC architecture selection based on the performance (speed up) and the cost-effectiveness (HW_efficiency) chosen by the user

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab

Page 24: Electronics Lab, Physics Dept., Aristotle Univ. of Thessaloniki, Greece 17th IEEE International Conference on Electronics, Circuits, and Systems ICECS

The research activities that led to these results, were co-financed by Hellenic Funds and by the European Regional Development Fund (ERDF) under the Hellenic National Strategic Reference Framework (ESPA) 2007-2013, according to Contract no. MICRO2-49-project LoC.

24

Thank you very much for your attention!

C.-L. Sotiropoulou – Design Space Exploration for FPGA-based Multiprocessing Systems – AUTH-eLab