22
1 An Architectural Space Exploration Tool for Domain Specific Reconfigurable Computing Gayatri Mehta Alex Jones Electrical Engineering Electrical & Computer Engineering University of North Texas University of Pittsburgh email: [email protected] email: [email protected] April 19, 2010

An Architectural Space Exploration Tool for Domain

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: An Architectural Space Exploration Tool for Domain

1

An Architectural Space Exploration Tool for Domain Specific Reconfigurable Computing

Gayatri Mehta Alex JonesElectrical Engineering Electrical & Computer EngineeringUniversity of North Texas University of Pittsburghemail: [email protected] email: [email protected]

April 19, 2010

Page 2: An Architectural Space Exploration Tool for Domain

2

Outline

Motivation A domain-specific fabric Design space exploration case studies Automation of design space exploration Results Conclusions

Page 3: An Architectural Space Exploration Tool for Domain

Motivation

Designing a complex SoC design requires the evaluation of many potential architectural options

Exploring the design space manually would be very time consuming and may not be feasible for complex designs

Page 4: An Architectural Space Exploration Tool for Domain

One approach is…

To develop design space exploration tools– Allow application developers to explore

architectural tradeoffs efficiently and reach solutions quickly

– Consider the applications of interest that will run on the device

Page 5: An Architectural Space Exploration Tool for Domain

5

Signal and image processing applications

Core signal processing benchmarks from MediaBench benchmark suite

Edge detection benchmarks from image processing domain

Page 6: An Architectural Space Exploration Tool for Domain

6

Mapping of a benchmark onto the fabric

temp1 = x * y;temp2 = z *temp1;if (temp2 < 255)

result1 = temp2<<16 ;else

result1 = temp1;result1 += (((temp1)>>16)-0xffff);

Page 7: An Architectural Space Exploration Tool for Domain

7

CharacteristicsCombinational structureASIC-like powerProgrammable

Domain specific fabricThe fabric is comprised of

Power-optimized Arithmetic and Logic UnitsConfigurable interconnect

Page 8: An Architectural Space Exploration Tool for Domain

Different architectural implementations of functional units

Inputs

Inputs

Inputs

Output

Output

Output

Hardware

ALU

Optimized ALU

Page 9: An Architectural Space Exploration Tool for Domain

9

Interconnect Options

4:1 cardinality interconnect

5:1 cardinality interconnect

Page 10: An Architectural Space Exploration Tool for Domain

10

Implementation of the fabric Implemented the fabric in parameterized VHDL

– same base VHDL description for the fabric model– by varying different parameters, generate different architectural instances

Global ParametersData width DW={8,16,32}

Fabric ParametersWidth of the fabric W

Height of the fabric H

Arithmetic and Logic Unit ParametersNumber of Operands O={3}

Number of Operations OP={8,10,16, 23}

Interconnect ParametersMultiplexer Cardinality C={2,4,8,16,32}

Page 11: An Architectural Space Exploration Tool for Domain

11

Design space exploration case studies

ALU granularity Multiplexer cardinality Dedicated pass gates Heterogeneous ALUs

Conclusion: 10 ops per ALU (32-bit) with 8:1 interconnect and 33% dedicated pass gates

Page 12: An Architectural Space Exploration Tool for Domain

12

Design Space Exploration

•Width/height of the fabric

•Number and Type of operators

•Data width

•Dedicated pass gates

•Interconnect cardinality

FabricMapper

FabricGenerator

SynthesisSimulation

Power/Performance/Area Analysis

Applications

Applications

Final Domain-Specific Fabric

Automation of generation of Domain-Specific Fabric

FIM

FIM

Physical Characteristics

Page 13: An Architectural Space Exploration Tool for Domain

Fabric Instance Model (FIM)<ftudefinename="alu0“ noop="10111" useic="false">

<op code="00001"> + </op><op code="00010"> - </op><op code="00011"> * </op><op code="10011"> == </op><op code="00111"> ^ </op><op code="01110"> &gt; </op><op code="10000"> &gt;= </op><op code="01111"> &lt; </op><op code="10001"> &lt;= </op><op code="10010"> != </op><op code="00100"> &amp; </op><op code="00101"> | </op><op code="01001"> &lt;&lt; </op><op code="01011"> &gt;&gt; </op><op code="00000"> pass </op><op code="10100" order="reverse"> pass </op><op code="11111"> mux </op><op code="01000"> ! </op>

</ftudefine>

<rowpattern repeat="forever"><row><ftupattern repeat="forever"><FTU type="alu0">

<operand number="0">

<range left ="-3" right ="4"/>

</operand>

<operand number="1">

<range left ="-3" right ="4"/>

</operand>

<operand number="2">

<range left ="-3" right ="4"/>

</operand></FTU></ftupattern></row>

</rowpattern>

Page 14: An Architectural Space Exploration Tool for Domain

Energy Consumption

Two factors that affect the energy consumption of the device– Increase in total path length of the

mapped application onto the device– Number of ALUs used as pass gates

14

Page 15: An Architectural Space Exploration Tool for Domain

15

Design Space ExplorationObservation: 11 ops per ALU with 8:1 interconnect & 33%

dedicated pass gates is the best candidate

0

1

2

3

4

5

6

7

8

9

10

33:1 No DPs-homo ALUs

32:1 No DPs-homo ALUs

17:1 No DPs-homo ALUs

16:1 No DPs-homo ALUs

9:1 No DPs-homo ALUs

8:1 No DPs-homo ALUs

5:1 No DPs-homo ALUs

8:1 25% DPs-homo ALUs

8:1 33% DPs-homo ALUs

8:1 50% DPs-homo ALUs

8:1 33% DPs-14ops

8:1 33% DPs-13ops

8:1 33% DPs-12ops

8:1 33% DPs-11ops

8:1 33% DPs-10ops

Ave

rag

e p

ath

le

ng

th i

ncr

ea

se

Architecture

Page 16: An Architectural Space Exploration Tool for Domain

16

Design Space Exploration

Page 17: An Architectural Space Exploration Tool for Domain

17

Comparison of manual and tool DSE results

0

50

100

150

200

250

300

350

400

450

En

erg

y (

pJ)

adpcmencoder

adpcmdecoder

idct row idct col gsm sobel laplace dwt average

Benchmark

10ops-8:1-33%DP (M) 11ops-8:1-33%DP (T)

Page 18: An Architectural Space Exploration Tool for Domain

18

Design Space ExplorationObservation: 11 ops per ALU with 8:1 interconnect & 50%

dedicated pass gates is the best candidate

0

1

2

3

4

5

6

7

8

9

10

33:1 No DPs-homo ALUs

32:1 No DPs-homo ALUs

17:1 No DPs-homo ALUs

16:1 No DPs-homo ALUs

9:1 No DPs-homo ALUs

8:1 No DPs-homo ALUs

5:1 No DPs-homo ALUs

8:1 25% DPs-homo ALUs

8:1 33% DPs-homo ALUs

8:1 50% DPs-homo ALUs

8:1 66% DPs-homo ALUs

8:1 50% DPs-

14ops

8:1 50% DPs-

13ops

8:1 50% DPs-

12ops

8:1 50% DPs-

11ops

8:1 50% DPs-

10ops

Ave

rag

e p

ath

le

ng

th i

ncr

ea

se

Architecture

Page 19: An Architectural Space Exploration Tool for Domain

19

Design Space Exploration

Page 20: An Architectural Space Exploration Tool for Domain

20

Comparison of manual and tool DSE results

0

50

100

150

200

250

300

350

400

450

En

erg

y (

pJ)

adpcmencoder

adpcmdecoder

idct row idct col gsm sobel laplace dwt average

Benchmark

10ops-8:1-33%DP (M) 11ops-8:1-50%DP (T)

Page 21: An Architectural Space Exploration Tool for Domain

21

Reconfigurable Fabric ComparisonObservation: Optimized fabric is 3X of ASIC energy

1

10

100

1000

10000

100000

1000000

En

erg

y (

pJ)

adpcmenc

adpcmdec

idct row idct col gsm sobel laplace average

Benchmark

Xscale (0.18um) Virtex-2P (0.13um) Fabric (initial)(0.16um)

Fabric (optimized)(0.16um) Fabric (optimized)(0.13um) ASIC (0.16um)

ASIC (0.13 um)

Page 22: An Architectural Space Exploration Tool for Domain

22

Thank You

Final thoughts and discussion