111
ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Hardware Software Co-design Embedded System Design Embedded System Design

ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Embed Size (px)

Citation preview

Page 1: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 Embedded Reconfigurable

Computing Systems

Hardware Software Co-designHardware Software Co-designEmbedded System DesignEmbedded System Design

Page 2: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 2

Topics Embedded SystemsEmbedded Systems H/S Co-Design DefinitionH/S Co-Design Definition MotivationMotivation Design Steps, Design Steps,

Profiling, Profiling, Partitioning, Partitioning, AllocationAllocation

Xilinx Xilinx Soft and Hard CoresSoft and Hard Cores EDK/SDK, Vivado vs. ISEEDK/SDK, Vivado vs. ISE

Page 3: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 3

References ““Embedded System Design: A Unified Embedded System Design: A Unified

Hardware/Software Introduction” by Frank Vahid, Hardware/Software Introduction” by Frank Vahid, Wiley, 2002.Wiley, 2002.

““Hardware/Software Codesign: A systematic Hardware/Software Codesign: A systematic approach targeting data-intensive applications”, approach targeting data-intensive applications”, Wayne Luk, IEEE Signal processing Magazine, Wayne Luk, IEEE Signal processing Magazine, May 2005.May 2005.

“Hardware-Software Co-synthesis for Digital Systems”, R.Gupta, G. De Micheli, G., IEEE Design & Test of Computers, September 1993, pp. 29-41

“Hardware/Software Design Space Exploration for a Reconfigurable Processor”, A. Rosa, 2003.

“A Framework for Hardware/Software Co-design”, S. Kumar, Q. Wulf, IEEE 1993.

Page 4: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

What is an embedded system?

Page 5: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

5

Embedded Systems

• Single functional e.g. pager, mobile phone• Tightly constrained

– cost, size, performance, power, etc.• Reactive & real-time

– e.g. car’s cruise controller– delay in computation => failure of system

Page 6: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

6

Embedded Systems?

Another definition for embedded systems:• Embedded = dedicated• Fixed functionality – No general purpose• Fixed in Hardware

Page 7: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Universität Dortmund

Characteristics of Embedded Systems

Must be efficient

– Energy efficient

– Code-size efficient(especially for systems on a chip)

– Run-time efficient

– Weight efficient

– Cost efficient Dedicated towards a certain application

Knowledge about behavior at design time can be used to minimize resources and to maximize robustness

Dedicated user interface(no mouse, keyboard and screen)

Must be efficient

– Energy efficient

– Code-size efficient(especially for systems on a chip)

– Run-time efficient

– Weight efficient

– Cost efficient Dedicated towards a certain application

Knowledge about behavior at design time can be used to minimize resources and to maximize robustness

Dedicated user interface(no mouse, keyboard and screen)

Page 8: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Universität Dortmund

Characteristics of Embedded Systems

Many ES must meet real-time constraints– A real-time system must react to stimuli from the

controlled object (or the operator) within the time interval dictated by the environment.

– For real-time systems, right answers arriving too late are wrong.

– „A real-time constraint is called hard, if not meeting that constraint could result in a catastrophe“ [Kopetz, 1997].

– All other time-constraints are called soft.– A guaranteed system response has to be explained

without statistical arguments

Many ES must meet real-time constraints– A real-time system must react to stimuli from the

controlled object (or the operator) within the time interval dictated by the environment.

– For real-time systems, right answers arriving too late are wrong.

– „A real-time constraint is called hard, if not meeting that constraint could result in a catastrophe“ [Kopetz, 1997].

– All other time-constraints are called soft.– A guaranteed system response has to be explained

without statistical arguments

Page 9: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Info

rmati

on T

ech

nolo

gy

9

A “short list” of embedded systems

And the list goes on and on

Anti-lock brakesAuto-focus camerasAutomatic teller machinesAutomatic toll systemsAutomatic transmissionAvionic systemsBattery chargersCamcordersCell phonesCell-phone base stationsCordless phonesCruise controlCurbside check-in systemsDigital camerasDisk drivesElectronic card readersElectronic instrumentsElectronic toys/gamesFactory controlFax machinesFingerprint identifiersHome security systemsLife-support systemsMedical testing systems

ModemsMPEG decodersNetwork cardsNetwork switches/routersOn-board navigationPagersPhotocopiersPoint-of-sale systemsPortable video gamesPrintersSatellite phonesScannersSmart ovens/dishwashersSpeech recognizersStereo systemsTeleconferencing systemsTelevisionsTemperature controllersTheft tracking systemsTV set-top boxesVCR’s, DVD playersVideo game consolesVideo phonesWashers and dryers

Page 10: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

10

Hardware is not the whole System !!!

An Embedded System is the result of a projection of …– Architecture– Hardware– Software

… distinguished by its gross Functional Behaviour !

• Software is an important part of the Product and must be part of the Design Process

… or we are only designing a Component of the system.

Page 11: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 11

Definition – Hardware/Software Co-DesignDefinition – Hardware/Software Co-Design

The design of computer systems that incorporates both standardized off the shelf processors, or softwaresoftware, as well as specializedspecialized hardware hardware. The cooperative designcooperative design of hardware and

software components. The unificationunification of currently separate hardware

and software paths. The movement of functionalitymovement of functionality between

hardware and software.

Page 12: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 12

H/S Co-design: ExampleH/S Co-design: Example Optical wheel speed sensor. System constraints Area – 40 units, time – 100 cycles This could be implemented using either standardized

processors, specialized hardware or a combination of both

Input

Decoding

FIR

Filter

Tick to Speed

Inversion

Output

Encoding

Page 13: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 13

H/S Co-design: SoftwareH/S Co-design: Software Design implemented in software System constraints

Area – 48 unitsArea – 48 units > 40 units Time – 132 cyclesTime – 132 cycles > 100 cycles

Design Time – 2 months

Processor #1Processor #1 Processor #2Processor #2

Page 14: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 14

H/S Co-design: HardwareH/S Co-design: Hardware Design implemented in custom RTL hardware System constraints

Area – 24 unitsArea – 24 units, < 40 units Time – 52 cyclesTime – 52 cycles << 100 cycles

Surpasses both area and timing constraints by 40%40% Design Time – 9 months

Delay in design is unacceptable in a competitive world.

Page 15: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 15

H/S Co-designH/S Co-design Design implemented in hardware & software System constraints

Area – 37 unitsArea – 37 units, < 40 units Time – 95 cyclesTime – 95 cycles << 100 cycles

I. Design Time – 3.5 monthsII. Not as efficient as design II However, it establishes a balance balance between two extremes.

Processor #1Processor #1

Page 16: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 16

Achieve performanceAchieve performance by moving software bottlenecks to hardware Use hardware to meetmeet time & area constraints time & area constraints which cannot

be met alone using general purpose processors. Not possible to put everything in hardware due to limited limited

resourcesresources

Some code more appropriate for sequential implementation (i.e. achieve flexibilityachieve flexibility)

Today’s designs are focusing on Embedded Systems on Embedded Systems which require both hardware and software modules

MotivationsMotivations

Page 17: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 17

Motivations … contMotivations … cont

The complexitycomplexity and functionality of computer systems are increasing at a dramatic rate SystemOnChip (SOC)(SOC). It is difficult difficult for custom systems to be designed,

built, verified within an acceptable time periodwithin an acceptable time period even with advanced CAD tools unless standardized parts are used. (Solution?)

Take advantage of previously designedpreviously designed (IPs) and tested processor to reduce time and improve reliability.

Page 18: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 18

Trade-offs/DecisionsTrade-offs/Decisions Given a set of specified goals and

implementation technology, constraints, … designers consider trade-offsdesigners consider trade-offs in how hardware and software components work together.

Decisions, Constraints and Evaluations?Decisions, Constraints and Evaluations? Performance. Area. Power. Flexibility (Programmability). Development & Manufacturing costs. Reliability Robustness Maintenance Design evolution.

Page 19: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 19

Hw/Sw Co-Design: ResearchHw/Sw Co-Design: Research

Research in hardware-software co-design encompasses many interesting areas of research such as:

I.I. System specificationSystem specification and modelingII.II. Design ExplorationDesign Exploration

System co-verificationco-verification and co-simulation Code generationCode generation for hardware/software Hardware/Software interfacinginterfacing

III.III. PartitioningPartitioning IV. SchedulingV. However the most important objective is to develop

a unified design methodology/tool for creating systems containing both hardware and software.

Page 20: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 20

A Simple ApproachA Simple Approach

Application

Evaluation

Decision

S/W H/W

Partitioning

Profiling

Scheduletasks

Page 21: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 21

Profiling and Partitioning

SW__________________

SW__________________

SW__________________

HW__________________

SW__________________

SW__________________

ProcessorProcessor ProcessorASIC/FPGA

Critical Regions

ProfilerProfiler Benefits Speedups of 2X to

10X typical Far more potential

than dynamic SW optimizations (1.2x)

Energy reductions of 25% to 95% typical

Time Energy

SW OnlyHW/ SW

Time Energy

SW Only

ProcessorProcessor

Page 22: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 22

ProfilingProfiling Profiling allows you to learn where your programwhere your program

spent its timespent its time and which functions called which other functions while it was executing. The profiler uses information collected during the actual

execution of your program, therefore, it can be used on programs that are too largetoo large or tootoo complex to analyzecomplex to analyze by reading the source.

This information can show you which pieces of your program are slower than you expectedslower than you expected. These might be candidates for either:

Rewriting code to make your program execute faster. Moving these functions to hardware.

Page 23: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Profiling is an analysis of software performance

– Where routine time is being spent

– How many times functions are being called

– Included tool in SDK

– Which algorithms to consider moving to hardware

Results in two useful formats

Profiling

Samples per function: How much time is spent in each

routine

Function call graph: Which routine call, which function, and how many

times

ENG3050 ERCS23

Page 24: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Hardware/software intrusiveRequires a hardware timerhardware timer

Requires a dedicated area in memorydedicated area in memory

Executable is modified Executable is modified with profiler routines

A dedicated hardware timer interrupts the processor at a fixed intervalThe interrupt routine keeps track of the program counter at each interrupt

A histogram of PC locations is kept in profile RAM

Interrupt interval time is programmable

Every function call in the software application is annotated by the compiler to track which functions are being called

How Does Profiling Work?

24

Page 25: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 25

Profiling: StepsProfiling: Steps You must compile and link your program with

profiling enabled. cc -o myprog.exe myprog.c utils.c –g –pgcc -o myprog.exe myprog.c utils.c –g –pg

You must then execute your program to generate a profile data file Your program will write the profile data into a file called

`gmon.outgmon.out’ just before exiting.

You must run gprof to analyze the profile data. gprofgprof optionsoptions myprog.exe gmon.outgmon.out > outfile The gprof program prints a flat profile and a call graph

Page 26: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 26

Profiling: Useful HintsProfiling: Useful Hints Options:

-e-e function_namefunction_name : tells gprof to NOT print information about the function function_namefunction_name (and its children …) in the call graph.

-f-f function_namefunction_name: causes gprof to limit the call graph to the function function_namefunction_name and its children.

-b-b : gprof doesn’t print the verbose blurbs that try to explain the meaning of all of the fields in the tables.

Page 27: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 27

Profiling: Flat ProfileProfiling: Flat Profile

% time% time : is the percentage of the total execution time your program spent in this function. cumulative secondscumulative seconds: This is the cumulative total number of seconds the computer spent

executing this function plus time spent in all the functions above. self secondsself seconds: This is the number of seconds accounted for by this function alone. callscalls: this is the total number of times the function was called. self ms/callself ms/call: This represents the average number of milliseconds spent in this function per

call. total ms/calltotal ms/call: This represents the average number of milliseconds spent in this function and

its descendants per call. namename: This is the name of the function.

Page 28: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 28

Simple Approach: Simple Approach: DrawbacksDrawbacks

I. Some functions might not be easily mapped onto hardware.

II. Decisions taken very early at profiling phase might not be optimal.

III. No consideration for interfacing and communication.

IV. If the application changes slightly then we need to re-profile and re-partition.

Page 29: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 29

Applications Not suitable for RCSApplications Not suitable for RCS

Not all applications are suitable for Reconfigurable Computing:

Applications that involve extensive recursionextensive recursion, for example, are a poor match because the synthesized “hardware” must be of fixed size.Applications that have only a small percentage of parallelismsmall percentage of parallelism (1-5%) will not make advantage of RCS.Applications that are I/O boundI/O bound will also suffer due to memory I/O transferApplications that require floating pointrequire floating point arithmetic

Page 30: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Design Space ExplorationDesign Space ExplorationScheduling/Arbitration

proportionalshareWFQ

staticdynamicfixed priority

EDFTDMA

FCFS

Communication Templates

Architecture # 1 Architecture # 2

Computation Templates

DSP

E

Cipher

SDRAMRISC

FPGA

LookUp

DSP

TDMA

Priority

EDF

WFQ

RISC

DSP

LookUp

Cipher

E E E

E E E

static

Which architecture is better suitedfor our application?

ENG3050 ERCS 30

Page 31: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 31

H/S Codesign: A FrameworkH/S Codesign: A FrameworkSystem

Representation

System

EvaluationCoDesign

Decomposition

(Break down system

functions into a

collection of

sub-functions)

H/S Partitioning

(Determine which of

the sub-functions

should be

implemented in H/S)

Refinement

(Produce a hardware

software alternative

via evaluation)

System

Integration

Page 32: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 32

Co-Synthesis/Co-DesignCo-Synthesis/Co-Design

Page 33: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 33

Partitioning & SchedulingPartitioning & Scheduling Task partitioningpartitioning and task schedulingscheduling are required in

many applications, for instance co-designco-design systems, Multi Processing Systems Multi Processing Systems and High Level SynthesisHigh Level Synthesis.

Sub-tasks extracted from the input description should be implemented in the WhereWhere? The right placeplace (using the Partitioner/Partitioner/PlacerPlacer) WhenWhen? The right timetime (using the schedulerscheduler)

It is well known that such scheduling and partitioningscheduling and partitioning problems are NP-completeNP-complete.

Optimization techniques based on heuristic methodsheuristic methods are generally employed to explore the search space so that feasible and near-optimal solutions can be found.

Page 34: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 34

System PartitioningSystem Partitioning

Good partitioning mechanism:

1) Minimize communication across bus

2) Allows parallelism both hardware (FPGA) and processor operating concurrently

3) Load Balancing Near peak processor utilization at all times (performing useful work)

process (a, b, c) in port a, b; out port c;{ read(a); … write(c);}

Specification

Line (){ a = … … detach}

Processor

Capture

Model FPGA

Partition

Synthesize

Interface

Page 35: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 35

Terminology: HypergraphsTerminology: Hypergraphs

a netlist is a hyper-graph Hyper-graphs can be approximated as graphs, breaking

each hyper-edge into a clique of edges

a hypergraph H = <V, Eh>

V is a set of verticesh Eh is a subset of vertices, 2V

a graph G = <V, E>

V is a set of verticese E is a pair of vertices (u,v)

Page 36: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 36

Bi-partitioning ProblemBi-partitioning Problem given a hyper/graph G

find a partition P of VV1, V2 s.t V1V2=, V1V2=V

minimizing number of edges that cross the cutmin c(P) = all h w(h) if (uV1 and vV2)

where u and v are connected by edge h

subject to a capacity constraint

> |V1| / |V2| >

Page 37: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 37

Bipartitioning ApproachesBipartitioning Approaches Exact Methods:

Mixed Integer Programming (using Branch and Bound) !! min-cut / max-flow (Ford-Fulkerson 1962)

maximum flow through graph = minimum cut useful for establishing unconstrained bound

Heuristics (Local Search) Kernighan-Lin (1970)

operates on graphs swap all nodes once, in pairs that yield max. gain choose greatest gain over pass,repeat until no improvement O(n2log n)

Fiduccia-Mattheyses (1982) operates on hypergraphs O(p), linear time!

Meta Heuristics (avoid getting stuck in local minima) Simulated annealing

select some random moves based on “temperature” design hopefully “cools” into optimal solution computationally intensive

Tabu Search Genetic Algorithms Particle Swarm Optimization

Page 38: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 38

Fiduccia-MattheysesFiduccia-Mattheyses

- generate initial partition- calculate gain g(c) of moving each cellwhile improvement{

clear cells being locked;while max g(c) > 0 | c locked {

select cell with max g(c) | c locked;move c across the cut;c → locked;update g(c) for all of c’s neighbors;

}

}

oneonepasspassO(p)O(p)

Page 39: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 39

ExampleExample

f

a c

ed

b

• all edges have unit weight

• given balance criteria:

|V1| -1 ≥ |V2| ≥ |V1| + 1

goal: partition graph into twodisjoint halves so as to minimize thenumber of hyperedges that span the cut

Page 40: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 40

Example (cont’d)Example (cont’d)

f

a c

ed

b

Step 1.Step 1.

random partitionassigned to keep balance

number of cuts = 5number of cuts = 5

Page 41: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 41

Example (cont’d)Example (cont’d)

d

a c

ed

b Step 2.Step 2.

initial gains arecalculated for each cell

results are placed intobucket array

+1+2

+2

+1-1

+2

number of cuts = 5

Page 42: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 42

Example (cont’d)Example (cont’d)

d

a c

ed

b Step 3.Step 3. cell is selected

gains of critical netsare updated

cell is locked fromfurther movement

+10

0

+1-1

0

number of cuts = 3number of cuts = 3

Page 43: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 43

Example (cont’d)Example (cont’d)

d

a

c

ed

b Step 3.Step 3. Another cell is selected

gains of critical netsare updated

cell is locked fromfurther movement

0

00

-1-1

0

number of cuts = 2number of cuts = 2

Page 44: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 44

InterfacingInterfacing Interfacing

between software and hardware modules is crucial for successful Co-design

I. How data is passed between sub-modules efficiently.

II. The rate of exchange of information between modules

System Description

Hw/Sw Partitioning

Co Synthesis

InterfaceSoftware Hardware

System Integration

Co-Simulation

Page 45: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 45

Interface Models: FIFOInterface Models: FIFOSynchronization through a FIFOFIFO can be implemented either in hardware or in

softwareEffectively reconfigure hardware (FPGA) to allocate

buffer space as needed Interrupts used for software version of FIFO

d1

d2d3

p1 p2 p3

r2

r3

FPGAControl/Data FIFO

Page 46: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 46

MIPS/ARM

I$

D$

Configurable Logic

Profiler

Dynamic Part. Module

(DPM)

Profile application to determine critical regions

Partition critical regions to hardware

Program configurable logic & update software binary

Partitioned application executes faster with lower energy consumption

Initially execute application in software only

11

22

33

44

55

Warp Processors

Page 47: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 47

SummarySummary Hardware/Software co-design Hardware/Software co-design is becoming

the common design style for building systems. H/S co-design allows the majority of a system

to be designed quickly designed quickly with standardized parts while special purpose hardware is used for time critical portions of the system.

Xilinx and Altera provide complete flow for H/S co-design.

Issues:I. How to partition the system?II. Communication overhead!!III. Platforms to be usedIV. Languages that support this paradigm.

Page 48: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 48

Xilinx Tools Xilinx Tools

Page 49: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

49ENG3050 ERCS

Embedded CPUs

PowerPC 405 (hard core) 32 bit embedded PowerPC RISC architecture Up to 450 MHz 2x16 kB instruction and data caches Memory management unit (MMU) Embedded in Virtex-II Pro and Virtex-4/5/6

ARM Cortex –A9 (hard core) 32 bit multicore processor Up to 900 MHz Xilinx Zynq 7000 Processing platform Device is processor based attached to FPGA High level of performance Reduces power, cost, size

MicroBlaze (soft core) 32 bit RISC architecture 2 64 kB instruction and data caches Hardware multiply and divide OPB and LMB bus interfaces...

Page 50: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

50ENG3050 ERCS

Embedded Processors

Embedded Processor

Core Type

Max Clock Frequency

Slices PLBsBlock RAMs

PowerPC Hard 222 MHz 1000 250 9

Microblaze Soft 180 MHz 940 235 9

Picoblaze Soft 221 MHz 333 84 3Picoblaze (optimized)

Soft 233 MHz 274 69 3

Hard core Faster Fixed position Few devices

Virtex-4 Processors:

Soft core Slower Can be placed anywhere Applicable to many devices

PowerPCPowerPCMicroBlazeMicroBlazeMicroBlazeMicroBlazePicoBlazePicoBlaze

Page 51: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 51

PowerPC405 Core

Dedicated Hard IPFlexible Soft IP

RocketIO

PowerPC-based Embedded Design

Full system customization to meet performance, functionality, and cost goals

DCR Bus

UART GPIOOn-Chip

PeripheralHi-Speed

PeripheralGB

E-Net

e.g.Memory

Controller

Arb

iter

On-Chip Peripheral Bus

OPB

Arb

iter

Processor Local Bus

Instruction Data

PLB

DSOCMBRAM

ISOCMBRAM

Off-ChipMemory

ZBT SRAMDDR SDRAM

SDRAM

BusBridge

IBM CoreConnect™on-chip bus standardPLB, OPB, and DCR

Page 52: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 52

MicroBlaze-based Embedded Design

Flexible Soft IPMicroBlaze32-Bit RISC Core

UART 10/100E-Net

On-ChipPeripheral

Off-ChipMemory

FLASH/SRAM

LocalLink™FIFO Channels

0,1…….32

CustomFunctions

CustomFunctions

BRAM Local Memory

BusD-CacheBRAM

I-CacheBRAM

ConfigurableSizes

Arb

iter

Processor Local Bus

Instruction Data

PLBBus

Bridge

PowerPC405 Core

Dedicated Hard IP

Arb

iter

Processor Local Bus

Instruction Data

PLBBus

BridgeBus

Bridge

PowerPC405 Core

Dedicated Hard IP

PowerPC405 Core

Dedicated Hard IP

PowerPC405 Core

Dedicated Hard IPPossible inVirtex-II Pro

Hi-SpeedPeripheral

GB E-Net

e.g.Memory

Controller

Hi-SpeedPeripheralHi-Speed

PeripheralGB

E-NetGB

E-Net

e.g.Memory

Controller

e.g.Memory

Controller

Arb

iter OPB

On-Chip Peripheral Bus

Page 53: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 53

MicroBlaze: Architecture & FeaturesMicroBlaze: Architecture & Features

• RISC• Thirty-two 32-bit general purpose registers• 32-bit instruction word with three operands and two addressing modes• Separate 32-bit instruction and data buses OPB (On-chip Peripheral Bus)Separate 32-bit instruction and data buses OPB (On-chip Peripheral Bus)• Separate 32-bit instruction and data buses LMB (Local Memory Bus)Separate 32-bit instruction and data buses LMB (Local Memory Bus)

Architecture

Features

OPB

LMB

Page 54: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 54

Embedded DevelopmentTool Flow Overview

Compiler/Linker

(Simulator)

C Code

Debugger

Standard Embedded SWDevelopment Flow

CPU code in on-chip memory

?CPU code in

off-chip memory

Download to Board & FPGA

Object Code

Standard FPGA HWDevelopment Flow

Synthesizer

Place & Route

Simulator

VHDL/Verilog

?

Download to FPGA

Page 55: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

EDK• The Embedded Development Kit (EDK) consists of the

following:– Xilinx Platform Studio – XPS– Base System Builder – BSB– Create and Import Peripheral Wizard– Hardware generation tool – PlatGen– Library generation tool – LibGen– Simulation generation tool – SimGen– GNU software development tools– System verification tool – XMD– Virtual Platform generation tool - VPgen– Software Development Kit (Eclipse)– Processor IP– Drivers for IP– Documentation

• Use the GUI or the shell command tool to run EDK

ENG3050 ERCS 55

Page 56: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

EDK Files

• MHS = Microprocessor Hardware Specification• MSS = Microprocessor Software Specification

• MPD = Microprocessor Peripheral Description• PAO = Peripheral Analyze Order

• BBD = Black-Box Definition• MDD = Microprocessor Driver Description• BMM = BRAM Memory Map

ENG3050 ERCS 56

Page 57: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 57

GenerateNetlist

*.mhs

Platform Definition(peripherals, configuration,

connectivity, address space)

Design Flow: Hardware IDesign Flow: Hardware I

Hardware

EDK / Xilinx Platform Studio

Page 58: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 58

Design Flow: Hardware II, ISE EnvDesign Flow: Hardware II, ISE Env

Hardware

Platform Definition(peripherals, configuration,

connectivity, address space)

EDK: Embedded Development Kit XPS: Xilinx Platform Studio ISE: Integrated Software Environment MHS: Microprocessor Hardware Specification

GenerateNetlist

ISE

Platform Ext.Proj.Nav. / VHDL

*.mhs

*.bit

XPS

GenerateBitstream

*.ucf

Page 59: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 59

Platform Definition(peripherals, configuration,

connectivity, address space)

EDK: Embedded Development Kit XPS: Xilinx Platform Studio ISE: Integrated Software Environment MHS: Microprocessor Hardware Specification

GenerateNetlist

*.mhs

*.bit

XPS

GenerateBitstream

*.ucf

Design Flow: SoftwareDesign Flow: Software

ISE

Platform Ext.Proj.Nav. / VHDL

Hardware Software

*.elf

*.c *.asm

Compile &

Link

*.h

Gen.Libs

Page 60: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 60

Design Flow: Combine HW + SWDesign Flow: Combine HW + SW

GenerateNetlist

ISE

Platform Ext.Proj.Nav. / VHDL

*.mhs

*.elf

*.c *.asm

Compile &

Link

UpdateBitstrea

m

*.bit

*.h

Gen.Libs

Platform Definition(peripherals, configuration,

connectivity, address space)

EDK: Embedded Development Kit XPS: Xilinx Platform Studio ISE: Integrated Software Environment MHS: Microprocessor Hardware Specification

*.bit

XPS

GenerateBitstream

*.ucf

Hardware Software

*.bmm

Page 61: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

The Zynq-7000 AP SoC architecture consists of two major sections– PS: Processing system

• Dual ARM Cortex-A9 processor based• Multiple peripherals• Hard silicon core

– PL: Programmable logic• Uses the same 7 series programmable logic

Artix™-based devices: Z-7010, Z-7015, Z-7020 (high-range I/O banks only)

Kintex™-based devices: Z-7030, Z-7035, Z-7045, Z-7100 (mix of high-range and high-performance I/O banks)

The PS and the PL

© Copyright 2014 Xilinx 61

Page 62: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Zynq-7000 AP SoC Block Diagram

© Copyright 2014Xilinx

62

Page 63: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Zynq Architecture Built-in Peripherals

Two USB 2.0 OTG/Device/Host

Two Tri- Mode GigE (10/100/1000)

Two SD/SDIO interfaces– Memory, I/O and combo cards

Two CAN 2.0Bs, SPIs , I2Cs, UARTs

Four GPIO 32bit Blocks– 54 available through MIO; other

available through EMIO

Multiplexed Input/Output (MIO)– Multiplexed pinout of peripheral and

static memories

Extended MIO– Maps PS peripheral ports to the PL

© Copyright 2014 Xilinx

63

Page 64: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

GPIO blocks– Four separate banks of 32 GPIO bits

each• Two banks connect to the 54 MIO pins

32 bits and 22 bits, respectively

• Two banks connect to EMIO (64 bits)

– Each GPIO bit can be dynamically programmed as input or output

– Reset values independently configurable for each bit

– Programmable interrupt generation for each bit

• One interrupt generated per GPIO bank

General-Purpose I/O

© Copyright 2014Xilinx64

Page 65: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Create a new project, or open an existing project

Add/Create a new embedded source in Vivado

Use IP integrator, Block automation, and connection automation features of Vivado to construct(modify) the hardware portion of the embedded design

Create(Update) top level HDL model

Add additional logic at the top-level

Synthesize, implement, and generate the design in Vivado

Export the bitstream, processor hardware description, and launch SDK

Embedded System Design Flow using Vivado & SDK

© Copyright 2014 Xilinx

65

Page 66: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Create a new:– Software board support package (BSP) and

– Application projects in the SDK

Compile the software with the GNU cross-compiler in SDK

[optional] Download the programmable logic’s completed bitstream using SDK or through a hardware session in Vivado

Use SDK to download the program (the ELF file)

Embedded System Design Flow using Vivado & SDK

© Copyright 2014 Xilinx

66

Page 67: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Embedded System Design using Vivado and SDK

15. Program bitstream & .elf into ZynqVivado

SDK

1. Launch Vivado2. Create Block Design

8. Create Top-Level HDL9. Add Constraints (file)10. Generate Bitstream => .bit11. Export hardware to SDK

3. Add PS7 4. Configure PS settings5. Run Block Automation6. Add and configure IPs7. Run Connection Automation

12. Create Board Support Package13. Create or add Software Project14. Build application => .elf

ZedBoard

© Copyright 2014 Xilinx

67

Page 68: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Zynq Device Processing System Configuration

© Copyright 2014 Xilinx

68

Page 69: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Run Block Automation

© Copyright 2014 Xilinx

Create default configuration for the platform

Base on board specified in project settings

– E.g. Zedboard: DDR, GPIO, Uart, USB, QSPI

69

Page 70: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Extending Hardware in IP Integrator

Add IPs

Configure IPs

Run Connection Automation

© Copyright 2014 Xilinx 70

Page 71: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Extending Hardware in Vivado

Create a top level HDL model

Optionally, add other hdl files to the design

Add user constraint files to connect PL pins

– PS/MIO handled automatically

– If you miss any pin constraints (IO standard must be explicitly specified), the tools will error out during the bit generation process

Generate bitstream for PL

© Copyright 2014 Xilinx

71

Page 72: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Eclipse IDE-based Software Development Kit (SDK)– Board support package creation : LibGen

– GNU software development tools

– C/C++ compiler for the ARM Cortex-A9 processor (gcc)

– Debugger for the ARM Cortex-A9 processor (gdb)

Board support packages (BSPs)– Stand-alone BSP

• Free basic device drivers and utilities from Xilinx

• NOT an RTOS

Embedded System Tools: Software

© Copyright 2014 Xilinx

72

Page 73: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

SDK Workbench Views

1. C/C++ project outline displays the elements of a project with file decorators (icons) for easy identification

2. C/C++ editor for integrated software creation

3. Code outline displays elements of the software file under development with file decorators (icons) for easy identification

4. Problems, Console, Properties views list output information associated with the software development flow

© Copyright 2014 Xilinx

1

2

3

4

73

Page 74: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Export Hardware Design to SDK

Software development is performed with the Xilinx Software Development Kit tool (SDK)

An XML description (.hdf Hardware Description file) of the hardware is imported in the SDK tool– The hardware platform is built on this

description

– Only one hardware platform for an SDK project

SDK will then associate user software projects to hardware

© Copyright 2014 Xilinx

74

Page 75: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Build Software Application in SDK

Create software platform– System software, board

support package

– LibGen program

Create software application

Optionally, create linker script

Build project – compile, assemble, link output

file <app_project>.elf

© Copyright 2014 Xilinx75

Page 76: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Software Management Settings

Software is managed in three major areas– Compiler/Linker Options

• Application program

– Software Platform Settings• Board support package

– Linker Script Generation• Assigning software to memory

resources

© Copyright 2014 Xilinx76

Page 77: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

GCC translates C source code into assembly language

GCC also functions as the user interface, passing options to the GNU assembler and to the GNU linker, calling the assembler and the linker with the appropriate parameters

Supported cross-compilers

ARM processor compiler

– GNU GCC (arm-xilinx-eabi-gcc)

– GNU Linux GCC (arm-xilinx-linux-eabi-gcc)

GNU Tools: GCC

© Copyright 2014 Xilinx

77

Page 78: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Input: assembly language files

– File extension: .s

Output: object code

– File extension: .o

Contains

– Assembled piece of code

– Constant data

– External references

– Debugging information

Typically, the compiler automatically calls the assembler

Use the -Wa switch if the source files are assembly only and use gcc

GNU Tools: AS

© Copyright 2014 Xilinx78

Page 79: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Inputs– Several object files

– Archived object files (library)

– Linker script (*.ld)

Outputs– Executable image (ELF)

– Map file

GNU Tools: Linker (LD)

© Copyright 2014 Xilinx79

Page 80: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Xilinx additions to the Eclipse IDE– Software Repositories

– BSP Settings

– Generate Linker Script

– Program the programmable logic• Bitstream must be available

– Program Flash Memory

– Launch XMD Console

– Launch Shell

– Create Zynq Boot Image

– SysGen Co-Debug Settings

– Configure JTAG Settings

Integrated Xilinx Tools

© Copyright 2014 Xilinx

80

Page 81: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Download the bitstream– Only required if PL is used

– Input file <top_name>.bit

The Xilinx hardware session allows downloading the bitstream in to the target

The hardware session can be created from– SDK

– Vivado

Requires that the download cable is connected

Configuring FPGA and Downloading Application

© Copyright 2014 Xilinx81

Page 82: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Set profiling for the BSP– Enable software intrusive profiling

– Enable the -pg option

Set profiling for the application– Enable the compiler for profiling with the –pg option

– Configure the profiler memory

– Set the interrupt frequency and bin value

Compile, link, and generate the ELF executable

Download the executable into a hardware or software simulator

Run the software application until completion or for an "amount of time"

Execute the GNU gprof tool to view the generated profile report

Profiling Procedure

© Copyright 2014 Xilinx

82

Page 83: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Select Xilinx Tools > Board Support Package Settings

Select standalone

Enable software profiling

Select the profiling timer

Select CPU_cortexa9– Add -pg to the Value column

for the extra_compiler_flags option

Configuring the Software Platform Settings

© Copyright 2014 Xilinx

83

Page 84: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

If any of the embedded design resides in programmable logic, download the bitstream to the programmable logic– Select Xilinx Tools > Program

FPGA

Select Run > Run Configurations and create a new configuration– Give appropriate name

– Select the elf file that was compiled with –pg

Profile Configuration: Create a Run Configuration

© Copyright 2014 Xilinx

84

Page 85: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

In the Profile Options tab– Enable profiling

– Set the sampling frequency at which the timer will interrupt

• Higher speed will require more memory but will give a finer resolution

– Set the Histogram bin size

– Set the location of RAM that the profiler can use

• make sure that the software application is not using this memory

Click Run to download the program and begin execution

Set Profile Option in Run Configuration

© Copyright 2014 Xilinx

85

Page 86: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Double-click gmon.out to launch gprof

Point to executable ELF; usually selected by default

gprof report launches

Report toolbar control report options and view capabilities– Sort samples per file

– Sort samples per function

– Sort samples per line

– Display function call graph

– Switch sample/time

Viewing Profiling Reports: Launching gprof

© Copyright 2014 Xilinx86

Page 87: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Profiled Output in SDK

© Copyright 2014 Xilinx

1: Sort Samples per File 2: Sort Samples per Function

3: Sort Samples per Line 4: Display Function Call Graph87

Page 88: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Gprof report options allow report view flexibility and export

Profiling Report Options

© Copyright 2014 Xilinx

1. Show/hide columns 2. Export to CSV 3. Sorting 4. Switch time<>Samples

88

Page 89: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Task Implementation Decision

Keep it in software– Not in critical path

– Enough "free" cycles

– Easier to code in software than in hardware

• Uses math library functions

– NEON co-processor• Supports integer vector

operations

• Single floating-point operations

Move to hardware– Programmable logic co-

processor• Customized to user's needs

• Excellent for iterative and pipelined processing

– Add soft core processor in PL• Both Cortex-A9 and MicroBlaze

processors can co-exist in the AP SoC

© Copyright 2014 Xilinx

89

Page 90: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Task Implementation Decision: Fir Filters

© Copyright 2014 Xilinx

90

Page 91: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Task Implementation Decision: Fir Filters

© Copyright 2014 Xilinx

91

Profiling the application with the hardware IP

Page 92: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Task Implementation Decision: Fir Filters

© Copyright 2014 Xilinx

92

Page 93: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

AXI is Part of ARM’s AMBA

Older Performance Newer

AMBA 3.0(2003)

AMBA: Advanced Microcontroller Bus ArchitectureAXI: Advanced Extensible Interface

© Copyright 2014 Xilinx

93

Page 94: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

AXI is Part of AMBA

AMBA 3.0(2003)

AMBA 4.0(2010)

Same Spec

Enhancements for FPGAs

Interface Features Similar to

Memory Map / Full (AXI4)

Traditional Address/Data Burst

(single address, multiple data)

PLBv46, PCI

Streaming

(AXI4-Stream)

Data-Only, Burst Local Link / DSP Interfaces / FIFO / FSL

Lite

(AXI4-Lite)

Traditional Address/Data—No Burst

(single address, single data)

PLBv46-single

OPB

© Copyright 2014 Xilinx 94

Page 95: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

AXI is an interconnect system used to tie processors to peripherals– AXI Full memory map: Full performance bursting interconnect

– AXI Lite: Lower performance non bursting interconnect (saves programmable logic resources)

– AXI Streaming: Non-addressed packet based or raw interface

AXI Interconnect

© Copyright 2014 Xilinx95

Page 96: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

Basic AXI Signaling – 5 Channels

1. Read Address Channel

2. Read Data Channel

3. Write Address Channel

4. Write Data Channel

5. Write Response Channel

© Copyright 2014 Xilinx

96

Page 97: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

SOURCE asserts and holds VALID when DATA is available

DESTINATION asserts READY if able to accept DATA

DATA transferred when VALID and READY = 1

SOURCE sends next DATA (if an actual data channel) or deasserts VALID

DESTINATION deasserts READY if no longer able to accept DATA

All AXI Channels Use A Basic “VALID/READY” Handshake

© Copyright 2014 Xilinx97

Page 98: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

The AXI Interface—AXI4Lite

No burst

Data width 32 or 64 only– Xilinx IP only supports 32-bits

Very small footprint

Bridging to AXI4 handled automatically by AXI_Interconnect (if needed)

AXI4-Lite Read

AXI4-Lite Write

© Copyright 2014 Xilinx

98

Page 99: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

The AXI Interface—AXI4

Sometimes called “Full AXI” or“AXI Memory Mapped”

– Not ARM-sanctioned names

Single address multiple data

– Burst up to 256 data beats

Data Width parameterizable

– 1024 bits

AXI4 Read

AXI4 Write

© Copyright 2014 Xilinx

99

Page 100: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

The AXI Interface—AXI4Stream

No address channel, no read and write, always just master to slave

– Effectively an AXI4 “write data” channel

Unlimited burst length

– AXI4 max 256

– AXI4-Lite does not burst

Virtually same signaling as AXI Data Channels

– Protocol allows merging, packing, width conversion

– Supports sparse, continuous, aligned, unaligned streams

AXI4-Stream Transfer

© Copyright 2014 Xilinx100

Page 101: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 101

SummarySummary Xilinx provides CAD tools in the form of

EDK/ISE/Vivado to implement soft cores and hard cores and manage the whole hardware/software development process.

The soft cores in the form of a single Micro-Blaze and hard cores in the form of ARM processors enable hardware/software co-design where sequential code can run on the processor and bottlenecks can run on a dedicated hardware accelerator attached to the soft/hard cores.

Page 102: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 102

Page 103: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 103

Co-design: ToolsCo-design: Tools Co-design tools should provide an

almost automatic frameworkautomatic framework for producing a balanced and optimized design from some initial high level specification.

The goal of co-design tools and platforms is not to push towards this not to push towards this kind of kind of total automationtotal automation.

The designer interactionsdesigner interactions and continuous feedback is considered essential.

The main goal is to incorporate in the black box of co-design tools that support for shifting functionalitysupport for shifting functionality and implementation between HW SW with effective and efficient evaluation.

Page 104: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 104

H/S Co-Design: Approaches

Opposite strategiesVulcan (“primal” approach)

Functionality all in HW (HardwareC) initially Move some to CPU to reduce architecture cost

Cosyma (“dual” approach) Functionality all in SW (Cx) initially Move some to ASIC to meet performance goals

LycosConvert all functionality to neutral form

Page 105: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 105

Partitioning AlgorithmsPartitioning Algorithms

Assume everything initially in software Select task for swapping Migrate to hardware and evaluate cost?

Timing, hardware resources, program and data storage, synchronization overhead

Cost evaluation and move evaluation similar to what we’ve seen regarding min-cut FM Algorithm.

task

Software Hardware

List of tasks List of tasks

Page 106: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 106

AutomationAutomation

Compiler profiler determines dependence and rough performance estimates

Result of compilation is synthesizable HDL and assembly code for the processor

Page 107: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 107

Soft and Hard cores in current FPGAs

Power SupplyCLKCLK

CLKcustomIF-logic

SDRAM SDRAMSRAM SRAMSRAM

Memory Controller

UARTLC

DisplayController

InterruptController Timer

AudioCodec

CPU(uP / DSP) Co-

Proc.

GP I/O

AddressDecode

Unit

EthernetMAC

Page 108: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 108

FPGA

Next Step...Next Step...

CLKCLK

CLKcustomIF-logic

SDRAM SDRAMSRAM SRAMSRAM

Memory Controller

UART

DisplayController

Timer

Power Supply

LC

AudioCodec

CPU(uP / DSP) Co-

Proc.

GP I/O

AddressDecode

Unit

EthernetMAC

InterruptController

Page 109: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 109

Configurable System on a Chip (CSoC)Configurable System on a Chip (CSoC)

Power Supply

SDRAM SDRAMSRAM SRAMSRAM

LC

AudioCodec EPROM

Page 110: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 110

Soft CPU Core: Soft CPU Core: „MicroBlaze“ „MicroBlaze“ (Xilinx Inc.)

Page 111: ENG3050 Embedded Reconfigurable Computing Systems Hardware Software Co-design Embedded System Design

ENG3050 ERCS 111

MicroBlaze: Bus ConfigurationsMicroBlaze: Bus Configurations

1.

2.

3.

4.

5.

6.

MicroBlaze core

• LMB: Memory Controller (BRAMs)

• OPB: Ext. Memory Ctrl., Interrupt Ctrl., UART, Timer,

Watchdog, SPI, JTAG-UART, etc.