15
Heng Tan Heng Tan Ronald Demara Ronald Demara A Device-Controlled Dynamic Configuration A Device-Controlled Dynamic Configuration Framework Framework Supporting Heterogeneous Resource Supporting Heterogeneous Resource Management Management

A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management

  • Upload
    questa

  • View
    47

  • Download
    1

Embed Size (px)

DESCRIPTION

A Device-Controlled Dynamic Configuration Framework Supporting Heterogeneous Resource Management. Heng Tan Ronald Demara. Previous Work - Tool Level. Proposed Work: Multilayer Runtime Reconfiguration Architecture (MRRA). Develop MRRA fast reconfiguration paradigm for the CRR approach - PowerPoint PPT Presentation

Citation preview

Heng Tan Heng Tan Ronald Demara Ronald Demara

A Device-Controlled Dynamic Configuration A Device-Controlled Dynamic Configuration Framework Framework

Supporting Heterogeneous Resource Supporting Heterogeneous Resource ManagementManagement

Previous Work - Tool Level

ApproachDevice

Supported On-chip System

Bit Stream Reuse

Potential Limitations

Moraes,Mesquita,

Palma, Moller

Virtex XCV300 devices

No N Lack of area Relocation

Raghavan, Sutton

Xilinx Virtex

devicesNo N Cumbersome

CAD flow

Blodget, McMillan

Virtex II devices Partial Yes

Direct bit stream reuse

required

Proposed Work: Multilayer Runtime Reconfiguration Architecture

(MRRA)

LogicApplications

TranslationEngine

RAM

Mic

ropr

oces

sor

System Bus

Autonomous System

ReconfigurableUnit

PCI

ICAP

PC

PowerPC

Bold: for Loosely-Coupled SystemItalic: for SOC

• Develop MRRA fast reconfiguration paradigm for the CRR approach• Validate with real hardware platform along with detailed performance analysis • Serve as the first general-purpose framework for a wide variety of applications

that require reconfiguration process during operation • Extend existing theories on reconfiguration

Loosely Coupled Solution

Avnet FPGA Development Board

PCI I nt er f ace

Virtex-IIPro FPGA

Off ChipRAM

Controlhosted on

PC

FPGA

Outp

ut

Bit

file

Inpu

t Da

ta

The entire system operates on a The entire system operates on a 32-bit basis32-bit basis

The The Virtex-II ProVirtex-II Pro is mounted on a is mounted on a development board which can then development board which can then

be interfaced with a WorkStation be interfaced with a WorkStation running running XilinxXilinx EDK and ISE. EDK and ISE.

LCS Implementation

Resource name

Number of

Available

Number of Used Utilization

IOBs 396 85 21%

Slices 4928 1805 36%

BRAM 24 44 54%

TBUFs 2464 352 14%

PPC405 1 1 100%

BUFGMUXs 4 1 25%

APIs on Host PC API name Input Parameter Operation Data Width

Initial N/A Recognizes and Initializes the FPGA board N/A

WriteBitFilechar Filename[] Reads configuration file from the

board and writes to Host PC File length

ReadBitFilechar Filename[] Reads the configuration file from

the Host PC and writes to the FPGA board

File length

ByteReadunsigned long StartAddr, unsigned long EndAddr,

int AccessBar

Reads the on board memory 8 bits

WordReadunsigned long StartAddr, unsigned long EndAddr,

int AccessBar

Reads the on board memory 16 bits

DWordRead

unsigned long StartAddr, unsigned long EndAddr,

int AccessBar, unsigned long AccessData

Reads the on-board memory32 bits

ByteWrite

unsigned long StartAddr, unsigned long EndAddr,

int AccessBar, unsigned long AccessData

Writes to the on-board memory 8 bits

WordWrite

unsigned long StartAddr, unsigned long EndAddr,

int AccessBar, unsigned long AccessData

Writes to the on-board memory 16 bits

DWordWrite

unsigned long StartAddr, unsigned long EndAddr,

int AccessBar, unsigned long AccessData

Writes to the on-board memory 32 bits

APIs on Chip CPU Core

API name Input Parameter Operation Data Width

Intc_setup N/A Initializes and enables the interrupt controller

N /A

DeviceDriverHandler  *CallbackRef

The corresponding routine for the SRAM

ownership request interruption from host

PC

N/A

mem_dump unsigned start_addr, unsigned end_addr

Reads the on board and on-chip

memory&register32 bits

mem_write unsigned wr_addr, unsigned wr_value

Writes the on board and on-chip

memory&register32 bits

flash_test unsigned start_addr, unsigned end_addr

Thorough validation test on the flash 32 bits

mem_testunsigned start_addr, unsigned end_addr

Thorough validation test on the flash

memory 32 bits

Future Theoretical Work

• Communication overhead, throughput and overall speed-up analysis

• Translation Complexity Analysis– The quantity of information that needs to be translated to generate the

reconfiguration bitstream– Simplification from file level to bit level is expected

• Storage Complexity Analysis– The memory space that is required for the run-time algorithms

Resources Utilization

Resource name Number of Available Number of Used utilization

IOBs 396 77 19%

Slices 4928 1352 27%

BRAM 44 8 18%

TBUFs 2464 352 14%

PPC405 1 1 100%

BUFGMUXs 4 1 25%

Overall Design

PR Module

Operational Characteristics

TaskTask: A function synthesized to a digital circuit in the form of module that can be programmed and downloaded into the reconfigurable device. A task has a size and a shape.

TaskTask ModularityModularity: The smallest granularity that this architecture deals with is at task level. The size and shape generate the area requirement of the task in CLBs.

General-purpose application scenarioGeneral-purpose application scenario: The architecture may carry out an arbitrary number of tasks. There are no predefined constraints on the tasks. The functions of the tasks are also unknown a-priori.

Runtime scenarioRuntime scenario: The architecture does not know in advance when and what tasks will arrive and what their properties will be. When a task is generated, the system processes it online at runtime.

Issues to Address

PartitioningPartitioning: Selecting computational resources to initialize as component

PlacementPlacement: Determining the target location of the component on the reconfigurable fabric of the device

RoutingRouting: Interfacing the component to its surrounding resources

GenerationGeneration: Generating the bitstream of the component at the target location, and

ConfigurationConfiguration: Writing the generated bitstream to the appropriate portions of the underlying reconfigurable infrastructure of the reconfigurable fabric

Routing: Reconfiguration Module Template

OPB

Addr decoder

Slave attach

MIR/ Reset

User Logic

IPIFReconfigurable

Module Bus Macro

Reconfigurable or

FixedModule

Reconfiguration module Template Intermodule Signal

Generation: Partial Reconfiguration Flow

(Top-Level Design)Design Entry

HDL Entry/Synthesis

Initial Budgeting(Top-level Design)

Design EntryHDL Entry/Synthesis

(Module)

Active ModuleImplementation

(Module)

MappingPlacement

Routing

Final Assembly(Top-Level Design

and Modules)

MappingPlacement

Routing

Downloadto device