Upload
cala
View
44
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Dynamic Support of Processor Extensions in Cross Development Tools. Vladimir Rubanov Institute for System Programming of RAS SYRCoSE 2007, 31 May 2007. Extensible Embedded System (1). System Components: Processor Core Processor Extensions = Accelerators (new FUs or co-processors) - PowerPoint PPT Presentation
Citation preview
Dynamic Support of Processor Extensions in Cross Development Tools
Vladimir RubanovInstitute for System Programming of RAS
SYRCoSE 2007, 31 May 2007
2
Extensible Embedded System (1)
System Components:Processor CoreProcessor Extensions = Accelerators
(new FUs or co-processors)Memory Subsystem:
• The only program memory.• Core’s data memories.• Shared data memory.• Accelerators' local data memories.
Uniform instruction set for application developer.
3
Extensible Embedded System (2)
MS
Processor Core P
Core’s Local Memory MP
Core’s Data Memories Mp
Program Memory PM
Shared Memory
Accelerators Ai
Internal Core’s Memory {Ep, Xp}
Accelerators’ Local Memories {Ma}
i-th Accelerator’s Local Memory Mai
Ea
Accelerator’s Internal Memory
Control Flow
Data Flow
4
Accelerator Model
Accelerator
Execution Memory Ea
Main Accelerator Memory MA
Decoder DAc Execution Block EA
...
Shared Memory MS
Local Memory Ma
CA
AAControl Flow
Data Flow
Activation
f1 t1p1
fn tnpn
5
System Design Process
Stage 1: Core Design Stage 2: Accelerators Design Stage 3: SoC System Design
(combining the core and selected extensions)
6
Cross Development Tools
Cycle-Accurate SW Simulator Profilers Macro Assembler Disassembler Linker and Librarian Visual Debugger Integrated Development Environment
(IDE)
7
Cross Development Workflow
C Source CodeCompiler
AssemblerAssembly Sources
Linker
Object Code
.asm
Absolute Binary Module
SimulatorDebugger
Profilers
Inte
gra
ted
Dev
elo
pm
ent
En
viro
nm
ent
(ID
E)
Analysis Tools
User
8
Cross Development Tools Role
At the Design Stage:Design space exploration and
prototyping by simulator based profiling.Early development of optimized
software.HDL verification.
At the Deployment Stage:Development of various production
software.
9
Formal Accelerator Description
Special language (ISE) for describing:Accelerator’s Memory StructureAccelerator’s ResourcesAccelerator’s Instruction Set:
• assembly syntax;• binary coding;• cycle-accurate behavior and
resource usage.
10
Memory Structure
DECLARE_MEMORY(INT(16, 3), 4096) LDM;DECLARE_MEMORY(INT(64, 3), 2048) TM;MEMORY(LDM, "Acc LDM");MEMORY(TM, "Acc TM");
DECLARE_REGISTERS_FILE(INT(16), 4) grn;
REGFILE_BEGIN(grn, "General Registers")REGISTER(0, "GR0");REGISTER(1, "GR1");REGISTER(2, "GR2");REGISTER(3, "GR3");
REGFILE_END()
11
Accelerator Instruction Set (1)
.types
grn = [GR0:0] [GR1:1] [GR2:2] [GR3:3]
acr = [ACR1:0] [ACR2:1]
.operands
GRs = {grn : SS}
GRt = {grn : TT}
ACRa = {acr : A}
ACRb = {acr : B}
12
Accelerator Instruction Set (2)
.instructionsALU01 {ADD GRs, GRt // syntax0110-00SS-0111-T0-T1 // codingconstraints {
GRs<>GRt : “GRs and GRt must be different”
}properties {
wgrn:GRs, rgrn:GRs, rgrn:GRt}
}
13
Accelerator Instruction Set (3)
ALU01 {
…
behavior {
GRs := GRs + GRt;
// GRs ≡ grn[#GRs]
}
}
14
Accelerator Instruction Set (4)
void ALU01 (OPCODE opcode)
{
UINT<2> GRs_ind = (opcode >> 8) && 3;
UINT<2> GRt_ind = ((opcode >> 3) && 1)
|| ((opcode >> 1) && 1);
grn[GRs_ind] =
grn[GRs_ind] + grn[GRt_ind];
FinishCycle();
}
15
Accelerator Instruction Set (5)
MAC01 {MAC ACRa, GRs, GRt…behavior {
// the first cyclemulres := GRs * GRt;FinishCycle();// the second cycleACRa := ACRa + mulres;
}}
16
Inter-Instruction Conflicts
.inter-constraints
[@e2_write_acr = read_acr] % error: “Write After Read conflict for accumulator”
[@p_write && memory_access] % warning: “1 cycle stall: memory access immediately after pointer update”
17
Cross Tools Reconfiguration (1)
1. Accelerator ISE description is created either visually or in plain text.
2. Accelerator description is compiled into a shared library module (.dll on Windows, .so on Linux).
3. Such modules are specified in the cross system configuration.
4. API is used by Core Simulator to execute accelerator instructions as fibers (explicitly controlled threads).
5. API is used by Assembler, Disassembler, Debugger and IDE to extract necessary meta-information about the accelerators.
18
Cross Tools Reconfiguration (2)
The Cross Tools
Automatic generation
On-the-fly reconfiguration
AcceleratorCross Module(.dll or .so)
ISE Specification
Visual ISE Editor
19
MetaDSP: General Layout
20
MetaDSP: Instruction Set Tree
21
MetaDSP: Instruction Properties
22
Results
Generalized model of extensible embedded systems.
A formalism for specifying particular accelerators in ISE language.
Tools for visual editing, analysis and verification of ISE specifications.
Framework for generating executable accelerator modules with meta-information.
Reconfigurable cross development tool chain dynamically extensible by the accelerator modules plugged-in.
23
MetaDSP Framework
The results have been used in MetaDSP – a framework for fast construction and modification of cross development tools for embedded systems.
Proved in 5 commercial projects for different extensible processor families:RISC 32/16 bit DSPsARM-like RISCVLIW DSP
Used in customer’s development teams in Sweden, Taiwan, China, USA
24
Implemented Accelerators
Fast Fourier Transform (FFT). Echo cancellation algorithms. Complex (imaginary) arithmetic
operations. Image processing operations
(JPEG accelerator). Digital voice filtering operations (FIR, IIR). Voice coding/decoding (AMR). MP3 music decoding.
25
Assembler
Parameterized macros support. Conditional assembly
(if, switch, repeat, etc.). Multi-dimensional arrays. Constant expression calculation. Inter-instruction conflicts detection. Automatic NOP insert. C debugging info in Dwarf2 format.
26
Linker
Memory holes optimization (both at variable and module levels).
Visual interface to control memory layout with advanced features (fixed/floating address, alignment).
27
SW Simulator
Cycle-Accurate. Fast speed
(50 MCPS on Core 2 2000Mhz). Pipeline simulation with stalls, zero-
overhead loops, interrupts and timers. Dynamically modifiable code support. Run-time semantics checks. Breakpoints, sample points, trace points.
28
Debugger and IDE Mixed C/Asm/Disasm source level debugging Projects support Full C expressions in Watch window Call Stack with frame switch Various display formats for register and memory
contents Code helper in editor Syntax highlighting in editor Breakpoints / Samplepoints / Tracepoints/
Watchpoints Source Browser RTOS debugging support Various profilers (linear, call graph, instruction
tree, RTOS load, RTOS sequence)
29
30
31