Upload
truongkhanh
View
248
Download
1
Embed Size (px)
Citation preview
1
®
© 2003 Intel Corporation
Software Development Environment for Reconfigurable Communications Architecture
Software Development Environment for Reconfigurable Communications Architecture
Vladimir Vladimir IvanovIvanovRadio Communications Lab/Corporate Technology GroupRadio Communications Lab/Corporate Technology Group
Contributor: Vicki TsaiContributor: Vicki TsaiRadio Communications Lab/Corporate Technology GroupRadio Communications Lab/Corporate Technology Group
Reconfigurable Computing TutorialReconfigurable Computing TutorialInternational Symposium on SystemInternational Symposium on System--onon--Chip ConferenceChip Conference
TampereTampere, Finland, Finland
Intel CorporationIntel Corporation
18 November 200318 November 2003
• 2 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
OutlineOutline??RCA ReviewRCA Review??What are the specific architectural features which What are the specific architectural features which
impact software development tools?impact software development tools?
??Programming flowProgramming flow??How do specific architectural features impact software How do specific architectural features impact software
development process?development process?
??Software Development EnvironmentSoftware Development Environment??Goals and ChallengesGoals and Challenges??Process specificsProcess specifics??SystemSystem--level issueslevel issues
??Development Environment ConceptDevelopment Environment Concept??Algorithm and compiler point of viewAlgorithm and compiler point of view
2
®
© 2003 Intel Corporation
RCA ReviewRCA Review
What are the specific architectural features What are the specific architectural features which impact software development tools?which impact software development tools?
• 4 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
RCA ReviewRCA Review??Scalable mesh interconnect of Scalable mesh interconnect of
heterogeneous processing elements (heterogeneous processing elements (PEsPEs))
??Interconnect with Nearest Interconnect with Nearest NeighbourNeighbour MeshMesh
??Clock frequency dependent on load and Clock frequency dependent on load and processprocess
3
• 5 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
RCA ReviewUbiquitous wireless communication across multiple protocols
A scalable mesh interconnect of heterogeneous processing elements (PEs):? Configurable basebands for multiple (concurrent) PHY/MAC operation? Power and Size conserving when compared to “multiple” dedicate d cores or
“traditional” SDR (S/W defined radio) approaches? Tools for simple programming and portability to different arrays of elements
Ultra-wideband WPAN
802.11a WLAN
WCDMA WWAN
DD
CMOS AFE 3
CMOS AFE 2
1
PEPE PE
IO (EC)
IO (AFE 2)
PE
PE
PE
PE
PEPE PE
PE
PE
PE
PE
1
4
3
2
4
3
2
A
EA
CMOS AFE 1
I.E.
IO (EC) IO (EC)
IO (AFE 1) IO (AFE 3)
UMAC 2UMAC 2UMAC 1UMAC 1 UMAC 3UMAC 3
B C D
DCB
E
Figure source Intel research and development
®
© 2003 Intel Corporation
Programming FlowProgramming Flow
How do How do specific architectural specific architectural featuresfeatures impact the software impact the software development process?development process?
4
• 7 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
1. Divide the protocol into modes1. Divide the protocol into modesPreambleDetect:
DiversitySelection:
…
Steady-StateData:
Each mode refers to a different, non-overlapping period in time
• 8 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
2. PartitioningSpecify functions for each mode
2. PartitioningSpecify functions for each mode
PreambleDetect:
DiversitySelection:
AFE1 (ant. 1) AGC Dec. Filter Preamble Det.
AFE1 (ant. 1) AGC Dec. Filter SNR Calc.
AFE1 (ant. 2) AGC Dec. Filter SNR Calc.
Diversity Sel.
Steady-StateData: AFE1 (ant. 1) Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal
Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.Host IO
…
Note: This description is function based and not hardware based.
5
• 9 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
3. CommunicationEstablish communication structure among functions
3. CommunicationEstablish communication structure among functions
PreambleDetect:
DiversitySelection:
AFE1 (ant. 1) AGC Dec. Filter Preamble Det.
AFE1 (ant. 1) AGC Dec. Filter SNR Calc.
AFE1 (ant. 2) AGC Dec. Filter SNR Calc.
Diversity Sel.
Steady-StateData: AFE1 (ant. 1) Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal
Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.Host IO
…
• 10 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
4. AggregationDetermine onto which PE types the functions could be mapped
4. AggregationDetermine onto which PE types the functions could be mappedAFE1 (ant. 1)
PreambleDetect: AGC Dec. Filter Preamble Det.
AFE1 (ant. 1)Diversity
Selection: AGC Dec. Filter SNR Calc.
AFE1 (ant. 2) AGC Dec. Filter SNR Calc.
Diversity Sel.
AFE1 (ant. 1)Steady-State
Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal
Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.
…
Host IO
PE typeAPE typeBPE typeCPE typeDPE typeE
6
• 11 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
5. Check if resources available for the current hardware layout5. Check if resources available for the current hardware layout
AFE1 (ant. 1)PreambleDetect: AGC Dec. Filter Preamble Det.
AFE1 (ant. 1)Diversity
Selection: AGC Dec. Filter SNR Calc.
AFE1 (ant. 2) AGC Dec. Filter SNR Calc.
Diversity Sel.
AFE1 (ant. 1)Steady-State
Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal
Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.
…
Host IO
ResourceUsage (%): PE typeB PE typeA PE typeE PE typeD PE typeC
PE typeAPE typeBPE typeCPE typeDPE typeE
B A
C D
C C
E D
HW topology
• 12 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Host IO
AGC
6. Mapping Place functions onto specific PEs
6. Mapping Place functions onto specific PEs
AFE1 (ant. 1)PreambleDetect: AGC Dec. Filter Preamble Det.
AFE1 (ant. 1)Diversity
Selection: Dec. Filter SNR Calc.
AFE1 (ant. 2) AGC Dec. Filter SNR Calc.
Diversity Sel.
AFE1 (ant. 1)Steady-State
Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal
Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.
…
PE1PE2PE3PE4PE5PE6
7
• 13 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Host IO
AGC
7. Generate “code” for this mapping7. Generate “code” for this mapping
AFE1 (ant. 1)PreambleDetect: AGC Dec. Filter Preamble Det.
AFE1 (ant. 1)Diversity
Selection: Dec. Filter SNR Calc.
AFE1 (ant. 2) AGC Dec. Filter SNR Calc.
Diversity Sel.
AFE1 (ant. 1)Steady-State
Data: Dec. Filter AFC Fixed IQ Imb. Corr. Guard Int. Removal
Deinter. Adapt. IQ Imb Corr FEQ 64-Pt FFTQAM DemapViterbiDescram.
…BinaryImage
BinaryImage
BinaryImage
BinaryImage
BinaryImage
BinaryImage
PE1PE2PE3PE4PE5PE6
• 14 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
8. Check if desired performance has been reached8. Check if desired performance has been reached
BinaryImage
BinaryImage
BinaryImage
BinaryImage
BinaryImage
BinaryImage
System Profiler
StimulusData
HW topology
Performance results
If desired performance has been met, output the binary images.Otherwise, use the results to adjust the mapping and go to Step 2 or 4 or 6
8
• 15 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Programming Flow SummaryProgramming Flow Summary1.1. Divide the protocol into modesDivide the protocol into modes2.2. Specify functions for each modeSpecify functions for each mode3.3. Establish communication structure among functionsEstablish communication structure among functions4.4. Determine onto what PE types the functions could be Determine onto what PE types the functions could be
mappedmapped5.5. Check if we have the resources in the hardwareCheck if we have the resources in the hardware6.6. Place functions onto specific Place functions onto specific PEsPEs7.7. Generate “code” for this mappingGenerate “code” for this mapping?? If code cannot be generated because the PE cannot fit If code cannot be generated because the PE cannot fit
the assigned functions, try a different mappingthe assigned functions, try a different mapping8.8. Check if desired performance has been reachedCheck if desired performance has been reached
?? If not, try a different mappingIf not, try a different mapping?? Otherwise, output the generated code from Step 6Otherwise, output the generated code from Step 6
Programmer
Tools
®
© 2003 Intel Corporation
Software Development EnvironmentSoftware Development Environment
??Goals and ChallengesGoals and Challenges
??Process specificsProcess specifics
??SystemSystem--level issueslevel issues
9
• 17 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Tools GoalsTools Goals??Primary goal is to assure development of Primary goal is to assure development of
effective code for RCA effective code for RCA ??Developed code should effectively use all RCA Developed code should effectively use all RCA
capabilitiescapabilities??Implemented protocols should meet users Implemented protocols should meet users
requirementsrequirements??Abstract code development from hardwareAbstract code development from hardware??If the number of total If the number of total PEsPEs change or the number of change or the number of PEsPEs
of a certain type change, the algorithm does not need to of a certain type change, the algorithm does not need to be alteredbe altered
??Give reasonable programming abstraction level Give reasonable programming abstraction level for the programmerfor the programmer??Provide effective environment for development, Provide effective environment for development,
debugging and testing of softwaredebugging and testing of software
• 18 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Tools ChallengesTools Challenges??Reasonable balance for abstracting software development Reasonable balance for abstracting software development
from hardwarefrom hardware??Classical challenges for parallel architectureClassical challenges for parallel architecture??Decomposition of program into parallel processesDecomposition of program into parallel processes??Effective mapping of processes to Effective mapping of processes to PEsPEs??Effective communication among processesEffective communication among processes??Synchronization among processesSynchronization among processes
??Protocol concurrency implies dynamic RCA resource Protocol concurrency implies dynamic RCA resource distribution among protocolsdistribution among protocols
??Heterogeneity of Heterogeneity of PEsPEs meshmesh??Variety of Processing Elements (Variety of Processing Elements (PEsPEs))??PEsPEs may not be processormay not be processor--basedbased??Methods to program Methods to program PEsPEs differ greatlydiffer greatly
??Guaranteed protocol performanceGuaranteed protocol performance??Effective data visualization from multiple Effective data visualization from multiple PEsPEs??High performance simulation of RCAHigh performance simulation of RCA
10
• 19 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Software Development ProcessSoftware Development Process
Source codedevelopment
Program codetranslation
Debugging
Performancemeasurement
Meetsuser’s reqs?
Algorithmdevelopment
START
Testing
AWARD
Yes
No
Algorithmredevelopment
Redevelopment
Tools foralgorithmdevelopment
Tools forsource codedevelopment
Translationtools, Linkagetools
Debugger,Simulator
Profiler
• 20 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Software Development EnvironmentSoftware Development Environment
IDE
DescriptionsTranslator
ParsedDescriptions
(XML)
Makefile
Mapper
ProcessesLayout (XML)
UserConstraints
Executionstatistics
Packager Makefile
RelocatableImage
LoadableImage
LoaderSimulation/Execution
Linkage tool
Translationtool
Source
Source code ofprocesses
Packager createsMakefile inaccordance withlayout scheme andruns make for theloadable imagebuilding
SourceCode Editor
Processdiagram
editorDescriptions
editor
etcetc.
Library
Hardwaredescription
Softwaredescription
Librarian
Translationtool
Source
RelocatableImage
Librarian
Linkage tool
Library
Relocatableimages of
processes
MapDirectives
Algorithm and source code development tools
Translation and linkage tools
Specific tools
11
• 21 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Input ExampleInput Example
myFn
(Auto)
in data In0 Out0
realFIRIn0 Out0
myFn2
(PE typeA)
In0 Out0
myFn3
(Auto)
In0Out0
In1out data
L(Auto)
_myFn2:...X0:X7=*P0++8 || Y0:Y7=*P1++8 ||M0=X0*Y0 || M1=X1*Y1 || M2=X2*Y2|| M3=X3*Y3 || M4=X4*Y4||M5=X5*Y5 ||M6=X6*Y6 ||M7=X7*Y7||A00=M0+M1 || A20=M2+M3 ||A4=M4+M5 || A6=M6+M7;....DONE
myFn2.ccs
myFn(int16 in0[], int16 out0[]){ int16 i,x; for (i=0; i<IN1LEN; i++) { x=in1[i] * in1[i]; send_output(0,x); }}
myFn.c
• 22 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
System SimulatorSystem Simulator??Cycle accurate simulationCycle accurate simulation
??High performanceHigh performance
??Allow to evaluate latency and Allow to evaluate latency and computational overheadcomputational overhead
??Possibility to connect two instances Possibility to connect two instances of the System Simulator to each otherof the System Simulator to each other
??Provide debugging facilitiesProvide debugging facilities
12
• 23 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
System SimulatorSystem Simulator?? SysSimSysSim contains contains
Simulator Core (SC) Simulator Core (SC) and Individual and Individual Simulators (IS)Simulators (IS)
?? Two abstraction Two abstraction layers for IS layers for IS representationrepresentation??High level objectHigh level object??Scheduled ObjectScheduled Object
?? Object design Object design principle: If being in principle: If being in state S1 and got an state S1 and got an input signal input signal InIn than than after delay D change after delay D change the state to S2 and the state to S2 and produce an output produce an output signal signal OutOut
Debugger
RCA Device Driver
RCA System Simulator
User Application
Simulator Core
Individual PE simulator
Scheduled Objects Layer: efficientcycle-accurate scheduling
AFE Data Host Data
AFE Data(to data filesor to anotherinstance ofthe Simulator)
JTAGControl
Host Data
HardwareConfiguration
File
Debugqueries
Debugevents /responses
JTAGControl
Host Data Drivercontrol
• 24 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Simulation PerformanceSimulation Performance
?? Comparing Comparing SystemCSystemCcore and core and SysSimSysSim corecore
?? SC_METHOD process SC_METHOD process was used for was used for SystemCSystemC
?? Simulated object is Simulated object is NNinstances of D flipinstances of D flip--flop flop objectsobjects
?? Simulation on Intel 2.4 Simulation on Intel 2.4 GHz Pentium 4 GHz Pentium 4
?? 4*4 Mesh (~1000 4*4 Mesh (~1000 objects), 400 MHz objects), 400 MHz
?? 1 sec simulation takes 1 sec simulation takes ~100 hours for ~100 hours for SystemCSystemC Core and ~13 Core and ~13 hours for hours for SySimSySim CoreCore
Source DestinationD flip-flop D flip-flop…
N instances
0
100
200
300
400
500
600
700
800
900
0 2000 4000 6000 8000 10000 12000
N of scheduled objects
Sim
ula
tio
n t
ime
(sec
)
CTL coreSystemC core
13
®
© 2003 Intel Corporation
Development Environment ConceptDevelopment Environment Concept
Algorithm and Compiler Algorithm and Compiler point of viewpoint of view
• 26 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Tools Development ConceptsTools Development Concepts??Naive PhaseNaive Phase ::??Manual program partitioningManual program partitioning??Manual code optimizationManual code optimization??Independent compiler toolsIndependent compiler tools??Static hardware and softwareStatic hardware and software
??Mature Phase:Mature Phase:??Automatic program partitioningAutomatic program partitioning??Automatic code optimizationAutomatic code optimization??Common compiler toolsCommon compiler tools??Static hardware and softwareStatic hardware and software
??Advanced Phase:Advanced Phase:??Macro architecture description toolsMacro architecture description tools??Automatic generation of micro architecture descriptionAutomatic generation of micro architecture description??Automatic software tools generationAutomatic software tools generation??Protocol partitioning for joint hardwareProtocol partitioning for joint hardware --software optimizationsoftware optimization
14
• 27 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Tools Development Naive PhaseTools Development Naive Phase??Enhanced Traditional ModelEnhanced Traditional Model??Networking (communication architecture)Networking (communication architecture)??Mapping (distributable compilation)Mapping (distributable compilation)
??Traditional toolTraditional tool--suite for RCAsuite for RCA??Complete development toolComplete development tool--suitesuite??Integration of tools for sequential programmingIntegration of tools for sequential programming
??Solution constraintsSolution constraints??Aided mapping (userAided mapping (user--defined mapping of defined mapping of
process to PE)process to PE)
• 28 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Enhanced Traditional ModelEnhanced Traditional Model
TinyMapper
RCA SimulatorDebugger
C sourcecode for PE i
Assemblycode for PE j
C Compiler
Objectmodule 1
Objectmodule 2
Linker
Executablemodule
C Compiler
Specializedcode for VMCA
Reconfigurationvector
Configurator
Linker Linker
C sourcecode for PE i
C Compiler
Objectmodule 1
C sourcecode for PEC
C Compiler
Objectmodule
RCA Linker
Executablemodule
Executablemodule
Loadableimage
Assemblycode for PE j
Objectmodule 2
C Compiler
Assemblycode for FMCA
Assembler
Objectmodule
DescriptionTranslator
Make directives
Link directives
15
• 29 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Tools Development Mature PhaseTools Development Mature Phase??True distributable compilationTrue distributable compilation??Automated mappingAutomated mapping
??Global optimizationGlobal optimization??IntermoduleIntermodule optimizationoptimization
??Optimization on heterogeneous environmentOptimization on heterogeneous environment
??Enhanced development toolsEnhanced development tools??C Compiler with highC Compiler with high--level IR generationlevel IR generation
??HighHigh--level IR Linkerlevel IR Linker
??RetargetableRetargetable Code GeneratorCode Generator
• 30 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Distributable Compilation ArchitectureDistributable Compilation Architecture
C sourcecode
C Front-End
IR 1
IR Linker
General IR
Assemblycode
Assembler
IR 2
Specializedcode
Configurator
IR 3
Mapper
IR for PE1 IR for PE2 IR for PE3
CG for PE1 CG for PE2 CG for PE3
Object module 1 Object module 2 Object module 3
IR Libs
Obj Libs
C sourcecode
C Front-End
IR 1
C sourcecode
C Front-End
IR 1
Assemblycode
Assembler
IR 2
Assemblycode
Assembler
IR 2
Specializedcode
Configurator
IR 3
Specializedcode
Configurator
IR 3
16
• 31 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Tools Development Advanced PhaseTools Development Advanced Phase
??Distributable compilationDistributable compilation
??RetargetableRetargetable development toolsdevelopment tools??RetargetableRetargetable C Compiler (tunable CG and optimization)C Compiler (tunable CG and optimization)
??RetargetableRetargetable Assembler (target architecture templates)Assembler (target architecture templates)
??RetargetableRetargetable Simulator (for RCA configurations)Simulator (for RCA configurations)
??Comprehensive Target Descriptive LanguageComprehensive Target Descriptive Language
??Target Tools GeneratorTarget Tools Generator
??HDL code generationHDL code generation
??Joint hardware and software optimizationJoint hardware and software optimization
• 32 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
Co-Design ArchitectureCo-Design Architecture
High LevelIR
ComprehensiveTarget
Description
HDL(VHDL)
RCA Hardware DesignRCA Hardware Design
Target Tools
C Compiler
Assemblers
RCA Simulator
Tools Generator HDL Output
Software DesignSoftware Design
Debugger
Source Codefor RCA
TargetRepresentation
CGi
17
• 33 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
SummarySummary?? RCA programming process characteristicsRCA programming process characteristics??Parallel running processes with message exchangeParallel running processes with message exchange??Procedure level parallelismProcedure level parallelism
??“Partitioning“Partitioning--communicationcommunication--aggregationaggregation--mapping” based mapping” based optimization cycleoptimization cycle
?? RCA software development RCA software development envenv contains standard set of tools for contains standard set of tools for ??Algorithm and source code developmentAlgorithm and source code development??Source code translation and linkingSource code translation and linking
?? RCA software development environment contains specific set of RCA software development environment contains specific set of tools for the optimization cycletools for the optimization cycle
?? 3 phases of software tools development3 phases of software tools development??Main goal of the naive and mature phases is to assure Main goal of the naive and mature phases is to assure
(manually or automatically) program code effectiveness(manually or automatically) program code effectiveness
??Main goal of advanced phase is to assure joint hardwareMain goal of advanced phase is to assure joint hardware--software effectiveness of PHY/MAC algorithms implementationsoftware effectiveness of PHY/MAC algorithms implementation
• 34 •Communications TechnologyCommunications Technology
LabLab
© 2003 Intel Corporation
AcknowledgementsAcknowledgementsErnest Tsui, Vladimir Pudovkin, Vladimir Pavlov, Sergey Mironov, Veronica Mikheeva, Tony Chun, Michael K. Chen