Upload
others
View
10
Download
0
Embed Size (px)
Citation preview
1
Effective System Designwith ARM System IP
Mentor Technical Forum 2009
Serge Poublan
Product Marketing Manager
ARM
2
Higher level of integration
WiFi
Bluetooth
Camera
Platform OS Graphic 13 days standby
H.264
MP3
Flash 9
128 MB DDR
Skype
3
World-class market-proven technology
20+ processors for every application
200+ silicon partners
500+ licenses
15Bu shipped
ARMv4
Processors are evolving, e.g. MP
x1-4
ARMv5
ARMv6
ARMv7Cortex
ARM966E-S
SC200ARM7EJ-S
ARM922T
SC100
ARM920T
ARM7TDMI(S)
ARM1176JZ(F)-S
ARM1156T2(F)-S
ARM1136J(F)-S
ARM1026EJ-S
ARM968E-S
ARM926EJ-S
ARM946E-S
x1-4
Cortex-A9
SC300
Cortex-M1
Cortex-M3
Cortex-R4
Cortex-R4F
Cortex-A8
ARM11 MPCore
Cortex-M0
4
ARM Mali GPU - Scalable Performance to over 1G Pixel/s
Mali™-400 MP
Mali™-200
Mali™-55
Vis
ua
lc
om
ple
xit
y
Screen resolution
NextGenerationNavigation
FlashLite
MobileGaming
WebBrowsing Java
Gaming
3DNavigation
Flash 10
TV HD UI
VideoPost
Processing
HD VideoPost
Processing
2D/3DPresentations
HD 3DGaming
Console 3DGaming
5
Higher Mobile Device Resolution
2007 2008 2009 2010 2011 2012 2013
QVGA320x240
VGA640x480
WVGA800x480
WXGA1280x800
WSVGA1024x600
1080p301920x1080
1080p601920x1080
Requirements of next generation Mobileplatform
- Increasing bandwidth requirements simply torefresh the display
- Ignoring Fill rate, Input Vertex Dataand Texture bandwidth
Display Refresh Bandwidth MB/s
1080p60, 1920x1080, 60fps 475
1080p30, 1920x1080, 30fps 237
720p, 1280x720, 30fps 105
WVGA, 800x480, 30fps 44
VGA, 640x480, 30fps 35
6
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
Example SoC Mobile Platform
64 or 128
LPDDR2
L2Cache
UART0 UART1 SPI WDT RTCTimer1Timer0 GPIO
NANDFlash
SDRAMCtrl
AMBAInterconnect
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
CPUCPU
L2CCL2CC
MediaEngine
MediaEngine
StaticMemory
Ctrl
StaticMemory
CtrlLate
ncy
req
uir
em
en
t
Bandwidthrequirement
7
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
Example SoC Mobile Platform
64 or 128
LPDDR2
L2Cache
NANDFlash
SDRAMCtrl
AMBAInterconnect
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
CPUCPU
L2CCL2CC
MediaEngine
MediaEngine
StaticMemory
Ctrl
StaticMemory
Ctrl
“Digital Highway”
8
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
9
AMBA Ecosystem :
The on-chip infrastructure is critical to system performance
Increased focus on processor memory performance
Different types of processors have different requirements
ARM has grown the AMBA architecture eco-system to helpaccelerate SoC design:
70+ Connected Community partnershave AMBA compatible products
10+ AMBA specification downloads a day
“… the de facto standard is of course the ARM bus architecture, AMBA.”Ron Wilson, EETimes
10
Each path must be designed to minimise the inherent pipelinelatency
Next generation AXI Interconnect halves the interconnect latency
Masters which issue multiple AXI requests effectively hide latency
PrimeCell Cache Controllers
Trade an increase in minimum latency for dramatically reduced average latency
Design to Minimise Latency
Processor sub-systemAXI Interconnect
Dynamic Mem CtrlDDR2 PHY
DDR2 SDRAM
Addressformat andarbitration
DDR2SDRAM
CAS latency
De-skewand
capture
Data FIFOand businterface
Round trip memory latency
11
Design to Maximise Throughput
Effective on-chip Quality of Service depends on the co-operation of the interconnect and memory controller
Support for multiple outstanding requests
The best use of memory pages by scanningthe list of requests
Controlling the order of queued transactions to
Meet maximum latency targets
Ensure throughput-dependentprocessors are well serviced
Provide low latency paths
12
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
ARM Level2 Cache Controllers
64 or 128
LPDDR2NANDFlash
SDRAMCtrl
AMBAInterconnect
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
CPUCPU
L2Cache
L2Cache
MediaEngine
MediaEngine
StaticMemory
Ctrl
StaticMemory
Ctrl
“Digital Highway”
13
L2CC Increases Processor Performance
Benchmark : MPEG4 decode
System : ARM PrimeXsys Platform forARM1136J-S
CPU : 400MHz ARM1136J-S 16K I & D caches
Memory : 100MHz 32 bit SDRAM
L2 cache : L210 128K unified L2 cache
Web Page Render Time as a function of L2 Cache Size
0.0 1.0 2.0 3.0 4.0
0
128
256
512
L2
Ca
ch
eS
ize
(KB
)
Speed Up Compared to 0K L2
First Time
Subsequent
Benchmark: Linux + Mozilla (5 htmlpages from I-Bench looped 4 times)CPU: Cortex-A8 (speed, L1 cache), L2part of Cortex-A8
Results may vary for systemconfiguration and web content
No L2
128K L2
256K L2
512K L2
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 2.2
+74%
+102%
+104%
MPEG4 Decode on ARM1136EJ-SRelative performance
14
L2CC Increases System Performance
Reduced System Power Consumption
External memory access ~10x more energy than on-chip
External memory accesses reduced with L2 cache
Enables use of lower-power and lower-cost memorysub-system
E.g. 16-bit instead of 32-bit external interface
Or LPDDR instead of DDR2
Reduced On-Chip traffic & contention
Only cache misses propagated to the interconnect
Improve overall system performances
Provide more bandwidth to others SoC components
15
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
ARM AMBA Interconnect
64 or128
LPDDR2NANDFlash
SDRAMCtrl
DynamicMemory
Controller
DynamicMemory
Controller
Interrupt
Controller
Interrupt
Controller
Cortex A8Cortex A8
L2CCL2CC
MediaEngine
MediaEngine
StaticMemory
Ctrl
StaticMemory
Ctrl
“Digital Highway”
NIC-301
16
AMBA Interconnect (NIC-301)
Low latency communication for ARM CPUs
High bandwidth for ARM Graphics and Video
Supporting: AXI, AHB & APB
Data widths from 32- to 128-bit
Supporting both synchronous & GALS implementations
Quality of service
Configurable through AMBA Designer For minimum area & maximum frequency
17
Optimise your Interconnect Topology
Use properties of the traffic to influencethe topology
RAM SMC DMC
Cortex A9
Real-time masters
Low bandwidth peripherals
Freq F
Low bandwidth peripherals
Real-time masters
Fx2.5
Fx2.5
RAM SMC DMC
Cortex A9
Fx2.5
High connectivity & increasingnumbers of IP cores does not scalewith a single interconnect
18
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
Topology Optimisation with ARM Interconnect
64 or128
LPDDR2NANDFlash
SDRAMCtrl
NIC-301200MHz
DynamicMemory
Controller
DynamicMemory
Controller
InterruptController
InterruptController
CortexCortex
L2CCL2CC
NeonNeon
StaticMemory
Ctrl
StaticMemory
Ctrl
NIC-301400MHz
Low LatencyInterconnect
19
Video
Engine
Video
EngineDMA
Engine
DMA
EngineLCD
Ctrl
LCD
Ctrl
Graphic
Engine
Graphic
Engine
ARM Memory Controllers
64 or128
LPDDR2NANDFlash
SDRAM Ctrl
DMC-34xDMC-34x InterruptController
InterruptController
CortexCortex
L2CCL2CC
NeonNeon
SMC-35xSMC-35x
Low LatencyInterconnect
20
ARM Memory Controllers
Synthesizable, Configurable soft cores
Wide range of memory types, silicon processes & targetapplications
AXI Dynamic Memory Controllers for SDR, DDR, LPDDR,DDR2 and LPDDR2 (DMC-34x)
Over 20 licensees to date
AXI Static Memory Controllers for NOR Flash, NANDFlash and SRAM (SMC-35x)
Over 40 licensees to date
AHB Memory Controllers for Dynamic and Static Memories(PL24x)
Over 60 licensees to date
21
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
22
Topolology
Configure
Cross-configure
Stitch & Check
What is AMBA Designer?
Cross-configure
23
Topolology
Configure
Cross-configure
Stitch & Check
What is AMBA Designer?
Stitch & Check
(Export as individual signals)
Interface checking on:
•Signal widths
•Signal direction
•Interface properties
•Valid responsetypes
•Interleave depth
•…
24
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
25
AXI Slave
Interface
AXI Master
Interface
AXI Slave
Interface
UUT(Block or Sub-system)
User
AXI Master
Interface
AXI Master
AXI Master
User VIP
AXI Slave
IEEE 1800 SystemVerilog Testbench
User IP
AXI Monitor
Directed
Vectors
Prof.
Data
AVIP Features for RTL Simulation
FunctionalVerification
For VerificationEngineers, AVIP is aset of System Verilogmodules that enablefaster and higherquality verification ofAXI based IP.
PerformanceExploration
For SoC architects, HWand VerificationEngineers. AXI basedSoC performance canbe explored and verified.
Prof.
Data
26
AXI Slave
Interface
AXI Master
Interface
AXI Slave
Interface
UUT(Block or Sub-system)
User
AXI Master
Interface
AXI Master
AXI Master
User VIP
AXI Slave
IEEE 1800 SystemVerilog Testbench
User IP
AXI Monitor
AVIP Features for RTL Simulation
ProtocolCheckers
OVL and SVAassertion librariesprovided for AXIprotocol checking.
AXI ProtocolCoverage
Channel level,transaction level andsequence level pre-defined coverage pointsfor AXI protocolcoverage.
27
AMBA Designer + AVIP: RTL Design Flow
To optimise interconnect and memoryarchitecture ARM recommends thefollowing flow:
Configuration
Set the correct parameters and checkthe components
Integration
Assemble the sub-system and staticallycheck the design
Simulation
Run test scenarios to check usagemodes
Analysis
Check results and loop back
ConfigurationConfiguration
IntegrationIntegration
SimulationSimulation
AnalysisAnalysis
28
Fabric Design Tools: What is AVIP?
AXI Slave
Interface
AXI Master
Interface
AXI Slave
Interface
UUT(Block or Sub-system)
User
AXI Master
Interface
AXI Master
AXI Master
User VIP
AXI Slave
IEEE 1800 SystemVerilog Testbench
User IP
AXI Monitor
29
Fabric Design Tools: What is AVIP?
It enables System Exploration at RTL level
TTT = Time to tweak = 20s
TTS = Time to simulate = 5 mins
30
System Exploration Methods
RTL simulation, AVIP, User VIP
Industry standards VIP
Block-level, Internal bus, RTL simulation
Spreadsheet
Analysis
SoC, static
Acceleration/Emulation
VIP, Logic Tiles, SW
SoC, Real Stimulus, external I/F
Silicon/Applications
Real-time Behavior
31
Iteration time vs Realism
SpreadsheetStatic analysis
AVIPInternal bus simulation
SoC + s/wEmulation/proto
Silicon + ApplCoreSight™
mins/hrs
mins/hrs
days/wks
mths/yrs
Cyc
letim
e
LOW
HIGH
Re
alis
ticb
eh
avio
ur
LOW
HIGH
Observeactual
behaviour
Adding S/W,external I/F with
realistic scenarios
Statistical orrecorded traffic
profiles
Mathematicalformula, not
dynamic
AVIP: the iteration time of a spreadsheet with the accuracy approaching RTL simulation
32
ARM Design Flow for Digital Highway
Design Your Intelligent Digital Highway
Configure and connect your RTL
Verification & performance exploration in simulation
Improve your software
AMBA Designer
AVIP
CoreSight
33
Improve the Performance of Your SoC
Analyzing real silicon performance enables you toconfidently improve the next design
If you want to find out how a car really performs, drive it
CoreSight Design Kit & Performance Profiling
Provide accurate, real-time ‘telemetry’ from your system
Essential tools for delivering system performance improvements
Your SoC may be optimized, but is the software?
ARM Profiler analyzes system performance, enabling optimization viaProfile Driven Compilation
34
CoreSight Debug & Trace
The Debug & Trace Architecture for the Digital World
Open Standard available on www.arm.com
Optimise software productivityon your multi-core SoC
SW Debug
SW PerformanceOptimisation
SoC Performanceoptimisation
Visibility and trace of thewhole SoC
ARM trace and performance sources (ETM, PTM, Interconnect)
Leverage CoreSight architecture for YOUR IP
35
ARM Digital Highway
ARM Digital Highway technologydelivers to YOU
Key Soft IP and Physical IP elements
The de-facto communication standard
Tools to analyze and optimize your systemdesign before committing to silicon
Solution to debug and optimise once yoursilicon has been manufactured
Faster time to revenue through reducing design effort andensuring quality of results
AVIP