Upload
vongoc
View
229
Download
0
Embed Size (px)
Citation preview
Altera SoC
Your User-Customizable System on Chip
Providing the Best of Both Worlds
2
SoC FPGA
Dual Core ARM Cortex® -A9
MPCore Processor
Hard
Memory
Controller
Peripherals
ARM Processor System FPGA
ARM + Altera = SoC
®
28nm SoC System Architecture
3
Processor
Dual-core ARM® Cortex-A9 MPCore™ processor
4,000 MIPS (up to 800 MHz per core)
NEON coprocessor with double-precision FPU
32-KB/32-KB L1 caches per core
512-KB shared L2 cache
Multiport SDRAM controller
Up to 533-MHz DDR3 and LPDDR2
Up to 400-MHz DDR2
Up to 200-MHz Mobile DDR
Integrated ECC support
High-bandwidth on-chip interfaces
> 125-Gbps HPS-to-FPGA interface
> 125-Gbps FPGA-to-SDRAM interface
Cost- and power-optimized FPGA fabric
Lowest power transceivers
Up to 1,600 GMACS, 300 GFLOPS
Up to 25Mb on-chip RAM
More hard intellectual property (IP): PCIe® and memory
controllers
PCIe Multiport DDR SDRAM
Controller
Multiport DDR SDRAM
Controller
Hard Processor System (HPS)
ARM Cortex-A9
NEON / FPU
L1 Cache
L2 Cache
USB
OTG
(x2) (1)
64-KB
RAM
DMA (8 Channels)
FPGA
Shared Multiport DDR
SDRAM Controller (2)
JTAG
Debug /
Trace (1)
ARM Cortex-A9
NEON / FPU
L1 Cache
SD /
SDIO/
MMC (1)
I2C
(x2)
CAN
(x2)
GPIO
SPI
(x2)
Hard
PCIe
Timers
(x11)
HPS to
FPGA
FPGA
to HPS
FPGA
Configu
ration
UART
(x2)
Ethernet
(x2) (1)
QSPI
Flash
Control
NAND
Flash (1) (2)
Hard Multiport DDR
SDRAM Controller (2)
3-, 5-, 6-,
and 10-Gbps
Transceivers
Notes:
(1) Integrated direct memory access (DMA)
(2) Integrated ECC
•28LP process
• 8-input ALMs
• Variable-precision DSP
• M10K memory and
640-bit MLABs
• fPLLs
HP
S I/O
s
FP
GA
Ge
ne
ral P
urp
os
e I/O
s
Embedded SW Availability
4
Vendor OS/RTOS Development
Tools
Open Source Linux 3.10 Linaro compiler
Wind River Systems VxWorks 6.9.2 Workbench 3.3.2
Micriµm µC/OS-II, µC/OS-III GNU compiler
Enea OSE 5.5.3 Optima 2.6
Express Logic ThreadX 5.5.3 GNU compiler
Wind River Systems Wind River Linux 5 Workbench/GNU
QNX QNX/Neutrino 6.5.3 Momentics
Fujisoft Android GNU compiler
Green Hills INTEGRITY Multi/Green Hills
Microsoft Windows Embedded 7 Microsoft/Studio
System Development Flow
Standard FPGA Flow Standard Software Flow
Hardware
Development
Software
Development
Release Release • Quartus II Programmer
• In-system Update • Flash Programmer
Simulate Simulate • ModelSim, VCS, NCSim, etc.
• AMBA-AXI and Avalon bus
functional models (BFMs)
• Virtual Target
Debug Debug • SignalTap™ II logic analyzer
• System Console
• GNU, Lauterbach, DS5
and ARM ecosystem
• Quartus II design software
• Qsys system integration tool
• Standard RTL flow
• Altera and partner IP
• ARM Development Studio 5
• GNU toolchain
• OS/BSP: Linux, VxWorks
• Etc…
Design Design
FP
GA
in
th
e L
oo
p
ARM® DS-5™
Altera® Edition Toolkit
One Device, Two Debugging Tools?
7
Dedicated JTAG connection
Visualize & control CPU
subsystem
JTAG
Dedicated JTAG connection
Visualize & control FPGA
ARM® DS-5™ Toolkit Altera Quartus® II Software
JTAG DSTREAM™
One Device, Two Debugging Tools?
8
Dedicated JTAG connection
Visualize & control CPU
subsystem
JTAG
Dedicated JTAG connection
Visualize & control FPGA
ARM® DS-5™ Toolkit Altera Quartus™ II Software
JTAG DSTREAM™
ARM® Development
Studio 5 (DS-5™) Altera® Edition Toolkit
Industry’s First FPGA-Adaptive Software Toolkit
Removes debugging barrier between CPUs and FPGA
Unique OEM agreement between Altera and ARM
Result of innovation in silicon, software, and business model
9
Altera USB-Blaster™
Connection
Unprecedented Combination of Productivity-
Boosting Features
Single USB-Blaster target connection
for software and hardware debug
Automatic creation of register views
of FPGA peripherals
Non-intrusive trace of CPU software instructions correlated
with application events and FPGA hardware events
Hardware cross-triggering between the CPU and FPGA
domains
Simultaneous debug and trace for Cortex-A9 cores and
CoreSight™-compliant cores synthesized on FPGA
Streamline support: Statistical analysis of software load and
bus traffic spanning the CPUs and FPGA
10
Altera SoC Roadmap
Long-Term SoC Roadmap Commitment
12
Featu
res &
Perf
orm
ance
1st Generation 28 nm TSMC
1.05 GHz Dual
ARM Cortex-A9 MPCore™
2nd Generation 20 nm TSMC
3rd Generation 14 nm Intel Tri-Gate
Quad ARM Cortex-A53 ARMv8
Processor
1.5 GHz Dual ARM Cortex-A9 MPCore™
More Coming
13
Arria 10 SoCs: Hard Processor Subsystem
Faster 1.5 GHz per core, total 7500 MIPS
Software Compatible Extensive of software, OS/BSP, tools reuse with
28nm SoC
More Secure Encryption (AES)
SHA authentication, EC DSA Authentication
Root of Trust Support (Certification Authority)
Anti-Tamper
Improved Architecture Device IO can be shared between FPGA logic and HPS
3rd EMAC
256 KB Scratch RAM
8 and 16 Bit NAND Flash device supported
eMMC SD Card Support
50% More Processor Performance
Full Software Compatibility
Superior Architecture
ARM Cortex® -A53 on Intel 14 nm Tri-Gate Process
High Performance + Power Efficiency
>6x throughput improvement
Highest power efficiency of any 64-bit processor
Wide portfolio reusability
Software compatibility with previous generation (32-bit mode)
Cortex-A53 target markets overlap with Stratix 10 SoCs (communications infrastructure,
enterprise, datacenter)
14
Summary
Altera Cyclone V / Arria V SoC Silicon in Mass Production
Comprehensive Ecosystem
ARM DS5 Altera Edition Industry’s First FPGA-Adaptive Software Toolkit
Hardware cross-triggering between the CPU and FPGA domains
Altera SoC roadmap Commitment 28nm SoC
20nm SoC
14nm SoC
And more…
15
Thank You Thank You
Qsys User Interface
17
Validation Display
Interfaces Exported for Hierarchy
Toolbar
Tabs IP Library System Interconnect
A10 SoC Logic and Processor Functions By Application
Wireless
Infrastructure
Access, Metro &
Core
Transmission Cloud Servers
and Storage
Broadcast
Target Applications
• Remote Radio Head
• Mobile Backhaul
• Active Antenna
• Basestation (BTS)
• 4G/LTE Marco eNB
• 4G/LTE Micro eNB
• 40G GPON, EPON,
FFTH, Switch
• 100G / 200G NGPON
• 100G Traffic
Management
• NX 100G OTU 4
• 2 X OTU 4
• 4 X OTU 4
• Flash Cache
• Cloud
• Server
• Acceleration
• Pro A/V Equipment
• Switcher
• Server
• Transport
• Head End
• VoD Mux
Logic Functions
• RF Processing
• Digital Pre-Distortion
(DPD)
• Baseband Interface
• Aggregation
• Bridging
• Switching
• Traffic Management
• IO
• FEC
• Aggregation
• Muxponding
• IO
• Flash Cache
Processing
• Acceleration
• SATA/SAS
• PCIe Gen 3
• Video Format
Conversion
• Muxing
• Switching
• Bridging
Processor Functions
• OAM & Link
• Digital Pre-Distortion
(DPD)
• L2 Switch
• IO, Protocol Control
• Host Offload
• OAM & Link
• L2 Switch
• IO, Protocol Control
• Chassis Mgmt
• Host Offload
• OAM & Link
• IO Control
• Chassis Mgmt
• Flash Cache Control
• Host Offload
• Co-processing &
Acceleration Control
• Audio Processing
• Video Compression
• Link Management
18
Stratix 10 SoC: Altera’s Most Advanced Product Ever
Delivering exceptional adaptability, performance, and power efficiency
19
QUAD 64-BIT
PROCESSOR
ADVANCED TOOLS
1GHZ FPGA LOGIC
>10 TFLOPS DSP
+
+
+
+
ALTERA®
SDK
FOR
OPENCL ™
Stratix 10 SoC In Datacenter Compute Acceleration
Design Challenge
Surging demand for compute-
intensive applications (eg., search
acceleration)
Power budgets critical
Workload balancing between
host processor and accelerators
Stratix 10 SoC
Heterogeneous Computing
High performance search
acceleration engines implemented
in high performance logic and
hard floating point DSP blocks
Highest performance per watt vs.
competing solutions
Load balancing and flow control
with integrated A53 processor
20
A53
Layer
DSP
Layer
Logic
Layer
Embedded OpenCL host
Flow control & load balancing
FPGA link, config, power
management
Search rules
Data mining
Deep packet inspection (page
level)
Floating point computation
Matrix manipulations
Deep packet inspection (bit level)
Search acceleration engines
WO
RK
LO
AD
DI
ST
RI
BU
TI
ON
Host
Processor
(Control
Plane)
Stratix 10 SoC In Networking and Transmission
Design Challenge
Subscriber bandwidths increasing
Cooling & Opex challenges due to system power & cost
Network complexity
Stratix 10 SoC Smart Line Card Processing
Increases subscriber bandwidth per line card with high-performance transceivers
Reduce system power & cost by integrating local line card processor with FPGA
Simplify networking by offloading local line card processing functions to on-chip A53 processor
21
A53
Layer
DSP
Layer
Logic
Layer
Line card chassis management
Operations & Maintenance (OAM)
Link management (EMAC, PCIe)
FPGA config and remote update
Control plane processing
Packet processing
Quality of Service management
Limited usage of DSP blocks
Bridging Transponding Forward Error Correction (FEC) MAC control PHY
Host
Processor
(Control
Plane)
WO
RK
LO
AD
DI
ST
RI
BU
TI
ON
Host
Processor
(Control
Plane)
Stratix 10 SoC In Radar Applications
Design Challenge
Performance and power
efficient SoC design
TFLOPs-class floating point
performance
Security & Encryption
Stratix 10 SoC
Heterogeneous Computing Highest performance per watt vs.
competing solutions
High performance waveform and
signal processing with up to 10
TFLOPS hard floating point DSP
blocks
Secure boot, DSP and flow control
with integrated A53 processor
Design productivity with OpenCL
22
Embedded OpenCL host
Flow control & load balancing
Secure Boot
FPGA config & power management
Secure communication
Guidance and control
Waveform processing
DSP transforms, matrix operations
Floating point computations
Deep Packet Inspection
Waveform processing
Crypto engines
A53
Layer
DSP
Layer
Logic
Layer
WO
RK
LO
AD
DI
ST
RI
BU
TI
ON