22
© Copyright 2015 Xilinx . Extending the Power of FPGAs The Journey has Begun Salil Raje Xilinx Corporate Vice President Software and IP Products Development

Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Extending the Power of FPGAs

The Journey has Begun

Salil RajeXilinx Corporate Vice PresidentSoftware and IP Products Development

Page 2: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

The Evolution of FPGAs and FPGA Programming

IP-Centric Design with High Level Languages

Software Defined Systems

Agenda

Page 3: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

© Copyright 2015 Xilinx.

The Evolution of FPGAs and FPGA Programming

Page 7: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

IP-Centric Design with High Level Languages

IP-Centric Design with High Level Languages

Page 8: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Example of Hard IP: Zynq MPSOC

Step 1: Leverage Hard and Soft IP + Embedded Processors

OTN Subsystem

Examples of Complex Soft IP

Video Subsystem

HMC Controller Digital Pre-Distortion

SmartConnect

AXI4-S router10x10

VDMA

AXI-MM

AXI4-S AXI4-S

AXI4-S

Deinterlacer

AXI4-S AXI4-S

V Scaler

AXI4-S AXI4-S

H Scaler

AXI4-S AXI4-S

CSC

AXI4-S AXI4-S

422-444

AXI4-S AXI4-S

AXI4-S

AXI-Lite interconnect

AXI-LiteAXI-MM

AXI-Lite

420-422420-422

AXI4-S AXI4-S

AXI-MM interconnect

AXI-MM

Letterboxing

AXI4-S AXI4-S

Page 9: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Step 2: Develop New IP blocks in C/C++

Create IP from C/C++/System C algorithm specificationAbstract algorithm verification 10,000x faster than RTL simTraditional FPGA design experience not required

AlgorithmicSpecification

Micro-architectureExploration

RTL Implementation

FPGA Integration

Page 10: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Step 3: Use Automated IP Assembly

4700 lines of VHDL(top-level connectivity only)

=

IP Assembly Example: Zynq Processor Subsystem

+ Video Subsystem+ 6 IP Blocks

Video Processing IP Subsystem

Page 11: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

High Level Design Case Study: GainSpeed

• Venture-backed start-up

• Products for cable operators to:• Meet skyrocketing capacity

requirements of streaming video

• Cost-effectively migrate networks to a software-driven, all-IP architecture

• Need to be 10x better and 10x cheaper than much larger incumbents

• Have a much smaller team and need to work smarter

11

Page 12: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Test Bench(System C)

Test Bench(System C)

RTL (VHDL)RTL (VHDL)

Previous Approach to Design

• Used Virtex-6 240Ts, targeting 200+ MHz• Ran P&R on 100 servers• Spent 20% of time designing and 80% making it work• Took a team of 10 engineers working for 2 years

RTL (VHDL)

Test Bench(System C)

ModelSim Synthesis P&R

Minimal Test Cases ExhaustiveCorner Cases

SystemDebug

(Chipscope)

100K+ lines of RTL

Testbench same as driver code

12

Page 13: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Current Design Methodology : HLS + IPI

• Kintex 480T + off-the-shelf parts• Used HLS to build 80% of the IP Blocks

• DSP functions, closed-loop timing recovery, DMA engines, etc.

• Fast functional simulation in C• Much better coverage achieved earlier• Team of 2 people working for 6 months

Test Bench(System C)

Test Bench(System C)

RTL (VHDL)RTL (VHDL)IP Blocks(C code))

Test Bench(C code)

C Com-piler

Synthesis P&R

Exhaustive Test Cases System-level Debug

SystemDebug

(Chipscope)

Low 1000s lines of C code

IPIHLS

13

Page 14: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Automated IP Assembly

• Eliminated grunt-work in wiring IP

14

Page 15: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

• Elapsed time from project start to running system in lab: 6 months

• Total number of IP blocks integrated: 30+

• Leveraged key IP cores:

• SRIO, 10G Ethernet MAC, MIG controller, FIR Compiler, Reed-Solomon

• Design running at 368 MHz in Kintex-7

• Enabled co-debug with software developers

Overall Project Results

15

Page 16: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

The Era of Software Defined Systems

Page 17: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Why FPGAs for Software Defined Systems?

The Era of Virtualization– Reconfigurable computing, storage and

networking in the cloud

• The Thirst for Acceleration– Heterogeneous computing– Compute-intensive algorithms

• DNA sequencing• Search engines• Video processing• Encryption/Decryption• Packet routing

FPGAs and Programmable SoCs: • Power-efficient• Reconfigurable• Massively-Parallel• Compute Engines

Page 18: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Example of FPGAs as AcceleratorsSmith-Waterman DNA Sequencing Application

Compares Query(N) with Reference(M) genome strings

Involves MxN Matrix Computation and Dynamic Programming

Maximal parallelism along diagonals

Reference

Que

ry

Xilinx Virtex-7 690T

(reference)

Intel® Xeon E5-2697 12 core

RatioVirtex-7 vs. Intel

12 core

Intel® Xeon Phi 5110P

60 core

Ratio Virtex-7 vs. Intel

60 Core

GCUPS 77.00 19.75 3.90 30.00 2.57

Watts 28.00 130.00 0.22 225.00 0.12

GCUPS/Watt 2.75 0.15 18.10 0.13 20.63

Page 19: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

SDSoc: Software Defined SoC Development

Applications:– Machine Vision– Driver Assistance/ADAS– Software-Defined Radio (SDR)– Wireless Radio– Surveillance– UAV / Drones

Standard Eclipse IDE

Embedded ARM Processor Subsystem

Programmable Logic

Full System Optimizing Compiler

C/C++ Development

System-level Profiling

Mark C/C++ Functions for Acceleration

ARM Code Main( ) Connectivity Accelerator

Func( )

GCC HLS+SP&R

Page 20: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

SDAccel: Software Defined Algorithm Acceleration

Sample Applications:– Machine Learning– Bioinformatics– Graph Processing– Stringology– Data Analytics– Modelling– Science Codes– Signal Processing– Video & Image Processing

Software-DefinedFPGA Acceleration

Page 21: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

Platforms Enable Software Defined FPGA Systems

Board SupportHardware

System Design & Host

Software Stack

Partial Reconfig

Performance Analysis

Pre-defined Platform

Algorithms

Page 22: Extending the Power of FPGAs - Xilinx · Intel® Xeon E5-2697 12 core : Ratio. Virtex-7 vs. Intel 12 core: Intel® Xeon . Phi 5110P : 60 core. Ratio Virtex-7 vs. Intel 60 Core

© Copyright 2015 Xilinx.

C-based IP development + high-level IP assembly are the next

step beyond RTL

Software-defined algorithm development + platforms will

enable you to exploit the power of FPGAs & SoCs

We’re making major investments in next generation silicon and tools that will

revolutionize FPGA design

Summary

HW designers: SW developers: