19
May 6, 2015 1 © Synopsys, Inc. All rights reserved Gert Goossens Sr. Director R&D, Synopsys Adding C Programmability to Data Path Design

Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

  • Upload
    others

  • View
    30

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 1

© Synopsys, Inc. All rights reserved

Gert Goossens

Sr. Director R&D, Synopsys

Adding C Programmability to

Data Path Design

Page 2: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 2

© Synopsys, Inc. All rights reserved

Smart Products Drive SoC Developments

• Need for reusable SoC platforms

• SoC platforms must become software programmable, without compromising PPA (performance, power, area)

Feature-Rich

Multi-Sensing

Wirelessly Connected

Green

Always-On

Multi-Output

Page 3: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 3

© Synopsys, Inc. All rights reserved

Programmable Processors in SoCs

UART

AMBA 3 AXI & AMBA 2.0 AHB

AMBA APB

GPIO

Wireless

Modem Audio

SD/MMC

controller

Radio

Front End

Audio

Codecs I2C

Video /

Imaging

Video

Front End

Embedded

Memories

(SRAM,

ROM, NVM)

Datapath

SATA

controller

SATA

PHY

DDR

PHY

DDR

controller

USB

controller

USB

PHY

PCIe

controller

PCIe

PHY

HDMI

controller

HDMI

PHY

Control

Processor Ethernet

controller

10G

PHY

UniPro,

UFS, CSI-3,

DigRFv4

controller

MIPI

M-PHY

CSI-2, DSI

controller

MIPI

D-PHY

Control

Processor

Signal

Processing

ADC

DAC

Page 4: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 4

© Synopsys, Inc. All rights reserved

Processor Solutions Spectrum

Micro- processor

Extensible Processor

Application-Specific P / DSP

Programmable Datapath

Hardwired Datapath

• Architectural specialization

• Parallelism: instruction-level, data-level, task-level

• Architectural specialization

• Parallelism: instruction-level, data-level, task-level

• Power-optimised RTL generation

• Power-gating of cores

Minimize power

consumption

Maximize performance

• Support changing requirements, product differentiation, new features… without SoC respin!

• Quick algorithm mapping from C to silicon, with easy debugging

Programma-bility

ASIP = Application-Specific Instruction-set Processor

Page 5: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 5

© Synopsys, Inc. All rights reserved

ASIP Architectural Optimization Space

• Architectural space beyond configurable templates

• Can be captured by processor description language

• Architectural exploration enabled by retargetable ASIP design tools

ASIP architectural optimization space

Parallelism Specialization

Instruction- level

parallelism

Data- level

parallelism

Task- level

parallelism

Orthogonal instruction set (VLIW)

Encoded instruction

set

Vector processing

(SIMD)

Multi-core

App.-specific

data types

App.-specific instructions

Connectivity & storage matching application’s

data-flow

App.-spec. data

processing

App.-spec. memory

addressing

App.-spec. control

processing

Distributed regs,

sub-ranges

Multiple mem’s,

sub-ranges

Jumps, subroutines, interrupts, HW do-loops,

residual control, predication

Direct, indirect, post-modification, indexed,

stack indirect…

Any exotic operator

Integer, fractional, floating-point, bits, complex, vector…

Single or multi-cycle

Relative or absolute, address range, delay slots

Pipeline

Multi-threading

Pipeline depth

Hazards: HW/SW stall,

bypass

Page 6: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 6

© Synopsys, Inc. All rights reserved

ASIP Designer – Retargetable ASIP Design Tool

Typical users: ASIC/SoC design teams

Page 7: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 7

© Synopsys, Inc. All rights reserved

ASIP Designer – History

ASIP Designer

• Processor description language: nML

• Consolidated product, combining strengths of IP Designer and Processor Designer

• Stepwise deployment in 2015-2016 time frame

• Legacy products remain available

IP Designer

• Processor description language: nML

• Roots in architectural exploration and retargetable compilation

Processor Designer

• Processor description language: LISA

• Roots in modeling and fast simulation

Page 8: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 8

© Synopsys, Inc. All rights reserved

Adding C Programmability to SoC Design

ISG

• Graph-based compilation technology combines retargetability with high code efficiency

• Instruction-set graph (ISG)

– Graph-based optimization algorithms operate on (any) ISG

– Closer to HW than other compilers’ machine models

– HW resources, data types, connectivity, instruction encoding, instruction-level parallelism, instruction pipeline

– Supports “irregular” architectures

• Enables rapid and architectural exploration with compiler-in-the-loop

• Enables algorithm development in C, even for highly specialized ASIPs

Application

C

Machine code

Elf / Dwarf

Processor model

nML

COMPILATION

ENGINE

(PHASE COUPLING)

CDFG

*

+

nML FRONT-END C FRONT-END

SOURCE-LEVEL TRANSF.

CODE SELECTION

REGISTER ALLOCATION

SCHEDULING

CODE EMISSION

mul

add

X[2] Y[2]

A[2]

A[2]

sub

Page 9: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 9

© Synopsys, Inc. All rights reserved

Applicable to “Any” Application Domain

Audio

Video & imaging

Wireless

Wireline

Medical

Network processing

Automotive

TMTM

High-perf. computing

Crypto & identification

Industrial

Graphics

• Publicly announced IP Designer and

Processor Designer customers

Page 10: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 10

© Synopsys, Inc. All rights reserved

Examples: Wireless Communication

Micro- processor

Domain-Specific Processor

Application-Specific Processor

Programmable Datapath

Hardwired Datapath

“BoT” [1]

Configurable inner-modem processor

LTE(A) + 11ac + 11ad + WPAN + GPS + DVBT... “FlexFEC” [2]

3-standard FEC engine

LDPC + Turbo + Viterbi

“BLOX” [1]

Single-function sliced accelerators

FFT | LDPC | Matrix inv.

[1] L. Van der Perre, “Radios in need of (Multi-)ASIP - wanted:

flexibility and energy efficiency, Synopsys User Group,

Munich, May 2013

[2] F. Naessens, “Unified C-programmable ASIP architecture for

multi-standard Viterbi, Turbo and LDPC decoding”, IP-SoC

Conference, Dec. 2011.

Page 11: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 11

© Synopsys, Inc. All rights reserved

“BoT” [1]

Configurable inner-modem processor

LTE(A) + 11ac + 11ad + WPAN + GPS + DVBT...

Examples: Wireless Communication

Micro- processor

Domain-Specific Processor

Application-Specific Processor

Programmable Datapath

Hardwired Datapath

“BoT” [1]

Configurable inner-modem processor

LTE(A) + 11ac + 11ad + WPAN + GPS + DVBT... “FlexFEC” [2]

3-standard FEC engine

LDPC + Turbo + Viterbi

“BLOX” [1]

Single-function sliced accelerators

FFT | LDPC | Matrix inv.

[1] L. Van der Perre, “Radios in need of (Multi-)ASIP - wanted:

flexibility and energy efficiency, Synopsys User Group,

Munich, May 2013

[2] F. Naessens, “Unified C-programmable ASIP architecture for

multi-standard Viterbi, Turbo and LDPC decoding”, IP-SoC

Conference, Dec. 2011.

Page 12: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 12

© Synopsys, Inc. All rights reserved

BOT

• Mixed scalar/vector processor

– 10-slot VLIW: 3 scalar,

2 vector L/S, 3 vector compute,

2 pack/unpack

– Vector compute units with

increased specialization

– VU1: alu, mul, shift

– VU2: alu, cabs, interleave, shift

– VU3: alu, recip, sqrt, tan, cexp,

slope, interleave, softdemap

– Vector packing/unpacking

– Low power: clock gating

exploits low duty cycle

– C programmable

Configurable Inner-Modem Processor [1]

Vector RF

BoT profile

average: 45mW (40nm@400MHz)

Page 13: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 13

© Synopsys, Inc. All rights reserved

“FlexFEC” [2]

3-standard FEC engine

LDPC + Turbo + Viterbi

Examples: Wireless Communication

Micro- processor

Domain-Specific Processor

Application-Specific Processor

Programmable Datapath

Hardwired Datapath

“BoT” [1]

Configurable inner-modem processor

LTE(A) + 11ac + 11ad + WPAN + GPS + DVBT... “FlexFEC” [2]

3-standard FEC engine

LDPC + Turbo + Viterbi

“BLOX” [1]

Single-function sliced accelerators

FFT | LDPC | Matrix inv.

[1] L. Van der Perre, “Radios in need of (Multi-)ASIP - wanted:

flexibility and energy efficiency, Synopsys User Group,

Munich, May 2013

[2] F. Naessens, “Unified C-programmable ASIP architecture for

multi-standard Viterbi, Turbo and LDPC decoding”, IP-SoC

Conference, Dec. 2011.

Page 14: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 14

© Synopsys, Inc. All rights reserved

• Application-specific mixed scalar/vector processor

– SIMD: n-way x 8-bit

– VLIW: 1 scalar and 5 vector issue slots

– App.-specific primitive functions

– LDPC decode, Turbo decode, Viterbi decode (e.g. add-compare- select), special addressing modes

– App.-specific complex instructions

– “abs() + abs()”, element-wise vector shift, cross correlation with programmable spreading code

– Transparent background memory access through lookup address generator

– C programmable

FlexFEC 3-Standard Forward Error-Correction (FEC) Engine [2]

Page 15: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 15

© Synopsys, Inc. All rights reserved

• Specialization: e.g. LDPC decode function “vq()”

Standard 32-bit RISC 3,040 cycles

(mild) specialization

32-bit RISC with predicated add/sub 2,707 cycles

data-level parallelism

96-lane, 16-bit SIMD with vector- predicated add/sub

32 cycles

specialization

96-lane, 16-bit SIMD with LDPC decode instruction

(synthesized from C code) 1 cycle

Note: cycle counts obtained for randomized input data

FlexFEC 3-Standard Forward Error-Correction (FEC) Engine [2]

Page 16: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 16

© Synopsys, Inc. All rights reserved

• Instruction-level parallelism: 1 scalar + 5 vector (SIMD) slots – C compiler efficiently exploits VLIW issue slots

FlexFEC 3-Standard Forward Error-Correction (FEC) Engine [2]

Page 17: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 17

© Synopsys, Inc. All rights reserved

“BLOX” [1]

Single-function sliced accelerators

FFT | LDPC | Matrix inv.

Examples: Wireless Communication

Micro- processor

Domain-Specific Processor

Application-Specific Processor

Programmable Datapath

Hardwired Datapath

“BoT” [1]

Configurable inner-modem processor

LTE(A) + 11ac + 11ad + WPAN + GPS + DVBT... “FlexFEC” [2]

3-standard FEC engine

LDPC + Turbo + Viterbi

“BLOX” [1]

Single-function sliced accelerators

FFT | LDPC | Matrix inv.

[1] L. Van der Perre, “Radios in need of (Multi-)ASIP - wanted:

flexibility and energy efficiency, Synopsys User Group,

Munich, May 2013

[2] F. Naessens, “Unified C-programmable ASIP architecture for

multi-standard Viterbi, Turbo and LDPC decoding”, IP-SoC

Conference, Dec. 2011.

Page 18: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 18

© Synopsys, Inc. All rights reserved

BLOX

• Highly regular vector processors

– In each SIMD lane, stack

elementary operators, limited

HW multiplexing

– Low power, thanks to

– Short active wires and modularity

– Simple operators

– Very wide register-files

(asymmetric access)

– Examples

– FFT for 11ac

– LDPC for 11ac

– Matrix ops

– C programmable (but requires C

code refactoring)

Single-Function Sliced Accelerators [1]

– FFT for 11ad

– LDPC for 11ad

Page 19: Adding C Programmability to Data Path Design · AMBA 3 AXI & AMBA 2.0 AHB AMBA APB GPIO Wireless Modem Audio SD/MMC controller RadioADC Front EndDAC Audio I2C Codecs Video / Imaging

May 6, 2015 19

© Synopsys, Inc. All rights reserved

Conclusions

• ASIP design tools introduce C programmability in SoC design

– Better design reuse

– Functional enhancements even after tapeout

– Productivity increase by raising abstraction from RTL to C

• “Compiler-in-the-Loop” concept

– Rapid architectural exploration

– Highly differentiating architectures

• Full control on PPA (performance, power, area)

• Software development kit for end users is available automatically

• Same tool supports wide range of IP needs

• Royalty-free solutions