15
NVIDIA RISC-V Story 4 th RISC-V Workshop 7/2016

NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

  • Upload
    others

  • View
    13

  • Download
    0

Embed Size (px)

Citation preview

Page 1: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

NVIDIA RISC-V Story

4th RISC-V Workshop 7/2016

Page 2: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Outlines

Introduce NVIDIA falcon CPU

Why a new CPU?

Introduce NV-RISCV

Page 3: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

FALCON

int

CS

B

inte

rru

pts

priv

1

ext

CSB

ext

IRQIRQ to

host

ctxsw from

host

methods from

host

vectored

IRQ

dma to

FBIF

ptimer from

host

ctsw to

FBIF

priv

2

ext mem

bus

PIC

TMR

MTHD

CTXSW

CFG

BUSDMA

ICD

MISC

PIPE

DMEM

IMEM

EMEM IF

AES

SHA

scp-cmd

i/f

csb

csbRSA/ECC

Secure

bus

pf1

pf2

ifetch

dec

ex

wb

NVIDIA Falcon overview

Falcon = FAst Logic CONtroller

Introduced over 10 years ago, and

used in >15 different hardware

engines today

Design for flexibility

Design for long memory latency

Design for low area

Design for security

Page 4: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Why Falcon Next Gen?

New use cases requiring more horsepower & feature

Wide addressing range

More performance

Not limit to code size

Rich OS support

Falcon has limits

Small addressing range

Poor performance (0.67DMIPS/Mhz, 1.4Coremark/Mhz)

No D$

No rich OS support

Page 5: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Falcon Next Gen – Options

Buy

ARM (A,R family)

Synopsys (ARC family)

MIPS

Cadence

Build

Improve falcon

Move to a new ISA (And this is when RISC-V came into the picture..)

Page 6: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

CPU comparison

Conclusion –RISC-V is the right direction to next generation of ISA

Build our own implementation of RISCV core

Item Requirement ARM A53 ARM A9 ARM R5 SNPS HS RISC-V Rocket Falcon

(improved)

Core perf >2x falcon Yes Yes Yes Yes Yes No

Area (16ff) <0.1mm^2 No No Yes Yes Yes Yes

Security Yes TZ TZ No No Yes Yes

TCM Yes Yes No Yes Yes No Yes

L1 I/D $ Yes Yes Yes Yes Yes Yes Yes

Addressing 64bit Yes No No No Yes No

Extensible ISA Yes No No No Yes Yes Yes

Safety

(ECC/Parity)

Yes Yes Yes Yes Yes Yes Yes

Functional

Simulation

model

Yes Yes No No Yes No Yes

Page 7: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Introducing NV-RISCV

Page 8: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Summary: Falcon -> NV RISC-V

Falcon NV-RISCV

Falcon NV-RISCV

ISA Falcon-ISA RISCV-RV64

Address Width 32/24 64

Data width 32 64

GPR Num. 16 32

Stage 6 5

Micro-arch

In-order issue In-order issue

out-of-order exec out-of-order exec

out-of-order WB (diff regs) in-order WB

WAW Hazard Stall ROB

Cache No D cache I/D configurable

TCM I/D configurable I/D configurable

Prediction static BTB/BHT/RAS

Load-store In-order Load out-of-order

Memory protection No MPU

Address mapping TCM tagging Base and bound

Page 9: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Falcon Next Gen with RISCV

RISCV plugged-in as 2nd

core

Back compatibility on

interface, easy to

integrate

Isolation between security

and non-security

applications

Page 10: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

NV-RISCV Core perf/area

Area data under 16ff

Core (TSMC) Falcon (Today) RISCV rocket chip NV-RISCV BOOM

Dhrystone (no inline) 0.67 1.72 1.9~2.0 N/A

EEMBC Core Mark 1.4 2.3 ~2.5 3.91

Core Area(mm2) 0.030.055

(no FPU, no L2)0.05~0.06 N/A

Frequency(GHz) 1.5 ~1 >1.5 N/A

Page 11: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Cache design to tolerate large latency

Configurable cache size/line size/associativity/write policy

Cache Optimizations

Store Buffer

Write merging

Line-fill Buffer

Victim Buffer

Stream buffer

SW pre-fetch (future)

L2 (future)

Banked cache (future)

I/DTCM

Page 12: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

D$ perf - btree (8k nodes)

371.35%

220.53%

169.02%

100.00%

0.00%

50.00%

100.00%

150.00%

200.00%

250.00%

300.00%

350.00%

400.00%

0

10

20

30

40

50

60

RISCV Falcon RISV optimized Falcon ideal

Millions

Total cycles (8k nodes, 64KB D Cache), less is better

Total cycles Total cycles (normalized)

Page 13: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

RISCV – Area of Interests to NV

Tool chain

Tool for automotive – To meet ISO26262/ASIL-D or SIL3 requirements

Tool for debug – To debug ucode on silicon/FPGA/emulator

Tool for performance tuning – For ucode profiling and tuning

Tool for flexibility – That users can easily customize ISA

Other compiler features – ILP32/LP64

Security

Crypto instruction & extensions

More instructions

Cache instruction – pre-fecth/invalidate/flush…

Page 14: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Conclusion

Falcon is NVIDIA proprietary control processor

New use-cases require more feature and performance from falcon

It is hard to improve the current CPU/ISA to meet all new requirements

We evaluated different options in the market, result showed that RISC-V is overall

best choice as next generation of falcon

We will build a new core from RISC-V ISA

Page 15: NVIDIA RISC-V StoryNVIDIA RISC-V Story 4th RISC-V Workshop 7/2016 Outlines Introduce NVIDIA falcon CPU Why a new CPU? Introduce NV-RISCV FALCON B u s priv 1 ext CSB ext IRQ to IRQ

Thank You!