Accelerating SDN/NFV with transparent offloading architecture

*NTT Microsystem Integration Laboratories, Japan †NTT Network Service Systems Laboratories, Japan

Open Networking Summit, Mar. 3-5, 2014, Santa Clara, CA, USA

Koji Yamazaki*, Takeshi Osaka†, Sadayuki Yasuda*, Shoko Ohteru*

and Akihiko Miyazaki*

Outline

• Challenge

• Our approach

• Experimental results

• Conclusion

Challenge

How can we enhance the performance of virtual network functions without increasing CAPEX or OPEX?

• Background

Lots of COTS accelerators and SDKs (FPGAs, NPUs, and GPUs)

Framework for saving energy in future networks (ITU-T Y.3021)

• Two objectives

To reduce programming effort

To enable high-performance, energy-efficient operations

Our approach

Goal: To accelerate required functions easily and efficiently

*ASIP: Application Specific Instruction-set Processor

1. Transparent offloading architecture

New programmable accelerator (ASIP*)

Harmonization among x86 environments

2. Design of application-specific instruction set

Optimal instructions for coarse DPI

Implementation of simple data structure (Bloom filter)

Overview of ASIP architecture

Load/Store

SRAM GPIO

Writeback

Decode

Execute

State Register

Memory Access

*ISA: Instruction Set Architecture

Extension

• Embedded RISC CPUs - MIPS - Cadence - Synopsys, etc.

• Tunable architecture

• ISA* extension with tailored compiler allowed

• Customize HW resources

ASIP for packet stream processing

Load/Store

SRAM GPIO

Writeback

Decode

Execute

State Register

Memory Access

Configure fast U-plane

Ingress Stream FIFO

Egress Stream

Transparent offloading

C-plane configuration

x86 ASIP

PCIe PCIe

DMA Transfers

Issue DMA instructions

Concept: Control ASIP functions from x86 environment

x86 ASIP

PCIe PCIe

Memory map

DMA Transfers

Search ASIP section

x86 ASIP

PCIe PCIe

I-RAM I-RAM

Forward functions

Memory map

DMA Transfers

Invoke coarse DPI function

x86 ASIP

PCIe PCIe

I-RAM I-RAM

Forward functions

Memory map

DMA Transfers

... // Test data #define QUEUE_CHECK 50000 ...

int main() { ... // scan 50000 packets loop = 0; do {

bloom_scan(); loop++; } while (loop<QUEUE_CHECK); bloom_destroy(); return EXIT_SUCCESS; }

main()

Invoke coarse DPI function

x86 ASIP

PCIe PCIe

I-RAM I-RAM

Forward functions

Memory map

DMA Transfers ... …

void bloom_scan() { // Invoke instructions as intrinsics

if (queue_vacancy_check()) {

pop_queue(); sax_hash_match(); sdbm_hash_match(); bernstein_hash_match(); forward_data(); } }

bloom_scan()

22 instructions and 14 registers were added for coarse DPI

Disassembly of bloom filter matching

x86 ASIP

PCIe PCIe

I-RAM I-RAM

Forward functions

Memory map

DMA Transfers

cycles Profiled disassembly

3 entry a1, 32 1 queue_vacancy_check a2

2 beqz.n a2, 60000465 <bloom_scan+0x19>

1 pop_queue 2 sdbm_hash_match 1 bernstein_hash_match 1 sax_hash_match 1 forward_data

1 retw.n bloom_scan+0x19

0 retw.n

13 cycles per one bloom_scan()

Experimental results

Evaluation items w/o acceleration w/ our instructions

Run-time

(mean # of cycles)

*50000 packets,

64-bit fixed field,

45-nm sim library.

hash(sax) 116 1

hash(sdbm) 115 2

hash(bernstein) 98 1

bloom_scan 678 13

Hardware size

(logic gate count) core and SRAM 75 KGates 79 KGates

Power dissipation

(mW) core and SRAM < 100 mW < 100 mW

Performance

(64 bytes)

pps (packets/s) 1 Mpps 57 Mpps

bps (bits/s) 723 Mbps 38 Gbps

Down 98%

Extremely low power

50x faster

Conclusion

Designing an optimal-instruction-set that harmonizes x86 environments

will reduce the costs required for acceleration

More challenging issues

Proprietary architecture White box architecture

Can open ISA transform the ecosystem of accelerators?

My assumption: Common, open ISA-based APIs reduce further programming costs.

Intel’s AVX ISA Extension Berkeley RISC-V open ISA

Emerging trends of open source SDKs (i.e. Centec’s Lantern)

Accelerators have “the Force” Dark side Light side

Other black box SDKs of COTS (ASSPs, NPUs)

Thank you! Questions?

Accelerating SDN/NFV with transparent offloading architecture

Technology

Edh offloading

Dynamic Service Chaining for NFV/SDN · 2018-07-27 · 2 ! Introduction – NFV Reference Architecture – NFV Use cases ! Policy Enforcement in NFV/SDN – Challenges in NFV environments

2.NFV(17)000251r1 ETSI NFV Concepts and MANO details - NFV ...17)000251r1_ETSI_NFV_Conce… · NFV/IFA0073 NFV/SOL0033(API) NFV/IFA008 NFV/SOL0023(API) ETSI3GSNFV /IFA011 VNF’Package’&’VNFD

Query Offloading Case Study - · PDF fileSabre Holdings merchandises and retails travel products and provides distribution and ... Query Offloading Case Study Query Offloading Saves

NFV and Connection Tracking OpenStack NFV: …openvswitch.org/support/boston2017/1100-dpdk-performance-nfv-ct.pdf · OpenStack NFV: Performance with OvS-DPDK for NFV and Connection

„Markt der Möglichkeiten“ im NFV-Ehrenamt 2016 · im NFV-Ehrenamt Markt der Möglichkeiten im NFV-Ehrenamt 4 NFV-Ehrenamt 2016 Markt der Möglichkeiten im NFV-Ehrenamt Markt

A Survey of Computation Offloading for Mobile Systems · PDF fileA Survey of Computation Offloading for Mobile Systems ... techniques, systems, and ... research on computation offloading

Running Head: COGNITIVE OFFLOADING

WiFi data offloading whitepaper

Patente Offloading Load0ut

DEFINING NFV NFV Network Function Virtualization

Wi fi offloading-uae

LNG TANDEM OFFLOADING SYSTEM

Deepwater Direct Offloading Systems

Evaluation of Offloading Firewall Rules with P4 of Offloadin… · · 2017-11-20Network Function Virtualization (NFV) is a novel ... Microsoft Word - Evaluation of Offloading Firewall

HiLoad LNG Offloading System

ON-DEMAND OFFLOADING COLLABORATION …

Offloading for Diabetic Foot

11-100 - Offloading Systems

SDN & NFV Introduction (SDN NFV Day ITB 2016)