29
ゼロから作る高速パケット転送用OS 東京大学大学院情報理工学系研究科 特任助教 浅井大史 <[email protected]> 2014年11月18日 ¸Internet Week 2014¹ S6 ýèô÷ÿâđĔöÞĒçĖĎĔõÞĒçÎb0a

ゼロから作るパケット転送用OS (Internet Week 2014)

Embed Size (px)

Citation preview

Page 1: ゼロから作るパケット転送用OS (Internet Week 2014)

ゼロから作る高速パケット転送用OS 東京大学大学院情報理工学系研究科

特任助教 浅井大史 <[email protected]>

2014年11月18日

Internet&Week&2014 &

S6&

Page 2: ゼロから作るパケット転送用OS (Internet Week 2014)

!  SDN:%So'ware%Defined%Network%"  Forwarding%Plane Control%Plane %

#  Control%Plane %

!  NFV:%Network%Func:on%Virtualiza:on%"  %

#  CPU

Page 3: ゼロから作るパケット転送用OS (Internet Week 2014)

!  CPU %=% OS%"  Inexpensive)"  Flexible)"  Extensible)

X&as&a&Service&

Network&Func:on&Virtualiza:on&

Service&Func:on&Chaining&

Page 4: ゼロから作るパケット転送用OS (Internet Week 2014)

“Networked”%Opera:ng%System%manages%% %“compu:ng”%resources%

“Networking”%Opera:ng%System%manages%% %“compu:ng”%and%“network”%resources%

Page 5: ゼロから作るパケット転送用OS (Internet Week 2014)

!  Not%good%for%networking%"  VM %

%#  Tick %#  %–  TLB etc.%

#  I/O %–  I/O %

Page 6: ゼロから作るパケット転送用OS (Internet Week 2014)

!  %"  %

#  %"  %

#  %#  %#  %#  %#  %

Page 7: ゼロから作るパケット転送用OS (Internet Week 2014)

!  %"  %

#  %"  %

#  %#  %#  %#  %#  %

Page 8: ゼロから作るパケット転送用OS (Internet Week 2014)

!  NIC %%

"  %#  CPU?%# Memory?%#  PCIe%bus?%#  or%something%else?%

Page 9: ゼロから作るパケット転送用OS (Internet Week 2014)

!  Ethernet%"  64SByte% = %

#  1GbE:%1.488Mpps%=%672%ns/packet%

#  10GbE:%14.88Mpps%=%67.2%ns/packet%

#  40GbE:%59.52Mpps%=%16.8%ns/packet%

#  100GbE:%148.8Mpps%=%6.72%ns/packet

Page 10: ゼロから作るパケット転送用OS (Internet Week 2014)

!  %%

"  CPU CPU %"  %

#  %–  netmap%[Rizzo,%USENIX%ATC%2012]%–  Intel®%DPDK%

"  %#  Linux NAPI%#  %

–  Intel®%DPDK%

Page 11: ゼロから作るパケット転送用OS (Internet Week 2014)

PCIe

CPU

I/O Hub

IntegratedMemory

Controller

CPU

Memory Memory

IntegratedMemory

Controller

(a) (a)

(c)

(b)

I/OControllerHub

On-board NIC

Direct Media Interface

(a)  3.3GHz%clock%CPU%•  0.3ns%per%cycle%(220%cycles%/%packet)%

•  +% %(b)  CPUSMemory%bus%(N.B.,%64%bit%wide%access)%

•  DDR3S1333%Dual%Channel:%21.333GB/s%(170.667Gbps)%•  DDR3S1600%Dual%Channel:%25.600GB/s%(204.800Gbps)%•  DDR3S1866%Dual%Channel:%29.867GB/s%(238.933Gbps)%%

(c)  PCIe%bus%•  Gen2:%500MB/s%(x1)%=%4Gbps%

•  usually%x8%for%a%twoSport%10GbE%NIC%•  x16%is%not%enough%for%a%twoSport%40GbE%NIC%

•  Gen3:%985MB/s%(x1)%=%7.88Gbps%(d)  DMI%bus%

•  v1.0:%2GB/s%(1GB/s%per%direc:on%=%8Gbps)%•  v2.0:%4GB/s%(2GB/s%per%direc:on%=%16Gbps)

Bokleneck?

Bokleneck?

Page 12: ゼロから作るパケット転送用OS (Internet Week 2014)

!  Data%access%latency%(*)%"  L1%cache:%4S5%cycles%~%1.2S1.5ns%"  L2%cache:%12%cycles%~%3.6ns%"  L3%cache:%27.85%cycles%~%8.4ns%" RAM:%28%cycles%+%49S56%ns%~%65ns%

#  %

(*)%hkp://www.7Scpu.com/cpu/SandyBridge.html

Page 13: ゼロから作るパケット転送用OS (Internet Week 2014)

���� �������$

Page 14: ゼロから作るパケット転送用OS (Internet Week 2014)

!  PCIe %=%Memory%Mapped%I/O%(MMIO)%

–  %

"  1529.17%cycles%/%read%#  392.1%ns%/%read%

"  282.621%cycles%/%write%#  72.47%ns%/%write

1M %CPU Performance%Monitoring%Counter%(PMC)

CPU:%Intel%Core%i7%4770K%Memory:%Corsair%DDR3S1866%8GB%x4%NIC:%Intel%X520SDA2%

Page 15: ゼロから作るパケット転送用OS (Internet Week 2014)

Ring%bufferDescriptors Buffer

Generic%NIC%architecture

Page 16: ゼロから作るパケット転送用OS (Internet Week 2014)

Ring%bufferDescriptors Buffer

Packet&recep:on&

1.  NIC%receives%a%packet%2.  NIC%transfer%the%packet%data%to%

a%buffer%in%RAM%via%DMA%3.  NIC%proceeds%the%head%pointer%4.  So'ware%processes%the%packet%5.  So'ware%proceeds%the%tail%

pointer%to%release%the%packet%

(3)%head

(2)

(5)%tail

Generic%NIC%architecture

Page 17: ゼロから作るパケット転送用OS (Internet Week 2014)

Ring%bufferDescriptors Buffer

Packet&transmission&

1.  So'ware%writes%a%packet%to%a%buffer%in%RAM%

2.  So'ware%proceeds%the%tail%pointer%to%commit%the%packet%

3.  NIC%transfer%the%packet%data%from%the%buffer%in%RAM%via%DMA%

4.  NIC%transmit%the%packet%5.  NIC%proceeds%the%head%pointer%

to%no:fy%the%packet%is%transmiked%

(2)%tail

(1)

(5)%head

Generic%NIC%architecture

Page 18: ゼロから作るパケット転送用OS (Internet Week 2014)

Ring%bufferDescriptors Buffer

Packet&recep:on&

1.  NIC%receives%a%packet%2.  NIC%transfer%the%packet%data%to%

a%buffer%in%RAM%via%DMA%3.  NIC%proceeds%the%head%pointer%4.  So'ware%processes%the%packet%5.  So'ware%proceeds%the%tail%

pointer%to%release%the%packet%

(3)%head

(2)

(5)%tail

Page 19: ゼロから作るパケット転送用OS (Internet Week 2014)

Ring%bufferDescriptors Buffer

Packet&transmission&

1.  So'ware%writes%a%packet%to%a%buffer%in%RAM%

2.  So'ware%proceeds%the%tail%pointer%to%commit%the%packet%

3.  NIC%transfer%the%packet%data%from%the%buffer%in%RAM%via%DMA%

4.  NIC%transmit%the%packet%5.  NIC%proceeds%the%head%pointer%

to%no:fy%the%packet%is%transmiked%

(2)%tail

(1)

(5)%head

Page 20: ゼロから作るパケット転送用OS (Internet Week 2014)

!  %"  UDP CPU %"  Tx %

n% %#  Descriptor %#  n Tx%tailtxq_tail = 0;for ( ;; ) {

txq_head = read_txq_head();/* Available Tx queue length */txq_len = txq_sz

- (txq_sz - txq_head + txq_tail) % txq_sz;/* Check the available Tx queue length */if ( txq_len < n ) continue;for ( i = 0; i < n; i++ ) {

// Set packet to the ring buffer to txq_tailtxq_ring[txq_tail].pkt = pkt_to_transmit;txq_tail = (txq_tail + 1) % txq_sz

}/* Commit */write_txq_tail(txq_tail);

}

~392.1ns

~72.47ns

Page 21: ゼロから作るパケット転送用OS (Internet Week 2014)

0

2

4

6

8

10

12

14

16

1 2 3 4 5 6 7 8

Pack

et ra

te [M

pps]

Bulk transfer size [packets]

Frame = 64B96B

128B192B256B384B512B768B

1024B1536B

14.88Mpps

=%n

~500ns/packet

~250ns/packet

~125ns/packet

Page 22: ゼロから作るパケット転送用OS (Internet Week 2014)

RX%queue%ring TX%queue%ring

Timehw sw sw hw

Strategy&

•  %•  PCIe %

Page 23: ゼロから作るパケット転送用OS (Internet Week 2014)

rxq_tail = txq_tail = 0;blkcnt = 0;/* # of packets to be routed in bulk transfer */nr_blk = 256 /* can be another value */;for ( ;; ) {

/* Rx queue head */rx_desc = GET_RX_DESC_HEAD(netdev);

if ( DMA_COMPLETED(rx_desc) ) {// Lookup routing table and copy from Rx to Tx// Rewrite destination MAC address, TTL--,// and calculate checksumblkcnt++;if ( blkcnt >= nr_blk ) {

blkcnt = 0;write_rxq_tail(rxq_tail);write_txq_tail(txq_tail);

}} else {

blkcnt = 0;write_rxq_tail(rxq_tail);write_txq_tail(txq_tail);

}}

Page 24: ゼロから作るパケット転送用OS (Internet Week 2014)

Transmitter RouterRX TX

RX

untag

untag

untag

CPU: % %Intel(R)%Core(TM)%i7%4770K%(3.90GHz,%quad%core)%%Memory: %32GiB,%DDR3S1866%NIC: % %Intel(R)%X520SDA2%(2%ports)%

%%5

OS OS

Page 25: ゼロから作るパケット転送用OS (Internet Week 2014)

0

1

2

3

4

5

6

7

8

9

10

0 200 400 600 800 1000 1200 1400 1600

Thro

ughp

ut [G

bps]

Frame size [byte]

My implementationLinux

Line rate

1 %TTL CPU

Page 26: ゼロから作るパケット転送用OS (Internet Week 2014)

!  %"  %

#  Spirent%Communica:ons Spirent%TestCenter%–  Interop%Tokyo%2014 %

%#  %

–  SPTSN4US110%–  CVS10GSS8%%

"  PC %#  CPU:%Intel®%Core%i7%4770K%%# Memory:%DDRS3S1866%(8GB%x4)%#  NIC:%Intel®%X520SDA2%(1 )%

Page 27: ゼロから作るパケット転送用OS (Internet Week 2014)

1%

10%

100%

1000%

1% 2% 3% 4% 5% 6% 7% 8% 9% 10%

Latency&[us]

Test&traffic&(64Obyte&frame)&[Gbps]

avg%

min%

max%

90% ~10us

0.001Mpps &

Page 28: ゼロから作るパケット転送用OS (Internet Week 2014)

!  Networking%Opera:ng%System%"  %

#  I/O %etc.%

"  %OS %

!  40GbE%NIC %

Page 29: ゼロから作るパケット転送用OS (Internet Week 2014)

!  %"  Not%CPU%

#  %"  Not%memory%"  PCIe%MMIO%

!  OS%"  %

#  10GbE %#  10GbE%x4% Tx%