40
Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

  • View
    217

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Router Construction II

OutlineNetwork ProcessorsAdding ExtensionsScheduling Cycles

Page 2: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 2

Observations

• Emerging commodity components can be used to build IP routers– switching fabrics, network processors,

• Routers are being asked to support a growing array of services– firewalls, proxies, p2p nets, overlays, ...

Page 3: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 3

Data Plane(IP)

Control Plane(BGP, RSVP…)

Router Architecture

Page 4: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 4

Software-Based Router

+ Cost

+ Programmability

– Performance (~300 Kpps)

– RobustnessData Plane(IP)

Control Plane(BGP, RSVP…)

PC

Page 5: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 5

Hardware-Based Router

– Cost

– Programmability

+ Performance (25+ Mpps)

+ RobustnessData Plane(IP)

Control Plane(BGP, RSVP…)

ASIC

PC

Page 6: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 6

NP-Based Router Architecture

+ Cost ($1500)

+ Programmability

? Performance

? RobustnessData Plane(packet flows)

Control Plane(packet flows)

IXP1200

PC

Page 7: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 7

In General...

IXP

Pentium

IXP

Pentium

...

Pentium

Page 8: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 8

Architectural Overview

. . . Network Services . . .

VirtualRouter

. . . Hardware Configurations . . .

Packet Flows

Forwarding Paths

Switching Paths

Page 9: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 9

Virtual Router

• Classifiers

• Schedulers

• Forwarders

Page 10: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 10

Simple Example

IP

Proxy

Active Protocol

Page 11: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 11

Intel IXP

Scratch

DRAM

SRAM

6 Micro-Engines

StrongARM

FIFOs

IX B

us

MA

C P

orts

IXP1200 Chip

PC

I B

us

Page 12: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 12

Processor Hierarchy

MicroEngines

Pentium

StrongArm

Page 13: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 13

Data Plane Pipeline

DRAM(buffers)

SRAM(queues,

state)

InputFIFO Slots

OutputFIFO Slots

InputContexts

OutputContexts

64B

Page 14: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 14

Data Plane Processing

INPUT context loopwait_for_datacopy in_fiforegsBasic_IP_processingcopy regsDRAMif (last_fragment)

enqueueSRAM

OUTPUT context loop

if (need_data)

select_queue

dequeueSRAMcopy DRAMout_fifo

Page 15: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 15

0123456789

0 4 8 16 24

MicroEngine Contexts

For

war

ding

Rat

e (M

pps)

input output

Pipeline Evaluation

100Mbps Ether 0.142Mpps

Measured independently

Page 16: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 16

What We Measured

• Static context assignment– 16 input / 8 output

• Infinite offered load• 64-byte (minimum-sized) IP packets• Three different queuing disciplines

Page 17: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 17

Single Protected Queue

• Lock synchronization• Max 3.47 Mpps• Contention lower bound 1.67 Mpps

O

I

I

I

Output FIFO

Page 18: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 18

Multiple Private Queues

• Output must select queue• Max 3.29 Mpps

O

I

I

I

Output FIFO

Page 19: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 19

Multiple Protected Queues

• Output must select queue• Some QoS scheduling (16 priority levels)• Max 3.29 Mpps

O

I

I

I

Output FIFO

Page 20: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 20

Data Plane ProcessingINPUT context loopwait_for_datacopy in_fiforegsBasic_IP_processingcopy regsDRAMif (last_fragment)

enqueueSRAM

OUTPUT context loop

if (need_data)

select_queue

dequeueSRAMcopy DRAMout_fifo

Page 21: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 21

Cycles to WasteINPUT context loopwait_for_datacopy in_fiforegsBasic_IP_processingnopnop…nopcopy regsDRAMif (last_fragment)

enqueueSRAM

OUTPUT context loop

if (need_data)

select_queue

dequeueSRAMcopy DRAMout_fifo

Page 22: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 22

How Many “NOPs” Possible?

00.5

11.5

22.5

33.5

0/0 40/4 80/8 160/16 320/32 640/64

Extra Register/SRAM Operations

For

war

din

g R

ate

(Mp

ps)

IXP1200 Evalualtion Board1.2 Mpps = 8x100Mbps

Page 23: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 23

Data Plane Extensions

Processing Memory Ops Register Ops

Basic IP 6 32

TCP Splicer 6 45

TCP SYN Monitor 1 5

ACK Monitor 3 15

Port Filter 5 26

Wavelet Dropper 2 28

Page 24: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 24

Control and Data Plane

Smart Dropper

Layered Video Analysis(control plane)

(data plane)

Shared State

Page 25: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 25

What About the StrongARM?

• Shares memory bus with MicroEngines– must respect resource budget

• What we do– control IXP1200 Pentium DMA– control MicroEngines

• What might be possible– anything within budget– exploit instruction and data caches

• We recommend against– running Linux

Page 26: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 26

Performance

MicroEngines

Pentium

StrongArm

310Kpps with1510 cycles/packet

3.47Mpps w/ no VRP or1.13Mpps w/ VRP buget

Page 27: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 27

Pentium

• Runs protocols in the control plane– e.g., BGP, OSPF, RSVP

• Run other router extensions– e.g., proxies, active protocols,

overlays

• Implementation– runs Scout OS + Linux IXP driver– CPU scheduler is key

Page 28: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 28

Processes...

.

.

.

.

.

.

.

.

.

Input Port

Pentium

Output PortP

P

PPP

Page 29: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 29

Performance

0

50

100

150

200

250

300

350

0 50 100 150 200 250 300 350 400

Aggregate Offered Load (Kpps)

Aggre

gate

Forw

ard

ing R

ate

(K

pps)

Interrupt

Polling, 1 Process

Polling, 2 Process

Polling, 3 Process

Polling, 3 Process,w/ o Batching

Page 30: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 30

Performance (cont)

0

50

100

150

200

250

300

IP - -IP ++

Active IP (native)

Active IP (Java)

Transparent Proxy

Classic Proxy

450MHz P-I ICisco 7200

Kpps

Page 31: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 31

Scheduling Mechanism• Proportional share forms the base

– each process reserves a cycle rate– provides isolation between processes– unused capacity fairly distributed

• Eligibility– a process receives its share only when its

source queue is not empty and sink queue is not full

• Batching– to minimize context switch overhead

Page 32: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 32

Share Assignment

• QoS Flows– assume link rate is given, derive cycle rate– conservative rate to input process– keep batching level low

• Best Effort Flows– may be influenced by admin policy– use shares to balance system (avoid livelock)– keep batching level high

Page 33: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 33

Experiment

A (BE)

B (QoS)

C (QoS)

A + C

B

Page 34: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 34

Mixing Best Effort and QoS

0

50

100

150

200

250

0 20 40 60 80 100 120 140

Flow A Offered Load (Kpps)

Forw

ard

ing R

ate

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• Increase offered load from A

Page 35: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 35

CPU vs Link

0

50

100

150

200

250

0 10 20 30 40 50 60

Flow A Additional Processing Delay (ms)

Forw

ard

ing R

ate

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• Fix A at 50Kpps, increase its processing cost

Page 36: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 36

Turn Batching Off

0

50

100

150

200

250

0 10 20 30 40 50 60

Flow A Additional Processing Delay (us)

Forw

ard

ing R

ate

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• CPU efficiency: 66.2%

Page 37: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 37

Enforce Time Slice

0

50

100

150

200

250

0 10 20 30 40 50 60

Flow A Additional Processing Delay (us)

Forw

ard

ing R

ate

(Kpps)

Aggreate

Flow A (BE)

Flow B(QoS,90Kpps)

Flow C(QoS,90Kpps)

• CPU efficiency: 81.6% (30us quantum)

Page 38: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 38

Batching Throttle• Scheduler Granularity: G

– flow processes as many packets as possible w/in G

• Efficiency Index: E, Overhead Threshold: T– keep the overhead under T%, then 1 / (1+T) < E

• Batch Threshold: Bi– don’t consider Flow i active until it has

accumulated at least Bi packets, where Csw / (Bi x

Ci) < T

• Delay Threshold: Di– consider a flow that has waited Di active

Page 39: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 39

Dynamic Control

• Flow specifies delay requirement D• Measure context switch overhead

offline• Record average flow runtime• Set E based on workload• Calculate batch-level B for flow

Page 40: Router Construction II Outline Network Processors Adding Extensions Scheduling Cycles

Spring 2003 CS 461 40

Packet Trace

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

5500

0 5 10 15 20 25 30 35 40 45 50 55

Serv

ice T

ime (

us)

WFQ

T=10%D=1000

T=25%D=1000

T=10%D=500

T=10%D=300