80
CCL/N300; Paul Huang 3/21/99 1 ITRI CCL Basic Requirements of a Switch Router What does / can a router do ? What can be done in hardware / software ? How do protocols / standards influence the design ?

Switch Router Design - NCTU

  • Upload
    others

  • View
    23

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 1

ITRICCL

Basic Requirements of aSwitch Router

What does / can a router do ?

What can be done in hardware / software ?

How do protocols / standards influence the design ?

Page 2: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 2

ITRICCL Types of Inter-network Nodes

1. Physical

2. Data Link

3. Network

4. Transport

5. Session

6. Presentation

7. Application

HubBridgeSwitch

Router

Gateway

Page 3: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 3

ITRICCL

Layer Purpose of the Layer Role of Switching

(7) Application(6) Presentation(5) Session

Defines user-oriented services such asfile transfer, messaging, and transactionprocessing; provides for structuringapplications, coding the data, andexchanging information

Application switching (e.g. e-mailforwarding); gateways between differentapplication types; support formanagement functions; selection ofdestination for messages

(4) Transport Delivery of data to applications, divisionof messages into packets

Directs the messages to the specificdestination application or protocol type

(3) Network End-to-end communications through oneor more subnets; selects optimal routes;controls loops; manages addressing

Forwards packets through aninterconnected set of networks

(2) Data Link Transfer of frames across a singlenetwork link such as a LAN; managescontention

Controls switched circuits, switchedLANs, and recovers from link errors

(1) Physical Transmission over a physical circuitincluding physical connectors, bitencoding, etc.

Circuit switching as is used for telephonyand port switching for LAN physicalmedia

What’s the difference? Switch vs Router

Page 4: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 4

ITRICCL Anatomy of a Node

Rep

eate

rR

eal T

ime

OS

Application

Network Protocol

NIC Driver

ApplicationLevel API

KernelLevel API

KernelLevel API

Fast Ethernet NIC

Another Node

Transmit

Receive

Transmit

Receive

ProtocolAPI

DriverSpecification

HardwareInterface

Page 5: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 5

ITRICCL Fast Ethernet System

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

NODE

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

AutoNegPMD

PCSPMA

AutoNegPMD

PCSPMA

MII

MDI

Fast Ethernet Standard (802.3u)

Soft

war

eH

ardw

are

Dev

ice

Net

wor

k In

terf

ace

Media Media

Page 6: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 5

ITRICCL Fast Ethernet System

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

NODE

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

AutoNegPMD

PCSPMA

AutoNegPMD

PCSPMA

MII

MDI

Fast Ethernet Standard (802.3u)

Soft

war

eH

ardw

are

Dev

ice

Net

wor

k In

terf

ace

Media Media

Baseband Repeater Unit

Page 7: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 5

ITRICCL Fast Ethernet System

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

NODE

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

AutoNegPMD

PCSPMA

AutoNegPMD

PCSPMA

MII

MDI

Fast Ethernet Standard (802.3u)

Soft

war

eH

ardw

are

Dev

ice

Net

wor

k In

terf

ace

Media Media

L2 Switch

Page 8: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 5

ITRICCL Fast Ethernet System

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

NODE

AutoNegPMDPMAPCS

Reconciliation

MACLLC

Protocol

RTOS / Applications

AutoNegPMD

PCSPMA

AutoNegPMD

PCSPMA

MII

MDI

Fast Ethernet Standard (802.3u)

Soft

war

eH

ardw

are

Dev

ice

Net

wor

k In

terf

ace

Media Media

L3 Switch - Router

Page 9: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 6

ITRICCL Media Access Controller

l Determine when a node can transmit a packet

l Send frames to the PHY for conversion into packets and transmissionon the media

l Receive frames from the PHY and send them to the software thatprocesses frames (protocols and applications).

l Frame checking

» Valid Frames– Frame size between 64 bytes & 1518 bytes

– Valid frame check sequence (CRC)

– Even number of octets

» Non-valid Frames– Runts: Any frame that is shorter than 64 bytes (512 bits) in size

– Jabber: Data transmission greater than 400 ms (largest packet: 120.56 ms)

– Dribble: Invalid number of octets

l Media independent

Page 10: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 7

ITRICCL IEEE 802.3 — CSMA / CD

l Media Access Rules

» Listen before sending– CSMA — Carrier Sense Multiple Access

– Interpacket Gap (IPG = 96 bit time or 0.96 µs for FE)

» Backoff– CD — Collision Detection

– Collision domain

– Collision window

– Slot time == maximum allowable collision window (512 bit times)

l minimum frame size (512 bit / 64 bytes)

l maximum network diameter

– Truncated binary exponential backoff

l RAND(0, 2 min(N,10) ), where N is the transmit attempt counter

l Integer multiple of 512 bit slot time (i.e. 512, 1024, 1536, 2048, … , 4096, etc.)

l Maximum backoff time is 5.3 ms.

Page 11: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 8

ITRICCL CSMA / CD Flow Chart

Transmit Frame

À Assemble PacketÁ Set Attempt counter to 0

Transmit 1st bit of the Packet

Is media busy ?

IPG passed ?

Collision ?

Tx Completed ?

Frame Tx Success

Delay Rest of IPGN

Other Tx Finished?

N

Y

Transmit next bit

N

Send Jam (32 bits)

Increment attemptcounter

Attempts > 16

Compute back-off

Delay back-off

Y

Y

Frame Tx Failed

Page 12: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 9

ITRICCL Packet Format

Preamble SFD Data Frame EFD

Preamble 10101010SFD Start Frame Delimiter 10101011EFD End of Frame Delimiter ----DA Destination AddressSA Source AddressL/T Length / TypeFCS Frame Check SequenceI/G Individual / GroupU/L Universal / Local AdministrationOUI Organizationally Unique IdentifierOUA Organizationally Unique Address

DA L/T DataSA FCS6 Bytes 6 Bytes 2B 46 ~ 1500 Bytes 4 Bytes

I/G OUI Address (OUA)U/L48 47 45 24 23 0

IP Info CHK DataProtocol IP InfoDest NA Src NA

Page 13: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 10

ITRICCL Wire Speed — Efficiency

74.3%

86.2%96.4%97.5%92.9%

97.1%

54.8%

38.1%

28.6%

19.0%

9.5%3.6%

32.2% 32.0% 31.8% 30.6%

28.5% 24.5%18.1%

12.6% 9.4% 6.3% 3.1% 1.2%0.0%

20.0%

40.0%

60.0%

80.0%

100.0%

1500 1262 1006 494 238 110 46 32 24 16 8 3Data 1500 1262 1006 494 238 110 46 32 24 16 8 3

Packet 1518 1280 1024 512 256 128 64 64 64 64 64 64

Page 14: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 11

ITRICCL

Best Ok Bad

Network Utilization

Util

izat

ion

Full Utilization 30% Saturation

Offered Load

Page 15: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 12

ITRICCL Basic & Worst Case Collision Detection w/ Cat-5 Cable

Repeater (70 bt)

A

B

C

10 m5.7 bt

1 m0.57 bt

100 m57 bt

25 bt

25 bt

25 bt

Node-to-node Path Delay Value (rounded to the nearest whole bit time)

A Õ B 126 = 25 + 5.7 + 70 + 0.57 + 25A Õ C 183 = 15 + 5.7 + 70 +57 + 25B Õ C 178 = 25 + 0.57 + 70 +57 +25

C Õ D Õ C 468 = 2 * (25 + 57 + 70 + 57 + 25)Safety margin 4Bit-time margin 40 = 512 - 472 -4

183

126

96

96

178

178

222 279 400 457

A B

C

D

100 m57 bt

25 bt

Page 16: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 13

ITRICCL Worst-case Collision Window w/ Class-I Hub and Fiberoptic Cable

Repeater (70 bt)

A

C

B

134 m67 bt

10 m5 bt

134 m67 bt

25 bt

25 bt

25 bt

Node-to-node Path Delay Value (rounded to the nearest whole bit time)

A Õ B 254 = 25 + 67 + 70 + 67 + 25A Õ C 192 = 15 + 67 + 70 + 5 + 25B Õ C 192 = 25 + 5 + 70 + 67 +25

A Õ B Õ A 508 = 2 * (25 + 67 + 70 + 67 + 25)Safety Margin 4Bit time margin 0 = 512 -508 - 4

254

253

192

192 253 384 507 699

A

B

C192 192

254

Page 17: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 14

ITRICCL Missed Collision w/ oversized network

Repeater (70 bt)

A

C

B

134 m67 bt

10 m5 bt

150 m75 bt

25 bt

25 bt

25 bt

Node-to-node Path Delay Value (rounded to the nearest whole bit time)

A Õ B 262 = 25 + 67 + 70 + 75 + 25A Õ C 192 = 15 + 67 + 70 + 5 + 25B Õ C 200 = 25 + 5 + 70 + 75 +25

A Õ B Õ A 524 = 2 * (25 + 67 + 70 + 75 + 25)Safety Margin 4Bit time margin -16 = 512 -524 - 4

262

261

192

192 261 392 512

A

B

C200

262

523

Page 18: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 15

ITRICCL 100Base-TX / FX Connection

MA

C

RE

C

PCS

PMA

MA

C

RE

C

PCS

PMA

+

– R

PCS

PMA

RPCS

PMA

+

Pair 1

Pair 3

Pin 1

Pin 2

Pin 3

Pin 6

Twisted Pair

R

PCS

PMA

RPCS

PMA

MA

C

RE

C

PCS

PMA

MA

C

RE

C

PCS

PMA

Fiberoptic pair

MII MIIR

epeater Unit

Nod

e

Page 19: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 16

ITRICCL 100Base-T4 Connection

MA

C

RE

C

PCS

PMA

MA

C

RE

C

PCS

PMA

+

– R

PCS

PMA

RPCS

PMA

+

Pair 2

Pair 3

Pin 1

Pin 2

Pin 3

Pin 6

Twisted PairMII MII

Repeater U

nit

Nod

e

+

+

+

+

Pair 1

Pair 4

Pin 5

Pin 4

Pin 7

Pin 8

Page 20: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 17

ITRICCL Repeater Testing

l Function

» Transmit / Receive event– Data handling: forward packet

– Receive event handling: carrier sense

» Error handling via partition– False carrier events: invalid start-of-stream delimiter

l Partition and set LINK UNSTABLE state after two false carrier event

l Send jam signal to all other ports on the repeater for 5 µs or until end of FCE

l Unset LINK UNSTABLE state after detecting no activity for more than 331 µs ordetecting a valid incoming packet after the the line has been idle for the interpacketgap time of 640 µs.

– Excessive collision: more than 60 collisions in a row

l Partition after receiving more than 60 collisions in a row

l Clear after detecting activity without a collision for more than 5 µs

– Receiver Jabber: data transmission greater than 400 µs (largest packet: 120.56 µs)

l Clear after jabber stops

Page 21: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 18

ITRICCL Basic Bridge Operation

Receive Frame

Address in Table ?

Ports the same ?

Unicast DA ?

Record Address &Port # in Table

MAC AddressTable

Lookup SAin Table

Lookup DA in table& get corresponding

port #

Address in Table ?Forward Frame to All Ports

except the inbound port

Inbound Port = DA port # Filter the FrameForward Frame to DA port #

N

Y

N

Output

Input

N

YN

Page 22: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 19

ITRICCL Multiple Bridges

A B C

D E F

AlphaM N O

P Q R

Beta

G H I

J K LGamma

Y ZDelta

S T U

V W X

Epsilon

Segment Alpha Gamma Beta Epsilon Delta

MAC Address ABCDEF GHIJKL MNOPQR STUVWX YZ

Bridge #1 111111 222222 333333 222222 22

Bridge #2 111111 111111 111111 333333 22

B11

2

3

B21

2

3

Page 23: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 20

ITRICCL Multiple Bridges

A B C

D E F

AlphaM N O

P Q R

Beta

G H I

J K LGamma

Y ZDelta

S T U

V W X

Epsilon

B3

B11

2

3

B21

2

3

Problems caused by looping• Broadcast storming• Learning problems• Cloned unicast frames

Solution• Spanning Tree Protocol

Page 24: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 20

ITRICCL Multiple Bridges

A B C

D E F

AlphaM N O

P Q R

Beta

G H I

J K LGamma

Y ZDelta

S T U

V W X

Epsilon

B3

B11

2

3

B21

2

3

Problems caused by looping• Broadcast storming• Learning problems• Cloned unicast frames

Solution• Spanning Tree Protocol

Page 25: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 21

ITRICCL Ethernet Switching

l Basic techniques

» Cut-through– Advantages

l low latency

– Disadvantages

l forwards runt & error frames

l internal speedup not possible

l mixed speeds difficult

» Interim Cut-through– Same as CT, but less runt frames

» Store & Forward– Advantage

l reduces error frames

l architecturally flexible

– Disadvantage

l longer latency (not really bad !!)

Page 26: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 22

ITRICCL Router vs. Routing function

l What does a router do

» Routing function– IP packet forwarding

– Route calculation/ convergence

– Route management

» Multicast– IP packet duplication

– Multicast routing

» Traffic Mgt. (QoS)– Packet Classification

– Packet Filtering

– Queue Management

» Network Mgt.

» Security– Firewall

– Authentication

l Router

» Conventional stand-alone routerperforms an IP routing function

– Bus based

– Central CPU

– Cached forwarding tables

– Centralized routing tables

– SW table lookup

» Calculations required– 10 Gbps throughput

– 64 byte packets = 50 ns / packet

– < 50 ns to make each routing decision.

Page 27: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 23

ITRICCL Network Processor

GeneralPurpose

Processor

CustomASICs

Poor ⇐⇐ Price / Performance ⇒⇒ Good

Poo

r ⇐⇐

Fle

xibi

lity,

TT

M

⇒⇒ G

ood

NetworkProcessor

High PerformanceLow Cost

Highly FlexibleFast time-to-market

l Evolution to network processors

» Software programmable

» Optimized instruction set fornetworking

» Breakthrough performance

» Switching, routing, and features

l Customer-specific differentiation

» Base level instruction set

» Empowers the higher level software

» Addresses all networking markets

l Enables high-level functions at thesame speed as basic switching wirespeed

Page 28: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 24

ITRICCL

So, What’s so hard aboutswitching & routing ?!#

Page 29: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 25

ITRICCL Typical Router Architecture

Route Mgt.

Bus /Switch Fabric

Slow

pat

h

Fast

pat

h

CPU

Memory

CPU

Memory

Interface

CPU

Memory

Interface

* Cache assistedIP routing

Packet Forwarding

Node Mgt.Traffic Mgt.

Security Mgt.

Page 30: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 26

ITRICCL The Easy Part: Basic IP Forwarding

l To forward an IP Unicast packet, you need to:

» Parse the IP header

» Lookup the Address in a large table of address prefixes (100,000+ entries)

» Check the checksum

» Decrement the TTL and adjust the checksum

l This stuff is easy to do at high speed

» This is straightforward for ASIC implementation

» Clever implementers can do OC-192 (or even OC-768)

» With today’s technology, this is not even close to being the bottleneck

Page 31: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 27

ITRICCL So What’s So Hard?

Switch Fabric

Packet Processing

Operating System

System performance = function of ALL elements“A chain is only as strong as its weakest link”

Protocol Application

Page 32: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 28

ITRICCL So What’s So Hard?

l There are things which can make high speed forwarding hard:

» Where data flows come together (backplane)

» Where parallelism is difficult– e.g. Optics, software, protocol

» Protocol standards– Unstable or poorly designed or under-defined standards

– Need mature implementations

– Multi-lingual

– Too many standards

» Lots of options and alternative paths

» Maintaining per-packet state that comes and goes

Page 33: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 29

ITRICCL So What’s So Hard?

l Proliferation of standards make system implementation hard:

» Support for legacy protocol (i.e. Multi-protocol & conformance)

» Interoperability (i.e. Multi-vendor)

» Addressing

» Routing

» Multicasting

» Traffic mgt. (QoS)

» Network mgt.

» Mobility

» Security

» Virtual Private Network

Page 34: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 30

ITRICCL So What’s So Hard?

l Reliability, maintainability, redundancy

» Hot swappable, Hot standby router

» Coherent network state

» Online upgrade

» Redundancy (power supply, link failure, etc.)

l Scalability

l Additional

» Frame translation

» Load balancing

» Port mirroring

Page 35: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 31

ITRICCL One Hard Part: The Backplane

l Given 100 OC-xx ports, served by 100 line cards, somehowpackets have to get between the line cards

l The design of the switched backplane is “non- trivial

» If there are “n” line cards, you have an n*log( n) problem

» You want very high switch utilization to push performance (thiseffects where packet is buffered)

» Power and heat become important

» Cost of hardware is meaningful

l It is easy to lose your QoS guarantees across the switchedbackplane

Page 36: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 32

ITRICCL ASIC Designer’s Nightmare: Options

l MPLS and IP forwarding

l Filters (source or destination address, RSVP, … )

l Tunneling: Encapsulation and decapsulation(particularly if reassembly is needed)

l Multicast

l IP Options

l Multipath (ECMP)

l NAT (application addresses plus state)

l IPv6 alternate headers

Page 37: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 33

ITRICCL What does MPLS do to a router ?

l Answer: Provide lots of alternative forwarding paths<IP> → IP

<IP> → <Shim> + <IP>

<IP> → <ATM+ shim> + <IP>

<Shim> → <Shim> ; <Shim>+<Shim>; <ATM+ shim>

<Shim>+< IP> → <IP> (with or without IP lookup)

<ATM+ shim> → <ATM+ shim>; <Shim>;…

<ATM+ shim>+< IP> → <IP> (with our without IP lookup)

<ATM+ shim>+< Shim> → <Shim>

etc …

l This is not popular with hardware developers L

Page 38: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 34

ITRICCL A Hardware Developer’s View

l Generally: Hardware engineers wish that folks who writestandards paid attention to hardware issues

l IP Forwarding can be done very fast, no problem

l CLNP forwarding can be done very fast, …

l But: Please don’t give us so many options!

l “It is clear that IP standards (including IPv6) were designedby folks who don’t pay any attention to what it takes to builda fast router?”

l (On the other hand, things are still within a top hardwareteam’s capabilities)

Page 39: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 35

ITRICCL The Bottom Line on Speed

l There are really three bottlenecks:

» The switched backplane

» The optics

» How much extra complexity and flexibility you want(Filtering, MPLS, options, all make it harder to go fast

l It really doesn’t matter what the forwarding looks like, ifits straightforward and well defined

l At very high speed, IP, MPLS, ATM, Frame Relay, are allconstrained by the same issues are all constrained by thesame issues

Page 40: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 36

ITRICCL Can Routers go fast enough ?

l There is some limit to how fast routers can go

l Or, more correctly, there is some limit onhow fast electronics can go

» Given today’s chip technology, and reasonableeconomics, the limit might be on the order of a fewthousand * OC-192

» In four years, possibly ditto but * OC-768

l Past this point, we need optics

» Core switches become WDM switches

» Very fast, very branchy routers (and ATMswitches) become feeders for WDM in the core

Page 41: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 37

ITRICCL Issues affecting Reliability

l Hardware robustness

» Reliable hardware, Redundancy at many levels

l Software quality and robustness

» e. g., How good is your routing software?

l Protocol Design Protocol Design

l Response to congestion Response to congestion

l Failover of links (“Sonet- Like” failover rates)

l Network Management

l Avoid mistakes, Diagnose failures

l Testing, testing, testing

Page 42: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 38

ITRICCL

Key Design Considerations

Are your assumptions reasonable ?

Page 43: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 39

ITRICCL Design Constraints

l Target: Right product at the right time at the right price

» Cost

» System

» Market Segment

» Market Timing (Market window)

» Specifications

l Competition

» Advantages & Weaknesses

» Targeted Market

l Resources

» Engineering team (Experience / Stability)

» Management team (Financing / Supportiveness)

» Standards / Customer / Industry tracking

Page 44: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 40

ITRICCL What is Required ??? (Perceived vs. Real)

l Optimizing Performance:

» Wire-speed switching at Layer 2

» Wire-speed forwarding at Layer 3

l Minimize Latency: Cut-through switching vs. Store-and-forward

l Increased Scalability: SOHO, Departmental, Enterprise, Backbone

l Maximize Integration: Multi-chip vs. Single chip solution

l Increased Functionality:

» VLAN (Port, MAC, IP, IEEE 802.1q Tagging, etc.)

» Port Trunking / Port Snooping

» Support Layer 3, Layer 4, … , Layer 7

» Support IP, IPX, SNA, …

» Support IEEE 802.3x flow control, jamming

» CoS / QoS / RSVP / SBM / Differentiated Service– Multiple loss / delay queues

– per VC queueing

Page 45: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 41

ITRICCL Are these assumptions reasonable ?

l Maintain multicast / unicast packet sequence.

l Multicast packet needs to switched at the same time

l Support 8 k / 16 k / 32 k MAC addresses.

l Support 8 k / 16 k / 32 k IP addresses.

l Support full SNMP / RMON statistic collection.

Page 46: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 42

ITRICCL

Key Design Considerations

Can you successfully overcome today’stechnology limitations ?

Page 47: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 43

ITRICCL Technology Assumptions

l Memory speeds, size, types» DRAM, SRAM, SDRAM, SSRAM, Rambus, NetRAM

l Semiconductor technology

» Dimension: 0.8 µm, … , 0.35 µm, 0.25 µm, 0.18 µm, etc.

» Power: 5 V, 3.3 V, 2.5 V, etc.

» Embedded Memory

l Design Tools» Simulation: RTL level, Behavioral, Cycle-base

» Layout Capabilities

» Emulation Technology

Page 48: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 44

ITRICCL Do these assumptions hold ?

l Freebies

» Memory speed / size

» Silicon cost

» Computational power

l Does these assumption still holds in a hyper-competitiveenvironment?

» No, because everybody have access to the samecomponents and semiconductor foundry.

Page 49: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 45

ITRICCL Improved Competitiveness

l Creating advantages by speeding up design cycle» Increase engineering experience

» Invest in the state-of-the-art engineering tools

– Latest simulation / CAD tools

– Advance computers

– SOC Emulation / FPGA hardware

l to improve simulation time

l to reduce potential errors, thus less debugging

Page 50: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 46

ITRICCL

Design Goals

Design specifications

Page 51: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 47

ITRICCL Design Specifications

l System Features

» Single chip eight 10/100 Mbps Ethernet ports with RMII interface

» Provides two 32-bit memory interfaces which support SSRAM

» Supports a 16-bit CPU interface

» Statistics collection to support SNMP, RMON-1

l Layer 3 Features

» Supports wire-speed IP routing (1.2Mpps) with line rate address lookup

» Supports 10K routes

» Supports IP Multicast

» Supports two level of user data priority (Class of Service Support)

l Layer 2 Features

» Supports IEEE 802.1d bridging and spanning tree algorithm

» Supports port or IEEE802.1Q compliant tag based VLANs

» Supports 8K MAC address entries

» IEEE 802.3x flow control for full duplex operation

» Supports port snooping

Page 52: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 48

ITRICCL

Design Goals

Architecture

Page 53: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 49

ITRICCL Key Common Architectural Components

ForwardingDecision

ForwardingDecision

Output LinkScheduling

Output LinkScheduling

Backplane

Forwarding Table Management, Network Management, and

System Management

In high performance systems, the forwarding decision, backplane and output linkscheduling must be performed in hardware, while the less timely management andmaintenance functions are performed in software.

Page 54: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 50

ITRICCL Architectural Evolution

Line Card #1

Line Card #2

Line Card #N

CPU

Memory

• • •

Line Card #1

Line Card #2

Line Card #N

CPU

Memory

• • •

CPU / Memory

CPU / Memory

CPU / Memory

Page 55: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 51

ITRICCL Architectural Evolution

Line Card #1

Line Card #2

Line Card #N

CPU

Memory

• • •

ForwardingEngine

ForwardingEngine

ForwardingEngine

Line Card #1

Line Card #2

Line Card #N

CPU

Memory

• • •

ForwardingEngine

ForwardingEngine

ForwardingEngine

Crossbar

Page 56: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 52

ITRICCL Architectural Renewal w/ Advance Technology

Trade-offCentralized vs. DistributedHardware vs. SoftwarePacket vs. CellConnection-less vs. Connection-orientedShared Bus vs. Crossbar

Transmission vs. Switching“Big Pipe” vs. “Managed BW”“Dumb Network” vs. “Intelligent”Wired vs. Wireless

Technology FactorsSemiconductor Advances

• Computing Power (CPU)• Memory Size• Analog / RF / Optical technology

Material Advances• Optical transmission

Page 57: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 53

ITRICCL Conceptual Model of L3 Switch

MAC

MAC

MAC

MAC

MAC

MAC

MAC

MAC

Buffer Mgt.

CPU

DescriptorLinks

PacketM

emoryR

outin

gT

able

I/O Scheduler

Packet Control

ForwardingEngine

Page 58: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 54

ITRICCL Packet Flow (Tetris)

Packet ControlForwarding

EngineBuffer Mgt.

Data Hdr

Hdr

Routing

Decision

Hdr

Output Scheduler

Data Hdr

Priority / Normal / Multicast Pkt (Ptr & Hdr)

Page 59: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 55

ITRICCL Packet Controller

MAC

Packet FIFO

PacketMemory

MemoryInterfaceModule

MemoryController

Scheduler

Page 60: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 56

ITRICCL Output Scheduler

Priority Normal Multicast

Header Process

MAC

Packet FIFO

From Packet Control

From Buffer Mgt.

Scheduler

Nex

t Wri

te

Nex

t Rea

d

Page 61: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 57

ITRICCL Buffer Management

Tail Maintenance

Routing DecisionI/O ports, Pkt Location, QoS

Modified IP Header

Head Maintenance

FreeDescriptor

To OutputScheduler

Temporary Descriptors

Page 62: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 58

ITRICCL Forwarding Engine

L2 Table Lookup (DA)

L2 Learning (SA)

L2 Learning (SA)

HeaderVerification

MulticastLookup

UnicastLookup(ARP)

Header Modification

Routing Decision

Route

Port SnoopingMap

Port TrunkingMap

Page 63: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 59

ITRICCL

Design Trade-offs

Packet Memory Design

Page 64: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 60

ITRICCL Variable Length Format

l Advantage

» Simple, no link list required

» No descriptors required

» No gaps between packets

» Easy to debug

l Disadvantage

» No sharing among ports

» Fast route decision required

» Large temporary FIFO required

» Parity bit or packet length write-back required

» Look-ahead forwarding not allowed formulticast packets

l Variations

» Parity bit vs. Packet Length

Port #1 Port #2 Port #3

Page 65: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 61

ITRICCL

Port #1 Port #2 Port #3

Fixed Packet Size Format

l Advantage

» Sharing among ports

» Routing decision relaxed

» Look-ahead forwarding allowed

» Small temporary FIFO

l Disadvantage

» Inefficient for small packets

» Link list required

» Difficult to debug

l Variations

» 1536 bytes vs 2048 bytes

Page 66: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 62

ITRICCL

Port #1 Port #2 Port #3

Cell Format

l Advantage

» Sharing among ports

» Efficient for most packets

» Routing decision relaxed

» Look-ahead forwarding allowed

» Small temporary FIFO

l Disadvantage

» Large descriptor memory required

» Link list required

» Complex logic / Longer design cycle

» Prone to error

» Very difficult to debug

l Variations

» 64 / 128 / 256 bytes

Page 67: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 63

ITRICCL

Design Trade-offs

Buffer Management

Page 68: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 64

ITRICCL Queue Methodology #1

A2

A4

A9

A3

A8

A6

A5

A7

xx

A1

A10

Link Listfor Port #1

Link Listfor Port #2

Link Listfor Port #3

Link Listfor Port #4

Link Listfor FreeUnicast

Descriptor

Link Listfor Free

MulticastDescriptor

Page 69: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 65

ITRICCL Queue Methodology #2

A2

A4

A9

A3

A8

A6

A5

A7

xx

A1

Link Listfor Port #1

Link Listfor Port #2

Link Listfor Port #3

Link Listfor Port #4

Link Listfor FreeUnicast

Descriptor

Link Listfor Free

MulticastDescriptor

A10 A10

A1

Page 70: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 66

ITRICCL Queue Methodology #3

A2

A4

A9

A3

A8

A6

A5

A7

xx

Link Listfor Port #1

Link Listfor Port #2

Link Listfor Port #3

Link Listfor Port #4

Link Listfor FreeUnicast

DescriptorA1

Link Listfor Multicast

A10

Page 71: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 67

ITRICCL Queue Methodology #4

Link Listfor Port #1

Link Listfor Port #2

Link Listfor Port #3

Link Listfor Port #4

Link Listfor Free

Packet Buffer

A2

A4

A9

A3

A8

A1

A6

A10

A5

A1

A7

A10

Page 72: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 68

ITRICCL

Design Trade-offs

IP Forwarding

Page 73: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 69

ITRICCL IP Routing

l Routing Algorithms calculate the routes

» Unicast: RIP, OSPF

» Multicast: DVMRP, PIM, MOSPF, CBT

l Routes are converted to table format

l Route tables are written into memory

» Initialization

» Route updates

l Route search looks up forwarding instruction / packet

Page 74: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 70

ITRICCL Route Search Operation

l Longest Prefix Match

l Lookup Criteria

» # memory access required

» size of the data structure

» # instruction required

Depth 32

232 leaves (IP Address)

l Lookup Methods

» Hashing

» Cache hit

» CAM

» Tree search

» Table lookup

» CPU search

» Protocol based (Tagging)

Routing Table

R1 = 0101R2 = 0101101R3 = 010110101011

IP = 010101101011

IP = 010110101101

Page 75: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 71

ITRICCL Histogram of Prefix Length Distribution

100 101 102 103 104 105 106

891011121314151617181920212223242526272829303132

Page 76: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 72

ITRICCL Mutated Binary Search on Hashing Table

“Waldvogel, et. Al. “Scalable High Speed IP Routing Lookup”

Length Hash5712

01010

01010110110110

011011010101

Hash Table

16

18

8

14

17

19

10

15

2420 21 22

2627

28 30 32

9

12 11

13

Page 77: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 73

ITRICCL Mutated Binary Search on Hashing Table

l Criteria

» Memory reference: 2X (Average); 5X (Worst)

» Memory usage: 1.2 Mbyte

» Lookup time: 100 ns (Ave); 450 ns (Worst); 2 ~ 10 Mpps

l Advantage:

» The speed of IP lookup is independent of forwarding table size

» Relatively few memory access

» Fast enough to support Gigabit rates

l Disadvantage:

» Routing update requires the tree to be rebuilt

» Insertion and deletion of routes from memory table is complex

Page 78: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 74

ITRICCL Direct Table Lookup

“P. Gupta, et. al. “Routing Lookups in Hardware at Memory Access Speeds”

0

2324

31

224 Entries

24

31 28 Entries

NextHop

Page 79: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 75

ITRICCL Direct Table Lookup

l Criteria

» Memory reference: 2X (Maximum)

» Memory usage: 33 Mbyte

» Lookup time: 10 ~ 20 Mpps

l Advantage:

» Few memory references

» Enabling pipelined implementation

l Disadvantage:

» Inefficient memory usage

» Insertion and deletion of routes from memory table is complex

Page 80: Switch Router Design - NCTU

CCL/N300; Paul Huang 3/21/99 76

ITRICCL Conclusion

l Understand and always keep the “Big Picture” inmind

» Market

» Technology

» Brain Power

l Be aware of the “Hype” vs. “Reality”

l Remember the “KISS” Principle

» Keep it simple, stupid !

» Successful technologies are not about perfection, but aboutcompromise between complexity, performance, ease ofdeployment and cost