Upload
larya
View
36
Download
1
Tags:
Embed Size (px)
DESCRIPTION
SPP V2 Router Design. John DeHart and Mike Wilson. Revision History. 3 June 2008 Initial release, presentation 25 June 2008 Updates on feedback from presentation 27 July 2009 Current status, changes, Control documentation 24 August 2009 Updates from debugging, simulation. - PowerPoint PPT Presentation
Citation preview
John DeHart and Mike Wilson
SPP V2 RouterDesign
2 - Mike Wilson - 04/21/23
Revision History3 June 2008
»Initial release, presentation25 June 2008
»Updates on feedback from presentation27 July 2009
»Current status, changes, Control documentation24 August 2009
»Updates from debugging, simulation
3 - Mike Wilson - 04/21/23
Current Status: Summary Memory Layout
»Done, may need revisiting Scripts (.ind files) done, missing TCAM initialization NPUA blocks written, simulates, some GPE-to-NPE problems NPUB broken, needs some changes
»Needs RxB SRAM Ring fix»HdrFmt needs internal header fix»Recent changes to LookupB/Copy not yet added»Need some changes to TxB for chained buffers
Recent changes:»Exception, Local Delivery packets omitted in original design
Necessitates changes to Parse
»Changed ResultTable indexing Impacts LookupB/Copy
4 - Mike Wilson - 04/21/23
SPP VersionsSPP Version 0:
»What we used for SIGCOMM PaperSPP Version 1:
»Bare minimum we would need to release something to PlanetLab Users
SPP Version 2:»What we would REALLY like to release to PlanetLab users.
5 - Mike Wilson - 04/21/23
Objectives for SPP-NPE version 2 Deal with constraints imposed by switch
»can send to only one NPU; can receive from only one NPU»split processing across NPUs
parsing, lookup on one; queuing on other Provide more resources for slice-specific processing Decouple QM schedulers from links
»collection of largely independent schedulers»may use several to send to the same link
e.g. separate rate classes (1-10M, 10-100M, 100-100M) optionally adjust scheduler rates dynamically
Provide support for multicast»requires addition of next-hop IP address after queueing
Enable single slice to operate at 10 Gb/s Support “slow” code options
»Use separate rate classes to limit rate to slow code options»LCI QMs for Parse, NPUB QMs for HdrFmt
6 - Mike Wilson - 04/21/23
SPP Version 2 System Architecture
GPE Blade
GPE Blade
SPISwitch
Sw
itch
Bla
de
NPUA
NPUB
LCIngress
RTM
LCEgress
FICSPI
Switch FIC
NPE 7010 BladeLC 7010 Blade
1 10Gb/sOR
10 1Gb/s
DecapParseLookupAddShim
CopyQMHdrFormat
Default Data Path
7 - Mike Wilson - 04/21/23
SPP Version 2 System Architecture
GPE Blade
GPE Blade
SPISwitch
Sw
itch
Bla
de
NPUA
NPUB
LCIngress
RTM
LCEgress
FICSPI
Switch FIC
NPE 7010 BladeLC 7010 Blade
1 10Gb/sOR
10 1Gb/s
DecapParseLookupAddShim
CopyQMHdrFormat
Fast-Path Data
8 - Mike Wilson - 04/21/23
SPP Version 2 System Architecture
GPE Blade
GPE Blade
SPISwitch
Sw
itch
Bla
de
NPUA
NPUB
LCIngress
RTM
LCEgress
FICSPI
Switch FIC
NPE 7010 BladeLC 7010 Blade
1 10Gb/sOR
10 1Gb/s
DecapParseLookupAddShim
CopyQMHdrFormat
Exception Data PathLocal Delivery
9 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM/0
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM/3
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
10 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM/0
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM/3
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
NN Scr/512
NN NN
SRAM
Scr/256Scr/
256
Scr/256
Scr/256
Scr/256
NN
Scr/256
Scr/1024
11 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
12 - Mike Wilson - 04/21/23
PlanetLab NPE Input Frame from LC or GPE
Ethernet Header:»DstAddr: MAC address of NPE»SrcAddr: MAC address of LC or GPE»VLAN: One VLAN per MR (MR == Slice)
Only use lower 11 bits of Vlan Tag IP Header:
»Dst Addr: IP address of this node How many IP Addresses can a NODE have?
»Src Addr: IP address of previous hop»Protocol: UDP
UDP Header:»Dst Port: Identifies input tunnel»Src Port: with IP Src Addr identifies sending
entity
Type=802.1Q (2B)
PAD (nB)
CRC (4B)
UDP Payload(MN Packet)
Src Addr (4B)
Dst Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
DstAddr (6B)
SrcAddr (6B)
IP Options (0-40B)
Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
VLAN (2B)Type=IP (2B)
Eth
ern
et
Header
IPH
eader
UD
PH
eader
Eth
ern
et
Tra
iler
Indicates 8-Byte BoundariesAssuming no IP Options
13 - Mike Wilson - 04/21/23
Local Delivery / ExceptionsGPE has separate tunnels for LD and EX
»Standard filters handle these packets»No internal packet headers required, although we can still use internal headers for exceptions
Return path from GPE uses same tunnels
»Standard filters handle re-classify cases»Internal packet headers from GPE to NPE are MNet-specific
Provides filter key for GPE-routed packets Substrate headers unchanged MN frames carry code-option-specific details, filter key
For IPv4, MN frame has IP version 0, payload has 112b lookup key to use. If GPE wants to reclassify, it sends a normal packet.
14 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Port(4b)
Reserved(12b)
Eth. FrameLen (16b)
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
15 - Mike Wilson - 04/21/23
RxANo change from V1
16 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Rx UDP DPort (16b)Slice ID (VLAN) (16b)
MN Frm Offset (16b)MN Frm Length(16b)
Rx IP SAddr (32b)
Reserved (12b)
Rx UDP SPort (16b)Code(4b)
Slice Data Ptr (32b)
Port(4b)
Reserved(12b)
Eth. FrameLen (16b)
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1Buffer Handle(24b)Rsv
(3b)Intf(4b)
V1
17 - Mike Wilson - 04/21/23
Decap Inputs:
»Packet from RxA Outputs:
»Meta-frame (handle, offset and length)»Slice ID (VLAN tag)
Actually, lower 11b of VLAN tag and lower 4b of RX DA in (for RxID)»Metainterface (Rx Saddr, Rx Sport, Rx Dport)»Code Option (4b, only 16 available)»Slice data pointer
Initialization:»VLAN table, NPE MAC Address
Functionality:»Read VLAN tag from DRAM, determine correct code option.»Validate packet. Drop invalid, unmatched packets.
IP Options for NPE dropped in LC, should never arrive here!»Enqueue valid packets to Scratch ring.»Update stats
Status:»Works for valid packets, invalid packet handling untested
18 - Mike Wilson - 04/21/23
VLAN table
VLAN code_opt slice_data_ptr slice_data_size
0 0 0 0
1 0 0 0
… … … …
0x0aa 1
… … … …
0x7ff 0 0 0
…
SD data
P data
…
code_option = 0 implies invalid slice»“on switch” for a slice in the data plane
SD data is currently only counters 56B slice data Only use lower 11b of VLAN tag (2048 VLANs) Only changes from V1:
»No longer need all data on NPUA, drop HF data, per-slice buffer limits
19 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Rx UDP DPort (16b)Slice ID (VLAN) (16b)
MN Frm Offset (16b)MN Frm Length(16b)
Rx IP SAddr (32b)
Reserved (12b)
Rx UDP SPort (16b)Code(4b)
Slice Data Ptr (32b)
Lookup Key[111-80] DA (32b)
MN Frm Length (16b)MN Frm Offset (16b)
Lookup Key[ 79-48] SA (32b)
Lookup KeyProto/TCP_Flags
[15- 0] (16b)
ExceptionBits (12b)
Lookup Key[143-112] Type(1b)/RxID(4b)/Slice ID(11b)/
Rx UDP DPort (16b)
Code(4b)
Lookup Key[ 47-16] Ports (32b)
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1Buffer Handle(24b)Rsv
(3b)Intf(4b)
V1
20 - Mike Wilson - 04/21/23
Parse Inputs:
» Meta-frame (handle, offset and length)» Slice ID (VLAN tag, RxID)» Tunnel ID (Rx Saddr, Rx Sport, Rx Dport)» Code Option (4b, only 16 available)» Slice data pointer
Outputs:» Meta-frame (handle, offset and length)» Lookup key (Includes slice ID, Rx UDP dport)» Code Option (4b, only 16 available)» Exception bits (MN-specific) – do we still need these? (Probably)
Initialization:» Slice Data
Functionality:» Slice-specific processing:
Parse meta-frame. Extract lookup key. Raise any relevant exceptions. Can pass slice data to HdrFmt in bytes 16..30 of packet. (0..15 are reserved for AddShim)
» Substrate processing: Add substrate-specific information to lookup key (32b: Lookup type, RxID, Slice ID, Rx UDP dport)
Status:» Needs internal packet handling from GPE for GPE-specified filter keys» Needs to use "special" filter key for exception path, 0x0. Substrate processing should still pre-pend
substrate-specific key information (slice, MiID)» Works for normal (LCI-to-NPE) packets
21 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
MN Frm Length (16b)MN Frm Offset (16b)
Result Index (32b)
ExceptionBits (12b)
Slice ID (VLAN) (16b)Code(4b)
MN Frm Length (16b)MN Frm Offset (16b)
Rsvd(16b) Stats Index (16b)
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
Lookup Key[111-80] DA (32b)
Lookup Key[ 79-48] SA (32b)
Lookup KeyProto/TCP_Flags
[15- 0] (16b)
ExceptionBits (12b)
Lookup Key[143-112] Type(1b)/RxID(4b)/Slice ID(11b)/
Rx UDP DPort (16b)
Code(4b)
Lookup Key[ 47-16] Ports (32b)
22 - Mike Wilson - 04/21/23
LookupA Inputs:
» Meta-frame (handle, offset and length)» Lookup key (Includes slice ID, RxID, Rx UDP dport)» Code Option (4b, only 16 available)» Exception bits
Outputs:» Meta-frame (handle, offset and length)» Lookup Result (Index into SRAM table on NPUB)
Actual max index is 0x3FFFF (Unicast), with single-bit type flag = 19 bits» Slice ID (VLAN tag)» Code Option (4b, only 16 available)» Exception bits (from Parse)» Stats Index (from TCAM)
Can this fit in the 13 bits leftover from the result index? No, result is bigger now. Initialization:
» Filters set in TCAM by control Functionality:
» Look up key in TCAM» On miss, drop the packet» Local Delivery is now a normal lookup» Lookup result is now just a 32b index (and stats index)
Status:» Written; untested.» Result size currently 48b; would like to reduce to 32b.
23 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Result Index (32b)
ExceptionBits (12b)
Slice ID (VLAN) (16b)Code(4b)
MN Frm Length (16b)MN Frm Offset (16b)
Rsvd(16b) Stats Index (16b)
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
24 - Mike Wilson - 04/21/23
AddShim Inputs:
» Meta-frame (handle, offset and length)» Lookup Result (Index into SRAM table on NPUB)» Slice ID (VLAN tag)» Code Option (4b, only 16 available)» Exception bits (from Parse)» Stats Index (from TCAM)
Outputs:» Shim Packet (buffer handle)
Buffer descriptor contains updated offset and length, if needed Initialization:
» None. Functionality:
» Prepend shim header to preserve packet annotations across NPU’s» Overwrite the existing ethernet header (Up to 18B) with:
Slice ID (16b) Code Option (4b) Exception Bits (12b) MN Frame Offset (16b) MN Frame Length (16b) Result Index (32b) Stats Index (16b) [This is the same on NPUA, NPUB] 30B for opaque slice data.
Proper memory alignment required This is written by Parse, not AddShim!
Status:» Written. Works for properly aligned packets. Needs optimization.
25 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
26 - Mike Wilson - 04/21/23
TxASends shim packet to NPUB.Unmodified 10 Gbps Tx 2×ME.
27 - Mike Wilson - 04/21/23
SPP Version2 NPUA to NPUB Frame
SHIM (16B)»Slice ID (16b)»Code Option (4b)»Exception Bits (12b)»Result Index (32b)»Stats Index (16b)»Offset of MN Packet (16b)»Length of MN Packet (16b)»Memory Alignment Padding (2B)
IP Header, UDP Header may be overwritten by:»opaque slice data, written in Parse
PAD (nB)
CRC (4B)
UDP Payload(MN Packet)
Src Addr (4B)
Dst Addr (4B)
Ver/HLen/Tos/Len (4B)ID/Flags/FragOff (4B)
TTL (1B)Protocol = UDP (1B)
Hdr Cksum (2B)
SHIM (16B)
IP Options (0-40B)
Src Port (2B)Dst Port (2B)
UDP length (2B)UDP checksum (2B)
Type=IP (2B)IP
Header
UD
PH
eader
Eth
ern
et
Tra
iler
Indicates 8-Byte BoundariesAssuming no IP Options
28 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Port(4b)
Reserved(12b)
Eth. FrameLen (16b)
Buffer Handle(24b)Reserved(8b)
29 - Mike Wilson - 04/21/23
RxBNeeds to switch from NN output to Scratch or SRAM
» Comments in code indicate SRAM should work» Supporting code seems to be only for scratch rings
Needs further examination DZar notes there are some obscure #define's needed for
SRAM rings.
30 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Frame Length (16b)Stats Index (16b)
Buffer Handle(24b)Reserved(8b)
Reserved(12b)
PerSchedQID(15b)
Sch3b
QM2b
Port(4b)
Reserved(12b)
Eth. FrameLen (16b)
Buffer Handle(24b)Reserved(8b)
31 - Mike Wilson - 04/21/23
LookupB/Copy Inputs:
»Shim packet (buffer handle, frame length) Outputs:
»Packet (buffer handle, frame length)»QueueID (QM, Scheduler, Queue ID)»Stats Index
Initialization:»ResultTable (unicast+multicast)» local endpoint table»Ethernet SAddr»Per-slice Packet Limits
Functionality (Overview)»Copy shim header into buffer descriptor»Look up routing information from result index» If multicast, make the copies»Enqueue to correct QM (from ResultTable)
Status»Written, broken.»Needs changes to handling of ResultTable; result indices are now
absolute, not per-slice.
32 - Mike Wilson - 04/21/23
LookupB/Copy – Code Sketchif not currently processing mcast packet
read packet from SRAM ringextract shimload ResultTable valuefill buffer descriptorif unicast
if per-slice packet limit permitsupdate per-slice packet countwrite to SRAM ring for correct QM. (By qmschedID in result table value).
else drop bufferelse
start mcast processingif per-slice packet limit permits
update per-slice packet countfetch first header buffer descriptorif payload length ≠ 0
write ref count into payload descriptorelse drop payload buffer
elsedrop bufferfinish mcast processing
else (Currently processing buffer, have empty header buffer handle)fill header buffer descriptor
only chain if payload buffer is not emptyif still making copies
fetch next header buffer descriptorelse finish mcast processingwrite current header buffer handle to SRAM ring for correct QM. (By qmschedID).
signal next ME
33 - Mike Wilson - 04/21/23
ResultTable – Unicast Data needed to enqueue, rewrite packet:
»Fanout: Ignored (Memory padding)»QID
QMID, SchedID, QID (20b) (Lookup Result)
»Src MI: IP Saddr (32b) (Per SchedID Table) UDP Sport (16b) (Lookup Result)
»Tunnel Next Hop IP DAddr (32b) (Lookup Result) IP DPort (16b)(Lookup Result)
»Chassis Addressing Ethernet Dst MAC (48b) (Per SchedID Table)
»Slice Specific Lookup Result Data (?) (Lookup Result)
Ethernet Src MAC»Should be constant across all pkts.
IP SAddr (32b)Eth DA (48b)
Per Sched Entry:
Fanout (4b)QID (20b)
IP DAddr (32b)UDP DPort (16b)UDP SPort (16b)
Results Entry:
HFIndex (16b)
34 - Mike Wilson - 04/21/23
ResultTable – Multicast Fanout gives the number of copies (0..15) Data needed per copy on NPUB:
»QID QMID, SchedID, QID (20b) (Lookup Result)
»Src MI: IP Saddr (32b) (Per SchedID Table) UDP Sport (16b) (Lookup Result)
»Tunnel Next Hop IP DAddr (32b) (Lookup Result) IP DPort (16b)(Lookup Result)
»Chassis Addressing Ethernet Dst MAC (48b) (Per SchedID Table)
»Slice Specific Lookup Result Data (?) (Lookup Result)
Ethernet Src MAC»Should be constant across all pkts.
Support Multicast but optimize for Unicast
Fanout (4b)QID (20b)
IP DAddr (32b)UDP DPort (16b)UDP SPort (16b)
Results Entry:
IP SAddr (32b)Eth DA (48b)
Per Sched Entry:
HFIndex (16b)
×16
35 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Frame Length (16b)
Buffer Handle(24b)
Stats Index (16b)
Reserved(8b)
Reserved(12b)
PerSchedQID(15b)
Sch3b
QM2b
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
36 - Mike Wilson - 04/21/23
QMNo change from V1
»Incorporates change to limit queues by #pktsSome changes in how control allocates bandwidth
»Need to ensure that slow HdrFmt blocks can’t tie up the system
»Currently looking at worst-case engineering (everyone runs at slowest block speed)
37 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1Buffer Handle(24b)Rsv
(3b)Intf(4b)
V1
38 - Mike Wilson - 04/21/23
HdrFmt / SubEncap Inputs:
» Buffer Handle» Remaining inputs come from Buffer Descriptor:
Multicast or Unicast (from buffer_next) Frame length, offset HFIndex (index into HFTable, a slice-specific table) ResultIndex (for tunnel headers)
Outputs:» Packet (buffer handle)
Buffer descriptor contains updated offset and length Initialization:
» HFTable, containing slice-specific data. For IPv4, this is unused.» ResultTable, tunnel header information
Functionality:» Substrate level:
read buffer descriptor and pass frame offset, length, HFIndex, mcast/ucast to slice-specific HdrFmt
» Slice level: arbitrary processing. For IPv4, this writes any next-hop information. Except for redirects such as exception packets, effectively does nothing.
» Substrate level: Encapsulate for output tunnel (from ResultTable) Update stats
Status:» Revisit multicast model» Needs Internal Header code (Missing!)
39 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1 Buffer Handle(24b)Rsv
(3b)Intf(4b)
V1
40 - Mike Wilson - 04/21/23
Scr2NN/FreelistMgr Inputs:
»Buffer Handle (possibly chained) Outputs:
»Buffer Handle (possibly chained) Initialization:
»None Functionality:
»Combines Freelist Manager with Scr2NN glue»FM: Read from scratch ring. Free buffers, correctly handling chained
buffers and reference counts.»Scr2NN: Read from Scratch, write to NN.
Status:»Needs to be reworked from scratch; my method of combining was
wrong and could (probably would) deadlock.»Both blocks exist, but combining them is not straight-forward.
Open question: how should we prioritize among these tasks? The author should ensure that no deadlock is possible. (TxB writes to FM; if FM ring is full, TxB stalls. If Scr2NN is writing to TxB, it stalls. Gridlock.)
»As of August 2009, we'll use a temporary 4×4 thread split and revisit later.
41 - Mike Wilson - 04/21/23
SRAM
TCAM
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt/SubEncap(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block DiagramNPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAM
Scr2NN/Freelist(1 ME)
AddShim(1 ME)
Decap(1 ME)
Parse(8 ME)
LookupA(1 ME)
TxA(2 ME)
SPISwitch
Buffer Handle(24b)Rsv(3b)
Intf(4b)
V1
42 - Mike Wilson - 04/21/23
TxBMust support chained buffers
»Multicast uses header buffers and payload buffers»Headers are slice-specific; we can’t rely on known, static lengths as we did in ONL.
Sends header from one buffer, payload from chained buffer.»Can TX do this? Comments in the code seem to imply that
chained (non-SOP) buffers must start at offset 0. Our payloads usually won’t.
This will probably take some TX modification, but there’s no reason why it won’t work. Might have a performance penalty, of course…. [DZar]
43 - Mike Wilson - 04/21/23
SPP V2 SideB SRAM Buffer DescriptorBuffer_Next (32b)LW0
LW1
LW2
LW3
LW4
LW5
LW6
Packet_Next (32b)LW7
Reserved (4b)
Free_list0000 (4b)
Ref_Cnt (8b)
Slice ID(xsid)(12b)Stats Index (16b)
ResultIndex (32b)
Buffer_Size (16b)
Packet_Size (16b)
Offset (16b)
Reserved (4b)
MR Exception Bits (16b)HFIndex (16b)
MR Bits (optional) (32b)
Written by Freelist Mgr
Written by Rx
Written by LookupB/Copy
Written by QM
Ref_Cnt (8b)
Written by Rx,Added to by CopyDecremented byFreelist Mgr
Written by Rx orLookupB/Copy
44 - Mike Wilson - 04/21/23
SPP V2 SideB SRAM Buffer Descriptor
HFIndex is an index into the HFTable. Unused in IPv4.» May not be needed in Buffer Descriptor, since
SubstrateEncap can fetch it using ResultIndex ResultIndex is used to get tunnel header info from the
ResultTable
Buffer_Next (32b)LW0
LW1
LW2
LW3
LW4
LW5
LW6
Packet_Next (32b)LW7
Reserved (4b)
Free_list0000 (4b)
Ref_Cnt (8b)
Slice ID(xsid)(12b)Stats Index (16b)
ResultIndex (32b)
Buffer_Size (16b)
Packet_Size (16b)
Offset (16b)
Reserved (4b)
MR Exception Bits (16b)HFIndex (16b)
MR Bits (optional) (32b)
45 - Mike Wilson - 04/21/23
SPP v2 ControlNew data path adds new Control requirements
Heterogeneous MNet execution times»Control must select parameters for LCI QMs, NPUB QMs to avoid Parse, HdrFmt execution lag
Slice is now partial VLAN tag»Must ensure all VLAN tags have distinct low 11b
Filter/Results now split across NPUA, NPUB»Must coordinate updates to multiple data locations»Synchronization issues require some care in Control
46 - Mike Wilson - 04/21/23
SPP v2 ControlNPUA Data areas requiring Control setupNPE MAC address at
»IPV4_SD_MAC_ADDR_HI32»IPV4_SD_MAC_ADDR_lo16
VLAN Table»Used by Decap, Parse»Maps VLANs to code options, data areas»2048-entry table at PL_SD_VLAN_CODE_OPT_TABLE_BASE
struct{ unsigned int code_opt; // only 4 lsb used unsigned int slice_data_ptr; unsigned int slice_data_size; }
47 - Mike Wilson - 04/21/23
SPP v2 ControlData areas requiring Control setup
VLAN Table -cont'd-»Pointer to slice-specific SRAM areas»Slice owners request amount needed
(IPv4 code option needs 72B for counters)»Control must pass along Slice owner initialization data
»Control can allocate in any 4B aligned location within Bank 3 addresses 0x300000..0x7FFFFF (upper 5MB of BANK3)
»Each slice-specific region must be at least SLICE_DATA_ENTRY_SIZE_MINIMUM (56B) in size
»Each code option has different additional size needs E.g., for IPv4, 56+64=128B total E.g., for i3, 56+3200 = 3256B total
48 - Mike Wilson - 04/21/23
SPP v2 ControlData areas requiring Control setup
TCAM filters»Used by LookupA»Tightly interlinked with tables on NPUB
49 - Mike Wilson - 04/21/23
SPP v2 Control NPUB Data areas requiring Control setup
NPE source MAC address (HdrFmt/SubstrateEncap)»LC_MAC_ADDR_HI32»LC_MAC_ADDR_LO32
Per-Slice (2048) packet limits table (LookupB/Copy) at LC_PER_SLICE_PACKET_LIMIT_BASE
struct { unsigned int current; unsigned int maximum; unsigned int overLimits;}
Queue Manager parameters»Must properly rate limit both bandwidth and slow HdrFmt code options
»No heterogeneous HdrFmt code options yet
50 - Mike Wilson - 04/21/23
SPP v2 ControlNPUB Data areas requiring Control setup
Result TableUsed by LookupB/Copy, HdrFmt/SubstrateEncap
»Results corresponding to TCAM lookups»Links to per-QM scheduler tunnel endpoint values»Also links to per-slice HdrFmt data areas
51 - Mike Wilson - 04/21/23
Filters and ResultsSlice owner maps filters to results
»Filter is 144b key, first 32b is substrate's Meta-Interface ID»Slice owner controls remaining 112b
Results have multiple pieces»Type: unicast / multicast»Output QID('s) (associated with Meta-Interface)
Control translates slice representation to substrate's tunnel»Index into slice data in HFTable for Header Format to use
52 - Mike Wilson - 04/21/23
Adding (Multicast) FilterSlice Owner View
x filters y unicast results z multicast results
1. Add filter <Meta-Interface In, IP DA, IP SA, DPort, Sport, Proto> with result <Type=Multicast, Result R (in 1..z)>
Result R = <Fanout, Meta-Interface, Index> [up to 16× entries]
ControlControl
TCAM slice/RxID/Dport key ResultIndex
map
Cop
y
MulticastResultTable
fanout qm sched qid Tunnel: DA/DPort/SPort... up to 16 ...
Map
Local"subnet"
Tunnel SA Eth DA32 Entries
Cop
y
HFTable
HFIndex
Next Hop
(Opaque)
Range
Validatio
n
Map
2. Update HFTable (index, length, bytestream)
Range
Validatio
n
Copy
53 - Mike Wilson - 04/21/23
Filters and Results First, some things to remember:
»This is the NPE: we are supporting protocols that may not be IP!»Order of filters in a TCAM database defines those filters’ priority
Lower numbered filter indices are higher priority
»TCAM filter lookup is done on the A-Side.»TCAM filter result gives us a pointer to a full result which resides
in SRAM on the B-Side. Thus the A-Side filter and the B-Side result need not be a 1-1
mapping We could have many filters using the same B-Side result.
»We are supporting Unicast and Multicast filters and results Multicast supports a maximum fanout of 16.
54 - Mike Wilson - 04/21/23
Filters and Results (continued) Slice owners allocate N unicast filters and M multicast filters.
»They get: N+M Filter id’s (0 – (N+M-1) )
Contiguous in the TCAM Order in TCAM indicates priority, lower id higher priority
N Unicast Result indices (0 – (N-1) ) Contiguous in the Unicast portion of Result table
M Multicast Result indices (0 - (M-1) ) Contiguous in the Multicast port of the Result table
»Filter id and Result index (unicast or multicast) are referenced separately.
Example: Filter id 4 might use unicast result index 12»Unicast and Multicast filters in TCAM can be mingled.
Remember: Order in TCAM is important. Example: A unicast catch-all (all wildcards) filter should
probably be the LAST filter in a slice’s set of filters so it does not override other filters including multicast filters.
55 - Mike Wilson - 04/21/23
Filters and Results (continued) Slice owners will have the ability to disable a filter.
»Control removes the filter from tthe TCAM (LookupA)»Result is left on NPUB for "in-flight" packets
Slice owners can also remove a filter»This deletes the results from the B-side
56 - Mike Wilson - 04/21/23
Filter / Result Operations
Type(1b) ResultIndex(31)
MC ResultBitMask(16b)
Stats Index (16)
TCAM Result (A-side)
UnicastN Results
16B per Result
MulticastM Blocks
16 Resultsper Block
16B per Result
Result Table (B-side)
Valid(1b)
QID (20b)IP DAddr (32b)
UDP DPort (16b)UDP SPort (16b)
HFIndex (16b)Pad (16b)
Pad (11b)
16B
If we use entire SRAM Bank: SRAM Banks are 8MB Result size is 16B TCAM has 128K 144b entries N + M = 128K (N+16*M) * 16 <= 8MB N = 104858 M = 26214
Result Entry (B-side)
57 - Mike Wilson - 04/21/23
Filter / Result Operations
add_mc_filter(fid, RxMI, Key, Mask, mcResultIndex, statIndex) update_mc_filter(fid, mcResultIndex, resultMask) add_mc_result(fid, mcResultIndex, entryIndex, Qinfo, DestInfo) update_mc_result(fid, mcResultIndex, entryIndex, Qinfo, DestInfo) remove_mc_filter(fid) remove_mc_result(mcResultIndex) add_uc_filter(fid, RxMI, Key, Mask, ucResultIndex, statIndex) update_uc_filter(fid, ucResultIndex, statIndex) add_uc_result(fid, ucResultIndex, Qinfo, DestInfo) update_uc_result(fid, ucResultIndex, Qinfo, DestInfo) remove_uc_filter(fid) remove_uc_result(ucResultIndex)
Valid(1b)QID (20b)
IP DAddr (32b)UDP DPort (16b)UDP SPort (16b)HFIndex (16b)
Type(1b) ResultIndex(31)
Result Bit Mask(16b)
Stats Index (16)
TCAM Result (A-side)
Result Entry (B-side)
58 - Mike Wilson - 04/21/23
Multicast Filter / Result Operations add_mc_filter(fid, RxMI, Key, mcResultIndex, resultMask, statIndex)
» Adds multicast filter to TCAM update_mc_filter(fid, mcResultIndex, resultMask, statIndex)
» Updates (re-writes) the TCAM result add_mc_result(mcResultIndex, entryIndex, Qinfo, DestInfo)
» Writes a MC result entry into Result Table» Marks result as valid
update_mc_result(mcResultIndex, entryIndex, Qinfo, DestInfo)» Updates (re-writes) a MC result entry in the Result Table» Marks result as valid» Implementation will almost certainly be same as add_mc_result so why have both?
remove_mc_filter(fid)» Removes the filter from the TCAM, leaves B-side results unchanged.
remove_mc_result(mcResultIndex)» Invalidates a multicast filter result
59 - Mike Wilson - 04/21/23
Unicast Filter / Result Operations add_uc_filter(fid, RxMI, Key, ucResultIndex, statIndex)
» Adds unicast filter to TCAM update_uc_filter(fid, ucResultIndex, statIndex)
» Updates (re-writes) the TCAM result add_uc_result(ucResultIndex, Qinfo, DestInfo)
» Writes a UC result entry into the Result table» Marks result as valid
update_uc_result(ucResultIndex, Qinfo, DestInfo)» Updates (re-writes) a UC result entry into the Result table» Marks result as valid» Implementation will almost certainly be same as add_uc_result so why have both?
remove_uc_filter(fid)» Removes the filter from the TCAM, leaves the B-Side results unchanged
remove_uc_result(ucResultIndex)» Invalidates a unicast filter result
60 - Mike Wilson - 04/21/23
Extra SlidesThe rest of the slides are old or for extra
information
61 - Mike Wilson - 04/21/23
Design Questions Small hole for abuse in HdrFmt
»QM rate limits on payload length»HdrFmt (after QM) can vastly increase packet length»Should the LookupB table give the padding size for each entry?
Enforced in SubEncap?»ANSWER: No, we will resort to our control of HdrFmt to force it to
behave. (We write all of the code options right now.)
What are the best places to update stats on NPUB?»ANSWER: Post-Q only
Is there any remaining reason that NPUB would need the source tunnel information?»ANSWER: No. If a code option needs it, put it into opaque slice
data.
62 - Mike Wilson - 04/21/23
Questions/Issues 4/28/08:
»How many code options? Limit of 16?
»To handle slow Code Options: LCI Queues would control traffic to Fast/Slow Parse Code
Classes of code options defined by how long their Parse code takes. Scheduler assigned to a class of code option.
NPE Queues would control traffic to Fast/Slow HF Code LCE Queues control the output rate to Interfaces.
»Multicast Problems: Impact of multicast traffic overloading Lookup/Copy and becoming a
bottleneck.»Rx on SideB, can it use SRAM output ring?
All our other 10G Rx’s have NN output ring.»Option for HF to send out additional pkts?»How to pass MR and substrate hdrs to TxB?
Through Ring or through Hdr Buffer associated with Hdr Buffer descriptor.
If the latter then what are the constraints in Tx for buffer chaining?
63 - Mike Wilson - 04/21/23
Meeting Notes1/15/08:
»QM: Add Pkt count to Queue Params, change limit from QLen to PktCount
»Add Per Slice Pkt limit to NPUA and NPUB»Limit Fanout to 16»MCast: Control will allocate all 16 entries for a multicast result entry, result entry will be typed as multicast or unicast and will not transition from one to the other.
»What happens to pkts in queues when there is a route change that sends that flow’s pkts to a different interface and queue? Pkt ordering problems?
64 - Mike Wilson - 04/21/23
SRAM
TxA(2 ME)
TCAM
Decap, Parse, LookupA, AddShim(8 MEs)
SRAM
Stats(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt(4 MEs)
Stats(1 ME)SRAM
NPE Version 2 Block DiagramLookup produces
resultIndx, statsIndx
slice#, resultIndx
, etc, passed in
shim
Lookup on <slice#, resultIndx>
yields fanout, list of QiDs;copy to queues, adding
copy#;(slice#, resultIndx remain
in packet buffer)
use slice# to select slice to format packet; use resultIndx to get
next-hop
flow
contr
ol?
for unicast, resultIndx replaced by QiD; allowing output side to skip lookup
SPISwitch
NPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
65 - Mike Wilson - 04/21/23
Questions/Issues Where are exit and entry points for packets sent to and from the GPE for exception
processing?» Parse (NPUA) and LookupA (NPUA) are where most exceptions are generated:
IP Options No Route Etc.
» HdrFormat (NPUB) is where we do ethernet header processing What needs to be in the SHIM going from NPUA to NPUB?
» ResultIndex (32b)» Exception Bits (12b)» StatsIndex (16b)» Slice# (12b)» ???
Will we support multi-copy in a way similar to the ONL Router? How big can the fanout be?
» How many QIDs need to be stored with the LookupB Result? Is there some encoding for the QIDs that can take into account support for multicast and the copy#? For
example: Multicast QID(20b)
– Multicast (1b): 1 – Copy# (4b)– PerMulticast QID(15b): One PerMulticast QID allocated for each Multicast
Unicast QID(20b)– Unicast (1b): 0– QID (19b)
Are there timing/synchronization issues with adding, deleting or changing lookup entries between the two NPUs databases?
Do we need flow control between TxA and RxB?
66 - Mike Wilson - 04/21/23
SRAM
TxA(2 ME)
TCAM
Decap, Parse, LookupA, AddShim(8 MEs)
SRAM
Stats(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt(4 MEs)
Stats(1 ME)SRAM
NPE Version 2 Block Diagram
flow
contr
ol?
SPISwitch
NPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
NPUA:»RxA:Same as Version 0»TxA: New 10Gb/s »Decap: Same as Version 0»Parse: Same as Version 0
New code options?»LookupA: Results will be different from Version 0»AddSim: New
67 - Mike Wilson - 04/21/23
SRAM
TxA(2 ME)
TCAM
Decap, Parse, LookupA, AddShim(8 MEs)
SRAM
Stats(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt(4 MEs)
Stats(1 ME)SRAM
NPE Version 2 Block Diagram
flow
contr
ol?
SPISwitch
NPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
NPUB:»RxB:Same as Version 0»TxB: New 10Gb/s
with L2 Header coming in on input ring?»LookupB: New»Copy: New, may be able to use some code from ONL Copy»QM: New, decoupled from Links »HF: New, may use some code from Version 0
68 - Mike Wilson - 04/21/23
SRAM
TxA(2 ME)
TCAM
Decap, Parse, LookupA, AddShim(8 MEs)
SRAM
StatsA(1 ME)
RxA(2 ME)
SRAM
SRAMSRAM
QueueManager(4 MEs)
RxB(2 ME)
TxB(2 ME)
LookupB&Copy(2 ME)
HdrFmt(4 MEs)
StatsB(1 ME)SRAM
NPE Version 2 Block Diagram
flow
contr
ol?
SPISwitch
NPUA
NPUB
SPISwitch
Sw
itch
Bla
de
GPE
SRAMFreeList
MgrB(1 ME)
Scr2NN(1 ME)
Sram2NN(1 ME)
NPUB has 17 MEscurrently spec’ed
FreeList MgrA
(1 ME)
69 - Mike Wilson - 04/21/23
SPP V2: MR Specific Code Where does the MR Specific Code reside in V2:
»Parse»HdrFormat
What about LookupA and LookupB?»Lookup is a “service” provided to the MRs by the Substrate.»No MR specific code needed in LookupA or LookupB
What about SideA AddShim?»The Exception bits that go in the shim are MR Specific but they should
be passed to AddShim and it will write them into the Shim. »No MR Specific code needed in AddShim.
What about SideB Copy?» Is there anything MR specific about setting up multiple copies of a
packet? There shouldn’t be. We will have the Copy block allocate a new hdr buffer
descriptor and link it to the existing data buffer descriptor and take care of reference counts.
The actual building of the new header(s) for the copies will be left to HF.»No MR Specific code needed in Copy.
70 - Mike Wilson - 04/21/23
SPP V2: Hdr Format Lots of changes for HF:
» Move behind QM» More general:
Support multiple source IP Addresses General support for Tunnels
Eventually different kinds of tunnels (UDP/IP, GRE, …)?» Support for Multicast
Dealing with header buffer descriptors Reading Fanout table
» Substrate portion of HF will need to do Decap type table lookup Slice ID (Code Option, Slice Memory Pointer, Slice Memory Size)
HF gets a buffer descriptor from the QM» The Substrate portion of HF must determine:
Code Option (8b) Slice ID (12b) Location of Next Hop information (20b - 32b)
LD vs. FWD? Stats Index (16b)
Should HF do this of QM?» The MR portion of HF must determine:
Exception bits (16b) Lets put all of the above data in the Buf Desc
» LookupB/Copy will need to write it there based on what comes across from SideA in the shim
71 - Mike Wilson - 04/21/23
SPP V2: ResultWe need to be much more general in our support
for Tunnels, Interfaces, MetaInterfaces, and Next Hops.
SideB Result:»Interface
IP SAddr (32b) Eth MAC DAddr (48b) (LC, GPE1, GPE2, …, GPEn) SchedulerId (8b): which QM should handle pkt
»TxMI: IP Sport (16b)
»TxNextHop: IP DAddr (32b) IP DPort (16b)
72 - Mike Wilson - 04/21/23
Data AreasWhere are the tables and what data is transmitted
from SideA to SideB?
SideA TablesShim between SideA and SideBSideB Tables
73 - Mike Wilson - 04/21/23
Pkt Processing Data and Tables SideA:
»MR/Slice Table: Generated by Control Used by:
Substrate Decap to retrieve a MR/Slice’s parameters Indexed by SliceId == VLAN Contains:
– Code option– Slice Memory ptr– Slice Memory size– ???
»TCAM: Generated by Control Used by:
LookupA Contains:
Key: Result:
74 - Mike Wilson - 04/21/23
Data Areas Shim between SideA and SideB
»Written to DRAM Buffer to be sent from SideA to SideB»Contains:
resultIndex (32b): Generated by Control Result of TCAM lookup on SideA Translates into an SRAM Address on SideB
exceptionBits (12b) Generated by SideA Parse/Lookup Used by:
– SideB HF statsIndex (16b)
Generated by Control Result of TCAM lookup on SideA Used by:
– SideA Lookup/AddShim to increment counters– SideB Lookup/Copy to increment PreQ Cntrs (or perhaps SideA is the PreQ cntrs)– SideB HF or QM to increment PostQ Cntrs
sliceId (12b) Generated by Control Result of Decap read of Ethernet hdr (VLAN) Used by:
– ??? codeOption (4b) Slice Memory Ptr (32b)
75 - Mike Wilson - 04/21/23
Data Areas SideB
»Data Buffer Descriptor»Hdr Buffer Descriptor
Used for multi-copy packets SPP V2 may require Tx to handle multi-buffer packets.
It is unclear if we can cleanly do that same thing that we do with ONL where HF passes the Ethernet header to Tx.
We may also need to have support for MR specific per copy data»Results Table
Generated by Control Used by:
LookupB/Copy HF
– Should HF get its per copy info from here as well. Contains:
Fanout (if fanout is > 1 we can overload some of the following fields with a pointer into a Fanout table)
QID InterfaceId TxMI Id
– Probably doesn’t help to make it an index into a table for UDP Tunnels since UDP Port is 16 bits
– But for tunnels other than UDP tunnels it may help? TX NextHop Id
– Index into a table of Tunnel Next Hops
76 - Mike Wilson - 04/21/23
Data Areas (continued)SideB (continued)
»Fanout Table Generated by Control Used by:
LookupB/Copy HF
Contains: QID[Fanout] InterfaceId TxMI Id Tx Next Hop ID[Fanout]
Implementation Choices: One contiguous block of memory
– Fixed size or variable sized Chained with one set of values per entry Chained with N (N=4?) sets of values per entry