24
1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI Dipartimento di Ingegneria dell’Informazione - Università di Pisa Ing. Fabio Vitucci DESIGN AND IMPLEMENTATION OF A MULTI-DIMENSIONAL PACKET CLASSIFIER FOR NETWORK PROCESSORS

1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

Embed Size (px)

Citation preview

Page 1: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

TITOLOTESI

VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Gruppo RETI di TELECOMUNICAZIONI

Dipartimento di Ingegneria dell’Informazione - Università di Pisa

Ing. Fabio Vitucci

DESIGN AND IMPLEMENTATIONOF A MULTI-DIMENSIONAL

PACKET CLASSIFIER FOR NETWORK PROCESSORS

Page 2: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

2 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Outline

• Resume of previous activities

• Implementation of classification module

• Programming problems

• Measurements

• Future works

• Conclusions

Page 3: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

3 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Resume of previous activities/1

• Detailed analysis of the Intel® IXP2400 Network Processor and the available board (Radysis ENP-2611)

• Choice of a proper application to be implemented on NPs: a packet classification

• Comparative analysis among many research algorithms

Source Address

Layer 4 Destination

Layer 4 Protocol

... Rule

11.14.2.21 www TCP ... R1

13.11.23.* gt 1023 TCP ... R2

112.*.*.* www UDP ... R3

Page 4: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

4 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Resume of previous activities/2

Comparative analysis among many research algorithms

Algorithm Worst case Time Worst Case Storage

Linear Search O(N) O(N)

Hierarchical tries O(WD) O(NDW)

Set-pruning tries O(WD) O(ND)

Grid-of-tries O(WD-1) O(NDW)

Cross-producting O(DW) O(ND)

Area-Based Quadtree O(NW) O(W)

FIS-tree O((L+1)W) O(LN1+1/L)

RFC O(D) O(ND)

Bitmap-intersection O(DW+N/W) O(DN2)

HiCuts O(D) O(ND)

Ternary CAMs O(1) O(N)

N = number of entries W = maximum number of bit for level

D = number of fields to be processed L = number of level of data structure

Page 5: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

5 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Multidimensional Multibit Trie

• Fields: – IP Source Address and IP Destination Address

– Layer 4 Source Port and Destination Port

– Layer 4 Protocol Type

• Hierarchical trie: a tree per dimension– Many levels for dimension

– A fixed number of bits for level

• Performance parameters:– Research speed: 5×O(W/K)

– Memory accesses: 12

– Storage complexity: 5×O(2(k-1)×N×W/K)

Resume of previous activities/3

SA Trie

DA Trie

SP Trie

DP Trie

PR Trie

Page 6: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

6 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

• Main bound:– Memory consumption– Rules with unspecified fields (e.g. 131.114.*.*) need

explosion of all possible rules

• Modifications:– A level transition in case of wild-cards

• Less number of nodes

• Sometimes more memory accesses

• More complexity

• Validation tests with a C simulator– Large saving in memory consumption (table in SRAM)– Small increase in instruction store size

Resume of previous activities/4

Page 7: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

7 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Implementation of module/1

Packet _RX

MSF

uE 0:0

ETH_RX_TO_IPV4_SRC_RING

ID = 4 (x 4)BASE_ADDRESS = 0SIZE = 1024

Eth_Decap_Classify

uE 0:2

IPv4 Fwd L2_Validate

IPV4_TO_QM_SCR_RINGalias QM_RING_INalias QM_RING_IN_0alias ENQ_RING_NUMBER

ID = 5 (x 4)BASE_ADDRESS = 4096SIZE = 512

Packet_QM

uE 0:3

SCHEDULER_TO_QM_SCR_RINGalias QM_RING_IN_1alias DEQ_RING_NUMBER

ID = 6 (x 4)BASE_ADDRESS = 6144SIZE = 512

Scheduler

uE 1:0

Eth_Decap_Classify

uE 0:1

IPv4 Fwd L2_Validate

Eth_Decap_Classify

uE 1:3

IPv4 Fwd L2_Validate

Sphy_mphy4_tx

uE 0:3 uE 1:1

MSF

QM_TO_PACKET_TX_SCR_RING_0alias PACKET_TX_IN_0

ID = 7 (x 4)BASE_ADDRESS = 8192SIZE = 128

NN_RING

Reflector Bus

IPv4 Forwarder Intel

Page 8: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

8 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Packet _RX

MSF

uE 0:0

ETH_RX_TO_IPV4_SRC_RING

ID = 4 (x 4)BASE_ADDRESS = 0SIZE = 1024

Eth_Decap_Classify

uE 0:2

IPv4 Fwd

L2_Validate

IPV4_TO_QM_SCR_RINGalias QM_RING_INalias QM_RING_IN_0alias ENQ_RING_NUMBER

ID = 5 (x 4)BASE_ADDRESS = 4096SIZE = 512

Packet_QM

uE 0:3

SCHEDULER_TO_QM_SCR_RINGalias QM_RING_IN_1alias DEQ_RING_NUMBER

ID = 6 (x 4)BASE_ADDRESS = 6144SIZE = 512

Scheduler

uE 1:0

Eth_Decap_Classify

uE 0:1

IPv4 Fwd

L2_Validate

Eth_Decap_Classify

uE 1:3

IPv4 Fwd

L2_Validate

Sphy_mphy4_tx

uE 0:3 uE 1:1

MSF

QM_TO_PACKET_TX_SCR_RING_0alias PACKET_TX_IN_0

ID = 7 (x 4)BASE_ADDRESS = 8192SIZE = 128

NN_RING

Reflector Bus

Classify

Classify

Classify

Implementation of module/1

IPv4 Forwarder Intel

Page 9: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

9 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

• Functions of XScale (implemented in C language): – Receiving classification rules– Building multidimensional trie according to received rules to

calculate the number of nodes per level and SRAM addresses– Rebuilding multidimensional trie to put data in SRAM to

precalculated addresses

• Functions of Microengines:– Receiving packets– Retrieving proper fields to packet headers– Finding matching rules using data structure in SRAM– Modifying TOS fields

Implementation of module/2

Page 10: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

10 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

• Functions of XScale (implemented in C language): – Receiving classification rules– Building multidimensional trie according to received rules to

calculate the number of nodes per level and SRAM addresses– Rebuilding multidimensional trie to put data in SRAM to

precalculated addresses

• Functions of Microengines:– Receiving packets– Retrieving proper fields to packet headers– Finding matching rules using data structure in SRAM– Modifying TOS fields

Implementation of module/2

Page 11: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

11 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

• Functions of XScale (implemented in C language): – Receiving classification rules– Building multidimensional trie according to received rules to

calculate the number of nodes per level and SRAM addresses– Rebuilding multidimensional trie to put data in SRAM to

precalculated addresses

• Functions of Microengines:– Receiving packets– Retrieving proper fields to packet headers– Finding matching rules using data structure in SRAM– Modifying TOS fields

Implementation of module/2

Page 12: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

12 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Implementation of module/3

index of node * index of node of 2nd level index of node of 2nd level index of node of 2nd level

index of node of 2nd level index of node of 2nd level index of node of 2nd level index of node of 2nd level

index of node * value of field index of next node

value of field index of next node value of field index of next node

index of node * value of field index of next node

value of field index of next node value of field index of next node

index of node * index of next node

minimumvalue maximum value

index of node * index of next node

minimumvalue maximum value

index of node * value of field number of rule

value of field number of rule value of field number of rule

long word

SRAM Data Table

Page 13: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

13 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

• Functions of µ-engines (implemented in µ-code assembler):– Receiving packets– Retrieving proper fields to packet headers– Finding matching rules using data structure in SRAM– Modifying TOS fields

• Number of added cycles: 1600– 50 = memory registers initialization– 180 = reading first node– 150 × 2 = reading nodes of ports– 145 × 7 = reading other nodes– 15 = final matching– 40 = writing TOS field

Implementation of module/4

Page 14: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

14 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Programming problems/1

• Main problems:– Number of SRAM accesses– Rate of SRAM accesses

Packet _RX

MSF

uE 0:0

ETH_RX_TO_IPV4_SRC_RING

ID = 4 (x 4)BASE_ADDRESS = 0SIZE = 1024

Eth_Decap_Classify

uE 0:2

IPv4 Fwd

L2_Validate

IPV4_TO_QM_SCR_RINGalias QM_RING_INalias QM_RING_IN_0alias ENQ_RING_NUMBER

ID = 5 (x 4)BASE_ADDRESS = 4096SIZE = 512

Packet_QM

uE 0:3

SCHEDULER_TO_QM_SCR_RINGalias QM_RING_IN_1alias DEQ_RING_NUMBER

ID = 6 (x 4)BASE_ADDRESS = 6144SIZE = 512

Scheduler

uE 1:0

Eth_Decap_Classify

uE 0:1

IPv4 Fwd

L2_Validate

Eth_Decap_Classify

uE 1:3

IPv4 Fwd

L2_Validate

Sphy_mphy4_tx

uE 0:3 uE 1:1

MSF

QM_TO_PACKET_TX_SCR_RING_0alias PACKET_TX_IN_0

ID = 7 (x 4)BASE_ADDRESS = 8192SIZE = 128

NN_RING

Reflector Bus

Classify

Classify

Classify

Page 15: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

15 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

We want to reduce the idle time

Programming problems/2Multithreaded Programming

running thread context swap idle thread idle µe

µe control memory access latency

time

thread 0

thread 1

thread 2

thread 3

thread 4

thread 5

thread 6

thread 7

Page 16: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

16 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Programming problems/3Stalling

running thread context swap idle thread idle µe

µe control memory access latency

time

thread 0

thread 1

thread 2

thread 3

time

thread 0

thread 1

thread 2

thread 3

• Decrease the number of active threads for µ-engine

Page 17: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

17 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Programming problems/4

• Filling

running thread context swap idle thread idle µe

µe control memory access latency

time

thread 0

thread 1

thread 2

thread 3

• Consolidate adjacent memory accesses

time

thread 0

thread 1

thread 2

thread 3

Page 18: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

18 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

AdTech AX4000

Cross-Compiler(XScale programming)

Serial Cable

Developers’ Workbench(Microengines Programming)

Measurements/1

Page 19: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

19 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Measurements/2ADTech AX4000

Page 20: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

20 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Measurements/3• Max packet rate: 2033000 pkt/s (0 lost packets)• Number of supported rules: 10000• Performance indipendent from number of rules• A fundamental feature: robustness

fabio
il limite alle regole è dato solo dalla dimensione della SRAM (quindi è un limite intrinseco della scheda)
Page 21: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

21 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Measurements/4• Packet delay

35 μsec

100 μsec 1130 μsec

Page 22: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

22 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Future Works: Resources/Link Scheduler

MSF MSFPacket _RX

uE 0:0

Classifier ResourceScheduler

uE 1:0

uE 1:3

Packet_TX

uE 1:2

LinkScheduler

uE 1:1

Scratchpad Memory

uE 0:3

uE 0:2

uE 0:1

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

uE 0:2

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

uE 0:3

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

uE 1:3

Eth_DecapClassify

IPv4 Fwd

L2_Validate

FB

Page 23: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

23 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

Conclusions

• Analyse the Intel® IXP2400 hardware architecture• Select a proper algorithm of packet classification for the IXP2400• Modify the algorithm to capitalize properties of our hardware• Build a C Simulator to test the new version

• Implement XScale functions in C language (building rule table)• Implement μ-engines functions in µ-code (finding matching rule)

• Analyse multithreaded programming• Study stalling, filling, and other “phenomenons”

• Test working and performance of the classifier• Characteristics: 1600 added cycles, 2 Mpkt/s, 10000 rules

supported, scalability, robustness in case of congestion

Page 24: 1 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP TITOLO TESI VIII Workshop PisaTel - December, 6 th 2005 - SSSUP Gruppo RETI di TELECOMUNICAZIONI

24 Fabio Vitucci - VIII Workshop PisaTel - December, 6th 2005 - SSSUP

TITOLOTESI

Workshop PisaTel - December 6th 2005 - SSSUP

Gruppo RETI di TELECOMUNICAZIONI

Dipartimento di Ingegneria dell’Informazione - Università di Pisa

Ing. Fabio Vitucci

DESIGN AND IMPLEMENTATION OF A MULTI-DIMENSIONAL

PACKET CLASSIFIER FOR NETWORK PROCESSORS