Pipelined Architecture For Multi-String Match
Department of Computer Science and Information Engineering National Cheng Kung University, Taiwan R.O.C.
Authors: Derek Pao, Wei Lin, and Bin Liu
Publisher: IEEE Computer Architecture Letters, 30 May 2008. IEEE computer Society Digital Library. IEEE Computer Society
Present: Chia-Ming ,Chuang Date: 10, 8, 2008
1
Outline
1. Introduction 2. Pipelined Architecture 3. Performance evaluation 4. Conclusion
2
Introduction (1/6) A string Y of length n is a sequence of characters c1c2……cn. Let Σ = {Y1, Y2, ...YN} be a finite set of strings c
alled keywords or signatures proposed hardware solutions are based on the
well-known Aho-Corasick (AC) algorithm , where the system is modeled as a deterministic finite automaton(DFA)
we present a pipelined processing approach to the implementation of AC algorithm, called P-AC.
3
Introduction (2/6)
CLOCK CURRENT INPUT NEXT STATE edge
1 ROOT a a forward edge
2 a p ap forward edge
3 ap p app forward edge
4 app a NULL cross edge
5 pa s pas forward edge
6 pas t past forward edge
7 past
Transition rule table
Input data: .appastxyz
4
Introduction (3/6)
Forward edges
cross edges
5
Introduction (4/6)
6
Introduction (6/6)
7
Introduction (5/5)
8
Outline
1. Introduction 2. Pipelined Architecture 3. Performance evaluation 4. Conclusion
9
Pipelined Architecture (1/6)
10
Pipelined Architecture (2/6)
Assume the pipeline has k+1 stages numbered from 0 to k
last stage is only used to buffer the search result of stage (k-1).
longer than k characters are divided into segments of length k
last segment whose length can be less than k
11
Pipelined Architecture (3/6) local transition tables (LT) can be implemented usin
g hardware hashing For long strings with more than k characters, the m
atch-result will be generated by the aggregation unit (AU).
transition rule table (TD) to determine the next state
segment ID will be passed to the corresponding partial match unit (PMi) of the AU.
(PMi) sends a lookup request to the conditional match table (CMT);
12
Pipelined Architecture (4/6)
and Test Instructions instrument
and Test Inst ruct ions inst rume nt
and Test Inst ruct ions rume nt
nt and ions Inst ruct rume Test
Boolean flagL is equal to 1 if the segment is part of a long string
13
Pipelined Architecture (5/6)
14
Pipelined Architecture (6/6)
15
Outline
1. Introduction 2. Pipelined Architecture 3. Performance evaluation 4. Conclusion
16
Performance evaluation (1/2)
There are k+3 memories in the system (k memories for the pipeline unit)
2 memories for the DFA units and 1 memory for the CMT).
DFA unit requires 2 memories because there are 2 types of edges (Forward edges or cross edges)
17
Performance evaluation (2/2)
18
Outline
1. Introduction 2. Pipelined Architecture 3. Performance evaluation 4. Conclusion
19
Conclusion (1/1) if 2-port memories are available, we can i
mplement 2 pipelines on the same device that share the lookup tables.
The system throughput can be doubled with little overhead. Using the Xilinx Virtex-5 FPGA that operates at 550 MHz, the throughput of P-AC is up to 8.8 Gbps.
20