of 16 /16
String Matching Problem Given a text string T of length n and a  pattern string  P of length m, the exact string matching problem is to find all occurrences of  P in T . • Example: T=“AGCTTGAP=“GCT” Applications:   Searching keywords in a file   Searching engines (like Google and Openfind)   Database searching (GenBank)

# String Matching Problem.ppt

Embed Size (px)

### Text of String Matching Problem.ppt

• 7/27/2019 String Matching Problem.ppt

1/16

String Matching Problem

Given a text stringTof length n and apattern stringPof length m, the exact stringmatching problem is to find all occurrences

ofPin T. Example: T=AGCTTGA P=GCT

Applications:

Searchingkeywords in a fileSearching engines (like Google and Openfind)

Database searching (GenBank)

• 7/27/2019 String Matching Problem.ppt

2/16

Problem/issue

Finding occurrence of a pattern (string)

P in String S and also finding theposition in S where the pattern match

occurs

What is pattern matching?

• 7/27/2019 String Matching Problem.ppt

3/16

Brute Force algorithm

The brute-force pattern matching algorithm comparesthe pattern Pwith the text Tfor each possible shift ofP

relative to T,

*until either a match is found, or

*all placements of the pattern have been tried

• 7/27/2019 String Matching Problem.ppt

4/16

Brute-force

Worst O(m*n)

Best O(n)

algorithmbrute-force:

input: an array of characters, T (the string to be analyzed) , length n

an array of characters, P (the pattern to be searched for), length m

for i := 0 to n-m do

for j := 0 to m-1 do

compare T[j] with P[i+j]ifnot equal, exit the inner loop

• 7/27/2019 String Matching Problem.ppt

5/16

Compare each character of P with S if

match continue else shift one position

String S

a

b

a

a

a

b

c

a

b

a

a

b

c

a

b

a

c

Pattern p

Example

• 7/27/2019 String Matching Problem.ppt

6/16

Step 1:compare p with S

a

b

c

a

b

a

a

b

c

a

b

a

c

a

b

a

a

Step 2: compare p with S

a

b

c

a

b

a

a

b

c

a

b

a

c

a

b

a

a

S

S

p

p

• 7/27/2019 String Matching Problem.ppt

7/16

Step 3: compare p with S

S a b c a b a a b c a b a c

a b a ap

Mismatch occurs here..

Since mismatch is detected, shift P one position to the Right andperform steps analogous to those from step 1 to step 3. At positionwhere mismatch is detected, shift P one position to the right andrepeat matching procedure.

• 7/27/2019 String Matching Problem.ppt

8/16

The Knuth-Morris-Pratt Algorithm

Knuth, Morris and Pratt proposed a lineartime algorithm for the string matchingproblem.

A matching time of O(n) is achieved byavoiding comparisons with elements of Sthat have previously been involved incomparison with some element of thepattern p to be matched. i.e.,backtracking on the string S never occurs

• 7/27/2019 String Matching Problem.ppt

9/16

Components of KMP algorithm

The prefix function,

The prefix function, for a pattern encapsulatesknowledge about how the pattern matches against shiftsof itself. This information can be used to avoid uselessshifts of the pattern p. In other words, this enablesavoiding backtracking on the string S.

The KMP Matcher

With string S, pattern p and prefix function asinputs, finds the occurrence of p in S and returns the

number of shifts of p after which occurrence is found.

• 7/27/2019 String Matching Problem.ppt

10/16

Knuth-Morris-Pratt algorithm-Algorithm

Compute-Prefix-Function(P)

1. m length[T]

2.  0

3. k 0

4. forq2 tom5. do whilek> 0 and P[k+ 1] P[q]

6. dok[k] /*ifk= 0 orP[k+ 1]

= P[q],

7. ifP[k+ 1] = P[q] going out of thewhile-loop.*/

8. then kk+ 1

9. [q]k

10.return

• 7/27/2019 String Matching Problem.ppt

11/16

Knuth-Morris-Pratt algorithm-Algorithm

KMP-Matcher(T, P)

1. n length[T]

2. m length[P]

3. Compute-Prefix-Function(P)

4. q 05. fori1 ton

6. do whileq > 0 and P[q + 1] T[i]

7. doq [q]

8. ifP[q + 1] = T[i]9. then qq + 1

10. ifq = m

11. thenprint pattern occurs with shift im

12. q [q]

• 7/27/2019 String Matching Problem.ppt

12/16

Compute prefix function

P= ababababca, T= ababaababababca= 0

k= 0

q = 2, P[k+ 1] = P = a, P[q] = P = b, P[k+ 1] P[q]

[q]k( 0)q = 3, P[k+ 1] = P = a, P[q] = P = a, P[k+ 1] = P[q]

k k+ 1, [q]k( 1)

k= 1

q = 4, P[k+ 1] = P = b, P[q] = P = b, P[k+ 1] = P[q]

k k+ 1, [q]k( 2)

• 7/27/2019 String Matching Problem.ppt

13/16

k= 2

q = 5, P[k+ 1] = P = a, P[q] = P = a, P[k+ 1] = P[q]

k k+ 1, [q]k( 3)k= 3

q = 6, P[k+ 1] = P = b, P[q] = P = b, P[k+ 1] = P[q]

k k+ 1, [q]k( 4)

k= 4q = 7, P[k+ 1] = P = a, P[q] = P = a, P[k+ 1] = P[q]

k k+ 1, [q]k( 5)

k= 5

q = 8, P[k+ 1] = P = b, P[q] = P = b, P[k+ 1] = P[q]

k k+ 1, [q]k( 6)

• 7/27/2019 String Matching Problem.ppt

14/16

k= 6

q = 9, P[k+ 1] = P = b, P[q] = P = c, P[k+ 1] P[q]k[k](k = 4)

P[k+ 1] = P = a, P[q] = P = c, P[k+ 1] P[q]

k[k](k = 2)

P[k+ 1] = P = a, P[q] = P = c, P[k+ 1] P[q]

k[k](k = 0)

k= 0

q = 9, P[k+ 1] = P = a, P[q] = P = c, P[k+ 1] P[q]

[q]k( 0)

q = 10, P[k+ 1] = P = a, P[q] = P = a, P[k+ 1] = P[q]k k+ 1, [q]k( 1)

• 7/27/2019 String Matching Problem.ppt

15/16

After prefix computation, the table is shown below

P= ababababca

1 2 3 4 5 6 7 8 9 10

a b a b a b a b c a0 0 1 2 3 4 5 6 0 1

i

P[i][i]

a b a b a b a b c a

a b a b a b

a b a b

a b

a b c a

a b a b c a

a b a b a b c a

a b a b a b a b c a

P8

P6

P4

P2

P0

 = 6 = 4

 = 2

 = 0

• 7/27/2019 String Matching Problem.ppt

16/16

Another Example for KMP Algorithm

Phase 1

Phase 2

f(41)+1=f(3)+1=0+1=1

f(13-1)+1= 4+1=5matched

First finish the prefix

computation

Next, Search phase computation ##### Yangjun Chen 1 String Matching String matching problem - prefix - suffix - automata - String-matching automata - prefix function - Knuth-Morris-Pratt algorithm
Documents ##### String Matching dengan Regular Expressioninformatika.stei.itb.ac.id/~rinaldi.munir/Stmik/2017-2018/String-Matching-dengan-Regex...String Matching dengan Regular Expression Masayu Leylia
Documents ##### String Matching String matching: definition of the problem (text,pattern) depends on what we have: text or patterns Exact matching: Approximate matching:
Documents ##### String matching algorithms - deepak garg · 2012-06-11 · String matching algorithms . Deliverables String Basics Naïve String matching Algorithm Boyer Moore Algorithm Rabin-Karp
Documents ##### A parallel ‘String Matching Engine’ for use in high speed network … · and Non-deterministic Finite Automata. 2.3 String matching algorithms There are many string matching algorithms
Documents ##### Optimal Packed String Matching - DTU Research Database · be eciently implemented. The same specialized packed string matching instruction could also be used in other string matching
Documents ##### Luca Tabarelli - Febbraio 20021 Algoritmi di String Matching String MatchingIl problema dello String Matching Forza BrutaAlgoritmo semplice di Forza Bruta
Documents Documents