of 48/48
م ی ح ر ل ا ن م ح ر ل له ا ل م ا س ب1

# Approximate String Matching

• View
132

1

Embed Size (px)

### Text of Approximate String Matching

1

: 1393 2

Parallel Algorithm for Approximate StringMatching with k Differences k

3

4

" " . . 5

....

6

Levenshtein (2) : .

K-difference k .7

1 ( ):

X=X[0,..,n-1] Y=Y[0,..,m-1] Edit(X,Y) X, Y X Y : X X X

8

2 ( K ):Edit( P[0,,m-1],T[ ,,j-1] ) k9

1 :

10

11

1 ( ) P=ACTACG T=TAGTACG .

12

13

.If i=0 D[i,j]=0 [s[1..i [t[1..0 i .If j=0 D[i,j]= i

14

[p[1..i-1 [t[l..j-1 k p[i] t[j] k s p .If T[j-1,i-1] = P[i-1] D[i,j] = D[i-1,j-1]15

: [p[1..i [t[l..j-1 k [t[j [t[1..j k+1 . [p[1..i-1 [t[l..j k [p[1..i [p[i k+1 . [p[1..i-1 [t[l..j-1 k [p[1..i [t[j [p[i k+1 .

D[i,j]=1+min{D[i-1,j],D[i-1,j-1],D[i,j-1]}16

[d[i,j [d[i,j .17

T= TAGTP= ACT18

T= TAGT

P=ACT

2132Edit(P,T)min19

for i=0:m d(0,i)=0;endfor j=0:n d(i,0)=i;endfor i=1:n for j=1:m if p(i-1)==t(j-1) d(i,j)=d(i-1,j-1); else d(i,j)=minmum(d(i,j-1),d(i-1,j),d(i-1,j-1))+1; end endend

O(nm)

O(n)O(m)20

K T(n)=O(n)+O(m)+O(mn)=O(mn) 21

. .22

23

k 24

.

25

. 26

. D[i ,j] =1+min{D[i-1,j-1], D[i-1 ,j], D[i ,j-1]} 27

D[i , j] N 1 N j-1 1 :

D[i-1, j-1] D[i , j-1] < D[i-1, j-1]

28

. 29

.

30

X .

31

X 2 :

32

33

.

34

: X1. FOR i=0 to || -1 PARALLEL DO FOR i=0 to n DO compute X[i,j] according formula(3); END FOR END FOR PARALLEL

O(n)

O(n)35

: D2. FOR i=0 to m DO FOR i=0 to n PARALLEL DO compute D[i,j] according formula(4); END FOR PARALLEL Barrier synchronization END FOR

O(1)

O(m)36

3.FOR j=0 to n-1 PARALLEL DO IF (D[m,j+1] K) Result[j] j; ELSE Result[j] -1; END FOR PARALLEL

O(1)37

T(n)=1 + 2 + 3

T(n)=t(1)+t(2)+t(3)=O(n) +O(m) +O(1)=O(m+n)

O(m+n) O(n)38

: n (n )

: m+1 (m )

: m+1

: n+m+1

39

.40

1

41

( X)

42

( D)

43

( )

44

K . . K m+1 n n+m+1 m+1 . DNA 45

[1] L. Z., B. J., and J. T. A software system for gene sequence databaseconstruction. Engineering in Medicine and Biology Society, 2005.[2] L. V.I. Binary codes capable of correcting deletions, insertions andreversals. ov. Phys. Dokl, 1996.10.[3] G. Navarro and R. Baeza-yates. A hybrid indexing method for approximatestring matching. Journal of Discrete Algorithms, 1(1):2149, 2000.[4] Z. C and C. GL. Parallel algorithms for approximate string matching onpram and larpbs. Journal of software, 15:159169, 2004.[5] S. P. The theory and computation of evolutionary distance:patternrecognition. Journal of Algorithms, pages 359373, 1980.1.[6] G. Navarro. A guided tour to approximate string matching. ACMComputing Surveys, 33(1):3188, 2000.[7] B.-Y. Z. Y.S.Jayram and R. Krauthgamer. Approximating Edit DistanceEfficiently. Computer Science, 2004.10.[8] K. A. T. MIURA and I. SHIOYA. Approximate String Matching UsingMarkovian Distance. Algorithms and Programming, 2010.[9] D. S. J. Zibert and N. Pavesic. An edit-distance model for the approximatematching of timed strings. Pattern Analysis and MachineIntelligence, 31(4):736741, 2009.46

46

[9] D. S. J. Zibert and N. Pavesic. An edit-distance model for the approximatematching of timed strings. Pattern Analysis and MachineIntelligence, 31(4):736741, 2009.[10] L. D. S. Wang and Z. Mei. Approximate Address Matching. InternationalConference on P2P, Parallel, Grid, Cloud and Internet Computing,2010.10.[11] H.-C. Lee and E. F. RMESH algorithms for parallel string matching.Los Alamitos: IEEE Computer Society Press, 1997.[12] A. H. Wright and Y. Jiang. O(k) parallel algorithms for approximatestring matching. ournal of Neural Parallel and Scientific Computation,1993.1.[13] S. Xiao and W. chun Feng. Inter-Block GPU communication via fastbarrier synchronization. 24th IEEE International Parallel DistributedProcessing Symposium, 2010.[14] K. C. K. G. Margaritis. String Matching on a Multicore GPU usingCUDA. 13th Panhellenic Conference on Informatics, 2009.

47

Thank YouVery much. 48

48

Documents
Documents
Documents
Documents
Documents
Documents
Technology
Documents
Documents
Documents
Documents
Documents
Documents
Technology
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents
Documents