Upload
rasia
View
73
Download
0
Embed Size (px)
DESCRIPTION
String Matching in Lempel-Ziv Compressed Strings. 論文紹介. Algorithmica (1998) 20: 388-404. M. Farach and M. Thorup. 竹田研究室 修士課程 2 年 喜田 拓也. Preliminaries. 用語の説明. prefix, substring, suffix. 用語の説明. F.E.R.C. ある文字列 w に対して、. w = xyz. サフィクス. プレフィクス. サブストリング. - PowerPoint PPT Presentation
Citation preview
String Matching in Lempel-Ziv Compressed StringsM. Farach and M. Thorup 2 Algorithmica (1998) 20: 388-404
Preliminaries
prefix, substring, suffix ()
prefix, substring, suffix ()w = nobinobita
Pattern Matching
Pattern Matching
Data Compression 0.0000000001% 400%453 3
Data Compressionaldoghqu3850pcxps;lafdjaeqw09bjzpafq05^@62:vzZIAPF(90rwDEVcx0832nkvl;pzp99OPF:eDfja
Goal of this paper
Ideaaldoghqu3850pcxps;lafdjaeqw09bjzpafq05^@62:vzZIAPF(90rwDEVcx0832nkvl;pzp99OPF:eDfja 0.0000000001% 400%453 3
Ideaaldoghqu3850pcxps;lafdjaeqw09bjzpafq05^@62:vzZIAPF(90rwDEVcx0832nkvl;pzp99OPF:eDfja
Previous studiesEilam-Tsoreff and VishkinAmir, Landau, and VishikinAmir and BensonFarach and ThorupGasieniec, et al.Amir, Benson and FarachKarpinski, et al.Miyazaki, et al.
Kida, et al.yearresearchercompression methodrun-lengthtwo-dimensionalrun-lengthLZ77LZ77LZWstraight-line programsstraight-line programsLZW198819921992199519961996199719971998Kida
LZ77 CompressionLZ77
contents
example Z = a b c
useful propertyb a c a
Main Algorithm
Basic Idea existence problem
Basic Ideai prefix substring i suffix substring i substring Yes
Basic IdeaPattern: b a c ai i+1 a b c c b c b a c a c a c b a bb a c a
Winding Phase
Winding Phase
Winding Phase
Winding Phase
Winding Phase
Winding Phase
Winding Phase
Winding Phase
Winding Phasea b a b c a b a b c b a c a b a b b c a b i = 3
Unwinding Phase
Unwinding Phasea b ca b a b c a b a b c b a c a b a b b c a b Pattern: b a c a
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b 55
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b 6
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b
Final operation
Complexity of Algorithm
Winding Phase
Winding PhaseL :O( log | Z | )Balanced tree
Winding PhaseL :O( log | Z | )O( 1 )
Winding PhaseL :O( | Z | log |T | )Segment-Merge
Winding Phase
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b 55
Unwinding PhasePattern: b a c aa b a b c a b a b c b a c a b a b b c a b O( | Z | )O( log| P | )55
Unwinding PhaseO( P + | Z | ( | Z | + log |P | ) )
final operationO( log| P | )O( log| P | )O( log| P | )O( log| P | )O( log| P | )O( log| P | )O( log| P | )O( |Z| log| P | )
Total
Conclusion
conclusion
conclusion
String Matching in Lempel-Ziv Compressed String W W X,Y,Z X W Y W Z W W W W nob,nobin W existence problem YES,NO YES all-occurrences problem KMPBM
KMPFarachThorupLZ77LZ77LZ77 0 a -3,1 b -2,1 ab 0 2 0,2 c -1,1 ababc 0 5 0,5 substring substring baca LZ77LZ77P+P- i P-i P+i+1 substring i i+1 i suffix substring P- prefix substring P+ P-i P+i+1 YES P- P+ P- P+ Winding phaseWinding PhaseWinding PhaseWinding PhaseP- cP+ a c a i=3 YES Unwinding Phase Unwinding Phase Winding Phase P+P-Unwinding PhaseUnwinding Phase
ba substring ba ba Unwinding ca substring P+ P- i 5 YESLZ77Winding Phase Balanced treeO(log|Z|)Segment-MergeO(|Z|log|T|)Winding Phase Unwinding Phase Unwinding Phase Winding PhaseWinding Phase substring substring O(|P|)O(log|P| )Unwinding PhaseP+P- substring substring T/Zlog |P|O(|P|+|Z|)(competitive and opportunistic)