View
218
Download
0
Tags:
Embed Size (px)
Citation preview
J. Fischer, V.Heun: Improvements on the RMQ-Problem
Improvements on theRange-Minimum-Query-
Problem
Johannes FischerVolker Heun
Universität München, Institut für Informatik
J. Fischer, V.Heun: Improvements on the RMQ-Problem
Introduction
3J. Fischer, V.Heun: Improvements on the RMQ-Problem
Introduction►given: array A of size n►Task: preprocess A such that
RMQA(l,r) = argminl≤i≤r A[i]
can be answered efficientlyl r
min ⇒ return 5
1 2 3 4 5 6 7 8 9 10 11
A = 2 5 2 4 1 7 5 8 3 8 6
► Break ties to the left
1 2 3 4 5 6 7 8 9 10 11
A = 2 5 2 4 1 7 5 8 3 8 61 2 3 4 5 6 7 8 9 10 11
A = 2 5 2 4 1 7 5 8 3 8 6
l r
min ⇒ return 1
4J. Fischer, V.Heun: Improvements on the RMQ-Problem
Applications► Lowest Common Ancestors (LCA)
A
B C
I
D E
KJ
G HF
A = I D B E A J F K C G C H
0
1
2
3
H = 3 2 1 2 0 3 2 3 1 2 1 2
J
G
H = 3 2 1 2 0 3 2 3 1 2 1 2A = I D B E A J F K C G C H
C
5J. Fischer, V.Heun: Improvements on the RMQ-Problem
Applications►Longest common extensions of
strings (LCE)
t=i jabba abbax z
• RMQs on the LCP-table of suffix array
►Other applications• Document Retrieval (Muthukrishnan SODA’02)• Suffix links in ESA (Abouldhoda et al. WABI’02)• Maximum-Sum Queries (Chen/Chao ISAAC‘04)• … ⇒ basic ingredient!
6J. Fischer, V.Heun: Improvements on the RMQ-Problem
Previous Results for RMQ►Berkman/Vishkin FOCS‘89:
• Preprocessing O(n)• Query time O(1)
►Rediscovered & simplified by Bender/Farach-Colton (LATIN’00)
►Reduction Chain:RMQ ➾ LCA ➾ ±1RMQ
Cartesian Tree Euler Tour 4-Russians Trick
4-Russians Trick
►cf. suffix array vs. suffix treetext ➾ suffix tree ➾ suffix array
7J. Fischer, V.Heun: Improvements on the RMQ-Problem
Cartesian Tree►Cartesian Tree for A[1,n]:
• Root: minimal element of A[1,n] at pos i• Left Child: Cartesian Tree for A[1,i-1]• Right Child: Cartesian Tree for A[i+1,n]
1 2 3 4 5 6 7 8 9 10 11
A = 2 5 2 4 1 7 5 8 3 8 6
51
3
2 4
9117
1086
O(n2)
J. Fischer, V.Heun: Improvements on the RMQ-Problem
The New Algorithm
9J. Fischer, V.Heun: Improvements on the RMQ-Problem
Overview
► Divide A into blocks B1,…,Bn/s of sizes = log(n)/4
► Answer queries seperately• Long queries than span several blocks (O(1))• Short in-block-queries (O(1))
► return position where overall minimum occurs (O(1))
A=
s
l rB1 Bn/s
10J. Fischer, V.Heun: Improvements on the RMQ-Problem
Answering Long Queries (B/F-C’00)
►Precompute all RMQs that span 2k blocks• M[i][k] = position of min in Bi,…,Bi+2^k-1
• Filled in optimal time with Dyn. Prog.
►Query: select 2 blocks covering interval
A=B1 Bn/sBi
M[i,1]M[i,2]M[i,3]
Size of M:n/s · log(n/s)
=O(n/logn·log(n/logn)) =O(n)
M[i,0]
11J. Fischer, V.Heun: Improvements on the RMQ-Problem
Answering In-block-queries►Computing the in-block-queries for all
n/s occurring blocks is too much►Really necessary?
3 4 3 2 4 3 4 8 11 10 -5 1 -4 0
41
32
►Fact: B and B‘ have the same answers to all RMQs iff they have the same Cartesian Tree.
65 7
41
32
65 7
n/s·s2
=O(n logn)
12J. Fischer, V.Heun: Improvements on the RMQ-Problem
Answering In-block-queries►Number of unlabelled bin. trees with
n nodes: n’th Catalan number Cn
►Cn=O(4n/n3/2)
►Theorem: We can store answers to all in-block-queries in space O(n)
►Proof: O(4s/s3/2)·s2
= O(22s·s1/2) = O(2log(n)/2·log1/2n) = O(n1/2·log1/2n)
13J. Fischer, V.Heun: Improvements on the RMQ-Problem
Answering In-block-queries►One problem remains:
• For each block Bi we need to know its type in time O(s)
►Type: bijection t from arrays of size s to {0,…,Cs-1} with t(B)= t(B’)
iffB and B’ have same Cartesian Tree
►build Cartesian Tree for each block Bj
►give tree a number 0 ≤ t(Bj) < Cs
14J. Fischer, V.Heun: Improvements on the RMQ-Problem
O(n)-Algo for Cartesian Tree►Let Ti be the Cartesian Tree for B[1,i]
►Ti obtained from Ti-1 as follows:
x
y
B[x] ≤ B[i]
> B[i]
x
i
y⇒
15J. Fischer, V.Heun: Improvements on the RMQ-Problem
Computing the block type►Don’t have to calculate tree!
• just keep “rightmost path” p on stack
• compute sequence of numbers l1,…,ls:
li=# nodes deleted from p in step i
• l1,…,ls satisfies “prefix property”
0 ≤ ∑1≤k≤i lk<i
• ...because one cannot delete more elements than have been inserted…
• … and each element is removed from p at most once!
16J. Fischer, V.Heun: Improvements on the RMQ-Problem
Computing the block type►l1,…,ls with ∑1≤k≤i lk<i corresponds to
path from to in
0 0
0 1
0 2
0 3
1 1
1 2
1 3
2 2
2 3 3 3
s s 0 0
►# paths from to given byCp,q= Cp-1,q + Cp,q-1 (“ballot numbers”)
p q 0 0
Cn,n= Cn
In step i:Go up li cells, go one to the left
17J. Fischer, V.Heun: Improvements on the RMQ-Problem
Computing the block type
►Paths with greater numbers than path q: at some point above q
►⇒ add # paths from current cell before going upwards
q
18J. Fischer, V.Heun: Improvements on the RMQ-Problem
Computing the block type► Precompute ballot numbers up to
s=logn/4. For all blocks Bj:
► let S be an empty stack, push(S,-∞)► q ← s, N ← 0► for i ← 1,…, s
• while top(S)>Bj[i]- pop(S)
- N ← N + C(s-i) q
- q ← q - 1
• push(S, Bj[i])
► return N
19J. Fischer, V.Heun: Improvements on the RMQ-Problem
Summary and Outlook►Direct construction algorithm for RMQ
• no dynamic data structures• never uses more space than in the end
►not the first… see Alstrup et al. SPAA’02
►Our method can be augmented with techniques from Sadakane SODA’02 to give a succinct data structure (2n+o(n) bits) with direct construction algorithm
J. Fischer, V.Heun: Improvements on the RMQ-Problem
Any Questions?