Upload
kadeem
View
47
Download
0
Embed Size (px)
DESCRIPTION
Dual Bitmap Index: Space-Time Efficient Bitmap Index for Equality and Membership Queries. Niwan Wattanakitrungroj and Sirirut Vanichayobon Information Systems Technology and Applied Research Laboratory Department of Computer Science, Prince of Songkla University. Introduction - PowerPoint PPT Presentation
Citation preview
Dual Bitmap Index: Space-Time Efficient Bitmap
Index for Equality and Membership Queries
Niwan Wattanakitrungroj and Sirirut Vanichayobon
Information Systems Technology and Applied Research LaboratoryDepartment of Computer Science, Prince of Songkla University
Information Systems Technology and Applied Research 2/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
Information Systems Technology and Applied Research 3/24
2541
SO N G K L A
B K K
D ata M art
D ata W ar ehouse0
10
20
30
40
50
60
70
80
90
100
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
East
West
North
1st Qtr
2nd Qtr
3rd Qtr
4th Qtr
2535
M etadata R epository
E xtractT ransf rom
L oadR ef resh
S erver
O L A P S erver
M onitorng & A dm instration
A nalysisQ uery /R eportingD ata M ining
T oolsD ata S ource
O perational dbs
E xternal S ource
D ata W areho using A rch i tecture
Introduction
- A data warehouse is a large repository of information accessed through OLAP application.
- A majority of requests for information from a data warehouse involve dynamic ad hoc queries.
- The ability to answer these queries quickly is a critical issue in the data warehouse environment.
Information Systems Technology and Applied Research 4/24
Introduction
Summary tables
Indexes
Parallel machines
To speed up query processing :
Information Systems Technology and Applied Research 5/24
Bitmap Index
simple to represent
uses less space
more CPU-efficient
low-cost Boolean operations
Characteristic :
Introduction :
Information Systems Technology and Applied Research 6/24
Bitmap IndexName Gender Education
Suda F BS
Wichai M BS
Jonh M MS
Marry F PhD
Somsak M BS
… …
F
1
0
0
1
0
…
BS
1
1
0
0
1
…
MS
0
0
1
0
0
…
M
0
1
1
0
1
…
PhD
0
0
0
1
0
…
RID
1
2
3
4
5
…
RID
1
2
3
4
5
…
Select Count(*)
From Employee
Where Gender=“F”;
Answer : 2
Select Name
From Employee
Where Gender=“M” and
Education=“MS”
Answer : John
Introduction :
Select Name
From Employee
Where Education in {MS,PhD}
Answer : John, Marry
Employee Table
Equality Query
Membership Query
Information Systems Technology and Applied Research 7/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
Information Systems Technology and Applied Research 8/24
Related WorkSimple Bitmap Index C = 15 15 bitmap vectors
Variations of Bitmap Index
Let C be a number of distinct values of the indexed attribute(Cardinallity).
Bitmap vectors : 0 1 2 1, , ,..., CS S S S " " vA v s
Query :
" " 22A S
Information Systems Technology and Applied Research 9/24
3v
Interval Bitmap Index Related Work
C = 15 8 bitmap vectors
Variations of Bitmap Index
Bitmap vectors : 1
0 1 2 2, , ,..., , C
I I I I
0
0
1
1
0
if 0, 0,
if 1, 2,
if 1, 3,
" " if ,
v v
v
I v m
I v C
I v C
A v I I v m
I I
12
1
0
if , 0,
if m 1, 0,
( ) if 1
C
v m v m
v m m
I I v C m
I I v C Query , jI j j m1,
2 C
m" " 2 32A I I
Information Systems Technology and Applied Research 10/24
Scatter Bitmap Index C = 15 8 bitmap vectors,
Variations of Bitmap Index
Related Work
Bitmap vectors : 1 2 1 0 1, ,..., , , ,...,C CL L L Z Z Z
1, m C
( - ) ( )
( - ) mod( )
if " "
otherwise
1 1 1
1 1 1
0v m v m
v m v m
Z Z vA v
Z L
m = 5
Query
" " 1 22A Z L
Information Systems Technology and Applied Research 11/24
Encoded Bitmap Index Related Work
C = 15 4 bitmap vectors
Variations of Bitmap Index
Mapping all Bitmap Vector
Query :Bitmap vectors : log 10 1 2, , ,..., CE E E E
" "2A
Information Systems Technology and Applied Research 12/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
OutlineVariations of Bitmap Index
Information Systems Technology and Applied Research 13/24
Dual Bitmap Index
Variations of Bitmap Index
Encoding Scheme of five bitmap indices
Need
C bitmap vectors
Need
bitmap vectors
2
CNeed
bitmap vectors
2 C
Need
bitmap vectors
log C Need
bitmap vectors
. . 2 0 25 0 5C
Information Systems Technology and Applied Research 14/24
Dual Bitmap Index
Variations of Bitmap Index
Information Systems Technology and Applied Research 15/24
1. Assign an increasing sequence of numbers to each of the distinct values of A (i.e., 0,1,…,C-1).
4. For each value v on record at position i in A
1
0
iD
if i = r and s
otherwise
where 2( ) 0.25 0.5 ,r hiC v
rrnrn
vrs mod2
)1)((1
and v is the value of an indexed attribute for any record.
Creation of Dual Bitmap Index
C =15 A = {0,1,2,…,14}
Variations of Bitmap Index
2. Calculate n :
2 0.25 0.5n C (The total number of bitmap vectors created )
hiC
2
nhiC3. Calculate : (the highest value of C that can be represent
by n bitmap vector)
n = 6
hiC = 15
Information Systems Technology and Applied Research 16/24
1. Find the sequence number of the searching value.
2. " " r sA v D D
where 2( ) 0.25 0.5 ,r hiC v
rrnrn
vrs mod2
)1)((1
and v is the value of an indexed attribute for any record.
Equality and Membership Queries
“A = 2” 5 2D D
Variations of Bitmap Index : Propose Bitmap Index
Information Systems Technology and Applied Research 17/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
Information Systems Technology and Applied Research 18/24
Performance study
Information Systems Technology and Applied Research 19/24
Performance study
Number of bitmap vectors used to represent an attribute with cardinality C
(Space)
Scatter
Dual
Encoded
Simple
Interval
Scatter
Dual
Encoded
Information Systems Technology and Applied Research 20/24
Performance study
Information Systems Technology and Applied Research 21/24
Space-Time Trade-off for five Bitmap IndicesSpace-Time Trade-off for five Bitmap Indices
C=50, N=1,000,000 (The data sets from TPC-H Benchmark)
Performance study
Simple
Interval
Scatter
Encoded
Dual
Information Systems Technology and Applied Research 22/24
Introduction
Variations of Bitmap Index
- Simple Bitmap Index
- Interval Bitmap Index
- Scatter Bitmap Index
- Encoded Bitmap Index
- Dual Bitmap Index
Performance Study
Conclusion
Outline
Information Systems Technology and Applied Research 23/24
Conclusion
Dual bitmap index uses less space while maintaining query processing time for equality and membership queries.
Dual Bitmap Index achieves this by representing each attribute value using only two bitmap vectors, and only the low-cost Boolean AND operation is used to answer equality query.
Dual Bitmap Index has better space-time performance than the other bitmap indexing techniques.
Simple Bitmap Index requires the most space.
Encoded Bitmap Index’ s processing time is the worst.
Information Systems Technology and Applied Research 24/24
Thank You
Question & answer