Upload
quentin-lott
View
56
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Query execution hashing based Two pass algorithms. —Gaurang Patel— (205). Agenda. Terminology Hash Basics Partitioning relation by hashing Hash-based Grouping and Aggregation Union, Intersection and Difference Hash-Join Algorithm Conclusion. Terminology. Query optimization - PowerPoint PPT Presentation
Citation preview
Terminology Hash Basics Partitioning relation by hashing Hash-based Grouping and Aggregation Union, Intersection and Difference Hash-Join Algorithm Conclusion
Query optimization -- Logical Query plan -- Physical plan Query execution: -- Query processor- group of DBMS
components -- Converts user queries into database
operations Operation Relation- arguments of operation
Large data Hash functions to store large relations Memory buffers Gain factor of M in the size of relations
Tuples from same block foes to same bucket
Hash key depends on grouping attributes First pass: Process each bucket in turn. Second pass: Only one record per group.
Binary operation- same hash function for both arguments
Union: R U S First Pass -- 2(M-1) buckets -- Avoid duplicates Same for R ∩ S, R – S..
I/O operations needed: -- B(R) + B(S) -- 2 more for hashing -- Total: 3(B(R) + B(S)) For, two pass algorithm: -- min(B(R),B(S)) ≤ M2
R(X,Y) ►◄ S(Y,Z) Same as other binary operations Only difference in hash key, Y I/O operations: -- 3(B(R)+B(S)) -- Two pass require min(B(R),B(S)) ≤ M2
-- Further techniques to reduce number of I/O operations