Upload
vidal
View
51
Download
0
Embed Size (px)
DESCRIPTION
Handling Data Skew in Parallel Joins in Shared-Nothing Systems. Yu Xu , Pekka Kostamaa , XinZhou (Teradata) Liang Chen (University of California) SIGMOD’08 Presented by Kisung Kim. Introduction. Parallel processing continues to be important in large data warehouses - PowerPoint PPT Presentation
Citation preview
Handling Data Skew in Parallel Joins in Shared-Nothing Systems
Yu Xu, Pekka Kostamaa, XinZhou (Teradata)Liang Chen (University of California)
SIGMOD’08
Presented by Kisung Kim
Introduction Parallel processing continues to be important in large
data warehouses Shared nothing architecture
– Multiple nodes communicate via high-speed interconnect network– Each node has its own private memory and disks
Parallel Unit (PU)– Virtual processors doing the scans, joins, locking, transaction
management,… Relations are horizontally partitioned across all Pus
– Hash partitioning is commonly used
PU PU PU PU PU PU PU PUData Data Data Data
2 / 28
Introduction Partitioning column
– R: x– S: y
Hash function– h(i) = i mod 3 + 1
3 / 28
Two Join Geographies Redistribution plan
– Redistribute the tables based on join attributes if they are not partitioned by the join attributes
– Join is performed on each PU in parallel
4 / 28
Two Join Geographies Duplication plan
– Duplicate tuples of the smaller relation on each PU to all Pus
– Join is performed on each PU in parallel
5 / 28
Redistribution Skew Hot PU
– After redistribution, some PUs have larger number of tuples than others
– Performance bottleneck in the whole system– Relations with many rows with the same value in the join at-
tributes Adding more nodes will not solve the skew problem Examples
– In travel booking industry, a big customer often makes a large number of reservations on behalf of its end users
– In online e-commerce, a few professionals make millions of transactions a year
– …
6 / 28
Redistribution Skew Relations in these applications are almost evenly par-
titioned When the join attribute is a non-partitioning column
attribute, severe redistribution skew happens Duplication plan can be a solution only when one join
relation is fairly small
Our solution– Partial Redistribution & Partial Duplication (PRPD) join
7 / 28
PRPD Join Assumptions
– DBAs evenly partition their data for efficient parallel process-ing
– Skewed rows tend to be evenly partitioned on each PU– The system knows the set of skewed values
Intuition– Deal with the skewed rows and non-skewed rows of R differ-
ently
8 / 28
PRPD
L1: set of skewed values R.a L2: set of skewed values S.b Step 1
– Scan Ri and split the rows into three sets Ri
2-loc: all skewed rows of Ri
Ri2-dup: every rows of Ri whose R.a value matches any value in
L2
Ri2-redis: all other rows of Ri
– Three spools for each PUi Ri
loc: all rows from Ri2-loc
Ridup: all rows of R duplicated to PUi
Riredis: all rows of R redistributed to Pui
– Similarly on S
Kept Locally Duplicated to all PUs Hash redistributed on
R.a
9 / 28
PRPD: Example
L1 = {1}L2 = {2}
10 / 28
R32-redis
PRPD Step 1
PU3 R32-dup
R32-loc
R22-redis
PU2 R22-dup
R22-loc
R12-redis
PU1 R12-dup
R12-loc
R3redis
R3dup
R3loc
R2redis
R2dup
R2loc
R1redis
R1dup
R1loc
PU3
PU2
PU1
Ri2-loc : Store Locally
11 / 28
R32-redis
PRPD Step 1
PU3 R32-dup
R32-loc
R22-redis
PU2 R22-dup
R22-loc
R12-redis
PU1 R12-dup
R12-loc
R3redis
R3dup
R3loc
R2redis
R2dup
R2loc
R1redis
R1dup
R1loc
PU3
PU2
PU1
Ri2-dup : Duplicate
12 / 28
R32-redis
PRPD Step 1
PU3 R32-dup
R32-loc
R22-redis
PU2 R22-dup
R22-loc
R12-redis
PU1 R12-dup
R12-loc
R3redis
R3dup
R3loc
R2redis
R2dup
R2loc
R1redis
R1dup
R1loc
PU3
PU2
PU1
Ri2-redis : Redistribute
13 / 28
PRPD Step 1
14 / 28
PRPD Step 2 On each PUi,
R1redis
R1dup
R1loc
PU1
S1redis
S1dup
S1loc
15 / 28
PRPD All sub-steps in each step can run in parallel
Overlapping skewed values– The overlapping skew values
Ri2-loc or Ri
2-dup ? – System chooses to include the overlapping skewed value in
only one of L1 and L2– Calculate the size of rows and choose small one
16 / 28
Comparison with Redistribution Plan Use more total spool space than redistribution plan
– PRPD duplicate some rows Less networking cost
– Keep the skewed rows locally PRPD does not send all skewed rows to a single PU
Ri2-redis
Ri2-dup
Ri2-loc
Keep locally , less network cost
Duplicate, more spool space
Same as redistribution plan
17 / 28
Comparison with Duplication Plan Less spool space than duplication plan
– Partial duplication More networking cost
– When data skew is not significant– PRPD plan needs to redistribute a large relation
Less join cost– Duplication plan always joins a complete copy of the dupli-
cated relation
18 / 28
PRPD: Hybrid of Two Plans L1= Ø, L2=Ø
– Same as redistribution plan L1=Uniq(R.a)⊃Uniq(S.b)
– Same as duplication plan (duplicate S)
19 / 28
PRPD: Hybrid of Two Plans n: the number of PUs x: percentage of the skewed rows in a relation R The number of rows of R after redistribution in redistribu-
tion – Hot PU:
– Non-hot PU: The number of rows of R after redistribution in PRPD
– Hot PU:
Ratio of the number of rows of hot PU in redistribution over the number of rows of R in PPRD
20 / 28
Experimental Evaluation Compare PRPD with redistribution plan
– Redistribution plan is more widely used than duplication plan
Schema & test query
21 / 28
Generating Skewed Data Originally 25 unique nations in TPC-H We increased the number of unique nations to 1000 5% skewness
22 / 28
Query Execution Time 10 nodes, 80 PUs Node
– Pentium IV 3.6 GHz CPUs, 4GB memory, 8 PUs 1 million rows for Supplier relation 1 million rows for Customer relation The size of query result is around 1 billion rows
23 / 28
Query Execution Time 1 Hot PUs
24 / 28
Query Execution Time 2 Hot PUs
25 / 28
Different Number of PUs Speedup ratio of PRPD over redistribution plan As the skewness increases, the speedup ratio increases The larger the system, the larger the speed up
26 / 28
Conclusions Effectively handle data skew in joins
– Important challenges in parallel DBMS We propose PRPD join
– Hybrid of redistribution and duplication plan– PRPD also can be used in multiple joins
27 / 28
Thank you
28 / 28