Upload
others
View
5
Download
0
Embed Size (px)
Citation preview
Spring 2020 – University of Virginia 1© Praphamontripong© Praphamontripong
SQL - Join
CS 4750Database Systems
[A. Silberschatz, H. F. Korth, S. Sudarshan, Database System Concepts, Ch.4.1][ https://www.dofactory.com/sql/join ]
Spring 2020 – University of Virginia 2© Praphamontripong
Recap 1: Aggregation
ID name dept_name salary10101 Srinivasan Comp. Sci. 65000.0012121 Wu Finance 90000.0015151 Mozart Music 40000.0022222 Einstein Physics 95000.0032343 El Said History 60000.0033456 Gold Physics 87000.0045565 Kate Comp. Sci. 75000.0058583 Katz History 62000.0076543 Singh Finance 80000.0076766 Brandt Biology 72000.0083821 Kate Comp. Sci. 92000.0098345 Kim Elec. Eng. 80000.00
Instructor (refer to alldbs.sql)
Find average salary for each department
SELECT dept_name, AVG(salary)FROM instructorGROUP BY dept_name;
Spring 2020 – University of Virginia 3© Praphamontripong
Recap 2: Aggregation
ID name dept_name salary10101 Srinivasan Comp. Sci. 65000.0012121 Wu Finance 90000.0015151 Mozart Music 40000.0022222 Einstein Physics 95000.0032343 El Said History 60000.0033456 Gold Physics 87000.0045565 Kate Comp. Sci. 75000.0058583 Katz History 62000.0076543 Singh Finance 80000.0076766 Brandt Biology 72000.0083821 Kate Comp. Sci. 92000.0098345 Kim Elec. Eng. 80000.00
Instructor (refer to alldbs.sql)
Find average salary for each department that is greater than 70000
SELECT dept_name, AVG(salary)FROM instructorGROUP BY dept_nameHAVING AVG(salary) > 70000;
Spring 2020 – University of Virginia 4© Praphamontripong
Recap: Primary Keys, Foreign KeysHiring (TA_id, Name, Year, Two_week_hours)TA_id Name Year Two_week_hours1234 Minnie 4 202345 Humpty 3 243456 Dumpty 4 303333 Minnie 3 125678 Mickey 2 16
Key = one or more attributes that uniquely
identify a rowTA_Assignment (TA_id, Course)TA_id Course1234 Database Systems2345 Database Systems3456 Cloud Computing1234 Software Analysis5678 Web Programming Lang.
Foreign key = one or more attributes that uniquely identify a row in another
table
references
Foreign key describes a relationship between tables
Spring 2020 – University of Virginia 5© Praphamontripong
JOIN• Combine related tables or sets of data on key values
• All tables (or relations) must have a unique identifier (primary key)
• Any related tables must contain a copy of the unique identifier (foreign key) from the related table
lefttable
righttable
FULL JOIN
lefttable
righttable
LEFT JOIN
lefttable
righttable
RIGHT JOIN
lefttable
righttable
(INNER) JOIN
lefttable
righttable
NATURAL JOIN
(preserve all rows, uncommon, thus skip)
(most common)
(preserve all rowsin right table)
(preserve all rowsin left table)
Spring 2020 – University of Virginia 6© Praphamontripong
NATURAL JOIN
lefttable
righttable
NATURAL JOIN
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
Not include the repeated columnone(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one NATURAL JOIN two;
for each row1 in one:for each row2 in two:
if (row1.p1 = row2.p1):output (row1.p1, row1.colA, row1.colB,
row2.p2, row2.colX, row2.colY)
Join on common named
attribute(s). If no match
found, move on.
Spring 2020 – University of Virginia 7© Praphamontripong
p1 colA colB p2 colX colY1 A aa 9001 55 ABC2 B bb 9006 42 STU3 C cc 9002 77 PQR3 C cc 9003 95 DEF5 E ee 9005 50 KLM7 G gg 9004 62 GHI7 G gg 9007 83 XYZ
NATURAL JOIN
lefttable
righttable
NATURAL JOINNot include the repeated column
SELECT * FROM one NATURAL JOIN two
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
Note the order of the columns; order of rows does not matter
Spring 2020 – University of Virginia 8© Praphamontripong
(INNER) JOIN
lefttable
righttable
(INNER) JOIN
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
Include the repeated columnone(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one JOIN two ON one.p1 = two.p1;
for each row1 in one:for each row2 in two:
if (row1.p1 = row2.p1):output (row1.p1, row1.colA, row1.colB,
row2.p2, row2.p1, row2.colX, row2.colY)
Join on specified attribute(s). If no match
found, move on.
Spring 2020 – University of Virginia 9© Praphamontripong
p1 colA colB p2 p1 colX colY1 A aa 9001 1 55 ABC2 B bb 9006 2 42 STU3 C cc 9002 3 77 PQR3 C cc 9003 3 95 DEF5 E ee 9005 5 50 KLM7 G gg 9004 7 62 GHI7 G gg 9007 7 83 XYZ
(INNER) JOIN
lefttable
righttable
(INNER) JOIN
SELECT * FROM one JOIN twoON one.p1 = two.p1
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one INNER JOIN twoON one.p1 = two.p1
Include the repeated column
Note the order of the columns; order of rows does not matter
Spring 2020 – University of Virginia 10© Praphamontripong
Cartesian Productp1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one JOIN two;
for each row1 in one:for each row2 in two:
output (row1.p1, row1.colA, row1.colB,row2.p2, row2.p1, row2.colX, row2.colY)
SELECT * FROM one, two;
Spring 2020 – University of Virginia 11© Praphamontripong
p1 colA colB p2 p1 colX colY1 A aa 9001 1 55 ABC1 A aa 9005 5 50 KLM1 A aa 9004 7 62 GHI1 A aa 9003 3 95 DEF1 A aa 9007 7 83 XYZ1 A aa 9002 3 77 PQR1 A aa 9006 2 42 STU2 B bb 9002 3 77 PQR2 B bb 9006 2 42 STU2 B bb 9001 1 55 ABC2 B bb 9005 5 50 KLM2 B bb 9004 7 62 GHI2 B bb 9003 3 95 DEF2 B bb 9007 7 83 XYZ3 C cc 9003 3 95 DEF3 C cc 9007 7 83 XYZ... … … ... … … …
Cartesian Product
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB)
two(p2, p1, colX, colY)
one × two
Note the order of the columns; order of rows does not matter
Spring 2020 – University of Virginia 12© Praphamontripong
NATURAL LEFT OUTTER JOINAugment a join by the dangling tuples
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one NATURAL LEFT OUTTER JOIN two;
Join on common named
attribute(s). If no match
found, pad with NULL values
lefttable
righttable
LEFT JOIN
Preserve all rows in left table
Spring 2020 – University of Virginia 13© Praphamontripong
p1 colA colB p2 colX colY1 A aa 9001 55 ABC2 B bb 9006 42 STU3 C cc 9002 77 PQR3 C cc 9003 95 DEF4 D dd NULL NULL NULL5 E ee 9005 50 KLM7 G gg 9004 62 GHI7 G gg 9007 83 XYZ8 H hh NULL NULL NULL
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
Note the order of the columns; order of rows does not matter
NATURAL LEFT OUTTER JOIN
lefttable
righttable
LEFT JOINAugment a join by the dangling tuples
Spring 2020 – University of Virginia 14© Praphamontripong
LEFT OUTTER JOINAugment a join by the dangling tuples
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one LEFT OUTTER JOIN twoON one.p1 = two.p1;
Join on specified attribute(s). If no match
found, pad with NULL values
lefttable
righttable
LEFT JOIN
Preserve all rows in left table
Spring 2020 – University of Virginia 15© Praphamontripong
p1 colA colB p2 colX colY1 A aa 9001 55 ABC2 B bb 9006 42 STU3 C cc 9002 77 PQR3 C cc 9003 95 DEF4 D dd NULL NULL NULL5 E ee 9005 50 KLM7 G gg 9004 62 GHI7 G gg 9007 83 XYZ8 H hh NULL NULL NULL
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
Note the order of the columns; order of rows does not matter
LEFT OUTTER JOIN
lefttable
righttable
LEFT JOINAugment a join by the dangling tuples
Spring 2020 – University of Virginia 16© Praphamontripong
NATURAL RIGHT OUTTER JOINAugment a join by the dangling tuples
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
SELECT * FROM one NATURAL RIGHT OUTTER JOIN two;
Join on common named
attribute(s). If no match
found, pad with NULL values
lefttable
righttable
RIGHT JOIN
Preserve all rows in right table
Spring 2020 – University of Virginia 17© Praphamontripong
p1 p2 colX colY colA colB1 9001 55 ABC A aa2 9006 42 STU B bb3 9002 77 PQR C cc3 9003 95 DEF C cc5 9005 50 KLM E ee7 9004 62 GHI G gg7 9007 83 XYZ G gg
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
Note the order of the columns; order of rows does not matter
NATURAL RIGHT OUTTER JOINAugment a join by the dangling tuples
Since two.p1 is a foreign key referencing one.p1, matches are always founded
lefttable
righttable
RIGHT JOIN
Spring 2020 – University of Virginia 18© Praphamontripong
RIGHT OUTTER JOINAugment a join by the dangling tuples
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
Join on specified attribute(s). If no match
found, pad with NULL values
lefttable
righttable
RIGHT JOIN
SELECT * FROM one RIGHT OUTTER JOIN twoON one.p1 = two.p1;
Preserve all rows in right table
Spring 2020 – University of Virginia 19© Praphamontripong
p1 p2 colX colY colA colB1 9001 55 ABC A aa2 9006 42 STU B bb3 9002 77 PQR C cc3 9003 95 DEF C cc5 9005 50 KLM E ee7 9004 62 GHI G gg7 9007 83 XYZ G gg
p1 colA colB1 A aa2 B bb3 C cc4 D dd5 E ee7 G gg8 H hh
p2 p1 colX colY9001 1 55 ABC9002 3 77 PQR9003 3 95 DEF9004 7 62 GHI9005 5 50 KLM9006 2 42 STU9007 7 83 XYZ
one(p1, colA, colB) two(p2, p1, colX, colY)
Note the order of the columns; order of rows does not matter
RIGHT OUTTER JOINAugment a join by the dangling tuples
Since two.p1 is a foreign key referencing one.p1, matches are always founded
lefttable
righttable
RIGHT JOIN
Spring 2020 – University of Virginia 20© Praphamontripong
Self Joins• Join with itself
• Useful when the table has a FOREIGN KEY which references its own PRIMARY KEY
• Can be viewed as joining two copies of the same table
• To do self join, use aliases
Spring 2020 – University of Virginia 21© Praphamontripong
(Challenging) Example: Self JoinsFind names of all TAs who are assigned to Database Systems and Software Analysis
TA_id Name Year Two_week_hours1234 Minnie 4 202345 Humpty 3 243456 Dumpty 4 303333 Minnie 3 125678 Mickey 2 16
TA_id Course1234 Database Systems2345 Database Systems3456 Cloud Computing1234 Software Analysis5678 Web Programming Lang.2345 Software Analysis
Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)
Discuss/work with your neighbor. How should we write SQL?
Spring 2020 – University of Virginia 22© Praphamontripong
(Challenging) Example: Self JoinsFind names of all TAs who are assigned to Database Systems and Software Analysis
TA_id Name Year Two_week_hours1234 Minnie 4 202345 Humpty 3 243456 Dumpty 4 303333 Minnie 3 125678 Mickey 2 16
TA_id Course1234 Database Systems2345 Database Systems3456 Cloud Computing1234 Software Analysis5678 Web Programming Lang.2345 Software Analysis
Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)
SELECT H.NameFrom Hiring AS H, TA_Assignment AS TWHERE H.TA_id = T.TA_id AND
T.Course = "Database Systems" AND
T.Course = "Software Analysis";
Will this work?
No.Returns empty set
Spring 2020 – University of Virginia 23© Praphamontripong
(Challenging) Example: Self JoinsFind names of all TAs who are assigned to Database Systems and Software Analysis
TA_id Name Year Two_week_hours1234 Minnie 4 202345 Humpty 3 243456 Dumpty 4 303333 Minnie 3 125678 Mickey 2 16
TA_id Course1234 Database Systems2345 Database Systems3456 Cloud Computing1234 Software Analysis5678 Web Programming Lang.2345 Software Analysis
Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)
SELECT H.NameFrom Hiring AS H, TA_Assignment AS T1,
TA_Assignment AS T2WHERE H.TA_id = T1.TA_id AND
H.TA_id = T2.TA_id ANDT1.Course = "Database Systems" ANDT2.Course = "Software Analysis";
Will this work?
NameMinnie
Humpty
Spring 2020 – University of Virginia 24© Praphamontripong
(Challenging) Example: Self JoinsFind names of all TAs who are assigned to Database Systems and Software Analysis
TA_id Name Year Two_week_hours1234 Minnie 4 202345 Humpty 3 243456 Dumpty 4 303333 Minnie 3 125678 Mickey 2 16
TA_id Course1234 Database Systems2345 Database Systems3456 Cloud Computing1234 Software Analysis5678 Web Programming Lang.2345 Software Analysis
Hiring(TA_id, Name, Year, Two_week_hours) TA_Assignment(TA_id, Course)
SELECT H.NameFrom Hiring AS H, TA_Assignment AS T1,
TA_Assignment AS T2WHERE H.TA_id = T1.TA_id AND
H.TA_id = T2.TA_id ANDT1.Course = "Database Systems" ANDT2.Course = "Software Analysis";
Let’s consider each part of the query
All pairs of courses a TA is assigned
For each pair, check for courses
Spring 2020 – University of Virginia 25© Praphamontripong
Wrap-UpJoin combines data across tables
• Nested-loop• Natural join (most common)• Inner join (filter Cartesian product)• Outer joins (preserve non-matching tuples)• Self join pattern
Different joining techniques can be used to achieve particular goals
What’s next? • SQL – subqueries, triggers
lefttable
righttable
LEFT JOIN
lefttable
righttable
RIGHT JOIN
lefttable
righttable
NATURAL JOIN
lefttable
righttable
(INNER) JOIN