Upload
others
View
8
Download
0
Embed Size (px)
Citation preview
CSC 261/461 – Database SystemsLecture 5
Fall 2017
MULTISET OPERATIONS IN SQL
2
UNION
3
SELECT R.AFROM R, SWHERE R.A=S.AUNIONSELECT R.AFROM R, TWHERE R.A=T.A Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
Whyaren’tthereduplicates?
Whatifwewantduplicates?
UNION ALL
4
SELECT R.AFROM R, SWHERE R.A=S.AUNION ALLSELECT R.AFROM R, TWHERE R.A=T.A Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 ∪ 𝑟. 𝐴 𝑟. 𝐴 = 𝑡. 𝐴}
ALLindicatesMultisetoperations
EXCEPT
5
SELECT R.AFROM R, SWHERE R.A=S.AEXCEPTSELECT R.AFROM R, TWHERE R.A=T.A Q1 Q2
𝑟. 𝐴 𝑟. 𝐴 = 𝑠. 𝐴 \{𝑟. 𝐴|𝑟. 𝐴 = 𝑡. 𝐴}
6
Nested queries: Sub-queries Returning Relations
SELECT DISTINCT c.cityFROM Company cWHERE c.name IN (
SELECT pr.makerFROM Purchase p, Product prWHERE p.product = pr.name
AND p.buyer = ‘Joe Blow‘)
“CitieswhereonecanfindcompaniesthatmanufactureproductsboughtbyJoeBlow”
Company(name, city)Product(name, maker)Purchase(id, product, buyer)
Anotherexample:
7
Subqueries Returning Relations
SELECT nameFROM ProductWHERE price > ALL(
SELECT priceFROM ProductWHERE maker = ‘Gizmo-Works’)
Product(name, price, category, maker)
Youcanalsouseoperationsoftheform:• s>ALLR• s<ANYR• EXISTSR
Findproductsthataremoreexpensivethanallthoseproducedby“Gizmo-Works”
Ex:
ANYandALLnotsupportedbySQLite.
8
Subqueries Returning Relations
SELECT p1.nameFROM Product p1WHERE p1.maker = ‘Gizmo-Works’
AND EXISTS(SELECT p2.name
FROM Product p2WHERE p2.maker <> ‘Gizmo-Works’
AND p1.name = p2.name)
Product(name, price, category, maker)
Youcanalsouseoperationsoftheform:• s>ALLR• s<ANYR• EXISTSR
Find‘copycat’products,i.e.productsmadebycompetitorswiththesamenamesasproductsmadeby“Gizmo-Works”
Ex:
<>means!=
9
Nested queries as alternatives to INTERSECT and EXCEPT
(SELECT R.A, R.BFROM R)INTERSECT(SELECT S.A, S.BFROM S)
SELECT R.A, R.BFROM RWHERE EXISTS(
SELECT *FROM S
WHERE R.A=S.A AND R.B=S.B)
SELECT R.A, R.BFROM RWHERE NOT EXISTS(
SELECT *FROM SWHERE R.A=S.A AND R.B=S.B)
(SELECT R.A, R.BFROM R)EXCEPT(SELECT S.A, S.BFROM S)
10
Correlated Queries
SELECT DISTINCT titleFROM Movie AS mWHERE year <> ANY(
SELECT yearFROM MovieWHERE title = m.title)
Movie(title, year, director, length)Findmovieswhosetitleappearsmorethanonce.
Notethescopingofthevariables!
Basic SQL Summary
• SQL provides a high-level declarative language for manipulating data (DML)
• The workhorse is the SFW block
• Set operators are powerful but have some subtleties
• Powerful, nested queries also allowed.
11
2. AGGREGATION & GROUP BY
12
What you will learn about in this section
1. Aggregationoperators
2. GROUPBY
3. GROUPBY:withHAVING,semantics
13
14
Aggregation
SELECT COUNT(*)FROM ProductWHERE year > 1995
ExceptCOUNT,allaggregationsapplytoasingleattribute
SELECT AVG(price)FROM ProductWHERE maker = “Toyota”
• SQLsupportsseveralaggregation operations:• SUM,COUNT,MIN,MAX,AVG
15
• COUNTappliestoduplicates,unlessotherwisestated
SELECT COUNT(category) FROM ProductWHERE year > 1995
Note:SameasCOUNT(*).Why?
Weprobablywant:
SELECT COUNT(DISTINCT category)FROM ProductWHERE year > 1995
Aggregation: COUNT
16
Purchase(product, date, price, quantity)
More Examples
SELECT SUM(price * quantity)FROM Purchase
SELECT SUM(price * quantity)FROM PurchaseWHERE product = ‘bagel’
Whatdothesemean?
17
Simple Aggregations
PurchaseProduct Date Price Quantitybagel 10/21 1 20banana 10/3 0.5 10banana 10/10 1 10bagel 10/25 1.50 20
SELECT SUM(price * quantity)FROM PurchaseWHERE product = ‘bagel’
50(=1*20+1.50*20)
18
Grouping and Aggregation
SELECT product,SUM(price * quantity) AS TotalSales
FROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product
Let’sseewhatthismeans…
Findtotalsalesafter10/1/2005perproduct.
Purchase(product, date, price, quantity)
19
Grouping and Aggregation
1.ComputetheFROM andWHERE clauses
2.GroupbytheattributesintheGROUPBY
3.ComputetheSELECT clause:groupedattributesandaggregates
Semanticsofthequery:
20
1. Compute the FROM and WHERE clauses
Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10
SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product
FROM
Product Date Price QuantityBagel 10/21 1 20Bagel 10/25 1.50 20Banana 10/3 0.5 10Banana 10/10 1 10
21
2. Group by the attributes in the GROUP BY
SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product
GROUP BY Product Date Price Quantity
Bagel10/21 1 2010/25 1.50 20
Banana10/3 0.5 1010/10 1 10
22
3. Compute the SELECT clause: grouped attributes and aggregates
SELECT product, SUM(price*quantity) AS TotalSalesFROM PurchaseWHERE date > ‘10/1/2005’GROUP BY product
Product TotalSales
Bagel 50
Banana 15
SELECTProduct Date Price Quantity
Bagel10/21 1 2010/25 1.50 20
Banana10/3 0.5 1010/10 1 10
23
HAVING Clause
Samequeryasbefore,exceptthatweconsideronlyproductsthathavemorethan100buyers
HAVINGclausescontainsconditionsonaggregates
SELECT product, SUM(price*quantity)FROM PurchaseWHERE date > ‘10/1/2005’GROUP BY productHAVING SUM(quantity) > 100
WhereasWHEREclausesconditiononindividualtuples…
24
General form of Grouping and Aggregation
• S = Can ONLY contain attributes a1,…,ak and/or aggregates over other attributes
• C1 = is any condition on the attributes in R1,…,Rn
• C2 = is any condition on the aggregate expressions
SELECT SFROM R1,…,RnWHERE C1GROUP BY a1,…,akHAVING C2
Why?
25
General form of Grouping and Aggregation
SELECT SFROM R1,…,RnWHERE C1GROUP BY a1,…,akHAVING C2
Evaluationsteps:
1. EvaluateFROM-WHERE:applyconditionC1 ontheattributesinR1,…,Rn
2. GROUPBYtheattributesa1,…,ak3. ApplyconditionC2 toeachgroup(mayhaveaggregates)
4. ComputeaggregatesinSandreturntheresult
26
Group-by vs. Nested Query
• Find authors who wrote ³ 10 documents:
• Attempt 1: with nested queriesSELECT DISTINCT Author.nameFROM AuthorWHERE COUNT(
SELECT Wrote.urlFROM WroteWHERE Author.login = Wrote.login) > 10
Author(login, name)Wrote(login, url)
ThisisSQLbyanovice
27
Group-by vs. Nested Query
• Find all authors who wrote at least 10 documents:• Attempt 2: SQL style (with GROUP BY)
SELECT Author.nameFROM Author, WroteWHERE Author.login = Wrote.loginGROUP BY Author.nameHAVING COUNT(Wrote.url) > 10
NoneedforDISTINCT:automaticallyfromGROUPBY
ThisisSQLbyanexpert
Group-by vs. Nested Query
Which way is more efficient?
• Attempt #1- With nested: How many times do we do a SFW query over all of the Wrote relations?
• Attempt #2- With group-by: How about when written this way?
WithGROUPBYcanbemuchmoreefficient!
Topics Covered
1. NULLs
2. OuterJoins
3. WithandCase
4. Constraint
5. Schema Change Statements
29
Comparisons Involving NULLand Three-Valued Logic (cont’d.)
Slide 7- 6
31
NULLS in SQL
• Whenever we don’t have a value, we can put a NULL
• Can mean many things:– Value does not exists– Value exists but is unknown– Value not applicable– Etc.
• The schema specifies for each attribute if can be null (nullable attribute) or not
• Each individual NULL value considered to be different from every other NULL value
• SQL uses a three-valued logic:– TRUE, FALSE, and UNKNOWN (like Maybe)
• NULL = NULL comparison is avoided
• How does SQL cope with tables that have NULLs?
32
Null Values
Unexpected behavior:
SELECT *FROM PersonWHERE age < 25 OR age >= 25
SomePersonsarenotincluded!
33
Null Values
Can test for NULL explicitly:– x IS NULL– x IS NOT NULL
SELECT *FROM PersonWHERE age < 25 OR age >= 25
OR age IS NULL
NowitincludesallPersons!
34
RECAP: Inner Joins
By default, joins in SQL are “inner joins”:
SELECT Product.name, Purchase.storeFROM Product JOIN Purchase ON Product.name = Purchase.prodName
SELECT Product.name, Purchase.storeFROM Product, PurchaseWHERE Product.name = Purchase.prodName
Product(name, category)Purchase(prodName, store)
Bothequivalent:BothINNERJOINS!
35
Inner Joins + NULLS = Lost data?
By default, joins in SQL are “inner joins”:
However:Productsthatneversold(withnoPurchasetuple)willbelost!
SELECT Product.name, Purchase.storeFROM Product JOIN Purchase ON Product.name = Purchase.prodName
SELECT Product.name, Purchase.storeFROM Product, PurchaseWHERE Product.name = Purchase.prodName
Product(name, category)Purchase(prodName, store)
36
Outer Joins
• An outer join returns tuples from the joined relations that don’t have a corresponding tuple in the other relations– I.e. If we join relations A and B on a.X = b.X, and there is an entry in A
with X=5, but none in B with X=5…• A LEFT OUTER JOIN will return a tuple (a, NULL)!
• Left outer joins in SQL: SELECT Product.name, Purchase.storeFROM Product LEFT OUTER JOIN Purchase ON
Product.name = Purchase.prodName
Nowwe’llgetproductseveniftheydidn’tsell
37
name category
Gizmo gadget
Camera Photo
OneClick Photo
prodName store
Gizmo Wiz
Camera Ritz
Camera Wiz
name store
Gizmo Wiz
Camera Ritz
Camera Wiz
Product PurchaseINNER JOIN:
SELECT Product.name, Purchase.storeFROM Product INNER JOIN Purchase
ON Product.name = Purchase.prodName
Note:anotherequivalentwaytowriteanINNERJOIN!
38
name category
Gizmo gadget
Camera Photo
OneClick Photo
prodName store
Gizmo Wiz
Camera Ritz
Camera Wiz
name store
Gizmo Wiz
Camera Ritz
Camera Wiz
OneClick NULL
Product PurchaseLEFT OUTER JOIN:
SELECT Product.name, Purchase.storeFROM Product LEFT OUTER JOIN Purchase
ON Product.name = Purchase.prodName
39
Other Outer Joins
• Left outer join:– Include the left tuple even if there’s no match
• Right outer join:– Include the right tuple even if there’s no match
• Full outer join:– Include the both left and right tuples even if there’s no match
Acknowledgement
• Some of the slides in this presentation are taken from the slides provided by the authors.
• Many of these slides are taken from cs145 course offered byStanford University.
• Thanks to YouTube, especially to Dr. Daniel Soper for his useful videos.
CSC261,Spring2017,UR