33
IN3020/4020 – Database Systems Spring 2020, Week 5.1 (cont. from week 4) RELATIONAL ALGEBRA – Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon slides by E. Thorstensen from Spring 2019

IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

IN3020/4020 – Database Systems Spring 2020, Week 5.1 (cont. from week 4)

RELATIONAL ALGEBRA – Part 2 Bags and Derviations

Dr. M. Naci Akkøk, Chief Architect, Oracle NordicsBased upon slides by E. Thorstensen from Spring 2019

Page 2: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

You can derive division using projection, cartesian product and differenceR(A,B) div S(B) =

!A(R) –!A((!A(R) × S) – R)

Note:(!A(R) × S) – R are those that do NOT satisfy ANY of the condition, which means that !A(R) – !A((!A(R) × S) – R) is ALL those that satisfy the condition!

Page 3: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Typical steps for computation of R(A,B) div S(B):

o Find out all possible combinations of S(B) with R(A) by computing R(A) × S(B). Call it R1

o Subtract actual R(A,B) from R1. Call it R2

o A in R2 are those that are not associated with every value in S(B); therefore R(A)-R2(A) gives us the A that are associated with all values in S

o Take a look at the examples here:o https://www.geeksforgeeks.org/sql-

division/o https://www.studytonight.com/dbms/divisi

on-operator.php

DERIVATION:R(A,B) div S(B) =

"A(R) –"A(("A(R) × S) – R)

INTERPRETATION:R(A,B) div S(B) =R(A) – ( ( R(A) × S(B) ) – R(A, B) )

R1

R2 (A)

A that are associated with all values in S

Page 4: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Other derivable operatorso R∩S = R – (R – S)o which means that we don´t /001 ∩ (we can derive it)

o R ⋈θ S =3θ(R× S)o which means that we don´t /001 ⋈θ

o R⋈S= 5L(3θ(R× S)), where L is the list of of attributes in R followed by the attributes in S that don´t occur in R, and θ is R.A1 = S.A1 AND ... AND R.Ak = S.Akder A1, ... , Ak are all the attributes that occur in both R and So which means that we don´t /001 ⋈J

Page 5: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

The minimal set of operatorso Operators in the set {∪, –, ", #, ×, %} can not be expressed

using the other operators in the seto They are the minimal independent set of operators

o We still wish to keep the other operators because there are effective algorithms for them and because it is often simpler to formulate queries using them

Page 6: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Bago Commercial DBMSs use Bag (multiset) and not Set (quantity) as

basic type for realizing relationso Set (D):

Each element in D occurs at most once. The order of the elements does not matter{a, b, c} = {a, c, b} = {a, a, b, c} = {c, a, b, a}

o Bag (D):Each element in D can occur more than once. The order of the elements does not matter{a, b, c} = {a, c, b} ≠ {a, a, b, c} = {c, a, b, a}

Page 7: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Why Bag and not Set?o Bag provides more efficient union and projection

calculations than Seto In aggregation, we need Bag functionality

o But: Bag is more space consuming than Set

Page 8: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Relational operators on Bagso The definitions become slightly different

o Not all algebraic laws that hold for Sets hold for BagsExample: (R ∪ S) – T = (R – T) ∪ (S –T)

o When we later in the lectures mention «bag relation», we mean a relation-schema and an instance that is a bag

Page 9: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Bag Uniono Let R and S be bag relations

If t is a tuple that occurs n times in R and m times in S, then t occurs n + m times in the bag relation R ∪ S

o Example of typical Bag union: {a, a, b, c, c} ∪ {a, c, c, c, d} = {a, a, a, b, c, c, c, c, c, d}

Page 10: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Bag Intersectiono Let R and S be bag relations

If t is a tuple that occurs n times in R and m times in S, then t occurs min(n, m) times in the bag relation R∩S

o Example of typical Bag intersection: {a, a, b, c, c} ∩ {a, c, c, c, d} = {a, c, c}

Page 11: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Bag Differenceo Let R and S be bag relations

If t is a tuple that occurs n times in R and m times in S, then t occurs max (0, n-m) times in the bag relation R−S

o Example of typical Bag difference: {a, a, b, c, c} – {a, c, c, c, d} = {a, b}

Page 12: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Bag Selectiono If R is a bag relation, then !"(R) is a bag relation obtained

from R by applying " to each tuple individually and selecting the tuples in R that satisfy the condition "

Page 13: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Bag Projectiono If R is a bag relation and L is a (non-empty) list of attributes,

then !L(R) is the bag relation obtained from R by selectingthe columns of the attributes in L

o !L(R) has as many tuples as R

Page 14: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Cartesian Product of Bagso R × S is the bag relation obtained from the bag relations R

and S by forming all possible concatenations of one tuple from R and one tuple from S

o If R has n tuples and S has m tuples, there will be nmdoubles in R × S

Page 15: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Theta-Join of Bagso If R and S are bag relations, the bag relation R⋈θS where θ

is a condition is formed as follows:

1. Calculate R × S (Cartesian bag product)2. Select the tuples that satisfy the condition

Page 16: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Natural Join of Bagso If R and S are bag relations, then R⋈S is the bag relation

obtained by merging matching tuples in R and S individually

Page 17: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Additional operators in relational algebrao Duplicate elimination o Aggregation operatorso Grouping o Sortingo Extended projectiono Outer Join

Page 18: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Duplicate eliminationo !(R) removes multiple occurences of tuples from the bag

relation R o The result is a set

Page 19: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Aggregation operationso Used on bags of atomic values for an attribute Ao Used in combination with the grouping operator

Page 20: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Standard aggregation operations #1o SUM(A):o Sums all values in the columns of Ao A´s domain must be numeric values

o AVG(A):o Calculates the average of the values in the column of A

(The column must have at least one value)o A´s domain must be numeric values

Page 21: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Standard aggregation operations #2o MIN (A), MAX (A):

o Selects the smallest / largest value in the column of A (The column must have at least one value)

o The domain of A must have an order relationo For numeric values this is <o Lexicographic arrangement is used for strings

o COUNT (A):o Counts the number of tuples in the relation with values in the

column of Ao (so doubles where A is nil are not counted)

Page 22: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Groupingo Used when we want to apply an aggregation operator to groups of valueso Form: !L(R), where L is a list of items with all the items in the list different. The

elements are in one of the following two forms:o A

o A is an attribute in Ro A is called a grouping attribute

o AGG (A) → AggReso AGG is an aggregation operatoro AggRes is an unused attribute name o A is called an aggregation attribute

Page 23: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

The resulting relation after groupingGiven %L(R), the result relation is constructed as follows1. Partition R in groups, one group for each collection of tuples

that are equal in all grouping attributes in L2. For each group, produce a tuple consisting of

i. The values of the grouping attributes in the groupii. For each aggregation attribute in L, the aggregation over all

the tuples in the groupThe result relation gets as many attributes as there are elements in L, and attribute names as specified by L. The result instance contains one tuple per group.

Page 24: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Grouping use& example

SELECT MAX (mycount) FROM (SELECT

agent_code, COUNT(agent_code)mycountFROM ordersGROUP BY agent_code

);

Page 25: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Sortingo !L(R), where R is a relation and L a list of attributes

A1, A2, ..., Ak, results in a list of tuples sorted first by A1, then by A2 internally in each batch of equal A1 values, etc.

o The attributes that are not included in the list are randomly arranged

o Result is a list, so the operation is meaningful only as a last, final operation on relations

Page 26: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Extended projectiono !L(R), classic: L is a list of attributes in Ro !L( R), extended: L is a list where each item can be

i. A simple attribute in Rii. An expression A → B, where A is an attribute in R and B

is an unused attribute nameRenames A in R to B in the result relation

iii. An expression E → B, where E is an expression built up of attributes in R, constants, arithmetic operators and string operators, and B is an unused attribute name

Page 27: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Extended projection – The result relationThe result relation !L(R) is obtained from R as follows:o Consider each tuple in R separatelyo Substitute the tuple´s values for the attribute names in L

and calculate the expressions in Lo The result relation is a bag with as many attributes as items

in L, and with names as given in L

Page 28: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Outer Joino Outer join is used when you want to preserve dangling tuples

from natural joino R⋈OS, outer join:

o Start with R⋈So Add dangling tuples from R and So Missing attribute values are filled in with ⊥ (nil)

o R⋈OLS Left outer join: Only dangling tuples from R are addedo R⋈ORS Right outer join : Only dangling tuples from S are added

Page 29: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Outer Join Examplehttps://www.w3resource.com/sql/joins/perform-an-outer-join.php

SELECT company.company_name,company.company_id, foods.company_id, foods.item_name, foods.item_unitFROM company, foods WHERE company.company_id = foods.company_id(+);

foods company

Page 30: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Relations and rules of integrityo We can express referential integrity, functional

dependencies and multi-value dependencies - and also other classes of integrity rules - in relational algebra!

Page 31: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Examples of integrity rules in classical relational algebrao If E is an expression in relational algebra, then E=∅ is an integrity rule

that says that E does not have any tuples o If E1 and E2 are expressions in relational algebra, then

E1 ⊆ E2 is an integrity rule that says that each tuple in E1 shall also be in E2

o Note that E1 ⊆ E2 and E1 – E2= ∅ are equivalent.Also E= ∅ og E ⊆ ∅. Thus, only one of the forms above is sufficient

o Strictly speaking, ∅ is not a relational algebra expression. We could have (we should have) written R – R instead for an arbitrary relation R with same schema as E

Page 32: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Examples of integrity rules in classical relational algebrao Referential integrity requires that a foreign key must have a matching primary

key or it must be null:”A is foreign key for S”, where B is primary key in S: !("A(R)) ⊆ "B(S)

o Domain constraint refers to the constraint requiring a valid set of values for an attribute. The data type of domain includes string, character, integer, time, date, currency, etc. And the value of the attribute (say A) of R must be available in the corresponding domain ($) of A: %&∉$ ( = ∅

o Entity integrity specifies that the Primary Keys on every instance of an entitymust be kept, must be unique and must have values other than NULL

Page 33: IN3020/4020 –Database Systems Spring 2020, Week 5.1 (cont. … · RELATIONAL ALGEBRA –Part 2 Bags and Derviations Dr. M. Naci Akkøk, Chief Architect, Oracle Nordics Based upon

Examples of integrity rules in classical relational algebrao Functional dependency is a relationship that exists when one attribute

uniquely determines another attribute. If R is a relation with attributes Xand Y, a functional dependency between the attributes is represented as X → Y, which specifies Y is functionally dependent on X.

o Functional Dependencies: ”A1 A2 ... An → B1 B2 ... Bm” in R: "#($R1(R)×$R2(R))=∅

where # is the expressionR1.A1 = R2.A1 AND ... AND R1.An = R2.An AND (R1.B1 ≠ R2.B1 OR ...OR R1.Bm ≠ R2.Bm