37
Using Relational Databases and SQL John Hurley Department of Computer Science California State University, Los Angeles Lecture 3: Joins Part I

Using Relational Databases and SQL

Embed Size (px)

DESCRIPTION

Using Relational Databases and SQL. Lecture 3: Joins Part I. John Hurley Department of Computer Science California State University, Los Angeles. Multiple Tables. Why do we have so many different tables in such a simple database? One-to-one relationships One-to-many relationships - PowerPoint PPT Presentation

Citation preview

Page 1: Using Relational Databases and SQL

Using Relational Databases and SQL

John HurleyDepartment of Computer Science

California State University, Los Angeles

Lecture 3:Joins Part I

Page 2: Using Relational Databases and SQL

Multiple TablesWhy do we have so many different tables in such a simple database?

One-to-one relationships

One-to-many relationships

Many-to-many relationships

Need to record the data with as little redundancy as possible

That’s what the relational model is all about

Page 3: Using Relational Databases and SQL

Joins: The Need

Question: Display each artist name along with the name of each title that they have produced.

The artist name comes from the Artists table

The title comes from the Titles table

Two things we can do:

Run multiple queries (not a good idea)

Combine data from both tables in the same query (good idea)

Page 4: Using Relational Databases and SQL

The Solution

Use a join (choose one from several join types):

SELECT ArtistName, Title FROM Artists JOIN Titles USING(ArtistID);

SELECT ArtistName, Title FROM Artists A INNER JOIN Titles T ON A.ArtistID = T.ArtistID;

SELECT ArtistName, Title FROM Artists A, Titles TWHERE A.ArtistID = T.ArtistID;

SELECT ArtistName, TitleFROM Artists NATURAL JOIN Titles;

Page 5: Using Relational Databases and SQL

What is a Cartesian Product?

The Cartesian Product of tables A and B is the set of all possible concatenated rows whose first component comes from A and whose second component comes from B

If A has a rows and B has b rows, the total number of rows in A x B is a x b

Example:

A has 6 rows

B has 4 rows

A x B has 24 rows

Page 6: Using Relational Databases and SQL

Cartesian Product Example

Given these two tables, what is the Cartesian Product?

A = SELECT * FROM Artists;

B = SELECT * FROM Titles;

Use a CROSS JOIN, which is the simplest type of join in SQL, to get the Cartesian Product

A x B =SELECT * FROM Artists CROSS JOIN Titles;

Page 7: Using Relational Databases and SQL

What is a Join?

A join is a subset of the Cartesian Product between two tables

A join is a type of mathematical operator, similar to multiplication, but applied to sets

A join takes two records from two tables, one from table A and one from table B, and concatenates them horizontally if a condition, known as the join predicate or join condition, is true

Page 8: Using Relational Databases and SQL

Cross JoinCartesian Product

SELECT *FROM Artists CROSS JOIN Titles;

Lots of records, with way too much information

Only records where ArtistIDs match are useful

Cartesian Products can be quite large

Cartesian Products are rarely useful

Therefore, use CROSS JOIN sparingly (rarely)

The only reasons for using cross join I have ever come across are a) to explain what the Cartesian product is and b) to quickly generate lots of arbitrary data for software testing!

Page 9: Using Relational Databases and SQL

Other (More Useful) Join Types

SQL provides several other join types other than the cross join

Inner and Outer Joins

Equi-Join

Named Column Join

Natural Join (not recommended!)

For each of these other join types, you can specify a boolean condition called a join condition, or join predicate, which is used to filter out the rows of the Cartesian Product that you don’t want

Page 10: Using Relational Databases and SQL

Join Conditions

Since many records in a Cartesian Product are not meaningful, we can eliminate them using a join condition

In general, most of the time, we want to keep only matching records (i.e. only when two values of a common attribute between two tables are equal)

Ex. Movies.MovieID = Companies.MovieID

How you specify a join condition depends on the type of join you are using

If you don't supply a join condition, you get cross join!

Page 11: Using Relational Databases and SQL

The “Physical” Meaning of Joins

A join suggests there is a relationship between two tables, that is described by the join condition

A salesperson “represents” a member/a member “has a” salesperson

A title ‘has an’ artist/an artist “has” titles

An artist “has” members/members “belong to” an artist

Page 12: Using Relational Databases and SQL

Named Column JoinsAlso called JOIN USING syntax

Syntax:

SELECT attribute_listFROM A JOIN B USING(column_name);

SELECT attribute_listFROM A JOIN B USING(name1, name2, …);

SELECT artistName, title FROM artists JOIN titles USING(artistID)

Page 13: Using Relational Databases and SQL

Qualified Table NamesYou may join tables which include fields with the same names. In fact, this is *always* true when you are using Named Column Join.

If you SELECT such a field, it may be ambiguous which one you want

SELECT Lastname, MemberID, SalesID FROM Members JOIN Salespeople ON Members.SalesID = Salespeople. SalesID;

ERROR 1052 (23000): Column 'Lastname' in field list is ambiguous

Page 14: Using Relational Databases and SQL

Qualified Table NamesThe first way to deal with this is to “qualify” the fieldnames by prepending the table names:

mysql> SELECT Members.Lastname, Members.MemberID, Salespeople.SalesID FROM Members JOIN Salespeople USING(SalesID);

In this example, it’s only really necessary to qualify SalesID, but for the sake of clarity it’s better to qualify all the fields you select.

This way is very clear, but we will also learn a slightly easier way later in this lecture

Page 15: Using Relational Databases and SQL

Named Column Join

Consider a DB with two tables:

1)Employees has fields ID, Lastname, and Worksite

2)Worksites has fields ID, Address, City, and State

Employees.Worksite is the same data as Worksites.ID, but the fields have different names in the two tables.

If we want to show the address of each employee’s worksite, we can’t use named column join.

Page 16: Using Relational Databases and SQL

Named Column Joins

Careful to be sure the two identically-named columns represent the same data.

In real life, the problem might not be so obvious

What does this return?

select m.memberID, s.salesID from members m join salespeople s using(lastname)

• Many databases contain fields with names like “ID” in every table, or with fields like “Address” in many different tables. Joining on these would not usually be meaningful, and you certainly wouldn’t want to do it be accudent!

Page 17: Using Relational Databases and SQL

Equi-JoinsUses a comma separated list of tables in the FROM clause instead of the JOIN clause

Join condition is specified in a WHERE or ON clause

Syntax:

SELECT attribute_listFROM A, BWHERE join_condition;

SELECT attribute_listFROM A JOIN BON join_condition;

Page 18: Using Relational Databases and SQL

Table AliasesWhen joining tables with common attribute names, MySQL may get confused:

SELECT ArtistID FROM Artists, TitlesWHERE ArtistID = ArtistID;

To solve this we can give each table an alias:

SELECT T.ArtistID, FROM Artists A, Titles T WHERE A.ArtistID = T.ArtistID;

You may also explicitly use qualified table names instead of aliases

SELECT Artists.ArtistID FROM Artists, TitlesWHERE Artists.ArtistID = Titles.ArtistID;

You may also use the AS keyword to specify a table alias

SELECT A.ArtistID, A.Artistname, T.Title FROM Artists AS A, Titles AS T WHERE A.ArtistID = T.ArtistID;

Page 19: Using Relational Databases and SQL

Table Aliases

You may also use the AS keyword to specify a table alias

SELECT A.ArtistID, T.Title FROM Artists AS A, Titles AS T WHERE A.ArtistID = T.ArtistID;

Page 20: Using Relational Databases and SQL

Equi-Join Examples

Examples:

SELECT *FROM Artists A, Titles TWHERE A.ArtistID = T.TitleID;

SELECT m.lastname, s.studioname FROM Members M, Studios S WHERE M.SalesID = S.SalesID;

Page 21: Using Relational Databases and SQL

Inner JoinsEquivalent to equi join, but with a different syntax

In an inner join, you explicitly write a full join condition expression in an ON clause

This is safer than Named Column Join, and it doesn’t require that the fields have the same name in both tables

Syntax:

SELECT attribute_listFROM A INNER JOIN B ON join_condition;

Page 22: Using Relational Databases and SQL

Inner Join Example

List the name of each track with the title on which it appears

Page 23: Using Relational Databases and SQL

Inner Join Example

SELECT tr.tracktitle, ti.title FROM tracks tr INNER JOIN titles ti ON(tr.titleID = ti.titleID)

Page 24: Using Relational Databases and SQL

Natural Joins: SQL’s Problem Child

In a natural join, no join condition is specified

Join condition is determined automatically by name; matches on all fields that have the same names in the two tables

Syntax:

SELECT attribute_listFROM A NATURAL JOIN B;

Example:

SELECT *FROM Artists NATURAL JOIN Titles;

Page 25: Using Relational Databases and SQL

Problems with Natural Joins

Try the following:

SELECT *FROM Members NATURAL JOIN SalesPeople;

Does it produce the expected results?

Yes, but it’s not the join condition you wanted

Wanted (match members with their supervisors)Members.SalesID = SalesPeople.SalesID

Natural join uses (crazy stuff)Members.SalesID = SalesPeople.SalesID AND Members.FirstName = SalesPeople.FirstName ANDMembers.LastName = SalesPeople.LastName

Rarely use natural joins

Page 26: Using Relational Databases and SQL

Foreign KeysA foreign key is a column name whose data contains the primary key values of another table

For example, ArtistID in the Titles table contains values that come from the Artists table (the ArtistID column in the Artists table, for which it is the primary key)

Foreign keys are also used to protect our database data from anomalies; for example, in the Titles table, what if we had an ArtistID of 100 in there? Who is ArtistID 100? Which artist is it?

Most, not all, joins use foreign keys. Why is this a good practice?

Page 27: Using Relational Databases and SQL

Cross-Referencing TablesNote XrefArtistMembers table

This can allow an individual listed in the members table to be a member of any number of artists, while an artist can have any number of members

Many-to-many relationship

Page 28: Using Relational Databases and SQL

Joining More Than Two Tables

You may chain tables just like you chain multiplications…

-- natural joinSELECT *FROM A NATURAL JOIN B NATURAL JOIN C;

-- named column joinSELECT *FROM A JOIN B USING(a1) JOIN C USING(a2);

Page 29: Using Relational Databases and SQL

More Chaining…

Examples:

-- inner joinSELECT *FROM A INNER JOIN B ON A.n1 = B.n2 INNER JOIN C ON B.n3 = C.n4;

-- equi-joinSELECT *FROM A, B, CWHERE A.n1 = B.n2 AND B.n3 = C.n4;

Page 30: Using Relational Databases and SQL

How to Solve Join Problems

List the first and last names of all members along with the artistIDs of the artists of which they are members

Page 31: Using Relational Databases and SQL

How to Solve Join Problems

answer:

SELECT M.Firstname, M.Lastname, X.ArtistID FROM Members M JOIN XrefArtistsMembers X USING(memberID);

Page 32: Using Relational Databases and SQL

How to Solve Join Problems

List the first and last names of all members along with the names of any titles that they have played on.

Page 33: Using Relational Databases and SQL

How to Solve Join Problems

Answer:

SELECT M.Firstname, M.Lastname, T.Title FROM Members M JOIN XrefArtistsMembers X ON M.MemberID = X.MemberID JOIN Titles T ON X.ArtistID = T.ArtistID;

Page 34: Using Relational Databases and SQL

How to Solve Join Problems

List the first and last names of all members who have recorded at the studio MakeTrax.

Page 35: Using Relational Databases and SQL

How to Solve Join Problems

Answer:

select distinct m.firstname, m.lastname from members M join xrefartistsmembers x using(memberID) join titles using(artistID) join studios S using(studioID) where s.studioname = "maketrax"

Page 36: Using Relational Databases and SQL

How to Solve Join Problems

List the first and last names of all members with the names of the artists to which they belong

Page 37: Using Relational Databases and SQL

Join Expressions

You may also wrap joins within parentheses, just as you can with mathematical expressions such as (1 + 2 – 3) * (8 – 4 + 2)

SELECT *FROM (A JOIN B USING(n1)) JOIN (C JOIN D USING(n2)) USING(n3);