Download ppt - Relational Databases

Transcript
Page 1: Relational Databases

Relational Databases

COMP 416

Fall 2010

Lecture 21

Page 2: Relational Databases

What’s a database?

• A collection of data

• Examples of collections of data?– Library– Web– Stacks of papers on your desk– Set of baseball cards.

• Are all of these things databases?

Page 3: Relational Databases

Web vs. Library

• What’s the big difference?– Organization– In what ways is the library organized?

• Databases have organization.

Page 4: Relational Databases

Library vs Baseball Cards• What’s the primary object in these two

collections?– Libraries: books– Baseball card collections: baseball cards

• How alike are books?– Somewhat, but large variatations

• How alike are baseball cards?– Very alike.

• Things in databases are highly structured.

Page 5: Relational Databases

One library vs another

• What’s the difference between the Brauer Library and the House Library?

• Different purposes leads to different priorities (in organization and content).

• Databases are built for a purpose.– The more specific the purpose, the more

specific its structure and organization.

Page 6: Relational Databases

So, what’s a database

• A database is an collection of structured information organized for a specific purpose.

Page 7: Relational Databases

Relational Databases

• Relational databases are the most prevalent type of database used.

• Information is organized into related tables.

• Each table captures information about a different entity.– Columns are different fields of information

(attributes of the entity).– Each row represents one instance (a specific

example of the entity).

Page 8: Relational Databases

Design Goals• What kinds of information do we want to

keep track of?

• What do we want to do with that information?

Page 9: Relational Databases

Entities

• First step in database design is to identify entities.– Think of entities as “things” that you want to

know information about.– What do we care about for our bookstore?

• Books (duh?)

Page 10: Relational Databases

Attributes

• Next step is to identify attributes of those entities.

• An attribute is labeled piece of information (i.e., a name/value pair)

• In general, we expect every instance of a particular entity to have specific values for a set of common attributes.

Page 11: Relational Databases

Book Entity

BookAuthor(s)

Title

Publisher

Genre

Price

Page 12: Relational Databases

Normalization

• Not all database designs are equal.

• Experience and research has shown that certain structures and relationships are easier to maintain and process than others.

• Normalization: a process through which a database design is “cleaned up”

• Well-defined set of “normal forms” which are the incremental result of this process.

Page 13: Relational Databases

1NF

• First Normal Form– All attributes are single-valued.– All instances have a unique identifier.

Page 14: Relational Databases

Book Entity Revisited

• Is our book entity in 1NF?

BookAuthor(s)

Title

Publisher

Genre

Price

Page 15: Relational Databases

Bookstore Entities (1NF)

• Multi-valued attributes generally indicate the need for a new entity.

BookTitle

Publisher

Genre

Price

AuthorFirst

Last

Birthday

Page 16: Relational Databases

Unique Identifiers

• What in our book and author entities can act as a unique identifier?

• Often (almost always) the best way to create a unique identifier is to create an artificial one.– Book ID, Author ID.– Assigned by the database itself.– No inherent semantics.

Page 17: Relational Databases

Book Entities (1NF) v2

Book

Title

Publisher

Genre

Price

ID

Author

First

Last

Birthday

ID

Page 18: Relational Databases

Modeling Relationships

• Two relationship types.– One-to-Many– Many-to-Many

• For now, we’ll just model this pictorially like this:

Book

Title

Publisher

Genre

Price

ID

Author

First

Last

Birthday

ID

Page 19: Relational Databases

2NF

• Second Normal Form– Already in 1NF– Non-identifying attributes are dependent on the

entity’s unique identifier.• Rule of thumb: if the same value appears multiple

times for a particular attribute, think hard if what you really need is another entity.

Page 20: Relational Databases

Bookstore Entities

• What might we pull out into an entity? Book

Title

Publisher

Genre

Price

ID

Author

First

Last

Birthday

ID

Page 21: Relational Databases

Bookstore Entities (2NF)

Book

Title

Price

ID

Author

First

Last

Birthday

ID

Publisher

Name

Address

State

State Abbrev.

ID

Genre

Genre Name

ID

Page 22: Relational Databases

3NF

• Third Normal Form– In 2NF– No attributes dependent on each other.

• What part of our data model violates this?

• To fix, generally want to pull the dependent attributes out into their own entity.

Page 23: Relational Databases

Bookstore Entities (3NF)

Book

Title

Price

ID

Author

First

Last

Birthday

ID

Publisher

Name

Address

ID

Genre

Genre Name

ID

Long NameAbbrev.

StateID

Page 24: Relational Databases

Logical vs Physical Design• Result so far is “logical” database design.• Still need to implement this design as a specific

database.• Relational databases:

– Each entity associated with a table.– Attributes are columns of the table.– Each attribute is given a data type.– Unique identifiers are “primary keys”– Relationships are embodied as “foreign keys”

• An attribute whose value is the unique identifier in another table.

Page 25: Relational Databases

Implementing 1-to-many

• To implement a 1-to-many relationship, add an attribute on the “many” side which is the unique identifier of the “one” side.

Page 26: Relational Databases

Implementing 1-to-many

Author

First

Last

Birthday

ID

Genre

Genre Name

ID

Long NameAbbrev.

StateID

Publisher

NameAddress

ID

StateID

Book

TitlePrice

ID

PubIDGenreID

Page 27: Relational Databases

Resolving M-to-M

• Many-to-many relationships are hard to implement in a database.

• Why is this?– Foreign key attribute which is supposed to

implement the relationship requires multiple values.

– This breaks 1NF structure.

• How might we fix it?

Page 28: Relational Databases

Junction Entities

• A junction entity is an abstract entity provides a level of indirection for a many-to-many relationship.

Page 29: Relational Databases

Adding BookAuthor Junction

Author

First

Last

Birthday

ID

Genre

Genre Name

ID

Long NameAbbrev.

StateID

Publisher

NameAddress

ID

StateID

Book

TitlePrice

ID

PubIDGenreID

BookAuthor

BookIDID

AuthorID

Page 30: Relational Databases

SQL

• Structured Query Language (SQL)– The language in which we express actions to be

performed on a relational database.– Standardized to allow portability across

different products.• SQL92 (aka SQL2) is the latest standard.

– Product specific differences and extensions still exist, but much better than before.

Page 31: Relational Databases

MySQL• MySQL

– Open-source– Great for small to mid-sized organizations.– Fast, efficient, cheap– Doesn’t support full SQL but a good portion of it.

Page 32: Relational Databases

Web App Model

JavaScriptProgrammable,

dynamic interface to the

document

Browser Web ServerHTTP Requests

HTTP Responses

DBOn Disk

SQLPHPProgrammable,

dynamic, document

construction

DatabaseStructured, table-based, information

storage