25
CSE 544 Principles of Database Management Systems Magdalena Balazinska (magda) Spring 2006 Lecture 1 - Class Introduction

CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544Principles of DatabaseManagement Systems

Magdalena Balazinska (magda)Spring 2006

Lecture 1 - Class Introduction

Page 2: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Outline

• Introductions

• Class overview

• What is the point of a database?

Page 3: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Course Staff

• Instructor: Magda– [email protected]

– Office hours by appointment– Location: cse 550

• TA: YongChul Kwon– Graduate student in the database group– [email protected]

– Office hours: Wed 3:30pm-4:20pm or by appointment– Location: cse 220

Page 4: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Who is Magda?

• Assistant Professor since January 2006• PhD from MIT, February 2006

• Areas of interest: databases and systems• Current research focus

– Stream processing & real-time monitoring

Page 5: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Current Research Projects

• RFID Ecosystem– Tracking people and objects in the Paul Allen Center

• StreamClean– Probabilistic RFID data cleaning– Probabilistic event extraction

• Moirae– Integrating history into monitoring systems

• SharedViews– Sharing personal data on the Internet

Page 6: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Goals of the Class

• Study principles of data management– Data models, data independence, normalization– Data integrity, availability, consistency, etc.

• Study key database design issues– Storage, query execution and optimization, transactions– Replication, distribution, streaming & sensor data– Adaptive processing, information retrieval, etc.

• Ensure that– You are comfortable using a database– You can write (web-based) applications that use a

database as a back-end

Page 7: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Class Format

• Two lectures per week: MW @ 10:30am• Mix of lecture and discussion

– Mostly based on papers– Must read papers before lecture– Come prepared to discuss them

• One guest lecture: Surajit Chaudhuri– From Microsoft Research– Will talk about IR and RDBMS integration

Page 8: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Readings and Notes

• Most readings are papers– Mix of old seminal papers and new papers– Papers available online

• A few readings from the following book– Database Management Systems. Third Ed.

Ramakrishnan and Gehrke. McGraw-Hill.

Page 9: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Class Resources

• Website: lectures, assignments, projectshttp://www.cs.washington.edu/544List of all the deadlines

• Mailing list:[email protected] automatically (no need to register)

Page 10: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Evaluation

• Assignments 20%– HW1: Using a database (SQL, views, indexes, etc.)– HW2: Writing a Web app with a db back-end

• Project 35%– Small research project but you must start it now!

• One exam 30%– November 13th in class

• Class participation 15%– Paper readings and discussions

Page 11: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Assignments

• Goal: hands-on experience using a DBMS• HW1: already posted on the website• Due October 16th• Content:

– Setup a db from scratch– Practice writing SQL queries– Browse the system catalog– Get experience with integrity constraints & triggers– Play with indexes and views

Page 12: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Project Overview

• Choose from a list of mini-research topics• Or come up with your own• Can be related to your ongoing research• Must be related to databases• Must contain some element of research• Open ended

• Final deliverables– Short research paper (8 pages)– Conference-style presentation

Page 13: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Project Goals

• Apply database principles to a new problem– Understand and model the problem– Research and understand related work (2-3 papers)– Propose some new approach

• Creativity will be evaluated– Implement some parts– Evaluate your solution– Write-up and present your results

• Amount of work may vary widely between groups

Page 14: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Project Milestones

• Oct 4th: teams formed• Oct 18th: project proposal• Nov 8th: milestone report• Dec 4th: final report• Dec 6th: project presentations• More details on the website• We will meet with you regularly

Page 15: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Exam

• Location: in class• Time: November 13th

– After we cover fundamental topics– Before the project is due

• Will focus on the papers read in class• More information later

Page 16: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Class Participation

• An important part of your grade• Because

– We would like you to read and think aboutpapers throughout the quarter

– Important to learn to discuss papers• Expectations

– Ask questions, raise issues, think critically– Learn to express your opinion– Respect other people’s opinions

Page 17: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Let’s get started

• What is a database?

• Give examples of databases

Page 18: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Let’s get started

• What is a database?– A collection of files storing related data

• Give examples of databases– Accounts database; payroll database; UW’s

students database; Amazon’s productsdatabase; airline reservation database

Page 19: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Data Management

• Data is valuable but hard to manage• Example: Store database

– Entities: employees, positions (ceo, manager,cashier), stores, products, sells, customers.

– Relationships: employee positions, staff ofeach store, inventory of each store.

• What functionality do we need?

Page 20: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Required Functionality

1. Create & persistently store large datasets2. Efficient query & update

1. Must handle complex questions about data2. Must handle sophisticated updates3. Performance matters

3. Change structure (e.g., add attributes)4. Concurrency control5. Crash recovery6. Access control, security, integrity

Page 21: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Database Management System

• A DBMS is a software system designed toprovide data management services

• Examples of DBMS– Oracle, DB2 (IBM), SQL Server (Microsoft),– PostgreSQL, MySQL,…

Page 22: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Market Shares

• In 2004 (from www.computerworld.com)– IBM, 35% market with $2.5 billion in sales– Oracle, 33% market with $2.3 billion in sales– Microsoft, 19% market with $1.3 billion in sales

Page 23: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Typical System Architecture

Data files

Database server(someone else’s

C program) Applications

connection(ODBC, JDBC)

“Two tier system” or “client-server”

Page 24: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

Main DBMS Features

• Data independence– Data model– Data definition language– Data manipulation language

• Efficient data access• Data integrity and security• Data administration• Concurrency control• Crash recovery• Reduced application development time

Page 25: CSE 544 Principles of Database Management Systems€¦ · CSE 544 - Fall 2006 Goals of the Class •Study principles of data management –Data models, data independence, normalization

CSE 544 - Fall 2006

When not to use a DBMS?

• DBMS is optimized for a certain workload• Some applications may need

– A completely different data model– Completely different operations– A few time-critical operations

• Examples– Text processing– Scientific analysis