12
BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer [email protected] [email protected] www.hacettepe.edu.tr/~ebru @Ebru176 903122977500

BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer [email protected] [email protected] ebru @Ebru176 903122977500

Embed Size (px)

Citation preview

Page 1: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

BBM371 Data ManagmentAssoc. Prof. Dr. Ebru Akçapınar Sezer

[email protected]@hacettepe.edu.tr

www.hacettepe.edu.tr/~ebru@Ebru176

903122977500

Page 2: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

Outline• Storage devices• Basic file and DBMS concepts• Record/page formats

– Layers of DBMS

– Journey of byte• Indexing

– Traditional/multilevel

– Tree based– Hash based

• External sorting• Spatial Indexing

Page 3: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

Policies• Class attendance is strongly recomended• One midterm (%25), 3 quiz, 2 attandance quiz and ofcourse one

final (%50) exam• No homework, trust comes first• Think about that this is not an «elective course»• Presentation based, question and answer style• Text book:

– Raghu Ramakrishnan, Database Management Systems, McGraw Hill, (text book).

– R. Elmasri, S.B. Navathe, Fundamentals of Database Systems, 4th edition, Addison-Wesley, 2004.

• Office hour: whenever you come to my office

Page 4: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

From Data Str 2 File Str• Similar points:

– Representation of Data+

– Operations for accessing data

• Different perspectives:– Data structures: deal with data in main memory– File structures: deal with data in secondary

storage

Page 5: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

First «Data»• What is DATA?

Data are any facts, numbers, or text that can be processed by a computer• What is Managment

Set of activities to accomplish desired goals and objectives using available resources efficiently and effectively.• And so… Data Managment is

To choose data storage policy which satisfies end user’s specified requirements and minimizes the response time

Page 6: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

Store data in «where»• Up to now : In memory

Arrays, linked lists, ques, stack, heaps….

• After today: In memory and in files

• What are we doing?dealing with huge datalet’s think about the larger data than your

available memory, can you use arrays ?

Page 7: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

File• means «KÜTÜK», not «Dosya» !!!

• has format, size, file processor depends on format

• stores data– image, video, pdf, source code, etc

• Not interested in all types of data, specialize on the «data about entity»

Page 8: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

Hardware

Operating System

DBMSFile system

Application

Let’s place the FS in our domain?

Page 9: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

MM versus SS

Main Memory (MM)

Secondary Storage (SS)

data transfer

data is manipulated here

data is stored here

- Semiconductors

- fast, expensive,volatile, small

- disks, tape

-slow, cheap, stable, large

processing

storage

data

tran

sfer

Page 10: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

MM versus SS

Main Memory (MM)

Secondary

Storage (SS)

data transfer

data is manipulated here

data is stored here

- Semiconductors

- Fast, expensive, volatile, small

- disks, tape

- Slow,cheap, stable, large

processing

storage

Minimize the number of trips to the disk in order to get desired information

Grouping related information so that we are likely to get everything we need with only one trip to the disk.

Page 11: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

How fast is main memory?

Typical time for getting info from:

Main memory: ~12 nanosec = 120 x 10-9 sec

Magnetic disks: ~30 milisec = 30 x 10-3 sec

Page 12: BBM371 Data Managment Assoc. Prof. Dr. Ebru Akçapınar Sezer ebruakcapinarsezer@gmail.com ebru@hacettepe.edu.tr ebru @Ebru176 903122977500

Physical Files and Logical Files• physical file: a collection of bytes stored on a disk or tape• logical file: a channel connecting program to a physical file

The physical file has a name, for instance myfile.txtThe logical file has a logical name inside the program.In C : FILE *outfile;

In C++: fstream outfile;

• The program (application) sends (or receives) bytes to (from) a file through the logical file. The program knows nothing about where the bytes go (came from).

• The operating system is responsible for associating a logical file in a program to a physical file in disk or tape. Writing to or reading from a file in a program is done through the operating system.