40
CS450/550 FileSystems.1 Adapted from MOS2E UC. Colorado Springs CS450/550 Operating Systems Lecture 6 File Systems Palden Lama Department of Computer Science

CS450/550 Operating Systems Lecture 6 File Systems

Embed Size (px)

DESCRIPTION

CS450/550 Operating Systems Lecture 6 File Systems. Palden Lama Department of Computer Science. Review: Summary of Chapter 5. OS responsibilities in I/O operations Protection and Scheduling CPU communicates with I/O devices I/O devices notify OS/CPU I/O software hierarchy - PowerPoint PPT Presentation

Citation preview

CS450/550 FileSystems.1 Adapted from MOS2EUC. Colorado Springs

CS450/550Operating Systems

Lecture 6 File Systems

Palden Lama

Department of Computer Science

CS450/550 FileSystems.2 Adapted from MOS2EUC. Colorado Springs

Review: Summary of Chapter 5

° OS responsibilities in I/O operations• Protection and Scheduling

• CPU communicates with I/O devices

• I/O devices notify OS/CPU

° I/O software hierarchy• Interrupt handlers

• Device drivers

• Buffering

° Storage Systems• Disk head scheduling algorithms

° Power Management

° More reading: textbook 5.1 - 5.11

CS450/550 FileSystems.3 Adapted from MOS2EUC. Colorado Springs

Chapter 6: File Systems

6.1 Files

6.2 Directories

6.3 File system implementation

6.4 Example file systems

CS450/550 FileSystems.4 Adapted from MOS2EUC. Colorado Springs

Long-term Information Storage

Three essential requirements for long-term information storage

• Must store large amounts of data

• Information stored must survive the termination of the process using it

• Multiple processes must be able to access the information concurrently

What are users’ concerns of the file system?

What are implementors’ concerns of the file system?

CS450/550 FileSystems.5 Adapted from MOS2EUC. Colorado Springs

File Naming

Typical file extensions.

° Files are an abstraction mechanism• two-part file names

CS450/550 FileSystems.6 Adapted from MOS2EUC. Colorado Springs

File Structures

° Three kinds of file structures

• Unstructured byte sequence (Unix and WinOS view)

• Record sequence (early machines’ view)

• Tree (mainframe view)

What files look like from programmers’ viewpoint?

CS450/550 FileSystems.7 Adapted from MOS2EUC. Colorado Springs

File Types

° Regular files• ASCII files or binary files

° Directories

° Character special files

° Block special files

CS450/550 FileSystems.8 Adapted from MOS2EUC. Colorado Springs

File Access° Sequential access

• read all bytes/records from the beginning

• cannot jump around, could rewind or back up

• convenient when medium was mag tape

° Random access• bytes/records read in any order

• essential for data base systems

• read can be …

- move file marker (seek), then read or …

- read and then move file marker

CS450/550 FileSystems.9 Adapted from MOS2EUC. Colorado Springs

File Attributes

Possible file attributes

CS450/550 FileSystems.10 Adapted from MOS2EUC. Colorado Springs

File Operations

1. Create

2. Delete

3. Open

4. Close

5. Read

6. Write

7. Append

8. Seek

9. Get attributes

10.Set Attributes

11.Rename

CS450/550 FileSystems.11 Adapted from MOS2EUC. Colorado Springs

An Example Program Using File System Calls

CS450/550 FileSystems.12 Adapted from MOS2EUC. Colorado Springs

An Example Program Using File System Calls (cont.)

CS450/550 FileSystems.13 Adapted from MOS2EUC. Colorado Springs

Memory-Mapped Files

(a) Segmented process before mapping files into its address space

(b) Process after mapping

existing file abc into one segment

creating new segment for xyz

° OS provide a way to map files into the address space of a running process; map() and unmap()• No read or write system calls are needed thereafter

CS450/550 FileSystems.14 Adapted from MOS2EUC. Colorado Springs

Directories: Single-Level Directory Systems

° A single-level directory system is simple for implementation

• contains 4 files

• owned by 3 different people, A, B, and C

What is the key problem with the single-level directory systems?

Different users may use the same names for their files

CS450/550 FileSystems.15 Adapted from MOS2EUC. Colorado Springs

Two-level Directory Systems

Letters indicate owners of the directories and files

What additional operation required, compared with single-level directory systems?

Login procedure

What if a user has many files and wants to group them in logical way?

CS450/550 FileSystems.16 Adapted from MOS2EUC. Colorado Springs

Hierarchical Directory Systems

A hierarchical directory system

CS450/550 FileSystems.17 Adapted from MOS2EUC. Colorado Springs

Path Names

A UNIX directory tree

° Absolute path name

° Relative path name

CS450/550 FileSystems.18 Adapted from MOS2EUC. Colorado Springs

Directory Operations

1. Create

2. Delete

3. Opendir

4. Closedir

5. Readdir

6. Rename

7. Link

8. Unlink

What are file system implementors’ concerns?

How files & directories stored?How disk space is managed?How to make everything work efficiently and reliably?

CS450/550 FileSystems.19 Adapted from MOS2EUC. Colorado Springs

File System Implementation

A possible file system layout

° File system layout• Most disks can be divided into one or more partitions

• BIOS MBR (Master Boot Record)

How to keep track of which disk blocks go with which file?

CS450/550 FileSystems.20 Adapted from MOS2EUC. Colorado Springs

Implementing Files (1) – Contiguous Allocation

(a) Contiguous block allocation of disk space for 7 files

(b) State of the disk after files D and E have been dynamically removed

° Pros: simple addressing and one-seek only reading

° Cons: disk fragmentation (like Swapping)fit CD-ROM

CS450/550 FileSystems.21 Adapted from MOS2EUC. Colorado Springs

Implementing Files (2) – Linked List Allocation

Storing a file as a linked list of disk blocks

° Keep each file as a linked list of disk blocks

• Pros: no space is lost due to disk fragmentation

• Cons: how about random access?

CS450/550 FileSystems.22 Adapted from MOS2EUC. Colorado Springs

Implementing Files (3) – FAT (File Allocation Table)

Linked list allocation using a file allocation table in RAM

° FAT: a table in memory with the pointer word of each disk block

• High utilization + easy random access, but too “FAT” maybe?

Consider:

A 20 GB disk1 KB block sizeEach entry 3 B

How much space for a FAT?How about paging it?

CS450/550 FileSystems.23 Adapted from MOS2EUC. Colorado Springs

Implementing Files (4) – I-nodes

An example i-node

° i-node: a data structure listing the attributes and disk addresses of the file’s blocks; in memory when the corresponding file is open

Why i-node scheme requires much less space than FAT?

CS450/550 FileSystems.24 Adapted from MOS2EUC. Colorado Springs

Implementing Files (5) – Summary

° How to find the disk blocks of a file?• Contiguous allocation: the disk address of the entire file

• Linked list & FAT: the number of the first block

• i-node: the number of the i-node

Who provides the information above?

The directory entry (based on the path name)

CS450/550 FileSystems.25 Adapted from MOS2EUC. Colorado Springs

Implementing Directories (1)

(a) A simple directory (MS-DOS/Windows)

Fixed-size entries

File names, attributes, and disk addresses in directory entry

(b) Directory (UNIX); each entry just refers to an i-node, i-number returned

° The directory entry, based on the path name, provides the information to find the disk blocks

……

What to do for few but long and variable-length file names?

CS450/550 FileSystems.26 Adapted from MOS2EUC. Colorado Springs

Implementing Directories (2)

° Two ways of handling long and variable-length file names in directory

(a) In-line: compaction and page fault. (b) In a heap: page fault

CS450/550 FileSystems.27 Adapted from MOS2EUC. Colorado Springs

Shared Files

File system containing a shared file

° How to let multiple users share files?

What if directories contain the disk addresses?

CS450/550 FileSystems.28 Adapted from MOS2EUC. Colorado Springs

Shared Files in UNIX

(a) Situation prior to linking; (b) After the link is created

(c) After the original owner removes the file

° UNIX utilizes i-node’ data structure• What if a file is removed by the owner?

CS450/550 FileSystems.29 Adapted from MOS2EUC. Colorado Springs

Shared Files – Symbolic Linking

° A new file, created with type LINK, enters B’s directory

• The file contains just the path name of the linked file

• Con: extra overhead with each file access, parsing

• Pro 1: Only when the owner removes the file, it is destroyed

- Removing a symbolic link does not affect the file

• Pro 2: networked file systems

CS450/550 FileSystems.30 Adapted from MOS2EUC. Colorado Springs

Disk Space Management – Block Size

° Dark line (left hand scale) gives data rate of a disk

° Dotted line (right hand scale) gives disk space efficiency

° All files 2KB

Block size

° All file systems chop files to fixed-size non-adjacent blocks

° Block size is a trade-off of space utilization and data rate• Three-step disk access

CS450/550 FileSystems.31 Adapted from MOS2EUC. Colorado Springs

Disk Space Management – Tracking Free Blocks

(a) Storing the free list on a linked list. (b) A bit map

° How to keep track of free blocks?

CS450/550 FileSystems.32 Adapted from MOS2EUC. Colorado Springs

Example of Tracking Free Blocks° Consider a 16-GB disk, 1-KB block size, 32-bit disk block

number

if all blocks are empty, how many blocks in the free list and in the bit map, respectively? Which one uses less space? But what if the disk is nearly full?

How much information should be stored in the memory for each scheme?

CS450/550 FileSystems.33 Adapted from MOS2EUC. Colorado Springs

Disk Space Management – Disk Quotas

Quotas for keeping track of each user’s disk use

° An open file table in memory has attributes telling who the owner of an opened file is; and a per-user table contains the quota

CS450/550 FileSystems.34 Adapted from MOS2EUC. Colorado Springs

File System Reliability

° Physical dumping: starts at block 0, writes al the disk blocks onto the output tape in order

° Logical dumping: starts one or more specified directories and recursively dumps all files and directories found there that have changed sine some given based date (e.g., the last backup from an incremental dump or system installation for a full dump)

CS450/550 FileSystems.35 Adapted from MOS2EUC. Colorado Springs

File System Consistency

° File system states

(a) consistent

(b) missing block

(c) duplicate block in free list

(d) duplicate data block

CS450/550 FileSystems.36 Adapted from MOS2EUC. Colorado Springs

File System Performance - Caching

The block cache data structures with a bi-directional usage list

° Cache: a collection of blocks that logically belong on the disk but are being kept in memory for performance reasons

• Hash the device and disk address and look up the result in a hash table with collision chains

• Cache references are relatively infrequent

Why LRU is undesirable when consistency is an issue if the system crashes?

CS450/550 FileSystems.37 Adapted from MOS2EUC. Colorado Springs

° Cache & Consistency

• UNIX system call sync() every 30s

• MS-DOS strategy write-through

File System Performance – Caching II

CS450/550 FileSystems.38 Adapted from MOS2EUC. Colorado Springs

File System Performance – Block Read Ahead

° Block Read Ahead works well for files that are being read sequentially

• Spatial locality

CS450/550 FileSystems.39 Adapted from MOS2EUC. Colorado Springs

The Windows 98 File System (1)

The extended MOS-DOS directory entry used in Windows 98

Bytes

CS450/550 FileSystems.40 Adapted from MOS2EUC. Colorado Springs

Summary

° Files and directories

° File system implementation• Contiguous files

• Linked lists

• FAT

• i-nodes

° Disk space management

° File system performance and consistency

° More reading: textbook 6.1 - 6.6