Upload
nguyenquynh
View
223
Download
1
Embed Size (px)
Citation preview
1
Chapter 11: File System Chapter 11: File System ImplementationImplementation
11.2 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
ObjectivesObjectives
To describe the details of implementing local file systems and directory structures
To describe the implementation of remote file systems
To discuss block allocation and free-block algorithms and trade-offs
2
11.3 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
File ConceptFile Concept
The file system consists of two distinct parts: collection of files and a directory structure.
The OS provides abstraction of physical properties of disks by defining a logical storage unit, the file.
A file is a collection of related information that is recorded on secondary storage.
Why files, why we don’t save info in the process itself?
Limited space
When the process terminates, the information is lost
Multiple processes must be able to access the information concurrently.
11.4 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
File AttributesFile Attributes
Name – only information kept in human-readable form, For the convenience of human users.
Identifier – unique tag (number) identifies file within file system
Type – needed for systems that support different types
Location – Pointer to a device and the file location on the device.
Size – current file size
Protection – controls who can do reading, writing, executing
Time, date, and user identification – data for protection, security, and usage monitoring
Information about files are kept in the directory structure, which is maintained on the disk
Usually, a directory entry consists of a filename and its ID. The ID locates the other attributes.
3
11.5 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
File OperationsFile Operations
A file is an abstract data type. System calls in the OS implement basic file operations:
Create: The file is created with no data (just to initialize attributes)
Open: Before using a file, a process must open it.
to allow the system to fetch the attributes and list of disk addresses into main memory for rapid access on later calls.
Close: to free up internal table space
Read: read from the current pointer position
Write: Data are written to the file usually at the current position
Reposition within file (file seek)
Delete
Truncate
Other possible operations: append, rename, get and set file attributes
Combining the above operations to perform other file operation (copying)
11.6 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Directory StructureDirectory StructureA collection of nodes containing information about all files in a file system (name. location, size)
A directory is a system file for maintaining the structure of the file system
F 1 F 2F 3
F 4
F n
Directory
Files
Both the directory structure and the files reside on disk (or a partition)A disk can contain multiple file systemsDirectories can be organized in several different ways.
4
11.7 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Operations Performed on DirectoryOperations Performed on Directory
Search for a file
Create a file
Delete a file
List a directory
Rename a file
11.8 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Issues for Directory Organization
Efficiency – locating a file quickly
Naming – convenient to users
two users can have same name for different files
the same file can have several different names
Grouping – logical grouping of files by properties, (e.g., all Java programs, all games, …)
5
11.9 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Directory StructureDirectory StructureSingleSingle--Level DirectoryLevel Directory
A single directory for all users, in which all files are contained.
Was common in early days when there was only one user.
Simple and easy to find files (one place to look in)
Naming problem
Grouping problem
11.10 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
TwoTwo--Level DirectoryLevel Directory
Separate directory for each user
Path name: defined by the username and the file name.Each OS has specific format for the path name.
System files are searched through search path rather than copied to each user directory
Different users can have files with the same name
Efficient searching
No grouping capability
6
11.11 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
TreeTree--Structured DirectoriesStructured Directories
11.12 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
TreeTree--Structured Directories (Cont)Structured Directories (Cont)
Efficient searching
Grouping Capability (subdirectories)
Current directory (working directory): should contain most of the files of interest to the process, else a user specify new directory
Special system calls (cd) are provided to change directories.
Absolute or relative path names.
Absolute – specify the name starting from the top (root) directory.
Example: /users/smith/project1/main.c
Relative – specify the name starting from the current directory.
Example: main.c
Search path for command names.
7
11.13 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
TreeTree--Structured Directories (Cont)Structured Directories (Cont)
Creating a new file is done in current directoryDeleting a directory – two options:
Require the directory to be empty before it can be deleted.Provide an option to delete the directory along with all of its
files and subdirectories.Delete a filerm <file-name>
Creating a new subdirectory is done in current directorymkdir <dir-name>
Example: if in current directory /mailmkdir count
prog copy prt exp count
Deleting “mail” ⇒ deleting the entire subtree rooted by “mail”
11.14 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
FileFile--System StructureSystem Structure
File system resides permanently on secondary storage (disks)
For efficiency, I/O transfer between memory and a disk is performed on the units of blocks
Each block has one or more sector, a sector is usually 512 bytes
OS imposes a file system to store and access data. 2 problems:
Define how the file system should look to the user. It involves defining a file, its attributes, operations on a file, directorystructure. USER VIEW
Creating algorithms and data structure to map the logical file system into the physical secondary-storage device. SYSTEM VIEW
8
11.15 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
FileFile--System Structure (System Structure (contdcontd))
A file system is generally organized into layers:
I/O control – device drivers and interrupt handlers to transfer info between main memory and the disk system
The device driver translates a high level commands ( such as get block 123) into hardware-specific instructions.
Basic file system – implements basic operations (commands).
File-organization module – knows about files and their logical and physical blocks (location).
Tracks also the free space
Logical file system – manages metadata, i.e. information about the files.
Manages the directory structure via a file control block (FCB), which contains info about the file including ownership, permissions, and location.
11.16 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Layered File SystemLayered File System
9
11.17 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
On-Disk File System Structures
File system structures that reside on disk generally include:
A boot control block (boot block, boot sector)
contains information needed to boot the OS. If no OS, this block is empty
Usually at the beginning of each partition (disk)
A volume control block contains volume (partition) information such as the number of blocks, size of blocks, free-block count and free-block pointers.
(UFS: superblock. NTFS: master file table.)
A directory structure per file system is used to organize files.
A file control block for each file contains details about the file, including permissions, ownership, size, and location of data blocks. (UFS: inode. NTFS: stored within the master file table.)
11.18 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
A Typical File Control BlockA Typical File Control Block
10
11.19 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
InIn--Memory File System StructuresMemory File System Structures
Information about files that is kept in memory by the operating system may include:
A mount table with information about each mounted volume.
A directory-structure cache.
The system-wide open-file table, with a copy of the FCB for each open file, along with other information.
The per-process open-file tables, which contain pointers to appropriate entries in the system-wide open-file table, and other information.
The following slide illustrates some of the in-memory structures kept by the operating system.
Figure (a) refers to opening a file.
Figure (b) refers to reading a file.
11.20 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
InIn--Memory File System StructuresMemory File System Structures
11
11.21 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Directory ImplementationDirectory Implementation
Linear list of file names with pointer to the data blocks.
simple to program
time-consuming to execute requires linear search
To create a new file, the directory must be searched first to make sure that no existing file has the same name.
Also the same for deleting a file.
Hash Table – linear list stores the directory entries but with hash data structure. (hash function takes filename as input and returns a pointer to the filename in the list)
decreases directory search time
collisions – situations where two file names hash to the same location
Another problem is fixed size of hash tables. If the directory size must increase, a new hash function is needed. (ex: mod 64 to mod 128)
11.22 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
File Implementation File Implementation (Allocation Methods)(Allocation Methods)
Disk blocks must be allocated to files. The main problem is how to allocate space to files so that disk space is utilized effectively and files can be accessed quickly.
keeping track of which disk blocks go with which file
There are three major methods of allocating disk space to files:
Contiguous allocation
Linked allocation
Indexed allocation
12
11.23 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Contiguous AllocationContiguous AllocationEach file occupies a set of contiguous blocks on the diskProvides efficient direct access.
Accessing block b+1 after block b requires no head movement, so the number of head seeks is minimal.
Simple addressing scheme: only starting location (block #) and length (number of blocks) are required
The directory entry for each file indicates the address of the starting block and the number of blocks required
Both sequential and direct access can be supportedA dynamic storage-allocation problem.
First fit and best fit are the most common used strategies.External fragmentation can be a problem.
Another problem is knowing how much space is needed for the file (file size) when it is createdFiles cannot grow, so if the allocated space is not enough any more:
Terminate the user program, with an appropriate error messageFind larger hole and copy the file into it.
Pre-allocation of space is generally required if the file size is known in advance, However, the file might reach its final size over long period (may be years) which can cause significant internal fragmentation.
11.24 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Contiguous Allocation of Disk SpaceContiguous Allocation of Disk Space
13
11.25 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Linked AllocationLinked Allocation
Each file is a linked list of fixed-size disk blocks.
Blocks may be scattered anywhere on the disk.
Allocate blocks as needed as the file grows, wherever they are available on the disk
The directory contains a pointer to the first and last blocks of the file
Each block contains a pointer to the next block (not accessible by users)
11.26 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Linked Allocation (Cont.)Linked Allocation (Cont.)
Simple – only need to keep the starting address of each file.
No external fragmentation. Any free block can be used to satisfy a request for more space.
Creating a file is easy, declared of size zero (null pointer to the first block) then grow easily as long as free blocks are available
No efficient direct access.
Effective for sequential access
Pointers saved in the blocks consume some space
Use of clusters (a set of blocks which the system deals with as a unit)
Decreases overhead of pointers (4 blocks = 1 cluster needs 1 ptr)
Increases throughput (fewer head seeks).
Increases internal fragmentation.
Reliability: if a pointer is lost, blocks (clusters) can’t be traversed
14
11.27 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Linked Allocation with FAT
Variation: File-Allocation Table (FAT)
A section of disk at the beginning is set a side to store the table.
FAT: One entry per block or cluster, and is indexed by block no. Link pointers only are kept in the FAT.
taking the pointer word from each disk block and putting it in the FAT table in memory
The directory entry contains the block no. of the first block
Special end-of-file is saved in the last pointer.
The FAT should be cached in memory to reduce head seeks.
Random access is much easier
Used by MS-DOS
11.28 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Indexed Allocation (iIndexed Allocation (i--node)node)
Linked allocation solves the external fragmentation and size-declaration problem, but it can’t support efficient direct access with the absence of FAT
Indexed allocation brings all pointers together into the index block.
Each file has its own index block, which is an array of disk-block addresses
The directory contains the address of the index block, from which the ith block can be accessed directly
Logical view.
Block 5571396
index block
Block7
Block 13
Block 9
Block 6
15
11.29 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Example of Indexed AllocationExample of Indexed Allocation
11.30 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Indexed Allocation (cont.)
More efficient direct access than linked allocation.
No external fragmentation. Any free block can be used to satisfy a request for more space.
Suffers from a wasted space due to the use of index block. (pointer overhead is larger)
How large the index block should be? What if a file is too largefor one index block?
Linked scheme: use one block. To allow larger files, link several index blocks
Multilevel index: use a first-level index block to point to a second-level index blocks
Combined scheme: used in UNIX.
16
11.31 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Indexed Allocation (Cont.) Indexed Allocation (Cont.) –– Multilevel IndexMultilevel Index
M
outer-index
index table file
11.32 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Combined Scheme: UNIX (4K bytes per block)Combined Scheme: UNIX (4K bytes per block)
17
11.33 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Efficiency and PerformanceEfficiency and Performance
Disks are the main bottleneck in a system performance, so they need to be improved in terms of efficiency and performance
Efficiency depends on:
disk allocation and directory algorithms
Clustering reduces head seeks and improve file transfer
types of data kept in file’s directory entry
Do we need to keep last access, last write dates for example. If yes we need to update these fields whenever a file is read or written
Pointer size.
Performance: it can be improved even after the algorithms are selected
A buffer (disk) cache can be used to keep frequently used file blocks in mem.
Page cache: cache file data as pages rather than as system blocks.
Memory mapped files can improve performance through page caching
Synchronous and asynchronous writes
free-behind and read-ahead – techniques to optimize sequential access
11.34 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
RecoveryRecoveryCare must be taken to ensure that system failure does not result in data loss or data inconsistency.
Consistency checking – compares data in directory structure kept in memory with data blocks on disk, and tries to fix inconsistencies
performing a backup on an active file system might lead to inconsistency. (If files and directories are being added, deleted, and modified during the dumping process, the resulting backup may beinconsistent )
Example: fsck on Unix or chkdsk in DOS
Use system programs to back up data from disk to another storage device (floppy disk, magnetic tape, other magnetic disk, optical)
Full backup: should the entire file system be backed up or only part of it?
it is usually desirable to back up only specific directories andeverything in them rather than the entire file system.
Incremental backup: back up files that have not changed since the last backup
Recover lost file or disk by restoring data from backup
18
11.35 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Log Structured File SystemsLog Structured File Systems
We have faster CPUs, faster and larger memories, larger but SLOW diskswrites are done in very small chunks very inefficient due to high seek time
Ex: creating a file requires the i-node for the directory, the directory block, the i-node for the file, and the file itself must all be written
The goal of LFS is to achieve the full bandwidth of the disk, even in the face of a workload consisting in large part of small random writes Log structured (or journaling) file systems record each small update to the file system as a transaction
Used in NTFS, optional in Solaris UNIXAll operations for a transactions are written to a log (the disk is viewed as a log)
A transaction is considered committed once all of its operations have been successfully written to the log. Then the system call can return to the userHowever, the file system may not yet be updated
The transactions in the log are asynchronously written to the file systemWhen the file system has been successfully updated, the transactions are
removed from the log.After a system crash, all remaining committed transactions in the log are performed. Uncommitted transactions are rolled back.
Summary: all writes are initially buffered in memory, and periodically all the buffered writes are written to the disk in a single segment, at the end of the log
11.36 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
RAID StructureRAID Structure
RAID – multiple disk drives provides reliability via redundancy.distributes data across several physical disks which look to the operating system and the user like a single logical disk
Idea: Use many disks in parallel to increase storage bandwidth, improve reliability
Files are striped across disks
Each stripe portion is read/written in parallel
Bandwidth increases with more disks, but more error prone
RAID is arranged into six different levels. RAID 0 distributes data across several disks in a way which gives improved speed and full capacity, but all data on all disks will be lost if any one disk fails.
RAID 1 (mirrored disks) uses two (possibly more) disks which each store the same data, so that data is not lost so long as one disk survives
RAID 5 combines three or more disks in a way that protects data against loss of any one disk
19
11.37 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
RAID (cont)RAID (cont)
Several improvements in disk-use techniques involve the use of multiple disks working cooperatively.
key concepts in RAID:
mirroring, the copying of data to more than one disk;
Used for reliability purposes
striping, the splitting of data across more than one disk;
Used for performance. More disks in the array means higher bandwidth, but greater risk of data loss
error correction, where redundant data is stored to allow problems to be detected and possibly fixed.
11.38 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
RAID LevelsRAID LevelsRAID 0: Striped set without parity:
Provides improved performance and additional storage but no fault tolerance. Any disk failure destroys the array, which becomes more likely with more disks in the array
RAID1: Mirrored set without parity.Provides fault tolerance from disk errors and single disk failure
RAID2: Redundancy through Hamming codeRAID3: Striped with interleaved parity
Dedicated disk for parity writing bottleneck
RAID4: striped with Block level parity.Dedicated disk for parity writing bottleneck
RAID5: Striped with distributed parityEach disk have some blocks that hold the corresponding blocks parityUpon drive failure, any subsequent reads can be calculated from the distributed parity such that the drive failure is masked from the end user
RAID6: Striped set with dual parityeach group of blocks have two distributed parity blocks that are distributed across the disks
Provides fault tolerance from two drive failures
20
11.39 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
RAID (0 + 1) and (1 + 0)RAID (0 + 1) and (1 + 0)
11.40 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Fast File System
The original Unix file system had a simple, straightforward implementation
Easy to implement and understand
But very poor utilization of disk bandwidth (lots of seeking)
BSD Unix folks did a redesign (mid 80s) that they called the Fast File System (FFS)
Improved disk utilization, decreased response time
Now the FS from which all other Unix FS’s have been compared
Good example of being device-aware for performance
21
11.41 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Data and Inode Placement
Original Unix FS had two placement problems:
1. Data blocks allocated randomly in aging file systems
Blocks for the same file allocated sequentially when FS is new
As FS “ages” and fills, need to allocate into blocks freed up when other files are deleted
Problem: Deleted files essentially randomly placed
So, blocks for new files become scattered across the disk
2. Inodes allocated far from blocks
All inodes at beginning of disk, far from data
Traversing file name paths, manipulating files, directories requires going back and forth from inodes to data blocks
Both of these problems generate many long seeks
11.42 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Cylinder Groups
BSD FFS addressed these problems using the notion of a cylinder group
Disk partitioned into groups of cylinders
Data blocks in same file allocated in same cylinder
Files in same directory allocated in same cylinder
Inodes for files allocated in same cylinder as file data blocks
Free space requirement
To be able to allocate according to cylinder groups, the disk must have free space scattered across cylinders
10% of the disk is reserved just for this purpose
22
11.43 Silberschatz, Galvin and Gagne ©2005Operating System Concepts – 7th Edition, Jan 1, 2005
Other Problems
Small blocks (1K) caused two problems:
Low bandwidth utilization
Small max file size (function of block size)
Fix using a larger block (4K)
Very large files, only need two levels of indirection for 2^32
Problem: internal fragmentation
Fix: Introduce “fragments” (1K pieces of a block)
Problem: Media failures
Replicate master block (superblock)
End of Chapter 11End of Chapter 11