20
File Implementation

File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Embed Size (px)

Citation preview

Page 1: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

File Implementation

Page 2: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

File System Abstraction

Page 3: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

How to Organize Files on Disk• Goals:

– Maximize sequential performance– Easy random access to file– Easy management of file (growth, truncation, etc)

Page 4: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Disk Formatting• A new magnetic disk is just a platter of a

magnetic recording material• Before disk can store data, must be divided into

sectors that the disk controller can read and write. This process is called low-level formatting or physical formatting. Fills disk with special data structure for each sector consisting of header, data area, and trailer– Header and trailer contain info used by disk controller

(e.g., sector number, error-correcting code (ECC))

Page 5: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Disk Formatting• To use a disk to hold files, OS still needs to

record its own data structures on the disk– First step is to partition the disk into one or more

groups of cylinders. The OS can treat each partition as though it were a separate disk

– Second step is logical formatting (or creation of a file system). The OS stores the initial file system data structures onto the disk

Page 6: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Building a File System• File System: layer of OS that transforms block

interface of disks (or other block devices) into files, directories, etc

• File System Components– Disk Management: collecting disk blocks into files– Naming: interface to find files by name, not by blocks– Protection: layers to keep data secure– Reliability/Durability: keeping of files durable despite

crashes, media failures, attacks, etc

Page 7: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Disk Management Policies• Basic entities on a disk

– File: user-visible group of blocks arranged sequentially in logical space

– Directory: user-visible index mapping names to files• Need way to structure files: File Header

– Track which blocks belong at which offsets within the logical file structure

– Optimize placement of files’ disk blocks to match access and usage patterns

Page 8: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Designing the File System: Access Patterns

• How do users access files?– Need to know type of access patterns user is likely to

throw at system• Sequential Access: bytes read in order (“give me

the next X bytes, then give me next, etc”)– Almost all file access are of this flavor

• Random Access: read/write element out of middle of array (“give me bytes i-j”)– Less frequent, but still important– Want this to be fast – don’t want to have to read all

bytes to get to the middle of the file

Page 9: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

How to Organize Files on Disk• Goals:

– Maximize sequential performance– Easy random access to file– Easy management of file (growth, truncation, etc)

Page 10: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

How to Organize Files on Disk• First Technique: Contiguous Allocation

– Use contiguous range of blocks– User says in advance how big file will be (disadvantage)

– Search for space using best fit/first fit• What if not enough contiguous space for new file?

– File header contains: First block in file, file size (# of blocks)– Pros?

• fast sequential access, easy random access

– Cons? • external fragmentation/hard to grow files

• Free holes get smaller and smaller

• Could compact space, but that would be really expensive

• Contiguous Allocation used by IBM 360– Result of allocation and management cost: people would create

a big file, put their file in the middle

Page 11: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Contiguous Allocation

Page 12: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

How to Organize Files on Disk• Second Technique: Linked List Approach

– Each block, pointer to next on disk

– Pros: can grow files dynamically, fragmentation?– Cons?

• Bad sequential access (seek between each block), unreliable (lose block, lose rest of file)

• Serious Con: bad random access!!!!

Null

File Header

Page 13: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Linked Allocation

Page 14: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Linked Allocation: File-Allocation Table (FAT)

• MSDOS links blocks together to create a file– Links not in blocks, but in the File Allocation Table

(FAT)• FAT contains an entry for each block on the disk

• FAT entries corresponding to blocks of file linked together– Access properties:

• Random access time is improved because the disk head can

find the location of any block by reading the FAT

Page 15: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Linked Allocation: File-Allocation Table (FAT)

Page 16: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Linked Allocation• Linked allocation solves the external

fragmentation and size declaration problems of contiguous allocation

• However, in the absence of a FAT, linked allocation cannot support efficient direct access

Page 17: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Indexed Allocation• Third technique: Indexed Files

– System allocates file header block to hold array of pointers big enough to point to all blocks

• User pre-declares max file size

Page 18: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Indexed Allocation• Direct access without external fragmentation, but

have overhead of index block– The pointer overhead of the index block is generally

greater than the pointer overhead of linked allocation

– Consider a common case of a file with only 1 or 2 blocks

• With linked allocation, lose space of only one pointer per

block

• With indexed allocation, an entire index block must be

allocated, even if only 1 or 2 pointers are non-nil

• How large should the index block be?

Page 19: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Multilevel Indexed Files (UNIX BSD 4.1)

• Multilevel Indexed Files: Like multilevel address translation (from UNIX 4.1 BSD)– Key idea: efficient for small files, but still allow big files– File header contains 13 pointers

• Fixed size table, pointers not all equivalent

• This header is called an inode in UNIX

– File Header format• First 10 pointers are to data blocks

• Block 11 points to indirect block containing 256 blocks

• Block 12 points to doubly indirect block containing 256

indirect blocks for total of 64K blocks

• Block 13 points to a triply indirect block (16M blocks)

Page 20: File Implementation. File System Abstraction How to Organize Files on Disk Goals: –Maximize sequential performance –Easy random access to file –Easy

Information in a UNIX Disk-Resident Inode