111
Lab 9: File system – Of buffers, logs, and blocks Advanced Operating Systems Zubair Nabi [email protected] April 3, 2013

AOS Lab 9: File system -- Of buffers, logs, and blocks

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: AOS Lab 9: File system -- Of buffers, logs, and blocks

Lab 9: File system – Of buffers, logs, and blocksAdvanced Operating Systems

Zubair Nabi

[email protected]

April 3, 2013

Page 2: AOS Lab 9: File system -- Of buffers, logs, and blocks

Introduction

The purpose of a file system is to:

1 Organize and store data

2 Support sharing of data among users and applications

3 Ensure persistence of data after a reboot

Page 3: AOS Lab 9: File system -- Of buffers, logs, and blocks

Introduction

The purpose of a file system is to:

1 Organize and store data

2 Support sharing of data among users and applications

3 Ensure persistence of data after a reboot

Page 4: AOS Lab 9: File system -- Of buffers, logs, and blocks

Introduction

The purpose of a file system is to:

1 Organize and store data

2 Support sharing of data among users and applications

3 Ensure persistence of data after a reboot

Page 5: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files

• Record the identities of the blocks that hold each file’s content• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery• A restart must not corrupt the file system or leave it in an

inconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 6: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files• Record the identities of the blocks that hold each file’s content

• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery• A restart must not corrupt the file system or leave it in an

inconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 7: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files• Record the identities of the blocks that hold each file’s content• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery• A restart must not corrupt the file system or leave it in an

inconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 8: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files• Record the identities of the blocks that hold each file’s content• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery

• A restart must not corrupt the file system or leave it in aninconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 9: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files• Record the identities of the blocks that hold each file’s content• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery• A restart must not corrupt the file system or leave it in an

inconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 10: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files• Record the identities of the blocks that hold each file’s content• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery• A restart must not corrupt the file system or leave it in an

inconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 11: AOS Lab 9: File system -- Of buffers, logs, and blocks

Challenges

• Need on-disk data structures to:• Represent the tree of named directories and files• Record the identities of the blocks that hold each file’s content• Keep track of the areas of the disk which are free

• The file system needs to support crash recovery• A restart must not corrupt the file system or leave it in an

inconsistent state

• The file system can be accessed by multiple processes at thesame time and this access needs to be synchronized

• Disk access is orders of magnitude slower than memory access,so the file system must maintain an in-memory cache of popularblocks

Page 12: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers

File descriptors

Recursive lookup

Directory inodes

Inodes and block allocator

Logging

Buffer cache

System calls

Pathnames

Directories

Files

Transactions

Blocks

Page 13: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (2)

1 Buffer cache: Reads and writes blocks on the IDE disk via thebuffer cache, which synchronizes access to disk blocks

• Ensures that only one kernel process can edit any particular blockat a time

2 Logging: Ensures atomicity by enabling higher layers to wrapupdates to several blocks in a transaction

3 Inodes and block allocator: Provides unnamed files, eachunnamed file is represented by an inode and a sequence ofblocks holding the file content

Page 14: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (2)

1 Buffer cache: Reads and writes blocks on the IDE disk via thebuffer cache, which synchronizes access to disk blocks

• Ensures that only one kernel process can edit any particular blockat a time

2 Logging: Ensures atomicity by enabling higher layers to wrapupdates to several blocks in a transaction

3 Inodes and block allocator: Provides unnamed files, eachunnamed file is represented by an inode and a sequence ofblocks holding the file content

Page 15: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (2)

1 Buffer cache: Reads and writes blocks on the IDE disk via thebuffer cache, which synchronizes access to disk blocks

• Ensures that only one kernel process can edit any particular blockat a time

2 Logging: Ensures atomicity by enabling higher layers to wrapupdates to several blocks in a transaction

3 Inodes and block allocator: Provides unnamed files, eachunnamed file is represented by an inode and a sequence ofblocks holding the file content

Page 16: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (2)

1 Buffer cache: Reads and writes blocks on the IDE disk via thebuffer cache, which synchronizes access to disk blocks

• Ensures that only one kernel process can edit any particular blockat a time

2 Logging: Ensures atomicity by enabling higher layers to wrapupdates to several blocks in a transaction

3 Inodes and block allocator: Provides unnamed files, eachunnamed file is represented by an inode and a sequence ofblocks holding the file content

Page 17: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (3)

4 Directory inodes: Implements directories as a special kind ofinode

• The content of this inode is a sequence of directory entries, eachof which contains a name and a reference to the named file’sinode

5 Recursive lookup: Provides hierarchical path names such as/foo/bar/baz.txt, via recursive lookup

6 File descriptors: Abstracts many Unix resources, such as pipes,devices, file, etc., using the file system interface

Page 18: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (3)

4 Directory inodes: Implements directories as a special kind ofinode

• The content of this inode is a sequence of directory entries, eachof which contains a name and a reference to the named file’sinode

5 Recursive lookup: Provides hierarchical path names such as/foo/bar/baz.txt, via recursive lookup

6 File descriptors: Abstracts many Unix resources, such as pipes,devices, file, etc., using the file system interface

Page 19: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (3)

4 Directory inodes: Implements directories as a special kind ofinode

• The content of this inode is a sequence of directory entries, eachof which contains a name and a reference to the named file’sinode

5 Recursive lookup: Provides hierarchical path names such as/foo/bar/baz.txt, via recursive lookup

6 File descriptors: Abstracts many Unix resources, such as pipes,devices, file, etc., using the file system interface

Page 20: AOS Lab 9: File system -- Of buffers, logs, and blocks

xv6 FS layers (3)

4 Directory inodes: Implements directories as a special kind ofinode

• The content of this inode is a sequence of directory entries, eachof which contains a name and a reference to the named file’sinode

5 Recursive lookup: Provides hierarchical path names such as/foo/bar/baz.txt, via recursive lookup

6 File descriptors: Abstracts many Unix resources, such as pipes,devices, file, etc., using the file system interface

Page 21: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

• xv6 lays out inodes and content blocks on the disk by dividing thedisk into several sections

boot super bitmap... data...inodes... log...

0 1 2 …..

• Block 0 holds the boot sector• Block 1 (called the superblock) contains metadata about the file

system• File system size in blocks, the number of data blocks, the number

of inodes, and the number of blocks in the log

• Blocks starting at 2 hold inodes, with multiple inodes per block

Page 22: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

• xv6 lays out inodes and content blocks on the disk by dividing thedisk into several sections

boot super bitmap... data...inodes... log...

0 1 2 …..

• Block 0 holds the boot sector

• Block 1 (called the superblock) contains metadata about the filesystem

• File system size in blocks, the number of data blocks, the numberof inodes, and the number of blocks in the log

• Blocks starting at 2 hold inodes, with multiple inodes per block

Page 23: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

• xv6 lays out inodes and content blocks on the disk by dividing thedisk into several sections

boot super bitmap... data...inodes... log...

0 1 2 …..

• Block 0 holds the boot sector• Block 1 (called the superblock) contains metadata about the file

system

• File system size in blocks, the number of data blocks, the numberof inodes, and the number of blocks in the log

• Blocks starting at 2 hold inodes, with multiple inodes per block

Page 24: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

• xv6 lays out inodes and content blocks on the disk by dividing thedisk into several sections

boot super bitmap... data...inodes... log...

0 1 2 …..

• Block 0 holds the boot sector• Block 1 (called the superblock) contains metadata about the file

system• File system size in blocks, the number of data blocks, the number

of inodes, and the number of blocks in the log

• Blocks starting at 2 hold inodes, with multiple inodes per block

Page 25: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

• xv6 lays out inodes and content blocks on the disk by dividing thedisk into several sections

boot super bitmap... data...inodes... log...

0 1 2 …..

• Block 0 holds the boot sector• Block 1 (called the superblock) contains metadata about the file

system• File system size in blocks, the number of data blocks, the number

of inodes, and the number of blocks in the log

• Blocks starting at 2 hold inodes, with multiple inodes per block

Page 26: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

boot super bitmap... data...inodes... log...

0 1 2 …..

• inode blocks are followed by bitmap blocks which keep track ofdata blocks in use

• Bitmap blocks are followed by data blocks which hold file anddirectory contents

• Finally at the end, the blocks hold a log which is required by thetransaction layer

Page 27: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

boot super bitmap... data...inodes... log...

0 1 2 …..

• inode blocks are followed by bitmap blocks which keep track ofdata blocks in use

• Bitmap blocks are followed by data blocks which hold file anddirectory contents

• Finally at the end, the blocks hold a log which is required by thetransaction layer

Page 28: AOS Lab 9: File system -- Of buffers, logs, and blocks

File system layout

boot super bitmap... data...inodes... log...

0 1 2 …..

• inode blocks are followed by bitmap blocks which keep track ofdata blocks in use

• Bitmap blocks are followed by data blocks which hold file anddirectory contents

• Finally at the end, the blocks hold a log which is required by thetransaction layer

Page 29: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer

• Has two main jobs:1 Synchronize access to disk blocks

2 Cache popular blocks

• Main interface:1 bread: Obtains a buffer containing a copy of a block2 bwrite: Writes a modified buffer3 brelse: Releases a buffer (after a read or write)

Page 30: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer

• Has two main jobs:1 Synchronize access to disk blocks2 Cache popular blocks

• Main interface:1 bread: Obtains a buffer containing a copy of a block2 bwrite: Writes a modified buffer3 brelse: Releases a buffer (after a read or write)

Page 31: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer

• Has two main jobs:1 Synchronize access to disk blocks2 Cache popular blocks

• Main interface:1 bread: Obtains a buffer containing a copy of a block

2 bwrite: Writes a modified buffer3 brelse: Releases a buffer (after a read or write)

Page 32: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer

• Has two main jobs:1 Synchronize access to disk blocks2 Cache popular blocks

• Main interface:1 bread: Obtains a buffer containing a copy of a block2 bwrite: Writes a modified buffer

3 brelse: Releases a buffer (after a read or write)

Page 33: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer

• Has two main jobs:1 Synchronize access to disk blocks2 Cache popular blocks

• Main interface:1 bread: Obtains a buffer containing a copy of a block2 bwrite: Writes a modified buffer3 brelse: Releases a buffer (after a read or write)

Page 34: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer (2)

• Synchronizes access to each block by allowing only a singlekernel thread to have a reference to the block’s buffer

• If one thread is holding a reference to a buffer, other threads willsleep on it

• The buffer cache has a fixed number of buffers to host disk blocks

• If higher layers ask for a block that is not cached, the buffer cacherecycles the least recently used buffer for this block

Page 35: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer (2)

• Synchronizes access to each block by allowing only a singlekernel thread to have a reference to the block’s buffer

• If one thread is holding a reference to a buffer, other threads willsleep on it

• The buffer cache has a fixed number of buffers to host disk blocks

• If higher layers ask for a block that is not cached, the buffer cacherecycles the least recently used buffer for this block

Page 36: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer (2)

• Synchronizes access to each block by allowing only a singlekernel thread to have a reference to the block’s buffer

• If one thread is holding a reference to a buffer, other threads willsleep on it

• The buffer cache has a fixed number of buffers to host disk blocks

• If higher layers ask for a block that is not cached, the buffer cacherecycles the least recently used buffer for this block

Page 37: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache layer (2)

• Synchronizes access to each block by allowing only a singlekernel thread to have a reference to the block’s buffer

• If one thread is holding a reference to a buffer, other threads willsleep on it

• The buffer cache has a fixed number of buffers to host disk blocks

• If higher layers ask for a block that is not cached, the buffer cacherecycles the least recently used buffer for this block

Page 38: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache

• The buffer cache is a doubly-linked of struct buf, with NBUFbuffers, accessed via bcache.head

• A buffer has three state bits1 B_VALID2 B_DIRTY3 B_BUSY

Page 39: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache

• The buffer cache is a doubly-linked of struct buf, with NBUFbuffers, accessed via bcache.head

• A buffer has three state bits

1 B_VALID2 B_DIRTY3 B_BUSY

Page 40: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache

• The buffer cache is a doubly-linked of struct buf, with NBUFbuffers, accessed via bcache.head

• A buffer has three state bits1 B_VALID

2 B_DIRTY3 B_BUSY

Page 41: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache

• The buffer cache is a doubly-linked of struct buf, with NBUFbuffers, accessed via bcache.head

• A buffer has three state bits1 B_VALID2 B_DIRTY

3 B_BUSY

Page 42: AOS Lab 9: File system -- Of buffers, logs, and blocks

Buffer cache

• The buffer cache is a doubly-linked of struct buf, with NBUFbuffers, accessed via bcache.head

• A buffer has three state bits1 B_VALID2 B_DIRTY3 B_BUSY

Page 43: AOS Lab 9: File system -- Of buffers, logs, and blocks

bread

• Makes a call to bget() to get a buffer for the given sector

• If the buffer is not B_VALID, it makes a call to iderw to read itinto the buffer cache

Page 44: AOS Lab 9: File system -- Of buffers, logs, and blocks

bread

• Makes a call to bget() to get a buffer for the given sector

• If the buffer is not B_VALID, it makes a call to iderw to read itinto the buffer cache

Page 45: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code: bread

struct buf*bread(uint dev, uint sector){struct buf *b;

b = bget(dev, sector);if(!(b->flags & B_VALID))iderw(b);

return b;}

Page 46: AOS Lab 9: File system -- Of buffers, logs, and blocks

bget

• Scans the buffer list for uint dev and uint sector

1 If such a buffer is present and B_BUSY is not set, it sets it andreturns the buffer

2 If B_BUSY is set, it goes to sleep on the buffer• Important: After bget wakes up, it cannot assume that the buffer

is available now – it might have been reused for a different sector –so it starts all over

3 If the buffer is not present, it reuses an existing buffer and edits itsmetadata to record the new uint dev and uint sector andsets B_BUSY and clears B_VALID and B_DIRTY

Page 47: AOS Lab 9: File system -- Of buffers, logs, and blocks

bget

• Scans the buffer list for uint dev and uint sector1 If such a buffer is present and B_BUSY is not set, it sets it and

returns the buffer

2 If B_BUSY is set, it goes to sleep on the buffer• Important: After bget wakes up, it cannot assume that the buffer

is available now – it might have been reused for a different sector –so it starts all over

3 If the buffer is not present, it reuses an existing buffer and edits itsmetadata to record the new uint dev and uint sector andsets B_BUSY and clears B_VALID and B_DIRTY

Page 48: AOS Lab 9: File system -- Of buffers, logs, and blocks

bget

• Scans the buffer list for uint dev and uint sector1 If such a buffer is present and B_BUSY is not set, it sets it and

returns the buffer2 If B_BUSY is set, it goes to sleep on the buffer

• Important: After bget wakes up, it cannot assume that the bufferis available now – it might have been reused for a different sector –so it starts all over

3 If the buffer is not present, it reuses an existing buffer and edits itsmetadata to record the new uint dev and uint sector andsets B_BUSY and clears B_VALID and B_DIRTY

Page 49: AOS Lab 9: File system -- Of buffers, logs, and blocks

bget

• Scans the buffer list for uint dev and uint sector1 If such a buffer is present and B_BUSY is not set, it sets it and

returns the buffer2 If B_BUSY is set, it goes to sleep on the buffer

• Important: After bget wakes up, it cannot assume that the bufferis available now – it might have been reused for a different sector –so it starts all over

3 If the buffer is not present, it reuses an existing buffer and edits itsmetadata to record the new uint dev and uint sector andsets B_BUSY and clears B_VALID and B_DIRTY

Page 50: AOS Lab 9: File system -- Of buffers, logs, and blocks

bget

• Scans the buffer list for uint dev and uint sector1 If such a buffer is present and B_BUSY is not set, it sets it and

returns the buffer2 If B_BUSY is set, it goes to sleep on the buffer

• Important: After bget wakes up, it cannot assume that the bufferis available now – it might have been reused for a different sector –so it starts all over

3 If the buffer is not present, it reuses an existing buffer and edits itsmetadata to record the new uint dev and uint sector andsets B_BUSY and clears B_VALID and B_DIRTY

Page 51: AOS Lab 9: File system -- Of buffers, logs, and blocks

bwrite

• Once bread returns a buffer, the caller has exclusive use of it

• If the caller writes to the buffer, it must call bwrite

• bwrite sets B_DIRTY and makes a call to iderw

Page 52: AOS Lab 9: File system -- Of buffers, logs, and blocks

bwrite

• Once bread returns a buffer, the caller has exclusive use of it

• If the caller writes to the buffer, it must call bwrite

• bwrite sets B_DIRTY and makes a call to iderw

Page 53: AOS Lab 9: File system -- Of buffers, logs, and blocks

bwrite

• Once bread returns a buffer, the caller has exclusive use of it

• If the caller writes to the buffer, it must call bwrite

• bwrite sets B_DIRTY and makes a call to iderw

Page 54: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code: bwrite

voidbwrite(struct buf *b){

if((b->flags & B_BUSY) == 0)panic("bwrite");

b->flags |= B_DIRTY;iderw(b);

}

Page 55: AOS Lab 9: File system -- Of buffers, logs, and blocks

brelse

• Moves the buffer from its current position to the front of the buffercache linked list, clears the B_BUSY bit, wakes up any processessleeping on that particular buffer

• This moving orders the buffers by how recently they were used• Why do we need to do this?

• Makes the scan in bget efficient – Remember its a doubly linkedlist

Page 56: AOS Lab 9: File system -- Of buffers, logs, and blocks

brelse

• Moves the buffer from its current position to the front of the buffercache linked list, clears the B_BUSY bit, wakes up any processessleeping on that particular buffer

• This moving orders the buffers by how recently they were used

• Why do we need to do this?• Makes the scan in bget efficient – Remember its a doubly linked

list

Page 57: AOS Lab 9: File system -- Of buffers, logs, and blocks

brelse

• Moves the buffer from its current position to the front of the buffercache linked list, clears the B_BUSY bit, wakes up any processessleeping on that particular buffer

• This moving orders the buffers by how recently they were used• Why do we need to do this?

• Makes the scan in bget efficient – Remember its a doubly linkedlist

Page 58: AOS Lab 9: File system -- Of buffers, logs, and blocks

brelse

• Moves the buffer from its current position to the front of the buffercache linked list, clears the B_BUSY bit, wakes up any processessleeping on that particular buffer

• This moving orders the buffers by how recently they were used• Why do we need to do this?

• Makes the scan in bget efficient – Remember its a doubly linkedlist

Page 59: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code: brelse

void brelse(struct buf *b){if((b->flags & B_BUSY) == 0)panic("brelse");

acquire(&bcache.lock);b->next->prev = b->prev;b->prev->next = b->next;b->next = bcache.head.next;b->prev = &bcache.head;bcache.head.next->prev = b;bcache.head.next = b;b->flags &= ~B_BUSY;wakeup(b);release(&bcache.lock);

}

Page 60: AOS Lab 9: File system -- Of buffers, logs, and blocks

Logging layer

• xv6 implements file system fault tolerance through a simplelogging mechanism

• System calls do not directly write file system data structures• Instead:

1 A system call first writes a description of all the disk writes that itwishes to perform to a log on the disk

2 It then writes a special commit record to the log to specify that itcontains a complete operation

3 Next it copies the required writes to the on-disk file system datastructures

4 Finally, it deletes the log

Page 61: AOS Lab 9: File system -- Of buffers, logs, and blocks

Logging layer

• xv6 implements file system fault tolerance through a simplelogging mechanism

• System calls do not directly write file system data structures

• Instead:1 A system call first writes a description of all the disk writes that it

wishes to perform to a log on the disk2 It then writes a special commit record to the log to specify that it

contains a complete operation3 Next it copies the required writes to the on-disk file system data

structures4 Finally, it deletes the log

Page 62: AOS Lab 9: File system -- Of buffers, logs, and blocks

Logging layer

• xv6 implements file system fault tolerance through a simplelogging mechanism

• System calls do not directly write file system data structures• Instead:

1 A system call first writes a description of all the disk writes that itwishes to perform to a log on the disk

2 It then writes a special commit record to the log to specify that itcontains a complete operation

3 Next it copies the required writes to the on-disk file system datastructures

4 Finally, it deletes the log

Page 63: AOS Lab 9: File system -- Of buffers, logs, and blocks

Logging layer

• xv6 implements file system fault tolerance through a simplelogging mechanism

• System calls do not directly write file system data structures• Instead:

1 A system call first writes a description of all the disk writes that itwishes to perform to a log on the disk

2 It then writes a special commit record to the log to specify that itcontains a complete operation

3 Next it copies the required writes to the on-disk file system datastructures

4 Finally, it deletes the log

Page 64: AOS Lab 9: File system -- Of buffers, logs, and blocks

Logging layer

• xv6 implements file system fault tolerance through a simplelogging mechanism

• System calls do not directly write file system data structures• Instead:

1 A system call first writes a description of all the disk writes that itwishes to perform to a log on the disk

2 It then writes a special commit record to the log to specify that itcontains a complete operation

3 Next it copies the required writes to the on-disk file system datastructures

4 Finally, it deletes the log

Page 65: AOS Lab 9: File system -- Of buffers, logs, and blocks

Logging layer

• xv6 implements file system fault tolerance through a simplelogging mechanism

• System calls do not directly write file system data structures• Instead:

1 A system call first writes a description of all the disk writes that itwishes to perform to a log on the disk

2 It then writes a special commit record to the log to specify that itcontains a complete operation

3 Next it copies the required writes to the on-disk file system datastructures

4 Finally, it deletes the log

Page 66: AOS Lab 9: File system -- Of buffers, logs, and blocks

Recovery

• In case of a reboot, the file system performs recovery by lookingat the log file

• If the log contains the commit record, the recovery code copiesthe required writes to the on-disk data structures

• If the log does not contain a complete operation, it is ignored anddeleted

Page 67: AOS Lab 9: File system -- Of buffers, logs, and blocks

Recovery

• In case of a reboot, the file system performs recovery by lookingat the log file

• If the log contains the commit record, the recovery code copiesthe required writes to the on-disk data structures

• If the log does not contain a complete operation, it is ignored anddeleted

Page 68: AOS Lab 9: File system -- Of buffers, logs, and blocks

Recovery

• In case of a reboot, the file system performs recovery by lookingat the log file

• If the log contains the commit record, the recovery code copiesthe required writes to the on-disk data structures

• If the log does not contain a complete operation, it is ignored anddeleted

Page 69: AOS Lab 9: File system -- Of buffers, logs, and blocks

Correctness of recovery mechanism

• If the crash occurs before the commit record, the log will beignored, and the state of the disk will stay unmodified

• If the crash occurs after the commit record, then the recovery willreplay all of the operation’s writes, even repeating them if thecrash occurred during the write to the on-disk data structure

• In both cases, the correctness of the file system is preserved:Either all writes are reflected on the disk or none

Page 70: AOS Lab 9: File system -- Of buffers, logs, and blocks

Correctness of recovery mechanism

• If the crash occurs before the commit record, the log will beignored, and the state of the disk will stay unmodified

• If the crash occurs after the commit record, then the recovery willreplay all of the operation’s writes, even repeating them if thecrash occurred during the write to the on-disk data structure

• In both cases, the correctness of the file system is preserved:Either all writes are reflected on the disk or none

Page 71: AOS Lab 9: File system -- Of buffers, logs, and blocks

Correctness of recovery mechanism

• If the crash occurs before the commit record, the log will beignored, and the state of the disk will stay unmodified

• If the crash occurs after the commit record, then the recovery willreplay all of the operation’s writes, even repeating them if thecrash occurred during the write to the on-disk data structure

• In both cases, the correctness of the file system is preserved:Either all writes are reflected on the disk or none

Page 72: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 73: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks

• The header block contains1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 74: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks

2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 75: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 76: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit

• The count is set to zero once all logged blocks have beenreflected in the file system

• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 77: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system

• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 78: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system• The count will be zero in case of a crash before a commit

• The count will be non-zero in case of a crash after a commit

Page 79: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design

• The log resides at a fixed location at the end of the disk

• It consists of a header block and a set of data blocks• The header block contains

1 An array of sector numbers, one for each of the logged data blocks2 Count of logged blocks

• The header block is written to after a commit• The count is set to zero once all logged blocks have been

reflected in the file system• The count will be zero in case of a crash before a commit• The count will be non-zero in case of a crash after a commit

Page 80: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log• No system call can write more distinct blocks than the size of the

log• Large writes are broken into multiple smaller writes so that each

write can fit in the log

Page 81: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log• No system call can write more distinct blocks than the size of the

log• Large writes are broken into multiple smaller writes so that each

write can fit in the log

Page 82: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log• No system call can write more distinct blocks than the size of the

log• Large writes are broken into multiple smaller writes so that each

write can fit in the log

Page 83: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log• No system call can write more distinct blocks than the size of the

log• Large writes are broken into multiple smaller writes so that each

write can fit in the log

Page 84: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log

• No system call can write more distinct blocks than the size of thelog

• Large writes are broken into multiple smaller writes so that eachwrite can fit in the log

Page 85: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log• No system call can write more distinct blocks than the size of the

log

• Large writes are broken into multiple smaller writes so that eachwrite can fit in the log

Page 86: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log design (2)

• A transaction sequence is indicated by the start and endsequence of writes in the system call

• Only one system call can be in a transaction at any given time toensure correctness

• The log holds at most one transaction at a time

• Only read system calls can execute concurrently with atransaction

• A fixed amount of space on the disk is dedicated to hold the log• No system call can write more distinct blocks than the size of the

log• Large writes are broken into multiple smaller writes so that each

write can fit in the log

Page 87: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code: Typical system call usage of log

begin_trans();...bp = bread(...);bp->data[...] = ...;log_write(bp);...commit_trans();

Page 88: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log

• log_write:• Appends the block’s new content to the log on the disk• Leaves the modified block in the buffer cache so that subsequent

reads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count2 Calls install_trans to copy each block from the log to the

relevant location on the disk3 Sets to count in the log header to zero

Page 89: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log• log_write:

• Appends the block’s new content to the log on the disk

• Leaves the modified block in the buffer cache so that subsequentreads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count2 Calls install_trans to copy each block from the log to the

relevant location on the disk3 Sets to count in the log header to zero

Page 90: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log• log_write:

• Appends the block’s new content to the log on the disk• Leaves the modified block in the buffer cache so that subsequent

reads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count2 Calls install_trans to copy each block from the log to the

relevant location on the disk3 Sets to count in the log header to zero

Page 91: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log• log_write:

• Appends the block’s new content to the log on the disk• Leaves the modified block in the buffer cache so that subsequent

reads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count2 Calls install_trans to copy each block from the log to the

relevant location on the disk3 Sets to count in the log header to zero

Page 92: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log• log_write:

• Appends the block’s new content to the log on the disk• Leaves the modified block in the buffer cache so that subsequent

reads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count

2 Calls install_trans to copy each block from the log to therelevant location on the disk

3 Sets to count in the log header to zero

Page 93: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log• log_write:

• Appends the block’s new content to the log on the disk• Leaves the modified block in the buffer cache so that subsequent

reads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count2 Calls install_trans to copy each block from the log to the

relevant location on the disk

3 Sets to count in the log header to zero

Page 94: AOS Lab 9: File system -- Of buffers, logs, and blocks

Log functions

• begin_trans: Waits until it obtains exclusive use of the log• log_write:

• Appends the block’s new content to the log on the disk• Leaves the modified block in the buffer cache so that subsequent

reads of the block during the transaction will yield the updatedstate

• Records the block’s sector number in memory to find out when ablock is written multiple times during a transaction and overwritethe block’s previous copy in the log

• commit_trans:1 Writes the log’s header block to disk, updating the count2 Calls install_trans to copy each block from the log to the

relevant location on the disk3 Sets to count in the log header to zero

Page 95: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code snippet: filewrite

begin_trans();ilock(f->ip);if ((r = writei(f->ip, addr + i, f->off, n1)) > 0)f->off += r;

iunlock(f->ip);commit_trans();

Page 96: AOS Lab 9: File system -- Of buffers, logs, and blocks

Recovery

• In case of a reboot, the file system performs recovery by lookingat the log file

• If the log contains the commit record, the recovery code copiesthe required writes to the on-disk data structures

• If the log does not contain a complete operation, it is ignored anddeleted

Page 97: AOS Lab 9: File system -- Of buffers, logs, and blocks

Recovery

• In case of a reboot, the file system performs recovery by lookingat the log file

• If the log contains the commit record, the recovery code copiesthe required writes to the on-disk data structures

• If the log does not contain a complete operation, it is ignored anddeleted

Page 98: AOS Lab 9: File system -- Of buffers, logs, and blocks

Recovery

• In case of a reboot, the file system performs recovery by lookingat the log file

• If the log contains the commit record, the recovery code copiesthe required writes to the on-disk data structures

• If the log does not contain a complete operation, it is ignored anddeleted

Page 99: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code snippet: recover_from_log

static voidrecover_from_log(void){

read_head();// if committed, copy from log to diskinstall_trans();log.lh.n = 0;write_head(); // clear the log

}

Page 100: AOS Lab 9: File system -- Of buffers, logs, and blocks

Code snippet: install_trans

static void install_trans(void) {int tail;for (tail = 0; tail < log.lh.n; tail++) {// read log blockstruct buf *lbuf = bread(log.dev,

log.start+tail+1);// read dststruct buf *dbuf = bread(log.dev,

log.lh.sector[tail]);// copy block to dstmemmove(dbuf->data, lbuf->data, BSIZE);bwrite(dbuf); // write dst to diskbrelse(lbuf);brelse(dbuf);

}}

Page 101: AOS Lab 9: File system -- Of buffers, logs, and blocks

Block allocator

• Maintains a free bitmap on disk; one bit per block

• A zero bit means that the block is free while a one indicates thatthe block is in use

• The bits for the boot sector, superblock, inode blocks, and bitmapblocks are always set

• Provides two functions to allocate (balloc()) and de-allocate(bfree()) a block

Page 102: AOS Lab 9: File system -- Of buffers, logs, and blocks

Block allocator

• Maintains a free bitmap on disk; one bit per block• A zero bit means that the block is free while a one indicates that

the block is in use

• The bits for the boot sector, superblock, inode blocks, and bitmapblocks are always set

• Provides two functions to allocate (balloc()) and de-allocate(bfree()) a block

Page 103: AOS Lab 9: File system -- Of buffers, logs, and blocks

Block allocator

• Maintains a free bitmap on disk; one bit per block• A zero bit means that the block is free while a one indicates that

the block is in use• The bits for the boot sector, superblock, inode blocks, and bitmap

blocks are always set

• Provides two functions to allocate (balloc()) and de-allocate(bfree()) a block

Page 104: AOS Lab 9: File system -- Of buffers, logs, and blocks

Block allocator

• Maintains a free bitmap on disk; one bit per block• A zero bit means that the block is free while a one indicates that

the block is in use• The bits for the boot sector, superblock, inode blocks, and bitmap

blocks are always set

• Provides two functions to allocate (balloc()) and de-allocate(bfree()) a block

Page 105: AOS Lab 9: File system -- Of buffers, logs, and blocks

balloc

• Calls readsb to read the superblock to get metadata

• Uses this metadata to traverse the entire bitmap and look for abitmap in which the bit is zero

• If it finds a free block it updates the bitmap and returns the block

Page 106: AOS Lab 9: File system -- Of buffers, logs, and blocks

balloc

• Calls readsb to read the superblock to get metadata

• Uses this metadata to traverse the entire bitmap and look for abitmap in which the bit is zero

• If it finds a free block it updates the bitmap and returns the block

Page 107: AOS Lab 9: File system -- Of buffers, logs, and blocks

balloc

• Calls readsb to read the superblock to get metadata

• Uses this metadata to traverse the entire bitmap and look for abitmap in which the bit is zero

• If it finds a free block it updates the bitmap and returns the block

Page 108: AOS Lab 9: File system -- Of buffers, logs, and blocks

bfree

• Finds the corresponding bitmap block

• Clears its bitmap bit

Page 109: AOS Lab 9: File system -- Of buffers, logs, and blocks

bfree

• Finds the corresponding bitmap block

• Clears its bitmap bit

Page 110: AOS Lab 9: File system -- Of buffers, logs, and blocks

Today’s task

• xv6 does not allow concurrent transactions to the log whichmeans that if a system call performs a long write operation, allother write system calls will block

• Come up with a strategy to implement concurrent transactions tothe log in terms of pseudo-code

Page 111: AOS Lab 9: File system -- Of buffers, logs, and blocks

Reading(s)

• Chapter 6, “File system”, up to section “Code: directory layer"from “xv6: a simple, Unix-like teaching operating system”