Multiple Device Driver and Flash FTLSarah DiesburgCOP 5641
IntroductionKernel uses logical remapping layers over storage to hide complexity and add functionalityTwo examplesMultiple device driversFlash Translation Layer (FTL)
The md driverProvides virtual devicesCreated from one or more independent underlying devicesThe basic mechanism to support RAIDsFull-disk encryption (software)LVMSecure deletion (TrueErase)
The md driverFile systems mounted on top of device mapper virtual deviceVirtual device canAbstract multiple devicesPerform encryptionOther things User/KernelApplicationsDMFile System
Simple Device MappersLinearMaps a linear range of a deviceDelaydelays reads and/or writes and maps them to different devicesZeroprovides a block-device that always returns zero'd data on reads and silently drops writessimilar behavior to /dev/zero, but as a block-device instead of a character-device. FlakeyUsed for testing only, simulates intermittent, catastrophic device failurehttp://lxr.linux.no/#linux+v3.2/Documentation/device-mapper
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Logical start sector
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Command to get number of sectors of a device (like /dev/sda1)
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Type of device mapper device we want. Linear is a one-to-one logical to physical sector mapping.
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Linear parameters: base device (like /dev/sda1)
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Linear parameters: starting offset within the device
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
Pipe the command to dmsetup, acts like table_file parameter
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
dmsetup command manages logical devices that use the device mapper driver. See man dmsetup for more information.
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
We wish to create a new logical device mapper device.
Loading a device mapper#!/bin/sh# Create an identity mapping for a device echo "0 `blockdev --getsize $1` linear $1 0" \| dmsetup create identity
We name the new device identity.
Loading a device mapperCan then mount file system directly on top of virtual device
#!/bin/bashmount /dev/mapper/identity /mnt
Unloading a device mapper#!/bin/bash
umount /mntdmsetup remove identity
Unloading a device mapper#!/bin/bash
umount /mntdmsetup remove identityFirst unmount the file system
Unloading a device mapper#!/bin/bash
umount /mntdmsetup remove identityThen use dmsetup to remove the device called identity
dm-linear.cDocumentationhttp://lxr.linux.no/#linux+v3.2/Documentation/device-mapper/linear.txtCodehttp://lxr.linux.no/#linux+v3.2/drivers/md/dm-linear.c
dm-linear.cstatic struct target_type linear_target = { .name = "linear", .version = {1, 1, 0},.module = THIS_MODULE,.ctr = linear_ctr,.dtr = linear_dtr,.map = linear_map,.status = linear_status,.ioctl = linear_ioctl,.merge = linear_merge,.iterate_devices = linear_iterate_devices,};
linear_mapstatic int linear_map(struct dm_target *ti, struct bio *bio, union map_info *map_context){ struct linear_c *lc = (struct linear_c *) ti->private;
bio->bi_bdev = lc->dev->bdev; bio->bi_sector = lc->start + (bio->bi_sector - ti->begin);
return DM_MAPIO_REMAPPED;}
(**Note this is a simpler function from an earlier kernel version. Version 3.2 does the same, but with a few more helper functions)
Memory Technology DeviceDifferent than a character or block deviceExports a special character device with extra ioctls and operations to access flash storageFor raw flash devices (not USB sticks)Embedded chipshttp://www.linux-mtd.infradead.org/
NAND Flash CharacteristicsFlash has different constraints than hard drives or character devicesExports read, write, and erase operations
NAND Flash CharacteristicsCan only write to a freshly-erased locationIf you want to write again to same physical location, you must first erase the areaReads and writes are to smaller flash pagesErasures are performed in flash blocksHolds many flash pages
NAND Flash CharacteristicsEach storage location can be erased only 10K-1M timesWriting is slower than readingErasures can be 10x slower than writingEach NAND page has a small, non-addressable out-of-bounds area to hold state and mapping informationAccessed by ioctls
NAND Flash CharacteristicsWe need a way to not wear out the flash and have good performance with a minimum of writes and erases
Flash Translation LayerThe solution is to stack a flash translation layer (FTL) on top of the raw flash deviceExports a block deviceTakes care of the flash operations of reads, writes, and erasesEvenly wears writes to all flash locationsMarks old pages as invalid until they can be erased later
Data PathVirtual file system (VFS)File systemMulti-device driversExt3Disk driverDisk driverMTD driverMTD driverJFFS2FTLApps
Flash Translation LayerRotates the usage of pagesOSWrite random bitsto 1dataFlash0123456data
Flash Translation LayerOverwrites go to new page
Write random bitsto 1dataFlash0123456randomdataOS
FTL ExampleINFTL Inverse Nand Flash Translation LayerOpen-source FTL in linux kernel for DiskOnChip flashSomewhat out-dated
INFTLBroken into two filesinftlmount.c load/unload functionsinftlcore.c flash and wear-leveling operationshttp://lxr.linux.no/linux+*/drivers/mtd/inftlmount.chttp://lxr.linux.no/linux+*/drivers/mtd/inftlcore.c
INFTLStack-based algorithm to provide the illusion of updatesEach stack (or chain) corresponds to a virtual address with sequentially-addressed pages
INFTL Chaining
INFTL ChainingChains can grow to any lengthOnce there are no more freshly-erased erase blocks, some old ones must be garbage-collectedChain is folded so that all valid data is copied into top erase blockLower erase blocks in chain are erased and put back into the pool
inftlcore.cstatic struct mtd_blktrans_ops inftl_tr = {.name = "inftl",.major = INFTL_MAJOR,.part_bits = INFTL_PARTN_BITS,.blksize = 512,.getgeo = inftl_getgeo,.readsect = inftl_readblock,.writesect = inftl_writeblock,.add_mtd = inftl_add_mtd,.remove_dev = inftl_remove_dev,.owner = THIS_MODULE,};
inftl_writeblockstatic int inftl_writeblock(struct mtd_blktrans_dev *mbd, unsigned long block, char *buffer){struct INFTLrecord *inftl = (void *)mbd;unsigned int writeEUN;unsigned long blockofs = (block * SECTORSIZE) & (inftl->EraseSize - 1);size_t retlen;struct inftl_oob oob;char *p, *pend;
inftl_writeblock/* Is block all zero? */ pend = buffer + SECTORSIZE;for (p = buffer; p < pend && !*p; p++); if (p < pend) {writeEUN = INFTL_findwriteunit(inftl, block); if (writeEUN == BLOCK_NIL) {printk(KERN_WARNING "inftl_writeblock():cannot find" "block to write to\n"); /* * If we _still_ haven't got a block to use, we're screwed.*/return 1;}
memset(&oob, 0xff, sizeof(struct inftl_oob)); oob.b.Status = oob.b.Status1 = SECTOR_USED; inftl_write(inftl->mbd.mtd, (writeEUN * inftl->EraseSize) + blockofs, SECTORSIZE, &retlen, (char *)buffer, (char *)&oob);
inftl_writeblockmemset(&oob, 0xff, sizeof(struct inftl_oob)); oob.b.Status = oob.b.Status1 = SECTOR_USED; inftl_write(inftl->mbd.mtd, (writeEUN * inftl->EraseSize) + blockofs, SECTORSIZE, &retlen, (char *)buffer, (char *)&oob);} else { INFTL_deleteblock(inftl, block); } return 0;}
**.ctr constructs the linear mapping.dtr destructs the mapping.map remaps bio to new offset
*Consider a sample flash device with six pages per erase block. Further suppose we write the letter A to sector 7. This means that we will write to the second sector of virtual erase block 2 (see (a)). Next suppose we write the letter B to sector 7. Since the second sector of virtual erase block 1 has already been written to, we must write to the second sector of a new, erased virtual erase block, make the new virtual erase block the primary block, and chain the old erase block underneath the primary block. The second sector of the old erase block is also marked as deleted in the oob data, and this is indicated in the figure as an X value over the sector (see (b)). Now suppose we write C to sector 9. We figure out that this is the fourth sector of virtual erase block 1. We then write C to the fourth sector of the primary block of virtual erase block 1 (see (c)).
*