Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
Flash-Dateisysteme
Christian Egger | Juni 2010 | Verteilte Systeme
Seite 2 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Intro - Flash
I Developed by Toshiba in 1985I Replacement of EEPROMs
I Read-Only: Bios, Firmware...I Read-Write: Embedded Devices (Router, controller...)
I Moore’s Law: faster, cheaper, bigger memoriesI New application areas
I Integration into microcontrollerI USB-Sticks (8MB... 2GB... 128GB...)I Memory cards: SD, MMC, xD, CF, MemoryStickI embedded Storage: MP3-Player, HandyI As hard disk replacement: Solid State Disks (SSDs)
I 2 Types: NOR and NAND Flash
Seite 3 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Flash Technology - NOR
I expensiveI low capacityI good reliabilityI Byte/Word-wise Access (via address/data lines)I direct CPU connectionI fast random accessI bitwise programmableI Use: mostly program memory / embeddedI very low erase and write performance
Seite 4 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Flash Technology - NAND
I Command driven interfaceI SLC NAND Flash (Single-Level-Cell)
I 2 States, 1 Bit per cellI robust: 100K-1Mio erase cyclesI more reliable than MLCI lower energy consumption, faster than MLCI more expensive than MLC
I MLC NAND Flash (Multi-Level-Cell)I 4 states / 2 bits per cellI 10K-100K erase cyclesI bad blocks when delivery (like bad pixels / LCDs)I more storage per siliconI stricter constraints compared to SLC and NOR
Seite 5 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Flash-storage: Differences to hard disksI Basic commands
I readI eraseI write
I UnitsI Page - 2KiBI Block - 128KiB
I granularities:I read/write - byte/wordI read/write - pageI erase - block
I limited number of erase cyclesI NOR: 100k-1MI SLC NAND: 100k+I MLC NAND: 10k-100k
Seite 6 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Other quirks
I WritesI in-place handicapped
I needs “read-modify-erase-write”I expensive (time, complexity)I not atomic, unsafeI high wear
I out-of-placeI needs only “write” (assumption: pre erased pages)I atomicity
I OverwritesI multiple writes to the same “page”I some flashes (NOR, SLC)
Seite 7 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
More quirks
I NAND: spare areasI extra storageI with/without overwritingI Application
I ECCI bad block flagsI deletion marker
I NAND: writes strict linearI only within same blockI consequence of other optimizations (price)
Seite 8 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Methods of using flash
I Flash Translation Layer (FTL)I flash as a block deviceI Handling
I wear-levelingI bad-block handlingI Error CorrectionI mapping of different page sizes
I Flash File SystemsI since 1990+, FFS2 by MicrosoftI Advantages over FTL
I directly usable, no extra logicI more efficientI special applications possible (XIP)
Seite 9 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Concepts
I NodesI Log StructureI Garbage CollectionI Wandering TreesI Write BackI Mount ScanningI Checkpointing / SnapshottingI CompressionI Error CorrectionI Execute-in-Place
Seite 10 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Nodes
I Contiguous StructureI MetadataI (not needingly also) Data
I less write operations
Seite 11 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Log Structure
I Out-of-placeI Formen
I RingbufferI Log structure within single blocksI Partitioning into areas
I best performance: blocks pre-erased
Seite 12 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Log Structure
CREATE
APPEND
WRITE to OFFSET
TRUNCATE
DELETE
CREATE
Abbildung: Log with some operations.
Seite 13 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Garbage Collection
I Log Structure: Trashing, fragmentation of single blocksI Block status
I empty / erasedI full / all data valid (obsolete)I partially full / some invalid dataI erasable / all data invalid
I Solution: GC!I redundant copyI obsolete full blocksI reclaim free space
I StrategiesI strict (like a Ringbuffer)I Heuristics
Seite 14 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Garbage Collection
copy
erase
fragmented
Abbildung: GC with different block statuses.
Seite 15 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Wandering TreesI Directory IndexI like ext2 TreeI but: out-of-place updates, floating structuresI Differences
I Index still points to obsolete Data (COW)I Update index recursivelyI Order: Leaf .. Root-node (atomicity!)I Root node has a new place
Seite 16 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Write-Back Strategy
I Caching of dirty pagesI Write bulks of dataI Pros/Cons
I Fewer writesI Agglomeration of DataI not safe
Seite 17 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Mount Scanning
I Index not on FlashI construct on startupI full Device-Scan neededI Complexity: O(n) (start-time + RAM vs. device size)
I Index on flashI locatable in O(1): root-nodeI complexity: O(1) possible (RAM + startup time)
Seite 18 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Checkpointing / Snapshots
I CheckpointingI FS without Index on FlashI Memory dump of the index saved to flashI low Mount-Scan complexityI fast startupI validity: as long state does not change
Seite 19 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Compression
I slow writesI Compression, write fewer Data: faster?I Algorithms
I deflate/zlib (default)I LZOI LZMAI bzip2
I ApplicationI compress DataI compress Metadata
I most often only dataI Problem: calculation of free space?
Seite 20 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Error Detection & Correction
I NAND: Focus on cheap priceI defect blocks/pages
I delivery with bad blocks allowedI emerge during useI mark: flag in spare area
I Bit-Flips in neighbouring cellsI Software has to deal with that
I CRCsI ECC (detect 2-bit, correct 1-bit errors)I FS data structure has to allow bad blocks everywhere
Seite 21 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Execute-in-Place
I No fetch into RAMI Executable Text area mapped directly into address spaceI only NORI very invasive (FS-Code - Paging Code)I used for “embedded” areasI i.e. Linux-Phones (Maemo, FIC)I 2 implementations in Linux
I AXFSI CRAMS+XIP Patch
Seite 22 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Flash File Systems in Linux
Jahr Name in Kernel?1999 JFFS discontinued2001 JFFS2 Linux-2.4.10+2002 YAFFS nur patch2005 YAFFS2 nur patch2007 LogFS Linux-2.6.34+2008 UBIFS Linux-2.6.27+
Seite 23 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
JFFS
I Axis Communications ABI first ImplementationI Structure: Nodes + strict LogI no compressionI Kernel 2.0 / 2.2I no hardlinksI Mount-ScanI Index in RAM
Seite 24 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
JFFS2
I Redesign of JFFS by RedHatI designed for NORI Improvements
I Compression (zlib, rubin, rtime)I relaxed Log-Structure ApproachI Hardlink support
I most often used Flash FSI Problem: scalability
I RAM: O(n) for {Number of Objects in JFFS2}I Startup time: O(n) for device size
Seite 25 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
YAFFS
I designed for NANDI very portable, Linux: PatchI no Index on Flash: RAM and Start in O(n)I but: Checkpointing, fast StartI no compressionI YAFFS1
I 512B page size NANDI Spare Areas: Deletion MarkerI simple Mount-Scan
I YAFFS2I 2KiB page size NANDI Spare Areas only for marking Bad BlocksI no Overwriting
Seite 26 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
LogFS
I Block and MTD modeI requirement: scalabilityI RAM usage and start in in O(1)
I Index on flash: Wandering TreeI 2 Anchor Areas: Pointers to floating structuresI Block-levels: blocks only used for nodes of same level
I root node blocksI ...I level n blocksI data blocks
Seite 27 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
UBIFS & UBI
I UBI LayerI “unsorted block images”I LEBs / PEBsI wear-levelingI Error correctionI ScrubbingI Start in O(n)
I UBIFSI Very much like LogFS (except for UBI)
Seite 28 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Read-Only File Systems in Linux
Jahr Name in Kernel?1997 RomFS 2.2+1999 CramFS 2.4+2002 SquashFS 2.6.29+2006 AXFS patch
Seite 29 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
CRAMFS
I Compression supportI terse MetadataI very matureI Disadvantages
I 8bit gid/uid’sI no timestampsI 16MiB file size limitI device size limit: ¡ 256MB (+16MB)
I XIP support (Montavista patch)
Seite 30 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
SquashFS
I variable (compression) block size up to 1MiBI result: good compression
I zlib (default)I LZMAI Bzip2I LZO
I ApplicationsI EmbeddedI LiveCDs (+UnionFS)
Seite 31 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
AXFS
I “Advanced eXecute-in-place File System”I only for NORI no MTD layerI not mainline, very invasive (messes with non VFS code)I pages either XIP or compressed
I runtime profiling support (XIP xor compression)I profile feeded to mkfs.axfs
Seite 32 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Linux: SSDs and ATA Trim()
I currently supportedI Btrfs, VFAT, EXT4, GFS2, NILFS
I BtrfsI special Block Allocator modesI Mode “ssd”I Modu “ssd spread”
I SSD-mode off by defaultI buggy SSD FTLs
Seite 33 Flash Dateisysteme | Rainbow-OS Architekturseminar | Juni 2010
Windows
I ATA Trim()I Windows7 onlyI FATI NTFS
I exFATI chosen future standard file system for SDXCI no 4GiB File-LimitI no 32GiB/2TiB Device-LimitI patent-encumberedI proprietaryI Linux not (really) supported yet