54
Digital UNIX Internals II File Layer and Virtual File System 3 - 1 File Layer and Virtual File System Chapter Three

File Layer and Virtual File System

  • Upload
    tejana

  • View
    107

  • Download
    0

Embed Size (px)

DESCRIPTION

File Layer and Virtual File System. Chapter Three. Topics. File System Abstractions File System Layers The File Layer The Virtual File System Selected File Related Calls. UNIX File Abstraction. The File Stream of bytes any record structure is imposed by application - PowerPoint PPT Presentation

Citation preview

Page 1: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 1

File Layer and Virtual File System

Chapter Three

Page 2: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 2

Topics File System Abstractions File System Layers The File Layer The Virtual File System Selected File Related Calls

Page 3: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 3

UNIX File Abstraction The File

Stream of bytesany record structure is imposed by application

Sequential or Random Access The Directory Structure

Tree-like directory hierarchyFile sharing

hard links - multiple names for same disk filesoft (symbolic) links - stored path shortcut

Access control associated with the file

Page 4: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 4

File Related System Calls open(), close() creat(), unlink() read(), write() seek() getattr(), setattr() mmap() ioctl() fsync() dup(), dup2()

Page 5: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 5

File Descriptor Applications name for an open file

small integer returned by open() The first three file descriptors are

0 -- standard input 1 -- standard output 2 -- standard error

These are usually associated with a terminal

Each has an associated offset or file position pointer

Page 6: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 6

Types of Files Regular Directory Block Special (Device) File Character Special (Device) File FIFO (Named Pipe) Symbolic Link Socket (In AF_UNIX Domain)

Page 7: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 7

UNIX Disk Abstraction Partitions

Subsets of the disk that may be treated as logical disk drives.

Partitioning a Large Disk Overcomes 32-bit UNIX limit problems Isolates directories Decreases fsck time

disklabel utility writes/edits disk label Partition identified by a special file

block: /dev/disk/dsk[number][partition_letter] character: /dev/rdisk/disk[number][partition_letter]

Page 8: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 8

UNIX File System Abstraction Two Senses

A mountable directory hierarchy administered in /etc/fstab.

A specific implementation of the UNIX file abstraction (UFS, NFS, AdvFS, CDFS, etc).

One file system is the root file system Other file systems are graphed in to the root

by mounting.

Page 9: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 9

File System System Calls mount(), unmount() sync()

Page 10: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 10

UNIXDOS

A: B: C:rz0a

rz0grz3c

The Virtual File System:Transparent Access

Page 11: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 11

Application Process

Common system calls:open(), close(), read(), write, seek()

VFS

To specific filesystem typeimplementation of the call

The Virtual File System: Uniform Access

Page 12: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 12

System Call

File Layer

Virtual File System

File System(s)

Cache

Device

read(), write() etc.

Manage file access state for a given process

Represent filesystem and files generically

Specific file system implementation, UFS, AdvfsNFS, MFS, etc.

In memory block storage for a file system. Couldbe traditional buffer cache, unified buffer cache orhome grown.

Local Block Device, Network Interface or a LogicalVolume.

File System Management Layers

Page 13: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 13

Digital UNIX File Systems “True” Data File Systems

UNIX File System (UFS)Network File System (NFS)Advanced File System (AdvFS)Memory File System (MFS)CD File System, ISO 9660:1988 (CDFS) Universal Disk Format (UDF) -- DVDFS

Pseudo-File Systems or LayersProc File System (procfs)File Descriptor File System (FDFS)File-on-File File System

Page 14: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 14

CDFS Compact Disk File System Support for common extensions

ISO 9660 Standard with Rocky Ridge Extensions

Joliet (Microsoft) extensionsMulti-session (Kodak) CD format

Can be exported by NFS

Page 15: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 15

CDFS: ISO 9660 Layout ISO 9660 layout consists of

Primary and Secondary volume descriptors (a.k.a.. super blocks)

Path TablesDirectory and File Data

Directory records containLocation of file or directorySizeLength of extended attribute record (XAR)Interleave attributesFlagsFile Name

Page 16: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 16

XAR

Data

XAR

Gap

Gap

Data

Data

Non-Interleaved Interleaved

CDFS:Interleaved and Noninterleaved Data

Layout

XAR contents:UID and GIDAccess PermissionsCreation/Modification dates

Page 17: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 17

Memory File System (MFS) Memory Only - No Permanent Storage ufs format (in-memory) Created with newfs not wired - backed by swap use: fast temporary directories

system /tmp build areas etc.

Page 18: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 18

swapPhysical Memory

MFS and swap

Page 19: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 19

The /proc File System The /proc file system is useful for process

tracing or debugging utilities, such as truss or dbx

Structures used by the /proc file system include:prstatus Status of a traced task or

threadprrun Actions to be taken before a

stopped task or thread is runprpsinfo Information reported by ps

Page 20: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 20

File on File Mounting FS Layer allowing mounting on a regular file of;

regular files character device files block special device files

Provided for SVID Conformance FIFOs are given a names as files see fattach(3) and fdetach(3)

Page 21: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 21

socket vnode

mount

ufs_mount

utaskuthread

proc andsession

file

UNIX Domain

uf_entry[ ][ ].ufe_ofile

cdirrdir

utaskproc

ttyvp

VFS UFS

f_data

vnodevnode

inodeinode

inode

File Layer and VFS Structures

Page 22: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 22

Per Process File Descriptor “table” Relates a task to an open file Referenced through the utask structure

two-level tree structuresBeginning with V5.0

entries are allocated whena file is opened or a pipe or socket is created inherited in a fork()a descriptor is copied via dup()

entries are deallocated whena file, pipe or socket is closeda process terminates

Page 23: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 23

File structure Records state of access to a file

Access mode (R/W)Offset into fileTwo tasks may share a single file structure

File structures inherited by child processes

Two uses of the file structure for regular files

includes an ops vector for manipulating regular files includes a pointer to a vnode

for sockets includes an ops vector for manipulating sockets includes a pointer to a socket

Page 24: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 24

File descriptor table within utask

Substructure of utaskstruct ufile_state uu_file_state;

Ultimately references file entry structuresstruct ufile_entry {

struct file *ufe_ofile;

struct socket_sel_queue *ufe_so_sel;

int ufe_unused;

int ufe_oflags;

udecl_simple_lock_data(,ufe_ofile_lock)

}

Page 25: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 25

ufile_state structure (1) struct ufile_state {

udecl_simple_lock_data(,uf_ofile_lock)

int utask_need_to_lock;

int uf_first_available;First available file descriptor

int uf_of_count;Number of overflow entries

int uf_flags;Marks pending changes in file descriptor table

int uf_referencesUsed to block table shrink

Page 26: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 26

ufile_state structure (2) Open file bit arrays -- indicates open file

u_long uf_open_bits_lvl1 ;

u_long *uf_popen_bits_lvl0;

u_long *uf_popen_bits_lvl1;

u_long uf_open_bits_lvl0 ; Pointers to the file entries

struct ufile_entry*uf_entry[U_FE_ARRAY_SIZE];

struct ufile_entry **uf_of_entry ;

}

Page 27: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 27

file structure (1)struct file {udecl_simple_lock_data(,f_incore_lock)int f_flag; uint_t f_count; /* reference count*/int f_type; /* descriptor type*/int f_msgcount;

/* references from message queue */struct ucred *f_cred;

/* descriptor's credentials */struct fileops *f_ops;

/* operations on f_data */caddr_t f_data; /* vnode or socket */

....

Page 28: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 28

file structure (2)…

union {/* offset or next free file struct */off_t fu_offset;struct file *fu_freef;

} f_u;

uint_t f_io_lock; /* I/O lock *//* (lower half of thread ptr) */

int f_io_waiters;/* number of waiters on i/o lock */

};

Page 29: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 29

struct fileops

struct fileops {

int (*fo_read)();

int (*fo_write)();

int (*fo_ioctl)();

int (*fo_select)();

int (*fo_close)();

}

Page 30: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 30

struct fileops ImplementationsRegular Files:

vfs/vfs_vnops.c

struct fileops vnops =

{ vn_read, vn_write, vn_ioctl, vn_select,

vn_close };

Sockets:

bsd/sys_socket.c

struct fileops socketops =

{ soo_read, soo_write, soo_ioctl, soo_select,

soo_close };

Page 31: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 31

Virtual File System Originally designed for UNIX by Sun Microsystems,

Inc., to support the Network File System (NFS) Object-oriented support of multiple file system types:

struct vnode is a generic representation of a file for all types of file system implementations

struct mount is a generic representation of a whole mountable file system for all implementations

a file system implements its own set of: member functions for vnodes and mount

structuresdata structures to combine with generic vnode

and mount structures

Page 32: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 32

mount structure

<locks>v_flag

v_usecountv_holdcnt

lock countsv_lastrv_id

v_typev_tag

v_mount

v_opv_freefv_freeb

v_mountfv_mountb

v_cleanblkhdv_dirtyblkhd

vnodeops structurevnode structurevnode structurevnode structurevnode structure

buf structure

buf structure

.....

multiprocessor exclusionvnode flagsreference count of userspage & buffer references

last read (read-ahead)capability identifiervnode typetype of underlying data

ptr to vfs we are in

vnode operationsvnode freelist forwardvnode freelist backvnode mountlist forwardvnode mountlist back

clean blocklist headdirty blocklist head

user-level lock counts

struct vnode (1)

v_buflists_lock protect clean/dirty heads

mount structurev_mountedhere ptr to mounted vfs

Page 33: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 33

v_ncache_timev_free_time

v_numoutputv_outflag

v_cache_lookup_refsv_rdcntv_wrcnt

v_dirtyblkcnt

v_unv_objectv_secopsv_data[ ]

vm_object structurevnsecops structure

....last cache activity timetime on vnode free_list

num of writes in progressoutput flags

count of readerscount of writers

Snapshot count of dirty blocks

ptr to sock, dev specinfo, pipeVM object for vnodevnode security opsplaceholder, private data

struct vnode (2)

v_output_lock protect numoutput, outflag

v_dirtyblkpush Snapshot count of pushed blocks

Page 34: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 34

Types of VnodesType Description

--------- ----------------------------------------------------

VNON Allocated, but as-yet untyped vnode

VREG Vnode representing a regular file

VDIR Directory vnode

VBLK Block device vnode

VCHR Character device vnode

VLNK Symbolic link vnode

VSOCKUNIX domain socket vnode

VFIFO FIFO special file vnode

Page 35: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 35

struct vnodeops (1) Operation Function

vn_lookup Looks up a filevn_create Creates a regular filevn_mknod Creates a fifo or device special filevn_open Opens a filevn_close Closes a filevn_access Checks the access for a filevn_getattr Gets file attributesvn_setattr Sets file attributesvn_read Reads a filevn_write Writes to a filevn_ioctl Controls a devicevn_select For synchronous I/O multiplexing

Page 36: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 36

struct vnodeops (2) Operation Function

vn_mmap Map memory of a character devicevn_fsync Synchronize file data and statisticsvn_seek Sets position on a filevn_remove Removes a filevn_link Creates a hard link to a filevn_renameRenames a filevn_mkdir Creates a directoryvn_rmdir Removes a directoryvn_symlink Creates a symbolic link to a filevn_readdir Reads a directoryvn_readlinkReads contents of a symbolic linkvn_abortopAborts operation

Page 37: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 37

struct vnodeops (3)Operation Functionvn_inactiveSets inactivevn_reclaim Reclaims a vnodevn_bmap Maps to file system blockvn_strategy Calls device strategy routinevn_print Prints the contents of an inodevn_pgrd Reads a pagevn_pgwr Writes a pagevn_swap Swaps handlervn_bread Reads buffervn_brelse Releases buffervn_lockctl Provides file lockingvn_syncdata Synchronizes range in open file

Page 38: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 38

struct vnodeops (4)Operation Function

vn_lock Locks an inode

vn_unlock Unlocks an inode

vn_getproplist Gets extended attributes

vn_setproplist Sets extended attributes

vn_delproplist Deletes extended attributes

vn_pathconf Checks path

Page 39: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 39

Lock for synchronization (SMP)

Flags

Flag for SMP

Next in mount list

Previous in mount list

Operations on file system

Vnode we are mounted on

List of vnodes this mount

Lock for vnode list

Exported mapping for UID 0

UID of mounter

File system statistics

Private data

NFS error information

Lock for synchronization

m_lock

m_flag

m_funnel

m_next

m_prev

m_op

m_vnodecovered

m_mounth

m_vlist_lock

m_exroot

m_uid

m_stat

m_data

m_nfs_errmsginfo

m_unmount_lock

mount structure

mount structure

vfsops structure

vnode structure

vnode structure

struct mount

Page 40: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 40

struct vfsopsOperation Functionvfs_mount Mounts the file systemvfs_start Starts the file systemvfs_unmount Unmounts the file systemvfs_root Returns vnode for the root of the file systemvfs_quotactl Performs operations associated with quotasvfs_statfs Updates file system statisticsvfs_sync Synchronizes the file systemvfs_fhtovp Returns the vnode pointer, given a file handlevfs_vptofh Returns a file handle, given a vnode pointervfs_init Initializes the file systemvfs_mountroot Mount root file system

vfs_smoothsync Gently sync the file system

Page 41: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 41

VFS Switch Table Identifies file system types that have been

implemented. Contains an entry point for file system

operations for each supported file system type.struct vfsops *vfssw[MOUNT_MAXTYPE];

Page 42: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 42

ufs_mount

ufs_start

ufs_unmount

ufs_root

ufs_quotactl

NULLPTR

&ufs_vfsops

&nfs_vfsops

*m_op

mount vfssw vfsops

Setting Up File System Operations

Page 43: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 43

m_next

m_data

m_nextm_prev

m_nextm_prev

v_mount

v_datafile system specific

file information

v_mount

v_datafile system specific

file information

v_mount

v_datafile system specific

file information

file system specificfile information

rootfs

MountTable

Vnodes

Mounted File System Structures

Page 44: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 44

A A

BB

C

C

Recording Mount PointsHow are they mounted? (1)

Page 45: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 45

VDIR VDIRVROOT

VnodeStructs

v_mounted_here

m_vnodecovered

rootfs

MountStructs

next next

Recording Mount PointsHow are they mounted? (2)

m_mounth m_mounth

Page 46: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 46

File System Operations namei() Interprets a pathname mount() Mounts a file system open() Opens a file read()/write() Reads or writes a file

Page 47: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 47

Namei (1) VFS routine that maps pathnames to

vnodes performs access checks on each component of

that pathname. Uses VOP_LOOKUP to move down the

path Special Cases

Symbolic LinksMount PointsProcess-Specific root (chroot())

Special Care - unmounts

Page 48: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 48

Namei (2) A LRU Hash Table

< parent-vnode, component-name> to< target-vnode, capabilities>

Capabilities are "tags" assigned to vnodes prevent cache entries from referring to out-of-

date associations Related data structures include:

namecache - namei cache nchash - hash list for cache nchsize - size of cache nchsz - size of hash list

Page 49: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 49

Start

Copy name into local buffer

Copy next component to buffer

Call file system specific lookuproutine VOP_LOOKUP()

".." ?

Symbolic link?

Mounted on?

More components?Done

Find parent vnode

Find root vnode of mounted filesystem: VFS_ROOT()

Copy name to buffer

Yes

Yes

Yes

No

No

No

YesNo

namei()flow

Page 50: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 50

mount()

VFS_MOUNT

namei()

vmountset()

MountTable

ufs_mount()UFS

mount()flow

Page 51: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 51

open()

falloc()

vn_open ()

FileTable

namei()

VOP_CREATE()

VOP_LOOKUP()

open() flow

Page 52: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 52

rwuio()

getf()

FOP_READ()FOP_WRITE()

vn_read()VOP_READ()VOP_WRITE()

ufs_read()ufs_write()

read()write()

ProcessDescriptor

Table

FileTable

VnodeTable

File

read()and write()flow

Page 53: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 53

Source Reference (1 of 2) kernel/sys/users.h

open file table ufile_state in struct utask kernel/sys/file.h

defines a struct file and fileops kernel/vfs/vfs_vnops.c

implementation of fileops for vnode file structs kernel/bsd/sys_socket.c

implementation of fileops for socket file structs kernel/sys/vnode.h

definition of vnode and vnodeops

Page 54: File Layer and Virtual File System

Digital UNIX Internals II

File Layer and Virtual File System3 - 54

Source Reference (2 of 2) kernel/sys/mount.h

definition of struct mount and vfsops kernel/vfs/vfs_syscalls.c

vfs_switch[] kernel/vfs/vfs_lookup.c

implementation of namei() kernel/vfs/vfs_syscalls.c

implementation of mount() and open() calls kernel/bsd/sys_generic.c

implementation of read() and write() calls