Upload
tejana
View
107
Download
0
Embed Size (px)
DESCRIPTION
File Layer and Virtual File System. Chapter Three. Topics. File System Abstractions File System Layers The File Layer The Virtual File System Selected File Related Calls. UNIX File Abstraction. The File Stream of bytes any record structure is imposed by application - PowerPoint PPT Presentation
Citation preview
Digital UNIX Internals II
File Layer and Virtual File System3 - 1
File Layer and Virtual File System
Chapter Three
Digital UNIX Internals II
File Layer and Virtual File System3 - 2
Topics File System Abstractions File System Layers The File Layer The Virtual File System Selected File Related Calls
Digital UNIX Internals II
File Layer and Virtual File System3 - 3
UNIX File Abstraction The File
Stream of bytesany record structure is imposed by application
Sequential or Random Access The Directory Structure
Tree-like directory hierarchyFile sharing
hard links - multiple names for same disk filesoft (symbolic) links - stored path shortcut
Access control associated with the file
Digital UNIX Internals II
File Layer and Virtual File System3 - 4
File Related System Calls open(), close() creat(), unlink() read(), write() seek() getattr(), setattr() mmap() ioctl() fsync() dup(), dup2()
Digital UNIX Internals II
File Layer and Virtual File System3 - 5
File Descriptor Applications name for an open file
small integer returned by open() The first three file descriptors are
0 -- standard input 1 -- standard output 2 -- standard error
These are usually associated with a terminal
Each has an associated offset or file position pointer
Digital UNIX Internals II
File Layer and Virtual File System3 - 6
Types of Files Regular Directory Block Special (Device) File Character Special (Device) File FIFO (Named Pipe) Symbolic Link Socket (In AF_UNIX Domain)
Digital UNIX Internals II
File Layer and Virtual File System3 - 7
UNIX Disk Abstraction Partitions
Subsets of the disk that may be treated as logical disk drives.
Partitioning a Large Disk Overcomes 32-bit UNIX limit problems Isolates directories Decreases fsck time
disklabel utility writes/edits disk label Partition identified by a special file
block: /dev/disk/dsk[number][partition_letter] character: /dev/rdisk/disk[number][partition_letter]
Digital UNIX Internals II
File Layer and Virtual File System3 - 8
UNIX File System Abstraction Two Senses
A mountable directory hierarchy administered in /etc/fstab.
A specific implementation of the UNIX file abstraction (UFS, NFS, AdvFS, CDFS, etc).
One file system is the root file system Other file systems are graphed in to the root
by mounting.
Digital UNIX Internals II
File Layer and Virtual File System3 - 9
File System System Calls mount(), unmount() sync()
Digital UNIX Internals II
File Layer and Virtual File System3 - 10
UNIXDOS
A: B: C:rz0a
rz0grz3c
The Virtual File System:Transparent Access
Digital UNIX Internals II
File Layer and Virtual File System3 - 11
Application Process
Common system calls:open(), close(), read(), write, seek()
VFS
To specific filesystem typeimplementation of the call
The Virtual File System: Uniform Access
Digital UNIX Internals II
File Layer and Virtual File System3 - 12
System Call
File Layer
Virtual File System
File System(s)
Cache
Device
read(), write() etc.
Manage file access state for a given process
Represent filesystem and files generically
Specific file system implementation, UFS, AdvfsNFS, MFS, etc.
In memory block storage for a file system. Couldbe traditional buffer cache, unified buffer cache orhome grown.
Local Block Device, Network Interface or a LogicalVolume.
File System Management Layers
Digital UNIX Internals II
File Layer and Virtual File System3 - 13
Digital UNIX File Systems “True” Data File Systems
UNIX File System (UFS)Network File System (NFS)Advanced File System (AdvFS)Memory File System (MFS)CD File System, ISO 9660:1988 (CDFS) Universal Disk Format (UDF) -- DVDFS
Pseudo-File Systems or LayersProc File System (procfs)File Descriptor File System (FDFS)File-on-File File System
Digital UNIX Internals II
File Layer and Virtual File System3 - 14
CDFS Compact Disk File System Support for common extensions
ISO 9660 Standard with Rocky Ridge Extensions
Joliet (Microsoft) extensionsMulti-session (Kodak) CD format
Can be exported by NFS
Digital UNIX Internals II
File Layer and Virtual File System3 - 15
CDFS: ISO 9660 Layout ISO 9660 layout consists of
Primary and Secondary volume descriptors (a.k.a.. super blocks)
Path TablesDirectory and File Data
Directory records containLocation of file or directorySizeLength of extended attribute record (XAR)Interleave attributesFlagsFile Name
Digital UNIX Internals II
File Layer and Virtual File System3 - 16
XAR
Data
XAR
Gap
Gap
Data
Data
Non-Interleaved Interleaved
CDFS:Interleaved and Noninterleaved Data
Layout
XAR contents:UID and GIDAccess PermissionsCreation/Modification dates
Digital UNIX Internals II
File Layer and Virtual File System3 - 17
Memory File System (MFS) Memory Only - No Permanent Storage ufs format (in-memory) Created with newfs not wired - backed by swap use: fast temporary directories
system /tmp build areas etc.
Digital UNIX Internals II
File Layer and Virtual File System3 - 18
swapPhysical Memory
MFS and swap
Digital UNIX Internals II
File Layer and Virtual File System3 - 19
The /proc File System The /proc file system is useful for process
tracing or debugging utilities, such as truss or dbx
Structures used by the /proc file system include:prstatus Status of a traced task or
threadprrun Actions to be taken before a
stopped task or thread is runprpsinfo Information reported by ps
Digital UNIX Internals II
File Layer and Virtual File System3 - 20
File on File Mounting FS Layer allowing mounting on a regular file of;
regular files character device files block special device files
Provided for SVID Conformance FIFOs are given a names as files see fattach(3) and fdetach(3)
Digital UNIX Internals II
File Layer and Virtual File System3 - 21
socket vnode
mount
ufs_mount
utaskuthread
proc andsession
file
UNIX Domain
uf_entry[ ][ ].ufe_ofile
cdirrdir
utaskproc
ttyvp
VFS UFS
f_data
vnodevnode
inodeinode
inode
File Layer and VFS Structures
Digital UNIX Internals II
File Layer and Virtual File System3 - 22
Per Process File Descriptor “table” Relates a task to an open file Referenced through the utask structure
two-level tree structuresBeginning with V5.0
entries are allocated whena file is opened or a pipe or socket is created inherited in a fork()a descriptor is copied via dup()
entries are deallocated whena file, pipe or socket is closeda process terminates
Digital UNIX Internals II
File Layer and Virtual File System3 - 23
File structure Records state of access to a file
Access mode (R/W)Offset into fileTwo tasks may share a single file structure
File structures inherited by child processes
Two uses of the file structure for regular files
includes an ops vector for manipulating regular files includes a pointer to a vnode
for sockets includes an ops vector for manipulating sockets includes a pointer to a socket
Digital UNIX Internals II
File Layer and Virtual File System3 - 24
File descriptor table within utask
Substructure of utaskstruct ufile_state uu_file_state;
Ultimately references file entry structuresstruct ufile_entry {
struct file *ufe_ofile;
struct socket_sel_queue *ufe_so_sel;
int ufe_unused;
int ufe_oflags;
udecl_simple_lock_data(,ufe_ofile_lock)
}
Digital UNIX Internals II
File Layer and Virtual File System3 - 25
ufile_state structure (1) struct ufile_state {
udecl_simple_lock_data(,uf_ofile_lock)
int utask_need_to_lock;
int uf_first_available;First available file descriptor
int uf_of_count;Number of overflow entries
int uf_flags;Marks pending changes in file descriptor table
int uf_referencesUsed to block table shrink
Digital UNIX Internals II
File Layer and Virtual File System3 - 26
ufile_state structure (2) Open file bit arrays -- indicates open file
u_long uf_open_bits_lvl1 ;
u_long *uf_popen_bits_lvl0;
u_long *uf_popen_bits_lvl1;
u_long uf_open_bits_lvl0 ; Pointers to the file entries
struct ufile_entry*uf_entry[U_FE_ARRAY_SIZE];
struct ufile_entry **uf_of_entry ;
}
Digital UNIX Internals II
File Layer and Virtual File System3 - 27
file structure (1)struct file {udecl_simple_lock_data(,f_incore_lock)int f_flag; uint_t f_count; /* reference count*/int f_type; /* descriptor type*/int f_msgcount;
/* references from message queue */struct ucred *f_cred;
/* descriptor's credentials */struct fileops *f_ops;
/* operations on f_data */caddr_t f_data; /* vnode or socket */
....
Digital UNIX Internals II
File Layer and Virtual File System3 - 28
file structure (2)…
union {/* offset or next free file struct */off_t fu_offset;struct file *fu_freef;
} f_u;
uint_t f_io_lock; /* I/O lock *//* (lower half of thread ptr) */
int f_io_waiters;/* number of waiters on i/o lock */
};
Digital UNIX Internals II
File Layer and Virtual File System3 - 29
struct fileops
struct fileops {
int (*fo_read)();
int (*fo_write)();
int (*fo_ioctl)();
int (*fo_select)();
int (*fo_close)();
}
Digital UNIX Internals II
File Layer and Virtual File System3 - 30
struct fileops ImplementationsRegular Files:
vfs/vfs_vnops.c
struct fileops vnops =
{ vn_read, vn_write, vn_ioctl, vn_select,
vn_close };
Sockets:
bsd/sys_socket.c
struct fileops socketops =
{ soo_read, soo_write, soo_ioctl, soo_select,
soo_close };
Digital UNIX Internals II
File Layer and Virtual File System3 - 31
Virtual File System Originally designed for UNIX by Sun Microsystems,
Inc., to support the Network File System (NFS) Object-oriented support of multiple file system types:
struct vnode is a generic representation of a file for all types of file system implementations
struct mount is a generic representation of a whole mountable file system for all implementations
a file system implements its own set of: member functions for vnodes and mount
structuresdata structures to combine with generic vnode
and mount structures
Digital UNIX Internals II
File Layer and Virtual File System3 - 32
mount structure
<locks>v_flag
v_usecountv_holdcnt
lock countsv_lastrv_id
v_typev_tag
v_mount
v_opv_freefv_freeb
v_mountfv_mountb
v_cleanblkhdv_dirtyblkhd
vnodeops structurevnode structurevnode structurevnode structurevnode structure
buf structure
buf structure
.....
multiprocessor exclusionvnode flagsreference count of userspage & buffer references
last read (read-ahead)capability identifiervnode typetype of underlying data
ptr to vfs we are in
vnode operationsvnode freelist forwardvnode freelist backvnode mountlist forwardvnode mountlist back
clean blocklist headdirty blocklist head
user-level lock counts
struct vnode (1)
v_buflists_lock protect clean/dirty heads
mount structurev_mountedhere ptr to mounted vfs
Digital UNIX Internals II
File Layer and Virtual File System3 - 33
v_ncache_timev_free_time
v_numoutputv_outflag
v_cache_lookup_refsv_rdcntv_wrcnt
v_dirtyblkcnt
v_unv_objectv_secopsv_data[ ]
vm_object structurevnsecops structure
....last cache activity timetime on vnode free_list
num of writes in progressoutput flags
count of readerscount of writers
Snapshot count of dirty blocks
ptr to sock, dev specinfo, pipeVM object for vnodevnode security opsplaceholder, private data
struct vnode (2)
v_output_lock protect numoutput, outflag
v_dirtyblkpush Snapshot count of pushed blocks
Digital UNIX Internals II
File Layer and Virtual File System3 - 34
Types of VnodesType Description
--------- ----------------------------------------------------
VNON Allocated, but as-yet untyped vnode
VREG Vnode representing a regular file
VDIR Directory vnode
VBLK Block device vnode
VCHR Character device vnode
VLNK Symbolic link vnode
VSOCKUNIX domain socket vnode
VFIFO FIFO special file vnode
Digital UNIX Internals II
File Layer and Virtual File System3 - 35
struct vnodeops (1) Operation Function
vn_lookup Looks up a filevn_create Creates a regular filevn_mknod Creates a fifo or device special filevn_open Opens a filevn_close Closes a filevn_access Checks the access for a filevn_getattr Gets file attributesvn_setattr Sets file attributesvn_read Reads a filevn_write Writes to a filevn_ioctl Controls a devicevn_select For synchronous I/O multiplexing
Digital UNIX Internals II
File Layer and Virtual File System3 - 36
struct vnodeops (2) Operation Function
vn_mmap Map memory of a character devicevn_fsync Synchronize file data and statisticsvn_seek Sets position on a filevn_remove Removes a filevn_link Creates a hard link to a filevn_renameRenames a filevn_mkdir Creates a directoryvn_rmdir Removes a directoryvn_symlink Creates a symbolic link to a filevn_readdir Reads a directoryvn_readlinkReads contents of a symbolic linkvn_abortopAborts operation
Digital UNIX Internals II
File Layer and Virtual File System3 - 37
struct vnodeops (3)Operation Functionvn_inactiveSets inactivevn_reclaim Reclaims a vnodevn_bmap Maps to file system blockvn_strategy Calls device strategy routinevn_print Prints the contents of an inodevn_pgrd Reads a pagevn_pgwr Writes a pagevn_swap Swaps handlervn_bread Reads buffervn_brelse Releases buffervn_lockctl Provides file lockingvn_syncdata Synchronizes range in open file
Digital UNIX Internals II
File Layer and Virtual File System3 - 38
struct vnodeops (4)Operation Function
vn_lock Locks an inode
vn_unlock Unlocks an inode
vn_getproplist Gets extended attributes
vn_setproplist Sets extended attributes
vn_delproplist Deletes extended attributes
vn_pathconf Checks path
Digital UNIX Internals II
File Layer and Virtual File System3 - 39
Lock for synchronization (SMP)
Flags
Flag for SMP
Next in mount list
Previous in mount list
Operations on file system
Vnode we are mounted on
List of vnodes this mount
Lock for vnode list
Exported mapping for UID 0
UID of mounter
File system statistics
Private data
NFS error information
Lock for synchronization
m_lock
m_flag
m_funnel
m_next
m_prev
m_op
m_vnodecovered
m_mounth
m_vlist_lock
m_exroot
m_uid
m_stat
m_data
m_nfs_errmsginfo
m_unmount_lock
mount structure
mount structure
vfsops structure
vnode structure
vnode structure
struct mount
Digital UNIX Internals II
File Layer and Virtual File System3 - 40
struct vfsopsOperation Functionvfs_mount Mounts the file systemvfs_start Starts the file systemvfs_unmount Unmounts the file systemvfs_root Returns vnode for the root of the file systemvfs_quotactl Performs operations associated with quotasvfs_statfs Updates file system statisticsvfs_sync Synchronizes the file systemvfs_fhtovp Returns the vnode pointer, given a file handlevfs_vptofh Returns a file handle, given a vnode pointervfs_init Initializes the file systemvfs_mountroot Mount root file system
vfs_smoothsync Gently sync the file system
Digital UNIX Internals II
File Layer and Virtual File System3 - 41
VFS Switch Table Identifies file system types that have been
implemented. Contains an entry point for file system
operations for each supported file system type.struct vfsops *vfssw[MOUNT_MAXTYPE];
Digital UNIX Internals II
File Layer and Virtual File System3 - 42
ufs_mount
ufs_start
ufs_unmount
ufs_root
ufs_quotactl
NULLPTR
&ufs_vfsops
&nfs_vfsops
*m_op
mount vfssw vfsops
Setting Up File System Operations
Digital UNIX Internals II
File Layer and Virtual File System3 - 43
m_next
m_data
m_nextm_prev
m_nextm_prev
v_mount
v_datafile system specific
file information
v_mount
v_datafile system specific
file information
v_mount
v_datafile system specific
file information
file system specificfile information
rootfs
MountTable
Vnodes
Mounted File System Structures
Digital UNIX Internals II
File Layer and Virtual File System3 - 44
A A
BB
C
C
Recording Mount PointsHow are they mounted? (1)
Digital UNIX Internals II
File Layer and Virtual File System3 - 45
VDIR VDIRVROOT
VnodeStructs
v_mounted_here
m_vnodecovered
rootfs
MountStructs
next next
Recording Mount PointsHow are they mounted? (2)
m_mounth m_mounth
Digital UNIX Internals II
File Layer and Virtual File System3 - 46
File System Operations namei() Interprets a pathname mount() Mounts a file system open() Opens a file read()/write() Reads or writes a file
Digital UNIX Internals II
File Layer and Virtual File System3 - 47
Namei (1) VFS routine that maps pathnames to
vnodes performs access checks on each component of
that pathname. Uses VOP_LOOKUP to move down the
path Special Cases
Symbolic LinksMount PointsProcess-Specific root (chroot())
Special Care - unmounts
Digital UNIX Internals II
File Layer and Virtual File System3 - 48
Namei (2) A LRU Hash Table
< parent-vnode, component-name> to< target-vnode, capabilities>
Capabilities are "tags" assigned to vnodes prevent cache entries from referring to out-of-
date associations Related data structures include:
namecache - namei cache nchash - hash list for cache nchsize - size of cache nchsz - size of hash list
Digital UNIX Internals II
File Layer and Virtual File System3 - 49
Start
Copy name into local buffer
Copy next component to buffer
Call file system specific lookuproutine VOP_LOOKUP()
".." ?
Symbolic link?
Mounted on?
More components?Done
Find parent vnode
Find root vnode of mounted filesystem: VFS_ROOT()
Copy name to buffer
Yes
Yes
Yes
No
No
No
YesNo
namei()flow
Digital UNIX Internals II
File Layer and Virtual File System3 - 50
mount()
VFS_MOUNT
namei()
vmountset()
MountTable
ufs_mount()UFS
mount()flow
Digital UNIX Internals II
File Layer and Virtual File System3 - 51
open()
falloc()
vn_open ()
FileTable
namei()
VOP_CREATE()
VOP_LOOKUP()
open() flow
Digital UNIX Internals II
File Layer and Virtual File System3 - 52
rwuio()
getf()
FOP_READ()FOP_WRITE()
vn_read()VOP_READ()VOP_WRITE()
ufs_read()ufs_write()
read()write()
ProcessDescriptor
Table
FileTable
VnodeTable
File
read()and write()flow
Digital UNIX Internals II
File Layer and Virtual File System3 - 53
Source Reference (1 of 2) kernel/sys/users.h
open file table ufile_state in struct utask kernel/sys/file.h
defines a struct file and fileops kernel/vfs/vfs_vnops.c
implementation of fileops for vnode file structs kernel/bsd/sys_socket.c
implementation of fileops for socket file structs kernel/sys/vnode.h
definition of vnode and vnodeops
Digital UNIX Internals II
File Layer and Virtual File System3 - 54
Source Reference (2 of 2) kernel/sys/mount.h
definition of struct mount and vfsops kernel/vfs/vfs_syscalls.c
vfs_switch[] kernel/vfs/vfs_lookup.c
implementation of namei() kernel/vfs/vfs_syscalls.c
implementation of mount() and open() calls kernel/bsd/sys_generic.c
implementation of read() and write() calls