Upload
leo-hurst
View
41
Download
1
Embed Size (px)
DESCRIPTION
Operating Systems CMPSCI 377 Lecture 21: Distributed File Systems. Emery Berger University of Massachusetts Amherst. Distributed File Systems. Most common use of distributed systems Idea: Given set of disks attached to different nodes, share as if all were attached to every node Examples: - PowerPoint PPT Presentation
Citation preview
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science
Emery BergerUniversity of Massachusetts Amherst
Operating SystemsCMPSCI 377
Lecture 21: Distributed File Systems
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 2
Distributed File Systems
Most common use of distributed systems
Idea: Given set of disks attached to different
nodes,share as if all were attached to every node
Examples: Edlab: one server, workstations on LAN AppleShare: nodes are servers with disk &
client
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 3
Distributed File Systems: Issues
Naming & transparency Remote file access Caching Server with state or without Replication
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 4
Naming & Transparency
Issues How are files named? Do filenames reveal location? Do filenames change if file moves? Do filenames change if user moves?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 5
Transparency
Location transparency: filename does not reveal physical storage location
Location independence: filename need not change if file’s storage location changes
In practice: Most naming schemes do not have
location independence Many have location transparency
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 6
Naming Strategies:Absolute Names
Disadvantages: User must know
complete name – aware of which files are local & which are remote
File is location dependent (cannot move)
Makes sharing harder
Not fault-tolerant
Advantages: Easy to find fully
specified filename Easy to add &
delete new names No global state Scales easily
<machine name, pathname>Examples: AppleShare, Windows
NT
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 7
Naming Strategies:Mount Points
Mount points (NFS – Network File System) Each host has set of local names for
remote locations Mount table (/etc/fstab): specifies
<remote pathname @ machine name, local pathname>
At boot: bind local name to remote Users refer to local pathnames
NFS manages mapping
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 8
Mount Points: Pros & Cons
Advantages: Location transparent Remote name can change across
reboots
Disadvantages: Single unified strategy hard to maintain Same file can have different names
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 9
NFS Example
Partial contents of /etc/fstab for Edlab:
/usr1/[email protected]:/var/spool/mail
/users/users1@elsrv1:/users/users1
/courses/cs300@elsrv3:/courses/cs300
/rcf/common@elsrv1:/exp/rcf/common
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 10
NFS Example
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 11
Naming Strategies:Global Name Space
Single name space: Examples:
AFS (CMU’s Andrew File System) Sprite (Berkeley)
No matter which node you are on,filenames remain the same
Client: gets filename structure from server(s)
When users access files, server sends copies to workstation, where they are cached
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 12
Global Name Space: Pros & Cons
Advantages: Naming – consistent Ensures all files are same regardless of where
you login Late binding of names ) moving them is easier
Disadvantages: Difficult for OS to keep files consistent (caching) Global name space may limit flexibility Performance issues
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 13
Distributed File Systems: Issues
Naming & transparency Remote file access Caching Server with state or without Replication
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 14
Remote File Access & Caching
Can access files: Remotely: returns results using RPC = remote
service Transfer part of file, perform local access =
caching
Caching issues: Where & when are file blocks cached? When are modifications propagated back to
remote file? What happens when multiple clients cache
same file?
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 15
Remote File Caching Local disk:
Reduces access time (compared to remote) Safe if node fails– Difficult to keep local copy consistent with
remote copy– Requires client to have disk!
Local memory: Quick access time Works without disks– Difficult to keep local copy consistent with
remote copy– Smaller cache size– Not fault-tolerant
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 16
Cache Update Policies
Write-through: write to remote disk Reliable– Low-performance = remote service for all
writes
Write-back: write only to cache Write to disk on evictions, periodic synch Quick Reduces network traffic (repeated writes to
same block)– User machine crashes ) data loss
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 17
Cache Consistency
Client-initiated consistency:client contacts server and checks consistency Can check every access, at given intervals, only upon opening a file
Server-initiated consistency:server detects potential conflicts, invalidates caches Server needs to know which clients have
cached which parts of which files which clients are readers & which are writers
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 18
Case Study:Sun’s Network File System
NFS: standard for distributed UNIX file access Designed to run on LANs
Nodes are both servers & clients Servers have no state Uses mount protocol to make global name local
/etc/exports: lists local names server willing to export
/etc/fstab: lists global names that local nodes import
Corresponding global name must be in /etc/exports on server
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 19
NFS Implementation NFS defines set of RPC operations for
remote file access:1. Directory search, reading directory entries2. Manipulating links & directories3. Accessing file attributes4. Reading/writing files
Does not rely on node homogeneity Heterogeneous nodes support NFS mount
& remote access protocols using RPC Users may need to know different names
depending upon which node they log on
UUNIVERSITY OF NIVERSITY OF MMASSACHUSETTSASSACHUSETTS, A, AMHERST • MHERST • Department of Computer Science Department of Computer Science 20
NFS Implementation