39
Operating Systems Operating Systems COMP 4850/CISG 5550 COMP 4850/CISG 5550 File Systems File Systems Files Files Dr. James Money Dr. James Money

Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

Embed Size (px)

Citation preview

Page 1: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

Operating SystemsOperating SystemsCOMP 4850/CISG 5550COMP 4850/CISG 5550

File SystemsFile SystemsFilesFiles

Dr. James MoneyDr. James Money

Page 2: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File SystemsFile Systems

• We have three requirements for long We have three requirements for long term storage of dataterm storage of data– We must be able to store a large We must be able to store a large

amount of informationamount of information– The information must survive the The information must survive the

terminate of the processterminate of the process– Multiple processes should be able to Multiple processes should be able to

access the data concurrentlyaccess the data concurrently

Page 3: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File SystemsFile Systems

• The solution involves storing the data The solution involves storing the data on disks in units called on disks in units called filesfiles

• The information in files must be The information in files must be persistentpersistent – that is not affected by – that is not affected by creation/destruction of processescreation/destruction of processes

• These files must be managed by the These files must be managed by the OSOS

• The part of the OS that handles this is The part of the OS that handles this is called the called the file systemfile system

Page 4: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

FilesFiles

• There are several important aspects There are several important aspects to the file system:to the file system:– File NamingFile Naming– File StructureFile Structure– File TypesFile Types– File AccessFile Access– File AttributesFile Attributes– File OperationsFile Operations

Page 5: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File NamingFile Naming

• The naming of files is an abstraction The naming of files is an abstraction of the data on the diskof the data on the disk

• We must shield the user from the We must shield the user from the details of this processdetails of this process

• In order to do this, we usually refer to In order to do this, we usually refer to files by name rather than location or files by name rather than location or a numbera number

Page 6: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File NamingFile Naming

• The exact rules vary from each file The exact rules vary from each file systemsystem

• All current OSes allow string of one to All current OSes allow string of one to eight characterseight characters

• Many times digits are permitted as wellMany times digits are permitted as well• Some times even special characters are Some times even special characters are

allowedallowed• Many support names as long as 255 Many support names as long as 255

characterscharacters

Page 7: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File NamingFile Naming

• ExamplesExamples– cathycathy– brucebruce– 22– urgent!urgent!– Fig.2-14Fig.2-14– supercalifragilicioussupercalifragilicious

Page 8: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File NamingFile Naming

• Some systems distinguish between Some systems distinguish between upper and lowercase lettersupper and lowercase letters– UNIX, LinuxUNIX, Linux– John, john, and JOHN are all different filesJohn, john, and JOHN are all different files

• Others do notOthers do not– DOS, Windows 1,2,3,95,98,XP,VistaDOS, Windows 1,2,3,95,98,XP,Vista– John, john, and JOHN are all the same fileJohn, john, and JOHN are all the same file

Page 9: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File NamingFile Naming

• Many operating systems have two part Many operating systems have two part files names separated by a periodfiles names separated by a period– File NameFile Name– File extension – indicated usually the program File extension – indicated usually the program

that created the file and data it containsthat created the file and data it contains

• UNIX does not differentiate this, but UNIX does not differentiate this, but names can include a periodnames can include a period

• This also allows files such as file.tar.gz This also allows files such as file.tar.gz instead of just file.tar or file.gz or file.tgzinstead of just file.tar or file.gz or file.tgz

Page 10: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File NamingFile Naming

Page 11: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File StructureFile Structure

• File can be structured many ways:File can be structured many ways:– Simple byte patternSimple byte pattern– Set of recordsSet of records– Tree of dataTree of data

• However, the file system usually only However, the file system usually only sees the files as a group of bytes sees the files as a group of bytes with no structure to itwith no structure to it

Page 12: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File StructureFile Structure

Page 13: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File StructureFile Structure

• By assuming just a byte sequence for By assuming just a byte sequence for files, this give flexibilityfiles, this give flexibility

• Now user programs can interpret the Now user programs can interpret the data in the file anyway they wantdata in the file anyway they want

• In the user wants to do unusual In the user wants to do unusual things, this prevents the OS from things, this prevents the OS from getting in the waygetting in the way

Page 14: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File StructureFile Structure

• The second choice is a record based The second choice is a record based approachapproach

• The records are fixed length with an The records are fixed length with an internal structureinternal structure

• Historically used with punch card Historically used with punch card systems b/c they had 132 character systems b/c they had 132 character recordsrecords

• No current system works this wayNo current system works this way

Page 15: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File StructureFile Structure

• The third type of file structure is a tree The third type of file structure is a tree systemsystem

• This is a tree of records, not all being the This is a tree of records, not all being the same length usuallysame length usually

• There is a fixed length There is a fixed length keykey field at a fixed field at a fixed positionposition

• The tree is sorted on this key field for rapid The tree is sorted on this key field for rapid searchingsearching

• You refer to a record by this key fieldYou refer to a record by this key field

Page 16: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File TypesFile Types

• Most OSes have several types of filesMost OSes have several types of files– Most of them support regular files and Most of them support regular files and

directoriesdirectories– Regular files Regular files contain user informationcontain user information– Directories Directories are system files for maintaining are system files for maintaining

the structure of the file systemthe structure of the file system– Character special files Character special files are special I/O files for are special I/O files for

serial devicesserial devices– Block special filesBlock special files are special I/O files for block are special I/O files for block

devicesdevices

Page 17: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File TypesFile Types

• Regular Files can contain either ASCII Regular Files can contain either ASCII or binary dataor binary data

• ASCII data just contains line of text, ASCII data just contains line of text, readable by humansreadable by humans

• Lines are terminated by carriage Lines are terminated by carriage returns or line feedsreturns or line feeds

• DOS uses both (CR-LF)DOS uses both (CR-LF)

Page 18: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File TypesFile Types

• ASCII file can edited with any basic ASCII file can edited with any basic text editortext editor

• It is easy to connect input and output It is easy to connect input and output of two programs this wayof two programs this way

• Interpretation of the data is easy in Interpretation of the data is easy in this formatthis format

Page 19: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File TypesFile Types

• The other type of regular files are The other type of regular files are binary filesbinary files

• The data usually is not human The data usually is not human readablereadable

• They have an internal structureThey have an internal structure

• This structure usually varies from file This structure usually varies from file to fileto file

Page 20: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File TypesFile Typesexecutable Archive file

Page 21: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AccessFile Access

• Early OSes had only one type of Early OSes had only one type of access: access: sequential accesssequential access

• In sequential access, you could only In sequential access, you could only read the bytes in a file in orderread the bytes in a file in order

• There was no skipping aroundThere was no skipping around• You could rewind to the beginning of You could rewind to the beginning of

the filethe file• This worked well with tapesThis worked well with tapes

Page 22: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AccessFile Access

• When disks become more popular for When disks become more popular for storage, there was a need for out of storage, there was a need for out of order reading of data in filesorder reading of data in files

• This is called This is called random access filesrandom access files

• This is required by most applicationsThis is required by most applications

Page 23: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AccessFile Access

• There are two methods to specify There are two methods to specify where to readwhere to read– Every read() call can specify a starting Every read() call can specify a starting

positionposition– A special function called seek() can set A special function called seek() can set

the current positionthe current position

• Older systems used to make you Older systems used to make you define the type of file at creationdefine the type of file at creation

Page 24: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AttributesFile Attributes

• In addition to the name and data, In addition to the name and data, every file has associated information every file has associated information with itwith it

• For example, the create and For example, the create and modification dates and its sizemodification dates and its size

• These are called These are called file attributesfile attributes

Page 25: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AttributesFile Attributes

Page 26: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AttributesFile Attributes

• The first four are related to the file The first four are related to the file protection and who may access itprotection and who may access it

• The flags are bit the control a The flags are bit the control a particular propertyparticular property

Page 27: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File AttributesFile Attributes

•The record length, key position, The record length, key position, and key length are only used by and key length are only used by files who have records referenced files who have records referenced by keysby keys

•The times key track of creation, The times key track of creation, modification and access timesmodification and access times

• The current size tells how big the file isThe current size tells how big the file is

Page 28: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File OperationsFile Operations

• Recall that files exist to store Recall that files exist to store information and retrieve it laterinformation and retrieve it later

• The calls among the different OSes The calls among the different OSes varyvary

• However, every OS provides However, every OS provides functionality for all the operationsfunctionality for all the operations

Page 29: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File OperationsFile Operations

1.1. CreateCreate – The file is created with no data. This – The file is created with no data. This tells the OS that there is a file coming and to tells the OS that there is a file coming and to set its attributesset its attributes

2.2. DeleteDelete – removes the file from the file – removes the file from the file system. The file has to be deleted to free up system. The file has to be deleted to free up disk spacedisk space

3.3. OpenOpen – Before the file can be read – Before the file can be read from/written to, you must open it. The open from/written to, you must open it. The open call may retrieve its attributes and disk call may retrieve its attributes and disk locationslocations

Page 30: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File OperationsFile Operations4.4. CloseClose - when access is done, the file should - when access is done, the file should

be closed. Sometimes this is forced by be closed. Sometimes this is forced by allowing a limited number of open files per allowing a limited number of open files per process. This also forces the writing of the process. This also forces the writing of the data to the disk if it has not already data to the disk if it has not already happened.happened.

5.5. ReadRead – retrieve the data from the file. The – retrieve the data from the file. The bytes usually come from the current bytes usually come from the current position, and the number of bytes is position, and the number of bytes is specified.specified.

Page 31: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File OperationsFile Operations

6.6. WriteWrite – Data is written to the file, at the – Data is written to the file, at the current position. If we are at the end of current position. If we are at the end of the file, the file size increases. If we are the file, the file size increases. If we are in the middle of the file, the data there in the middle of the file, the data there is overwritten.is overwritten.

7.7. AppendAppend – This is a special form of – This is a special form of write(). It adds data to the end of the write(). It adds data to the end of the file. This is the same as seek-ing to the file. This is the same as seek-ing to the end of file and calling write()end of file and calling write()

Page 32: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File OperationsFile Operations

8.8. SeekSeek – For random access files, – For random access files, repositions the current read/write repositions the current read/write position.position.

9.9. Get attributesGet attributes – retrieves the file – retrieves the file attributes, for example the attributes, for example the modification time. This is typically a modification time. This is typically a struct of values.struct of values.

Page 33: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

File OperationsFile Operations

10.10.Set attributesSet attributes- sets the attributes - sets the attributes in a similar fashion to get attributes. in a similar fashion to get attributes. Note that not all attributes are Note that not all attributes are settable.settable.

11.11.RenameRename- allows you to rename a - allows you to rename a file after it has been created. Not file after it has been created. Not always needed since we can copy always needed since we can copy the file and delete the old one.the file and delete the old one.

Page 34: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

ExampleExample

• We consider an example program We consider an example program that copies one file to another.that copies one file to another.

• The program name is The program name is copyfilecopyfile

• Its syntax isIts syntax is

copyfile file1 file2copyfile file1 file2

which copies file1 to file2which copies file1 to file2

Page 35: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

ExampleExample

• If file2 exists, it is overwrittenIf file2 exists, it is overwritten

• If file2 does not exist, it will be If file2 does not exist, it will be createdcreated

• file1 must existfile1 must exist

• There are exactly two argumentsThere are exactly two arguments

Page 36: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

ExampleExample

Page 37: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

ExampleExample

Page 38: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

Memory Mapped FilesMemory Mapped Files

• Many programmers feel that the prior way Many programmers feel that the prior way of working with files is cumbersomeof working with files is cumbersome

• There would prefer to use it similar to There would prefer to use it similar to memory accessmemory access

• One way to do this is to map/unmap a file One way to do this is to map/unmap a file to a virtual address rangeto a virtual address range

• map() provides a file and a starting addressmap() provides a file and a starting address

• unmap() provides just the address usuallyunmap() provides just the address usually

Page 39: Operating Systems COMP 4850/CISG 5550 File Systems Files Dr. James Money

Memory Mapped FilesMemory Mapped Files

• If a file of length 32KB is mapped to If a file of length 32KB is mapped to virtual memory at address 512K, virtual memory at address 512K, then any instruction to read and then any instruction to read and write between 512K and 542K refers write between 512K and 542K refers to the file.to the file.

• 512K is the OK block, 513K is the 1K 512K is the OK block, 513K is the 1K block of the file, and so onblock of the file, and so on