47
Kernel File Interface Kernel File Interface operating systems (or programming I/O in Unix)

Kernel File Interface operating systems (or programming I/O in Unix)

  • View
    236

  • Download
    2

Embed Size (px)

Citation preview

Page 1: Kernel File Interface operating systems (or programming I/O in Unix)

Kernel File InterfaceKernel File Interface

operatingsystems

(or programming I/O in Unix)

Page 2: Kernel File Interface operating systems (or programming I/O in Unix)

Input and OutputInput and Output

When programming in C on Unix, there are twovery different I/O libraries you can use:

The C language libraries:o Bufferedo Part of the C languageo The basic unit is a FILE*

The Kernel I/O callso Unbufferedo System calls – not part of Co The basic unit is a File Descriptor

Page 3: Kernel File Interface operating systems (or programming I/O in Unix)

Application Program

Standard I/O Library

Kernel

High level I/O - Streams

Low level I/O - file descriptors

Low level I/O - file descriptors

operatingsystems

Page 4: Kernel File Interface operating systems (or programming I/O in Unix)

Standard C I/OStandard C I/O

operatingsystems

Page 5: Kernel File Interface operating systems (or programming I/O in Unix)

As in C++, the fundamental notion usedin doing I/O is the stream (but it is not anobject as it is in C++ ... it is a data structure)

When a file is created or opened in C, thesystem associates a stream with the file.

When a stream is opened, the fopen( ) callreturns a pointer to a FILE data structure. The FILE data structure contains all of the information necessary for the I/O library tomanage the stream: * a file descriptor * a pointer to the I/O buffer * error flags * etc

The original C I/O library was written around 1975 byDennis Ritchie. Little has changed since then.

operatingsystems

Page 6: Kernel File Interface operating systems (or programming I/O in Unix)

Three streams are predefined and available to aProcess. These standard streams are referenced Through the pre-defined FILE pointers stdin, stdout, and stderr. These pointers are defined in <stdio.h>

Standard StreamsStandard Streamsoperatingsystems

Page 7: Kernel File Interface operating systems (or programming I/O in Unix)

One of the keys of the C I/O library is that I/O is normally buffered to minimize context switches.

Fully Buffered: I/O takes place when a buffer is full.Disk files are normally fully buffered. The buffer isallocated by the I/O library itself by doing a malloc.

Line Buffered: I/O takes place when a new line character is encountered. Line buffering is used for terminal I/O. Note that I/O may take place before a new line character is encountered because of the size of the buffer.

Buffering I/OBuffering I/Ooperatingsystems

Page 8: Kernel File Interface operating systems (or programming I/O in Unix)

Unbuffered:

No buffering is done. Data is output immediately.

operatingsystems

Page 9: Kernel File Interface operating systems (or programming I/O in Unix)

Most Unix systems default to the following:

Standard Error is always un-buffered.

All streams referring to a terminal device areline buffered (stdin and stdout).

All other streams are fully buffered.

operatingsystems

Page 10: Kernel File Interface operating systems (or programming I/O in Unix)

You can force a stream to be flushed,(all unwritten bytes are passed to the kernel)

#include <stdio.h>

int fflush (FILE *fp);

Flushing a StreamFlushing a Streamoperatingsystems

I’ve not seen an issue in Windows, but in Unix, youmay not see output when you expect to if you don’tflush the buffers.

Page 11: Kernel File Interface operating systems (or programming I/O in Unix)

fopenfopen

#include <stdio.h>

FILE *fopen (const char *filename, const char *mode);

pointer to the FILE structure holding the internal state information about the connection to the associated file. Returns a NULL pointer if open fails.

full path to the file to be opened Mode bits

“r” open text file for reading“rb” open binary file for reading“w” open text file for writing - truncate“wb” open binary file for writing - truncate“a” open text file for writing-append“ab” open binary file for writing-append“r+” open text file to read & write (file must exist)“rb+” open binary file to read & write - ditto“w+” open text file to read & write – truncate“wb+” open binary file to read & write – truncate“a+” open text file to read & write – append“ab+” open binary file to read & write - append

when opened for reading and writing* input cannot immediately follow output without an intervening fflush, fseek, fsetpos, or rewind.* output cannot immediately follow input without an intervening fseek, fsetpos, or rewind.

operatingsystems

Page 12: Kernel File Interface operating systems (or programming I/O in Unix)

Restriction r w a r+ w+ a+

file must already exist * *previous contents are discarded * *

stream can be read * * * *stream can be written * * * * * stream can only be written at end * *

You cannot set permission when a file is opened with w or a

Opening a StreamOpening a Streamoperatingsystems

Page 13: Kernel File Interface operating systems (or programming I/O in Unix)

FILE *in;

if ((in = fopen(“file1.txt”, “r”)) == NULL) perror(“could not open file1.txt”);

Example of using fopen

Page 14: Kernel File Interface operating systems (or programming I/O in Unix)

FILE *freopen (const char *pathname, const char *mode, FILE *fp);

Opens a specified file on a specified stream. Closes the file first, if it is already open. Most typically used with stdin, stdout, and stderr to open a file as one of these streams.

FILE *fdopen (int filedes, const char *mode);

takes a file descriptor as a parameter. Used withpipes and network connections, because these usefile descriptors. Associates an I/O stream with thedescriptor.

Related CallsRelated Callsoperatingsystems

Page 15: Kernel File Interface operating systems (or programming I/O in Unix)

fclosefclose

#include <stdio.h>

int fclose (FILE *stream);

returns a zero if the close is successfulOtherwise it returns -1

All files are closed when the program terminates normally, but this allows no opportunity to do error recovery if termination is not normal. Therefore, it is recommended that all files be closed explicitly.

operatingsystems

Page 16: Kernel File Interface operating systems (or programming I/O in Unix)

Binary I/O is commonly used to read or write arraysor to read and write structures, because both deal with fixed size blocks of information.

Note: Binary files are not necessarily interchangeable across systems! * compilers change how data is packed * binary formats are different on different cpu architectures.

Binary I/OBinary I/Ooperatingsystems

Page 17: Kernel File Interface operating systems (or programming I/O in Unix)

There are three types of unformatted I/O: * Character at a time * Line at a time * Direct I/O (fread and fwrite for binary data)

Unformatted I/OUnformatted I/O

operatingsystems

Page 18: Kernel File Interface operating systems (or programming I/O in Unix)

#include <stdio.h>

long ftell (FILE *fp);

int fseek (FILE *fp, long offset, int whence);

void rewind (FILE *fp);

returns the current byte offset or -1L

SEEK_SET – from beginning of fileSEEK_CUR – from the current positionSEEK_END – from the end of the filereturns 0 if successful

nonzero on error

for binary files and text files on GNU systems

Stream PositioningStream Positioningoperatingsystems

Page 19: Kernel File Interface operating systems (or programming I/O in Unix)

For portability across POSIX systems use:

int fgetpos (FILE *fp, fpos_t *pos);

int fsetpos (FILE *fp, const fpos_t *pos);

the position is passed in this parameter,a new data type defined by the POSIX standard.The position value in an fsetpos must have beenobtained in a previous fgetpos call.

returns 0 if successful

operatingsystems

Page 20: Kernel File Interface operating systems (or programming I/O in Unix)

freadfreadfread is used to read binary data and text in fixed sized blocks

#include <stdio.h>size_t fread (void *ptr, size_t size, size_t nblocks, FILE *stream);

The number of items read.It could be less than nblocks if there is an error or eof is reached.

address of where first byte is to be stored

The size of each block or record

The number of blocks to read

The stream to read from

operatingsystems

Page 21: Kernel File Interface operating systems (or programming I/O in Unix)

Interpreting Binary DataInterpreting Binary Data

If the data that you are reading hassome record structure …

struct record_fmt data_buf;. . .

fread(&data_buf, sizeof(char), sizeof(data_buf), file_handle);

operatingsystems

Page 22: Kernel File Interface operating systems (or programming I/O in Unix)

01000101000110100011101000111001

011110100111001010001101000111101

001101110100001101110101000011110

011101000101011010100011101010011

struct record_fmt{ int a; float b; char id[8]; char pw[8];};

operatingsystems

databuf

from the file

cout << data_buf.id;

Page 23: Kernel File Interface operating systems (or programming I/O in Unix)

fwritefwrite

#include <stdio.h>

size_t fwrite (void *ptr, size_t size, size_t nblocks, FILE *stream);

The number of blocks written. If not the same as nblocks, some error has occurred.

address of the first byte to write

The size of each block or record

The number of blocks to write

The stream to write to

operatingsystems

Page 24: Kernel File Interface operating systems (or programming I/O in Unix)

#include <stdio.h>

int fgetc (FILE *stream);

fgetc gets the next character in the stream as an unsigned char and returns it as an int. If an eof or an error is encountered, EOF is returned instead. This call is guaranteed to be written as a function.

The return value is an unsigned char that has been converted to an int.

The constant EOF (usually -1) is returned if there is an error or if theend of the file is encountered.

Character at a time InputCharacter at a time Inputoperatingsystems

Page 25: Kernel File Interface operating systems (or programming I/O in Unix)

int getc (FILE *stream);

int getchar ( void );

highly optimized –best function for reading a single character. Usually implemented as a macro.

Equivalent to getc(stdin)

Character at a time InputCharacter at a time Input

operatingsystems

Page 26: Kernel File Interface operating systems (or programming I/O in Unix)

In most implementations, each stream maintains

* an error flag* an end-of-file flag

To distinguish between EOF and an error call oneof the following functions:

#include <stdio.h>

int ferror (FILE *fp);

int feof (FILE *fp);

returns nonzero (true) if error flagis set, otherwise returns 0

returns nonzero (true) if eof flagis set, otherwise returns 0

Clear the flags by calling

void clearerr (FILE *fp);

operatingsystems

Page 27: Kernel File Interface operating systems (or programming I/O in Unix)

After reading a character from a stream, it can be pushed back into the stream.

#include <stdio.h>

int ungetc (int c, FILE *fp);

the character to push back. Notethat it is not required that you pushback the same character that you read.

You cannot pushback EOF.

Implementations are not required to support more thana single character of pushback, so don’t count on it.

operatingsystems

Page 28: Kernel File Interface operating systems (or programming I/O in Unix)

Character OutputCharacter Output

int fputc (int c, FILE *stream);

fputc converts c to an unsigned char and writes it to the stream. EOF is returned if there is an error.

int putc (int c, FILE *stream);

int putchar( int c );

optimized for single character input

assumes stdout is the output stream

operatingsystems

Page 29: Kernel File Interface operating systems (or programming I/O in Unix)

Line at a Time InputLine at a Time Input

#include <stdio.h>

char *fgets (char *buf, int n, FILE *fp);

char *gets (char *fp);

gets has been deprecated because it doesnot allow the size of the buffer to be specified.This allows buffer overflow!

reads up through and including the next newline character,but no more than n-1 characters. The buffer is terminated with a null byte. If the line is longer than n-1, a partial line is returned.The buffer is still null terminated. If the input contains a null, you can’t tell.

returns buf if successfuland NULL on end of file orerror.

Warning

operatingsystems

Page 30: Kernel File Interface operating systems (or programming I/O in Unix)

String OutputString Output

#include <stdio.h>

int fputs (const char *str, FILE *fp);

int puts (const char *str);

writes a null-terminated string to the stream.It does not write the null terminating character.It does not write a newline character. Returns EOFif the function fails.

writes the null terminated string to standard-out, replacing thezero terminating character with a new-line character.If successful, the function returns a non-negative value.If the function fails, it returns EOF.

operatingsystems

Page 31: Kernel File Interface operating systems (or programming I/O in Unix)

#include <stdio.h>

int main (void){ int c; while ( (c =getc(stdin)) != EOF) if (putc(c, stdout) == EOF) perror("Error writing output");

if(ferror(stdin)) perror("Error reading input"); exit(0);}

EOF is ctrl-D

I/O EfficiencyChar at a time

operatingsystems

Page 32: Kernel File Interface operating systems (or programming I/O in Unix)

#include <stdio.h>#define MAXLINE 4096

int main (void){ char buf[MAXLINE];

while (fgets(buf, MAXLINE, stdin) != NULL) if (fputs(buf, stdout) == EOF) perror("Output Error");

if (ferror(stdin)) perror("Input Error");

exit(0);}

Line at a time

operatingsystems

Page 33: Kernel File Interface operating systems (or programming I/O in Unix)

for copying a file of 1.5M bytes in 30,000 lines

Function user CPU

fgets, fputs 2.2 secondsgetc, putc 4.3 secondsfgetc, fputc 4.6 seconds

loop is executed 30,000 times

loop is executed 1.5M times

operatingsystems

Page 34: Kernel File Interface operating systems (or programming I/O in Unix)

Formatted OutputFormatted Output

int printf (const char *format-spec, print-data … ); int fprintf (FILE *fp, const char *format-spec, print data);int sprintf(char *s, const char *format-spec, print-data…);

a format-specification has the following format:

%[flags] [width] [.precision] type

- left align, default is to right align+ prefix value with a sign0 pad output with zeros

prefix positive values with a blank

Minimum field width.If width is prefixed with 0,add zeros until minimum width is reached.

digits after decimal point. Thiscan truncate data

d signed decimal integeri signed decimal integeru unsigned decimal integero unsigned octal integerx unsigned hex integerf double in fixed point notatione double in exponent notationc single character, an ints a string

writes to stdout

writes to buffer andappends a null byteat the end.

% -this isformat-spec

operatingsystems

Page 35: Kernel File Interface operating systems (or programming I/O in Unix)

“%-10.8f”

Example Format SpecificationExample Format Specification

% - introduces the formatspecification

left justify theoutput

output field is 10 chars wideas a minimum. Padded if fewercharacters in the output. Data is nevertruncated.

print 8 digits after thedecimal point

operatingsystems

Page 36: Kernel File Interface operating systems (or programming I/O in Unix)

ExampleExample

int n = 3;double cost-per-item = 3.25;

printf(“Cost of %3d items at $%4.2f each = $%6.2f\n”, n, cost-per-item, n*cost-per-item);

first field is 3 characters widedata is right justified

3

second field is 4 characters widewith two characters after decimal point

Cost of items at $ 3 . 2 5 = $ 9 . 7 5

third field is 6 characters widewith 2 characters after decimal pointright justified

operatingsystems

Page 37: Kernel File Interface operating systems (or programming I/O in Unix)

Formatted InputFormatted Input

#include <stdio.h>

int scanf (const char* format-spec, data fields);

int fscanf (FILE *fp, const char *format-spec, data fields);

int sscanf (const char *buf, const char *format-spec, data fields);

operatingsystems

Page 38: Kernel File Interface operating systems (or programming I/O in Unix)

scanf reads formatted data from stdin into the data fieldsgiven in the argument string. Each argument must be apointer to a variable that corresponds to a type specifier in the format specification.

The format specification can contain:

* white space characters. A white space character causes scanf to read in, but not store all consecutive white space characters in the input stream, up to the next non-white space character.

* non-white space characters, except % sign. Causes scanf to read but not store a matching non-white space character. If the character does not match, scanf terminates.

* format specification, introduced by %. Causes scanf to read in and convert characters in the input into values of the specified type.

The resulting value is assigned to the next data field in the arg list.

Page 39: Kernel File Interface operating systems (or programming I/O in Unix)

Temporary Files

#include <stdio.h>

FILE *tmpfile (void);

creates a temporary file (type wb+) that isautomatically deleted when the file is closed or the program terminates.

operatingsystems

Page 40: Kernel File Interface operating systems (or programming I/O in Unix)

Sample ProgramSample Program

Write a simple version of the cat command.It takes an optional parameter, a file name.It copies the file to stdout.- if no file name is given, it copies stdin to stdout

operatingsystems

Page 41: Kernel File Interface operating systems (or programming I/O in Unix)

PreliminariesPreliminariesoperatingsystems

#include <stdio.h>#include <stdlib.h>

#define LINELEN 256

void send_to_stdout( FILE*);

header files for I/O

required to define NULL

C programmers use #define to define constants. It works like amacro … the value 256 getsinserted wherever the name LINELEN appears in the code.There is no type checking!

function prototype

Page 42: Kernel File Interface operating systems (or programming I/O in Unix)

Main declarationMain declaration

int main (int argc, char* argv[ ]){ . . .}

The number of argumentson the command line

Array contains thecommand linearguments

Page 43: Kernel File Interface operating systems (or programming I/O in Unix)

Body of mainBody of main

int main (int argc, char* argv[ ]){ FILE *fp;

if (argc == 1) send_to_stdout ( stdin);

Declare a FILE* to hold the file handle

If there is just one command lineargument it is the command. Copyfrom stdin.

Page 44: Kernel File Interface operating systems (or programming I/O in Unix)

int main (int argc, char* argv[ ]){ FILE *fp;

if (argc == 1) send_to_stdout ( stdin); else if (argc == 2) { if ( (fp = fopen(*++argv, “r”) ) != NULL) { send_to_stdout ( fp ); fclose ( fp ); }

If there are two commandline arguments, the secondone is the file name.

Page 45: Kernel File Interface operating systems (or programming I/O in Unix)

int main (int argc, char* argv[ ]){ FILE *fp;

if (argc == 1) send_to_stdout ( stdin); else if (argc == 2) { if (fp = fopen(*++argv, “r”) ) != NULL) { send_to_stdout ( fp ); fclose ( fp ); } else { perror(“could not open the file.”); exit(1); }

handle filewon’t open error

Page 46: Kernel File Interface operating systems (or programming I/O in Unix)

else { perror(“could not open the file.”); exit(1); } } else { perror(“Invalid command – too many arguments”); exit(1); } return 0;}

Handle the case wherethere are too manyarguments on thecommand line.

Page 47: Kernel File Interface operating systems (or programming I/O in Unix)

send_to_stdout functionsend_to_stdout function

void send_to_stdout(FILE *fp){ char line[LINELEN]; while ( fgets (line, LINELEN, fp) ) { if (fputs ( line, stdout ) == EOF ) { perror(“Write to stdout failed”); exit(1); } }}