51
File I/O

Types of files Command line arguments File input and output functions Binary files Random access

Embed Size (px)

Citation preview

File I/O

File I/O

Types of files Command line arguments File input and output functions Binary files Random access

Introduction

Introduction

Data stored in main memory does not persist Most programs require that the user be

able to store and retrieve data from previous sessions

In persistent storage, such as a hard disk There are many forms of persistent

storage Which suggests that the low level

processes for accessing them is different A file is a high level representation that

allows us to ignore low level details

Reading Files

Files have formats A set of rules that determine the

meaning of its contents To read a file

Know (or find) its name Open it for reading Read in the data in the file Close it

There are similar processes for writing to files

Files

A file represents a section of storage Files are viewed as contiguous

sequences of bytes which can be read individually In reality a file may not be stored

sequentially and may not be read one byte at a time

Such details are the responsibility of the OS

C has two ways to view files Text Binary

Text and Binary Files

There is a distinction between text files and binary files Text files store all data as text whereas

binary files store the underlying binary representation

In addition C allows for both text and binary views of files Usually the binary view is used with

binary files

Low Level I/O

In addition to types of files and views of files C has a choice of I/O levels

Low level I/O uses the fundamental I/O given by the OS

Standard high level I/O uses a standard package of library functions ANSI C only supports standard I/O since

all OS I/O cannot be represented by one low level model

We will only look at standard I/O

Text Files

Standard Files

C automatically assigns input and output to standard files for some I/O functions e.g. getchar(), gets(), scanf(), printf(),

puts() There are three standard devices for

I/O Standard input is set to the keyboard Standard output is set to the display Standard error is set to the display

Redirection causes other devices or files to be used for standard input or output

Redirection

The input or output of a C program can be redirected to a file When the program is run at the command

prompt▪ > file redirects standard input to the named file

▪ < file redirects standard output to the named file To redirect input for a program called a3 to

a file called a3test ./a3 > a3test▪ ... which is what we are doing when marking some of

your assignments ...

Command Line Arguments

Command line arguments are additional input for programs For example, gcc takes a number of

command line arguments, such as in gcc -o hello helloworld.c

Our C programs can also take command line arguments Another form of declaring the main

function gives the main function two arguments

int main(int argc, char *argv[])

int main(int argc, char* argv[])

number of arguments

array of strings (the arguments)

short for argument count, the count should be one more than the number of arguments

the first string is the name of the command, the remaining strings are the additional arguments

Using Command Line Arguments

The value of argc is derived It does not have to be entered by the

user The first element of argv is the name

of the executable (the program name) On most systems

The second and subsequent elements of argv are the arguments In the order in which they were entered

Counting Characters

Write a program to count the number and types of characters in a file The file to be read will be given as a

command line argument to the program The program will exit under two

conditions The wrong number of arguments are

given to the program The file cannot be opened

Counting Characters – includes #include "stdio.h"#include "ctype.h"

//Forward Declarations

void countCharacters(FILE* fp, char* fName);

required for character comparisons

Counting Characters – main 1

int main(int argc, char* argv[]){

FILE* fp;// Test the number of argumentsif(argc != 2){

printf("%s requires file name\n", argv[0]);exit(1);

}

i.e. command and one argument

size of argv

first test to make sure the user has entered the command correctly

in Unix (or Linux) the program can be given different names

Counting Characters – main 2

int main(int argc, char* argv[]){

// ...// Attempt to open fileif((fp = fopen(argv[1], "r")) == NULL){

printf("Cannot open %s\n", argv[1]);exit(1);

}

countCharacters(fp, argv[1]);fclose(fp);return 0;

}

the argument, should be a file name

returns NULL if file cannot be opened

processes the file

then attempt to open the file for reading

1 has the value of EXIT_FAILURE, note that exit will exit the program from any function

Counting Characters Function – 1

// Prints the count of characters in a file, by:// alpha// digits// whitespace// other// PRE: fp can be opened and read// PARAM: fp is a pointer to a file to be readvoid countCharacters(FILE* fp, char* fName){

int alpha = 0;int digits = 0;int white = 0;int other = 0;int total = 0;char ch;

note the pre-condition is documented

variable declarations

documentation and variable declarations

Counting Characters Function – 2

void countCharacters(FILE* fp, char* fName){

// ...// Read file one character at a timewhile((ch = getc(fp)) != EOF){

if(isalpha(ch)){alpha++;

}else if(isdigit(ch)){digits++;

}else if(isspace(ch)){white++;

}else{other++;

}}total = alpha + digits + white + other;

processes each character until end-of-file

it’s an if ... else if ... else statement to minimize comparisons and to ensure that other is counted correctly

go through the file one character at a time

Counting Characters Function – 3

void countCharacters(FILE* fp, char* fName){

// ...// Print number of charactersprintf("%s contains %d characters\n", fName, total);printf("%d letters\n", alpha);printf("%d digits\n", digits);printf("%d whitespace\n", white);printf("%d other\n", other);

} prints the count of each type of character

and then print the number of characters

Counting Characters Output

here is a sample run of the program

changes directory to the directory containing the .exe

it’s a Word document

no such file

no file name argument

Discussion

There is no need to use command line arguments with the preceding program It’s just an example of using them

A different version of the program could allow the user to process multiple files With a loop that ended the program

when the user wanted In which case it would not make sense to

have the file name as a command line argument

Opening Files with fopen

The fopen function is used to open files It returns a pointer to a FILE structure The FILE structure is defined in stdio.h

and contains data about the file If the file cannot be opened fopen

returns the null pointer fopen takes two string arguments

The name of the file to be opened The mode in which the file is to be

opened

File Modes

Mode Meaning

"r" opens text file for reading

"w" opens text file for writing, overwrites existing files, creates new files

"a" opens text file for writing, appends to the end of existing files

"r+" opens text file for update (both reading and writing)

"w+" opens text file for update (both reading and writing) overwrites existing files, creates new files

"a+" opens text file for update (both reading and writing) the whole file can be read but writing only appends to the end of the file

"rb", “wb", ...

the same as the preceding modes except that it uses binary rather than text mode

Character I/O

The functions getc and putc can be used for character based file I/O They are similar to getchar and putchar

except that they require a file argument The getc function will return the EOF

value if it has reached the end of a file To avoid trying to process empty files

check for EOF before processing the first character

Closing Files

Files should be closed when finished with Using the fclose function which takes a

file pointer The fclose function flushes buffers as

required, and allows the file to be correctly opened again

The fclose function returns 0 if a file was closed successfully and EOF if it was not Files can be unsuccessfully closed if the

disk is full or if their drive is removed

File I/O Functions

There are file I/O functions similar to the I/O functions we’ve been using Each function takes a FILE pointer▪ Which could be stdin or stdout if input is to be

from the keyboard, or output to the display

fprintf, fscanf and rewind

The fprintf and fscanf functions work just like scanf and printf except with files The file pointer is an additional first

argument▪ The file pointer is the last argument for putc

The rewind function moves the file pointer back to the front of the file

fgets and fputs

The fgets function is used for string input The first argument is an address of a

string The second is the maximum length of

the string The third is the file where input is stored fgets returns NULL when it encounters

an EOF The fputs function is used for string

output It has arguments for a string and a file

pointer It does not append a newline when it

prints▪ Unlike puts which does

Append Names to a File 1

#include "stdio.h"

const int FNAME_LEN = 20;const int NAME_MAX = 40;

int main(){

char fname[FNAME_LEN];char name[NAME_MAX];FILE* fp;

printf("Enter the name of the file: ");gets(fname);

fp = fopen(fname, "a+")

maximum lengths of file names and names

opens the file for appending and reading (a+)

open for append and read, will create a new file if fname does not exist

Append Names to a File 2

int main(){

// ...puts("Enter names to add to the file");

while(gets(name) != NULL && name[0] != '\0'){fprintf(fp, "%s\n", name);

}

puts prints a newline

similar to printf, can be used to format numeric values

add words to the end of the file

the while loop continues until the user presses enter twice in sequence

Append Names to a File 3

int main(){

// ...puts("File contents\n");rewind(fp);while(fgets(name, NAME_MAX, fp) != NULL){

printf("%s",name);}

fclose(fp);return 0;

}

goes back to the start of the file

fgets is used instead of fscanf since names consist of two words

then print the entire contents of the file

Append Names to a File 4

here is a sample run of the program

note that the new names have been appended to the existing file rather than over-writing the file

Binary Files

Storing Numeric Data

All of the examples have involved string and character storage Consider storing numeric data

Storing integers is straightforward But what about storing floating point values?

We could use fprintf for floating point values e.g. fprintf(fp, "%f", num); But this entails making decisions about the

format specifier

Storing Bytes

If fprintf stores numeric values they are converted to characters and stored as text This may waste space if the number

contains many digits (e.g. 1.0/3) Or may lose precision if the format

specifier is used to fix decimal places▪ fprintf(fp, "%.2f", 1.0/3);

An alternative is to store the same pattern of bits used to represent the value

Binary File

A binary file stores data using the same representation as a program Numeric data are not converted to

strings The functions fread and fwrite are

used for binary I/O They are a little more complex than text

file functions They require information about the size

of data to be stored

Function Prototype for fwrite

size_t fwrite(void * ptr, size_t size, size_t nmemb, FILE* fp)

size_t is a type, defined in terms of other C standard types and is usually an unsigned intsize_t is the type returned by sizeof

file pointer

address of the first memory location to be written

the size of the variables

the number of variables

Use of fwrite

The complex structure of fwrite allows it to store entire arrays in one function call double temperatures[365]; fwrite(temperatures, sizeof(double), 365,

fp); The return value of fwrite is the

number of items successfully written to the file This should equal the nmemb parameter

fread

The fread function takes the same set of arguments as fwrite The ptr argument is the address in

memory to read the data into fread should be used to read files

that were written using fwrite double temperatures[365]; fread(temperatures, sizeof(double), 365,

fp);

Random Access

It may be useful to move to a particular location in a file Without reading the preceding part of

the file, like reading an array This is known as random access

The fseek and ftell functions allow random access to files They are usually used with binary files

fseek

The fseek function has three arguments A file pointer to the file An offset indicating the distance to be moved

from the starting point The mode which identifies the starting point▪ SEEK_SET – the beginning of the file▪ SEEK_CUR – the current position▪ SEEK_END – the end

fseek returns 0 normally and -1 for an error Such as reading past the end of the file

ftell

The ftell function returns the current position in a file, as a long The number of bytes from the start of

the file fseek and ftell may differ based on

the OS Since the distance that fseek moves is

measured in bytes they are normally used for binary files

ANSI C introduced fgetpos and fsetpos for use with larger file sizes

Binary File Example

This example creates an array of random values and writes them to a binary file

The user is then asked for an index value The program finds and prints the value

with that index in the file using fseek and fread

Writing and Reading an Array 1

#include "stdio.h"#include "stdlib.h"

#define ARR_SIZE 100

int main(){

double numbers[ARR_SIZE];double value;int i;long pos;char* fname = "numbers.dat";FILE* fp;

length of the array

declarations

Writing and Reading an Array 2

int main(){

// ...// Create a set of double valuesfor(i = 0; i < ARR_SIZE; ++i){

numbers[i] = i + (double)rand() / RAND_MAX;}

create the array to written to the file

this is probably unnecessarily complicated but it produces an ordered array of doubles with digits to the right of the decimal point

defined in stdlib.h

Writing and Reading an Array 3

int main(){

// ...// Open file for writingif((fp = fopen(fname, "wb")) == NULL){

fprintf(stderr, "Could not open %s.\n", fname);exit(1);

}

// Write array in binary formatfwrite(numbers, sizeof(double), ARR_SIZE, fp);fclose(fp);

write the array to the file

the array

size of each value

for writing a binary file

number of values

Writing and Reading an Array 4

int main(){

// ...// Open file for readingif((fp = fopen(fname, "rb")) == NULL){

fprintf(stderr, "Could not open %s.\n", fname);exit(1);

}

open file for reading

for reading a binary file

Writing and Reading an Array 5

int main(){

// ...// Read array elements as requestedprintf("Enter index in range 0 to %d: ", ARR_SIZE-1);scanf("%d", &i);while(i >= 0 && i < ARR_SIZE){

pos = (long) i * sizeof(double);fseek(fp, pos, SEEK_SET);fread(&value, sizeof(double), 1, fp);printf("value at index %d = %.2f\n", i, value);printf("Enter index (out of range to quit): ");scanf("%d", &i);

}fclose(fp);

}

read values from the file

position in file to be read

move to positionbinary

read

get next position

Writing and Reading an Array 6

here is a sample run of the program

note that the binary file is not comprehensible by humans