58
COMPUTER SCIENCE 4500 OPERATING SYSTEMS Module 2: OS Concepts: Processes and File Systems © 2017 Stanley Wileman Last update: 1/24/2017

COMPUTER SCIENCE 4500 OPERATING SYSTEMS - cs2cs2.ist.unomaha.edu/~stanw/171/csci4500/slides/m02.pdf · Introduction to Processes (Tasks) 5 ! A process, sometimes called a task, is

Embed Size (px)

Citation preview

COMPUTER SCIENCE 4500 OPERATING SYSTEMS Module 2: OS Concepts: Processes and File Systems © 2017

Stanley Wileman

Last update: 1/24/2017

In This Module… 2

!  Operating System Concepts !  Introduction to Processes !  Files and Data Sets !  Name Binding, Catalogs and Directories !  System Calls for Processes and Tasks !  Sample Programs Using Processes

Documentation on System Calls 3

!  On loki, use man name, or man 2 name where name is the system call name.

!  You can also find many UNIX system calls described in the document at the end of the class web page.

!  Windows system calls are described at http://msdn.microsooft.com/library -- look at the API index, specifically for System Services and Data Access.

Operating System Concepts 4

!  Recall that we will study operating systems from two points of view. These are the views of ! Application writers, using the various system calls to achieve

interaction with the system and its resources. ! OS implementers, who must write the code for the operating

system that implements the system calls.

!  Let’s consider two major system call categories, those for processes and for file systems.

Introduction to Processes (Tasks) 5

!  A process, sometimes called a task, is essentially a packaged “unit of work” for a computer system. In many cases when you enter a command or double click on an icon, a command interpreter process (a “shell”) will create a new process to run the program.

!  Processes will always have !  an address space in which they can execute, and !  a set of processor registers to hold data, next instruction

address, and so forth.

Processes 6

!  In a multiprogramming system, multiple processes (at least one per user) are being “virtually executed.” But it’s obvious only one process can be actually executed at a time on a single processor.

!  Thus, the process currently being executed is occasionally “freeze dried” and another process is executed for a while. This is called a context switch, since the processor’s context changes.

!  When a process is not being executed, salient details (e.g. CPU register contents and file pointers) are stored away in a process table.

Process Example: The CPU executes some instructions from process “A”…

7

LOAD A ADD B STORE C DIV X ADD Z STORE Y …

LOAD K DIV E MPY T STORE N CMP L JNE LOOP …

.

.

.

Process A Process B

…saves process A’s context and executes some instructions from process “B”…

8

LOAD A ADD B STORE C DIV X ADD Z STORE Y …

LOAD K DIV E MPY T STORE N CMP L JNE LOOP …

.

.

.

Process A Process B

…then saves process B’s context, restores A’s context and executes more instructions.

9

.

.

.

LOAD A ADD B STORE C DIV X ADD Z STORE Y …

LOAD K DIV E MPY T STORE N CMP L JNE LOOP …

Process A Process B

System Calls for Processes 10

!  Key process-related system calls include those in the following categories: !  create or terminate a process; !  cause a process to wait, if necessary, until it can use a

particular resource; ! manage the memory used by a process; and !  control the handling of asynchronous signals sent to a

process.

!  We will later examine some of these system calls in detail, and use them in a program.

Process Tables 11

!  A process table has one entry for every process in the system.

!  Each process table entry contains information about process ownership, relation to other processes, process state, and statistical information.

!  Process tables are frequently organized as static arrays.

Process Identification and Ownership 12

!  Each process in a system must be uniquely identified. In UNIX systems, this identification is provided by an integer called the process ID.

!  Each process is also owned by a particular user, identified by a user ID (uid).

!  One uid, associated with the “super user,” imparts special privileges (e.g. shutdown the system, and read arbitrary files) to processes. (OS/390 UNIX allows multiple “super users.”)

Files and Data Sets 13

!  Another major operating system function is to provide convenient access to information organized in files, or data sets. (The term data set is frequently used for files on mainframes.)

!  Files are named collections of information, usually stored on disk or magnetic tape.

!  Files can also be stored on other devices, including RAM disks and removable memory cards for use with portable devices (e.g. cameras and handheld computers).

File Systems 14

!  A file system is a storage structure (like a data structure, but normally tailored for a secondary storage device like a disk).

!  There are many different types of file systems, even for the same operating system. Speed, reliability, media type, and other factors influence the design of a file system. For example, CD Rom file systems differ considerably from those for hard drives.

System Calls for File Manipulation 15

!  The usual system calls for manipulating files include the following: !  open — prepare a file for input/output !  read — copy data from file to memory !  write — copy data from memory to file !  close — conclude I/O operations on a file

!  Other system calls may be provided for things like file locking, random positioning, concurrent I/O, tape positioning, communication functions, and so forth.

!  We will later take a detailed look at a number of these system calls and use them in a program.

Organizing Files 16

!  Even systems designed for use by a single user may have thousands of files. How are these files organized?

!  Most systems use a hierarchical naming scheme for files. A particular file is accessed by giving first the name at the top level of the hierarchy, then the second level name, and so forth, finishing with the name of the desired file.

Hierarchical Naming Examples 17

!  A mainframe computer might use names like STANW.SAMPLE.JOBS and STANW.SAMPLE.CLIST for data sets containing different types of sample code.

!  If these were stored in a UNIX file system, they might be /u/stanw/sample/jobs and /u/stanw/sample/clist.

!  Note that the basic concepts are the same. It’s much like describing a path through a tree, from the root to a particular node.

Binding 18

!  The term binding is used to indicate the mechanism by which a name is associated with an object.

!  For file systems, binding refers to the mechanism used to associate a name with a file.

!  There are many different binding techniques used in different file systems.

Binding Examples 19

!  The IBM MVS system uses catalogs, each of which stores pairs of the form (name, location), where name might be STANW.SAMPLE.JOBS and location would identify the physical location of the file.

!  UNIX systems use multiple directories to store this information, each directory similar to an interior node of a tree. Thus each directory entry (called a link) contains one component of a hierarchical name and a reference to the named object, that could possibly be another directory.

Catalogs 20

STANW.SAMPLE.JOBS

STANW.SAMPLE.CLIST

.

.

.

Directories 21

home

stanw

csci4500

args.c

cls.c

stanw

csci4500

Catalogs and Directories 22

!  We often say a catalog or directory contains a group of files, but this is only a conceptual idea.

!  As we have seen, a directory or catalog does not contain files, but rather contains a list of file names and other information.

!  Directories and catalogs are usually themselves stored in files, but with limited access for writing. This is because their organization must be carefully maintained, as it is crucial for the proper operation of the file system.

Common Filesystems 23

!  Some common file systems include: !  FAT (used with MS-DOS) ! NTFS (used with Windows NT) ! CDFS (used on CD-ROMs) ! UFS and EXT4 (used with UNIX/Linux systems) ! HFS (used with OS/390 UNIX) ! MVS (used with mainframe systems)

!  Each of these has a unique way of naming and storing files, and mapping from a file name to its storage. The functions they provide, however, are basically the same.

Hierarchical File System Conventions 24

!  The top directory (node) in a hierarchical file system tree is called, naturally, the root node or root directory.

!  This concept does not apply to the MVS file system. MVS may, however, have multiple catalogs, the first of which is called the master catalog.

!  The first component of a data set’s name in MVS is called the high level qualifier (HLQ).

Relative and Absolute Path Names 25

!  In many systems (UNIX, Windows, etc.), a unique file can be identified by giving a path to it, starting from the root directory. This is called an absolute, or complete path path to the file.

!  A relative path to a file starts at a working directory, which is just a directory identified as such by the user. A default working directory, called the user’s home directory, is identified at login time.

!  Absolute pathnames always begin with ‘/’; a relative pathname is one that does not start with ‘/’ (at least in Linux systems).

Example Pathnames 26

!  Absolute or complete paths /u/stanw/samples/jobs /usr/man/man2/fork.2

!  Equivalent relative paths (where cwd means the current working directory)

jobs (with cwd of /u/stanw/samples) man2/fork.2 (with cwd of /usr/man)

System Calls for Directories 27

!  Windows ! CreateDirectory, RemoveDirectory ! GetCurrentDirectory, SetCurrentDirectory

!  UNIX/Linux ! mkdir, rmdir !  link, unlink, mount, unmount ! chdir, chroot

Non-Hierarchical File Systems 28

!  Some file systems do not support hierarchical file naming (in the traditional manner).

!  For example, the CP/M operating system (the predecessor of MS-DOS) only allows each file to have 11 characters in its file name, in the common “8.3” format (e.g. computer.txt). There is only one directory (per disk), and that contains an entry for every file in the CP/M file system on that disk.

File Permissions 29

!  Most systems provide some protection for files to restrict who can read, write, create, or delete particular files.

!  UNIX systems specify protection with a 9-bit code: 3 bits for the “user” (the file’s owner), 3 bits for the user’s group, and 3 bits for everyone else.

!  Each group of 3 bits is interpreted as r w x. If r is set, then reading is allowed; when w is 1, writing is allowed; and when x is 1, execution is allowed.

!  The output of the ls –l command illustrates basic permission information.

Typical Output of ls –l Command

© 2008 Stanley A. Wileman, Jr. Operating Systems

29.1

Create Date/Time $ ls -l total 20 -rw-rw---- 1 stanw stanw 37 Nov 15 15:00 Makefile -rwxrwx--- 1 stanw stanw 8427 Nov 15 15:00 stuff -rw-rw---- 1 stanw stanw 138 Nov 15 14:59 stuff.c drwxrwx--- 2 stanw stanw 4096 Nov 15 15:06 subdir $

First col: - means regular file, d means directory

Owner Group

rwx: read, write, execute for owner, group, others

# of links to the file File size (bytes) Filename

RACF (pronounced “Rack F”) 31

!  OS/390 provides optional security systems (like RACF) that provide security external to the file system. Each system call that seeks to access an object (e.g. a file) causes the operating system to query RACF to see if the requested access is permitted.

!  This global approach to providing security is different (and much more comprehensive) than the file-system integral approach used by UNIX.

!  Look at the file perm.c on Loki for an example.

Access Control Lists 32

!  Windows and RACF also provide ACLs, or Access Control Lists.

!  An ACL is a list of pairs, each pair identifying a user or group of users, and the type of access permitted to the file.

!  Each file system object may have an associated ACL. RACF extends ACLs to virtually every object in the system.

Special Files 33

!  UNIX allows access to physical devices (for suitably privileged processes) using the same I/O system calls used for “regular” files.

!  Character special files are used to model devices that do transfer one character at a time. Examples include serial ports, mice, keyboards, printers, etc.

!  Block special files model devices that transfer a block of data at a time (disks and tapes).

File Descriptors / Handles 34

!  Opening a file requires specification of the file name, or the path to the file. The type of access desired (e.g. read, write) is also indicated.

!  After the file is open, other system calls referencing the file use a reference to a data structure containing information about the open file; this reference is returned by the open call.

!  In UNIX, this reference is called a file descriptor. In Windows, it is called a handle. And in MVS it is a pointer to a data control block.

Files Automatically Open in UNIX Processes

35

!  Three (or more) files are usually available (opened automatically) when a process begins execution: ! Standard input (usually the keyboard) ! Standard output (usually the CRT) ! Standard error (usually also the CRT) ! Standard print (in MS-DOS and Windows)

High Level Language I/O 36

!  In a high-level language (like C or C++), application programs rarely directly use the reference returned by the open system call. Instead, they use a reference to a higher-level data structure managed by standard libraries provided by the language.

!  For example, in C, to read the keyboard character and store the result in variable c, we could use ! read (0, &c, 1); (system call) ! c = getchar(stdin); (standard library function)

Pipes 37

!  A pipe is essentially queue of bytes treated as an I/O device.

!  When a pipe is created, two file descriptors are created. !  The first file descriptor is used to read from the pipe. !  The second file descriptor is used to write to the pipe.

!  Data written into the pipe is conceptually written at the tail of a queue; reading from the pipe removes data from the head of the queue.

!  Pipes are frequently used for communication between processes.

Implementation of System Calls 38

Q. Since the operating system code usually executes in kernel mode, and applications run in user mode, how can an application make a system call which requires the application to enter the kernel mode?

A. Special instructions are used that cause traps - just as in a divide by 0, but these traps indicate to the OS that a system call was made. While in kernel mode, the application can only execute the specified system call’s code, thus ensuring security.

Making System Calls from C 39

!  Rather than have users struggle with writing special trap instructions, libraries are included with C implementations that define functions that execute the appropriate trap instructions.

!  Example (UNIX): char buffer[80]; int count; count = read (0,buffer,80); The read function handles parameter passing and

arranges for the appropriate trap instruction to be executed.

System Call Results 40

!  Most system calls are designed to return a unique code indicating failure. In UNIX, this is usually –1.

!  When a system call fails, an additional mechanism is available to identify the particular reason for failure: ! errno (a global variable in UNIX) ! GetLastError (a function call in WindowsNT) !  a processor register’s contents in MVS

Command Interpreters (Shells) 41

!  Character-oriented interfaces to operating systems (as opposed to GUIs) utilize one or more programs that interpret a “command language,” frequently resulting in requests to create processes and execute programs.

!  These interpreters are normally called shells (e.g. sh, csh, mush, tcsh and ksh in UNIX).

!  Windows NT uses a program called cmd.exe, and MS-DOS uses command.com.

Typical Features of Shells 42

!  A user-configurable prompt is issued to indicate the shell is ready for the next command.

!  prog arg1 arg2 would be typed to request creation of a process for execution of the executable program in file prog (a relative path) with arguments arg1 and arg2 (more on arguments later).

More Shell Features 43

!  prog < input executes prog with its standard input redirected to the file input.

!  prog > output redirects standard output to the named file.

!  prog < input > output redirects both standard input and standard output.

!  prog1 | prog2 runs prog1 and prog2; the standard output of prog1 piped to prog2’s input.

!  prog & runs prog concurrently with shell.

Other Approaches 44

!  Note that the mechanism by which an executable program is submitted for execution, and how external objects (e.g. files and terminals) are associated with the process, is usually independent of the system calls made by the process.

!  For example, in MVS, program execution is requested by statements in a data set (file) containing JCL (job command language). The program itself, while unchanged, could have its execution requested by a command line in UNIX.

MVS Job Control Language 45

!  JCL statements usually have three fields: a label, a statement type, and parameters.

!  The JOB statement is used to define a job consisting of one or more job steps: //DOIT JOB ACCT#,’USERNAME’

!  The EXEC statement requests the execution of a program: //STEP1 EXEC PGM=COMPILER

!  The DD statement associates a “ddname” with a data set (the “ddname” is referenced inside the program): //SYSIN DD DSN=MY.SOURCE.PROGRAM

A Typical MVS Job 46

//DOIT JOB 12345678,’USERNAME’

// EXEC COB2UCLG

//COB2.SYSIN DD *

The COBOL program goes here

/*

//GO.OUT DD SYSOUT=*

//GO.IN DD DSN=MY.INPUT.DATA,DISP=SHR

The UNIX fork System Call 47

!  The UNIX system call fork() duplicates the process that executes it: memory contents, open file handles, quotas, permissions, etc.

!  In the parent (creating) process, fork() returns the unique process id (an integer) of the newly created process.

!  In the child process, fork() returns 0.

The UNIX execve System Call 48

!  The execve() system call replaces the core image (memory contents) of the currently executing process with that of another program.

!  Shells begin the execution of a program (specified on the command line) by first executing fork, and then arranging for the child process to exec the appropriate program.

The Windows CreateProcess System Call 49

!  The CreateProcess system call combines the effects of the UNIX fork and exec system calls.

!  CreateProcess creates a new process which begins executing a named program with a specified set of arguments.

The MVS ATTACHX System Call 50

!  The ATTACHX system call is used in MVS to create a new subtask (process) and have it execute a named program.

!  MVS System calls are usually made from code generated by assembler macros.

!  The ATTACHX macro places the address of a parameter block in a register, then executes the instruction “SVC 42” which invokes the operating system by causing a trap.

!  The parameter block includes the name of the program to be executed by the new task.

Passing Arguments to UNIX Processes 51

!  When a UNIX process begins execution, it receives command line parameters in a variable length array of pointers to character strings.

!  In C, the main function uses two (or three) arguments: ! An integer giving the number of arguments ! A pointer to an array of character strings which contains the

arguments !  The third argument points to the environment (which we will

cover later).

Sample Program Using Arguments 52

!  The program “pargs.c” illustrates a C program that references the arguments provided on the command line.

!  The program just prints the arguments. !  If invoked as “pargs one two three”, the program

will print the following:

Number of arguments = 4 Argument 0: <args> Argument 1: <one> Argument 2: <two> Argument 3: <three>

UNIX program: pargs.c 53

/*----------------------------------*/ /* Print the command line arguments */ /*----------------------------------*/ #include <stdio.h> main (int argc, char *argv[]) { int i; printf("Number of arguments = %d\n”, argc); for (i=0;i<argc;i++) printf(" Argument %d: <%s>\n",i, argv[i]); return 0; }

Sample Program Using Fork/Exec 54

!  The program “prun.c” illustrates a C program that creates a new process that then executes a named program with arguments.

!  If invoked as “prun pargs one two”, the program will create a new process (using fork()) and terminate. The new process will then execute the pargs program (using execve()) which will print the expected output.

UNIX program: prun.c (part 1) 55

/*-------------------------------------------*/ /* Create a new process to execute a program */ /*-------------------------------------------*/ #include <stdio.h> extern char **environ; void main( int argc, char *argv[]) { char *args[10]; int i; /* Eliminate argv[0] prior to use by execve */ for (i=1; i<argc;i++) args[i-1] = argv[i]; args[i-1] = NULL; if (argc < 2) { /* error: too few arguments */ printf("Usage: prun <prog> <arg> ...\n" ); exit(1); }

55

if (fork() == 0) { /* fork, test for error */ /* inside the new process, execute the prog */ execve(argv[1], args, environ); perror("execve"); /* if execve had error */ return 1; } printf( "prun complete.\n", i ); return 0; }

UNIX program: prun.c (part 2)

55.1

$ cc -o pargs pargs.c $ cc -o prun prun.c $ ./prun ./pargs one two three prun complete. $ Number of arguments = 4 Argument 0 = <./pargs> Argument 1 = <one> Argument 2 = <two> Argument 3 = <three>

Compiling, linking, and running…

!  Note how the shell prompt ($) is displayed before the output from the pargs program.

!  This is because the shell was waiting for prun to terminate before prompting for the next command.

!  However, prun itself did not wait for the process it started (pargs) to terminate. Look at the prunw.c program for more details.

Next… 56

!  In the next module we’ll examine the process model in more detail.

!  We will examine the states in which processes may exist, and the causes for transitions between these states.

!  Threads will be considered as alternatives to processes.

!  And interprocess communication (IPC) will be introduced.