Unix Fundamentals

Unix Shell ScriptingModule 2 General Purpose Utilities 11
Module 3 Working with Directories & Files 21
Module 4 The Shell 44
Module 5 vi Editor 52
Module 6 File permissions 70
Module 7 File Comparison 77
Module 8 The process 81
Module 9 Filters 91
*
*
*
*
What is UNIX?
The name was intended as a pun on an earlier system called ‘Multics’ (Multiplexed Information and Computing Service).
*
Worded commands & messages
*
*
Features
Multi-user :- More than one user can use the machine at a time supported via terminals (serial or network connection).
Multi-tasking :- More than one program can be run at a time.
Hierarchical directory structure :- To support the organization and maintenance of files.
Portability :- Only the kernel ( <10%) written in assembler tools for program development a wide range of support tools (debuggers, compilers).
*
Where : Bell Telephone Laboratories
*
*
Unix operating system originally developed in 1969 by a group of AT&T employees at Bell Labs including Ken Thomson and Dennis Ritchie.
*
AIX (Advanced Interactive eXecutive ) by IBM
*
*
Unix is available in market in various flavors. Different vendors provides different operating system.
Solaris is given by Sun Microsystems, AIX is given by IBM.
Most popular version of Unix is Linux.
*
Signing onto Unix
Every user of a UNIX system must log on to the computer using an existing account.
Login name is to be entered at login prompt.
login: user1
$ ls
All unix commands must be entered in lowercase.
Between command name and its options, there must always be a space.
$ ls -l
*
*
Overview
banner
cal
date
Who
Echo
bc
$ banner hello
Displays calendar of current month.
Displays calendar for specified month (i.e. month 7-July ) and year (i.e. year 2008).
Displays calendar of 1752.
*
%D Displays date in the format of mm/dd/yy.
%g Displays year in 2 digit.
%G Displays year in 4 digit.
*
$
*
Unix provides two calculators – a graphical object ( xcalc command) that looks like one, and bc command. The former is available in X-window System, and is quite easy to use. The other one is less friendly, extremely powerful. In fact, there are situations when working without bc becomes quite painful.
$ bc
Command 1 :- Clears the screen.
Command 2 :- Tells you filename of the terminal you are using (tty stands for teletype).
Command 3 :- know your machine name
uname –r: shows operating system’s version no.
Command 4 :- Prints user’s login name.
Command 5 :- sign off
Overview
Types of Files
Rules for filenames
Directory Handling Commands
Absolute path & Relative path
Unix treats everything it knows and understands as a file.
Unix File system resembles an upside down tree.
/ (root)
bin
lib
dev
usr
tmp
etc
mnt
user1
user2
bin
It has a hierarchical file structure
Files can grow dynamically
Files have access permissions
*
*
- Ordinary Files
- Directory files
- Device files
*
*
pwd
pwd
cat command: create new file
Creates file with the specified name and can add data into it.
cat > r_jan
<ctrl+d>
cat r_jan
$ cat –n [file_name]
$ cat –e [file_name]
ls [option] [directory/file]
Options to ls
-a displays hidden files also
-l long listing if files showing 7 attributes of a file
-i displays inode number
-s Print size of each file, in blocks
-t Sort by modification time
-F Marks executables with * and directories with /
-R Recursive listing of all files in sub-directories
-d List directory entries
$ ls -a
Absolute path
Relative path
/
Options to mv:
Examples:
ln -s [source_path] [destination_path]
Permissions (read, write etc)
File deletion time
Number of links (soft/hard)
*
*
On Unix, the collection of data that makes up the contents of a directory or a file isn't stored under a name; the data is stored as part of a data structure called an ‘inode’.
In the inode, Unix stores information about which disk blocks belong to the contents of the directory or file, as well as information about who owns the directory or file and what access permissions it has. Every directory or file on Unix has its own inode.
*
*
Shell acts as the mediator which interprets the command that user gives and then conveys them to the kernel which ultimately executes them.
*
*
Bourne shell: Original Unix system came with Bourne shell. Created by Steve Bourne
C Shell: faster than Bourne. Created by Bill Joy. Allows aliasing of commands. It also has command history feature.
*
/dev/tty - refers to your own terminal
*
$ who | wc –l $ ls | wc –l >fcount
*
*
*
*
Unix System variables
System variables Purpose
PATH The set of directories the the shell will search in the order given, to find the command
HOME Set of users home directories
MAIL The directory in which electronic mail is sent to you is places
PS1 The primary prompt string
PS2 The secondary prompt string
SHELL It sets the path name of the shell interpreter
TERM Identifies the kind of terminal you are using
LOGNAME Displays the username
*
Unix system is controlled by a number of shell variables that are separately set by the system, some during boot sequence, and some after logging in.
These variables are called system variables or environment variables. You must know significance of many of them because they determine the environment in which you work.
use $ set to display complete list of these variables.
*
Overview
Input Mode commands
Ex mode commands
Pattern search and replace
*
*
*
Introduction to vi
vi( short for visual editor) is an editor available with all versions of unix.
It allows user to view and edit the entire document at same time.
Written by Bill Joy
$ vi newfile
Input
Mode
i, I, o, O, r, R, s, S, a, A
<Esc> :
<Enter>
I Inserts text at the beginning of the line
Append text
A Appends text at the end of the line
Opening a new line
*
*
R Replaces text from cursor to right
s Replaces single character under cursor with any number of characters
S Replaces entire line
Moving between the lines
nG Goes to line number ‘n’
Scrolling page
ctrl+d Scrolls half page forward
ctrl+u Scrolls half page backward
*
*
Saving & Quitting
:wq Saves & quits from editing mode
:x Saves & quits from editing mode
:q! Quitting the editing mode without saving any changes
Writes selected lines into specified file_name
:w <file_name> Save file as file_name
:n1,n2w <file_name> Writes lines n1 to n2 to specified file_name
:.w <file_name> Writes current line into specified file
:$w <file_name> Writes last line into specified file name
*
*
dw Deletes a word ( or part of the word)
dd Deletes the entire line
db Deletes the word to previous start of the word
d0 Deletes current line from cursor to beginning of line
d$ Deletes current line from cursor till end of line
nx Deletes n characters
ndw Deletes n words
ndd Deletes n lines
Yanking/Copying text
yy Yanks current line
y0 Yanks current line from cursor to beginning of line
y$ Yanks current line from cursor till end of line
nyy Yank the specified number of lines
Paste
*
*
n Repeats the last search command
N Repeats search in opposite direction
*
*
^ Matches pattern towards beginning of line
$ Matches pattern towards end of line
\< Forces match to occur only at beginning of word
\> Forces match to occur at end of word
Examples:
^finance
finance$
*
If any of the special characters mentioned above themselves occur in the pattern to be searched then they must be preceeded by a backslash(\) if they are to be treated as ordinary characters.
If you want to make searching irrespective of their case,
<Esc>:set ignorecase or <Esc>: set ic
*
:m,n s/str1/str2 /g
Replaces all occurrence of str1 with str2 from lines m to n
:.,$ s/str1/str2/g
Replaces all occurrence of str1 with str2 from current line to end of file
Example:
:s/director/member/g
:$s/director/member/g Replace director with member in last line
:1,$s/director/member/i Replace director with member in whole file
Sometimes you may like to selectively replace a string. In that case, add the ‘c’ (confirmatory) parameter as the flag at the end.
Example:
:1,5s/director/member/c
- replace director with member in line 1 to 5(y/n/a/q/l/^E/^Y)?
y (yes)make this changes
n (no)skip the match
a (all)make this change and all remaining ones without further confirmation
q (quit)don't make any more changes
l (last)make this change & quit
^E Scoll the text one line up
^Y Scoll the text one line down
*
:set nonumber/set nonu Set no number
:set wrapmargin=20 Set wrap margin equal to 20
:set ignorecase Ignorecase while searching
:set ai Set auto indent
:set showmode Shows the working mode
:set autowrite Automatically writes when moves to another page
*
*
:abbr lists currently defined abbreviation
:una <abbr> Unabbreviates the abbreviation
Example:
*
*
Command Purpose
vi file1 file2 Loads both files in vi for editing.
:n Permits editing of next file
:n! Permits editing of next file without saving current file
:rew Permits editing of first file in buffer
:rew! Permits editing of first file without saving current file
:args Displays names of all files in buffer
:f Displays name of current file
Example:
*
*
The chmod command
*
*
Three types of permissions
Execute
*
*
For others remove read permissions
$ chmod o-r report.txt
$ ls -l report.txt
$ chmod g=rx report.txt
*
*
Original permissions
$ ls-l report.txt
Assigning permissions using octal notations
$ chmod 664 report.txt
After assigning permissions
$ ls -l report.txt
*
*
*
*
*
cmp [file1] [file2]
*
*
The cmp command does a character-by-character comparison and indicates whether or not two files are
identical.
To use cmp files should be sorted.
If the two files differ, cmp generates a message indicating the first point where the two files differ. E.g.,
file1 file2 differ: char 4, line 10
*
comm [file1] [file2]
$ comm file1 file2
*
*
The comm command outputs three columns: the first column identifies only those lines that are present in the first file, the second column identifies only those lines that are present in the second file, and the third column identifies those lines that are present in both files.
To use comm files should be sorted.
Comm can also produce selective output, using options -1, -2 or -3.
$ comm -12 file1 file2 # selects only common lines
*
diff [file1] [file2]
$ diff file1 file2
*
*
The problem with ‘cmp’ is that it is unable to distinguish whether the difference between two files is profound or superficial. A useful alternative to the ‘cmp’ command is the UNIX ‘diff’ command. The ‘diff’ command attempts to determine the minimum set of changes needed to convert one file to another file.
Maintaining several versions of a file:
To use diff command files should be sorted.
Take previous example for file1 and file2
*
*
*
*
-f : Display Process ancestry
-e : Display system processes
Process is an instance of a program in execution.
Each process is uniquely identified by a number called PID ( Process Identifier ) that is alloted by the kernel when it is born.
When you log in to a UNIX system, a process is immediately set up by the kernel: sh (for bourne shell), ksh(korn shell), bash (bourne again shell)
Any command that you type in at the prompt is actually the standard input to the shell process. This process remains alive until you logout
To know PID of current shell,
echo $$
user1 1275 1032 0 09:10:02 pts/1 00:00 ps –f
*
*
fork
exec
wait
Fork:
A process in Unix is created with fork system call, which creates a copy of the process that invokes it. For example, when you enter a command at the prompt, the shell first creates a copy of itself. The image is practically identical to the calling process, except for few parameters like PID. When a process is forked in this way, child gets a new PID. Forking mechanism is responsible for multiplication of processes in system.
Exec:
Parent then overwrites the image that it has just created with the copy of the program that has to be executed. This is done by exec system call, and parent is said to exec this process. No new process is created here; existing program is simply replaced with new program. This process has same PID as the child that was just forked.
Wait:
*
Executing jobs in background
In order to execute a process in background, just terminate the command line with ‘&’
$ sort –o emp.lst emp.lst &
550 # job’s PID
*
A multitasking system lets a user do more than one job at a time.
*
nohup : Log Out Safely
nohup ( no hangup) command, when prefixed to a command, permits execution of the process even after user has logged out.
$ nohup sort emp.last &
*
*
bg, fg, kill, jobs
[1]+ Stopped cat >test
*
If you are using Korn or bash shell, you can use its job control facility to manipulate jobs.
You can send a job to background, bring it back to foreground, suspend or even kill it.
For this purpose, following commands are used:
bg, fg, suspend, and kill
fg and bg optionally use a job identifier prefixed by a % symbol. If you specify fg %1, the first job is brought to foreground.
*
$ kill 1346
*
Kill, by default uses signal number 15 to terminate the process.
Signal 9 is also known as sure kill signal
You can also kill all processes in your own system except login shell, by using special argument 0
$ kill 0 # terminate all processes with signal 15
Kill $! - kills last background job
*
at command
at 15:15
-r (remove ) : Remove submitted job
*
*
Scheduling jobs using cron
cron lets you schedule jobs so that they can run repeatedly.
$ crontab -l
day of week (0 - 6) (Sunday=0)
month (1 - 12)
hour (0 - 23)
min (0 - 59)
*
cron lets you schedule jobs so that they can run repeatedly. It takes input from user’s crontab file.
Cron process is mostly dormant, but every minute it wakes up and looks in a control file in /usr/spool/cron/crontabs) for instructions to be performed at that instant.
*
11 | Sumit Singh | D.G.M | Marketing | 19/Apr/1943 | 60000
12 | Barun Sen | Director | Personnel | 11/May/1947 | 78000
23 | Bipin Das | Secretary | Personnel | 11/Jul/1947 | 40000
50 | N.k.Gupta | Chairman | Admin | 30/Aug/1956 | 64000
43 | Chanchal | Director | Sales | 03/Sep/1938 | 67000
*
*
-h Displays header as specified instead of file name.
Example:
pr [option] [file_name]
Options to head:
Example:
head [option] [file_name]
Options to tail:
Example:
Options to cut:
-c : cutting columns
-f : cutting fields
-d : specify delimeter
Example:
+k Starts sorting after skipping kth field
-k Stops sorting after kth field
-o File Write result to “File” instead of standard output
-t Specify field separator
Example:
$sort -t"|" -k 3,3 -k 2,2 emp.lst
*
*
Options:
-u : selects only non-repeated lines
Examples:
$ sort departments | uniq
*
*
*
grep scans a file for occurrence of a pattern, and can display the selected pattern, line numbers in which they were found, or filenames where pattern was found.
Grep can also select lines not containing the pattern
*
Deleting specified pattern lines
$grep -v "sales" emp.lst
character
[pqr] Matches a single character p, q or r
[a-r] Matches a single character within range a – r
[^pqr] Matches a single character which is not p, q or r
^pattern Matches pattern at beginning of line
pattern$ Matches pattern at end of line
\<pattern Matches pattern at beginning of word
pattern\> Matches pattern at end of word
*
Suppress the prefixing of filenames on output when multiple files are searched
$grep -h gupta *
grep : Searching for pattern (Contd…)
Searches for a pattern only at the beginning of a word and not anywhere on the line
$grep "\<man" emp.lst
Searches for a pattern only at the end of a word and not anywhere on the line
$grep "man\>" emp.lst
$grep ag[agr][ra]r*wal emp.lst
*
*
$egrep "sengupta|dasgupta|gupta" emp.lst
$egrep "(sen|das|)gupta" emp.lst
*
Egrep command, authored by Alfred Aho, extends grep’s matching capabilities.
It offers all options of grep, but its most useful feature is the facility to specify more than one pattern for search.
Each pattern is separated from the other by a | (pipe)
It has –f, which allows to fetch patterns from some file.
egrep -f
$cat -n pattern.lst
Overview
Built-in variables
Arrays
Functions
Unlike other filters, awk operates at field level
*
*
This command made a late entry into the UNIX system in 1977 to augment the tool kit with suitable report formatting capabilities. Named after its authors, Aho, Weinberger and Kernighnan, awk is unquestionable the most powerful utility for text manipulation.
*
awk –F"|" '/sales/ {print $2,$3,$4,$6}' emp.lst
awk –F “|” ‘NR==3, NR==6 {print NR, $2, $3, $6 }’ emp.lst
awk <options> ‘ address {action}’ <file(s)>
*
*
In the line number 1 line specifier section is represented by the context address /director/. This section selects the lines that are to be processed in the action section. Moreover the print is the default action of awk there is no need to specify it if you want to print the entire line.
For the purpose of the identification of fields, awk uses a contiguous string of space as the default field delimiter. This is done with –F option. Using $ prefixed variable viz. $2 $3 you can specify the field. e,g,
$ awk –F “|” ‘/sales/ { print $2,$3,$5 }’ emp.lst
The comma has used to delimit the field specification. These ensure that each field is separated from the other by a space. Example shown in line number2. awk also accept the line address to select lines. To select lines 3 to 6 from file filename we use following command
*
printf("%3d %-20s %-12s %d\n", NR, $2, $3, $6)
}' emp.lst
*
*
The output in the last slide is unformatted, with the width of each field determined solely by the space it occupies in the file awk is a ‘stream formatter’, and uses the C-like printf statement to beautify the output. This statement is used with suitable format specifier to control the width and justification each field should use, irrespective of the width it has in the input. %s format will be used for string data, and %d for numeric. You can now produce a list of all the agarwals:
As shown in line number 1, the name and designation have been printed in space 20 and 12 characters wide, respectively, using the ‘–’ symbol to left-justify the output.
You can use following characters for formatting.
Character Description
e Floating-point format ([-]d.precisione[+-]dd)
E Floating-point format ([-]d.precisione[+-]dd)
f Floating-point format ([-]ddd.precision)
g e or f conversion, whichever is shortest, with trailing zeroes removed
G E or f conversion, whichever is shortest, with trailing zeroes removed
o Unsigned octal values String
x Unsigned hexadecimal number. Uses a-f for 10 to 15
X Unsigned hexadecimal number. Uses A-F for 10 to 15
% Literal %
Logical And &&
Logical Or ||
awk -F "|" '$6 > 7500 { print }' emp.lst
*
*
Line number 1 shows, how do you print the three fields for the directors or chairman? You can always write each awk program separate line and if you want to print only those lines for persons who are neither director nor chairman, you should use the != and && operators instead.
awk offer the ~ and !~ operators to match and negate a match, respectively. You can use this feature to print out all the chowdhury’s (spell Choudhury and chowdury) and saxena’s (spelt as saxena and saksena)as shown in line number 2.
awk -F "|" '$2 ~ /[cC]ho[wu]dh?ury/ || $2 ~ /sa[xk]s?ena/ { print }' emp.lst
awk can also handle numbers, both integer and floating type, all the relational tests can be handled by awk with numbers, as is done in any other language. Using operators from the set shown in the following table, as shown in line number 3.
Operator Significance
< Less than
= = Equal to
*
}' emp.lst
kount++
}' emp.lst
*
*
Like the shell, awk can perform computations on numbers. It uses the arithmetic operators +, -, *, /, and %(modulus) same as used by C or other programming languages. It also overcomes one of the major limitations of the shell; the inability to handle decimal numbers, example shown in line number 1.
A user-defined variable used by awk has a special feature. No type declaration is needed, and it is initialized to zero or a null string, by default, depending on the type. awk has a mechanism of identifying the type of variable used from its context.
*
awk -F"|" -f empawk.awk emp.lst
*
*
The programs are getting larger, you an store into file. So if the previous program was stored in the file empawk.awk, then you need to specify the program filename with the –f option :
To run the empaawk.awk file type the following command at the shell prompt.
$ awk –F”|” –f empawk.awk emp.lst
*
cat -n empawk2.awk
3 }
5 kount++; tot += $6
6 printf "%3d %-20s %-12s %d\n", kount, $2, $3, $6
7 }
8 END {
9 printf "\n\t\tThe average basic pay is %6d\n", tot/kount
10 }
*
*
awk statement are usually applied to all lines selected by the line specifier or address, and there are no addresses, then they are applied to every line of a file. If you are to print a heading, then the BEGIN section can be used quite gainfully. Similarly, if you want to print some totals after the processing is over then you should do it in the END section.
The BEGIN and END sections are optional, and take the form
BEGIN { action }
END { action }
These two sections, when present, are delimited by the body of the awk program. They also use a pair of curly braces to enclose the program. You can use these two sections to print suitable heading at the beginning, and the average salary at the end. Store the awk program in a separate file empawk2.awk, and then use the –f option.
You can execute this program using command given below
$ awk –F”|” –f empawk2.awk emp.lst
*
$6 > 7500
(change empawk2.awk line number 4 as $5>mpay )
*
*
The previous program can take a more generalized form if the number 7500 is replaced by a variable. To do that, the entire awk command (not just the program) should be stored in shell script, and the parameter supplied as an argument to the script. The parameter is compared with the variable. These variables are properly known positional parameters, and are identified by the shell as $1, $2, $3, etc. in the order they are presented in the line.
awk also accepts these parameters without conflicting with the identifiers which also use the same notation. The only difference is that the positional parameters used by awk should be enclosed within single quotes. This will enable awk to distinguish between a positional parameter and a field identifier. So, instead of having line number 1 as the line specifier, you should generalize it as line number 2.
There is another method of generalizing the previous program, without having to store the entire command sequence in a separate script file. Since both the shell and awk use variables, you can expect awk to accept values passed by the shell. If you use a variable mpay in the shell command line, then you need to make the following change in the script empawk.awk
$6 > mpay {
*
awk 'BEGIN {FS="|" }
awk '$6>6000
awk has several built-in variables. All these variables are assigned automatically by awk, though it is also possible for a user to reassign them.
The major variables you can use profitably with awk are shown below.
Variable Function
FS The input field separator
OFS The output field separator
NF Number of fields in current record
FILENAME The current input file
FS defines the input field separator, which in this case happens to be the “|”. This is alternative to the –F option of the command, which does the same thing. When used at all it must occur in the BEGIN section so that the body of the program knows its value before it starts processing:
BEGIN { FS=”|” }
When neither the –F option is specified, nor is this variable defined in the BEGIN section, awk assumes the space and the tab characters as the default input field separator.
Likewise, when you used the print statement with comma-separated arguments, each argument was separated from the other by a space. This is awk default output field separator, and can be reassigned using the variable OFS in the BEGIN section:
BEGIN { OFS=“~” }
see Line number 1.
*
cat -n empawk3.awk
3 getline var < "/dev/tty"
5 }
6 $6 > var {
7 printf( "%3d %-20s %-12s %d \n", ++kount, $2, $3, $6)
8 }
*
*
getline reads from the standard input, it is possible to take a line of information from the keyboard. Using the device name /dev/tty to represent the terminal, you can replace the previous sequence, by placing the getline statement in the BEGIN section.
When you execute this script, it will ask you for the value which has to be used as the cut-off point :
$ awk –F”|” –f empawk4.awk emp.lst
*
3 }
6 gp = $6+hra+da;
7 tot[1] +=$6 ; tot[2] += da;
8 tot[3] +=hra; tot[4] += gp
9 kount++
12 printf "\n\t Average %5d %5d %5d %5d\n", \
13 tot[1]/kount, tot[2]/kount, tot[3]/kount, tot[4]/kount
14 }
*
*
awk handles single-dimensional arrays. The index for any array can be virtually anything; it can even be a string. No array declarations are necessary; an array is considered to be declared the moment it is used, and is automatically initialized to zero, unless initialized explicitly.
You can use array to store the totals of the basic pay, da, house rent and gross pay of the sales and marketing people. Assume that the da is 25%, and the house rent 50% of the basic pay. Use the array to store the totals of each element of pay and also the gross pay.
When you run the program, it outputs the average of the elements of the pay :
$ awk –f empawk.awk emp.lst
*
awk has several built-in functions, performing both arithmetic and string operations. A list of the major functions is presented in the table below:
You may like to clear the screen before displaying an awk output on the terminal or you may want to print the system date at the beginning of the report. You can achieve all this, with the system() function.
It accepts any UNIX command (or even a pipeline) as an argument, but the entire command sequence has to be enclosed within double quotes. Simply place the following statements in the BEGIN section of a awk program :
BEGIN {
int(x) Return the integer value of x
sqrt(x) Returns the square root of x
index(s1,s2) Returns the position of the string s2 in the string s1
length( ) Returns the length of the argument (the complete record in case of none)
substr(s1, s2, s3) Returns portion of string of length s3, starting from position s2 in string s1
split(str,arr) Splits string str into array arr
*
}
*
*
awk has conditional structures with the if statement, and loops using while and for. They all execute a body of statements depending on the success or failure of the control command.
The if statement can be used when the && and || are found to be inadequate for certain tasks. It tests for the condition that is specified on its ‘command line’ (i.e. the line containing the if statement), and then performs the actions that follow it. Optionally, it can also perform a different set of actions if the condition is not found to be true. This statement takes the form
if (condition is true)
<statement>
else
<statements>
With if you can use any relational operator, as well as the special symbols ~ and !~, to match a regular expression. They all work very well with it, and when combined with the logical operators || and &&, make awk programming quite easy and powerful.
*
*
*
awk suppors two loops, for and while and they both execute the loop body as long as the control command returns a true value. for has two forms. The easier one resembles its C counterpart. A simple example illustrates the first form :
for (K=1; K<=9;K+=2)
The second form of the for loop doesn’t have a parallel in any known programming language. It has an unusual syntax :
for ( k in array )
<commands>
*
}
*
*
In the above program, user defined function newline() is defined, which takes parameters as character and number. This function can be used to print new line on the screen.

Documents

Unix Fundamentals