8
CSC 352– Unix Programming, Spring, 2011 April 6, 2011, Week 11, a useful subset of regular expressions, grep and sed, parts of Chapter 11

CSC 352– Unix Programming, Spring, 2011

Embed Size (px)

DESCRIPTION

CSC 352– Unix Programming, Spring, 2011. April 6, 2011, Week 11, a useful subset of regular expressions, grep and sed, parts of Chapter 11. Motivation. - PowerPoint PPT Presentation

Citation preview

Page 1: CSC 352– Unix Programming, Spring, 2011

CSC 352– Unix Programming, Spring, 2011

April 6, 2011, Week 11, a useful subset of regular expressions, grep

and sed, parts of Chapter 11

Page 2: CSC 352– Unix Programming, Spring, 2011

Motivation

• In assignment 4 you will write a shell script to inspect all regular files in and below directory (including subdirectories) for a string pattern.

• To get a list of files containing a pattern, you can use the find command to find all regular files, and grep to search for the pattern.

• Your script will then iterate through this list of files and, one at a time, use sed to replace all occurrences of the pattern with a new string.

Page 3: CSC 352– Unix Programming, Spring, 2011

Finding files with pattern

• Start out with a manual find command.• find JavaLang -type f –print # looks for all regular files• find JavaLang -type f –name “*.java” –print # use name

• Use grep with above in back ticks for file list.• grep interface `find JavaLang -type f -print`• grep –l interface `find JavaLang -type f -print`• grep –l interface `find JavaLang -type f –print 2>/dev/null`

• Iterate through files in a for loop.• all=`find JavaLang -type f -print 2>/dev/null`• matches=`grep -l interface $all 2>/dev/null`• for file in $matches; do echo ”FOUND FILE $file"; done

Page 4: CSC 352– Unix Programming, Spring, 2011

File names with spaces

• These create problems for above approach.• It is necessary to use the “-print0” option of find

instead of “-print” and to pipe the stdout of find to xargs -0 in order to package space-containing file names up for downstream commands such as grep.

• We will not use directories and file names that contain spaces in project 4.

• Avoid spaces in directory and file names when setting up Unix source code and similar repositories.

Page 5: CSC 352– Unix Programming, Spring, 2011

Using sed for substitution

• Replace for loop in previous example line with this.• for file in $matches• do

– sed -e "s/interface/thingy/g" $file > junk.tmp.txt– mv junk.tmp.txt $file

• done

• Sed can substitute a string (“thingy’ in this example) for a regular expression pattern.

• The “/g” portion of sed’s substitute command “s/” says, “Do it globally, throughout each line.”

Page 6: CSC 352– Unix Programming, Spring, 2011

Grep command line (p. 295)

• Grep searches for a regular expression in stdin or in a list of files given on the command line.

• A regular expression is a string expression that describes a set of strings.

• Grep options include the following.• -i is case insensitive• -v shows only non-matching line (filters out matches).• -l (ell) gives only the distinct file names.• -n displays line numbers along with lines.

Page 7: CSC 352– Unix Programming, Spring, 2011

A few important patterns (p. 299)pattern * 0 or more occurrences of previous characterpattern . any single characterpattern [pqr] any single character from set p, q or rpattern [a-zA-Z] any single character from range a-z or A-Zpattern [^pqr] any single character *not* from set p, q or rpattern ^ the start of the string being searched for a matchpattern $ the end of the string being searched for a matchpattern \ escapes the next character so it is treated as a regular charThese are the most useful regular expression patterns available to grep, sed,

and for pattern-based searching in emacs and vi.The shell uses so-called “glob-style matching” for strings (*.java), which differ

from regular expressions (.*\.java) used by grep, sed, emacs and vi.

Page 8: CSC 352– Unix Programming, Spring, 2011

Intro to sed

• Sed is based on early command-line editors in Unix (ed editor), applied to a stream of lines as a filter.

• All we will cover this semester is using sed to replace occurrences of a pattern in a file with a string.

• sed –e ‘s/this/that’ # substitutes string “that” for pattern “this”, only once per line, where “this” may contain patterns from the previous slide. Use single quotes in the sed command line if you don’t want the Shell to expand Shell meta-characters such as $.

• sed –e ‘s/this/that/g’ # global substitute all occurrences of pattern “this” in each line.

• Sed can read stdin or a file, writes to stdout.