9
Regular Expressions • What is this line all about? while (!($search =~ /^\s*$/)) { • It’s a string search just like before, but with a huge twist – regular expression search • ^\s*$ is a regular expression that says “look for a line with nothing but white space” – Whitespace: space ( ), tab (\t), formfeed (\f), newline (\n), carriage return (\r)

Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Embed Size (px)

Citation preview

Page 1: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Regular Expressions

• What is this line all about?while (!($search =~ /^\s*$/)) {

• It’s a string search just like before, but with a huge twist – regular expression search

• ^\s*$ is a regular expression that says “look for a line with nothing but white space”– Whitespace: space ( ), tab (\t), formfeed (\f),

newline (\n), carriage return (\r)

Page 2: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Regular Expressions

• A “convenient” way to describe patterns of characters– Characters include “printable” and “meta” characters

• Three primary concepts :– Concatenation – adjacent characters in the search string

must be adjacent in the data string– Alternation – specify a choice of characters that match in a

specified position– Repetition – specify how many of a given character must

match

Page 3: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Concatenation

if ($data =~ /abcdef/) {…

}• The pattern “abcdef” must show in that order

within the variable $data

Page 4: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Alternation

if ($data =~ /a(b|c|d|e)f/) {…

}•The pattern “a(b|c|d|e)f” must be an ‘a’ followed by one of ‘b’, ‘c’, ‘d’, ‘e’, followed by a ‘f’ within the variable $data

Page 5: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Repetition

if ($data =~ /ab*f/) {…

}•The pattern “ab*f” must be an ‘a’ followed by zero or more ‘b’, followed by a ‘f’ within the variable $data•* – zero or more instances of the previous character•+ – one or more instances of the previous character •{n} – exactly n instances of the previous character•{m,n} – m or m+1, … , n instances of the previous character•{n,} – n or more instances of the previous character•? – zero or one instances of the previous character

Page 6: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Meta-characters

• Anything following a \• Alternation (choice) |• Grouping within ( and )• Character classes within [ and ]

– e.g. [A-Za-z] all upper and lower case letters– e.g. [abc] a or b or c – same as (a|b|c)– e.g. [^0-9] anything that is not a digit 0 thru 9

• Match any– . (the dot) matches all characters. e.g. [.*] zero or more of

any character

Page 7: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Meta-characters

• Beginning and end of a string– ^ what follows must start the string– $ what follows must end the string– /^ matches the ^– /$ matches the $

Page 8: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Character Classes

• Use square brackets to denote classes (sets) of characters to be matched[A-Z] match any single uppercase letter[a-z] match any single lower case letter[0-9] match any digit[A-Za-z0-9] match any single letter or digit[^0-9] match any single character that is NOT a digit

• Note that there is no spaces in the classes (unless you want to match a space)

Page 9: Regular Expressions What is this line all about? while (!($search =~ /^\s*$/)) { It’s a string search just like before, but with a huge twist – regular

Matching

• String matching assumes the longest possible string to formulate the matche.g. “hear ye hear ye” =~ /hear.*ye/ matches the entire string

• If you want the minimal string you must do the followinge.g. “hear ye hear ye” =~ /hear.*?ye/ matches only the first “hear ye”