7
Regular expressions {week 06} The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. rom Concepts of Programming Languages, 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6

Regular expressions {week 06}

Embed Size (px)

DESCRIPTION

The College of Saint Rose CIS 433 – Programming Languages David Goldschmidt, Ph.D. Regular expressions {week 06}. from Concepts of Programming Languages , 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6. Regular expressions ( i ). - PowerPoint PPT Presentation

Citation preview

Page 1: Regular expressions {week  06}

Regular expressions{week 06}

The College of Saint RoseCIS 433 – Programming LanguagesDavid Goldschmidt, Ph.D.

from Concepts of Programming Languages, 9th edition by Robert W. Sebesta, Addison-Wesley, 2010, ISBN 0-13-607347-6

Page 2: Regular expressions {week  06}

Regular expressions (i)

A regular expression is an expression ina “mini language” designed specificallyfor textual pattern matching Support for regular expressions are

availablein many languages, including Java, JavaScript,C, C++, PHP, etc.

Page 3: Regular expressions {week  06}

Regular expressions (ii)

A pattern contains numerous character groupings and is specified as a string

Patterns to match a phone number include: [0-9][0-9][0-9]−[0-9][0-9][0-9]−[0-9][0-9]

[0-9][0-9] [0-9]{3}−[0-9]{3}−[0-9]{4} \d\d\d−\d\d\d−\d\d\d\d \d{3}−\d{3}−\d{4} (\d\d\d) \d\d\d−\d\d\d\d

Page 4: Regular expressions {week  06}

Regular expressions (iii)regular expression

matches example

xyz specified characters xyz Java matches Java

. any single character Java matches J..a

[xyz] single character x, y, or z Java matches Ja[uvwx]a

[^xyz] any character except x, y, or z

Java matches Ja[^abcd]a

[a-z] any character a through z Java matches [A-M]a[t-z]a

[^a-z] any character except a through z

Java matches Jav[^b-x]

[A-Za-z] any “word” character Java matches Jav[a-fp-z]

\d any digit character [0-9] 1234 matches \d\d\d\d

\D any non-digit character [^0-9]

Java matches \D\D\D\D

\w any “word” character [A-Za-z]

Java matches \wava

\W any non-word character [^A-Za-z]

2+3 matches \d\W\d

\s any whitespace character D G matches \w\s\w

\S any non-whitespace character

D + matches \S\s\S

Page 5: Regular expressions {week  06}

Regular expressions (iv)

regular expression

matches example

^ the beginning of the string Java matches ^Java

$ the end of the string Java matches Java$

pattern* zero or more occurrences of pattern

JAVA matches [A-Z]*

pattern+ one or more occurrences of pattern

Java matches J[a-z]+

pattern? zero or one occurrence of pattern

−50 matches −?\d+

pattern{n} exactly n occurrences of pattern

Java matches \w{4}

pattern{n,m} between n and m (inclusive) occurrences of pattern

Java matches \w{3,8}

pattern{n,} at least n occurrences of pattern

Java matches \w{3}

Page 6: Regular expressions {week  06}

Regular expressions in Java (i)

The String class in Java provides a pattern matching method called matches():

Unlike other languages, Java requires the pattern to match the entire string

String s = "Pattern matching in Java!";String p = "\\w+\\s\\w+\\s\\w{2}\\s\\w+!";if ( s.matches( p ) ){ System.out.println( "MATCH!" );}

Page 7: Regular expressions {week  06}

Regular expressions in Java (ii)

Additional pattern-matching methods: Use the replaceFirst() and replaceAll() methods to replace a pattern with a string:String s = "<title>Cool Web Site</title>";String p = "</?\w+>";String result = s.replaceAll( p, "" );