24
Todd Benson RegEx 101

Regex 101

Embed Size (px)

DESCRIPTION

Basic introduction into regular expressions and some ways they may be used

Citation preview

Page 1: Regex 101

Todd Benson

RegEx 101

Page 2: Regex 101

Overview

• What is RegEx• RegEx Basics• Uses for RegEx• Useful RegExpressions

Page 3: Regex 101

What is RegEx?

“In computing, a regular expression (abbreviated regex or regexp) is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. “ - Wikipedia

Page 4: Regex 101

• “Some people, when confronted with a problem, think ‘I know, I'll use regular expressions.’ Now they have two problems.” - Jamie Zawinski

Page 5: Regex 101

Why RegEx?

• Tools use it: Nessus, Burp, W3AF• All programming languages use it• Excellent tool to have in the toolbox

Page 6: Regex 101

RegEx Basics: Literal Matches

Literal Matches‘bat’ matches ‘bat’

12 special characters - \ ^ $ . | ? * + ( ) [ ]These must be escaped ‘\\’ ‘\$’

.‘.at’ Matches ‘bat’, ‘cat’, and ‘hat’

Page 7: Regex 101

RegEx Basics: Characture Classes

Character Classes • -- [ ]

‘[bc]at’ will match ‘bat’ or ‘cat’• --[^ ]

[^A-Z] will match any character that is not a capitol letter

Page 8: Regex 101

RegEx Basics: Shorthand Character Classes

Shorthand Character Classes• \d

Same as [0-9]• \D

Same as [^0-9]• \w

Same as [0-9A-Za-z_] • \W

Same as [^0-9A-Za-z_]• \s

tab, line feed, form feed, carriage return, and space• \S

Anything other than tab, line feed, etc.

Page 9: Regex 101

RegEx Basics: Anchors

Anchors• ^

Beginning of line ‘rpm -qa|grep ^ao’ would list all packages that start with ‘ao’

• $End of line‘[0-9][0-9][0-9]$’ would find all instances when a line ended with 3 consecutive digits

• \b \bWord boundary‘\bW.n*\b’ looks for words that begin with ‘W’ followed by any character followed by ‘n’ followed by zero or more characters‘Win’ ‘Windows’ ‘Won’ ‘Wonton’ ‘Winter’ ‘Wonderland’ ‘Wonder’ all match

Page 10: Regex 101

RegEx Basics: Non-Printable

Non-printable• -- \n

New Line• -- \r

Carriage Return

Page 11: Regex 101

RegEx Basics: Groups

Groups • --( )

Defines the scope and precedence of operators‘Write(ln)?’ matches ‘Write’ and ‘Writeln’

• -- |OR‘Gr(a|e)y’ matches ‘Gray’ and ‘Grey’‘(ITSO|OITS)’ matches ‘ITSO’ or ‘OITS’

Page 12: Regex 101

RegEx Basics: Quantification

QuantificationShows how often a token or group is allowed to occur

• ?Zero or one‘a?’ will match ‘’ and ‘a’

• *Zero or more‘a*’ will match ‘’ and ‘a’ and ‘aaaaaaaaa’

Page 13: Regex 101

RegEx Basics: Quantification (Cont.)

QuantificationShows how often a token or group is allowed to occur• +

One or more‘a+’ will match ‘a’ and ‘aaaaaaaaaaaa’

• { , }Minimum and Maximum‘a{3,7}’ will match between 3 and 7 ‘a’

Page 14: Regex 101

Uses: Searches

• Errors (error|exception|illegal|invalid|fail|stack|access|directory|file|not found|unknown|uid=|varchar|SQL|quotation mark|syntax|password) • Redirects(document|window)\.

Page 15: Regex 101

Uses: Searches (Cont.)

• DOM XSS((src|href|data|location|code|value|action)\s*["'\]]*\s*\+?\s*=)|((replace|assign|navigate|getResponseHeader|open(Dialog)?|showModalDialog|eval|evaluate|execCommand|execScript|setTimeout|setInterval)\s*["'\]]*\s*\()

• DOM XSS(location\s*[\[.])|([.\[]\s*["']?\s*(arguments|dialogArguments|innerHTML|write(ln)?|open(Dialog)?|showModalDialog|cookie|URL|documentURI|baseURI|referrer|name|opener|parent|top|content|self|frames)\W)|(localStorage|sessionStorage|Database)

Page 16: Regex 101

Uses: Searching Logs

• grep -v 156.132.142.[11-19] /var/log/apache2/other_vhosts_access.log|grep -v 156.132.103.*

• cat /var/log/apache2/other_vhosts_access.log|grep -o '\s[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\s' | sort -t . -k 3,3n -k 4,4n|uniq

Page 17: Regex 101

Uses: VI Search and Replace

• SS#:%s/\d{3}-\d{2}-\d{4}/123-45-6789/g

• email:%s/[0-9A-Za-z._%+-]+@[0-9A-Za-z._%+-]+\.[A-Za-z]{2,4}/[email protected]/g

Page 18: Regex 101

Uses: Command Line

openssl ciphers|sed ‘s/:/\n/g'|sort

Page 19: Regex 101

Uses: Output Mangaling

while read line; do host $line; done < ips.txt | sed 's/ has address / \/ /g‘ > foo.txt

Page 20: Regex 101

Uses: Programming

• Sanitizing input $name = preg_replace("/\<\s*?\/?script\s*?>/i", "&lt;script&gt;", $name);

Page 21: Regex 101

Useful RegExes

• SS# \d{3}-\d{2}-\d{4}

• Phone# (\(?\d{3}\)?[ -.])?\d{3}[ -.]\d{4}

• IP Addresses \b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

• email [0-9A-Z._%+-]+@[0-9A-Z._%+-]+\.[A-Z]{2,4}

• Find Base64 (?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?

• Credit Card# - HTML Tags - Dates

Page 22: Regex 101

Questions?

Page 23: Regex 101

Go forth and RegEx…

Page 24: Regex 101

References

• Web Application Hacker's Handbook• http://regex.info/blog/2006-09-15/247#comment-3085• http://en.wikipedia.org/wiki/Regular_expression• https://isc.sans.edu/regex.html• http://www.regular-expressions.info/examples.html• http://

blog.spiderlabs.com/2013/02/easy-dom-based-xss-detection-via-regexes.html

• https://en.wikipedia.org/wiki/Regular_expression• www.xkcd.com