Upload
todd-benson
View
1.109
Download
3
Embed Size (px)
DESCRIPTION
Basic introduction into regular expressions and some ways they may be used
Citation preview
Todd Benson
RegEx 101
Overview
• What is RegEx• RegEx Basics• Uses for RegEx• Useful RegExpressions
What is RegEx?
“In computing, a regular expression (abbreviated regex or regexp) is a sequence of characters that forms a search pattern, mainly for use in pattern matching with strings, or string matching, i.e. "find and replace"-like operations. “ - Wikipedia
• “Some people, when confronted with a problem, think ‘I know, I'll use regular expressions.’ Now they have two problems.” - Jamie Zawinski
Why RegEx?
• Tools use it: Nessus, Burp, W3AF• All programming languages use it• Excellent tool to have in the toolbox
RegEx Basics: Literal Matches
Literal Matches‘bat’ matches ‘bat’
12 special characters - \ ^ $ . | ? * + ( ) [ ]These must be escaped ‘\\’ ‘\$’
.‘.at’ Matches ‘bat’, ‘cat’, and ‘hat’
RegEx Basics: Characture Classes
Character Classes • -- [ ]
‘[bc]at’ will match ‘bat’ or ‘cat’• --[^ ]
[^A-Z] will match any character that is not a capitol letter
RegEx Basics: Shorthand Character Classes
Shorthand Character Classes• \d
Same as [0-9]• \D
Same as [^0-9]• \w
Same as [0-9A-Za-z_] • \W
Same as [^0-9A-Za-z_]• \s
tab, line feed, form feed, carriage return, and space• \S
Anything other than tab, line feed, etc.
RegEx Basics: Anchors
Anchors• ^
Beginning of line ‘rpm -qa|grep ^ao’ would list all packages that start with ‘ao’
• $End of line‘[0-9][0-9][0-9]$’ would find all instances when a line ended with 3 consecutive digits
• \b \bWord boundary‘\bW.n*\b’ looks for words that begin with ‘W’ followed by any character followed by ‘n’ followed by zero or more characters‘Win’ ‘Windows’ ‘Won’ ‘Wonton’ ‘Winter’ ‘Wonderland’ ‘Wonder’ all match
RegEx Basics: Non-Printable
Non-printable• -- \n
New Line• -- \r
Carriage Return
RegEx Basics: Groups
Groups • --( )
Defines the scope and precedence of operators‘Write(ln)?’ matches ‘Write’ and ‘Writeln’
• -- |OR‘Gr(a|e)y’ matches ‘Gray’ and ‘Grey’‘(ITSO|OITS)’ matches ‘ITSO’ or ‘OITS’
RegEx Basics: Quantification
QuantificationShows how often a token or group is allowed to occur
• ?Zero or one‘a?’ will match ‘’ and ‘a’
• *Zero or more‘a*’ will match ‘’ and ‘a’ and ‘aaaaaaaaa’
RegEx Basics: Quantification (Cont.)
QuantificationShows how often a token or group is allowed to occur• +
One or more‘a+’ will match ‘a’ and ‘aaaaaaaaaaaa’
• { , }Minimum and Maximum‘a{3,7}’ will match between 3 and 7 ‘a’
Uses: Searches
• Errors (error|exception|illegal|invalid|fail|stack|access|directory|file|not found|unknown|uid=|varchar|SQL|quotation mark|syntax|password) • Redirects(document|window)\.
Uses: Searches (Cont.)
• DOM XSS((src|href|data|location|code|value|action)\s*["'\]]*\s*\+?\s*=)|((replace|assign|navigate|getResponseHeader|open(Dialog)?|showModalDialog|eval|evaluate|execCommand|execScript|setTimeout|setInterval)\s*["'\]]*\s*\()
• DOM XSS(location\s*[\[.])|([.\[]\s*["']?\s*(arguments|dialogArguments|innerHTML|write(ln)?|open(Dialog)?|showModalDialog|cookie|URL|documentURI|baseURI|referrer|name|opener|parent|top|content|self|frames)\W)|(localStorage|sessionStorage|Database)
Uses: Searching Logs
• grep -v 156.132.142.[11-19] /var/log/apache2/other_vhosts_access.log|grep -v 156.132.103.*
• cat /var/log/apache2/other_vhosts_access.log|grep -o '\s[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\.[0-9]\{1,3\}\s' | sort -t . -k 3,3n -k 4,4n|uniq
Uses: VI Search and Replace
• SS#:%s/\d{3}-\d{2}-\d{4}/123-45-6789/g
• email:%s/[0-9A-Za-z._%+-]+@[0-9A-Za-z._%+-]+\.[A-Za-z]{2,4}/[email protected]/g
Uses: Command Line
openssl ciphers|sed ‘s/:/\n/g'|sort
Uses: Output Mangaling
while read line; do host $line; done < ips.txt | sed 's/ has address / \/ /g‘ > foo.txt
Uses: Programming
• Sanitizing input $name = preg_replace("/\<\s*?\/?script\s*?>/i", "<script>", $name);
Useful RegExes
• SS# \d{3}-\d{2}-\d{4}
• Phone# (\(?\d{3}\)?[ -.])?\d{3}[ -.]\d{4}
• IP Addresses \b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3} (25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b
• email [0-9A-Z._%+-]+@[0-9A-Z._%+-]+\.[A-Z]{2,4}
• Find Base64 (?:[A-Za-z0-9+/]{4})*(?:[A-Za-z0-9+/]{2}==|[A-Za-z0-9+/]{3}=)?
• Credit Card# - HTML Tags - Dates
Questions?
Go forth and RegEx…
References
• Web Application Hacker's Handbook• http://regex.info/blog/2006-09-15/247#comment-3085• http://en.wikipedia.org/wiki/Regular_expression• https://isc.sans.edu/regex.html• http://www.regular-expressions.info/examples.html• http://
blog.spiderlabs.com/2013/02/easy-dom-based-xss-detection-via-regexes.html
• https://en.wikipedia.org/wiki/Regular_expression• www.xkcd.com