Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
http://brown.edu/go/regexWelcome to Regular Expressions. We will begin at 4:00 While you’re waiting, please ● download this presentation at
http://brown.edu/go/regex● download and (if you can) print off
the RegEx cheatsheet at https://tinyurl.com/ovz8pao
https://tinyurl.com/ovz8pao
http://brown.edu/go/regex
Regular ExpressionsSearch and Replace with Advanced Pattern MatchingPatrick Rashleigh
http://brown.edu/go/regex
First things first● Download this presentation at
http://brown.edu/go/regex● Download this RegEx
cheatsheet● Open up chat in Zoom● Mute the mic in Zoom
http://brown.edu/go/regexhttp://web.mit.edu/hackl/www/lab/turkshop/slides/regex-cheatsheet.pdfhttp://web.mit.edu/hackl/www/lab/turkshop/slides/regex-cheatsheet.pdf
http://brown.edu/go/regex
IntroductionsBriefly introduce yourself in the chat (any of this is optional)● name● area of study● what brings you to a Regular
Expressions workshop?
http://brown.edu/go/regex
Workshop format● I talk a bit, demo via screen sharing;
you post questions to chat● You do a challenge for a few minutes
while I check the chat for questions● Post your solutions to chat; discuss● Repeat!
http://brown.edu/go/regex
What is a Regular Expression?
http://brown.edu/go/regex
A text pattern matching language that is used in many tools
http://brown.edu/go/regex
b*t matches beet and bot and bat
http://brown.edu/go/regex
“Show me all the names of characters in Pride and Prejudice mentioned in other characters’ speech”
http://brown.edu/go/regexExample of Data Mining Free Text
http://regex101.com/r/sS3vM6/2
http://brown.edu/go/regex
Behold: the Regular Expression language!
"[^"]*M(s|rs?)\.(\s+[A-Z]\w+)+[^"]*"
http://brown.edu/go/regexExample of data cleaning structured text
http://regex101.com/r/nN4fE3/2
http://brown.edu/go/regex
Text editors
(in this case, Sublime Text)
http://brown.edu/go/regex
Data cleaners
(in this case, Open Refine)
http://brown.edu/go/regex
Programming languagesPython, Javascript, SQL, R, and pretty much any language
http://brown.edu/go/regex
Our tool today:RegEx101.com
http://brown.edu/go/regex
Take a deep breath and let’s go step by stepStep 1:Basic matching
http://brown.edu/go/regex
LiteralsDemo
http://regex101.com/r/iI9dA0/2
http://brown.edu/go/regex
Symbols\d \w \s \nDemo
http://regex101.com/r/pR6xE0/2
http://brown.edu/go/regexCHALLENGELinkFind all Rhode Island telephone numbers (401 area code)Post your solution to the chat
http://regex101.com/r/pR6xE0/2
http://brown.edu/go/regex
Step 2:Character classes
http://brown.edu/go/regex
List: [...]Not-list: [^...]Ranges: [A-Z]Disjunction: ( | )Demo
http://regex101.com/r/iI9dA0/2
http://brown.edu/go/regex
Boundary: \bMatching a condition, not a characterDemo
http://regex101.com/r/iI9dA0/2
http://brown.edu/go/regexCHALLENGELinkMatch all 4-letter words beginning with f (any case). Don't use the i flagPost your solution to the chat
http://regex101.com/r/iI9dA0/2
http://brown.edu/go/regex
Step 3:Quantifiers
http://brown.edu/go/regex
0 or more Xs1 or more XsX may existBetween m and n Xs
X*X+X?X{m,n}
http://brown.edu/go/regex
Only affects the immediately previous entity (character or group)Demo
http://regex101.com/r/iI9dA0/2
http://brown.edu/go/regexCHALLENGELinkMatch all words of any length that start with a capital letterPost your solution to the chat
http://regex101.com/r/iI9dA0/2
http://brown.edu/go/regexCHALLENGE 2LinkGet the text of all quotes (i.e. falling between quotation marks)Post your solution to the chat
https://regex101.com/r/sS3vM6/1
http://brown.edu/go/regex
You now have all the knowledge to read this!
"[^"]*M(s|rs?)\.(\s+[A-Z]\w+)+[^"]*"
(and if it’s still gobbledygook, you know where to find me to follow up)
http://brown.edu/go/regex
Thank [email protected] feedback formenter “Regex” for the workshop title
http://brown.edu/go/workshop-evaluation