31
http://brown.edu/go/regex Welcome to Regular Expressions. We will begin at 4:00 While you’re waiting, please download this presentation at http://brown.edu/go/regex download and (if you can) print off the RegEx cheatsheet at https://tinyurl.com/ovz8pao

Expressions. Welcome to Regular We will begin at 4:00cds.library.brown.edu/projects/dsl/workshops/regex/...2020/04/13  · Welcome to Regular Expressions. We will begin at 4:00 While

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

  • http://brown.edu/go/regexWelcome to Regular Expressions. We will begin at 4:00 While you’re waiting, please ● download this presentation at

    http://brown.edu/go/regex● download and (if you can) print off

    the RegEx cheatsheet at https://tinyurl.com/ovz8pao

    https://tinyurl.com/ovz8pao

  • http://brown.edu/go/regex

    Regular ExpressionsSearch and Replace with Advanced Pattern MatchingPatrick Rashleigh

  • http://brown.edu/go/regex

    First things first● Download this presentation at

    http://brown.edu/go/regex● Download this RegEx

    cheatsheet● Open up chat in Zoom● Mute the mic in Zoom

    http://brown.edu/go/regexhttp://web.mit.edu/hackl/www/lab/turkshop/slides/regex-cheatsheet.pdfhttp://web.mit.edu/hackl/www/lab/turkshop/slides/regex-cheatsheet.pdf

  • http://brown.edu/go/regex

    IntroductionsBriefly introduce yourself in the chat (any of this is optional)● name● area of study● what brings you to a Regular

    Expressions workshop?

  • http://brown.edu/go/regex

    Workshop format● I talk a bit, demo via screen sharing;

    you post questions to chat● You do a challenge for a few minutes

    while I check the chat for questions● Post your solutions to chat; discuss● Repeat!

  • http://brown.edu/go/regex

    What is a Regular Expression?

  • http://brown.edu/go/regex

    A text pattern matching language that is used in many tools

  • http://brown.edu/go/regex

    b*t matches beet and bot and bat

  • http://brown.edu/go/regex

    “Show me all the names of characters in Pride and Prejudice mentioned in other characters’ speech”

  • http://brown.edu/go/regexExample of Data Mining Free Text

    http://regex101.com/r/sS3vM6/2

  • http://brown.edu/go/regex

    Behold: the Regular Expression language!

    "[^"]*M(s|rs?)\.(\s+[A-Z]\w+)+[^"]*"

  • http://brown.edu/go/regexExample of data cleaning structured text

    http://regex101.com/r/nN4fE3/2

  • http://brown.edu/go/regex

    Text editors

    (in this case, Sublime Text)

  • http://brown.edu/go/regex

    Data cleaners

    (in this case, Open Refine)

  • http://brown.edu/go/regex

    Programming languagesPython, Javascript, SQL, R, and pretty much any language

  • http://brown.edu/go/regex

    Our tool today:RegEx101.com

  • http://brown.edu/go/regex

    Take a deep breath and let’s go step by stepStep 1:Basic matching

  • http://brown.edu/go/regex

    LiteralsDemo

    http://regex101.com/r/iI9dA0/2

  • http://brown.edu/go/regex

    Symbols\d \w \s \nDemo

    http://regex101.com/r/pR6xE0/2

  • http://brown.edu/go/regexCHALLENGELinkFind all Rhode Island telephone numbers (401 area code)Post your solution to the chat

    http://regex101.com/r/pR6xE0/2

  • http://brown.edu/go/regex

    Step 2:Character classes

  • http://brown.edu/go/regex

    List: [...]Not-list: [^...]Ranges: [A-Z]Disjunction: ( | )Demo

    http://regex101.com/r/iI9dA0/2

  • http://brown.edu/go/regex

    Boundary: \bMatching a condition, not a characterDemo

    http://regex101.com/r/iI9dA0/2

  • http://brown.edu/go/regexCHALLENGELinkMatch all 4-letter words beginning with f (any case). Don't use the i flagPost your solution to the chat

    http://regex101.com/r/iI9dA0/2

  • http://brown.edu/go/regex

    Step 3:Quantifiers

  • http://brown.edu/go/regex

    0 or more Xs1 or more XsX may existBetween m and n Xs

    X*X+X?X{m,n}

  • http://brown.edu/go/regex

    Only affects the immediately previous entity (character or group)Demo

    http://regex101.com/r/iI9dA0/2

  • http://brown.edu/go/regexCHALLENGELinkMatch all words of any length that start with a capital letterPost your solution to the chat

    http://regex101.com/r/iI9dA0/2

  • http://brown.edu/go/regexCHALLENGE 2LinkGet the text of all quotes (i.e. falling between quotation marks)Post your solution to the chat

    https://regex101.com/r/sS3vM6/1

  • http://brown.edu/go/regex

    You now have all the knowledge to read this!

    "[^"]*M(s|rs?)\.(\s+[A-Z]\w+)+[^"]*"

    (and if it’s still gobbledygook, you know where to find me to follow up)

  • http://brown.edu/go/regex

    Thank [email protected] feedback formenter “Regex” for the workshop title

    http://brown.edu/go/workshop-evaluation