04 Syntax Analysis S16

Embed Size (px)

Citation preview

  • 7/24/2019 04 Syntax Analysis S16

    1/45

    Syntax Analysis

    CSE 340 Principles of Programming Languages

    Spring 2016

    Aam !oup"

    Arizona State University

    #ttp$%%aamoupe&com

  • 7/24/2019 04 Syntax Analysis S16

    2/45

    Aam !oup"' Principles of Programming Languages

    Syntax Analysis

    ( )#e goal of syntax analysis is to transformt#e se*uence of to+ens from t#e lexer intosomet#ing useful

    ( ,o-e.er' -e nee a -ay to specify anc#ec+ if t#e se*uence of to+ens is .ali / PLS /

    !ECAL !) / ! !) !

    !) !) !) / ! !) !

    2

  • 7/24/2019 04 Syntax Analysis S16

    3/45

    Aam !oup"' Principles of Programming Languages

    sing egular Expressions

    P5A S)A)EE/)7

    S)A)EE/) E8PESS/ 9 :;S)) 9 9 ? 9 7 9 %E8PESS/ @/ 9 ! 9 !ECAL P @/ 9 !9 !ECAL

    5 + 10

    foo - bar

    1 + 2 + 3

    3

  • 7/24/2019 04 Syntax Analysis S16

    4/45

  • 7/24/2019 04 Syntax Analysis S16

    5/45

    Aam !oup"' Principles of Programming Languages

    Context?:ree 5rammars

    ( Syntax for context?free grammars Eac# ro- is calle a prouction

    ( /on?terminals on t#e left

    ( ig#t arro-

    ( /on?terminals an terminals on t#e rig#t

    /on?terminals -ill start -it# an upper case in our examples' terminals -ill Belo-ercase an are to+ens

    S -ill typically Be t#e starting non?terminal

    ( Example for matc#ing parent#esis

    S F

    S F @ S

    Can also -rite more succinctly By comBining prouction rules -it# t#e samestarting non?terminals

    SF @ S 9

    G

  • 7/24/2019 04 Syntax Analysis S16

    6/45

    Aam !oup"' Principles of Programming Languages

    C:5 Example

    SF @ S 9

    !eri.ations of t#e C:5S

    S @ S @ @

    S @ S @ @ S @ @ @@

    6

  • 7/24/2019 04 Syntax Analysis S16

    7/45

    Aam !oup"' Principles of Programming Languages

    C:5 Example

    ExpF Exp > Exp

    ExpF Exp 7 Exp

    ExpF /

    Exp Exp 7 Exp Exp 7 3 Exp > Exp 7 3

    Exp > 2 7 3 1 > 2 7 3

    H

  • 7/24/2019 04 Syntax Analysis S16

    8/45

    Aam !oup"' Principles of Programming Languages

    Leftmost !eri.ation

    ( Al-ays expan t#e leftmost nonterminal

    ExpF Exp > Exp

    ExpF Exp 7 Exp

    ExpF /

    s t#is a leftmost eri.ation

    Exp Exp 7 Exp Exp 7 3 Exp > Exp 7 3 Exp > 2 7 3

    1 > 2 7 3

    Exp Exp 7 Exp Exp > Exp 7 Exp 1 > Exp 7 Exp 1 > 2 7 Exp 1 > 2 7 3

    I

  • 7/24/2019 04 Syntax Analysis S16

    9/45

    Aam !oup"' Principles of Programming Languages

    ig#tmost !eri.ation

    ( Al-ays expan t#e rig#tmost nonterminal

    ExpF Exp > ExpExpF Exp 7 Exp

    ExpF /

    Exp Exp 7 Exp Exp 7 3 Exp > Exp 7 3 Exp > 2 7 3 1 > 2 7 3

    10

  • 7/24/2019 04 Syntax Analysis S16

    10/45

    Aam !oup"' Principles of Programming Languages

    Parse )ree

    (

  • 7/24/2019 04 Syntax Analysis S16

    11/45

    Aam !oup"' Principles of Programming Languages

    Parse )ree

    Exp Exp 7 Exp Exp 7 3 Exp > Exp 7 3 Exp > 2 7 3 1 > 2 7 3

    12

    Exp

    Exp Exp7

    3Exp > Exp

    21

  • 7/24/2019 04 Syntax Analysis S16

    12/45

  • 7/24/2019 04 Syntax Analysis S16

    13/45

    Aam !oup"' Principles of Programming Languages

    AmBiguous 5rammars

    ExpF Exp > Exp

    ExpF Exp 7 Exp

    ExpF /

    ,o- to parse 1 + 2 * 3

    Exp Exp 7 Exp Exp > Exp 7 Exp 1 > Exp 7 Exp 1 >

    2 7 Exp 1 > 2 7 3

    Exp Exp > Exp 1 > Exp 1 > Exp 7 Exp 1 > 2 7 Exp 1 > 2 7 3

    14

  • 7/24/2019 04 Syntax Analysis S16

    14/45

    Aam !oup"' Principles of Programming Languages

    AmBiguous 5rammars

    1 + 2 * 3

    1G

    Exp

    Exp Exp7

    3Exp > Exp

    21

    Exp

    Exp Exp>

    7 Exp1 Exp

    2 3

  • 7/24/2019 04 Syntax Analysis S16

    15/45

    Aam !oup"' Principles of Programming Languages

    AmBiguous 5rammars

    ( A grammar is amBiguous if t#ere exists t-oifferent leftmost eri.ations' or t-o ifferentrig#tmost eri.ations' or t-o ifferent parse trees

    for any string in t#e grammar( s Englis# amBiguous

    sa- a man on a #ill -it# a telescope&

    ( AmBiguity is not esiraBle in a programminglanguage nli+e in Englis#' -e ont -ant t#e compiler to rea

    your min an try to infer -#at you meant

    16

  • 7/24/2019 04 Syntax Analysis S16

    16/45

    Aam !oup"' Principles of Programming Languages

    Parsing Approac#es

    ( Marious -ays to turn strings into parse tree Jottom?up parsing' -#ere you start from t#e

    terminals an -or+ your -ay up

    )op?o-n parsing' -#ere you start from t#estarting non?terminal an -or+ your -ay o-n

    ( n t#is class' -e -ill focus exclusi.ely on

    top?o-n parsing

    1N

  • 7/24/2019 04 Syntax Analysis S16

    17/45

    Aam !oup"' Principles of Programming Languages

    )op?!o-n ParsingS F A 9 J 9 C

    A F a

    J F JB 9 B

    C F Cc 9

    parse_S() {

    t_type = getToken()

    if (t_type = = a) {ungetToken()

    parse_A()

    check_eof()

    }

    else if (t_type = = b) {

    ungetToken()parse_B()

    check_eof()

    }

    1H

    else if (t_type = = c) {

    ungetToken()parse_C()

    check_eof()

    }

    else if (t_type = = eof) {

    // o !" # stuff

    }

    else {

    synta$_error()

    }

    }

  • 7/24/2019 04 Syntax Analysis S16

    18/45

    Aam !oup"' Principles of Programming Languages

    Preicti.e ecursi.e !escent Parsers

    ( Preicti.e recursi.e escent parser are efficient top?o-nparsers Efficient Because t#ey only loo+ at next to+en' no

    Bac+trac+ing%guessing

    ( )o etermine if a language allo-s a preicti.e recursi.eescent parser' -e nee to efine t#e follo-ing functions

    ( :S)@O' -#ere O is a se*uence of grammar symBols @non?terminals' terminals' an :S)@O returns t#e set of terminals an t#at Begin strings eri.e

    from O( :LL

  • 7/24/2019 04 Syntax Analysis S16

    19/45

    Aam !oup"' Principles of Programming Languages

    :S)@ Example

    S F A 9 J 9 C

    A F a

    J F JB 9 B

    C F Cc 9

    :S)@S D a' B' c'

    :S)@A D a :S)@J D B

    :S)@C D ' c

    20

  • 7/24/2019 04 Syntax Analysis S16

    20/45

    Aam !oup"' Principles of Programming Languages

    Calculating :S)@O

    :irst' start out -it# empty :S)@ sets for all non?terminals in t#egrammar

    )#en' apply t#e follo-ing rules until t#e :S)@ sets o not c#ange$

    1& :S)@x D x if x is a terminal

    2& :S) D ( 3& f A F JO is a prouction rule' t#en a :S)@J D to

    :S)@A

    4& f A F J0J1J2=JiJi>1=J+an :S)@J 0 an :S)@J 1 an

    :S)@J 2 an = an :S)@J i' t#en a :S)@Ji>1

    D to :S)@AG& f A F J0J1J2=J+an :S)@J0 an :S)@J 1 an

    :S)@J2 an = an :S)@J +' t#en a to :S)@A

    21

  • 7/24/2019 04 Syntax Analysis S16

    21/45

    Aam !oup"' Principles of Programming Languages

    Calculating :S) Sets

    S F AJC!

    A F C! 9 aA

    J F B

    C F cC 9

    ! F ! 9

    23

    INITIAL

    :S)@S D

    :S)@S D

    :S)@S D a

    :S)@S D a' c' ' B

    :S)@S D a' c' ' B

    :S)@A

    D

    :S)@A

    D a

    :S)@A

    D a' c' ' :S)@A

    D a' c' ' :S)@A

    D a' c' '

    :S)@J D

    :S)@J D B

    :S)@J D B

    :S)@J D B

    :S)@J D B

    :S)@C D

    :S)@C D c'

    :S)@C D c'

    :S)@C D c'

    :S)@C D c'

    :S)@! D

    :S)@! D '

    :S)@! D '

    :S)@! D '

    :S)@! D '

  • 7/24/2019 04 Syntax Analysis S16

    22/45

    Aam !oup"' Principles of Programming Languages

    Calculating :S) Sets

    S F AJC!

    A F C! 9 aA

    J F B

    C F cC 9

    ! F ! 9

    24

    INITIAL

    :S)@S D

    :S)@S D

    :S)@S D a

    :S)@S D a' c' ' B

    :S)@S D a' c' ' B

    :S)@A

    D

    :S)@A

    D a

    :S)@A

    D a' c' ' :S)@A

    D a' c' ' :S)@A

    D a' c' '

    :S)@J D

    :S)@J D B

    :S)@J D B

    :S)@J D B

    :S)@J D B

    :S)@C D

    :S)@C D c'

    :S)@C D c'

    :S)@C D c'

    :S)@C D c'

    :S)@! D

    :S)@! D '

    :S)@! D '

    :S)@! D '

    :S)@! D '

  • 7/24/2019 04 Syntax Analysis S16

    23/45

    Aam !oup"' Principles of Programming Languages

    S F AJC!

    A F C! 9 aA

    J F B

    C F cC 9

    ! F ! 9

    2G

    INITIAL

    :S)@S D

    :S)@S D

    :S)@S D a

    :S)@S D a' c' ' B

    :S)@S D a' c' ' B

    :S)@A

    D

    :S)@A

    D a

    :S)@A

    D a' c' ' :S)@A

    D a' c' ' :S)@A

    D a' c' '

    :S)@J D

    :S)@J D B

    :S)@J D B

    :S)@J D B

    :S)@J D B

    :S)@C D

    :S)@C D c'

    :S)@C D c'

    :S)@C D c'

    :S)@C D c'

    :S)@! D

    :S)@! D '

    :S)@! D '

    :S)@! D '

    :S)@! D '

    1&:S)@x D x if x is a terminal

    2& :S) D (

    3&f A F JO is a prouction rule' t#en a :S)@J D to :S)@A

    4&f A F J0J1J2=JiJi>1=J+an :S)@J 0 an :S)@J 1 an :S)@J 2 an = an :S)@J i' t#en a

    :S)@Ji>1 D to :S)@A

    G&f A F J0J1J2=J+an :S)@J0 an :S)@J 1 an :S)@J 2 an = an :S)@J +' t#en a to :S)@A

  • 7/24/2019 04 Syntax Analysis S16

    24/45

    Aam !oup"' Principles of Programming Languages

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    25/45

    Aam !oup"' Principles of Programming Languages

    Calculating :LL1 D to

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    26/45

    Aam !oup"' Principles of Programming Languages

    Calculating :LL< Sets

    2I

    S F AJC!

    A F C! 9 aA

    J F B

    C F cC 9

    ! F ! 9

    :S)@S D a' c' ' B

    :S)@A D a' c' '

    :S)@J D B

    :S)@C D c'

    :S)@! D '

    INITIAL

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    27/45

    Aam !oup"' Principles of Programming Languages

    Calculating :LL< Sets

    30

    S F AJC!

    A F C! 9 aA

    J F B

    C F cC 9

    ! F ! 9

    :S)@S D a' c' ' B

    :S)@A D a' c' '

    :S)@J D B

    :S)@C D c'

    :S)@! D '

    INITIAL

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    28/45

    Aam !oup"' Principles of Programming Languages 32

    S F AJC!

    A F C! 9 aA

    J F B

    C F cC 9

    ! F ! 9

    :S)@S D a' c' ' B

    :S)@A D a' c' '

    :S)@J D B

    :S)@C D c'

    :S)@! D '

    INITIAL

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    29/45

    Aam !oup"' Principles of Programming Languages

    Preicti.e ecursi.e !escent Parsers

    ( At eac# parsing step' t#ere is only onegrammar rule t#at can Be c#osen' ant#ere is no nee for Bac+trac+ing

    ( )#e conitions for a preicti.e parser areBot# of t#e follo-ing f A F O an A F R' t#en :S)@O

    :S)@R f T :S)@A' t#en :S)@A

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    30/45

    Aam !oup"' Principles of Programming Languages

    Creating a Preicti.e ecursi.e !escentParser

    ( Create a C:5

    ( Calculate :S) an :LL< sets

    (Pro.e t#at C:5 allo-s a Preicti.eecursi.e !escent Parser

    (

  • 7/24/2019 04 Syntax Analysis S16

    31/45

    Aam !oup"' Principles of Programming Languages

    Email Aresses

    ( ,o- to parse%.aliate email aresses name U omain&tl

    ( )urns out' it is not so simple Vcse 340VUexample&com

    customer%epartments#ippingUexample&com

    VABcUefVUexample&com VABcWUefVUexample&com

    VABcWVUexample&comVUexample&com

    test Vexample U#elloV XtestUexample&comY

    ( n fact' a company calle ailgun' -#ic# pro.ies email ser.ices as an AP'

    release an open?source tool to .aliate email aresses' Base on t#eirexperience -it# real?-orl email ,o- i t#ey implement t#eir parser

    A recursi.e escent parser

    #ttps$%%git#uB&com%mailgun%flan+er

    3G

  • 7/24/2019 04 Syntax Analysis S16

    32/45

    Aam !oup"' Principles of Programming Languages

    Email Aress C:5

    *uote?string

    atom

    ot?atom

    -#itespace

    Aress F /ame?ar?rfc 9 /ame?ar?lax 9 Ar?spec

    /ame?ar?rfc F !isplay?name?rfc Angle?ar?rfc 9 Angle?ar?rfc

    !isplay?name?rfc F

  • 7/24/2019 04 Syntax Analysis S16

    33/45

    Aam !oup"' Principles of Programming Languages

    Simplifie Email Aress C:5

    *uote?string @*?s

    atom

    ot?atom @?a

    *uote?string?at @*?s?a

    ot?atom?at @?a?a

    Aress F /ame?ar 9 Ar?spec

    /ame?ar F !isplay?name Angle?ar 9 Angle?ar

    !isplay?name F

  • 7/24/2019 04 Syntax Analysis S16

    34/45

    Aam !oup"' Principles of Programming Languages

    /ame?ar F !isplay?name Angle?ar 9 Angle?ar

    !isplay?name F

  • 7/24/2019 04 Syntax Analysis S16

    35/45

    Aam !oup"' Principles of Programming Languages 40

    /ame?ar F !isplay?name Angle?ar 9 Angle?ar

    !isplay?name F

  • 7/24/2019 04 Syntax Analysis S16

    36/45

    Aam !oup"' Principles of Programming Languages 41

    9 p

    /ame?ar F !isplay?name Angle?ar 9 Angle?ar

    !isplay?name F

  • 7/24/2019 04 Syntax Analysis S16

    37/45

    Aam !oup"' Principles of Programming Languages

    parse_Aress() {

    t_type = getToken()%

    // Check #&'ST(a e*ar)

    if (t_type = = + ,, t_type = = ato ,, t_type = = -*s ) {

    ungetToken()%

    parse_a e*ar()%

    printf(.Aress * a e*ar.)%}

    // Check #&'ST(Ar*spec)

    else if (t_type = = *a*a ,, t_type = = -*s*a) {

    ungetToken()%

    parse_Ar*spec()%

    printf(.Aress * Ar*spec.)%

    }

    else {

    synta$_error()%

    }

    }

    42

    9 p

    :S)@Aress D ?a?a' *?s?a' X' atom'

    *?s

    :S)@/ame?ar D X' atom' *?s

    :S)@Ar?spec D ?a?a' *?s?a

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    38/45

    Aam !oup"' Principles of Programming Languages

    parse_a e*ar() {

    t_type = getToken()%

    // Check #&'ST(0isplay*na e Angle*ar)

    if (t_type = = ato ,, t_type = = -*s) {

    ungetToken()%

    parse_0isplay*na e()%

    parse_Angle*ar()%

    printf(.a e*ar * 0isplay*na e Angle*ar.)%

    }

    // Check #&'ST(Angle*ar)

    else if (t_type = = + ) {

    ungetToken()%

    parse_Angle*ar()%

    printf(.a e*ar * Angle*ar.)%}

    else {

    synta$_error()%

    }

    }

    43

    /ame ar F !isplay name Angle ar 9 Angle ar

    :S)@/ame?ar D X' atom' *?s

    :S)@!isplay?name D atom' *?s

    :S)@Angle?ar D X

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    39/45

    Aam !oup"' Principles of Programming Languages

    parse_0isplay*na e() {

    t_type = getToken()%

    // Check #&'ST(1 or 0isplay*na e*list)

    if (t_type = = ato ,, t_type = = -*s) {

    ungetToken()%parse_1 or()%

    parse_0isplay*na e*list()%

    printf(.0isplay*na e * 1 or 0isplay*na e*list.)%

    }

    else {synta$_error()%

    }

    }

    44

    !isplay name

  • 7/24/2019 04 Syntax Analysis S16

    40/45

    Aam !oup"' Principles of Programming Languages

    parse_0isplay*na e*list() {

    t_type = getToken()%

    // Check #&'ST( 1 or 0isplay*na e*list)

    if (t_type = = ato ,, t_type = = -*s) {

    ungetToken()%

    parse_1 or()%

    parse_0isplay*na e*list()%

    printf(.0isplay*na e*list * 1 or 0isplay*na e*list.)%

    }

    // Check #" 22"1 (0isplay*na e*list)

    else if (t_type = = + ) {

    ungetToken()%printf(.0isplay*na e*list * VZ

    }

    else { synta$_error()% }

    }

    4G

    !isplay?name?list F

  • 7/24/2019 04 Syntax Analysis S16

    41/45

    Aam !oup"' Principles of Programming Languages

    parse_Angle*ar() {

    t_type = getToken()%

    // Check #&'ST(+ Ar*spec )

    if (t_type = = + ) {

    // ungetToken()3

    parse_Ar*spec()%

    t_type = getToken()%

    if (t_type 4= ) {

    synta$_error()%

    }

    printf(.Angle*ar * + Ar*spec .)%

    }else {

    synta$_error()%

    }

    }

    46

    Angle?ar F X Ar?spec Y

    :S)@Angle?ar D X

    :S)@Ar?spec D ?a?a' *?s?a

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    42/45

    Aam !oup"' Principles of Programming Languages

    parse_Ar*spec() {

    t_type = getToken()%

    // Check #&'ST(*a*a 0 o ain)

    if (t_type = = *a*a) {

    // ungetToken()3

    parse_0o ain()%printf(.Ar*spec * *a*a 0o ain.)%

    }

    // Check #&'ST(-*s*a 0o ain)

    else if (t_type = = -*s*a) {

    parse_0o ain()%

    printf(.Ar*spec * -*s*a 0o ain.)%

    }

    else { synta$_error()% }

    }

    4N

    Ar?spec F ?a?a !omain 9 *?s?a !omain

    :S)@Ar?spec D ?a?a' *?s?a

    :S)@!omain D ?a

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    43/45

    Aam !oup"' Principles of Programming Languages

    parse_0o ain() {

    t_type = getToken()%

    // Check #&'ST(*a)

    if (t_type = = *a) {printf(.0o ain * *a.)%

    }

    else {

    synta$_error()%}

    }

    4H

    !omain F a

    :S)@!omain D ?a

    :LL

  • 7/24/2019 04 Syntax Analysis S16

    44/45

    Aam !oup"' Principles of Programming Languages

    parse_1 or() {

    t_type = getToken()%

    // Check #&'ST(ato )

    if (t_type = = ato ) {

    printf(.1 or * ato .)%

    }

    // Check #&'ST(-*s)

    else if (t_type = = -*s) {

    printf(.1 or * -*s.)%

    }

    else {

    synta$_error()%

    }

    }

    4I

  • 7/24/2019 04 Syntax Analysis S16

    45/45

    Preicti.e ecursi.e !escent Parsers

    ( :or e.ery non?terminal A in t#e grammar' create a functioncalle parse;A

    ( :or eac# prouction rule A F O @-#ere O is a se*uence ofterminals an non?terminals' if get)o+en@ :S)@O t#en

    c#oose t#e prouction rule A F O :or e.ery terminal an non?terminal a in O' if a is a non?terminal

    call parse;a' if a is a terminal c#ec+ t#at get)o+en@ a

    f :S)@O' t#en c#ec+ t#at get)o+en@ :LL