21
Syntax Analysis – Bottom Up Parsing 1 CMPSC 470 Lecture 07 Topics: A. Bottom-up parsing concept A bottom-up parser constructs a parse tree for an input string at levels and working up toward the root. Given input string , bottom-up parser reduces (or substitutes) non-terminal symbols to terminal symbols. It uses shift-reduce parsing. Example) a) Reduction Reduction is a process of converting/reducing input string to grammar symbol. A reduction is the reverse of a step in a derivation, so goal of bottom-up parsing is to construct a derivation in reverse.

Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

  • Upload
    vutuyen

  • View
    256

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

Syntax Analysis – Bottom Up Parsing 1 CMPSC 470 Lecture 07 Topics:

• A. Bottom-up parsing concept A bottom-up parser constructs a parse tree for an input string at levels and working up toward the root.

Given input string 𝑤𝑤, bottom-up parser reduces (or substitutes) non-terminal symbols to terminal symbols.

It uses shift-reduce parsing.

Example)

a) Reduction

Reduction is a process of converting/reducing input string to grammar symbol.

A reduction is the reverse of a step in a derivation, so goal of bottom-up parsing is to construct a derivation in reverse.

Page 2: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

b) Handle & handle pruning

During a left-to-right scan of input, bottom-up parsing constructs a right most derivation, in reverse, using handle.

Handle is a substring that matches the body of production.

Example)

Right sentential form Handle Reducing products

(1) 𝐢𝐢𝐢𝐢1 ∗ 𝐢𝐢𝐢𝐢2

𝐹𝐹 ∗ 𝐢𝐢𝐢𝐢2

𝑇𝑇 ∗ 𝐢𝐢𝐢𝐢2

𝑇𝑇 ∗ 𝐹𝐹

𝑇𝑇

𝐸𝐸

In (1), formally, the production “𝐹𝐹 → 𝐢𝐢𝐢𝐢” (in the right position “𝐢𝐢𝐢𝐢1”) is a handle of “𝐢𝐢𝐢𝐢1 ∗ 𝐢𝐢𝐢𝐢2,” but for convenience, we refer “𝐢𝐢𝐢𝐢1” as handle rather than “𝐹𝐹 → 𝐢𝐢𝐢𝐢.”

A right most derivation in reverse can be obtained by “handle pruning.”

Example)

Page 3: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

c) Shift-reduce parsing

Shift-reduce parsing is a form of bottom-up parsing, which uses stack and input buffer.

Procedure.

1. Initially stack is empty (except its bottom $). Right most symbol of input string 𝑤𝑤 is $.

2. During left-to-right scan of input string, a. parser shifts an input symbol on top of stack, b. if possible, reduce a string on top of stack, c. parser repeat

3. If stack contains start symbol or error is detected, parser halts.

Example) Given the following grammar:

𝐸𝐸 → 𝐸𝐸 + 𝑇𝑇 | 𝑇𝑇, 𝑇𝑇 → 𝑇𝑇 ∗ 𝐹𝐹 | 𝐹𝐹, 𝐹𝐹 → 𝐢𝐢𝐢𝐢,

And input:

Stack Input Action

Page 4: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

By using stack, handle to reduce will always appear on top of stack.

Shift-reduce parsing has 4 operations:

d) Conflict during shift-reduce parsing

There are context-free grammars for which shift-reduce parsing cannot be used: non-LR grammars.

⊛ Ambiguous grammar

Example) “dangling-else” grammar

𝑆𝑆 → 𝐢𝐢𝐢𝐢 𝐸𝐸 𝐭𝐭𝐭𝐭𝐭𝐭𝐭𝐭 𝑆𝑆 | 𝐢𝐢𝐢𝐢 𝐸𝐸 𝐭𝐭𝐭𝐭𝐭𝐭𝐭𝐭 𝑆𝑆 𝐭𝐭𝐞𝐞𝐞𝐞𝐭𝐭 𝑆𝑆 | 𝑜𝑜𝑜𝑜ℎ𝑒𝑒𝑒𝑒

If we have the following configuration:

Page 5: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

⊛ Ambiguity of selecting production from handle in stack and input symbols

Example) Consider the following grammar

𝑠𝑠𝑜𝑜𝑠𝑠𝑜𝑜 → 𝐢𝐢𝐢𝐢 ( 𝑝𝑝𝑝𝑝𝑒𝑒𝑝𝑝𝑠𝑠_𝑙𝑙𝑙𝑙𝑠𝑠𝑜𝑜 ) | 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒 ∶= 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒 𝑝𝑝𝑝𝑝𝑒𝑒𝑝𝑝_𝑙𝑙𝑙𝑙𝑠𝑠𝑜𝑜 → 𝑝𝑝𝑝𝑝𝑒𝑒𝑝𝑝_𝑙𝑙𝑙𝑙𝑠𝑠𝑜𝑜 , 𝑝𝑝𝑝𝑝𝑒𝑒𝑝𝑝𝑠𝑠 | 𝑝𝑝𝑝𝑝𝑒𝑒𝑝𝑝𝑠𝑠 𝑝𝑝𝑝𝑝𝑒𝑒𝑝𝑝𝑠𝑠 → 𝐢𝐢𝐢𝐢 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒 → 𝐢𝐢𝐢𝐢 ( 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒_𝑙𝑙𝑙𝑙𝑠𝑠𝑜𝑜 ) | 𝐢𝐢𝐢𝐢 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒_𝑙𝑙𝑙𝑙𝑠𝑠𝑜𝑜 → 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒_𝑙𝑙𝑙𝑙𝑠𝑠𝑜𝑜 , 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒 | 𝑒𝑒𝑒𝑒𝑝𝑝𝑒𝑒

If we have the following configuration:

Page 6: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

B. Simple LR a) Introduction to LR parsing

LR(𝑘𝑘) :

Properties:

• LR(0) is also called “simple LR” or “SLR”. • LR parser is table-driven, like non-recursive LL

parser. • LR grammar can construct a parsing table. • With LR grammar, the left-to-right shift-reduce

parser can recognize handles on top of stack.

Advantage of LR parsing:

• LR parser can recognize virtually all programming language written in CFG.

• LR parsing method is the most general non-backtracking shift-reduce parsing method known, and efficient.

• LR parser can detect a syntactic error as soon as possible.

• LR grammars are superset of predictive or LL grammars.

Drawback of LR method:

• It is too much work to construct LR parser (parsing table) by hand for a typical programming-language grammar.

Page 7: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

Given the following grammar:

𝐸𝐸 → 𝐸𝐸 + 𝑇𝑇 | 𝑇𝑇, 𝑇𝑇 → 𝑇𝑇 ∗ 𝐹𝐹 | 𝐹𝐹, 𝐹𝐹 → 𝐢𝐢𝐢𝐢,

and following configuration (current stack and input):

Stack Input $𝑇𝑇 ∗ 𝐢𝐢𝐢𝐢 $

How can we select “shift” (moving ∗ to stack) instead of “reduce” (𝐸𝐸 → 𝑇𝑇)?

LR parser uses item, canonical collection, CLOSURE, GOTO function, in order to build parsing automata from CFG.

b) Items

An LR(0) item is a production of 𝐺𝐺 with a “dot” at some position of the body.

Page 8: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

c) Canonical LR(0) collection

It is one collection of sets of LR(0) items that provides the basis for constructing deterministic finite automation, called LR(0) automation.

This is the basic set used to build SLR and LALR parser.

Building the canonical collection use two functions: CLOSURE and GOTO.

Idea is similar to the method of converting NFA to DFA, discussed in class.

Following shows the canonical LR(0) collection of the following grammar:

𝐸𝐸 → 𝐸𝐸 + 𝑇𝑇 | 𝑇𝑇 𝑇𝑇 → 𝑇𝑇 ∗ 𝐹𝐹 | 𝐹𝐹 𝐹𝐹 → 𝐢𝐢𝐢𝐢

I0E’→•EE→•E+TE→•TT→•T*FT→•FF→•(E)F→•id

I1E’→E•E→E•+T

I2E→T•T→T•*F

I5F→•id

I4F→(•E)E→•E+TE→•TT→•T*FT→•FF→•(E)F→•id

I3T→F•

I6E→E+•TT→•T*FT→•FF→•(E)F→•id

I7T→T*•FF→•(E)F→•id

I8E→E•+TF→(E•)

I9E→E+T•T→T•*F

I10T→T*F•

I11F→(E)•

E

T

id

(

F

F

+

$

accept

*

(

T

E

id

T

F(

id

F

*

id

(

)+

Page 9: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

d) Closure of item sets

If “I” is a set of items for a grammar 𝐺𝐺, then CLOSURE(I) is a set of items that can be extended recursively from I (especially with nonterminals), including I.

Example 1) Given the following grammar:

𝐴𝐴 → 𝐚𝐚 𝐵𝐵 𝐛𝐛 𝐵𝐵 → 𝐜𝐜 | 𝐢𝐢 𝐭𝐭

The “𝐴𝐴 → 𝐚𝐚 ⋅ 𝐵𝐵 𝐛𝐛” indicates that the input will have the string for “𝐵𝐵 𝐛𝐛”. The CLOSURE���𝐴𝐴 → 𝐚𝐚 ⋅ 𝐵𝐵 𝐛𝐛��� is the set of items whose string can be “𝐵𝐵 𝐛𝐛”.

Example 2) Given grammar:

𝐸𝐸 → 𝐸𝐸 + 𝑇𝑇 | 𝑇𝑇 𝑇𝑇 → 𝑇𝑇 ∗ 𝐹𝐹 | 𝐹𝐹 𝐹𝐹 → 𝐢𝐢𝐢𝐢

Page 10: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

Determining CLOSURE(I)

SetOfItems CLOSURE(𝐼𝐼) 1. 𝐽𝐽 = 𝐼𝐼 2. repeat 3. for ( each item 𝐴𝐴 → 𝛼𝛼 ⋅ 𝐵𝐵𝐵𝐵 in 𝐽𝐽 ) 4. for ( each production 𝐵𝐵 → 𝛾𝛾 of 𝐺𝐺 ) 5. if ( each 𝐵𝐵 →⋅ 𝛾𝛾 is not in 𝐽𝐽 ) 6. add 𝐵𝐵 →⋅ 𝛾𝛾 to 𝐽𝐽 7. until no more items are added to 𝐽𝐽 on one round; 8. return 𝐽𝐽

or

1. Initially, add every items in 𝐼𝐼 to CLOSURE(𝐼𝐼) 2. If [𝐴𝐴 → 𝛼𝛼 ⋅ 𝐵𝐵𝐵𝐵] ∈ CLOSURE(𝐼𝐼)

and 𝐵𝐵 → 𝛾𝛾 is a production, then add [𝐵𝐵 →⋅ 𝛾𝛾] into CLOSURE(𝐼𝐼)

3. Repeat 2 until no more new items are added into CLOSURE(𝐼𝐼)

We divide all the sets of items into two classes:

1. Kernel items: the initial item (𝑆𝑆′ →⋅ 𝑆𝑆), and all items whose dots are not at the left end.

2. Non-kernel items: all items with their dots are at the left end, except for 𝑆𝑆′ →⋅ 𝑆𝑆. Non-terminal items can be determined using closure. (This will be will with examples again)

Page 11: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

e) The Function GOTO

GOTO is used to define the transition in LR(0) automation.

GOTO(𝐼𝐼,𝑋𝑋) is defined to be the closure of the ste of all items [𝐴𝐴 → 𝛼𝛼𝑋𝑋 ⋅ 𝐵𝐵] such that [𝐴𝐴 → 𝛼𝛼 ⋅ 𝑋𝑋𝐵𝐵] is in 𝐼𝐼: GOTO(𝐼𝐼,𝑋𝑋) = CLOSURE({𝐴𝐴 → 𝛼𝛼𝑋𝑋 ⋅ 𝐵𝐵|𝐴𝐴 → 𝛼𝛼 ⋅ 𝑋𝑋𝐵𝐵 ∈ 𝐼𝐼}).

For given state “𝐼𝐼”, GOTO(𝐼𝐼,𝑋𝑋) specifies the transition from the state for 𝐼𝐼 under input 𝑋𝑋.

Example) Given grammar:

𝐸𝐸′ → 𝐸𝐸 𝐸𝐸 → 𝐸𝐸 + 𝑇𝑇 | 𝑇𝑇 𝑇𝑇 → 𝑇𝑇 ∗ 𝐹𝐹 | 𝐹𝐹 𝐹𝐹 → 𝐢𝐢𝐢𝐢

If 𝐼𝐼 = {[𝐸𝐸′ → 𝐸𝐸 ⋅], [𝐸𝐸 → 𝐸𝐸 ⋅ +𝑇𝑇]}, then

Page 12: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

f) Construction canonical collection of sets of LR(0) items

Construct the canonical LR(0) collection, using the following procedure:

1. Write a grammar 𝐺𝐺′ for 𝐺𝐺. If 𝐺𝐺 is original grammar with start symbol 𝑆𝑆, then 𝐺𝐺′ is 𝐺𝐺 with new start symbol 𝑆𝑆′ → 𝑆𝑆.

2. Use the following algorithm, which uses CLOSURE and GOTO function, to determine the canonical collection 𝐶𝐶.

void Items(𝐺𝐺′) 1. 𝐶𝐶 = {CLOSURE({[𝑆𝑆′ →⋅ 𝑆𝑆]})} 2. repeat 3. for each set of items 𝐼𝐼 ∈ 𝐶𝐶 4. for each grammar symbol 𝑋𝑋 5. 𝐽𝐽 = GOTO(𝐼𝐼,𝑋𝑋) 6. if ( 𝐽𝐽 ≠ ∅ and 𝐽𝐽 ∉ 𝐶𝐶 ) 7. Add 𝐽𝐽 to 𝐶𝐶

8. until 𝐶𝐶 does not change (having no new states)

or

1. Add CLOSURE({[𝑆𝑆′ →⋅ 𝑆𝑆]}) into 𝐶𝐶 2. Select a set of items 𝐼𝐼 ∈ 𝐶𝐶 3. Select an grammar symbol 𝑋𝑋 (terminals and non-terminals) 4. If GOTO(𝐼𝐼,𝑋𝑋) ≠ ∅ and GOTO(𝐼𝐼,𝑋𝑋) ∉ 𝐶𝐶, add it into 𝐶𝐶 5. Repeat 2-4 until 𝐶𝐶 does not change

Example)

This will take about 2-3 pages.

Page 13: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs
Page 14: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

g) Use of the LR(0) automation

Central idea of “simple LR,” or SLR, parsing is the construction from the grammar of the LR(0) automation. The states of this automation are the sets of items from the canonical LR(0) collection, and the transitions are given by GOTO function.

The start state of the LR(0) automation is CLOSURE({[𝑆𝑆′ →⋅ 𝑆𝑆]}), where 𝑆𝑆′ is the start symbol.

The parser works on state 𝑗𝑗 (node 𝐼𝐼𝑗𝑗) as follows:

• If there is a transition on input 𝐚𝐚, do transition (change state) and shift;

• If there is no transition on input 𝐚𝐚, do reduce.

Example)

Stack Symbols Input Action

I0E’→•E

I1E’→E•E→E•+T

I2E→T•T→T•*F

I5F→•id

I3T→F•

I7T→T*•F

I10T→T*F•

E

T

id

F

$ accept

*

Fid

Page 15: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

h) LR parsing algorithm

⊛ Structure of LR parsing program

a1 ... ai ... an $

s_(m-1)s_m

$

LR ParsingProgram

ACTION GOTO

Output

⊛ Structure of LR parsing table

1. Given state 𝑙𝑙 and terminal 𝐚𝐚, ACTION�𝑙𝑙,𝐚𝐚� has one of four forms

a. Shift state 𝒋𝒋: Parser shifts 𝐚𝐚 to stack, but use state 𝑗𝑗 to represent 𝐚𝐚. This is because 𝐚𝐚 can be identified by 𝑗𝑗.

b. Reduce 𝑨𝑨 → 𝜷𝜷: parser reduces 𝐵𝐵 on the top of stack to head 𝐴𝐴.

c. Accept: parser accepts input, and finish parsing

d. Error: parser discover error in inputs, and takes some corrective action.

2. If GOTO[𝑙𝑙, A] = 𝑗𝑗, then GOTO maps a state 𝑙𝑙 and nonterminal 𝐴𝐴 to state 𝑗𝑗.

⊛ LR parsing configuration

A configuration of an LR parser (complete state of the parser) is a pair

It represents the right-sentential form:

Page 16: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

i) Constructing SLR parsing table

Initially, given grammar 𝐺𝐺:

1. Add new start symbol 𝑆𝑆′ and production 𝑆𝑆′ → 𝑆𝑆. Now, it is a new grammar 𝐺𝐺′.

2. From 𝐺𝐺′, construct canonical collection 𝐶𝐶 = {𝐼𝐼0, 𝐼𝐼1, … , 𝐼𝐼𝑛𝑛}, the collection of sets of LR(0) items for 𝐺𝐺′, and GOTO functions.

3. Determine FOLLOW(𝐴𝐴) for each nonterminal 𝐴𝐴.

The ACTION and GOTO entries in parsing table are constructed using the following algorithm.

1. Construct canonical collection 𝐶𝐶 = {𝐼𝐼0, 𝐼𝐼1, … , 𝐼𝐼𝑛𝑛}

2. State 𝑙𝑙 is constructed from 𝐼𝐼𝑖𝑖. The parsing actions for state 𝑙𝑙 are determined as follows:

(a) If �𝐴𝐴 → 𝛼𝛼 ⋅ 𝐚𝐚𝐵𝐵� ∈ 𝐼𝐼𝑖𝑖 and GOTO�𝐼𝐼𝑖𝑖,𝐚𝐚� = 𝐼𝐼𝑗𝑗, then ACTION�𝑙𝑙,𝐚𝐚� = "shift 𝑗𝑗", where 𝐚𝐚 must be a terminal.

(b) If [𝐴𝐴 → 𝛼𝛼 ⋅] ∈ 𝐼𝐼𝑖𝑖, then ACTION�𝑙𝑙,𝐚𝐚� = "reduce 𝐴𝐴 → 𝛼𝛼" for all 𝐚𝐚 ∈ Follow(𝐴𝐴). Here, 𝐴𝐴 ≠ 𝑆𝑆′.

(c) If [𝑆𝑆′ → 𝑆𝑆 ⋅] ∈ 𝐼𝐼𝑖𝑖, then ACTION�𝑙𝑙,𝐚𝐚� = "accept".

If any confliction actions result from the above rules, then the grammar is not SLR(1), and the algorithm fails to produce a parser.

3. If GOTO(𝐼𝐼𝑖𝑖,𝐴𝐴) = 𝐼𝐼𝑗𝑗 for nonterminal 𝐴𝐴, then GOTO[𝑙𝑙,𝐴𝐴] = 𝑗𝑗.

4. All entries not defined by rules 2 and 3 are “error” 5. The initial state of the parser is the state containing

item [𝑆𝑆′ →⋅ 𝑆𝑆].

The parsing table constructed using the above algorithm is called SLR(1) table for 𝐺𝐺. LR parser using SLR(1) table is called SLR(1) parser. A grammar having SLR(1) parsing table is said to be SLR(1).

We usually omit “(1).”

Page 17: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

Example) Given grammar: (This will take about 2 pages.)

(1) 𝐸𝐸 → 𝐸𝐸 + 𝑇𝑇 (2) 𝐸𝐸 → 𝑇𝑇 (3) 𝑇𝑇 → 𝑇𝑇 ∗ 𝐹𝐹 (4) 𝑇𝑇 → 𝐹𝐹 (5) 𝐹𝐹 → (𝐸𝐸) (6) 𝐹𝐹 → 𝐢𝐢𝐢𝐢

Page 18: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs
Page 19: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

As a result, the parsing table of the grammar has the following form:

State ACTION GOTO

id + * ( ) $ E T F

0

1

2

3

4

5

6

7

8

9

10

11

Page 20: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

j) LR parsing algorithm

⊛ Behavior of LR parser (Tasks of parsing program)

Parser determines next move from configuration �𝑠𝑠0 𝑠𝑠1 ⋯ 𝑠𝑠𝑚𝑚 , 𝐚𝐚𝑖𝑖 𝐚𝐚𝑖𝑖+1 ⋯ 𝐚𝐚𝑛𝑛 $� with

and, then consults ACTION�𝑠𝑠𝑚𝑚,𝐚𝐚𝑖𝑖�

1. If ACTION�𝑠𝑠𝑚𝑚,𝐚𝐚𝑖𝑖� = ”shift 𝑠𝑠”, then parser a. shift next state 𝑠𝑠 onto stack, and b. remove 𝐚𝐚𝑖𝑖. c. Then, configuration becomes

�𝑠𝑠0 𝑠𝑠1 ⋯ 𝑠𝑠𝑚𝑚 𝑠𝑠 , 𝐚𝐚𝑖𝑖+1 ⋯ 𝐚𝐚𝑛𝑛 $�.

2. If ACTION�𝑠𝑠𝑚𝑚,𝐚𝐚𝑖𝑖� = ”reduce 𝐴𝐴 → 𝐵𝐵”, then parser a. pops 𝑒𝑒 = |𝐵𝐵| states where |𝐵𝐵| is the length of 𝐵𝐵, b. push 𝑠𝑠 = GOTO�𝑠𝑠𝑚𝑚−𝑟𝑟,𝐚𝐚𝑖𝑖� onto stack, c. current input is not changed (in reduce move), d. configuration becomes

�𝑠𝑠0 𝑠𝑠1 ⋯ 𝑠𝑠𝑚𝑚−𝑟𝑟 𝑠𝑠 , 𝐚𝐚𝑖𝑖 𝐚𝐚𝑖𝑖+1 ⋯ 𝐚𝐚𝑛𝑛 $�, and e. generate output of LR parser.

3. If ACTION�𝑠𝑠𝑚𝑚,𝐚𝐚𝑖𝑖� = ”accetp”,

parsing is completed.

4. If ACTION�𝑠𝑠𝑚𝑚,𝐚𝐚𝑖𝑖� = ”error”, parser discovered error, and call an error recovery routine.

Example)

Given input “𝐢𝐢𝐢𝐢 ∗ 𝐢𝐢𝐢𝐢 + 𝐢𝐢𝐢𝐢”

Page 21: Syntax Analysis – Bottom Up Parsing 1 Analysis...Syntax Analysis – Bottom Up Parsing 1 . CMPSC 470 Lecture 07 . Topics: • A. Bottom-up parsing concept A bottom-up parser constructs

Stack Symbols Input Action