Upload
tobias-franey
View
225
Download
0
Tags:
Embed Size (px)
Citation preview
Lesson 6
CDT301 – Compiler Theory, Spring 2011Teacher: Linus Källberg
2
Outline
• Code generation using syntax-directed translation
• Lexical analysis
CODE GENERATION USING SYNTAX-DIRECTED TRANSLATION
3
Syntax-directed translation
• Add attributes to the grammar symbols• Add semantic actions to the grammar
– Syntax-directed translation scheme• “Inject” code into the parser
4
SDT example(Section 2.3 in the book)
• Expression grammar:expr → expr + num
| expr – num | num
• Infix to postfix notation
5
SDT example(Section 2.3 in the book)
6
Infix expression Postfix expression1 + 2 1 2 +(1 + 2) – 3 12 + 3 –1 + (2 – 3) 1 2 3 – +
SDT example(Section 2.3 in the book)
• Formal definition:– POSTFIX(num) = num– POSTFIX( (E) ) = POSTFIX(E)– POSTFIX(E1 op E2) = POSTFIX(E1) POSTFIX(E2)
op
7
Exercise (1)
Translate the following infix expressions into postfix notation:
a) 78b) 3 – 2 – 1c) (8 + 19 * 3)d) 3 * (17) / (92 + 8)Assume conventional operator precedence
and associativity.8
SDT example(Section 2.3 in the book)
• Translation scheme:expr → expr + num { print(num.value);
print('+') } | expr – num { print(num.value);
print('–') } | num { print(num.value) }
9
SDT example(Section 2.3 in the book)
• Extended parse tree for 1 + 2 – 3:
10
expr
expr
expr
num (1)
–
+ num (2)
num (3) { print(num.value); print('-') }
{ print(num.value); print('+') }
{ print(num.value) }
Exercise (2)Traverse the following extended parse tree in
a depth-first, left-right order and execute the semantic actions:
11
expr
expr
expr
num (1)
–
+ num (2)
num (3) { print(num.value); print('-') }
{ print(num.value); print('+') }
{ print(num.value) }
Left recursion elimination
expr → num { print(num.value) } restrest → + num { print(num.value);
print('+') } restrest → - num { print(num.value);
print('-') } restrest → ε
12
Exercise (3)
Draw the parse tree for 1 + 2 – 3(i.e. num + num – num) with the new grammar. Include the semantic actions as leaf nodes. Then traverse it and execute the semantic actions.
13
Syntax-directed definitions
• Similar to translation schemes• More “abstract” or “declarative”
14
Production Semantic rulesexpr → expr1 + num expr.t = expr1.t || num.value || '+' | expr1 – num expr.t = expr1.t || num.value || '-' | num expr.t = num.value
LEXICAL ANALYSIS
15
Lexical analysis
• “Lexical analyzer”/“scanner”/“tokenizer”
• Simplifies the parser:– Removes white spaces– Removes comments– Identifies lexemes and
returns tokens16
Tokens
• Name + attribute• Attributes:
– Line and column number– Identifier name/symbol
table index– Numerical value– …
• Lexemes 17
Differing requirements
• Allow spaces in identifiers?– Example: Fortran 90
• Allow keywords as identifiers?– Example: PL/1
• Language support for configuringthe lexical analysis?– Example: TeX
18
Implementing lexical analysis
• Finite state machine?• Hard-coded?• Use a generator tool?
19
Input buffering
20
int lineno = 1, attribute = NONE;int GetNextToken(void) {
char t;for (t = ReadChar(); t != 0; t = ReadChar()) {
if (t == ' ' || t == '\t')/* Skip white spaces */
else if (t == '\n')lineno++;
else if ('0' <= t && t <= '9') {attribute = GetNum(t);return NUM;
}else { /* Error handling */
attribute = NONE;return
UNKNOWN_TOKEN;}
}return EOF; /* End of file token */
}
22
int GetNum(char t) {int num = 0;for (; '0' <= t && t <= '9'; t = ReadChar()) {
num *= 10; num += t – '0';
}// Put back the char that caused the loop to exitPutBack(t);return num;
}
DFA-based scanner
23
DFA-based scanner
24
SymbolState
0 1 2 3 4 5 6 7 8< 1 4 8> 6 3 8= 5 2 7
other 4 8
Differentiating betweenkeywords and identifiers
• Two strategies:– Keyword table– Test for keywords before identifiers
25
Error recovery
• Often hard to detect– Misspelled keywords = valid identifiers– Misspelled identifiers hard to detect
• Recovery strategies:– Panic mode– Try to “fix” the input
26
Conclusion
• Code generation using syntax-directed translation
• Lexical analysis
27
Next time
• Stack machine code• Generating stack machine code using SDT
28