Upload
akshat-sapra
View
220
Download
0
Embed Size (px)
Citation preview
8/7/2019 SAPRAMCLEEppt
1/18
8/7/2019 SAPRAMCLEEppt
2/18
` A Token is a string of characters, categorizedaccordingto therules as a symbol (e.g.
IDENTIFIER, NUMBER, COMMA,etc.)
` Thereis a set ofstringsintheinput forwhich thesametokenisproduced as output. Thisset ofstringsisdescribedby a rule called a patternassociated with thetoken.
8/7/2019 SAPRAMCLEEppt
3/18
` A lexical analyzergenerallydoesnothing with
combinations oftokens, a task left fora parser.
` Forexample, a typicallexical analyzerrecognizes
parenthesis astokens,butdoesnothingto ensurethateach '(' ismatched with a ')'.
8/7/2019 SAPRAMCLEEppt
4/18
` Thelexical analyzer (eithergenerated
automaticallyby a toollikelex, orhand-crafted)
readsin a stream of characters,identifiesthe
lexemesinthestream, and categorizestheminto
tokens.
8/7/2019 SAPRAMCLEEppt
5/18
` Identiers:x,y11,elsex_i00
` Keywords:if,else, while
` Integers: 2 ,1000,-500, +6663554` Floatingpoint: 2.0,0.00020 ,.02
` Symbols: + , * ,- , < , [ , ] , >, = ,.., /
` Comments: { dont changethis }
8/7/2019 SAPRAMCLEEppt
6/18
Sometokentypes have values associated
with them
TYPE VALUE
IDENT sqrt
INTCONSTANT 1
RELOP >
ADDOP -
8/7/2019 SAPRAMCLEEppt
7/18
` Numeric literalsin Pascal
Definition ofthetokenunsigned_number
digitp 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
unsigned_integerp digit* digit
unsigned_numberp unsigned_integer( ( .unsigned_integer) | I )
( ( e ( + | | I)unsigned_integer) | I )
` Recursionisnot allowed!` Noticetheuse ofparenthesesto avoid ambiguity
8/7/2019 SAPRAMCLEEppt
8/18
input token value
identifier x
equal =
identifier x
star *x = x * (acc+123) left-paren (
identifier acc
plus +
integer 123right-paren )
` Tokens aretypicallyrepresentedbynumbers
8/7/2019 SAPRAMCLEEppt
9/18
` CharacterSequencematchedby an Instanceofthe Token.
Example:- sqrt
8/7/2019 SAPRAMCLEEppt
10/18
LEXEME TOKEN TYPE
sum Identifier
= Assignment Operator
3 Integer
+ Addition operator
2 Integer
; Semi
ConsiderthisexpressionintheC programminglanguage:
sum=3+2;
Tokenizedinthe followingtable:
8/7/2019 SAPRAMCLEEppt
11/18
Lexical analysis
Parsing
Intermediate Code Generation
Code Generation
Source code(characterstream)
Tokenstream
Abstractsyntaxtree
Intermediate code
Assembly code
8/7/2019 SAPRAMCLEEppt
12/18
CS331 Lexical Analysis
Lexical analysis
Parsing
Semantic Analysis
Sourcecode(characterstream)
if (b == 0) a = hi;
if ( b == 0 ) a = i ;
Token
tream
8/7/2019 SAPRAMCLEEppt
13/18
` Lexical analysis converts a character tream toa token tream ofpairs
if (x1 * x2 < 1.0) {
y = x1;}
i f ( x 1 * x 2 < 1 . 0 ) { \n
KEY:if LPAREN ID:x1 OP:* ID:x2 RELOP: