SAPRAMCLEEppt

Embed Size (px)

Citation preview

  • 8/7/2019 SAPRAMCLEEppt

    1/18

  • 8/7/2019 SAPRAMCLEEppt

    2/18

    ` A Token is a string of characters, categorizedaccordingto therules as a symbol (e.g.

    IDENTIFIER, NUMBER, COMMA,etc.)

    ` Thereis a set ofstringsintheinput forwhich thesametokenisproduced as output. Thisset ofstringsisdescribedby a rule called a patternassociated with thetoken.

  • 8/7/2019 SAPRAMCLEEppt

    3/18

    ` A lexical analyzergenerallydoesnothing with

    combinations oftokens, a task left fora parser.

    ` Forexample, a typicallexical analyzerrecognizes

    parenthesis astokens,butdoesnothingto ensurethateach '(' ismatched with a ')'.

  • 8/7/2019 SAPRAMCLEEppt

    4/18

    ` Thelexical analyzer (eithergenerated

    automaticallyby a toollikelex, orhand-crafted)

    readsin a stream of characters,identifiesthe

    lexemesinthestream, and categorizestheminto

    tokens.

  • 8/7/2019 SAPRAMCLEEppt

    5/18

    ` Identiers:x,y11,elsex_i00

    ` Keywords:if,else, while

    ` Integers: 2 ,1000,-500, +6663554` Floatingpoint: 2.0,0.00020 ,.02

    ` Symbols: + , * ,- , < , [ , ] , >, = ,.., /

    ` Comments: { dont changethis }

  • 8/7/2019 SAPRAMCLEEppt

    6/18

    Sometokentypes have values associated

    with them

    TYPE VALUE

    IDENT sqrt

    INTCONSTANT 1

    RELOP >

    ADDOP -

  • 8/7/2019 SAPRAMCLEEppt

    7/18

    ` Numeric literalsin Pascal

    Definition ofthetokenunsigned_number

    digitp 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

    unsigned_integerp digit* digit

    unsigned_numberp unsigned_integer( ( .unsigned_integer) | I )

    ( ( e ( + | | I)unsigned_integer) | I )

    ` Recursionisnot allowed!` Noticetheuse ofparenthesesto avoid ambiguity

  • 8/7/2019 SAPRAMCLEEppt

    8/18

    input token value

    identifier x

    equal =

    identifier x

    star *x = x * (acc+123) left-paren (

    identifier acc

    plus +

    integer 123right-paren )

    ` Tokens aretypicallyrepresentedbynumbers

  • 8/7/2019 SAPRAMCLEEppt

    9/18

    ` CharacterSequencematchedby an Instanceofthe Token.

    Example:- sqrt

  • 8/7/2019 SAPRAMCLEEppt

    10/18

    LEXEME TOKEN TYPE

    sum Identifier

    = Assignment Operator

    3 Integer

    + Addition operator

    2 Integer

    ; Semi

    ConsiderthisexpressionintheC programminglanguage:

    sum=3+2;

    Tokenizedinthe followingtable:

  • 8/7/2019 SAPRAMCLEEppt

    11/18

    Lexical analysis

    Parsing

    Intermediate Code Generation

    Code Generation

    Source code(characterstream)

    Tokenstream

    Abstractsyntaxtree

    Intermediate code

    Assembly code

  • 8/7/2019 SAPRAMCLEEppt

    12/18

    CS331 Lexical Analysis

    Lexical analysis

    Parsing

    Semantic Analysis

    Sourcecode(characterstream)

    if (b == 0) a = hi;

    if ( b == 0 ) a = i ;

    Token

    tream

  • 8/7/2019 SAPRAMCLEEppt

    13/18

    ` Lexical analysis converts a character tream toa token tream ofpairs

    if (x1 * x2 < 1.0) {

    y = x1;}

    i f ( x 1 * x 2 < 1 . 0 ) { \n

    KEY:if LPAREN ID:x1 OP:* ID:x2 RELOP: