26
COMPILER CONSTRUCTION LAB Submitted By : Amit Garg

23123067 Compiler Lab Manual

Embed Size (px)

Citation preview

Page 1: 23123067 Compiler Lab Manual

COMPILER CONSTRUCTION LAB

Submitted By :

Amit Garg

Page 2: 23123067 Compiler Lab Manual

Index

• Introduction

• Phases Of Compiler

Program1: Design the Lexical Analyzer to split the file in to Tokens using C

Compiler.

Program2: Design the Lexical Analyzer to identify the

keywords in to the file using C Compiler.

Program3: Count the number of While loops and number of For

loops in a program using the Lexical Analyzer.

Program4: Count the number of IF conditions in a program using

the Lexical Analyzer.

Program5: Count the number of Variables present in a file with

data types using the Lexical Analyzer.

Page 3: 23123067 Compiler Lab Manual

1

Introduction

What is a Compiler

Compiler is a program that reads a program written in one language – the source language –

and translates it in to an equivalent program in another language – the target language. As an

important part of this translation process, the compiler reports to its user the presence of errors

in the source program.

Source

target program

Program

Error Message

The Analysis-Synthesis Model of Compilation

There are two parts to compilation: Analysis and Synthesis. The analysis part breaks up the

source program into constituent pieces and creates an intermediate representation of source

program. The synthesis part constructs the desired target program from the intermediate

representation. Of the two parts, synthesis requires the most specialize technique.

Compiler

Page 4: 23123067 Compiler Lab Manual

2

The Phases Of A Compiler

A compiler operates in six phases, each of which transforms the source program from one

representation to another. The first three phases are forming the bulk of analysis portion of a

compiler. Two other activities, symbol table management and error handling, are also

interacting with the six phases of compiler. These six phases are lexical analysis, syntax

analysis, semantic analysis, intermediate code generation, code optimization and code

generation.

Page 5: 23123067 Compiler Lab Manual

3

Phases of Compiler

Source

program

LexicalAnalyzer

SyntaxAnalyzer

SemanticAnalyzer

Symbol tablemanager

Intermediate codegeneration

ErrorHandler

CodeOptimizer

CodeGenerator

Target

program

Page 6: 23123067 Compiler Lab Manual

4

Lexical analysis

In compiler, lexical analysis is also called linear analysis or scanning. In lexical analysis the

stream of characters making up the source program is read from left to right and grouped into

tokens that are sequences of characters having a collective meaning.

Syntax analysis

It is also called as Hierarchical analysis or parsing. It involves grouping the tokens of the

source program into grammatical phrases that are used by the compiler to synthesize output.

Usually, a parse tree represents the grammatical phrases of the sourse program.

Semantic Analysis

The semantic analysis phase checks the source program for semantic errors and gathers type

information for the subsequent code generation phase. It uses the hierarchical structure

determined by the syntax-analysis phase to identify the operators and operands of expressions

and statements.

An important component of semantic analysis is type checking. Here the compiler checks

that each operator has operands that are permitted by the source language specification.

Symbol table management

Symbol table is a data structure containing the record of each identifier, with fields for the

attributes of the identifier. The data structure allows us to find the record for each identifier

quickly and store or retrieve data from that record quickly. When the lexical analyzer detects

an identifier in the source program, the identifier is entered into symbol table. The remaining

phases enter information about identifiers in to the symbol table.

Page 7: 23123067 Compiler Lab Manual

5

Error detection

Each phase can encounter errors. The syntax and semantic analysis phases usually handle a

large fraction of the errors detectable by compiler. The lexical phase can detect errors where

the characters remaining in the input do not form any token of language. Errors where the

token stream violates the structure rules of the language are determined by the syntax analysis

phase.

Intermediate code generation

After syntax and semantic analysis, some compilers generate an explicit intermediate

representation of the source program. This intermediate representation should have two

important properties: it should be easy to produce and easy to translate into target program.

During semantic analysis the compiler tries to detect constructs that have the right syntactic

structure but no meaning to the operation involved

Code optimization

The code optimization phase attempts to improve the intermediate codeso that the faster-

running machine code will result. There are simple optimizations that significantly improve

the running time of the target program without slowing down compilation too much.

Code generation

The final phase of compilation is the generation of target code, consisting normally of

relocatable machine code or assembly code.

Page 8: 23123067 Compiler Lab Manual

6

Program-1

Objective: Design the Lexical Analyzer to split the file in to tokens using C Compiler.

Source Code of the given Objective:

#include <stdio.h>

#include <conio.h>

void main()

{

int i=0,j=0,k=0;

char c,Token[809],TokenList[33][33];

FILE *fp;

clrscr(

);

fp=fopen("b.b","r");

while((c=getc(fp))!=EOF)

{

if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 && c<=122) ||(c==95))

{

Token[i]=c;

i=i+1;

Token[i]='\0';

}

else

{

Page 9: 23123067 Compiler Lab Manual

7

j=0;

if(i>0)

{

while(Token[j] !='\0')

{

TokenList[k][j]=Token[j];

j=j+1;

}

TokenList[k][j]='\0';

k=k+1;

}

if (c!=32)

{

TokenList[k][0]=c;

TokenList[k][1]=c & '\0';

k=k+1;

}

i=0;

}

}

printf("TOKENS ARE=-\n");

printf("----------\n");

for(i=0;i<k;i++)

{

printf("%s\n",TokenList[i]);

}

Page 10: 23123067 Compiler Lab Manual

8

printf("\n Total Tokens in the File = %d\n",k);

getch();

}

Page 11: 23123067 Compiler Lab Manual

9

Program-2

Objective: Design the Lexical Analyzer to identify the keywords in to the file using C

Compiler.

Source Code of the given Objective:

#include <stdio.h>

#include <conio.h>

#include <string.h>

void main()

{

int i=0,j=0,k=0,st,count=0,k1=0;

char c,Token[809],TokenList[33][33],KeyWords[10]

[10]={{"int"},{"float"},{"char"},{"printf"},

{"scanf"}},temp[22][22];

FILE *fp;

clrscr(

);

fp=fopen("c.c","r");

while((c=getc(fp))!=EOF)

{

if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&

c<=122) ||(c==95))

{

Token[i]=c;

Page 12: 23123067 Compiler Lab Manual

10

i=i+1;

Token[i]='\0';

}

else

{

j=0;

if(i>0)

{

while(Token[j] !='\0')

{

TokenList[k][j]=Token[j];

j=j+1;

}

TokenList[k][j]='\0';

k=k+1;

}

if (c!=32)

{

TokenList[k][0]=c;

TokenList[k][1]=c & '\0';

k=k+1;

}

i=0;

}

}

for(i=0;i<k;i++)

{

Page 13: 23123067 Compiler Lab Manual

11

// printf("%s\n",TokenList[i]);

}

st=0;

while(st<=k)

{

for(i=0;i<=5;i++)

{

if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||

(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||

(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||

(TokenList[st][0]==95))

{

if (strcmp(TokenList[st],KeyWords[i])==0)

{

// printf("%s\n",TokenList[st]);

for(j=0;j<=k1;j++)

{

if(strcmp(TokenList[st],temp[j])==0) break;

}

if(j>k1)

{

strcpy(temp[k1],TokenList[st]);

k1=k1+1;

}

count=count+1;

}

}

}

Page 14: 23123067 Compiler Lab Manual

12

st++;

}

printf("Keywords Used In The Program are =-\n ");

printf("--------------------------------\n");

for(j=0;j<=k1-1;j++)

printf("\n%s",temp[j]);

printf("\nTotal Keywords In The Program = %d\n",k1);

getch();

}

Page 15: 23123067 Compiler Lab Manual

13

Program-3

Objective: Count the number of While loops and number of For loops in a program using

the Lexical Analyzer.

Source Code of the given Objective:

#include <stdio.h>

#include <conio.h>

#include <string.h>

void main()

{

int i=0,j=0,k=0,st,count=0,k1=0;

char c,Token[809],TokenList[33][33],KeyWords[10]

[10]={{"int"},{"float"},{"char"},{"printf"},{"scanf"},

{"for"}},temp[22][22];

FILE *fp;

clrscr();

fp=fopen("c.c","r");

while((c=getc(fp))!=EOF)

{

if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&

c<=122) ||(c==95))

{

Page 16: 23123067 Compiler Lab Manual

14

Token[i]=c;

i=i+1;

Token[i]='\0';

}

else

{

j=0;

if(i>0)

{

while(Token[j] !='\0')

{

TokenList[k][j]=Token[j];

j=j+1;

}

TokenList[k][j]='\0';

k=k+1;

}

if (c!=32)

{

TokenList[k][0]=c;

TokenList[k][1]=c & '\0';

k=k+1;

}

i=0;

}

}

for(i=0;i<k;i++)

Page 17: 23123067 Compiler Lab Manual

15

{

// printf("%s\n",TokenList[i]);

}

st=0;

i=0;

while(st<=k)

{

if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||

(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||

(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||

(TokenList[st][0]==95))

{

if (strcmp(TokenList[st],"for")==0)

count=count+1;

else if(strcmp(TokenList[st],"while")==0)

i=i+1;

}

st++;

}

printf("Total No. of For Loop In The Program = %d\n",count);

printf("Total No. of While Loop In The Program = %d\n",i);

getch();

}

Page 18: 23123067 Compiler Lab Manual

16

Program-4

Objective: Count the number of IF conditions in a program using the Lexical Analyzer.

Source Code of the given Objective:

#include <stdio.h>

#include <conio.h>

#include <string.h>

void main()

{

int i=0,j=0,k=0,st,count=0,k1=0;

char c,Token[809],TokenList[33][33],KeyWords[10]

[10]={{"int"},{"float"},{"char"},{"printf"},{"scanf"},

{"for"}},temp[22][22];

FILE *fp;

clrscr();

fp=fopen("a.a","r");

while((c=getc(fp))!=EOF)

{

if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&

c<=122) ||(c==95))

{

Token[i]=c;

i=i+1;

Token[i]='\0';

Page 19: 23123067 Compiler Lab Manual

17

}

else

{

j=0;

if(i>0)

{

while(Token[j] !='\0')

{

TokenList[k][j]=Token[j];

j=j+1;

}

TokenList[k][j]='\0';

k=k+1;

}

if (c!=32)

{

TokenList[k][0]=c;

TokenList[k][1]=c & '\0';

k=k+1;

}

i=0;

}

}

for(i=0;i<k;i++)

{

printf("%s\n",TokenList[i]);

}

Page 20: 23123067 Compiler Lab Manual

18

st=0;

i=0;

while(st<=k)

{

if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||

(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||

(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||

(TokenList[st][0]==95))

{

if (strcmp(TokenList[st],"if")==0)

count=count+1;

}

st++;

}

printf("Total No. of IF Condition In The Program =

%d\n",count);

getch();

}

Page 21: 23123067 Compiler Lab Manual

19

Program-5

Objective: Count the number of Variables present in a file with data types using the

Lexical Analyzer.

Source Code of the given Objective:

#include <stdio.h>

#include <string.h>

#include <conio.h>

void dataType();

int st;

char TokenList[33][33];

void main()

{

int i=0,j=0,k=0,callmodule,brk;

char c,Token[809],KeyWords[10][10]={{"int"},{"float"},

{"char"},{"printf"},{"scanf"}};

FILE *fp;

clrscr();

fp=fopen("c.c","r");

while((c=getc(fp))!=EOF)

{

if ((c>=48 && c<=57) || (c>=65 && c<=90) || (c>=97 &&

c<=122) ||(c==95))

{

20

Page 22: 23123067 Compiler Lab Manual

Token[i]=c;

i=i+1;

Token[i]='\0';

}

else

{

j=0;

if(i>0)

{

while(Token[j] !='\0')

{

TokenList[k][j]=Token[j];

j=j+1;

}

TokenList[k][j]='\0';

k=k+1;

}

if (c!=32)

{

TokenList[k][0]=c;

TokenList[k][1]=c & '\0';

k=k+1;

}

i=0;

}

}

// printf("\n########################%d\n",k);

21

Page 23: 23123067 Compiler Lab Manual

for(i=0;i<k;i++)

{

// printf("%s\n",TokenList[i]);

}

printf(" VARIABLES USED IN PROGRAM ARE =-\n");

printf(" ----------------------------- \n");

st=0;

while(st<=k)

{

for(i=0;i<=5;i++)

{

// printf("%d",i);

if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||

(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||

(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||

(TokenList[st][0]==95))

{

if(strcmp(TokenList[st],KeyWords[i])==0)

{

callmodule=i;

brk=1;

// printf("----%s\n",TokenList[st]);

break;

}

}

}

switch(callmodule)

22

Page 24: 23123067 Compiler Lab Manual

{

case 0:

{

dataType();

printf(" Are Integer Type Variables.");

callmodule=-1;

break;

}

case 1:

{

dataType();

printf(" Are Float Type Variables.");

callmodule=-1;

break;

}

case 2:

{

dataType();

printf(" Are Character Type Variables.");

callmodule=-1;

break;

}

}

st++;

}

printf("\n\n");

getch();

}

23

Page 25: 23123067 Compiler Lab Manual

void dataType(void)

{

st=st+1;

printf("\n");

while(strcmp(TokenList[st],";")!=0)

{

if((TokenList[st][0]>=48 && TokenList[st][0]<=57) ||

(TokenList[st][0]>=65 && TokenList[st][0]<=90) ||

(TokenList[st][0]>=97 && TokenList[st][0]<=122) ||

(TokenList[st][0]==95))

{

printf("%s ",TokenList[st]);

}

st++;

}

st=st-1;

Page 26: 23123067 Compiler Lab Manual