AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

8/10/2019 AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)

1/31

AUTOMATED SCORING SYSTEM FOR ESSAYSBy

ARUNA P 2009103010

DHIVYA PRIYA R 2009103528DIVYA HARSHINI R 2009103530

A project report submitted to the

FACULTY OF INFORMATION AND

COMMUNICATION ENGINEERING

in partial fulfillment of the requirements

for the award of the degree of

BACHELOR OF ENGINEERING

in

COMPUTER SCIENCE AND ENGINEERING

DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING

ANNA UNIVERSITY


2/31

CHENNAI - 600025

April 2012

CERTIFICATE

Certified that this project report titled Automated Essay Scoring System

is the bonafide work of Aruna P (2009103010),Dhivya Priya R(2009103528) and Divya

Harshini R (2009103530) who carried out the project work under my supervision, for the partialfulfillment of the requirements for the award of the degree of Bachelor of Engineering in

Computer Science and Engineering. Certified further that to the best of my knowledge, the work

reported herein does not form part of any other thesis or dissertation on the basis of which adegree or an award was conferred on an earlier occasion on these are any other candidates.

Place: Chennai Prof.Dr.K.S.EaswarakumarDate: Professor and Head ,

Department of Computer Science and Engineering,

Anna University,Chennai - 600025

COUNTERSIGNED

Head of the Department

Department of Computer Science and Engineering

Anna University

Chennai600025


3/31

ACKNOWLEDGEMENTS

We express our deep gratitude to our guide, Prof.Dr.K.S.Easwarakumar for guiding us throughevery phase of the project. We appreciate his thoroughness, tolerance and ability to share hisknowledge with us. We thank him for being easily approachable and quite thoughtful. Apart

from adding his own input, he has encouraged us to think on our own and give form to our

thoughts. We owe him for harnessing our potential and bringing out the best in us. Without hisimmense support through every step of the way, we could never have done it to this extent.

We are extremely grateful to Prof.Dr.K.S.Easwarakumar, Professor of Computer Science and

Engineering, Anna University, Chennai 600025, for extending the facilities of the Departmenttowards our project and for his unstinting support.

We express our thanks to the panel of reviewers Dr.Arul Siromoney, Dr.Madhan

Karky, Dr.A P Shanthi and Miss.Suganya for their valuable suggestions and critical reviews

throughout the course of our project.

We thank our parents, family, and friends for bearing with us throughout the course of our

project and for the opportunity they provided us in undergoing this course in such a prestigious

institution.

Aruna P Dhivya Priya R Divya Harshini R

ABSTRACT


4/31

The objective ofautomated essay scoring system is to assign scores to essays written in

an educational setting. It is a method ofeducational assessmentand an application of Natural

Language Processing using word based document vector construction method and by adoptingContent Vector Analysis (CVA) . CVA can be used in this case as the distribution of words in

corporate datasets can be expected to be of random nature. Our system will use Model Based

approach in order to overcome huge space, storage and training requirements. Here, we calculatethe deviation of the students essay with respect to the ideally scored essays.The system evaluates the essays based on established rubrics viz. Surface Features,

Spelling errors, Grammar mistakes and correlation with the topic. Then, the individual raw

scores so determined are taken and weights are assigned depending on the salience. The scoresare then subjected to regression techniques using which the final score is calculated. In this

manner, we can ensure that the essays are graded uniformly with equity and less fatigue.

Contents

Certificate iAcknowledgements ii

Abstract(English) iii

Abstract(Tamil) iv

List of Figures viiiList of Tables ix

1 INTRODUCTION 1

1.1 Basic Cryptography . . . . . . . . . . . . . . . . . . . . . . . . .2

1.1.1 Symmetric Key Cryptography . . . . . . . . . . . . . . .

2

1.1.2 Public Key Cryptography . . . . . . . . . . . . . . . . . .3

1.1.2.1 Encryption . . . . . . . . . . . . . . . . . . . .

41.2 Other Flavours of Cryptography . . . . . . . . . . . . . . . . . .

5

1.2.1 Identity-Based Cryptography . . . . . . . . . . . . . . . .5

1.3 Provable Security . . . . . . . . . . . . . . . . . . . . . . . . . .

6

2 PRELIMINARIES 72.1 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1.1 Bilinear Pairing . . . . . . . . . . . . . . . . . . . . . . .

7

2.1.2 Hardness Assumptions . . . . . . . . . . . . . . . . . . .8

2.1.2.1 Discrete Logarithm Problem . . . . . . . . . .

82.1.2.2 Computational Diffie-Hellman Problem . . . .


5/31

8

2.1.2.3 Decisional Diffie-Hellman Problem . . . . . . . 8

2.1.2.4 Bilinear Diffie-Hellman problem . . . . . . . .9

2.1.2.5 Decisional Bilinear Diffie-Hellman Problem . . 9

3 RELATED WORK 113.1 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11

v

3.1.1 Formal model of Encryption . . . . . . . . . . . . . . . .

113.1.2 Security of Encryption Schemes . . . . . . . . . . . . . .

12

3.1.2.1 IND-CPA Game . . . . . . . . . . . . . . . . . 133.1.2.2 IND-CCA Game . . . . . . . . . . . . . . . . . 14

3.1.2.3 IND-CCA2 Game . . . . . . . . . . . . . . . . 15

3.2 Identity-Based Encryption . . . . . . . . . . . . . . . . . . . . .

173.2.1 Boneh Franklin IBE: . . . . . . . . . . . . . . . . . . . . 19

3.2.2 Other ID-Based Schemes: . . . . . . . . . . . . . . . . . 21

3.3 Proxy Re-Encryption . . . . . . . . . . . . . . . . . . . . . . . .23

3.4 Digital Signatures . . . . . . . . . . . . . . . . . . . . . . . . .

25

4 REQUIREMENT ANALYSIS 274.1 Product Perspective . . . . . . . . . . . . . . . . . . . . . . . . .

27

4.2 Product Functionality . . . . . . . . . . . . . . . . . . . . . . . .27

4.3 User Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.4 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . .30

4.5 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . .

31

5 SYSTEM DESIGN 365.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . .

36

5.2 Encryption Scheme - A variant of Twin Boneh- Franklin Scheme .

375.3 Signature Algorithm - Hess Signature Scheme . . . . . . . . . . .

38

5.4 Proxy Re-encryption Scheme . . . . . . . . . . . . . . . . . . . .40


6/31

6 SYSTEM DEVELOPMENT 43

6.1 Scheme Implementation . . . . . . . . . . . . . . . . . . . . . .

436.1.1 Tools used for implementation . . . . . . . . . . . . . . .

43

6.2 Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .457 RESULTS AND DISCUSSIONS 50

7.1 Screenshots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

8 CONCLUSIONS 568.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

vi

References 57


7/31

vii


8/31

List of Figures4.1 UseCase Diagram . . . . . . . . . . . . . . . . . . . . . . . . . .

29

4.2 Class Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . .304.3 Registration Process . . . . . . . . . . . . . . . . . . . . . . . . .

31

4.4 Encrypting Questionnaire . . . . . . . . . . . . . . . . . . . . . .32

4.5 Conducting Exam . . . . . . . . . . . . . . . . . . . . . . . . . .

33

4.6 Encrypting Answerscript . . . . . . . . . . . . . . . . . . . . . .34

4.7 Re-encryption and Evaluation Phase . . . . . . . . . . . . . . . .

355.1 System Architecture . . . . . . . . . . . . . . . . . . . . . . . . .

36

6.1 Accessing Different Centers Question . . . . . . . . . . . . . . .

476.2 Tampering Marks . . . . . . . . . . . . . . . . . . . . . . . . . .

48

6.3 Unsuccessful Decryption . . . . . . . . . . . . . . . . . . . . . .49

7.1 User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

7.2 Login Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

517.3 Student Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 52

7.4 Questions Display . . . . . . . . . . . . . . . . . . . . . . . . . .

537.5 AnswerScript with DUMMY ID . . . . . . . . . . . . . . . . . .

53

7.6 Faculty Interface . . . . . . . . . . . . . . . . . . . . . . . . . .54

7.7 View Result . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55


9/31

viii

List of Tables

3.1 Encryption schemes . . . . . . . . . . . . . . . . . . . . . . . . .22

3.2 Signature schemes . . . . . . . . . . . . . . . . . . . . . . . . .

25

6.1 Test Cases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


10/31

ix

CHAPTER 1


11/31

INTRODUCTION

Writing tests are increasingly being included in large-scale assessmentprograms and high stakes decisions. However, Automated Essay Scoring(AES) systems developed to overcome issues of marker inconsistency,volume, speed, cost and so on, also raise issues of score validity. In orderto fill a crucial gap identified in the current approaches used to evaluate

AES systems, we propose a framework that draws upon the current theoryof validation, for assessing the validity of scores produced from AutomatedEssay Scoring systems (AES) in a systematic and comprehensive manner.

MOTIVATION:With the advent of online examinations like GRE, GMAT and CET4 there had

been an increasing call for automation in the scoring process. Scoring of objective

questions has been considerably simple and it has been existent since years back.But the evaluation of essays has been in practice only manually in most of the

cases. This is because of the high complexity involved in programming such a

system that will perform as good as a human in its cognition. With the evolution of

advanced text database practices and Natural Language Processing (NLP)

techniques, this has become possible off late.

Any automated essay grading system should offer several salient features, most

importantly, the following:

1. Speed: Score generated in a matter of seconds as against the time-consuming

manual correction.

2. Ease/Less fatigue: Process made easy by automation as against the laborious

manual task.

3. Equitable: No place for any unjust favoring or unfair partiality or preference; all

scores are generated unbiased.

4. Uniformity: Overcomes the problem of different mindsets or attitudes of

different evaluators; ensures all essays are graded on a similar outlook.

SCOPE:


12/31

As automation has become the order of the day, with most of the jobs being

automated, evaluating essays has been an issue for long time. Since late 1960s,systems have been developed to evaluate essays automatically. To address the

shortcomings of the earlier systems, we adopt a new approach and propose a

versatile system.All the phases of the system, namely content discovery,analysis,grading of content shall operate in an unsupervised fashion and with no

need of manual assessment.

LITERATURE REVIEW::Lin Bin,Lu Jun,Yao Jian-Min,Zhu Qiao-Ming[1]proposes in AUTOMATEDESSAY SCORING USING KNN ALGORITHM a methodology of transformation

of Essays into Vectors in which the training set of essays is converted into vectorsof word frequencies.Then they are then transformed into word weights and these

weight vectors occupy the training space. To score the test essay, it is also

converted into a weight vector. A search is conducted to find the training vectorsmost similar to it.This is measured by the cosine between the test and trainingvectors. The closest matches among the training set are used to assign a score to

the test essay in the lines of Burstein.

Feature Selection KNN:

After eliminating the stop words, the features of the essays viz. words, phrases and

arguments are chosen .The value of each vector is expressed by the term frequencyand inversed document frequency (TF-IDF) weight. Similarity of essays is

calculated with cosine in the KNN algorithm.Term frequency TF is used to selectfeatures by predetermined thresholds.To find the highest information features, we

need to calculate information gain for each word. Information gain IG for

classification is a measure of how common a feature is in a particular classcompared to how common it is in all other classes.K-Nearest Neighbor Algorithm for Text Categorization:

We determine the most similar k features as nearest neighbors to a givenfeature, and assign individual scores according to the distance of the neighbors

calculated from suitable methods like Euclidean and Cosine relation. The finalscore is the weighted sum.
http://en.wikipedia.org/wiki/Information_gain_in_decision_treeshttp://en.wikipedia.org/wiki/Information_gain_in_decision_trees


13/31

--The major issues arise because of the following limitations of using KNN

algorithm.1. Memory based : Large space requirement to store the entire dataset is required.2. Unreliable Neighborhood-Lack of Overlapping results: Since the dataset usually

gives a sparse matrix, there are no overlapping values. But similarity measuresrequire high overlapping for higher reliability.3. Unsuitable for corporate dataset- Due to Sparseness: By the above argument,since the corporate datasets are usually sparse, KNN is less suitable for them.

In[2]AUTOMATED ESSAY SCORING SYSTEM FOR CET4 Yali Li,Yonghong

Yan comes up with a methodology involving the following Score-Determining

Components.The surface Features involve the number of characters in the

document(Chars),the number of words in the document(words),the number of

different words (Diffwds),the fourth root of the number of words in the document,as suggested by the Page(Rootwds),the number of sentences in the

document(Sents),average word length(Wordlen=Chars/Words),average sentence

length(Sentlen=Words/Sents),number of words longer than five

characters(BW5).Grammar checking uses ALEK(Assessment of Lexical

Knowledge)a tool.Bigram and trigram of part-of-speech tag sequence are used.For

Sentence Error Detection,Parts-of-Speech tag analysis is used.To determine the

relation to the topic, 2 approaches are being used viz.Simple Comparison of

Keywords and Content Vector Analysis. The final score is computed by linear

regression which is the linear weighted sum of several components.

The major limitations occur due to linear Regression and are as follows:

Incomplete description of relationship among variables-

1. Extremes are ignored.

2. Only Mean is considered

Sensitive to outliers

The precision attained by this methodology is 70.125%

In [3]AUTOMATED ESSAY SCORING USING GENERALIZED LATENT SEMANTIC

ANALYSIS by Md. Monjurul Islam , A. S. M. Latiful Hoque,Informational retrieval by Latent

Semantic analysis using Singular Value Decomposition is achieved by the following process as

depicted in the block diagram. It uses n-gram by document matrix.


14/31

The issues that occur in the performance due to SVD are:

1. Very high order of complexity of the Algorithm : O(n^2k^3).

2. Requires Normal distribution of term : Words are required to be normally distributed across

the documents. But in corporate datasets there is sparse distribution.

Yali Li, YonghongYans hypotheisis in [4] AN EFFECTIVE AUTOMATED ESSAYSCORING SYSTEM USING SUPPORT VECTOR REGRESSION follows Dataset

Construction Using Character n-grams over words.The key idea is to use Content Vector

Analysis(CVA) over Latent Semantic Analysis(LSA). Uses SVM( Support Vector Machine)

which is Model Based and popular in text classification problems where very high-dimensional

spaces are the norm.Support Vector Regression is used for the final score calculation. Evaluation

of rhetorical arguments is also possible by treating each argument as a mini-document. The

process involves vector construction for each document by extraction of words,subsequent

morphological analysis, followed by frequency vector construction and finally weight-

assignment based on salience( relative freq and inverse relative freq).In CVA,cosine relation

between test vector and document/class vector is computed and the class with highest correlationis selected.

PROBLEM DEFINITION:

From the survey of the related literature it is apparent that the development of a simple system

that grades user essays with an accuracy similar to manual correction remains as a great

challenge in the arena of educational data mining.Manual evaluation has its own drawbacks. It is time-consuming and requires an arduous task of

reading and evaluating when the corpus is very large. There is also an additional possibility of

unduly favoring of the preferred candidate. When an essay is being graded by different

evaluators there is a likely chance that the problem of different mindsets evolves. So essays arenot graded with a uniform outlook. The existing automated systems are more complex and

require large amount of training before deploying them..Hence, the evaluation systems must be

improved to support automated grading in a faster, simpler, more scalable and an equitablemanner.


15/31

.

CONTRIBUTIONS:

Using Model-based approach :

Our system will use Model Based approach because memory based approach requires a large

training dataset. Also, this is popular in text classification problems where very high-dimensionalspaces are the norm. In comparison to the Memory based approach that needs huge space,

storage and training requirements, our model based approach of calculating the deviation of theessay examined from the ideal scored essaysis better preferred.

Salience based correlation method:

To assess the consistency of essays with the topic, we use Content Vector Analysis (CVA) inpreference to Latent Semantic Analysis (LSA). This is because in LSA a higher order

algorithmic complexity of O(n^2k^3) is involved in SVD and words are necessarily required to

exhibit normal distribution for a good performance.CVA can be used in this case as the

distribution of words in corporate datasets can be expected to be of randomnature only.Salience is determined by the relative frequency of the word in the document and the inverse

relative frequency over the other documents. For example, the word the may appear very

frequently in a given document but its salience is very low because it appears in all thedocuments. If the word metamorphosis occurs even a few times, it will have a high salience

because there are relatively few documents that contain this word.

Using Ridge Regression for final score consolidation:A complete relationship among the different variables(The variables here correspond to the differentscores resulting from various rubrics predefined) is established by means of Ridge Regression. Thismethod ensures that the mean is taken into consideration along with the extremes.Ridge regression is L2 regularized linear regression in which the final prediction is the result of a widervariety of inputs as against a single input in the case of Linear Regression. This tends to make the systemmore robust for generalization.

ORGANIZATION OF THIS THESIS

The remainder of this thesis is structured as follows. Chapter 2 gives a requirement analysis of

the proposed system covering all functional and non-functional requirements following which

the system use-cases are presented. Chapter 3 of this thesis gives an overview of the Systemdesign and its architecture. In Chapter 4, the algorithms and techniques employed are discussed.

Chapter 5

discusses the performance evaluation of the proposed system in comparison with baseline

methods. The thesis ends by summarizing the conclusions obtained along pointers to futureresearch in Chapter 6. The references for this research are presented in Chapter 7 followed by the

snapshots in Appendix A.

REQUIREMENT ANALYSIS:FUNCTIONAL REQUIREMENTS:


16/31

The proposed system is designed to have the following features:

It evaluates the essay based on 4 dimensions viz. Grammar, Spelling , Correlation with the topic

and Surface features.For Grammar checking, we use jlinkgrammar, a grammatical system to classify natural

languages by designating links between sequences of words. Instead of using part-of-speech tags

based on rules to parse sentences, it uses links to create a syntactic structure for a language.Spelling mistakes are identified and thereby automatically corrected by using the Peter Norvigsspell correction method. It uses probabilistic and Bayesian theories in its implementation.

NONFUNCTIONAL REQUIREMENTS:User Interface Design:

For Information Retrieval, Stemming and Stop word removal do not require any specialized

interfaces. The system operates as a stand alone application. The system provides interfaces forquestion/answers

to assess the learners competency level. Interfaces are also provided for presenting the details of

the score assisgment process to the learner in the form of JFrames.In addition, the interfaceincludes provision for specification of the target corpus.

2.2.2 Documentation

The system is properly documented. All requirements(functional/non functional),use casediagrams, their description, the various packages used and the relevant tools employed shall form

a part of the system documentation. The source code listing has also been documented to serve

as a reference to future developers and contributors.

2.2.3 Hardware Considerations

The following hardware considerations are identified:

Operating System: UbuntuProcessor: Pentium 2.0 GHz or higher

RAM: 256 MB or more

Hard Drive Space : 10 GB or more

2.2.4 Performance Requirements

The performance of the system is evaluated against baseline method of manual grading of essaywith respect to 4 parameters viz. Grammar, spelling, Correlation with the topic and Surface

features.

Error HandlingThe following errors are possible in each of the modules:

Null entry in the text area:

When the user enters nothing in the essay text area, a message box appears for the first time

prompting him to key in the essay. This is done to ensure that he does not submit an essay byclicking the Submit button without his knowledge. If the action is repeated, the null essay is

considered and assigned a score of zero.

Content Vector analysis:


17/31

In case there is total absence of relation between the essay question and the candidates answer, a

score of zero is assigned notwithstanding the performance in Grammar,Spelling and Surface

Feature aspects.

CONSTRAINTS and ASSUMPTIONS:

Language Constraints:Language of consideration is English.Assumption:

A manually prescored corpus of reference essays is assumed to be already present before the

candidate enters the essay of any particular question for a prompt score. The manually correctedessays are assumed to be evaluated on the basics of the 4 holistic rubrics mentioned above in an

error free manner.

SYSTEM MODELS:Use case model and Scenarios

The various use cases in the Figure x.x are elaborated in this section.

Use case : Create Essay questionID: 001

TITLE: Create essay question

DESCRIPTION: The question for which the candidate is required to answer is created by the

administratorACTORS: Admin

PRE CONDITIONS: The admin should have been logged in

POST CONDITIONS: The question flashes on the User Interface.

Use case : Input Test essay

ID: 002

TITLE: Input Test essayDESCRIPTION: The student enters his response for the required question in the Text area

ACTORS: Student

PRE CONDITIONS: The student should have been logged in and the question should have beenprompted on the screen

POST CONDITIONS: The essay,upon submitting, gets stored in the required file.

Use case : Store Essay

ID: 003

TITLE: Store Essay

DESCRIPTION: The essay that the student enters gets stored at the required file location.ACTORS: Test essay db

PRE CONDITIONS: The student should have entered the essay and clicked the Submit button.

POST CONDITIONS: The essay contents get copied to the file in the desired location.

Use case : Input Reference Essay

ID: 004

TITLE: Input Reference Essay


18/31

DESCRIPTION: The admin enters the name of the folder where there are pre-scored manually

graded essays.

ACTORS: AdminPRE CONDITIONS: The reference essays must be available in the folder after having been

graded manually.

POST CONDITIONS: The reference essays will be ready for comparison with the test essay.

Use Case: Text Processing

ID: 005

TITLE: Text ProcessingDESCRIPTION: Stemming and Stop-word removal of the reference and test essays are done.

ACTORS: Reference Essay db, Test Essay db

PRE CONDITIONS: The Simple Feature Extraction, Spell check and auto correction and

Grammar check should have been performedPOST CONDITIONS: The keywords of the reference and test essays are displayed.

Use Case: Generate Individual ScoreID: 006

TITLE: Generate Individual Score

DESCRIPTION: Depending on the rubrics, the essays are evaluated and suitable marks are

awarded for each.ACTORS: Reference Essay db, Test Essay db

PRE CONDITIONS: The text from test and reference essays must have been processed.

POST CONDITIONS: The individual score components are recorded.

Use Case: Display Grade

ID: 007

TITLE: Display GradeDESCRIPTION: The individual scores are combined and the overall score range is estimated.

ACTORS: Score db

PRE CONDITIONS: The individual scores must have been generated.POST CONDITIONS: The final score range is displayed on the screen.


19/31

CHAPTER 3

DESIGN

This chapter gives the detailed design description of modules in the system.

3.1 Data Flow Diagram:The Data Flow Diagram in Figure x.x lists the various stages involved in the implementation of

the system. The various relationships between them are also shown in Figure 3.1.

FIGURE x.x.1: DFD Level-0


20/31

FIGURE x.x.2: DFD Level-1

FIGURE x.x.3: DFD Level-23.2 USER INTERFACE DESIGN


21/31

The system uses JFrame for essay input purposes. It also uses the same for displaying the

different stages of the evaluation and for displaying the score. Text areas, Text boxes, message

boxes and tabbed pane are used for user interactivity.

3.3 OVERALL SYSTEM ARCHITECTURE

The proposed system shown in Figure x.x is composed of the following modules:

FIGURE 3.2: System Architecture

3.4 MODULE DESCRIPTIONS

3.4.1 SIMPLE FEATURE EXTRACTION:

INPUT:Test Essay

OUTPUT:Text Complexity Feature Score Component1

The system first evaluates text complexity features, such as the number of characters in

the document(Chars),number of words in the document(words),number of different words(Diffwds) fourth root of the number of words in the document, as suggested by the

Page(Rootwds), number of sentences in the document(Sents),average word

length(Wordlen=Chars/Words),average sentence length (Sentlen=Words/Sents) and number ofwords longer than five characters(BW5). Each feature has its own use. For example, the number

of words represents the length of the essay since the length requirement is say 250-300 words.

This feature can check the empty essay or essay which is ridiculously short that it cannot beprocessed and rejects it immediately. Otherwise a score can be assigned accordingly.


22/31

3.4.2 GRAMMAR/SPELL CHECK:

INPUT:Test EssayOUTPUT:Spell Check Score Component2

Once the essay passes the feature extraction process, the next step is to check the essayfor any spelling mistakes. The count of the number of spelling mistakes has to be recorded and

the errors must be auto-corrected.

INPUT: Auto Corrected Test EssayOUTPUT: Grammar Check Score Component3

Then the essay must be checked for grammatical mistakes using jlinkgrammar,whichworks on the basic principle of linking.It uses probabilistic parsing and deduces the number of

linkage errors from which the potential grammar errors are identified from the passage sent

through the batch file and based on it a component of score must be assigned.


23/31


24/31

3.4.4 FINAL SCORE USING REGRESSION:The individual raw scores, namely from the feature extraction process, the grammar/spell

check process and the content vector analysis process, are taken and weights are assigned for

each component. The scores are then subjected to ridge regression techniques using which the

final score is calculated.

IMPLEMENTATION


25/31

This chapter explains the details of implementation of all modules of the proposed

system.

4.1 IMPLEMENTATION DETAILS

The system is implemented in Java.

Java JFrames was used for the user interface design of the system.Netbeans was the IDE of choice for Java.For grammar checking , the system was coded using the features of the tool , jlinkgrammar.

Stanford Tagger is used to find the number of verbs for Surface feature analysis.

Tools used in the implementation:

Packages: Stanford Parts-of-Speech Tagger

Dictionary: Peter Norvigs essay (Spelling Correction)IDE for Java: Netbeans

Grammar check: Jlinkgrammar

TEXT PROCESSING DETAILS:

3.1 Stop Word Removal

Many of the most frequently used words in English are useless in Information Retrieval (IR) and

text mining. These words are called 'Stop words'. Stop-words, which are language-specificfunctional words,are frequent words that carry no information (i.e., pronouns, prepositions,

conjunctions). In English language, there are about 400-500 Stop words. Examples of such

words include 'the', 'of','and', 'to'. The first step during preprocessing is to remove these Stop words, which has proven as

very important.

The present work uses the stop word list customized by us.

3.2 StemmingStemming techniques are used to find out the root/stem of a word. Stemming converts words to

their stems, which incorporates a great deal of language-dependent linguistic knowledge. Behind

stemming, the hypothesis is that words with the same stem or word root mostly describe same orrelatively close concepts in text and so words can be conflated by using stems. For example, the

words, user, users, used, using all can be stemmed to the word 'USE'. In the present work, the

Stemmer algorithm is defined and used.3.3 Document Indexing

The main objective of document indexing is to increase the efficiency by extracting from the

resulting document a selected set of terms to be used for indexing the document. Document

indexing consists of choosing the appropriate set of keywords based on the whole corpus ofdocuments, and assigning weights to those keywords for each particular document, thus

transforming each document into a vector of keyword weights. The weight normally is related to

the frequency of occurrence of the term in the

document and the number of documents that use that term.3.3.1 Term Weighting

In the vector space model, the documents are represented vectors. Term weighting is an

important concept which determines the success or failure of the classification system. Since


26/31

different terms have different level of importance in a text, the term weight is associated with

every term as an important indicator.

The main components that affect the importance of a term in a document are the Term Frequency(TF) factor and Inverse Document Frequency (IDF) factor. Term frequency of each word in a

document (TF) is a weight which depends on the distribution of each word in documents. It

expresses the importance of the word in the document. Inverse document frequency of each wordin the document database (IDF) is a weight which depends on the distribution of each word inthe document database. It expresses the importance of each word in the document database.

TF/IDF is a technique which uses both TF and IDF to determine the weight a term. TF/IDF

scheme is very popular in text classificationfield and almost all the other weighting schemes are variants of this scheme.

Given a document collection 'D', a word 'w', and an individual documentd D, the weight w is

calculated using Equation x.

The result of TF/IDF is a vector with the various terms along with their term weight. The pseudocode for the calculation of TF/IDF is shown in Fig.2.

Determine TF, calculate its corresponding weight andstore it in

Weight matrix (WM)

Determine IDF

if IDF == zero thenRemove the word from the WordList

Remove the corresponding TF from the WM

ElseCalculate TF/IDF and store normalized

TF/IDF in the corresponding element of the

weight matrix

Fig. 2 Algorithm TF/IDF

RESULTS AND DISCUSSION(YET TO DO)TEST RESULTS AND ANALYSIS

Test case Id: AES1

Module being tested: Essay_entry UI

Test case Description:

This test case verifies if the essay is keyed in or not.

Flow of Events:

1.A question pertaining to a topic appears in the interface

2.In case the users answer is null, the user is again prompted to make his entry.

Expected Results:


27/31

The essay, if null, gets invalidated.

Exceptions:

If he continues doing the same for the second time, the null entry gets accepted and given a scoreof 0.

Test Result: PASS

Comments and Bugs(if any) identified: NIL

Test case Id: AES2

Module being tested: Response Recording


This test case verifies if the essay entered by the user gets stored in a dedicated text file.

Pre-condition:

The user provides an answer to the essay question.

Flow of Events:

1.The user submits the essay after entering it on the UI.

2.If it is non-null the essay gets recorded in a text file

Expected Results:

The contents of the essay entered in the UI are copied on to the text file at the desired location.

Exceptions:

If the essay is null, it gets terminated after assignment of score zero.

Test Result: PASS


Test case Id: AES4

Module being tested: Grammar and spell check



28/31

This test case verifies if the essay entered by the user gets evaluated according to the no. of

spelling and grammar errors..

Pre-condition:

The user provides an answer to the essay question.

Flow of Events:

1.The essay is checked for Spelling errors and auto-corrected using Peter Norvigs method

2.The no. of such Spelling errors get recorded and influence a percentage of negative score.

3. The test essay is then checked for Grammar errors using jinkgrammar.

4. The no. of Grammar mistakes are recorded and a negative score is allotted correspondingly.

Expected Results:

The score components for Grammar and spelling are recorded.

Exceptions:

NIL

Test Result: PASS


Test case Id: AES5

Module being tested: CVA


This test case verifies if the most similar essay is determined using CVA.

Pre-condition:

The reference corpus contains the prescored essays.

Flow of Events:

1. The documents are indexed along with the test essay document.


29/31

2.The keyword extraction occurs after stemming and stopword removal.

3.The raw frequencies of the keywords in all the documents are determined.

4. The relative frequencies are found using tf -idf computation and the weighted Term Document

matrix(TDM) is found.5. The similarity level between the test essay and the reference essays is computed by cosinecomputation of vectors in which each column of Weighted TDM is treated as a vector.

6. The most similar document is identified and the corresponding score is allotted.

Expected Results:

A specific component of the score is allotted based on the relation with the reference documents.

Exceptions:

If users response is totally off the topic,it is given a score of 0.

Test Result: PASS


CHAPTER 6CONCLUSION

6.1 OVERALL CONCLUSIONIn this work, we have presented a novel framework for automatic evaluation of essays.

For the grammatical checking component, we use linkages to detect errors which is more reliable

than the traditional method of defining the rules for checking grammar.Instead of using part-of-

speech tags based on rules to parse sentences, it uses links to create a syntactic structure for alanguage. For the topic detection component, we use the CVA model, and find it can effectively

detect whether an essay is off-topic especially for large number of essays. The final score so

computed by using ridge regression is influenced by a number of factors rather than being overly

affected by a single factor.

6.2 FUTUREWORK

In our work we have restricted ourselves to the English language. But it can be further extendedto cater to the other languages using suitable dictionaries. Also, for descriptive science essays,


30/31

provision to include and evaluate the equations and formulae can be made as an additional

enhancement.

Also, in addition to checking the correlation between the documents using exact words, similar

words can be derived with an aid of thesaurus and the checking can be performed for more

accuracy.

REFERENCES

[1] Lin Bin,Lu Jun,Yao Jian-Min and Zhu Qiao-Ming ,Automated Essay

Scoring Using KNN Algorithm ,International Conference on ComputerScience and Software Engineering , 2008.

[2] Yali Li and Yonghong Yan, Automated Essay Scoring System For

CET4 , Second International Conference on Education technology and

Computer Science ,2010.[3] Md. Monjurul Islam ,and A. S. M. Latiful Hoque , Automated Essay

Scoring Using Generalized Latent Semantic Analysis ,13th International

Conference on Computer and Information Technology ,2010.[4] Yali Li, and YonghongYan, An Effective Automated Essay Scoring

Sy

[4] Yali Li, and YonghongYan, An Effective Automated Essay ScoringSystem Using Support Vector Regression ,Fifth International Conference

on Intelligent Computation Technology and Automation ,2012.

[5]Dikli.S,An Overview of Automated Scoring of Essays,Journal of

Technology, Learning, and Assessment, 5(1),Retrieved from http://


31/31

www.jtla.org,2006.

[6]J. Burstein, K. Kukich, S. Wol, C. Lu, M. Chodorow, L. Bradenharder, and

M.Dee Harris, Automated Scoring Using A Hybrid Feature Identication

Technique, in Proc. In the Proceedings of the Annual Meeting of the

Association of Computational Linguistics,1998.

SNAPSHOTS:

In line with text| Fixed position

stem Using Support Vector Regression ,Fifth International Conference

on Intelligent Computation Technology and Automation ,2012.[5]Dikli.S,An Overview of Automated Scoring of Essays,Journal of

Technology, Learning, and Assessment, 5(1),Retrieved from http://

www.jtla.org,2006.

[6]J. Burstein, K. Kukich, S. Wol, C. Lu, M. Chodorow, L. Bradenharder, andM.Dee Harris, Automated Scoring Using A Hybrid Feature Identication

Technique, in Proc. In the Proceedings of the Annual Meeting of the

Association of Computational Linguistics,1998.

SNAPSHOTS:

Documents

AUTOMATEDSCORINGSYSTEMFORESSAYS.docx (1)