View
0
Download
0
Category
Preview:
Citation preview
@IJMTER-2015, All rights Reserved 414
An Image Processing Oriented Optical Mark Reader Based on Modify
Multi-Connect Architecture MMCA
Rusul Hussein Hasan1,Emad I Abdul Kareem
2
1, 2 College of education in Computer since, AL-MustansiriyaUniversity, Iraq. Baghdad.
Abstract— Optical Mark Recognition (OMR) is the technology of electronically extracting intended
data from marked fields, such as squareand bubbles fields, on printed forms. OMR technology is
particularly useful for applications in which large numbers of hand-filled forms need to be processed
quickly and with a great degree of accuracy. The technique is particularly popular with schools and
universities for the reading in of multiple choice exam papers. This paper proposed OMRbased on
Modify Multi-Connect Architecture (MMCA) associative memory, its work in two phases: training
phase and recognition phase.The proposed method was also able to detect more than one or no
selected choice. Among 800 test samples with 8 types of grid answer sheets and total 58000
questions, the system exhibits an accuracy is 99.96% in the recognition of marked, thus making it
suitable for real world applications.
Key Words: Associative memory, Image processing, Optical mark reader and multiple choice exam.
I. INTRODUCTION
OMR technology has changed much in recent years. Now a day in schools, colleges and classes
OMR technology is used. Exams are conducted using OMR answer sheet checking system because
by using this technology the conduction of exam is getting much easier, powerful, and cheap [11].
Optical Mark Reader (OMR), also called “mark sensing”,is a method of entering data such as
assessment/multiple-choice exams, course evaluation sheets, enrollment forms, surveys etc. into a
computer system using an optical mark reader. Pencil or pen marks made in predefined positions on
paper forms as responses to questions or tick list prompts can be read by the reader. These marks are
digitally entered into a computer for further analysis. OMR is very useful when data is to be
collected from a large number of sources simultaneously and a large volume of data must be
collected and processed in a short period of time. The university environment is the perfect
application area for OMR systems. Especially when dealing with large class sizes. The government
and the medical industry are also major application areas for OMR systems [15].
The idea of such a system was proposed by R. B. Johnson, a high school science teacher in
Michigan, who devised a machine for recording students test answers and to compare them to an
answer key. IBM bought rights to his invention and launched the machine in the market by the name-
IBM 805 Test Scoring Machine [1].
The proposed approach will use modify multi-connect architecture associative memory which is
developed from MCA associative memory [3]it is learning on mask initialled to detection mark
fromeach question during recognition phase.
II. RELATED WORKS
This subsection provide a survey of the literature related to Optical Mark Reader. That were
developed to improve Optical Mark Reader.
2.1 Optical Mark Reader Based Image Processing
According to the literature survey to the related works, two techniques have been used as follow:
2.1.1 OMR Based Image Segmentation and Thresholding Technique
Chinnasarn et al, presented a system which was based on Personal Computer-type microcontroller
and image scanner. The system operations can be distinguished in two modes: learning mode and
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 415
operation mode. The data extraction from each area can be performed based on the horizontal and
vertical projections of the histogram. For the answer checking purpose, the number of black pixels in
each answer block is counted, and the difference of those numbers between the input and its
corresponding model is used as a decision criterion. This is the first image-based OMR technique
[4].
Andrea Spadaccinidescribed JECT-OMR, a system that analyses digital images representing scans of
multiple-choice tests compiled by students. The system performs a structural analysis of the
document in order to get the chosen answer for each question, and it also contains a bar-code
decoder, used for the identification for additional information encoded in the document. JECT-OMR
was implemented using the Python programming language, and leverages the power of the Gamera
framework in order to accomplish its task. [6].
Tien Dzung Nguyen et al. proposes grading multiple choice test which is based on a camera with
reliability and efficiency. The bounds of the answer sheet image captured by the camera are first
allocated using Hough transform and then skew-corrected into the proper orientation, followed by
the normalization to a given size. The tick mark corresponding to the answer for each question can
be recognized by allocation of the mask which wraps the answer area [7].
NutchanatSattayakawee proposes the algorithm of test scoring for grid answer sheets. The method
used is based on projection profile and thresholding techniques [8].
Rakesh S et al. proposed system consists of an ordinary printer, scanner and a computer to perform
computation and is assisted with a graphical user interface. Users can design forms of their choice
and use it for survey or other related activities. The filled forms are scanned and scanned images are
given as input to a computer, which does the computation and stores the result in a user
understandable spreadsheet. The system is independent of hardware and system platform, thereby
making it platform independent [9].
A. AL-Marakebypresents a low cost and fast solution for optical mark recognition system working in
multi-core processor system. The answer sheet is captured using a digital camera and the image is
processed. Initially the borders of the sheet are located then the bubbles are detected. Fast techniques
are used to detect the bubbles without a rotation correction. An adaptive binarization has been used
to overcome the lighting effects of the camera based images [10].
2.1.2 OMR Based Image Segmentation and Template MatchingTechnique
Francisco de AssisZampirolli et al. presents a simple and innovative method to transform captured
images of answer sheets into reduced binary matrices containing answers to the questions plus some
control elements, using simple morphological operations for segmentation [5].
AzmanTalib et al. proposes shape-based vision algorithm, a hierarchical template-matching approach
that implemented in this system to verify the imaging and inspecting the correct answer of the
Optical Mark Recognition (OMR) sheet form. An OMR answer sheet scheme with all correct
answers are marked on the paper and will be used as a template for object recognition during the
matching process. Region of interest (ROI) is selected and filtered into grey level to extract the
contour of the object. The image is then pre-processed and trained using image processing technique.
A low-cost 1.3 MP web camera is used to acquire the marked OMR, image for all questions together
with the sequence number; this is to ensure the system can distinguish between different
questions having the same answer [12].
Ms.Sumitra B. Gaikwadaims to develop Image processing based Optical Mark Recognition sheet
scanning system. Find that lot of competitive exams are being conducted as entrance exams. These
exams consist of MCQs. The students have to fill the right box or circle in the appropriate answer to
the respective questions. During the inspection or examining phase normally a stencil is provided to
the examiner to determine the right answer to the questions. This is a manual process and a lot of
errors can occur in the manual process such as counting mistake and many more. To avoid this
mistakes OMR system is used. In this system OMR answer sheet will be scanned and the scanned
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 416
image of the answer sheet will be given as input to the software system. Using Image processing
[11].
III. MODIFY MULTI-CONNECT ARCHITECTURE ASSOCIATIVE MEMORY
Associative memoryis a data collectively stored in the form of a memory or weight matrix, which is
used to generate output that corresponds to a given input, can be either auto-associative or hetero-
associative memory [2]. A Hopfield neural network is one of the most commonly used neural
network models for auto-association and optimization tasks, it has several limitations. For example,
it is well known that Hopfield neural networks has limited stored patterns, local minimum problems,
limited noise ratio, retrieve the reverse value of pattern, and shifting and scaling problems [3].
Although, MCA has been overcome these limitations [3], to improve the efficiency of MCA in order
to decrease the network size and weight size. In additional to increase the ability for noise robust as
well as speed up its learning and convergence process. A modified associative memory based on the
MCA, namely the Modify Multi Connect Architecture (MMCA), is proposed. The modifications
include the network architecture as well as in its learning and convergence processes.
This improving was done by proposed algorithms for learning process and convergence process.
Thus for both, the pattern (pattern: It means a sequence of 1's and -1's) will divide into a number of
parts with size two, to be considered as a vector v (eachtwo element of the pattern will be one
vector). Each one of these vectors needs to create its learning weight matrix during learning process
or need to find the convergence pattern during the convergence process see Figure 1.
Figure 1. The data (pattern) divided into a number of vectors with size two, which it need
to create its learning weight matrix.
Because of this process, MMCA can deal with any pattern size and the associative memory capacity
became unlimited, and it could remember even the correlation patterns. Additionally, because the
size of the vectors is two, there are no more than four possible vectors see Table 1, this means there
are no more than two weight matrices W will be built during learning process depending on the fact
that each pair of orthogonal vectors has the same weight. These matrices are symmetric, without zero
diagonal and with size 2*2.
Table 1. Illustrated the four possibilities of the bipolar vector with length two.
The architecture of MMCA is illustrated in Figure 2. It shows each path represents one learning
weight matrix (1< m <2), thus, all the vectors in the pattern will be replacing with a number,
which represents the number of the path in the net, by this number, we can call the path again.
Figure 2. The architecture of MMCA associative memory.
-1 -1
-1 1
1 -1
1 1
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 417
IV. THE PROPOSED OMR MODULE
The general flowchart of OMR module as shown in Figure 3.Where the answer sheet captured by a
scanner is then processed by the system and the assessment results arestoring in the excel file. The
details of each phase is now discussed.
4.1 Image Acquisition
The role of scanner is just to scan the filled sheet and so any flatbed or ADF (Automatic Document
Feeder) scanner is used to scan the sheet, in our study, high speed scanner is used to acquire the
images of attendance sheets. The average processing speed is more than 60 pages per minute. Image
data are transferred from scanner to computer and stored in memory of the computer with JPG image
format.
Figure 3: Flowchart of OMR
4.2 Image Pre-Processing
The pre-processing phase consists in a set of operations that make the scanned image more suitable
for the further phases.The first operation performed to the image is the conversion to gray scale; then
the image is converted into black and white format using the thresholding method.
Next the system does a compensation of rotation effects induced by the scanning operation. The
goal of this step is rotate of image answer sheet at a calculated angle to restore it to its normal
rectangle. To do that, at first we must calculate the correct angle by using Hough transform method,
and then apply bilinear interpolation method with correct angle to rotate all image answer sheet
pixels to normal location. Figure 4 shows image answer sheet before and after rotation operation.
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 418
(a) (b)
Figure 4: The rotation of image answer sheet: (a) rotate image answer sheet (b) Normal image answer sheet.
4.3 Answer Area Allocation
In this step image answer sheet is projected horizontally and vertically to located answer area.To
achieve Answer Area Allocation, the following steps must be applied sequentially:
Invert image answer sheet to binary image.
Compute vertical projection of an invert image answer sheet by counting of white values in each column. In Figure 5 (a), from mid of image answer sheet vertically, red line represents
direction of computing process and the blue arrow represents column, which contains the
maximum of white values. Finally, store the result from this process in look up table, size of
this array is width of image answer sheet divided by two, index of array represents the
number of column, and the data of array represents the count of white values in each column.
Compute horizontal projection of an invert image answer sheet by counting of white values in
each row. In Figure 5 (a), from mid of image answer sheet horizontally, violet line represents
direction of computing process and the green represents the row which contains maximum of
white values. Finally, store the result from this process in look up table, size of this array is
height of image answer sheet division by two, index of array represents the number of row
and the data of array represents the number of white values in each row. Figure 5 (b) shows
answer area allocation after cropping horizontal and vertical projection.
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 419
(a) (b)
Figure5: The cropping of vertical and horizontal projection.
(a) Original image answer sheet (b) after cropping of vertical and horizontal projection.
Then each question zone must be determined in order to score the response of that question.
Question number is excluded from question segment. Each question comprises 4 positions (X up, X
low, Y up, Y low) which are lower and upper bounds of the question answer zone as shown in Figure 6.
Figure 6: Question zone represented by the quadruple (X up, X low, Y up, Y low).
4.4Answer recognition using MMCA
First, initial mask size it computation in the Equation 2.The proposed OMR approach will use this
mask as a training image during learnphaseis implemented only once and save it in a lookup table in
MMCA associative memory to be remembered during recognition process.
Mask size = (X up-X low)/N choice (2)
Where recognition phase implements for each question the convergence phase of the MMCA using
the lookup table that is built during the training Phase.Figure7illustrates the training and recognition
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 420
phase steps to implement of MMCA method
Figure7:illustrates the tanning and recognition phase steps to implement of MMCA method.
V. RESULTS DISCUSSION AND ANALYSIS
In the experiments, percentage of correctness was measured. Two types of answer sheets
have been used, each type divide in four classes as shown in Appendix A.Each class was defined by
100 samples. The accuracy result and process time for each test are shown in Figure 8 and Figure 9
respectively.
Figure 8: accuracy result of each test.
99.5
99.6
99.7
99.8
99.9
100
100.1
506080100506080100
ISquare
IISquare
IIISquare
IVSquare
IBubble
IIBubble
IIIBubble
IVBubble
Acc
urc
y
Accuracy result of MMCA
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 421
Figure 9: Process time of each test.
The process time in these experiments depend the form type as shown in the Figure 8 the process
time is increase when the number of question increase. The performance evaluations of the proposed
OMR have been shown in Appendix B.Total number of questions were 58000. They aretested in two
types of answer sheets. The experiments show that there are just 26 answers were unrecognized
fromthe total number of questions (i.e. the 58000 questions). Accordingly,the average accuracy
was99.96%.
VI. COMPARISON WITH OTHER WORKS
The review presented in the related work shows that there are nine papers that have a similar goal
with the present research. These papers used the threshold and template matching technique to
detected mark. Therefore, it is useful to compare the proposed OMR module in this research with the
method in those papers. Hence, a comparison study has been conducted by evaluating these paper
work.
In this comparison, focused on compare with the accuracy result only for all question in each paper.
Figure 10 shows average accuracy for all paper that compare them.
Figure 10: Average accuracy comparison.
0
20
40
60
80
100
120
140
160
506080100506080100
I SquareII SquareIIISquare
IVSquare
I BubbleII BubbleIIIBubble
IVBubble
Seco
nd
s
Average Speed Of MMCA
90.00%91.00%92.00%93.00%94.00%95.00%96.00%97.00%98.00%99.00%
100.00%101.00%
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 422
VII. CONCLUSION Results Discussion and Analysis show that the proposed OMR average accuracy was 99.96. This
work focused on the development of an OMR for multiple-choice tests via using a new technique
(i.e. associative memory with modify multi-connect architecture) paving the way to the future works
to develop more efficient OMR in speed and accuracy using associative memory (may be after
modified it). The input forms used in the experiments were printed on an A4 sheets. No need to use
OMR scanner, where a normal scanner is used to scan the filled forms. The scanned copies are then
used as input to the proposed OMR.The actual error was caused by some pale marks form input. And
the proposed method also able to detect more than one or no selected choice.
REFERENCE 1. IBM archives< exhibits< IBM special products (Vol. 1) http://www-
03.ibm.com/ibm/history/exhibits/specialprod1/specialprod1_9.html
2. KishanMehrotra, C.K.M.a.S.R., Elements of Artificial Neural Networks. Bradford Books, Complex Adaptive
Systems, New York, USA, 1996.
3. Emad I Abdul Kareem, et al., Multi Connect Architecture (MCA) Associative Memory: A Modified Hopfield
Neural Network. Intelligent Automation and Soft Computing, Vol. 18, No. 3, pp. 291-308, 2012.
4. K. Chinnasarn Y. Rangsanseri “An image-processing oriented optical mark reader” Applications of digital image
processing XXII, Denver CO, 1999.
5. Francisco de AssisZampirolli, Jos´eArturQuilici Gonzalez, Rog´erioPerino de Oliveira NevesCentro de
Matem´atica, Computac¸˜ao e Cognic¸˜ao "Automatic Correction of Multiple-Choice Tests using Digital Cameras
and Image Processing".
6. Andrea Spadaccini, "A Multiple-Choice Test Recognition System based on the Gamera Framework", 2011, arXiv:
1105.3834v1 [cs.CV] 19 May 2011.
7. TienDzung Nguyen, Quyet Hoang Manh, "Efficient and reliable camera based multiple-choice test grading
system" 2011 International Conference on Advanced Technologies for Communications (ATC 2011).
8. NutchanatSattayakawee, "Test Scoring for Non-Optical Grid Answer Sheet Based on Projection Profile Method"
International Journal of Information and Education Technology, Vol. 3, No. 2, April 2013.
9. Rakesh S, KailashAtal, Ashish Arora, "Cost Effective Optical Mark Reader" International Journal of Computer
Science and Artificial Intelligence Jun. 2013, Vol. 3 Iss. 2, PP. 44-49.
10. A. AL-Marakeby, "Multi-Core Processors for Camera based OMR", International Journal of Computer
Applications (0975 – 8887) Volume 68– No.13, April 2013.
11. Ms.Sumitra B. Gaikwad, "Image Processing Based OMR Sheet Scanning" International Journal of Advanced
Research in Electronics and Communication Engineering (IJARECE) Volume 4, Issue 3, March 2015.
12. AzmanTalib, Norazlina Ahmad, WoldyTahar, "OMR Form Inspection by Web Camera Using Shape-Based
Matching Approach" International Journal of Research in Engineering and Science (IJRES) Volume 3 Issue 4 ǁ
April. 2015 ǁ PP.29-35.
13. Werner Kinnebrock, Neural Network, Fundamentals, Applications, Examples, Galotia publications,
1995.
14. KussayNugamesh Mutter, Imad I. Abdul Kaream, Hussein A. Moussa,Connect Architecture. In Third International
Conference on Computer Graphics, Imaging and Visualization (CGIV 2006), 26-28 July 2006, Sydney, Australia.
pages 236-242, IEEE Computer Society, 2006.
15. Stephen Hussmann, Leona Chan, C. Fung, M. Albrecht, “Low Cost and highspeed Optical mark reader based on
Intelligent line Camera”, Proceedings ofthe SPIE AeroSense 2003, optical pattern recognition XIV, Orlando,
Florida,USA, vol. 5106, 2003.
Appendix A
Type Of answer Sheet Class No. Of Question
Square I 50
II 60
III 80
IV 100
Bubble I 50
II 60
III 80
IV 100
International Journal of Modern Trends in Engineering and Research (IJMTER) Volume 02, Issue 07, [July– 2015] ISSN (Online):2349–9745 ; ISSN (Print):2393-8161
@IJMTER-2015, All rights Reserved 423
Appendix B
Type Of answer Sheet Class No. Of Question Total No. Of Answer Recognized Unrecognized
Square I 50 5000 5000 0
II 60 6000 6000 0
III 80 8000 8000 0
IV 100 10000 9998 2
Bubble I 50 5000 5000 0
II 60 6000 5984 16
III 80 8000 7997 3
IV 100 10000 9995 5
Recommended