Test and Item Analysis of Multiple Choice Test
Questions Report
by
C. R. Adjah
July 2007
Table of Contents
0
Page
1
Table of Contents 1
List of Tables 2
List if figures 3
Section
1 Introduction 4
1.1 Statement of purpose 4
1.2 Methodology 4
1.3 Report structure 5
2 Test analysis 6
2.1 Distribution of students with correct option per
question
6
2.2 Distribution of percentage scores 7
2.3 Descriptive statistics 7
3 Item analysis 9
3.1 Difficulty index 9
3.2 Discrimination index 10
3.3 Item reliability 11
4 Conclusion 12
Bibliography 13
Addendum A
Addendum B
List of Tables
Table Table Name Page
1 Mean, Standard deviation, Skewness and 7
2
Kurtosis per question
2 The mean, median, mode and standard
deviation of percentages
8
3 Difficulty index 9
4 Discrimination index 10
5 Cronbach’s alpha 11
6 Cronbach’s alpha on deleting an item 11
List of Figures
Figure Figure name Page
1 The approach 4
2 Report structure 5
3
3 Histogram of students with correct options 6
4 Frequency histogram of percentage scores 7
1 INTRODUCTION
This is a report on the test and item analysis of a 20 multiple choice test
questions taken by 25 students.
1.1 STATEMENT OF PURPOSE
4
The purpose of this report is to provide a descriptive statistics and item
analysis of 20 multiple choice test questions taken by 25 students.
1.2 METHODOLOGY
The approach followed is as shown in Figure 1.
Figure 1: The approach
1.3 REPORT STRUCTURE
The report is made up of the four main sections:
Introduction
Test analysis
Item analysis
Conclusion
5
Steps Description
The data collected from answer scripts of the students were captured in an excel spreadsheet
For each student, the chosen options captured as A, B, C and D were recoded into 1 for a correct option and 0 for an incorrect option in an excel spreadsheet.
The score per student was calculated and sorted in descending order according to percentages
13 of the students were then grouped in an upper group and 12 in a lower group.
An analysis of the data was carried out using SPSS (Statistical Program for the Social Sciences) to determine the mean, standard deviation, mode, median, difficulty index, discrimination index and the Cronbach’s alpha.
A histogram of number of students with correct options per question was drawn.
Data tabulation
Recoding of data
Calculation of student score
Grouping of students
Analysis of data
Histogram
These sections as illustrated in Figure 2 are subdivided into subsections by
their headings.
Figure 2: Report structure
2 TEST ANALYSIS
2.1 Distribution of students with correct option per question
The number of students with the correct options chosen per question were
determined and a histogram drawn. This is illustrated in Figure 3.
Figure 3: Histogram of students with correct options
6
HISTOGRAM OF NUMBER OF STUDENTS WITH CORRECT OPTIONS PER QUESTION
0
5
10
15
20
25
QUESTION
FR
EQ
UE
NC
Y
Q1Q2Q3Q4Q5Q6Q7Q8Q9Q10Q11Q12Q13Q14Q15Q16Q17Q18Q19Q20
It is shown from the histogram that between 21 and 23 which represent 84%
to 92% of the students chose the correct options for questions 1, 2, 5, 11, 14,
15 and 16. Between 8 and 13 representing 32% to 52% of the students
answered questions 4, 7, 8, 9, 10 and 19 correctly.
2.2 Distribution of percentage scores
The number of students that fall within a percentage score is represented by
a histogram as illustrated in Figure 4.
Figure 4: Frequency histogram of percentage scores
7
HISTOGRAM
0
1
2
3
4
5
6
PERCENTAGE SCORES
FR
EQ
UE
NC
Y20-30
30-40
40-50
50-60
60-70
70-80
80-90
90-100
It is shown that 14 learners representing 56% of the students obtained scores
above the mean with 44% of the students have scores below the mean.
2.3 Descriptive statistics
The mean, standard deviation per item is shown in Table 1.
Table 1: Mean, Standard deviation, Skewness and Kurtosis per question
QUESTION N Sum MeanStd.
Deviation Skewness KurtosisQ1 25 21.00 .8400 .37417 -1.975 2.061Q2 25 22.00 .8800 .33166 -2.491 4.563Q3 25 17.00 .6800 .47610 -.822 -1.447Q4 25 12.00 .4800 .50990 .085 -2.174Q5 25 21.00 .8400 .37417 -1.975 2.061Q6 25 17.00 .6800 .47610 -.822 -1.447Q7 25 11.00 .4400 .50662 .257 -2.110
Table 1: Mean, Standard deviation, Skewness and Kurtosis per question
QUESTION N Sum MeanStd.
Deviation Skewness KurtosisQ8 23 12.00 .5217 .51075 -.093 -2.190Q9 25 13.00 .5200 .50990 -.085 -2.174Q10 24 8.00 .3333 .48154 .755 -1.568Q11 25 23.00 .9200 .27689 -3.298 9.641Q12 25 19.00 .7600 .43589 -1.297 -.354Q13 25 15.00 .6000 .50000 -.435 -1.976Q14 25 21.00 .8400 .37417 -1.975 2.061
8
Q15 25 20.00 .8000 .40825 -1.597 .593Q16 24 22.00 .9167 .28233 -3.220 9.124Q17 24 15.00 .6250 .49454 -.551 -1.859Q18 25 8.00 .3200 .47610 .822 -1.447Q19 25 13.00 .5200 .50990 -.085 -2.174Q20 25 16.00 .6400 .48990 -.621 -1.762Valid N (listwise)
22
The mean percentage score calculated is illustrated in Table 2. Also in the
table are the Median, Mode and standard deviation of the percentage scores.
Table 2: The mean, median, mode and standard deviation of percentages
Mean 65.24Median 65.00Mode 65.00Standard deviation 21.60
3 ITEM ANALYSIS
3.1 Difficulty index
Illustrated in Table 3 are the p-values of each test item. The p-values indicate
the proportion of students who got the test items correct.
Table 3: Difficulty indexDifficulty index
QUE #Correct #Answered p REMARKSQ1 21 25 0.84 Unacceptable item
9
Q2 22 25 0.88 Unacceptable itemQ3 17 25 0.68 Acceptable itemQ4 12 25 0.48 Acceptable itemQ5 21 25 0.84 Unacceptable itemQ6 17 25 0.68 Acceptable itemQ7 11 25 0.44 Acceptable itemQ8 12 23 0.52 Acceptable itemQ9 13 25 0.52 Acceptable item
Q10 8 24 0.33 Acceptable itemQ11 23 25 0.92 Unacceptable itemQ12 19 25 0.76 Acceptable itemQ13 15 25 0.60 Acceptable itemQ14 21 25 0.84 Unacceptable itemQ15 20 25 0.80 Acceptable itemQ16 22 24 0.92 Unacceptable itemQ17 15 24 0.63 Acceptable itemQ18 8 25 0.32 Acceptable itemQ19 13 25 0.52 Acceptable itemQ20 16 25 0.64 Acceptable item
From the table, the p-values of Q1, Q2, Q5, and Q14 are greater than 0.80
and therefore can be termed to be unacceptable test items. Q11 and Q15
with p-values above 0.90 are very easy items and should not be reused in
following tests. All other test items are acceptable as their p-values fall
between 0.20 and 0.80.
3.2 Discrimination index
A measure of the extent to which students who do well on the overall test
differentiate from students who did not do well on the overall test items was
determined as the discrimination indices. These discrimination indices
determined are shown in Table 4.
Table 4: Discrimination indexDiscrimination index
QUE#U
(UPPER)#L
(LOWER) DREMARKS
10
Q1 12 9 0.23 Acceptable itemQ2 13 9 0.31 Acceptable itemQ3 13 4 0.69 Acceptable itemQ4 7 5 0.15 Unacceptable itemQ5 13 8 0.38 Acceptable itemQ6 11 6 0.38 Acceptable itemQ7 8 3 0.38 Acceptable itemQ8 10 2 0.62 Acceptable itemQ9 10 3 0.54 Acceptable itemQ10 8 0 0.62 Acceptable itemQ11 12 11 0.08 Unacceptable itemQ12 12 7 0.38 Acceptable itemQ13 11 4 0.54 Acceptable itemQ14 13 8 0.38 Acceptable itemQ15 12 8 0.31 Acceptable itemQ16 13 9 0.31 Acceptable itemQ17 10 5 0.38 Acceptable itemQ18 5 3 0.15 Unacceptable itemQ19 10 3 0.54 Acceptable itemQ20 10 6 0.31 Acceptable item
Even though the discrimination indices of the test items are all positive and
therefore can be considered to be desirable items, Q4, Q11, Q18 with
discrimination indices less than 0.20 indicate that these test items are poorly
constructed items and unacceptable (Measurement and Evaluation Center,
2003).
3.3 Item reliability
Cronbach’s alpha which is the indicator of the overall test reliability is shown in Table 5.
Table 5: Cronbach’s alpha
Cronbach's Alpha
Cronbach's Alpha Based on Standardized
Items N of Items.804 .812 20
11
The high Cronbach’s alpha value of 0.812 indicates that the overall test is
reliable. Deleting a test item either increases or decreases the Cronbach’s
alpha. These changes are reflected in Table 6.
Table 6: Cronbach’s alpha on deleting an item
QUESTScale Mean
if Item Deleted
Scale Variance if
Item Deleted
Cronbach's Alpha if Item
DeletedComments
Q1 13.0455 15.093 .802 AcceptableQ2 13.0000 14.571 .791 AcceptableQ3 13.0909 14.372 .791 AcceptableQ4 13.3182 15.846 .821 UnacceptableQ5 12.9545 15.474 .804 acceptableQ6 13.0909 15.420 .809 UnacceptableQ7 13.4091 14.253 .795 AcceptableQ8 13.3182 14.513 .799 AcceptableQ9 13.3182 13.656 .783 AcceptableQ10 13.5000 13.405 .777 AcceptableQ11 12.9545 15.474 .804 AcceptableQ12 13.0000 15.333 .804 AcceptableQ13 13.2273 13.898 .787 AcceptableQ14 12.9545 14.617 .789 AcceptableQ15 13.0909 14.468 .793 AcceptableQ16 12.9545 14.617 .789 AcceptableQ17 13.2727 14.113 .792 AcceptableQ18 13.5000 14.833 .804 AcceptableQ19 13.2727 13.827 .786 AcceptableQ20 13.1364 14.504 .795 Acceptable
Q4 and Q6 showed an increase in Cronbach’s alpha value if deleted. This
indicates that this question needs modification or deletion as a test item in
order to maintain the reliability of the test.
4 Conclusions
All test items discriminate well except for Q4, Q11 and Q18. In the case of
Q1, Q2, Q5, and Q14 with difficulty indices above 0.80 is an indication that
they are quite easy test items and may need a review. Questions 11 and 15
with difficulty indices above 0.90 are very easy items and should not be
reused in subsequent testing. However, based upon the Cronbach’s alpha
12
values, all the test items can be considered to be reliable and acceptable
except for Q4 which needs modification or deletion in order to increase the
reliability of the test.
Knoetze, J. (2007). Test data. Retrieved July 16, 2007 from
<http://www.jknoetze.co.za_2007/testdata.xls>
Measurement and Evaluation Center. (2003). Test Item Analysis & Decision
Making. The University of Texas at Austin. Retrieved July 16, 2007 from
<http://www.utexas.edu/academic/mec/research/pdf/itemanalysishando
ut.pdf>
Varma, S. (n.d.). Preliminary Item Statistics Using Point-Biserial Correlation
and P-Values. Educational Data Systems Inc Morgan Hill CA. Retrieved
July 16, 2007 from
<http://www.eddata.com/resources/publications/EDS_Point_Biserial.pdf>
13
14
ADDENDUM A
Coding and grouping of studentsKey C B D D B C D A C B A C B D A A C D B CSt No Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q15 Q16 Q17 Q18 Q19 Q20 #Corr #Ans % Grp
11 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 20 20 100.00 U16 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 20 20 100.00 U
2 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 18 20 90.00 U3 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 18 20 90.00 U
25 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 18 20 90.00 U13 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 17 20 85.00 U20 1 1 1 1 1 1 1 1 1 0 0 0 1 1 1 1 1 1 1 1 17 20 85.00 U14 0 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 0 0 1 1 16 20 80.00 U
5 1 1 1 0 1 1 0 1 1 0 1 1 1 1 1 1 0 0 1 1 15 20 75.00 U4 1 1 1 0 1 1 0 1 1 1 1 1 0 1 0 1 1 0 0 1 14 20 70.00 U
12 1 1 1 1 1 1 1 0 0 0 1 1 0 1 1 1 1 0 1 0 14 20 70.00 U8 1 1 1 0 1 1 0 0 0 0 1 1 1 1 1 1 1 0 1 0 13 20 65.00 U9 1 1 1 0 1 1 1 0 0 0 1 1 1 1 1 1 1 0 0 0 13 20 65.00 U
18 1 1 0 1 1 0 1 0 0 0 1 1 0 1 1 1 1 0 1 1 13 20 65.00 L23 1 1 1 0 1 1 0 0 0 0 1 1 1 1 1 1 1 0 1 0 13 20 65.00 L10 1 1 0 0 1 1 1 0 0 0 1 0 0 1 0 1 1 1 1 1 12 20 60.00 L21 1 0 1 1 0 1 0 0 1 0 1 1 0 1 1 1 0 0 0 1 11 20 55.00 L22 1 1 0 0 1 1 0 0 0 0 1 1 1 1 0 1 0 1 0 1 11 20 55.00 L17 1 1 0 0 1 0 1 0 1 1 0 1 1 1 0 0 0 9 17 52.94 L
6 0 0 1 1 0 1 0 0 1 0 1 1 0 1 1 1 0 0 0 1 10 20 50.00 L7 0 1 0 0 1 1 0 0 0 0 1 1 1 1 0 1 0 1 0 1 10 20 50.00 L
15 0 1 1 1 1 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 8 20 40.00 L1 1 1 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 6 19 31.58 L
24 1 1 0 0 0 0 0 0 0 1 0 0 0 1 1 1 0 0 0 6 19 31.58 L19 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 4 20 20.00 L
65.6421.60
Upper 13Lower 12
ADDENDUM B
0
Difficulty indexDiscrimination
indexQUE #Corr #Ans p #U #L DQ1 21 25 0.84 12 9 0.23Q2 22 25 0.88 13 9 0.31Q3 17 25 0.68 13 4 0.69Q4 12 25 0.48 7 5 0.15Q5 21 25 0.84 13 8 0.38Q6 17 25 0.68 11 6 0.38Q7 11 25 0.44 8 3 0.38Q8 12 23 0.52 10 2 0.62Q9 13 25 0.52 10 3 0.54Q10 8 24 0.33 8 0 0.62Q11 23 25 0.92 12 11 0.08Q12 19 25 0.76 12 7 0.38Q13 15 25 0.60 11 4 0.54Q14 21 25 0.84 13 8 0.38Q15 20 25 0.80 12 8 0.31Q16 22 24 0.92 13 9 0.31Q17 15 24 0.63 10 5 0.38Q18 8 25 0.32 5 3 0.15Q19 13 25 0.52 10 3 0.54Q20 16 25 0.64 10 6 0.31
M 65.64MDN 65.00 % FREQ
MODE 65.00 20-30 1STD 21.60 30-40 2
40-50 150-60 560-70 570-80 380-90 390-100 5
1