Review and reward within the computerised peer‐assessment of essays

This article was downloaded by: [George Mason University]On: 19 December 2014, At: 04:31Publisher: RoutledgeInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Assessment & Evaluation in HigherEducationPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/caeh20

Review and reward within thecomputerised peer‐assessment ofessaysPhil Davies aa University of Glamorgan , South Wales, UKPublished online: 29 Apr 2009.

To cite this article: Phil Davies (2009) Review and reward within the computerisedpeer‐assessment of essays, Assessment & Evaluation in Higher Education, 34:3, 321-333, DOI:10.1080/02602930802071072

To link to this article: http://dx.doi.org/10.1080/02602930802071072

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoeveror howsoever caused arising directly or indirectly in connection with, in relation to orarising out of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/loi/caeh20

http://www.tandfonline.com/action/showCitFormats?doi=10.1080/02602930802071072

http://dx.doi.org/10.1080/02602930802071072

http://www.tandfonline.com/page/terms-and-conditions

http://www.tandfonline.com/page/terms-and-conditions

Assessment & Evaluation in Higher EducationVol. 34, No. 3, June 2009, 321–333

ISSN 0260-2938 print/ISSN 1469-297X online© 2009 Taylor & FrancisDOI: 10.1080/02602930802071072http://www.informaworld.com

Review and reward within the computerised peer-assessment of essays

Phil Davies*

University of Glamorgan, South Wales, UK

Taylor and Francis LtdCAEH_A_307273.sgm10.1080/02602930802071072Assessment & Evaluation in Higher Education0260-2938 (print)/1469-297X (online)Original Article2008Taylor & [email protected] This article details the implementation and use of a ‘Review Stage’ within the CAP(computerised assessment by peers) tool as part of the assessment process for a post-graduate module in e-learning. It reports upon the effect of providing the students witha ‘second chance’ in marking and commenting their peers’ essays having been able toview the peer-comments of other markers. Included is discussion on how a mark forperforming the peer-marking process can be automatically generated that reflects thequality of the student’s marking and commenting of their peers’ work. Studentfeedback is also presented to illustrate the effect that this additional stage ofcomputerised peer-assessment has had upon the student’s learning, development andassessment.

Keywords: computer-aided assessment; peer-assessment; assessment; self-assessment

Background

As noted in Brown, Race, and Bull (1999, 1), ‘assessment is often regarded as the bane ofacademics’. Further to this, Bull and McKenna (2004, 16) note ‘assessment is arguably themost critical activity in which students take part … yet often the one in which they haveleast control’. Peer-assessment can be introduced as a way of solving these two points inthat it has often been seen as a way of involving students more in the development of assess-ment, with the by-product of reducing tutor marking. The use of peer-assessment is not newto higher education (Boud, Cohen, and Sampson 1999; Dochy, Segers, and Sluijsmans1999; Falchikov and Goldfinch 2000) and its merits and demerits have been argued overfor many years (Topping 1998; Falchikov 2005, 151–67). The use of ICT to ‘support theprocess’ however is still in its relative infancy, with the first systems appearing onwardsfrom the late 1990s (Davies 2000; Bhalerao and Ward 2000; Bostock 2001; Parsons 2003;De Volder et al. 2007). The tools developed in this period of time have tended to replicatethe ‘traditional’ peer-marking process rather than attempt to embed ICT into the processand develop the possible functionality.

During this period of time the CAP (computerised assessment by peers) system(Davies 2000) has been developed as a tool to support the peer-assessment of both essaysand multimedia presentations. It has evolved from a basic marking tool that replicatestraditional peer-assessment (Davies 2000), to include anonymous communicationsbetween marker and marked (Davies 2003) and the inclusion of menu driven commentsand weightings to take into account the subjectivity of the marker and automatic creationof a mark for marking (Davies 2005). Throughout the various development stages of thissystem, the importance of feedback and quality of comments (Davies 2004, 2006) hasbeen emphasised as being of great value to the owner of the essay. In order that a fair

*Email: [email protected]

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

322 P. Davies

mark is produced for the essays, a unique compensation process has been developed thatautomatically adjusts the marker’s marks in order to produce a compensated peer-markthat acts as the final grade for a particular essay (described in greater detail later withinthis article).

From past student questionnaires collected after they’ve made use of the CAP system,two major concerns have often been reported upon, namely:

(1) Whether they have been able to maintain consistency throughout the peer-markingprocess.

(2) Have they performed the ‘task’ well compared with other students in the group?

This article reports upon the development of the CAP system to include a ‘ReviewStage’ that permits the students to amend their marks and/or comments for a particularessay having been permitted to review their previous marking of the essay (addressing (1)above). Also during this process they were permitted to view the comments of their peerswho had also marked this particular essay (addressing (2) above).

Assessment process

As part of their coursework assessment within the postgraduate e-learning module, acohort of students (13 students were initially enrolled on the module) were requested toproduce an individual essay in the form of a fully referenced document that explained howto develop ‘a distance learning PowerPoint presentation to teach 10 year olds somethingof a technical nature’. This report was to be addressed at the level of their peers and it wassuggested that it was to be a maximum of three pages plus references. It was also requestedthat the main source of referencing be from the Web (however some books and journalswere to be expected). The reason for this being that in the peer-marking timescale permit-ted, it would be difficult for a marker to be able to find book and journal references but asthe CAP system supports an embedded Web browser it would be easy for them to judgethe relevant research undertaken by the essay developer. The students having been fullybriefed in class concerning the various stages of the assessment process were given twoweeks to research, develop and submit this essay. Having submitted this essay the studentsthen moved onto the ‘marking’ aspect of the assessment.

Setting the weighted comments bank

Prior to the self- and peer-marking stages of the assignment, the students were requestedto develop an appropriate bank of comments that they could use within the 10 categoriespresent within the CAP menu driven marking system namely: readability, aimed atcorrect level, personal conclusions, referencing, research and use of Web, content andexplanations, examples and case studies, overall report quality, introduction and defini-tions and report presentation and structure. Prior to the assessment being undertaken thestudents were offered the opportunity of replacing some of these categories and also tosuggest suitable marking criteria for this particular assignment via a discussion board setup for the purpose. Through this discussion board it was decided to leave the comment-ing categories as in the past. The weighting of marks was also discussed and this resultedin the marking criteria categories being: research shown 40%, explanations 30%, read-ability and structure 20% and aimed at correct level 10%.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

Assessment & Evaluation in Higher Education 323

It should be noted that the importance of certain comments would have a different‘value’ to the students marking their peers’ essays. This of course is a similar scenario totutors, where a certain feedback comment would have greater importance to one tutorcompared with another. Therefore to include some form of subjectivity within thecomments being presented, each of the menu comments (both positive and negative)within the previous named categories was assigned a weighting between 1 (low impor-tance) and 5 (high importance). In order to do this the students made use of the Commentsand Weightings setting application (Figure 1) in order to set comments that they feltsuitable for their marking and including weightings per comment to include the personalimportance for their commenting. This is described in more detail in Davies (2005).Figure 1. Application for the setting of comments and weightings.

Prior to the students undertaking the peer-assessment aspect of this assignment theywere instructed to use the CAP system to self-assess their own work (Figure 2). This is anaspect of assessment that students in the past have found to be extremely difficult. Themark generated by this self-assessment process is not necessarily of great importance withregard to the outcomes of this assignment; however by performing this aspect of assess-ment it has been reported that it has provided a means of the students:

● getting used to the menu driven computerised assessment system (CAP);● having a way of creating a standard for themselves that they can use throughout the

peer-marking process; and● identifying any small errors (such as a mistyped reference) that they could ask the

tutor to amend prior to the peer-assessment taking place.

Figure 2. CAP self- and peer-marking application.

The students were then given a week to perform the peer-marking process making useof the CAP marking system (Figure 2). During this period of time they were expected tomark at least six of their peers’ essays, with these essays being randomly allocated to themarking student via the server aspect of the CAP system.

Figure 1. Application for the setting of comments and weightings.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

324 P. Davies

Having completed the peer-marking process the students were then given a week tomake use of the new review functionality added to the CAP system (Figure 3) whichpermitted them to view the comments of their peers concerning the essays that they hadpreviously marked themselves. By clicking on button B an essay marked by this marker isshown with the original menu driven comments, free text comments and the marks in eachcategory. By clicking on button A the comments of another marker (not marks) aredisplayed. This stage may be stepped through a number of times to view all of thecomments for this particular essay. If the marker would like to amend his/her originalmarks or comments (both menu driven and free text) then they may do so. Having madeany modifications then button C is pressed to write away and amend the particular markingfor this essay. As this is a dynamic system, the marker is able to return at any time duringthis week to this aspect of the marking and if he/she wishes to view or amend their owncomments, whilst also viewing the amended comments of other markers.Figure 3. Review application permitting viewing of peer-commenting.

Having completed the Review Stage, by using another function of the CAP tool thestudents were initially permitted to view the comments of their peers with regard to theirown submitted essays. They were then requested to submit a reflective self-assessmentgrade for their work. Having completed this task they were then allowed to view both thepeer comments and marks that they had been awarded for their essays. They were allowedto view the median derived peer-mark for their essay not the compensated peer-mark thatwould represent the final grade they were to be awarded for their essay. In order to gener-ate this compensated average peer-mark for an essay, the possibility that a marking studentis a ‘hard’ or ‘easy’ marker (often mapping to personal expectations) has to be taken intoaccount. It would be unfair (unfortunate) from a student’s perspective were they to be peer-marked by six hard markers compared to another student who was marked by six easymarkers. In order to provide some form of compensation process, each marker has to bejudged with regard to their average over- or under-marking methods. Each essay thereforeneeds to have a provisional average grade produced for it (the median is deemed to be afairer reflection than the mean). Having created this, each marker’s mark is compared

Figure 2. CAP self- and peer-marking application.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014


against the average mark for the essay they had marked and an over- or under-average‘mark difference’ is created. The essays now marked by this student are amended by thismark difference and a compensated peer-mark is generated for each essay. Therefore thefinal peer-mark produced for an essay is compensated taking into account the ‘bias’ shownby a marker. In addition to this grade for their essay they were allocated a mark for theconsistency shown in the peer-marking process that they had performed. On completion ofthe assignment they were provided with a questionnaire requesting them to comment onhow they had found the overall assessment process.

Results

Out of the 13 students who initially undertook this assessment process, two of these didnot fully complete the peer-marking process as requested. The results of these studentshave been included as their essays were peer-marked. In total there were 76 markings withan average time per marking of 42 minutes (ranging from three to 72 minutes). This is notan exact record of the ‘active’ marking time that a student took to mark but an overall timefrom the downloading an essay to the actual submission of the mark and comments. Withregard to the menu comments, on average there were 15.7 produced for each essay mark-ing. The average compensated peer-mark produced for the essays prior to the ReviewStage was 60.15% and post the Review Stage was 59.69% (this indicates overall the minoreffect that the Review Stage had on the marks produced for the essays). In the past uses of

Figure 3. Review application permitting viewing of peer-commenting.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

326 P. Davies

the CAP system the use of this compensation process has not really had a major influenceupon the final grade produced, but certainly does allay the fears of students with regard tothem being ‘fairly’ graded for their essays and not being disaffected by particular markers.In past uses of the CAP system particular emphasis has been placed on the quality of thecomments with regard to the actual marks presented. Robinson (2002) suggests that in herstudy the majority of peer-feedback was of a highly variable quality.

Table 1 shows the correlation between the compensated peer-marks (i.e. the final aver-age compensated mark produced for an essay taking into account high and low markers)and the average feedback indexes for these essays (i.e. the average number of positive andnegative menu driven comments produced per essay by the peer-markers). As in the pastwith the CAP usage there is a significant positive correlation between the comments andthe marks provided (i.e. an essay with a good average mark will have a concomitantnumber of positive comments and likewise an essay with a poor mark will have numerousnegative comments). This reflects well upon the students providing comments that reflectthe quality of the essays. Due to this a student should be able to have a very good idea ofthe quality of his/her own work, by merely viewing the peer-comments, and not rely upona mark.

As mentioned previously the students were requested initially to self-assess their work(Table 4, Column B) and then at the end of the assessment process having seen only thecomments of their peers to email the tutor with a reflective self-assessment mark (Table 4,Column C). The average self-assessment mark produced was 68.33%. Having performedthe peer-marking process and having been provided with the opportunity to view thecomments of their peers concerning their own essays the average reflective self-assessmentmark produced was 64.63%. Out of the 13 students undertaking the assessment processeight reduced their self-assessment, with three of the students remaining the same and twonot providing the self-assessment marks as requested. The ‘raw’ median mark wasproduced for each essay (Table 4, Column D); the average mark produced for the essaysbeing 60.83%. As mentioned previously there is a need to develop a ‘fair’ mark that takesinto account high- and low-markers. Having compensated their marks by this average

Table 2. Mark changes performed at Review Stage.

Negative Positive

8 7 6 5 4 3 2 1 1 2 3 4 5 6 7 8 91 3 2 3 2 2 4 2 1 1 0 0 0 2 1 1 1

8 67–>59 9 30–>39

7 73–>67, 67–>60, 60–>53 8 58–>66

6 73–>67, 52–>46 7 79–>866 14–>20, 40–>46

Table 1. Peer-generated mark and feedback index mappings.

Feedback index +7 +6 +5 +4 +3 +2 +1 +0 −0 −1 −2 −3 −4

Compensated essay grades 81 68 62 72 53 61 52 43 4373 66 51

51

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014


difference a compensated average peer-mark was produced (Table 4, Column E), with theoverall average being 60.15%.

Having completed the peer-marking process, the students were permitted to ‘review’their marks and comments whilst being permitted to view the comments provided by theirpeers who had also marked the same essays (as previously shown in Figure 3). This offeredthe students the opportunity to modify their initial markings having reflected upon others’comments and also having marked a range of essays themselves. Out of the 76 initial mark-ings that took place, 41 of the markings were ‘replaced’ (54%) with 26 changing the originalmark given (i.e. 34%). On checking through these results in more detail it was noted thatonly 33 out of the 41 ‘replaced’ markings actually had any changes made. This appears toindicate that some students on finishing viewing their previous markings clicked the amendbutton on the review system even though they had not actually performed any amendments.Out of the students who undertook the marking two of them actually ‘replaced’ all of theirmarkings in some manner. Table 2 indicates the mark frequency changes that occurredduring the Review Stage. This shows that out of the 26 mark changes that were made, 19actually reduced the original mark whilst seven increased the mark. The most significantmark amendments are presented indicating the ranges of these changes.

Having been through this Review Stage, the raw average mark for the essays was59.69% and the compensated average mark being also 59.69% (Table 4, Columns F andG). This basically indicates that following the review process the average median markshown to a student before compensation is the same as it is having gone through thecompensation process. However there are still individual differences, they are merely aver-aged out over all of the results. Of the final changes resulting from the Review Stage, fivestudents improved their average mark, seven were reduced and one remained the same.The Compensation Stage therefore does not have a significant effect upon a student’sgrade but it does take into account the concern over high- and low-markers. As notedpreviously in this article, there is also a need to produce a mark that reflects the quality(consistency) shown by the marker in producing not just marks but comments that reflectthe quality of an essay. In judging the consistency shown by a marker, then, it is thedeviation from the correct mark that is measured. If a student consistently is accurate orover/under-marks from the average then this shows good evaluative skills and consistency.Hence a low consistency index mark reflects a good evaluative marker. For example, if amarker on average over-marks by 10% then if he/she is consistently over-marking by this10% it shows good evaluative judgment. If the mark produced for an essay by the markeris 75% and this essay was actually worth 60% then an over-marking of 15% has beenproduced. However it is expected that on average he/she will over-mark by 10%, thereforea consistency difference of five is produced. If, for example, the student had provided amark of 50% for this particular essay, then the consistency difference would have been 25(taking into account the expected average over-marking). By summing and averagingthese consistency differences, a numeric value can be calculated that represents themarker’s consistency. Having produced this ‘value’ for the marking consistency, it mustbe mapped to an actual percentage grade (discussed later). Table 3 presents the averagemark differences and consistencies produced.

In developing a mapped percentage for the marking consistency it was decided, inorder to compare ‘like with like’ for this particular cohort of students, to map the consis-tency marks against the actual final compensated peer-generated marks produced for theessays (Table 4, Column G). In this way the ‘range of abilities’ of the students was usedas a boundary to the percentage grade awarded. For this group the average essay gradeproduced was 60% with a range of 81% to 42%, thus the percentage points being 21 above

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

328 P. Davies

and 18 below the average. With regard to the mark consistencies produced the average was4.87, with a range of 2.31 to 10.78 (keeping in mind that a low score is good whilst a highscore is poor with regard to mark consistency). The resultant points range of a ‘good’student below the average was 2.56 and a ‘poor’ student above the average was 5.91.Therefore, mapping a good student’s marking consistency to a good essay results in 21 (i.e.81–60)/2.56 (i.e. 4.87–2.31) = 8.2% for every mark consistency point below the averageto be added to the essay average mark of 60%. Similarly a poor student’s marking consis-tency to a poor essay results in 18 (i.e. 60–42)/5.91 (i.e. 10.78–4.87) = 3.05% for everymark consistency point above the average to be taken from the average essay mark of 60%.

Suppose a student has a mark consistency grade of 5.9. This is above the average markconsistency grade of 4.87, therefore it indicates a below average marking performance. Towork out the percentage grade for this marking would result from an essay average (60%)minus difference between the student’s mark consistency (5.9) and the average markconsistency figure (4.87) i.e. 1.03. This figure is then multiplied by the weighting for apoor result (i.e. 3.05%). Therefore the mark awarded to this student being 60%–(1.03×3.05)% = 60–3.14 = 56.86%. This method is obviously ‘raw’ and illustrates thedifficulty in mapping an actual percentage grade to ‘reward’ the marking process in aqualitative manner.

If we were to consider what makes a traditional tutor a ‘good’ assessor of work, thenwe would expect the marks and comments he/she produces to be fair and to reflectconsistently the quality of the work being assessed. Similarly we would expect a goodpeer-assessor to show similar skills. Therefore if we apply the scales relating to consis-tency (explained previously), we are able to produce a grade that reflects the qualityshown by the student in performing the marking and commenting processes in a consis-tent manner. Table 4, Column H shows the marking consistency indexes for the studentsshowing consistency in producing the marks for the essays. These are now comparedwith the expected average index of 5.81 to produce the average differences of markconsistencies (Table 4, Column I). The percentage grade for marking is now calculatedmaking use of the scales explained above, as shown in Table 4, Column J.

Table 3. Results of student marking consistencies both pre- and post-Review Stage.

Pre-Review Stage Post-Review Stage

Student number

Average mark difference

Mark consistency

Average mark difference

Mark consistency

1 −5.17 6.85 −6.0 5.592 −1.14 5.49 0.14 4.553 0.83 2.55 0.5 2.314 12 5.54 11.14 4.175 – – – –6 6.67 7.46 7 7.597 −19.25 11.19 −18.25 10.788 −11.67 8.58 −9.83 4.549 – – – –10 7.83 6.14 8 5.9911 4.67 4.16 2.17 3.3112 −4.83 8.41 −3.67 6.6713 −1 2.95 0.83 3.73

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014


Tabl

e 4.

Stu

dent

s re

sult

s fo

r al

l st

ages

of

asse

ssm

ent

proc

ess.

AB

CD

EF

GH

IJ

KL

MN

OP

Student #

Self-assess

Reflective self-assess

Raw peer-mark

Compensated peer-mark

Post-review raw peer-mark

Post-review compensatedpeer-mark

Mark consistency

Consistency difference fromaverage

% Grade for marking

Average feedback difference

Feedback consistency

Consistency difference fromaverage

% Grade for commenting

% Overall consistency

Final grade (60/15/15/10)

183

8370

7367

72%

5.59

+0

.22

59%

−2.1

03.

05+

0.25

59%

80%

69%

260

6069

6969

68%

4.55

−0.8

265

%0.

752.

53−0

.27

63%

72%

67%

363

6254

5354

53%

2.31

−3.0

681

%4.

432.

94+

0.14

59%

43%

57%

463

6247

5147

51%

4.17

−1.2

068

%−1

.30

2.21

−0.5

965

%68

%57

%5

––

4543

4542

––

––

––

––

–6

7068

4955

4652

%7.

59+

2.22

53%

4.38

1.59

−1.2

175

%43

%55

%7

5757

6464

6162

%10

.78

+5.

4143

%−6

.00

7.23

+4.

4343

%81

%58

%8

7160

8681

8681

%4.

54−0

.83

65%

−2.2

92.

61−0

.19

62%

68%

75%

970

–56

5156

51–

––

––

––

––

1069

6561

6261

61%

5.99

+0.

6258

%1.

893.

47+

0.67

57%

79%

62%

1179

7466

7168

73%

3.31

−2.0

674

%−0

.62

1.07

−1.7

381

%58

%73

%12

6655

4844

4843

%6.

67+

1.30

56%

0.44

2.41

−0.3

964

%56

%49

%13

6965

7065

6866

%3.

73−1

.64

71%

−2.0

51.

72−1

.08

73%

74%

69%

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

330 P. Davies

It should also be remembered that the marker is not just being rewarded for showingconsistency in their marking but also in their consistency shown in commenting. A similarcalculation is now performed as for the marks in order to produce initially the feedbackindexes (Table 4, Column L), their differences from the average (Table 4, Column M) andthen the percentage grades for commenting consistency (Table 4, Column N). These twogrades produced represent the quality of the student’s marking and commenting consis-tency. However it is appropriate that a student should also be rewarded for doing ‘well’ atboth marking and commenting. Therefore a third mark is produced that reflects the consis-tency shown in both aspects of assessing. Again this uses the essay ranges (Table 4,Column G) as boundaries for the allocation of a percentage grade (Table 4, Column O).

Having now produced the various grades for the student work, a final grade for theoverall assignment needs to be produced. It was decided to create the final grade using thefollowing ratios essay (60), mark for mark consistency (15), mark for comment consis-tency (15) and mark for showing consistency in producing marks and comments (10)(Table 4, Column P).

In comparing the various grades produced for the students no real trends can be iden-tified between a student’s ability in marking and producing an actual essay. The correlationbetween the marking and commenting consistency grades awarded was 0.49. The correla-tion figures for the grade awarded for marking and commenting consistency with the essaymarks was 0.17 and 0.05 respectively (low correlation, i.e. just because a student writes agood essay it does not necessarily result in him/her being a good marker). An average markfor all three aspects of the marking was then mapped against the final essay grade achievedproducing a correlation of 0.56. Finally comparing the final grade produced for the assign-ment (60/15/15/10) and the essay grade on its own produced a high positive correlation of0.85 (significant positive correlation between the overall assessment outcome and theactual essay grade).

Student comments

On completion of the assessment process the students were asked to fill in a free textresponse form for their feedback concerning the use of the Review Stage of the CAP systemand their overall feelings concerning the use of peer-assessment within their course. All ofthe students when asked had never used peer-assessment as a method of assessment in theirprevious education. With regard to the use of self-assessment they on the whole found it avery difficult thing to do; however a number reported that it had helped them considerablyin promoting their critical awareness and thinking about how to assess others. The creationof the weighted comments database was completely new to all of the students and generallythey found it hard to know what comments to include prior to having marked any essays.However, a number noted how beneficial it had been in including a weighting for the vari-ous comments in clearly defining what they were going to look for in the essays. A coupleof students suggested that possibly a dummy marking using their own comments mighthave helped. With regard to the use of peer-assessment the students found it a very time-consuming process. It is therefore important that in the development of the class assessmentschedule the time to mark and review is estimated and included within the time scales. Thisreflects past comments from students who’ve used this tool and clearly supports the needto provide a qualitative mark that reflects the skills of the students in performing the mark-ing process. The students on the whole found the experience of using peer-assessment verypositive and interesting and felt it had helped their development in the subject area. Anumber of students expressed their relief that the process was anonymous and felt if this

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014


had not been the case they would have found great difficulty in fairly assessing their peers.The introduction of the Review Stage was felt to be a good enhancement to the system witha number of students liking the opportunity to re-consider their marks and comments andalso to be able to see what others thought of the essays they’d marked. The aspect of theassessment process that triggered the most comments was the provision of a mark for mark-ing. Initially the students had found it difficult to understand how they would be judged.However they noted that as they progressed through the marking process they became moreaware of the need to remain consistent in their judgments as this would provide them witha better mark. All of the students were supportive of the need to provide a fair mark thatreflected their marking. One student suggested that part of the assessment process couldinclude a stage were the owners of the essays provided a mark for marking themselves.This will be looked at in the future.

Conclusions

At this early stage in the development of this additional Review Stage to the computerisedpeer-assessment process (and also the limited sample size), it would be inappropriate todeduce any major conclusions concerning the effect that it could have upon peer-markingin general. Initially the results appear to indicate that the Review Stage does not have amajor effect upon the peer-marks produced (i.e. by viewing the comments of peers whohave also marked a particular essay has not resulted in a closer correlation of marksproduced); thus the need for the compensation process remains in generating a peer-markthat reflects the quality of an essay. At the onset of this study the author had mixedfeelings concerning the possible outcomes of the introduction of this Review Stage. Inpast uses of the CAP marking system, students have requested that they would have likedto have had the opportunity to re-assess their original markings. The inclusion of thisextra stage had been avoided as it was felt that this would result in the students not settingtheir criteria for peer-marking clearly prior to performing marking due to the fact thatthey’d have a ‘second chance’. The preliminary results from this review study appear toindicate that the students even though they knew that this second chance would be avail-able took every care in their original marking (mainly due to the fact that they noted thatthey would be allocated a grade for performing this marking in a qualitative manner). Themark changes were relatively minor and appear to have had little bearing on the overallresults produced.

It was noted earlier in this article that students who had performed peer-assessment inthe past had been concerned that they didn’t know if they had been consistent in performingthe peer-marking and peer-commenting, particularly compared with their peers. The inclu-sion of the review functionality to the CAP system has met with the general approval ofthe students with regard to these concerns in that it has provided them with an opportunityto get a realistic appraisal of what their peer-assessment of the essays was in comparisonwith others within their group. Again it must be noted that this addition to the peer-markingprocess has resulted in an increase in the assessment time scales, and as such great care hasto be taken in mapping an appropriate reward for the additional effort expected from thestudents. This as in the past uses of the CAP system has to be mapped to the quality of theprocess not just to the time taken.

Probably the main outcome of this peer-assessment procedure is the necessity for it tobe supported by ICT. This assessment procedure has become one of computerised assess-ment (CA) rather than computer-aided assessment (CAA). The subjectivity of the tutor hasbeen removed from the marking process and been replaced by the students themselves.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

332 P. Davies

The mark for marking is raw; however the students report that they like the idea of beingrewarded appropriately for the quality (consistency) of their marking. All of the mathemat-ical calculations, etc. are easily achieved via software enhancements to the server aspectof the CAP tool.

In providing a mark for marking in the future there may be a need to look at the construc-tiveness of the actual menu comments and free text comments provided; however this wouldrequire tutor intervention. The significant positive correlation between the amalgamatedgrade produced by the various stages of the assignment and the actual essay itself, couldactually question the need for all of the effort required in peer-marking, etc. However theassessment is about much more than merely producing an essay but should be recognisedas a method of developing student reflection and critical skills. The students in using theCAP system not only ‘develop’ their understanding and knowledge in a particular subjectarea by peer-marking, but also have the incentive of being rewarded in a qualitative mannerfor displaying evaluative higher order skills.

Notes on contributorPhil Davies has been a lecturer at the University of Glamorgan since 1985. Over the past eightyears he has been actively researching, developing and presenting in the area of CAA. He has todate developed three client-server network-based systems as tools to aid with the assessmentprocesses, namely one for multiple-choice questions, one for confidence testing and a third forpeer-assessment. He has been invited to be keynote speaker at a number of events in the area ofCAA and in 2007 was awarded his PhD by Publication in the area for the inclusion of subjectivitywithin CAA development.

ReferencesBhalerao, A., and A. Ward. 2001. Towards electronically assisted peer assessment: A case

study. Association of Learning Technology Journal 9, no. 1: 26–37.Bostock, S. 2000. Student peer assessment. http://www.keele.ac.uk/depts/aa/landt/lt/docs/

bostock_peer_ assessment.htm.Boud, D., R. Cohen, and J. Sampson. 1999. Peer learning and assessment. Assessment & Evaluation

in Higher Education 24, no. 4: 413–26.Brown, S., P. Race, and J. Bull. 1999. Computer-assisted assessment in higher education. London:

SEDA.Bull, J., and C. McKenna. 2004. Blueprint for computer-assisted assessment. London: Routledge-

Falmer.Davies, P. 2000. Computerized peer-assessment. Innovations in Education & Teaching International

37, no. 4: 346–55.Davies, P. 2003. Closing the communications loop on the computerized peer assessment of essays.

Association of Learning Technology Journal 11, no. 1: 41–54.Davies, P. 2004. Don’t write just mark: The validity of assessing student ability via their computerized

peer-marking of an essay rather than their creation of an essay. Association of Learning TechnologyJournal 12, no. 3: 263–79.

Davies, P. 2005. Weighting for computerized peer-assessment to be accepted. In Proceedings of the9th Annual International CAA Conference, ed. M. Danson, 179–92. Leicestershire: LoughboroughUniversity.

Davies, P. 2006. Peer-assessment: Judging the quality of student work by the comments not the marks?Innovations in Education & Teaching International 43, no. 1: 69–82.

De Volder, M., M. Rutjens, A. Slootmaker, H. Kurvers, M. Bitter, R. Kappe, H. Roossink, J. deGoeijen, and H. Reitzema. 2007. Espace: A new web-tool for peer assessment with in-built feed-back quality system. http://www.cluteinstitute.com/Programs/Hawaii-2007/Article%20172.pdf.

Dochy, F., M. Segers, and D. Sluijsmans. 1999. The use of self-, peer- and co-assessment inhigher education: A review. Studies in Higher Education 24, no. 3: 331–50.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014


Falchikov, N. 2005. Improving assessment through student involvement: Practical solutions foraiding learning in higher and further education. London: RoutledgeFalmer.

Falchikov, N., and J. Goldfinch. 2000. Student peer-assessment in higher education: A meta-analysiscomparing peer and teacher marks. Review of Educational Research 70, no. 3: 287–22.

Parsons, R. 2003. Self, peer and tutor assessment of text online: Design, delivery and analysis.In Proceedings of the 7th International CAA Conference (July), ed. J. Christie, 315–26.Leicestershire: Loughborough University.

Robinson, J. 2002. In search of fairness: An application of multi-reviewer anonymous peer-reviewin a large class. Journal of Further and Higher Education 26, no. 2: 183–92.

Topping, K. 1998. Peer assessment between students in colleges and universities. Review ofEducational Research 68, no. 3: 249–76.

Dow

nloa

ded

by [

Geo

rge

Mas

on U

nive

rsity

] at

04:

31 1

9 D

ecem

ber

2014

Documents

Review and reward within the computerised peer‐assessment of essays