13
PEER APPRAISALS: DIFFERENTIATION OF INDIVIDUAL PERFORMANCE ON GROUP TASKS Human Resource Management, Winter 2001, Vol. 40, No. 4, Pp. 333–345 © 2001 John Wiley & Sons, Inc. John A. Drexler, Jr., Terry A. Beehr, and Thomas A. Stetz The use of peer appraisals has been widely acclaimed, but how useful are they really? Student groups made non-anonymous ratings of peer performance on two group tasks, and the ratings contributed to the students’ course grades. Groups differentiated very little among peers in their performance ratings. Individuals in non-differentiating groups reported more positive distribu- tive and procedural justice, satisfaction, and turnover intent than did individuals in differentiat- ing groups. In differentiating groups, no differences in attitudes were found between individuals who were differentially rewarded or penalized for their performance. Implications for peer ap- praisal practice and future research are discussed. © 2001 John Wiley & Sons, Inc. Introduction In the 1990s, many articles called for expand- ing input for employee performance apprais- als from traditional supervisor ratings to multiple ratings from a variety of sources. This process is variously called 360-degree ap- praisal, multi-source appraisal, and multi-rater appraisal. All of these methods involve collect- ing information from several sources knowl- edgeable about the performance of the individual being rated. For a manager, the ad- ditional sources of observations may include self-ratings, peer and subordinate ratings, and internal or external customer ratings (London & Smither, 1995; Tornow, 1993). Thus, these appraisal techniques attempt to collect obser- vations from “key constituencies representing the full circle of relevant viewpoints” (Lon- don & Smither, 1995, p. 803). The present study investigated peer ap- praisals, probably the most commonly used source of performance ratings other than those of supervisors. It explores the extent to which peers differentiate in ratings of their co-work- ers on project teams and some attitudinal cor- relates of such differentiation or non-differentiation. These attitudes indicate the degree of acceptance of peer evaluation or appraisals, and such acceptance is consid- ered important for successful inclusion of evaluations by people other than supervisors (e.g., Waldman & Bowen, 1998). It also com- pares the attitudinal responses of individuals differentiated and identified by their peers as strong or weak contributors to project success. These issues are important for human resource managers who need to understand the links between their practices and their firm’s finan- cial performance (Becker & Huselid, 1998). Performance appraisals can guide employees’ behaviors and ultimately influence organiza- tional productivity. For peer appraisals, a pre- requisite for their impact on the bottom line The study explores the extent to which peers differentiate ratings of their coworkers on project teams and some attitudinal correlates of such differentiation or nondifferentiation.

Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Embed Size (px)

Citation preview

Page 1: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 333

PEER APPRAISALS: DIFFERENTIATION OFINDIVIDUAL PERFORMANCE ONGROUP TASKS

Human Resource Management, Winter 2001, Vol. 40, No. 4, Pp. 333–345© 2001 John Wiley & Sons, Inc.

John A. Drexler, Jr., Terry A. Beehr, and Thomas A. Stetz

The use of peer appraisals has been widely acclaimed, but how useful are they really? Studentgroups made non-anonymous ratings of peer performance on two group tasks, and the ratingscontributed to the students’ course grades. Groups differentiated very little among peers in theirperformance ratings. Individuals in non-differentiating groups reported more positive distribu-tive and procedural justice, satisfaction, and turnover intent than did individuals in differentiat-ing groups. In differentiating groups, no differences in attitudes were found between individualswho were differentially rewarded or penalized for their performance. Implications for peer ap-praisal practice and future research are discussed. © 2001 John Wiley & Sons, Inc.

Introduction

In the 1990s, many articles called for expand-ing input for employee performance apprais-als from traditional supervisor ratings tomultiple ratings from a variety of sources. Thisprocess is variously called 360-degree ap-praisal, multi-source appraisal, and multi-raterappraisal. All of these methods involve collect-ing information from several sources knowl-edgeable about the performance of theindividual being rated. For a manager, the ad-ditional sources of observations may includeself-ratings, peer and subordinate ratings, andinternal or external customer ratings (London& Smither, 1995; Tornow, 1993). Thus, theseappraisal techniques attempt to collect obser-vations from “key constituencies representingthe full circle of relevant viewpoints” (Lon-don & Smither, 1995, p. 803).

The present study investigated peer ap-praisals, probably the most commonly used

source of performance ratings other than thoseof supervisors. It explores the extent to whichpeers differentiate in ratings of their co-work-ers on project teams and some attitudinal cor-relates of such differentiation ornon-differentiation. These attitudes indicatethe degree of acceptance of peer evaluationor appraisals, and such acceptance is consid-ered important for successful inclusion ofevaluations by people other than supervisors(e.g., Waldman & Bowen, 1998). It also com-pares the attitudinal responses of individualsdifferentiated and identified by their peers asstrong or weak contributors to project success.These issues are important for human resourcemanagers who need to understand the linksbetween their practices and their firm’s finan-cial performance (Becker & Huselid, 1998).Performance appraisals can guide employees’behaviors and ultimately influence organiza-tional productivity. For peer appraisals, a pre-requisite for their impact on the bottom line

The studyexplores theextent to whichpeersdifferentiateratings of theircoworkers onproject teamsand someattitudinalcorrelates of suchdifferentiation ornondifferentiation.

Page 2: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

334 • HUMAN RESOURCE MANAGEMENT, Winter 2001

is their accuracy or validity. If peers are reluc-tant to differentiate among their co-workersin evaluating their performance, the ratingswill be less valid and accurate, and they willnot be effective in enhancing productivity.

Peer Appraisals

As with other human resource managementprocesses, in practice peer appraisals take onmany different forms: the measurement formsvary (nominations, rankings, ratings); ratersmay be identified or not; and the accumulateddata may be used for feedback to serve devel-opmental purposes, for making administrativedecisions such as salary increases or promo-tions, or for both.

Regardless of the structure, peer apprais-als, in principle, are a tool designed to pro-vide job incumbents with valid information toallow them to maintain or improve perfor-mance or to provide the basis for administra-tive decisions. Dunnette (1993) and Londonand Smither (1995) argued that 360-degreeappraisal structures could enhance perfor-mance appraisal effectiveness because theyinclude observations from a number of observ-ers and thus can provide a more complete pic-ture of an individual’s performance. Inprinciple, performance appraisals fromsources in addition to the supervisor makegood sense. They provide a perspective andinformation that the supervisor alone oftencannot obtain. Further, these appraisal pro-cesses, if well managed and structured, canhelp an organization better direct the capac-ity of its human capital toward the achieve-ment of its strategic mission and goals (Becker& Huselid, 1998). It is important, however, tounderstand processes that can affect or areaffected by non-supervisor ratings and thusto understand their ability to affect anorganization’s strategic performance.

Conceptually, the process involves observ-ers who are peers accurately assessing perfor-mance and truthfully reporting theirobservations. Several factors might threatenwhether or not the assessments are accurateand truthful. These factors might include con-cerns about treating others equally or equita-bly and the consequences of either treatmentto themselves. These consequences might in-

clude, for example, the disruption of peer har-mony or better or worse rewards for them-selves. We assume that performancedifferences do occur and that they are detect-able to peers but that the peers’ motivation toreport differences in their co-workers’ perfor-mance varies.

There has been substantial research onpeer appraisals. In an early review, Kane andLawler (1978) investigated the psychometricproperties of three forms of peer assessment(peer nominations, peer rankings, and peerratings) and found respectable reliability andvalidity coefficients. In a meta-analysis ofmulti-rater studies, Harris and Schaubroeck(1988) found high correlations between peerand supervisor ratings of an incumbent’s per-formance but only moderate correlations be-tween self ratings and peer ratings ofperformance. This finding has been confirmedin subsequent research (Fox, Caspy, & Reisler,1994; Furnham & Stringfield, 1998). Saavedraand Kwun (1993) found that outstanding con-tributors were more discriminating than wereaverage or below average contributors in theirpeer evaluations, suggesting that one’s ownperformance might influence the willingnessto evaluate others. The peer evaluation litera-ture also includes a large number of applica-tions-oriented articles that provide anecdotalinformation about peer evaluation systems.Norman and Zawacki (1991), for example, de-scribed the processes used for conducting peerappraisals in self-managed work groups atDigital Equipment Corporation.

Job incumbent reactions and attitudestoward peer evaluations appear to be mixed.For example, Love (1981) found that hissample of police officers were negatively dis-posed toward peer evaluations. McEvoy andBuller (1987) reviewed user acceptance ofpeer appraisals and found in their empiricalanalysis that hourly employees of a food-pro-cessing plant were more likely than subjectsin other studies to accept an anonymous peerappraisal process. They also found that therelationships between acceptance and devel-opmental uses of appraisals were stronger thanthose between acceptance and administrativeuses that might affect individual outcomes.

Little previous research has investigatedthe relationship of intragroup processes to peer

Peer appraisalsare a tooldesigned toprovide jobincumbents withvalid informationto allow them tomaintain orimproveperformance orto provide thebasis foradministrativedecisions.

Page 3: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 335

evaluations. One exception is Bettenhausenand Fedor’s (1997) study of MBA student re-ports of likely positive and negative outcomesthat would result from peer and upward ap-praisals conducted under various conditions.They found that respondents believed thatnegative outcomes such as jeopardizing workrelationships and making employees vulner-able to retribution were likely to result frompeer appraisals used for administrative pur-poses. The opposite was true when peer ap-praisals were used for developmental purposes.Moreover, respondents also reported that posi-tive outcomes such as helping employees dotheir job better and increasing employees’ feel-ings of self-worth were more likely outcomeswhen they had good co-worker relations thanwhen they had poor co-worker relations.

An experimental study by DeNisi,Randolph, and Blencoe (1983) manipulatedpositive and negative performance evaluationsto ascertain whether a subject’s own evalua-tion on a first task affected his/her evaluationsof others on a subsequent task. These sub-jects, however, only worked together in oneexperimental session, and the evaluations didnot reflect actual performance. A study byImada (1982) used a sample of managers in atraining program to show how interaction, ob-servation, and stereotyping affect the cogni-tive organization of information whencompleting peer evaluation forms. Finally,Saavedra and Kwun (1993) found that out-standing contributors were more likely thanaverage or below-average contributors to re-port that their personal abilities were used ona group task and that the peer evaluation pro-cess was fair. The ratings were anonymous,however, and had no effect on such outcomesas compensation or promotions.

As with 360-degree feedback, researchershave suggested that there is much potentialvalue to peer evaluations (e.g., Dunnette,1993; Kane & Lawler, 1978; Mohrman,Resnick-West, & Lawler, 1989). The claim isthat peers are uniquely good observers of ajob incumbent’s performance. In many jobs,peers are often the most frequent observersof the incumbent’s performance and the mostaffected by it. In fact, Tornow (1993) arguedthat the lower correlations between peer rat-ings and ratings from other observers should

serve as a reason for using them: they reflectdifferent perspectives.

Of course, peer ratings must be valid inorder to be useful either for developmentalor reward/administration purposes. Onethreat to their validity is a reluctance of somepeers to give negative ratings to each other.This phenomenon can result in rating every-one positively and equally or making smalldistinctions in performance regardless ofwhether these ratings reflect actual contri-butions. This lack of valid differentiationamong ratees would render the ratings use-less for almost any purpose.

Differentiation in Peer Appraisals ofPerformance in Group Projects

Characteristics of group members are com-mon input variables in models of group per-formance (e.g., Guzzo & Shea, 1992;Hackman, 1987; Tannenbaum, Beard, &Salas, 1992). Even in a group project, thereare likely to be individual differences in abil-ity and motivation, which manifest themselvesin differential input and differential individualcontributions to the group’s performance.One of the group member reactions in thepresent study is the perception of justice,which is closely linked to equity theory. Ineq-uity, as classically defined (Adams, 1965),exists whenever a person perceives that theratio of his/her outcomes to inputs is unequalto the same ratio for a reference person. Ref-erence people are readily available in groupwork because of the presence of other groupmembers working on the same project. Giventhe assumption of individual differences incontribution to the group project, equitytheory suggests that group members will de-sire to reward each other differentially in or-der to achieve equity. Little is known, however,about differentiation in rewards assigned bygroups themselves. We examine differentia-tion in peer performance ratings on long-termprojects in real work groups in which mem-bers have important outcomes contingent onthe evaluations. Three specific research ques-tions are addressed.

Participants in the present study were re-quired to conduct non-anonymous peer evalu-ations in work groups, and these ratings

Given theassumption ofindividualdifferences incontribution tothe groupproject, equitytheory suggeststhat groupmembers willdesire to rewardeach otherdifferentially inorder to achieveequity.

Page 4: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

336 • HUMAN RESOURCE MANAGEMENT, Winter 2001

affected group members’ rewards. Much ofthe previous research on peer appraisals em-ployed anonymous ratings. It is important toexamine non-anonymous peer ratings, how-ever, because such ratings may need to benon-anonymous to combat potential legalchallenges if they are to be used in person-nel decisions (Waldman, Atwater, &Antonioni, 1998). Work groups in the presentstudy could choose to differentiate or not inrating member performance. The first ques-tion addresses what proportion of the groupsdecided to differentiate and how much theydifferentiated. Range restriction due to lackof differentiation has been identified as aproblem in some peer appraisal research(McEvoy & Buller, 1987), and lack of differ-entiation in peer appraisals represents suchrange restriction.

Research Question #1: Given the choice todifferentiate or not, how often do groups dif-ferentiate, and of those groups that do dif-ferentiate, how much differentiation occurs?

Acceptance of said evaluation is impor-tant for successful use of evaluations bypeople other than the supervisor (e.g.,Waldman & Bowen, 1998). Little is known,however, about the effects of differentiationin group performance ratings on acceptance,especially how differentiation might be asso-ciated with attitudinal variables such as sat-isfaction, perceived equity, and turnover. Ifthe assumption of individual differences ininput is correct, equity theory suggests thatoutput also must vary in order to achieveequity and result in perceptions of justice. Ifgroups do not differentiate, one would expectmore negative attitudes, especially includingjustice feelings. The general justice reactionsmay carry over to other attitudinal variables,such as satisfaction and turnover intent(Moorman, 1991).

Research Question #2: Are there differencesin how individuals respond to being evalu-ated by their peers depending on whetherdifferentiation occurs?

Similarly, in groups that do differentiatein their appraisals of each member’s perfor-

mance, individual perceptions of fairness orjustice, satisfaction, and willingness to re-main in a group might be different for indi-viduals rated higher than they are for thoserated lower than others. If ratees who receivefavorable ratings have favorable reactions andothers have negative reactions to peer ap-praisals, the organization will need to decideif this is good or bad, but at present we don’tknow the answer.

Research Question #3: In groups that dodifferentiate, do individuals who receivehigher ratings have attitudes different fromindividuals who receive lower ratings?

Human resource functions need to be pro-active in evaluating the usefulness of theirprograms and practices, and an overall linkbetween human resource practices and a firm’sfinancial success has already been demon-strated (Becker & Huselid, 1998). Brockbank(1999) noted that the measurement and re-warding of individuals’ effectiveness is one ofthe factors linked to a firm’s ability to com-pete globally. In order for any human resourcesystem to help sustain a company’s competi-tive advantage, it first must perform as in-tended. This project therefore examined theworking of peer appraisals in work teams, onewidely adopted form of measuring and reward-ing employee performance.

Method

Participants

The participants were students in nine sec-tions of an upper-division organizational be-havior course taught at a large publicuniversity over a three-year period. All busi-ness majors are required to take this course,as are students from many other majors. Thesample of students who agreed to participateby completing a questionnaire consisted of 290individuals, representing an 85% responserate. Of the participants, 56% were male and44% were female; 80% were White while 14%were Asian. The remaining 6% were African-American (1%), Hispanic (2%), Native Ameri-can (2%), or other. Their mean age was 22.2years (SD = 3.1).

It is important toexamine non-anonymous peerratings, however,because suchratings may needto be non-anonymous tocombat potentiallegal challengesif they are to beused in personneldecisions.

Page 5: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 337

Procedure

Over the 10-week term, participants workedtogether in the same groups on a variety ofgraded and ungraded tasks and problems.Two of the tasks were graded projects thatcomprised 40% of a participant’s grade andthat required a large amount of time and ef-fort. Participants reported working an aver-age of 3.8 hours per week (SD = 2.53) onthe two projects throughout the term withfour weeks between the first and secondprojects’ due dates.

A total of 56 groups participated, with anaverage size of six members (range was fourto seven). In assigning individuals to groups,an attempt was made to evenly distribute (1)self-reported writing and information-searchskills, (2) men and women, and (3) domesticand international students.

The first graded project required groupsto develop a protocol for preparing seniormanagers to travel to another country for busi-ness purposes. This cross-cultural assignmentrequired information search as well as com-puter, writing, and oral presentation skills. Thesecond graded project required groups to de-velop a process report that analyzed how wellmembers worked together.

Because research (Day & Sulsky, 1995)has shown rater training to enhance the ac-curacy of performance appraisals, participantsreceived training and practice at several pointsin the course. Prior to the first project peerappraisals, participants learned about givingspecific and descriptive feedback and abouthow to reduce the kinds of measurement er-rors often associated with performance ap-praisals. Moreover, participants had to practiceassessing performance on two other tasks priorto making the peer appraisals. Finally, partici-pants were instructed to use detailed workplans that assigned tasks and due dates to workgroup members as the basis for the peer ap-praisals. Participants were reminded of theseconcepts prior to the second peer appraisal.

After submitting each assignment, eachgroup conducted a peer evaluation of each ofits member’s contribution to the developmentof the submitted project. The work plan eachgroup developed before starting project workserved as the basis for the evaluation. Partici-

pants could rate each other in a range from80% to 120%, with the constraint that the av-erage rating for the group must equal 100%.This is similar to a budget-driven compensa-tion system where a fixed amount of money isavailable for pay raises, merit pay, or bonuses:The available money can be distributed equallyor differentiated narrowly or widely amongemployees to reflect actual performance. In-dividuals not earning at least 80% were to beassigned a grade of zero, and in this case thegroup could not redistribute the remainingpoints among other members. An individual’srecorded grade on each project was the peerrating times the grade the instructor gave thegroup. The peer evaluation was conducted andsubmitted before the groups knew the gradethe instructor gave to the group projects.

The final ratings reflected a group’s con-sensus about an individual’s performance ratherthan simply averaging individual ratings ofpeers; therefore, there was only one rating perratee. Essentially, groups had to discuss thecontributions of each member in the contextof the project plan and arrive at a decision abouteach member’s contribution. The member waspresent during this discussion. The procedurewas quite similar to that used by Saavedra andKwun (1993). Specifically, these were self-man-aging groups that had graded tasks requiringinterdependence, cooperation, and coordina-tion over a ten-week academic term. It was dif-ferent from Saavedra and Kwun, however, inthat the peer rating process was not anonymousand the results affected people through impor-tant, formal outcomes: grades.

During the last class session, instructorsadministered a survey to the participants at thesame time as the university’s course evaluation.Instructors read a human subjects protocol andassured participants that instructors would nothave access to the data until after grades weresubmitted, thus eliminating any possibility thatsurvey responses would affect grades.

Measures

Differentiation was measured by inspecting thedata generated by each group of students. Nodifferentiation occurred when each groupmember received 100% of the credit for anassignment. Differentiation occurred when the

Essentially,groups had todiscuss thecontributions ofeach member inthe context ofthe project planand arrive at adecision abouteach member’scontribution.

Page 6: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

338 • HUMAN RESOURCE MANAGEMENT, Winter 2001

group awarded some members more and lessthan 100% credit.

Peer Ratings. Students were asked to iden-tify themselves on the survey and most did so.Course records were then used as the sourceof peer ratings.

Attitudinal Variables. The study examinedfour attitudinal variables measured in the sur-vey. Two of these, procedural and distributivejustice, are directly related to perceived eq-uity of the peer appraisals and the rewards as-sociated with them. The other two, satisfactionand turnover intentions, are likely to be gen-erally associated with appraisals, althoughthere are many other factors (e.g., supervision,job design, and labor market conditions) thatcan also have a strong effect on them in theemployment situation. In the present study,however, many of these other influences onsatisfaction and turnover intention are rela-tively constant across people. That is, all par-ticipants have the same job, supervision(professor), and labor market (not a lot of otherjobs to which they can move); therefore theinfluence of peer appraisals on these two atti-tudes should be more apparent in this setting.

General satisfaction was measured withtwo items: “All in all, I am satisfied with mygroup” and “I liked working with my group”.Turnover intention was measured with threeitems: “If given the opportunity, I would haveleft my class group and worked on my own”;“I would have liked to quit my class group”;and “If given the opportunity, I would chooseto work with the same class group again”.These scales all used a five-point Likert-typeagree-disagree scale.

Procedural justice, also using a 5-pointLikert-type agree-disagree scale, was measuredwith seven items adapted from Shapiro andBrett (1993). A sample item was, “I have been

taken advantage of by the grading proceduremy class group used for this class”. Distribu-tive justice was measured with a modified ver-sion of Price and Mueller’s (1986) distributivejustice index and consisted of six questionsthat asked participants to rate whether rewardswere fairly distributed taking into accountsuch things as “the amount of effort I haveput forth”. Each item used a 5-point “rewardsare fairly distributed” to “rewards are not fairlydistributed at all” scale.

Table I presents descriptive statistics,intercorrelations, and Cronbach coefficientalphas for the attitudinal variables used in thisstudy. The measures were moderately tostrongly intercorrelated, ranging from r = .28to .85. The Cronbach alpha reliability coeffi-cients were acceptably high, ranging from .89to .96.

Results

The first research question asks how oftengroups, given the opportunity, differentiatein ratings of group members, and of thegroups that do differentiate, how much dif-ferentiation occurs. Of the 56 groups, 19(34%) differentiated on the first project and19 (34%) differentiated on the secondproject. Only six (11%) groups differentiatedon both projects, and 24 (43%) did not dif-ferentiate on either project.

Table II presents the distribution of ratingsfor each of the two projects and shows thatgroups that actually did differentiate did sonarrowly. It shows that 54.1% of group mem-bers were rated in a range from 98% to 102%on the first project and 60.8% were rated inthe same range on the second project. Most ofthe differentiation that did occur was restrictedto 5 of 40 possible percentage points.

Mean SD Alpha 1. 2. 3.

1. General satisfaction 4.05 0.94 0.902. Distributive justice 3.60 1.04 0.96 0.283. Procedural justice 4.00 0.93 0.94 0.55 0.354. Turnover intent 2.03 1.06 0.89 –0.85 –0.32 –0.59

TABLE I Means, Standard Deviations, Coefficient Alphas, and Correlations.

*All coefficients significant at p < .001; smallest N = 289.

Page 7: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 339

Analyses of variance were used to testwhether being in a group that differentiatedamong members in the peer evaluations wasassociated with scores on the four attitudinalvariables. Regarding the first project, statisti-cally significant differences were found forgeneral satisfaction (F = 13.01, p < .001), pro-cedural justice (F = 10.69, p < .001), and turn-over intent (F = 10.67, p < .001). In each case,members of groups that differentiated on per-formance reported more negative attitudesthan did members of groups where no perfor-mance differentiation occurred. Only the dis-tributive justice measure did not yieldstatistically significant differences among thegroups. Table III presents the means and stan-dard deviations from these analyses.

For the second group project, all the atti-tudinal variables differed between groups thatdid and those that did not differentiate: gen-eral satisfaction (F = 12.00, p < .001), proce-dural justice (F = 14.96, p < .001), distributivejustice (F = 4.01, p < .05), and turnover in-tent (F = 11.02, p < .001). Consistent withthe first project, members of groups that didnot differentiate among members reported

more positive attitudes. Table III also presentsthe means and standard deviations from thesecond project analyses.

The last test was a comparison of the atti-tudes of people in differentiating groups ac-cording to their own status as ratees. None ofthe eight tests using attitudinal variables ineach of two projects showed a statistically sig-nificant difference, although turnover intentwas marginally significant on the first project(F = 3.04, p < .10). People who earned morethan 100% of the points on the first projectexpressed more turnover intention than peoplerated lower. Given the marginality of theseresults, it is probably safest to conclude thatthe receipt of higher or lower ratings is unre-lated to ratees’ attitudes.

Discussion

This study examined undergraduate studentsworking together in task groups over 10 weeksas subjects in two peer evaluation experiences.While using students as subjects affects thestudy’s external validity for generalizing resultsto business and industry, the fact that every

Project 1 Project 2

Rating Frequency % Frequency %

0.0 1 1.4 .80–.81 1 1.4 .82–.83 .84–.85 1 1.4 .86–.87 .88–.89 .90–.91 2 2.7 1 1.4 .92–.93 1 1.4 3 4.1 .94–.95 1 1.4 6 8.1 .96–.97 2 2.7 1 1.4 .98–.99 27 36.5 18 24.31.00 11 14.9 8 10.81.01–1.02 2 2.7 19 25.71.03–1.04 14 19.0 6 8.11.05–1.06 9 12.2 7 9.51.07–1.08 1 1.41.09–1.10 3 4.1 2 2.71.11–1.12 1 1.41.13–1.141.15–1.161.17–1.181.19–1.20

Peer Ratings in Groups that Differentiated.TABLE II

Page 8: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

340 • HUMAN RESOURCE MANAGEMENT, Winter 2001

group had identical tasks enhances its inter-nal validity. The rating process was non-anony-mous, and because the ratings affected theparticipants’ grades for the course, it servedadministrative purposes. External validity isprobably enhanced because the task was partof the participants’ real lives (as students) overan extended period of time, and these peerevaluations affected important outcomes,grades. The method used had the advantageof allowing close control and better measure-ment of performance than could have beenattained in a business setting, and the realityof the situation in the students’ lives providedbetter confidence in generalizability thancould be had in a laboratory setting. Class-room studies of human resource principleshave been used successfully in previous re-search as well (e.g., Tenbrunsel, 1998; Watson,Michaelsen, & Sharp, 1991), and the mea-surement properties of and attitudinal re-sponses to classroom use of peer evaluationshave also been reported elsewhere (Morahan-Martin, 1996; Topping, 1998). A possiblethreat to external validity, however, is the rela-tively homogeneous age of participants: Mostare in their early 20s. The sample, therefore,was different from most business settings inwhich there is a wide range of ages. Further-more, people at this age might be more con-cerned than would older participants aboutpeer acceptance and thus might be less likelyto make decisions that would result in othersliking them less.

Overall, the groups in the present studyonly differentiated their ratings of peers to asmall extent. This study assumed that perfor-

mance varies within work groups but foundthat 43% of the groups chose not to differen-tiate in peer performance ratings on eitherproject. Moreover, this study found mostgroups that did differentiate used a somewhatnarrow range, largely limited to 5 ratingpoints on a 40-point scale. In addition, mem-bers of non-differentiating groups had morepositive attitudes toward the group and theprocess of peer evaluations than did the dif-ferentiating groups. Taken together, these re-sults imply that members of project groupsprefer to assign rewards equally and are morecomfortable in groups that do so. This couldbe due to the belief that people receivinglower ratings would feel bad, and this couldcause disruption in group harmony. This doesnot appear to be the case, however, becausethere was little or no difference in attitudesbetween people rated high versus low in thedifferentiating groups. This pattern of resultssuggests that employers have less to worryabout in terms of offending specific individu-als by using peer appraisals than of adverselyaffecting the attitudes of entire groups. Fu-ture research should aim to determine theextent to which these results generalizewidely and to find individual or situationalfactors that influence a groups’ willingnessto differentiate in their peer appraisals.

Does the narrow range used by groups thatdid differentiate reflect small differences inactual contributions or rather small adjust-ments in ratings for big differences in actualcontributions? This could not be tested directlybecause there was no measure of individuals’actual contributions, but our assumption and

Differentiate on Project One Differentiate on Project TwoYes No Yes No

Distributive justice Mean 3.55 3.61 3.42 3.68SD 1.03 1.05 1.07 1.01

General satisfaction Mean 3.78 4.21 3.79 4.20SD 1.19 0.74 1.17 0.77

Procedural justice Mean 3.77 4.15 3.72 4.17SD 1.03 0.84 1.01 0.85

Turnover intent Mean 2.30 1.87 2.31 1.87SD 1.21 0.93 1.22 0.93N 97 177 96 178

TABLE III Descriptive Statistics: Attitudes by Differentiation— All Respondents.

Page 9: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 341

experience led us to believe that bigger differ-ences in contributions existed than the differ-ences in ratings reflect. After all, theinstructor’s assessment of projects varied (firstproject mean grade = 172.87, SD = 13.79,range = 143–198; second project mean grade= 174.43, SD = 12.42, range = 144–200).Further, participants’ performance on indi-vidually graded assignments and exams var-ied as reflected in the final grade distributionfor the course (mean final grade = 2.77 on a4.0 scale, SD = .60, range = F–A). These datasupport the assumption that there was vari-ability in individual performance on other tasksin this same setting.

Even if that were not accepted, however,the discrepancy between our assumption andthe results may have occurred for a numberof alternative reasons. Possibly in somegroups those who did more than their fairshare of the work required only a small eq-uity adjustment without fully requiring anadjustment that reflected actual contribu-tions. Another explanation is that group mem-bers paid attention to more visible behaviorswhen evaluating their peers. Simply attend-ing a meeting and speaking up, for example,were verifiable public behaviors. Doing a lotof outside research and doing a good job ofsynthesizing it were less public and less veri-fiable. Alternatively, individuals who contrib-uted more may also have received informalrewards, such as prestige and status. If theyreceived desired informal rewards, they mayhave been less concerned about proportion-ate formal point adjustments.

Another explanation might be that groupsthat did not differentiate had the characteris-tics of high-performance teams (Katzenbach& Smith, 1993). In such a case, all groupmembers perform well and contribute to out-standing overall group performance. In thepresent study, this would be reflected in groupsnot differentiating in ratings because the groupfunctioned as a high-performing team. Addi-tional inspection of the data, however, did notsupport such an explanation. There were nostatistically significant differences on the firstproject grades between differentiating andnon-differentiating groups. On the secondproject, the differences in project grades weresignificantly different (F = 11.08, p < .001),

but the groups that did differentiate earnedsignificantly better grades than groups that didnot differentiate.

This study was interested in the possibil-ity that differentiation could lead to outcomessuch as dissatisfaction and subsequent turn-over intention. Because of the non-experi-mental research design, however, stronginferences about causality could not be made.The results are consistent with the explana-tion that differentiation does lead to theseoutcomes, as has been argued here. Reversecausation is a plausible alternative explana-tion, however. Based on an avoidance-of-group-disharmony hypothesis, satisfied(harmonious) groups may tend to avoid any-thing that might disrupt their harmony suchas differentiation in peer ratings.

There are at least three explanations asto why groups that did not differentiate hadmore positive attitudes than those that diddifferentiate. One is that our assumption isincorrect and that each member contributedequally, which contributed to group harmony.The second explanation is that no one in thegroup was willing to disrupt group harmonyby exacting equity when one or more mem-bers did not contribute as expected. Further-more, those who contributed less might havebeen happy not to have to address their lowercontributions or accept the negative conse-quences. A third explanation is that these in-dividuals value equality rather than equity.That is, they essentially see treating every-one equally as the proper thing to do, asidefrom any consequences such as group har-mony. It is not necessarily equality or equitythat characterizes task groups, however; bothequity and equality principles can operate si-multaneously. Kabanoff (1991) noted thatorganizations are distributively complex sys-tems that tend to use both equity and equal-ity distribution principles and that this ispartly because of the twin goals of task ef-fectiveness and organization maintenance.These possibilities await further research.

The findings, that few groups differenti-ated on both projects, and those that used asomewhat narrow range, raised a questionabout the validity of the ratings. While peersmay provide an additional perspective in per-formance ratings because they are frequent

Organizationsare distributivelycomplex systemsthat tend to useboth equity andequalitydistributionprinciples. Thisis partly becauseof the twin goalsof taskeffectiveness andorganizationmaintenance.

Page 10: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

342 • HUMAN RESOURCE MANAGEMENT, Winter 2001

observers of an incumbent’s performance, weneed to discover how to use them in ways thatencourage peers to rate each other accurately.This is important, given the time, labor costs,and possible disruption in group harmony re-quired to produce them. Peers can, of course,be required to assign ratings that will form aspecified distribution, disallowing equal rat-ings for everyone. Along these lines, rankingscould be required instead of ratings. Short ofactually requiring peers to differentiate, how-ever, at least three other factors might moti-vate peers to differentiate in their appraisalsof each other: leadership, group dynamics, andknowledge of the specific consequences tiedto the ratings. Group norms almost certainlyinfluence group ratings, and the formal leaderor supervisor can attempt to influence thesenorms by directions, encouragement, andmodeling. The dynamics of the group interac-tion also influence normative behavior. Thedevelopment of a group climate in which eq-uity is considered as important as or moreimportant than equality would aid in gettingaccurate, differentiated ratings. In addition,if the peers know quite specifically what andhow large the rewards associated with theirratings will be, they can at least make moreinformed ratings. Often, as in these projects,peers do ratings before they know how theirwork has been evaluated by others.

The lack of statistically significant differ-ences in most of the attitudes between thosewho earned more and less than 100% of thepoints deserves some discussion. Saavedra andKwun (1993) found more positive attitudesamong those rated as outstanding performersand acknowledged that their study used singleitem measures of attitudes. They recom-mended that future research use reliablemulti-item indices, which this study does.Furthermore, in this study, the most positiveattitudes were found in groups where no dif-ferentiation occurred, and this includes thetwo equity measures, distributive and proce-dural justice. Taken together, this all suggeststhat more is operating here than equity pre-dictions can explain.

Two additional cautions are important forinterpreting these findings. One is that requir-ing consensus could create a leveling effectin the ratings. This seems especially true given

that the target ratee was present during theappraisal discussion. A second caution is thatwhile the participants in the present studyworked together over an extended period oftime, it is unlikely that they, as an intact group,would have to work together after the courseis over. Thus, there might be no long-termincentive to invest energy in providing validratings. If employees are committed to theirteams, they have an incentive to try to improveteam performance. In the present study, how-ever, this possible influence on ratings is ab-sent. Instead, these teams were like projectteams in businesses.

Overall, the data did not support an ex-pectation that equity considerations wouldcause attitudes to be linked to greater differ-entiation. People in differentiating groups hadmore negative rather than more positive atti-tudes toward their groups. A straightforwardself-interest hypothesis also would not findsupport in the data because individuals whoreceived higher ratings were no more likely tohave positive attitudes than those receivinglower ratings.

Future Research

This study provides enough evidence to sug-gest that peer appraisal research needs to ex-plore more thoroughly how intragroupdynamics affect the willingness of peers to rateeach other accurately and thus affect the rat-ings’ validity. Regarding traditional supervisor–subordinate evaluations, Longenecker, Sims,and Gioia (1987) described conditions underwhich managers intentionally inflated or de-flated subordinate ratings, depending on theconsequences to the manager. The same is-sues should be investigated regarding peerappraisals. Creating disharmony in a groupcould be an important consequence of evalu-ating someone negatively, as could having todeal with an angry peer who was rated nega-tively. Might peers be motivated to inflate ordeflate ratings because such consequences areperceived to be important and likely outcomesof being honest in peer evaluations? Addition-ally, a social norm of equality rather than eq-uity may be linked to a value for groupharmony. Elsewhere, justice researchers havenoted such normative differences

Group normsalmost certainlyinfluence groupratings, and theformal leader orsupervisor canattempt toinfluence thesenorms bydirections,encouragement,and modeling.

Page 11: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 343

(Cropanzano & Greenberg, 1997). A norm ofequality could be operating in the student sub-culture, but it may also be more widespreadthan simply being present among students. Re-gardless of the source of such a norm, it re-mains to be discovered whether similar beliefsand/or actions occur widely in the workplace.

Participants in this study did not knowhow the instructor evaluated their work untilafter the peer ratings were completed. Re-search needs to investigate whether differ-ences occur depending on whether a grouphas external evaluative information about howwell it performed overall before conductingthe peer evaluation. It might be, for example,that high contributors will want to differenti-ate more forcefully if they know the group’swork was evaluated by the instructor as poor.

The appraisal process used in this studydid not provide anonymous data and had anoutcome linked to the ratings. Much of thepublished research on peer evaluations de-scribes structures in which the ratings areanonymous, but this practice has legal andbehavioral implications in the workplace.Some legal challenges rest on whether per-sonnel actions are based on anonymous per-formance ratings. Anonymous ratings arequestioned because they cannot be linkedto specific sources of performance data(Waldman, Atwater, & Antonioni, 1998). Inaddition, the potential for discussion be-tween employees and raters is a necessarycomponent of a judicially acceptable ap-praisal system (Sovereign, 1999), and thismakes anonymity impractical. Many orga-nizations do use peer appraisal informationand information from other non-supervisoryobservers of performance for administrativepurposes. Negative applicant reactions toselection procedures can lead to complaintsand court challenges (e.g., Smither, Reilly,Millsap, & Stoffey, 1993), and it is likely that

negative employee reactions to performanceappraisals can have the same result. Even ifthe information were not formally consid-ered a part of administrative decisions, wewould expect that mere knowledge of theappraisal results would at least have someunintentional bearing on administrative de-cisions. For developmental purposes, to beeffective in changing behavior, people needto be able to ask questions about and re-spond to the evaluative information theyreceive (Nadler, 1977). Anonymous feed-back would preclude this from happening.Overall, how group dynamics affect anony-mous and non-anonymous peer ratingsneeds to be investigated.

Meyer, Kay, and French (1965) presenteda classic explanation of the contrast betweenevaluative and administrative uses of perfor-mance ratings. Regarding the former, to re-ward people appropriately, the performanceappraisals, of which the peer ratings in thepresent study are a part, need to be accurate.Basing rewards on inaccurate data is not fairor equitable and, in any event, subverts thestrategic intent of most reward systems. Re-garding the developmental purpose of apprais-als, people will not pay attention to theevaluations if they do not affect important con-sequences such as salary adjustments, promo-tions, or grades. Ratings that are not linked tosuch valued outcomes will not get the atten-tion of job incumbents and will not, therefore,lead to improved performance (Kerr, 1995).

Peer evaluations are being widely used,and their potential value has received muchpraise and attention. More effort, however,needs to go into understanding exactly whataffects them and what they affect. Groupmembers are reluctant to differentiate amongeach other in peer appraisals, and the poten-tial causes and consequences of such differ-entiation are not yet well understood.

Page 12: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

344 • HUMAN RESOURCE MANAGEMENT, Winter 2001

REFERENCES

Adams, J.S. (1965). Inequity in social exchange. InL. Berkowitz (Ed.), Advances in experimentalsocial psychology. New York: Academic Press.

Becker, E., & Huselid, M. (1998). High performancework systems and firm performance: A synthesisof research and managerial implications. Re-search in Personnel and Human Resource Man-agement, 16, 53–101.

Bettenhausen, K.L., & Fedor, D.B. (1997). Peer andupward appraisals: A comparison of their ben-efits and problems. Group and OrganizationManagement, 22 (2), 236–263.

Brockbank, W. (1999). If HR were really strategicallyproactive: Present and future directions in HR’scontribution to competitive advantage. HumanResource Management, 38, 337–352.

Cropanzano, R., & Greenberg, J. (1997). Progress inorganizational justice: Tunneling through themaze. In C.L. Cooper & I.T. Robertson (Eds.),International review of industrial and organiza-tional psychology (pp. 317–372). Chichester,England: Wiley.

Day, D.V., & Sulsky, L.M. (1995). Effects of frame-of-reference training and information configurationon memory organization and rating accuracy. Jour-nal of Applied Psychology, 80, 158–167.

DeNisi, A.S., Randolph, W.A., & Blencoe, A.G. (1983).Potential problems with peer ratings. Academyof Management Journal, 26(3), 457–464.

Dunnette, M.D. (1993). My hammer or your ham-mer? Human Resource Management, 32 (2 &3), 373–384.

JOHN A. DREXLER, JR. is Associate Professor of Management in the College of Businessat Oregon State University. His recent articles haven been published in the Journal ofOrganizational Behavior, Project Management Journal, the Journal of Construction En-gineering and Management, and the Journal of Management Education. His research,teaching, and consulting cover a wide range of organizational change issues.

TERRY A. BEEHR is Professor and Director of the Ph.D. Program in Industrial/Organiza-tional Psychology at Central Michigan University. He has published books and articleson a variety of topics in organizational behavior and human resource management,including occupational stress and employee health, aging workers and retirement, lead-ership, job design, and careers. In addition to these topics, he is currently studying 360-degree feedback, safety, supervisor’s attitudes toward their subordinates, and humanresource practices among police officers.

THOMAS A. STETZ is Lead Research Psychologist for the WF21 Development Team atthe National Imagery and Mapping Agency (NIMA). He is responsible for developingassessment processes, workforce skill analyses, and he consults on various human re-sources and organizational topics. Prior to working for NIMA he was at the U.S. Officeof Personnel Management developing selection and promotion assessments for variousgovernment agencies.

Fox, S., Caspy, T., & Reisler, A. (1994). Variables af-fecting leniency, halo, and validity of self-ap-praisal. Journal of Occupational andOrganizational Psychology, 67, 45–56.

Furnham, A., & Stringfield, P. (1998). Congruencein job-performance ratings: A study of 360-de-gree feedback examining self, manager, peers, andconsultant ratings. Human Relations, 51, 517–530.

Guzzo, R.A., & Shea, G.P. (1992). Group performanceand inter group relations in organizations. InM.D. Dunnette & L.M. Hough (Eds.), Handbookof industrial and organizational psychology (Vol-ume 3, pp. 269–313). Palo Alto CA: ConsultingPsychologists Press.

Hackman, J.R. (1987). The design of work teams. InJ.W. Lorsch (Ed.), Handbook of organizationalbehavior (pp. 215–342). Englewood Cliffs, NJ:Prentice-Hall.

Harris, M.M., & Schaubroeck, J. (1988). A meta-analysis of self-supervisor, self-peer, and peer-su-pervisor ratings. Personnel Psychology, 41,43–62.

Imada, A.S. (1982). Social interaction, observation,and stereotypes as determinants of differentia-tion in peer ratings. Organizational Behavior andHuman Performance, 29, 397–415.

Kabanoff, B. (1991). Equity, equality, power, and con-flict. Academy of Management Review, 16, 416–441.

Kane, J.S., & Lawler, E.E., III (1978). Methods ofpeer assessment. Psychological Bulletin, 85(3),555–586.

Katzenbach, J.R., & Smith, D.K. (1993). The wisdom

Page 13: Peer Appraisals: Differentiation of Individual Performance on Group Tasks

Peer Appraisals • 345

of teams: Creating the high-performance orga-nization. Boston: Harvard Business School Press.

Kerr, S. (1995). On the folly of rewarding A, whilehoping for B. Academy of Management Execu-tive, 9(1), 7–14.

London, M., & Smither, J.W. (1995). Can multi-source feedback change perceptions of goal ac-complishment, self-evaluations, andperformance-related outcomes? Theory-based ap-plications and directions for research. PersonnelPsychology, 48, 803–839.

Longenecker, C.O., Sims, H.P., Jr., & Gioia, D.A.(1987). Behind the mask: The politics of em-ployee appraisal. Academy of Management Ex-ecutive, 1(3), 183–193.

Love, K.G. (1981). Comparison of peer assessmentmethods: Reliability, validity, friendship bias, anduser reactions. Journal of Applied Psychology,66(4), 451–457.

McEvoy, G.M., & Buller, P.F. (1987). User acceptanceof peer appraisals in an industrial setting. Per-sonnel Psychology, 40(4), 785–797.

Meyer, H.H., Kay, E., & French, J.R.P., Jr. (1965). Splitroles in performance appraisal. Harvard BusinessReview, 43(January–February), 123–129.

Mohrman, A.M., Jr., Resnick-West, S.M., & Lawler,E.E., III (1989) Designing performance appraisalsystems. San Francisco: Jossey-Bass.

Moorman, R.H. (1991). Relationship between orga-nizational justice and organizational citizenshipbehaviors. Do fairness perceptions influenceemployee citizenship? Journal of Applied Psychol-ogy 76, 845–855.

Morahan-Martin, J. (1996). Should peers’ evaluationsbe used in class projects? Questions regardingreliability, leniency, and acceptance. Psychologi-cal Reports, 78, 1243–1250.

Nadler, D.A. (1977). Feedback and organization de-velopment: Using data-based methods. Reading,MA: Addison-Wesley.

Norman, C.A., & Zawacki, R.A. (1991). Team apprais-als–Team approach. Personnel Journal, 70(9),101–104.

Price, J.L., & Mueller, C.W. (1986). Handbook of or-ganizational measurement. Marshfield, MA: Pit-man.

Saavedra, R., & Kwun, S.K. (1993). Peer evaluationsin self-managing work groups. Journal of AppliedPsychology, 78(3), 450–462.

Shapiro, D.L., & Brett, J.M. (1993). Comparing threeprocesses underlying judgments of proceduraljustice: A field study of mediation and arbitra-tion. Journal of Personality and Social Psychol-ogy, 65, 1167–1177.

Smither, J.W., Reilly, R.R., Millsap, R.R., & Stoffey,R.W. (1993). Applicant reactions to selectionprocedures. Personnel Psychology, 46, 49–76.

Sovereign, K.L. (1999). Personnel Law (4th Edition).Upper Saddle River, NJ: Prentice Hall.

Tannenbaum, S.I., Beard, R.L., & Salas, E. (1992).Team building and its influences on team effec-tiveness: An examination of conceptual and em-pirical developments. In K. Kelly (Ed.), Issuestheory, and research in industrial/organizationalpsychology. Amsterdam: Elsevier.

Tenbrunsel, A.E. (1998). Misrepresentation and ex-pectations of misrepresentation in an ethical di-lemma: The role of incentives and temptation.Academy of Management Journal, 41, 330–339.

Topping, K. (1998). Peer assessment between studentsin colleges and universities. Review of Educa-tional Research, 68(3), 249–276.

Tornow, W.W. (1993). Perceptions or reality: Is multi-perspective measurement a means or an end?Human Resource Management, 32(2 & 3), 221–229.

Waldman, D.A., Atwater, L.E., & Antonioni, D.(1998). Has 360-degree feedback gone amok?Academy of Management Executive, 12(2), 86–94.

Waldman, D.A., & Bowen, D.E. (1998). The accept-ability of 360-degree appraisals: A customer-sup-plier relationship perspective. Human ResourceManagement, 37, 117–129.

Watson, W., Michaelsen, L.K., & Sharp, W. (1991).Member competence, group interaction, andgroup decision making: A longitudinal study. Jour-nal of Applied Psychology, 76, 803–809.

ENDNOTES

The authors acknowledge the College ofBusiness at Oregon State University for itssupport of this project’s data entry and travelcosts. Sabbatical leave and a grant to thesecond author from the Faculty Researchand Creative Endeavor Committee at Cen-tral Michigan University also contributed tothe project. An earlier version of this paperwas presented at the Western Academy ofManagement Meetings, Portland, Oregon,1998. The authors also acknowledge theinsightful suggestions of the editor and twoanonymous reviewers.