Author(s): Richard W. Buchanan and Martha Rogers Source ... Assessment in Large Classes.pdf · Innovative Assessment in Large Classes Richard W. Buchanan and Martha Rogers e would

Innovative Assessment in Large ClassesAuthor(s): Richard W. Buchanan and Martha RogersSource: College Teaching, Vol. 38, No. 2 (Spring, 1990), pp. 69-73Published by: Taylor & Francis, Ltd.Stable URL: http://www.jstor.org/stable/27558399 .

Accessed: 26/03/2014 11:46

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at .http://www.jstor.org/page/info/about/policies/terms.jsp

.JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide range ofcontent in a trusted digital archive. We use information technology and tools to increase productivity and facilitate new formsof scholarship. For more information about JSTOR, please contact [email protected].

.

Taylor & Francis, Ltd. is collaborating with JSTOR to digitize, preserve and extend access to CollegeTeaching.

http://www.jstor.org

This content downloaded from 206.87.46.46 on Wed, 26 Mar 2014 11:46:45 AMAll use subject to JSTOR Terms and Conditions

http://www.jstor.org/action/showPublisher?publisherCode=taylorfrancis

http://www.jstor.org/stable/27558399?origin=JSTOR-pdf

http://www.jstor.org/page/info/about/policies/terms.jsp


Innovative Assessment

in Large Classes

Richard W. Buchanan and Martha Rogers

e would like to offer some

useful suggestions to solve some of the assessment prob

lems frequently encountered with large classes. 'Targe classes" will be defined

here as those with eighty students or

more. Although this definition is some

what arbitrary, it has been our experi ence that eighty students is the break

ing point where traditional teaching

techniques are no longer workable and new ones must be tried. This breaking

point is particularly noticeable in the area of assessment. We've watched

many of our colleagues struggle along with traditional approaches, such as es

say examinations, up to points where

class enrollments exceed eighty. Then

they normally collapse from overwork,

delegate assessment to lower-level as

sistants, or start looking for new ap

proaches.

This paper will show some solutions

to three problems:

1. How to offer students in large classes an opportunity to be assessed in an essay format without straining the resources available for grading

2. How to deal with students who

miss a required examination

Richard W. Buchanan is senior lecturer in marketing at the Massey University in New

Zealand. Martha Rogers is an assistant

professor of marketing at Bowling Green

State University in Bowling Green, Ohio.

w

3. How to generate large numbers

of new, relevant examination questions on a regular basis

It is useful to begin by stressing that

this paper is not, and was never intend

ed to be, an elegant scientific examina

tion of all the factors within its focus.

It is our intention to share techniques that have worked for us in sections

numbering between 50 and 350 stu

dents. One author typically teaches be

tween two and three thousand students

per year.

We have had only one graduate assis tant assigned to each of us for a period of five to ten hours per week, and thus

finding a means of dealing with mass

numbers became a matter of survival.

Virtually all of the solutions suggested

by this article were the result of trial

and-error. As such, this paper cannot

lay claim to having tested all possible solutions. In addition, although we

have kept reasonably accurate records to test the effectiveness of various solu

tions, we have made no attempt to pre sent them as anything other than ap

proximations.

Our three assessment solutions will

be presented and should be used simul

taneously, as a total system. This is in

keeping with our experience that it is

best to treat instructional design as a

system?rather than to treat individual

parts in isolation. To do otherwise of ten causes the solution to one problem to exacerbate another. Therefore, this

paper will not only relate those parts of

the system designed to deal with select

ed problems but will also mention some solutions for problems created by the new system itself.

Objective Tests?Imperfect but Unavoidable

Although people teaching large class

es often try to avoid multiple-choice/ true-false tests, we have found that

such efforts seem to be appreciated by almost no one. Although colleagues may criticize the limitations of anything other than essay tests, they usually are

willing to accept an alternative if more

than fifty students are involved. Admin

istrators may make noises about the de

sirability of essay examinations, but, in our experience, they are rarely willing to

trade the time it takes to grade them for

a lack of participation in either matters

of administration or research/publica tion. Finally, students are not nearly so

fond of them as their comments to the

contrary might suggest. For all these reasons we are assum

ing that the basis for assessment will

primarily be objective questions. This

assumption normally unleashes a storm

of student complaints to the effect that "I just don't do well on objective tests." Although this may be the case

for some, we have found that, general ly, the belief just doesn't hold true.

Through the years we have often

made it a point to offer both essay and

objective final examinations to stu

dents who have been tested up to that

Vol. 38/No. 2 69



time in an objective format. Those who

have taken the essay options have been

graded on the basis of their examina

tion without our first checking to see

what their performance had been on

objective test items. Only rarely has

their letter grade on the essay final ex

amination been different from the let

ter grade on previous objective tests.

This observation concurs with the find

ings of Cowles and Hubbard (1952),

Thompson (1965), and Bracht and

Hopkins (1970). A study by Warren

(1979) indicates that it may actually be

easier for students to get high marks

with multiple-choice than with essay tests (also see Hogan [1981]).

This rule-of-thumb, however, is not

true for all students. And, even if it were true, it will not be useful for

quieting students' objections if they think it is not true for them. For this

reason, we've found it necessary to

provide some way for students to be as

sessed in an essay format?while still

protecting ourselves against the enor

mous time investment required to eval

uate all students in this manner.

Some idea of how great a time in

vestment may be involved can be deter

mined by considering a hypothetical

example. Suppose that a more or less

standard ten-question, short-answer

test intended to be taken in fifty min

utes were to be given. Assuming that it

takes a minimum of two to three min

utes to grade each question means that

assessing each paper in the most mini

mal fashion requires a total of from

twenty to thirty minutes. Multiplying this figure by a not uncommon student

load of six hundred students produces a figure of from two hundred to three

hundred hours. Even if instructors were to spend all of their time grading papers on a forty-hour week basis, each exam would take from five to seven weeks to process.

Some might argue that this situation

could be alleviated by the use of grad ers, but this technique has problems of

its own. Among them are coordination/

management of the graders, variability among graders, and the fact that stu

dents don't normally like to have their

work assessed by someone other than

the instructor.

All of these factors argue for a solu

tion that offers students a chance to be

assessed in an essay format but that

will limit the number of students so as

sessed to reasonable numbers.

Self-Selective Essay Exams

We found that the only system that

would fit into the preceding constraints

had to be based on what many would

term a "cafeteria" approach. The phil

osophical basis of this approach (which is frequently used in structuring em

ployee benefit plans) is to offer "con

sumers" a number of options from

which they can select the combination

of items they prefer. Students are, therefore, offered the

following three options: (1) four objec tive concept tests only, (2) four objec tive concept tests and an optional final, or (3) three objective concept tests and an optional final. In options one and

three, each test is worth 25 percent of

their course grade; in option two, each

test is worth 20 percent. Those students electing to take the

optional final are told

1. their current grade prior to the fi

nal (i.e., Should they quit while they're

ahead?); 2. that the final examination can

hurt them as well as help them (i.e., a

concept test can?under some circum

stances?be dropped, but a final can

not be dropped if attempted); 3. the approximate percentage of

students taking the final examination and the fraction of these improving their grades over the years;

4. that the final examination will

consist of either a fifty-question objec tive test or a ten-question short-answer

essay?both covering the entire course; 5. that students will have to decide

prior to taking the final which version

they will attempt (i.e., they could not

look at both and decide which version was easier); and

6. that most students in the past have preferred the objective version be cause it loads their risk into small (two

points each) components rather than

large (ten-point) "hunks."

When the options are presented to

them in this manner, only 10 to 15 per

cent of the students enrolled in large courses have elected to attempt the fi

nal. Of those taking the final, no more

than 20 percent chose the essay version

?and, typically, only six or seven in a

class of three hundred students.

These numbers, though manageable

enough, have been distilled even fur

ther by a refinement of the system that was produced to meet what proved to

be a product of the authors' teaching

styles. When teaching large classes, we've found it useful to make sure that

the lectures contain enough material

not covered in the supporting text to

make it worthwhile for students to at

tend lectures. We tell the students that

this material will be both presented and

the subject of examination questions

(i.e., at least 30 percent of a test's items

will not be found in the book). Because it is generally impossible to

videotape the lectures, those students

who miss many classes have a very real

problem, although they could miss at

least one concept test without penalty. However, if the final examination cov

ers both the text and the lecture, they are still at risk for those topics covered

during their absences. For this reason

we decided to make the objective ver

sion of the final examination cover the

text only while the essay version is

drawn from both the text and lecture.

Generally, the lectures are more orient

ed to applications of knowledge than to definitions or facts, and we believe

that these applications are better tested in an essay format.

Once this refinement was made, the

percentage of students taking the final

exam remained about the same, but the

number electing the essay version has

dropped to a fraction of 1 percent.

Still, it has always been there if anyone wanted to complain about not doing well on objective tests. To the best of our knowledge, no complaints about

the unavailability of essay tests have ever been made about our large classes.

It may also be useful to know that

the percentage of students attempting the final usually falls over time, pos

sibly because the grapevine eventually

spreads the word that the final is not a

particularly soft option. At any rate, the ceiling on the people attempting it

70 COLLEGE TEACHING



seems to be about 10 to 15 percent of

those enrolled.

How many concept tests should be

administered? Students complain if

there are fewer than four concept tests, because administering three or fewer exams causes the amount of material to

be covered on each one to be unman

ageable. Having more than four seems

impractical because it multiplies the re

sources needed beyond a point of di

minishing returns.

The basis of this system is in direct

contrast to what seems to be an aca

demic tradition of placing relatively

greater emphasis on the final examina

tion than on others such as the concept tests. However, it is not our intention

to load most of a student's evaluation

into his or her performance on only one day of the term.

Abolishing Makeup Exams

Once the "cafeteria" style is adopt

ed, it then becomes possible to use it to

solve other problems such as makeup exams.

Having students absent from a re

quired exam is never a comfortable sit

uation. Professors dread the inconven

ience of constructing a makeup exam

and find distasteful the thought of

serving as judge, jury, and executioner in determining whether excuses are ac

ceptable. At the same time, students

don't like having their integrity ques tioned by an unpredictable, often in

sensitive system that they frequently

suspect of being punitive. These more

or less standard complaints explode in

their intensity when multiplied by the

enrollments of a large class.

Before tackling the problem of

makeup tests, we realized that 15 to 25

percent of students might be absent

from any given examination. When ap

plied to a class enrollment of 80 to 350, and multiplied by several sections, the

total number of students likely to be in

volved is beyond the scope of tradition

al methods for handling them.

The first problem is processing the

flood of individuals who show up at an

instructor's door either prior to or

shortly after an examination with their excuses for being absent. If only five

minutes is spent with each person, the

total time invested would leave little

time for doing anything else. Beyond

this, we have felt totally helpless to de

termine which excuses are truthful, jus

tified, or both.

Even if all the absentees could be ac

commodated, their sheer numbers

make it impossible to arrange a time

and place for a makeup exam that they can all attend. Finally, if a makeup test

is allowed, there is no way to make it

fair for all concerned. If anyone is al

lowed to take the test prior to the regu lar class, then someone is bound to feel

that those taking the makeup will pass

questions on to their friends. And, if a

totally different test is given as a make

up, someone will argue that it is harder

(easier) than the regular test.

At this point it would have been

tempting to surrender the entire matter

and decide to accept absolutely no ex

cuses except those that conform to uni

versity policy and are supported by ap

proved documentation (i.e., student

health center doctor's excuse, etc.).

But, common sense suggests that this

limitation would overlook some per

fectly valid situations, and this would

lead to further conflict. Although such

conflict may be permissible in smaller

class settings, it definitely is not for

large ones.

One thing that large classes teach

their instructors is never to tolerate any situation that strikes a large number of

students as unfair. Reasonable univer

sity administrators are used to discard

ing the opinions of what they may per ceive as a handful of disgruntled stu

dents. They are much more likely to

take action if fifty or a hundred gather outside their door.

After considering all the problems associated with makeup exams, we de

cided to offer the students the option

(previously discussed) of being assessed on the basis of three concept tests and an optional final examination that be comes mandatory if a student misses one of the concept tests. At the time

the students are informed of this op tion they are also told that

1. they do not need to inform the in

structor or get permission to miss a

test;

2. by taking this option, they are

also giving up the ability to drop a low

test (i.e., what really is happening is

that students are given the ability to

drop a low test score?either a bad per formance on a test taken or no per

formance on one they missed); 3. if they miss another test or the

final they will fail the course; and, most important,

4. no makeups will be given for any reason to anyone.

Besides all this they are also told the

specifics of the final examination, which have already been introduced in a preceding section.

This system has had remarkable re

sults. Only a handful of students come

to the office door each year to ask

about the possibility of a makeup. Be

yond this, the percentage of students

electing to miss any given concept test

has averaged around 5 perccent of

those enrolled. And we are relatively certain that any who do miss a test

under these circumstances have reasons

that they think are justifiable. Limitations of this part of the system

should be mentioned. Most important, when a student misses an exam, that

student has not been assessed on a sig nificant percentage of course material.

Although we have not yet tried it, one

solution to this drawback would be to

give more weight on the final exam to

those items assessing the material on

the missed exam. This weighting would

be procedurally simple. Each student

taking the final exam will do so either

voluntarily as a fifth exam or to make

up for a missed exam. The student's

record will reveal which is the case, and, if the latter, which exam was missed. It

is then a relatively simple matter to

weight the items from the missed ex

ams more heavily.

Additionally, in a few rare cases, a

student has tried to test the system either by challenging it or by missing two examinations. In the first cate

gory, an entire hockey team had their

coach call, first, a department chair, and then the dean, trying to get an ex

cused absence. These matters were eas

ily dealt with as soon as both the spe

cifics, the rationale behind the system.

Vol. 38/No. 2 71



and the clarity of the presentation to

students at the outset of the course

were explained to the administration.

When students miss two examina

tions, we've found it easy to deal with

them on a case-by-case basis. Fre

quently those who miss two exams

never even bother to come in and sim

ply accept their failing grades.

Generating Test Items

It is certainly true that generating test questions is not particularly easy for any course. However, large class

sizes produce unique pressures. One

problem is introduced by the sheer vol

ume of the class. The students are like

ly to fill the largest auditorium avail

able, or at least a large amphitheater classroom. Thus, when tests are given there is no way to spread students out

with a seat between each of them.

There must be at least two (and pos

sibly more) versions of each test given for each examination. We find that re

ordering the items accomplishes this.

If two or more sections are taking the examination at different times,

each ''sitting" will probably need en

tirely different examinations as well.

Otherwise, information about the ex

am will flow from the earlier class to

the later one. The most obvious way

this can happen is if copies of the test

are pilfered and removed from the ex

amination room. However, even with

stringent security, this is not the only

way for exams to "get out." We once

learned of a student in an early section

who had taken the exam with a tape re

corder in his pocket. He apparently sat

at the back of the room and mouthed

the questions into the recorder; then,

he left the room, looked up the an

swers, and gave copies of the tests to

his friends. We've also heard that so

cial organizations have directed indi

vidual members to carry questions from the test in memory (i.e., "you do

one to five and she will do six through

ten," etc.).

Finally, large class sizes usually de

mand that all new tests be constructed

for each test each year. Large classes,

which tend to be entry-level courses,

are tempting targets for the develop ment of files that can be passed on

from year to year once students learn

that exams may be repeated. The thing that makes all of these

seemingly paranoid fears more real is

that a large class size escalates the value

of misappropriated information or

copies of exams. A graduate assistant,

caught in a campus security forces

raid, had apparently been selling copies of exams for $100 each.

All of these concerns mandate that a

large number of test items be devel

oped continually. The only problem is

that the instructor of a large course

may tend to specialize in it after a

while. Since the same textbooks and

relatively similar lectures are used year after year, the instructor may find a

diminishing ability to generate new ob

jective test items.

A popular solution to this problem, test banks supplied by textbook com

panies, may fail on two counts. One

difficulty is that the test bank has ques tions that apply only to the text. As al

ready stated, we find it desirable that

lecture content and textual material be

different. If this is the case, there may

be a large body of information that will

not be tested if questions come from

test banks only. Once students figure this out (and they will), lecture atten

dance will fall.

Furthermore, our experience with

textbook test banks has been mixed.

Some of the questions seem poorly

worded, ambiguous, or irrelevant.

Ironically enough, the resource that

can solve this problem is the same as

the one that causes most of the other

problems: large size of the class.

Student-Generated Test Items

The sheer volume of students in a

large class represents a source of aid

seldom recognized by teachers. Chan

neled in the right direction, the sum

total of talents within a large class is

usually more than equal to its chal

lenges. Where a small class may have

only four or five outstanding students,

big ones may have fifty or more.

We have designed a system that en

ables this resource to be put to work.

In a handout issued before the first ex

am, students are told that they can sub

mit potential examinations questions in

a specified format. The motivations

for students' writing test items are that

(1) they can have the satisfaction of

seeing their own questions used with

their names attached (i.e., the instruc

tor will identify the author of the ques tion on the exam if the student wishes

it); (2) if they submit the question they

presumably will get it correct on the ex

am; and (3) the teacher agrees to "pay" them two points additional credit for

each question chosen (the same as each

question is worth on an examination/

grading system based upon total

points).

Telling students the correct format

for submitting questions has proved to

be crucial, as otherwise the instructor

can be deluged with pieces of paper that are very difficult to process. For

this reason we insist that students can

submit up to ten questions per exam,

that all questions must be on a standard

5"-x-7" card, that each question must

be either typed or legibly printed on a

separate card, and that information

giving the correct answer, the source

(i.e., page of the text, date of lecture,

etc.), and the identity of the author

must be provided for each question. Over the years students operating

under these constraints have provided

many of our test items. We have been

happily surprised by the quality of the

questions. Although as many as ten

questions may be sifted to get one good one (and even this one usually requires

rewriting), we believe that those select

ed have been of a caliber at least as

good as many of those in test banks

and are often less trivial and more con

ceptual.

The students' reaction is always dif

ficult to assess. We've tried to be par

ticularly attentive to any dissatisfac

tion. Although we've occasionally had

the complaint that "the exam (grade) doesn't reflect how much I know," the

comments from mandatory student

evaluations of course and instructor

have been fairly repetitive of those

we've received about questions gener ated traditionally. The only complaint

unique to this system is that "the in

structor shouldn't be so lazy as to let

others write his exams," and these

seem to be rare and to lack passion.

72 COLLEGE TEACHING



Reactions from colleagues and uni

versity administrators have been diffi

cult to assess because most of them seem oblivious to the system. Although

we've been careful to get administra

tive approval of this approach, permis sion to use it has proved easy to get.

Once permission has been granted, we

have yet to hear much about the whole

concept from anyone not connected to

the course, presumably because there

have been no complaints. At any rate, the system does seem

to generate questions that are good to

excellent once they have been filtered

and rewritten. We believe it's impor tant to make every effort not to include

simplistic questions that measure mere

ly rote memorization.

Our experience has shown that on

the average there are normally from one to one-and-one-half questions sub

mitted per test per student enrolled.

Thus if three hundred students are en

rolled in the class, the instructor may

expect from 300-450 questions to be

submitted per test, and more for mul

tiple sections. Usually the total number

of questions climbs with each succes

sive test, as some students discover that

they can increase their grades and

others realize that they are now in aca

demic trouble and need the bonus

points. Furthermore, students repeat

edly tell us that writing questions is an

effective way to review for exams.

The total number of questions sub

mitted may frighten some teachers, but

it shouldn't because a number of tech

niques can make the job of processing them easier. First, their sheer numbers

mandate that having some remote loca tion for their deposit, like a faculty

room mailbox, is probably a good idea. Once the questions are all in one

large stack, we suggest making an out

line of topics to be covered on the ex

am. Then the items that "fit" can be

used until the exam covers all the nec

essary topics. If an item looks promis

ing it is kept; if not, it is discarded. In

order to reward as many individual stu

dents as possible, we accept no more

than two questions per student. This is

fairly easy to keep track of as students

submit the questions in batches that are

paper-clipped or banded together. It should be noted that it is not nec

essary to read all the questions sub

mitted. In fact, we find that it's best to

be honest about telling the students

that we will simply reach into the stack

and draw out questions until we have

the right mix of good ones to create the

exam desired. They seem to accept this

lottery approach without much com

plaint.

Using this procedure, we've found

that a standard fifty-item test can be

constructed in from two to three hours

per test. This certainly compares favor

ably with the time necessary to create

questions oneself. And since the ques tions selected are all typed on standard

sized cards with correct answers at

tached, this system is also usually pop ular with the word-processing depart

ment; it is an easy matter for them to

turn a standard title page and a rubber

banded stack of fifty questions into a

finished examination. Once their work

is complete, they can then pass the

stack of questions on to a person creat

ing an answer key for grading pur

poses. Finally, copies of this key with

the page number where test items are

located can be passed back to students

with their answer sheets so that they can check to see that an answer they

missed really does exist.

For those faced with the responsibili ties of teaching large classes, this arti

cle was intended to resolve some practi cal aspects of the problems of assess

ment. It is sometimes hard to depart from traditional methods without much

soul-searching about whether the inno

vations are somehow a dilution of the

quality of the original. The fact is, we

have little empirical evidence to guide us, and we must accept the changes that

the resources at hand dictate in a way that seems best for everyone.

REFERENCES

Bracht, G. H., and K. D. Hopkins. 1970.

Communality of essay and objective tests

of academic achievement. Educational

and Psychological Measurement 30 (Sum mer): 359-64.

Cowles, J. T., and J. P. Hubbard. 1952. A

comparative study of essay and objective examinations for medical students. Jour

nal of Medical Education, Part 2. 27:14 17.

Hogan, T. P. 1981. Relationship between

free response and choice-type tests of

achievement: A review of the literature.

Washington, D.C.: National Institute of

Education. (ERIC Document Reproduc tion No. ED 224 811)

Thompsen, R. E. 1965. A study of the com

parative predictive validities of the essay and objective sections of the college en

trance examination board advanced place ment examination in physics. Princeton:

Educational Testing Service. Test Devel

opment Report 65-4.

Warren, G. 1979. Essay versus multiple choice tests. Journal of Research in Sci

ence Teaching 16(November): 563-67.

Public Television's

YEAR OF THE

ENVIRONMENT

1990

A|TH Vol. 38/No. 2 73



Documents

Author(s): Richard W. Buchanan and Martha Rogers Source ... Assessment in Large Classes.pdf · Innovative Assessment in Large Classes Richard W. Buchanan and Martha Rogers e would