Upload
nathan-thompson
View
72
Download
0
Embed Size (px)
Citation preview
SETTING CUTSCORES FOR CERTIFICATION
EXAMSNathan A. Thompson, Ph.D.
Vice President, ASCAdjunct Faculty, University of Cincinnati
Why are cutscores necessary? As Glaser (1963) pointed out, the reason
for the existence of many tests is to make decisions about people Mastery: Pass/Fail educational content Credentialing: Award/not professional
credential (certification, certificate, license) Pre-employment: Hire/not for job (or eligible
as candidate) University selection: Admission/not to
university or program
From Livingston (1980), discussing the rationale for cutscores:
Why are cutscores necessary? What does that mean? That most of what we want to measure is in
a continuum (knowledge, intelligence) and not naturally in “states” (e.g., male/female)
So we need to set a cutscore (or cutscores) on the continuum to sort examinees into groups that reflect interpretations and meanings that are useful to us Pass is “qualified” and Fail is “unqualified”
How do we set a cutscore? As the Livingston excerpt notes, all
cutscores involve a level of subjectivity or arbitrariness
The higher the stakes of the exam, the more we need to reduce the arbitrariness
Standard setting methods differ in their level of objectivity
A more objective method provides an anchor to validity and defensibility
How do we set a cutscore?Approach Example Arbitrari
nessArbitrary round number
70% of items MOST
Quota Whatever passes 85% of people (z=-1.0)
MOST
Examinee-based
Borderline, Contrasting Groups
LEAST
Content-based Angoff, Bookmark LEAST
Examinee-based methods Borderline Method
Experts familiar with content AND all examinees identify those examinees they consider “borderline”
The mean or median score for those examinees is the cutscore
Contrasting Groups Method Experts familiar with content AND all examinees
sort examinees into Pass and Fail Groups (or external criterion is used)
The point where the two score distributions cross is the cutscore
Examinee-based methods Are conceptually appealing but have two
large disadvantages: Require examinees to take the test first, so
pass/fail decisions cannot be made after they finish the test
Require a way to assign examinees into groups WITHOUT test scores – either experts that are familiar with all examinees or some sort of “gold standard” Example: For a practice test, results on the real
test can be used as a gold standard to set cutscore
Content-based methods The Angoff and Bookmark methods require
experts to look at items rather than candidates
Bookmark: pilot all items, analyze difficulty statistics, order the items by difficulty in a booklet, and ask experts to place a bookmark
Angoff: All experts provide a rating 0 to 100 for each item, average serves as cutscore
Content-based methods The Angoff method is the most commonly
used approach in certification testing and therefore quite legally defensible
Biggest advantage: does not require test to be administered for data
Can use data too, with Beuk Compromise, to incorporate examinee-based aspects
The drawback is that it requires a group of subject matter experts to rate all items, which can take time
Content-based methods The Bookmark method has the
advantage that a rating is not required for every item from every expert (which takes a lot of time)
The drawback is that it requires all items to be delivered to a decent-sized sample in order to obtain item difficulty statistics (might not be feasible)