45
Standard Setting for Professional Certification Brian D. Bontempo Mountain Measurement, Inc. brian@mountainmeasurement. com (503) 284-1288 ext 129

Standard Setting for Professional Certification

  • Upload
    nassor

  • View
    81

  • Download
    1

Embed Size (px)

DESCRIPTION

Standard Setting for Professional Certification. Brian D. Bontempo Mountain Measurement, Inc. [email protected] (503) 284-1288 ext 129. Overview. Definition of Standard Setting Management Issues relating to Standard Setting Standard Setting Process Methods of Standard Setting - PowerPoint PPT Presentation

Citation preview

Page 1: Standard Setting for Professional Certification

Standard Setting forProfessional Certification

Brian D. BontempoMountain Measurement, Inc.

[email protected]

(503) 284-1288 ext 129

Page 2: Standard Setting for Professional Certification

Overview• Definition of Standard Setting• Management Issues relating to

Standard Setting• Standard Setting Process• Methods of Standard Setting• Using multiple methods of Standard

Setting

Page 3: Standard Setting for Professional Certification

Definition of Standard Setting• Standard setting is a process

whereby decision makers render judgments about the performance level required of minimally competent examinees

Page 4: Standard Setting for Professional Certification

Types of Standards• Relative Standard (Normative

Standards)– Top 70% of scores pass– 20 points above average

• Criterion-Referenced Standard (Absolute Standards)– 70% of the items correct– 600 out of 800 scaled score– .05 logits– 20 items correct

Page 5: Standard Setting for Professional Certification

Why do we conduct Standard Setting?• To objectively involve stakeholders in

the test decision making process• To connect the expectations of

employers to the test decision making process

• To connect the reality of training to the test decision making process

• To ensure psychometric soundness & legal defensibility

Page 6: Standard Setting for Professional Certification

When to (re)set a passing standard• For a new exam, after Beta Test data have

been analyzed, typically after “Live” Test Forms have been constructed

• For exam revisions, when the expectations of a job role have changed– Practice has changed– Content domain has changed– It is not appropriate to change the passing

standard whenever a test or training has been revised.

– It is not appropriate to change the passing standard because of supply and demand issues (too many/few certified professionals)

Page 7: Standard Setting for Professional Certification

Who should lead a standard setting panel?• An experienced Psychometrician

– Insider perspective, familiar with your certification and exam development

– Outsider perspective, not familiar with your certification and exam development

Page 8: Standard Setting for Professional Certification

How rigid should you be in your direction to the Psychometrician?• I recommend a conversation

between the Psychometrician and the Test Sponsor to figure out what works best. Typically a test sponsor will specify a framework (e.g., Angoff) and let the Psychometrician dictate the specifics.

Page 9: Standard Setting for Professional Certification

Outcomes of Standard Setting• A conceptual (qualitative) definition of

minimal competency• A proposed numeric (quantitative) passing

standard• A set of alternate passing standards based

on errors in the process• Expected passing rate(s) from each standard• A report documenting the process and the

psychometric quality of the process

Page 10: Standard Setting for Professional Certification

Standard Setting Process

Page 11: Standard Setting for Professional Certification

Standard Setting Process• Gather test data• Assemble a group of judges

– Define minimal competency– Train judges on the method– Render judgments on the performance of

borderline examinees• Calculate the passing standard by

aggregating the judgments• Evaluate the outcome by calculating the

expected passing rate

Page 12: Standard Setting for Professional Certification

Selecting your judges• Representative Sample

– Hiring Managers– Trainers– Entry-Level Practitioners

• How many judges is enough?– For a low stakes exam

• at least 8 judges– For a medium stakes exam

• at least 12 judges– For a high stakes exam

• at least 16 judges

Page 13: Standard Setting for Professional Certification

Developing a Definition of Minimal Competency• Identify 3 common tasks within each

domain of the test blueprint (an easy, a hard, and a “Borderline” task)

• Characterize the performance of minimally competent examinees on each of the major tasks

• Write text that summarizes these discussions

Page 14: Standard Setting for Professional Certification

Training Judges• Instruct them on their task• Practice rating items

– Two sets of practice items• Practice discussing items• Explain the stats that you will be

providing them• Set the tone and boundaries for good

‘group psychology’

Page 15: Standard Setting for Professional Certification

Standard Setting Methods

Page 16: Standard Setting for Professional Certification

Types of Standard Setting Methods• Examinee-Centered Methods

– Judges use external criteria, such as on the job performance, to evaluate the competency of real examinees

• Test-Centered Methods– Judges evaluate the performance of imaginary

examinees on real test items• Adjustments

– in order to account for inaccuracy in the standard setting process, Psychometricians use real test data to provide a range of probable values for the passing standard

Page 17: Standard Setting for Professional Certification

Examinee-Centered Methods• Borderline group

– Using external criteria (such as performance on the job), judges identify a group of examinees that they think are borderline examinees. The average score of this group is the passing standard

• Contrasting groups– Using external criteria, judges classify examinees

as passers or failers. The passing standard is established by determining the point which discriminates the best between the scores of both groups

Page 18: Standard Setting for Professional Certification

Test-Centered• Modified-Angoff

– Angoff, W.H. (1971) Scales, Norms, and equivalent scores. In R.L. Thorndike (Editor) Educational Measurement 2nd edition: Washington, DC American Council on Education.

• Bookmark– Mitzel, H.C., Lewis, D.M., Patz, R.J., & Green,

D.R. (2001). The Bookmark Procedure: Psychological perspectives. In G.J. Cizek (Editor), Setting Performance Standards: Mahwah, NJ Lawrence Erlbaum Associates.

Page 19: Standard Setting for Professional Certification

Basic Angoff Process• Judges evaluate each item

– What percentage of MC examinees would get the item correct?

• Feedback/Discussion• Judges make adjustments to their

ratings• Average of all items is the judges

passing standard• Average of all judges’ standards is the

passing standard

Page 20: Standard Setting for Professional Certification

Common Angoff Issues• What percentage of

– MCs vs. all– MCs is correct

• candidates– “would” vs. “should”– “would” is correct

• get the item correct?

Page 21: Standard Setting for Professional Certification

Common Angoff Issues• What type of ratings should judges

make?– 1/0 (Yes/No) – Percentage of Borderline examinees

• Round to 1 decimal (.9)• Round to 2 decimals (.92)

– NEVER use percentage of all examinees

Page 22: Standard Setting for Professional Certification

Common Angoff Issues• Types of Feedback to provide

– Group Discussion• Relate to conceptual definition of minimal

competency– Typical or atypical content– Relevancy

• Relate to item nuances– Item Stem– Item Distractors

• “I expect a lot of the MC because this is core content and the item is straightforward.”

• “I would like to cut the MC some slack because this is not covered well in training and the scenario is a little abstract.”

Page 23: Standard Setting for Professional Certification

Common Angoff Issues• Types of Feedback to provide

– Empirical Data• Answer Key – Yes!• Percentage of Borderline examinees

answering the item correctly – If possible yes

• P-Value (Percentage of examinees answering the item correctly) – Only if the percentage of Borderline examinees is not available

Page 24: Standard Setting for Professional Certification

Common Angoff Issues• When to provide feedback?

– Initial Rating– Discuss items– Secondary Rating– Provide Empirical Data– Tertiary Rating

Page 25: Standard Setting for Professional Certification

Bookmark• Test is divided up into sub tests

– By domain OR– Equal variance of difficulty across sub tests

• Items are sorted from easiest to hardest– By judges OR– By actual value

• Judges bookmark the subtest at the point where the MC examinee would stop getting items correct and start getting them incorrect

• The lowest possible standard• The expected standard• The high possible standard

• Judges discuss ratings & make adjustments• Passing standard is average # of items answered

correct

Page 26: Standard Setting for Professional Certification

Common Bookmark Issues• How many Ordered Item Booklets

(OIB)– One for each content domain– An equivalent number that meet the

test plan

Page 27: Standard Setting for Professional Certification

Common Bookmark Issues• How should I select Items for the

OIB?– Minimize the distance in difficulty

between any two adjacent items.• Ensure that there are enough items at all

difficulty levels for each OIB • Ensure that the variance in item difficulty is

the same for each OIB

Page 28: Standard Setting for Professional Certification

Common Bookmark Issues• How should I sort the item booklets?

– Easiest to Hardest– Hardest to Easiest

Page 29: Standard Setting for Professional Certification

Common Bookmark Issues• How do I know when the MC would

stop getting items correct and start getting them incorrect? (What is the appropriate RP value?)– .5– .67* Most Common– .75

Page 30: Standard Setting for Professional Certification

Common Bookmark Issues• How do I convert the bookmark to a

passing standard?– Previous Item (PI) – Take the difficulty of

the easier of the two items on either side of the bookmark

– Between Item (BI) – Take the average of difficulty of the two items

Page 31: Standard Setting for Professional Certification

Compare Angoff and Bookmark• Angoff requires less preparation

– Select a real test form as opposed to building the OIBs

• Judges understand Bookmark better– Rating the difficulty of an item is a

difficult task• Bookmark requires more test items

– I’d recommend an item pool of at least 40 solid test items per content domain

Page 32: Standard Setting for Professional Certification

Other Test Centered Methods• Ebel• Nedelsky• Jaeger• Rasch Item Mapping

Page 33: Standard Setting for Professional Certification

Ebel• Judges sort each item into piles

– How difficult is this item for the MC examinee?• Easy, moderate, or hard

– How relevant is this content for practice?• Critical, Moderately important, Not relevant

• Judges then estimate the percentage of items in each that MC examinees would get correct

• The passing standard is then determined by multiplying the number of items in each cell by the percentage and sum all values

Page 34: Standard Setting for Professional Certification

Nedelsky• Judges determine which response

options are unrealistic for each item• The probability of a guessed correct

response is calculated• The sum of the probabilities is the

passing standard

Page 35: Standard Setting for Professional Certification

Jaeger• Judges evaluate each item

– Yes/No - “Should every entry-level practitioner answer this item correctly?”

• Judges discuss ratings & make adjustments

• Judges are provided passing rate based on standard & make adjustments

• Passing standard is calculated by summing the number of “Yes” responses

Page 36: Standard Setting for Professional Certification

Test-Centered Options• What the ratings are based on

– Should or would MC get this right• How ratings are made

– Yes/No, Percentage• Relevance adjustments• Guessing adjustments• What kind of feedback is provided

– Passing rate– Other judges ratings– Actual item difficulty

Page 37: Standard Setting for Professional Certification

Using Multiple Methods of Standard Setting

Page 38: Standard Setting for Professional Certification

Why use Multiple Methods?• There is error in every standard setting• Allows policymakers to “decide” on the

standard rather than science simply documenting the outcomes of a panel

• Allows for the recovery of standard setting sessions that go awry

• Involves more stakeholders

Page 39: Standard Setting for Professional Certification

Adjustments• Simple Stats – Calculate the confidence interval

around the estimate• Beuk – Judges provide an expected passing score

and an expected passing rate. Calculations are made that are based on the variability in these two estimates

• De Gruijter – Similar to Beuk, judges also provide an estimate of the uncertainty of their judgments.

• Hofstee – Judges indicate the highest and lowest passing score and passing rate. These values are plotted along with the cumulative frequency distribution and the point of intersection is the passing standard

Page 40: Standard Setting for Professional Certification

Survey of Hiring Managers• Ask hiring managers about the

workforce– What percentage of certified persons do

you believe to be minimally competent?– Are your certified persons more

competent that your uncertified persons?

• Expands the reach of your exam

Page 41: Standard Setting for Professional Certification

Triangulating results• Psychometrician should present the

outcome of each method and the passing rate associated with each outcome– A range of possible values

• Policymakers can use this information and “their professional experience” to set the actual passing standard

Page 42: Standard Setting for Professional Certification

Wrap-Up

Page 43: Standard Setting for Professional Certification

3 Vital Recommendations• Have more judges at standard setting• Spend more time training your judges• With each standard setting ensure

that you take the time to define minimal competency conceptually and don’t forget to document this definition.

Page 44: Standard Setting for Professional Certification

Concluding Remarks• Many people like to think of test

makers as big bad people which is obviously not true. Standard setting is one example of how inclusive the scientific process of test development can be. I encourage folks to make this process light and fun.

Page 45: Standard Setting for Professional Certification

Thank you for paying attention!

Questions & Comments:[email protected]