Upload
nathan-thompson
View
75
Download
9
Embed Size (px)
Citation preview
Workshop: Item Writing
Objectives for Today
Item writing and development:TerminologyGeneral GuidelinesWriting CR items
Next week: Examples and reviewing
Objectives of Item Development
1. Build exams that are high quality, legally defensible, and produce reliable and valid score interpretations
2. Create items that directly assess the knowledge and skills in question
3. Minimize (ultimately eliminate) distractions and undue influence
Terminology
Item: a test ‘question’ or prompt designed to assess knowledge, skills or abilities
Stem: the part of the item that presents the content
Asset/Stimulus: the part of the item that presents or exhibits information, might be a reading passage, image, or audio
Options: the available answers
Key: the correct answer
Distractor: the incorrect options
stem
asset
key
distractors
options
item
Terminology
Terminology
Response: the recorded input of an individual examinee
Rubric: a clearly defined set of criteria (rules) for scoring a free response itemConvert open responses to a scale of numbersCan have multiple rubrics on one item
Item Types
There are two major item type categories:
1. Selected ResponseMultiple choiceMultiple ResponseDrag and drop or matching
2. Constructed Response (aka free or open)Essay or similar (e.g., math problem)Short answer or Fill-In-The-Blank (can still be automatically scored)Performance tests
Example: Multiple Choice
What is the capital of Norway?a. Oslo*b. Bergenc. Stavangerd. Stockholm
Example: Multiple Response
Which of the following are cities in Norway?a. Oslo*b. Copenhagenc. Stavanger*d. Stockholm
City in Norway Not a City in Norway
Drag and Drop is often same as Multiple Response!
Example: Scored short answer
James purchased a music album for $8. It was discounted by 20%. What was the regular price?
(Student would type response in the box; acceptable answers might be: $10, ten dollars, 10 dollars)
More difficult and high-fidelity than MC
Example: Fill in the blank
_________ is capital of Norway.
(Student would type response in the box)
Example: Essay
Write a detailed account of how Oslo became the capital of Norway. Why was it a good choice? Provide three reasons to support your position.
This lends itself to rubrics:Position? 0/1 pointsReasons? 0/1/2/3 pointsHistorical accuracy? 0/1/2 points
Construct-Irrelevant Variance
The enemy of all tests is construct-irrelevant variance.
Scores should reflect an examinees knowledge, skills, or abilities as it relates to the construct of interest (in this case, course competency)
Measurement of anything else is irrelevant and unhelpful.Remember that reliability is ~unidimensionality
Construct-Irrelevant Variance
Our Goal:Reduce construct-irrelevant variance
Construct-Irrelevant VarianceIt’s a scientific experiment; we want to hold all variables constant except the variable of interest.
Guidelines for Item Writing
The following are some general applicable guidelines, regardless of test purpose or item type
Validity: Remember original purpose
What is the goal of the test?Show minimal subject mastery?Show mastery at a range of levels?Differentiate the top students?Identify students that need remediation?
Clear InformationBe clear and concise in the item’s content:
Provide all information that is neededDo not provide extraneous or superfluous information unless a distractor
Make sure formatting is as clear as possible
Utilize Blueprints!
Make sure that the content of items maps to the blueprint as directly as possible
Record rationale/source/etc.
Essential link in the validity chain of evidence
Think like an examinee
While writing an items, it’s important to think like an examinee.
Very important but often overlooked: Quality distractors
What is Appropriate Difficulty?
Write items of appropriate difficulty… While a very difficult item might be correct and
actually quite “good” it might not serve the purposes of a test of minimal competence
Some tests call for a narrow range of difficulty Some situations call for a wide range
Enhances reliability because more score variance
Rationale and Source
Whenever possible, record the rationale and source or reference for finding the correct answer
For example: “Answer B is correct because the stem says _____; C and D would not have an effect and A would actually counteract because _____.”
“Found on Page 125 of Jackson 2013 Text”
Maintain Grammar
If a question mark completes the stem, options should be formatted as stand alone phrases. No need for punctuation.
What is the capital of Norway?a. Oslo*b. Bergenc. Stavangerd. Stockholm
Maintain Grammar
If the stem does not end with punctuation, the options should complete the stem’s sentence.
The capital of Norway isa. Oslo.*b. Bergen.c. Stavanger.d. Stockholm.
Maintain Grammar
Capitalize appropriately – proper nouns require capitalization, but otherwise it is generally unnecessary.
Washington, D.C. is the _____ of the United States.a. capital*b. largest cityc. primary portd. southernmost city
How to Write Items
1. Identify a relevant situation or a piece of necessary knowledge that you’d like to evaluate. Consult your Blueprint.
2. Browse text books, references and sources that are relevant to the exam – generate ideas!
3. Determine how to structure the itemCorrect answerDistractors!
How to Write Items
Best Practices in quality control for Multiple Choice questions:
Ensure that the key is truly correct
Check that distractors are fully incorrect, but plausible
Review the stem to make sure all necessary information is presented
Make sure that the “question” part of the stem is clear and indicates the type of response necessary
How to Write Items
Examples and counter examples of these specific guidelines will be the next workshop (Item Review)
Constructed Response Items
Goal:
Forming a connection from complex responses and real-life situations to reliable scores
How to Write Items: CR
Examples of constructed response items:
Solving a practical problem (high fidelity)Proposing solutions with explanationsCreate a solution within certain parametersEssays (argumentative or creative)Synthesizing information
How to Write Items: CR
Guidelines for constructed response items:
Determine the topic for the itemEstablish the scenarioDetermine all necessary informationReduce/eliminate unnecessary information (unless a distractor!)
Think of the steps, write the item, answer it yourself as a student
Scoring CR Items
Scoring of CR items can be difficult due to complexity
Remember that the most interesting item in the world does not do any good if no way to accurately score it!
If possible, link to algorithm of problem solvingWeight by difficulty or criticalityForgetting to round as last step, or provide units …vs…Utilizing incorrect information from scenario
Scoring CR Items
Approaches to CR scoring (keep in mind while writing)Score on processScore on resultsScore on both
Did student complete each step? Did they reach the correct answer?
Scoring CR Items
Ways to convert CR item to pointsRubricsPoints for errors/completionsPoints for answers or multiple answers
These make your life easier and standardize the scoring, making it more reliable
Rubrics
Rubrics are very helpfulA set of rules to convert open responses to score points
Rubric/Criteria: What you are ratingRating scale: Axis, with point levelsDescriptors: Examples of what each mean
Rubrics
Identify axes (often driven by curriculum)Establish relevant point levels (can differ)Establish descriptorsRevisit point levels
Observable or isolatable
Rubrics
Some examples of rubrics with dos/don’ts
Mention that they have their own set of statistics – inter-rater reliability, agreement, read-behinds, etc.
Using multiple answers
Provide a complex scenario, ask student to list every piece of information they would need to solve (e.g., there are 5)3 points for each correct-3 for each missing-3 for each supplied that is not correct
Note: You could earn -15!
Performance Testing
Deductions or additions due to criticalityExample: 100 possible points
Cutscore = 80-7 for minor error-14 for moderate error-21 for critical error (e.g., safety)
Performance Testing
Performance Testing
Interestingly, Performance Testing still lacks a true psychometric theory
Readings
Haladyna, T. M., Rodriguez, M. C., & Downing, S. M. (2013). Developing and validating test items. NY: Routledge.
Downing & Haladyna (2006). The Handbook of Test Development.
Lots of free resources on internet (ASC has an item writing guide…)
Question and Answer