Upload
meghan-weaver
View
226
Download
0
Tags:
Embed Size (px)
Citation preview
Michigan Assessment Consortium
Common Assessment Common Assessment Development SeriesDevelopment Series
Rubrics and Scoring GuidesRubrics and Scoring Guides
Developed by…
Bruce R. Fay, PhDBruce R. Fay, PhD
Assessment ConsultantAssessment Consultant
Wayne RESAWayne RESA
Support
The Michigan Assessment Consortium The Michigan Assessment Consortium professional development series in common professional development series in common assessment development is funded in part by assessment development is funded in part by the Michigan Association of Intermediate the Michigan Association of Intermediate School Administrators in cooperation with the School Administrators in cooperation with the Michigan Department of Education, Michigan Michigan Department of Education, Michigan State University, Ingham and Ionia ISDs, State University, Ingham and Ionia ISDs, Oakland Schools, and Wayne RESA.Oakland Schools, and Wayne RESA.
What you will learn
Why and when you need rubricsWhy and when you need rubrics Different kinds of rubricsDifferent kinds of rubrics How to develop a rubricHow to develop a rubric How to use a rubricHow to use a rubric What scoring guides areWhat scoring guides are How to use scoring guidesHow to use scoring guides
Subjectivity in Scoring
No such thingNo such thing
If it’s truly subjective, it’s just If it’s truly subjective, it’s just someone’s opinion, and is of little someone’s opinion, and is of little or no value to the person being or no value to the person being assessedassessed
The problem is Bias
There are sources of bias in all There are sources of bias in all assessment methodsassessment methods
Some are common to all methodsSome are common to all methods Others are unique to each methodOthers are unique to each method All of them must be minimized in All of them must be minimized in
order for assessments with scored order for assessments with scored items to be fairitems to be fair
So, what is a rubric?
“…“…guidelines, rules, or principles by which guidelines, rules, or principles by which student responses, products, or student responses, products, or performances are judged. They describe performances are judged. They describe what to look for in student performances what to look for in student performances or products to judge qualityor products to judge quality.” (p. 4).” (p. 4)
Scoring Rubrics in the ClassroomScoring Rubrics in the Classroom
Judith Arter and Judith Arter and Jay McTigheJay McTighe
Corwin PressCorwin Press
Assessment methods that require a rubric / scoring guide
Written responseWritten response Performance/observationPerformance/observation Interactive/conversationInteractive/conversation PortfolioPortfolio
Where Do Rubrics Fit?
Classroom or large-scale assessmentClassroom or large-scale assessment Free-response written methodsFree-response written methods
Short- and extended-response itemsShort- and extended-response items Performance observation (somewhat)Performance observation (somewhat) Across assessment targets, especially Across assessment targets, especially
complex, hard to define ones, such as complex, hard to define ones, such as problem-solving, writing, and group problem-solving, writing, and group processesprocesses
Across content areas and grade levelsAcross content areas and grade levels
Assessment Myth #1
True or False? Selected-response True or False? Selected-response tests are more objective that free-tests are more objective that free-response tests?response tests?
Answer?Answer? False! The only thing truly False! The only thing truly
“objective” about selected-response “objective” about selected-response tests is the scoring.tests is the scoring.
Assessment Myth #2
True or False? Scoring (or grading) of True or False? Scoring (or grading) of free-response items is inherently free-response items is inherently subjective?subjective?
Answer?Answer? False! On two accounts:False! On two accounts:
1) Scoring and grading are not the same 1) Scoring and grading are not the same thingthing
2) “Subjective scoring” isn’t scoring at all, 2) “Subjective scoring” isn’t scoring at all, it’s just your opinionit’s just your opinion
Assessment of Open-ended Work
ChecklistsChecklists Performance listsPerformance lists Scoring RubricsScoring Rubrics Scoring GuidesScoring Guides
Checklists
Simple criteria for simple tasksSimple criteria for simple tasks Checklist formatChecklist format Assess presence or absence onlyAssess presence or absence only No judgment of qualityNo judgment of quality
Performance Lists
More sophisticated than checklistsMore sophisticated than checklists Criterion-basedCriterion-based Product, task, or performance broken Product, task, or performance broken
down into relatively simple, discrete down into relatively simple, discrete piecespieces
Each piece scored on a scaleEach piece scored on a scale Scale for each piece can be differentScale for each piece can be different
Scoring Rubrics & Guides
Judgments regarding complex tasksJudgments regarding complex tasks Written criteriaWritten criteria Score points defined/describedScore points defined/described Represents the essence of quality workRepresents the essence of quality work Reflects the best thinking in the fieldReflects the best thinking in the field Scoring guides often include exemplars in Scoring guides often include exemplars in
the form of annotated anchor papersthe form of annotated anchor papers
Benefits of Rubrics forTeaching, Learning, and Assessing
Focuses instruction/assessmentFocuses instruction/assessment Clarifies instructional goalsClarifies instructional goals Clarifies assessment targetsClarifies assessment targets Defines quality workDefines quality work
Integrates assessment and instructionIntegrates assessment and instruction Develops shared/consistent vocabulary and Develops shared/consistent vocabulary and
understandingunderstanding Provides consistency in scoringProvides consistency in scoring
Across students by a scorerAcross students by a scorer Across multiple scorersAcross multiple scorers Across timeAcross time
Rubric Basics
TypesTypes UsesUses Scoring rangesScoring ranges
Types of Rubrics
HolisticTrait
Analytic
Generic A B
Task Specific
C D
USES OF RUBRICS
Holistic Rubrics StrengthsStrengths
Provide a quick, overall rating of qualityProvide a quick, overall rating of quality Judge the “impact” of a product or performanceJudge the “impact” of a product or performance Use for Summative or large-scale assessmentUse for Summative or large-scale assessment
LimitationsLimitations May lack the diagnostic detail needed toMay lack the diagnostic detail needed to
Plan instructionPlan instruction Allow students to see how to improveAllow students to see how to improve
Students may get the same score for vastly different Students may get the same score for vastly different reasonsreasons
Trait-Analytic Rubrics
StrengthsStrengths Judge aspects of complex work independentlyJudge aspects of complex work independently Provide detailed/diagnostic data by trait that Provide detailed/diagnostic data by trait that
can better inform instruction and learningcan better inform instruction and learning
LimitationsLimitations More time consuming to learn and applyMore time consuming to learn and apply May result in lower inter-rater agreement when May result in lower inter-rater agreement when
multiple scorers are used (without appropriate multiple scorers are used (without appropriate procedures)procedures)
Generic Rubric Strengths Complex skills that generalize across tasks, grades, Complex skills that generalize across tasks, grades,
or content areasor content areas Situations where students are doing a similar but Situations where students are doing a similar but
not identical tasknot identical task Help students see “the big picture”, generalize Help students see “the big picture”, generalize
thinkingthinking Promote/require thinking by the studentPromote/require thinking by the student Allow for creative or unanticipated responsesAllow for creative or unanticipated responses Can’t give away the answer ahead of timeCan’t give away the answer ahead of time More consistency with multiple raters (only one More consistency with multiple raters (only one
rubric to learn, so you can learn it well)rubric to learn, so you can learn it well)
Generic Rubrics Limitations
Difficult to develop and validateDifficult to develop and validate Takes time and practice to learn, internalize, and Takes time and practice to learn, internalize, and
apply consistentlyapply consistently Takes time to applyTakes time to apply Takes discipline to apply correctlyTakes discipline to apply correctly Requires a scoring procedure to ensure consistent Requires a scoring procedure to ensure consistent
scores when multiple raters are involvedscores when multiple raters are involved
Task-specific Rubric Stengths
Specialized tasksSpecialized tasks Highly structured assignmentsHighly structured assignments Specific/detailed assessment goalsSpecific/detailed assessment goals Provide detailed feedback to student on workProvide detailed feedback to student on work Situations requiring quick but highly consistent Situations requiring quick but highly consistent
scoring from multiple scorers with less training scoring from multiple scorers with less training and/or inter-rater control proceduresand/or inter-rater control procedures
Task-specific Rubric Limitations
Can’t show to students ahead of time as they give Can’t show to students ahead of time as they give away the answeraway the answer
Does not allow the student to see what quality Does not allow the student to see what quality looks like ahead of timelooks like ahead of time
Need a new rubric for each taskNeed a new rubric for each task Rater on autopilot may miss correct answers not Rater on autopilot may miss correct answers not
explicitly shown in the rubricexplicitly shown in the rubric
Scoring Ranges
Minimum of 3 levelsMinimum of 3 levels Maximum of 3 to 7 levels (typically, beyond 8 Maximum of 3 to 7 levels (typically, beyond 8
it’s hard to apply and understand)it’s hard to apply and understand) Even vs. Odd – Odd point scales (3, 5, 7, etc.) Even vs. Odd – Odd point scales (3, 5, 7, etc.)
allow a middle ground that is psychologically allow a middle ground that is psychologically attractive for the rater (which you may want to attractive for the rater (which you may want to avoid)avoid)
5-point scales tend to look like A-F grading 5-point scales tend to look like A-F grading scheme (which you may also want to avoid)scheme (which you may also want to avoid)
Distinguish Quality
4 or more points typically needed to 4 or more points typically needed to distinguish levels of qualitydistinguish levels of quality
4 – 7 points is typical4 – 7 points is typical Depends on being able to distinguish levels Depends on being able to distinguish levels
of qualityof quality The more open-ended/complex the task, the The more open-ended/complex the task, the
broader the range of points neededbroader the range of points needed
A Meta-rubric
A rubric for evaluating rubrics
Trait 1: content/coverageTrait 1: content/coverage Trait 2: clarity/detailTrait 2: clarity/detail Trait 3: usabilityTrait 3: usability Trait 4: technical qualityTrait 4: technical quality
MRT 1: Content/coverage
Aligned to curriculum and instructionAligned to curriculum and instruction Includes everything that is qualityIncludes everything that is quality Does not include trivial thingsDoes not include trivial things Reasonable explanations for what is Reasonable explanations for what is
included and excludedincluded and excluded Reflects best thinking and practiceReflects best thinking and practice Rarely find work that can’t be scoredRarely find work that can’t be scored
MRT 2: Clarity/detail
Different users likely to interpret the rubric Different users likely to interpret the rubric in the same way – language is not in the same way – language is not ambiguous, vague, or contradictoryambiguous, vague, or contradictory
Use of rubric supports consistent scoring Use of rubric supports consistent scoring across students, teachers, and timeacross students, teachers, and time
Examples of student work illustrate each Examples of student work illustrate each level of quality on each traitlevel of quality on each trait
MRT 3: Usability/Practicality
Can be applied in a reasonable amount of Can be applied in a reasonable amount of time when scoringtime when scoring
Can easily explain/justify why a particular Can easily explain/justify why a particular score was assignedscore was assigned
Student can see what to do differently next Student can see what to do differently next time to earn a better scoretime to earn a better score
Teacher can see how to alter instruction for Teacher can see how to alter instruction for greater student achievementgreater student achievement
MRT Trait 4: Technical Quality
Evidence of reliability (consistency) – Evidence of reliability (consistency) – across students, teachers, and timeacross students, teachers, and time
Evidence for validity (appropriateness) – Evidence for validity (appropriateness) – students and teachers agree that it supports students and teachers agree that it supports teaching and learning when used as intendedteaching and learning when used as intended
Evidence of fairness and lack of bias – does Evidence of fairness and lack of bias – does not place any group at a disadvantage not place any group at a disadvantage because of the way the rubric is worded or because of the way the rubric is worded or appliedapplied
Develop Your Own Rubrics
Form a learning teamForm a learning team Locate/acquire additional resourcesLocate/acquire additional resources Modify existing rubricsModify existing rubrics Understand the development processUnderstand the development process Help each other outHelp each other out When you are comfortable with the process, When you are comfortable with the process,
introduce it to your studentsintroduce it to your students
The Meta-rubric outlineWOW Most Some None
Trait 1
Trait 2
Trait 3
Trait 4
Possible Development Process Gather samples of student workGather samples of student work Sort student work into groups and write down the Sort student work into groups and write down the
reasons for how it is sortedreasons for how it is sorted Cluster the reasons into traitsCluster the reasons into traits Write a value-neutral definition of each traitWrite a value-neutral definition of each trait Find samples of student work that illustrate a Find samples of student work that illustrate a
possible range of score points on each traitpossible range of score points on each trait Write value-neutral descriptions of each score level Write value-neutral descriptions of each score level
for each trait, if appropriatefor each trait, if appropriate Evaluate your rubric using the Meta-rubricEvaluate your rubric using the Meta-rubric Test it out and revise it as neededTest it out and revise it as needed
AcknowledgmentsThis module is based on material adapted from:This module is based on material adapted from:
Scoring Rubrics in the ClassroomScoring Rubrics in the Classroom
By By Judith Arter and Jay McTigheJudith Arter and Jay McTighe
Experts in Assessment SeriesExperts in Assessment Series
Corwin Press, Thousand Oaks, CACorwin Press, Thousand Oaks, CA
andandMaterial provided by Edward Roeber of Michigan State Material provided by Edward Roeber of Michigan State
University, East Lansing, MichiganUniversity, East Lansing, Michigan