Upload
domenic-logan
View
216
Download
2
Embed Size (px)
Citation preview
Classroom Assessment
A Practical Guide for Educatorsby Craig A. Mertler
Chapter 3
Characteristics ofAssessments
Introduction
The quality of educational decisions is only as
good as the information that leads to them.
If inappropriate information is collected, or if it is
collected without precision, the decisions that
follow will logically be inaccurate.
Two key characteristics of all assessments are
validity and reliability.
What Is Validity? Validity: the degree to which evidence and
theory support the interpretations of test scores entailed by proposed uses of tests.• does not emphasize the results themselves,
but rather how the results are used (validity deals with the decisions that follow the interpretation of test results—i.e., appropriate and inappropriate uses of assessment results)
• the most fundamental consideration when developing and evaluating tests and other assessments
• example—the SMART
What Is Validity? Validity (continued)
• Three important points about validity: concept of validity applies to the ways
teachers interpret and use assessment results
assessment results have different degrees of validity, depending on purposes
judgments about validity should be made only after examination of several types of validity evidence
What Is Validity? Sources of Validity Evidence
• validity is an abstract concept; it cannot be directly observed; must gather evidence in support of it
• content evidence of validity focuses on extent to which content addressed
by assessment items adequately samples the larger domain of performance
most important type of evidence for teachers relevance: do items emphasize what has been
taught? representativeness: how well do the items
represent the total content area? two-way tables (“content” x “taxonomic level”)
can assist teachers in gathering this evidence
What Is Validity? Sources of Validity Evidence (continued)
• criterion evidence of validity focuses on extent to which scores resulting from
an assessment are related to another similar, well-established assessment (the criterion)
predominantly a concern for standardized tests Predictive evidence of validity: form of criterion-
related evidence where criterion is measured sometime in the future.
Concurrent evidence of validity: form of criterion-related evidence where criterion is measured at the same time or consists of some measure available at the same time.
What Is Validity?
Sources of Validity Evidence (continued)• construct evidence of validity
focuses on degree to which there is a fit between hypothetical construct (unobservable human trait) being measured and the responses actually supplied by students
typically a concern for standardized tests sometimes viewed as an “umbrella” for all
sources of validity evidence
What Is Validity?
Sources of Validity Evidence (continued)
• face evidence of validity
not considered a formal source of evidence
informal measure of extent to which the users or takers of tests believe that the test results are valid
often plays an important role in terms of student (and teacher) motivation
What Is Validity?
Establishing Validity of Quantitative Assessments• for classroom assessments—five guiding questions
Does my assessment procedure emphasize what I have taught?
Do my assessment tasks accurately represent outcomes specified in my school’s, district’s, or state’s curriculum guide?
Is the content in my assessment procedure important and worth learning?
Do students perceive that the problems or tasks on my assessment emphasize the concepts and other material that I have taught?
Do students generally believe that the assessment measures the appropriate behaviors, skills, or characteristics as they were taught?
content evidence
face evidence
What Is Validity?
Establishing Validity of Quantitative Assessments
(continued)
• for standardized assessments—evidence of
validity is based on statistical analysis (especially
for criterion-related evidence)
• Correlation coefficient (r): statistical measure that
indicates the extent to which scores on one
assessment agree with scores on the other;
ranges from -1.00 to +1.00; known as a validity
coefficient.
What Is Validity? Establishing Validity of Qualitative Assessments
• for informal classroom assessments—five questions
Have I limited my observations to concrete behaviors, as opposed to more global impressions of students?
Have I observed/noted the specific behavior a sufficient number of times in order to draw definitive conclusions?
Have I observed/noted the behavior in different settings or situations?
Have I based my conclusions only on the information that I have gathered?
Are there plausible, alternative explanations for the given behavior?
representa-tiveness of observatio
ns
nature of
inferences
What Is Reliability? Reliability: the consistency of measures when the
testing procedure is repeated on a population of individuals or groups.• validity = accuracy; reliability = consistency• also speaks to scores and their interpretation
and use, not to the assessment itself• scores—and their consistency—are affected by
error• error can result from student illness, content
assessed but not taught, etc.• random errors affect consistency; systematic
errors affect validity
What Is Reliability?
Establishing Reliability of Quantitative
Assessments
• established by correlating test results with
themselves or with other forms of the test
(anticipate that high scores on one form of the
assessment are associated with high scores on
the other)
• Reliability coefficient (r): a correlation
coefficient representing measures of reliability.
What Is Reliability?
Establishing Reliability of Quantitative Assessments (continued)
• Test-retest method: estimates reliability over time; results in a coefficient of stability.
procedure is not realistic for classroom use
• Alternate-forms and equivalent-forms methods: administration of tests with different items, or same items, that have been rearranged; results in an alternate-forms coefficient.
again, procedure not realistic for classroom use
What Is Reliability? Establishing Reliability of Quantitative Assessments
(continued)• Internal consistency methods: estimate of
reliability with only one administration; determines how well items correlate with one another. Split-half method: divides test into two
comparable halves. KR-21 method: all possible split-half
combinations. Cronbach’s method: similar to KR-21, but for
items with different point values.
What Is Reliability?
Establishing Reliability of Qualitative
Assessments
• Interrater consistency: calculation of percent
agreement between two or more raters of
student performance.
The Relationship Between Validity and Reliability
Validity is the more important feature.
Reliability is a prerequisite to validity (in other
words, if items accurately assess a domain of
content, the scores will also be consistent).
Assessment results may be reliable (i.e.,
consistent) but not valid (i.e., accurate).
The Relationship Between Validity and Reliability
Valid test results are also reliable, but
reliable test results are not necessarily
valid. • • • • • • • • • • • •• • • • • • • • • • • • • • • • •••• • • • • • • • • • ••••• • • • • • • •• •• • • • •
• •
(a)
lacks validity andreliability
(b)
fair validity andfair reliability
(c)
good reliability butlacks validity
(d)
good validity andgood reliability
Teacher Responsibilities Related to Validity and Reliability
Ensuring the validity and reliability of classroom
assessments is a primary responsibility of
teachers.
Refer to both The Standards for Teacher
Competence in the Educational Assessment of
Students and The Code of Fair Testing Practices
in Education.