Upload
others
View
13
Download
0
Embed Size (px)
Citation preview
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Improving Assessments of Reading Comprehension among Early Readers
Working Group on Research in Reading Comprehension MeasuresFor Young Children
Patricia Mathes (Chair) Southern Methodist University
Marcia Davis, University of Maryland Keith Millis, Northern Illinois UniversityJenny DeMonte, University of Michigan John Sabatini, Education Testing ServicePeter Foltz, New Mexico State University Kristi Santi, University of Texas at HoustonBarbara Foorman, University of Texas at Houston Joe Torgesen, Florida State UniversityDavid Francis, University of Houston Richard Wagner, Florida State UniversityJohn Guthrie, University of Maryland Barbara Wise, University of ColoradoTom Laudauer, University of Colorado
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Table of Contents
The Working Group on Research in Reading Comprehension MeasuresFor Young Children: History and Charge ………………………………………….. ii
Challenges of Assessing Reading Comprehension ………………………………….. 1
Unpacking the Construct of Reading Comprehension ………………………….. 4
Limitations of Current Assessment Tools ………………………………………….. 5
Enhancing Assessments of Reading Comprehension ………………………….. 8
Conclusion ………………………………………………………………………….. 13
Appendix A: Members of the Working Group on Research in ReadingComprehension Measures for Young Children …………………………………. vi
Appendix B: Participants in the November 12, 2004 meeting …………………. vii
Appendix C: Agenda for the November 12, 2004 meeting …………………………. viii
Appendix D: References and Additional Resources …………………………………. x
i
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
The Working Group on Research in Reading Comprehension MeasuresFor Young Children: History and Charge
On November 12, 2004 a one-day meeting on Assessing Reading Comprehension with
Young Children was held in Arlington, Virginia. Funded by the National Science Foundation
(NSF) as part of the Interagency Education Research Initiative (IERI, a collaboration of NSF, the
U.S. Department of Education, and the National Institute for Child Health and Human
Development, NICHD), the meeting was designed to address the challenges of measuring reading
comprehension of young children and struggling readers in classroom settings. This specific issue
and the more general challenge of identifying, developing, and administering assessments
sufficiently sensitive to detect responses to interventions with minimal disruption to normal
classroom activities are critical to the IERI program, and to other federal initiatives and education
research activities. There are concerns in the research community that assessments capable of
detecting small yet significant improvements in student learning outcomes may be too resource
intensive to implement in the large-scale longitudinal studies required to obtain rigorous evidence of
the impact of scaling-up promising interventions. Such concerns raise questions regarding the
adequacy and efficacy of current assessments that need to be investigated from multiple viewpoints.
With respect to the assessment of reading comprehension, key questions include: how well do
existing assessments distinguish fluency, decoding, listening comprehension and reading
comprehension; what are the implications of relying on popular methods of assessment for
developing and evaluating responses to interventions; and what practical and conceptual challenges
need to be overcome to enhance assessments of reading comprehension in young children and
struggling readers?
ii
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
The Working Group on Research in Reading Comprehension Measures for Young
Children was formed to address these issues. The specific catalyst for the group’s formation was a
request from an IERI investigator for assistance from the Data Research and Development Center
(DRDC, a technical assistance and research center established by NSF to support the IERI program;
further information about DRDC & IERI is available online at www.drdc.uchicago.edu). The
investigator raised concerns which DRDC found were shared by several IERI researchers regarding
the limitations of many commonly employed instruments and measures for assessing
comprehension among early readers. DRDC’s efforts to assist the investigator led to a wider
conversation among a small group of IERI researchers, other reading researchers not supported by
the IERI program, and IERI program officers and other senior staff at NSF, NICHD, and the U.S.
Department of Education’s Institute of Education Scientists (IES) regarding the challenges of
employing and the desirability of improving currently available assessment instruments. There was
a general consensus that the challenges the IERI researcher posed have important implications for
many members of the education research and policymaking communities, and should be pursued.
The Working Group is comprised of fourteen education researchers (including four
conducting IERI-supported research). The group is chaired by Patricia Mathes, Texas Instruments
Chair of Reading, Director of the Institute for Reading Research and Professor of Literacy and
Language Acquisition in the School of Education at Southern Methodist, (principal investigator on
the IERI-supported project “Scaling-Up Effective Intervention for Preventing Reading Difficulties
in Young Children”); a full list of the members of the working group is included in Appendix A.
Eleven of these researchers or their designated spokespersons were able to attend the November
2004 meeting; the other three members joined a portion of the meeting via teleconference. Joining
iii
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
the group were seven representatives of the three IERI agencies and DRDC; the full list of attendees
is included in Appendix B.
Prior to the meeting, the members of the Working Group received a full briefing on the
objectives of the meeting and invitations to discuss salient aspects of their research and their expert
views on techniques for measuring reading comprehension with children reading at primary grade
levels. Members were specifically encouraged to come prepared to share the measurements and
instruments they use to assess reading comprehension, particularly for lower achievement level
students, and to address the two framework questions for discussion:
What are the problems associated with measuring reading comprehension in early readers?
Which measures provide the best insights into what interventions accomplish with respect to
improving reading comprehension? What is the technical adequacy of data obtained from
existing measures, and what are the implications for the development of additional and/or
enhanced instruments?
This format allowed participants to explore the difficulties they encounter when measuring
comprehension in early readers, and to present their current instruments and materials to a group of
experts able to draw parallels to their own research projects. The complete agenda for the meeting
is included in Appendix C. A transcript of the meeting and copies of presentation materials are
available online to those who attended the meeting; instructions and credentials for accessing these
materials are available from Michelle Llosa (773 256 6189) at DRDC.
This White Paper has been prepared from transcripts of the meeting and presentation
materials provided by the members of the Working Group; (a compendium of sources of additional
iv
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
information on the assessments and research discussed at the meeting and referenced here is
included in Appendix D). It is intended to provide a public record of the issues addressed by the
Working Group on Research in Reading Comprehension Measures for Young Children. It is also
an invitation to other researchers interested in forming a network for sharing measures and
experiences using them. The Working Group welcomes opportunities to work with other
researchers committed to resolving the challenges of measuring comprehension growth among early
readers, thus ultimately strengthening the quality of the evidence base regarding the efficacy of
interventions designed to improve reading comprehension.
Data Research and Development Center
University of Chicago and NORC
February 2005
v
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Challenges of Assessing Reading Comprehension
Taking educational innovations that have been shown to be effective in one or more
contexts and replicating them in other settings has been a consistent goal of educational reform (see,
e.g., Elmore, 1996). What distinguishes recent efforts to bring educational interventions to scale is
a focus on providing compelling evidence (e.g., from scientifically-based research) of the impact of
such interventions across a wide range of educational contexts. The cornerstones of such efforts are
robust assessments capable of detecting changes in the student learning outcomes of interest.
Ideally these assessment instruments should yield reliable and valid results with minimal disruption
to normal classroom practice and limited if any commitment of additional instructor or other school
resources. Importantly, they should also be widely available to and used by both large numbers of
education researchers and practitioners, particularly in the context of larger accountability
initiatives. The development of tailored assessments may be an option – even a requirement – when
innovative interventions are in a developmental or pilot phase, and finely-grained measures of
responses to interventions are required to judge the merits of continuing R&D work in the area.
There are considerable disadvantages, however, to encouraging researchers to develop and
administer customized assessments in larger-scale evaluations. In particular, the limited
dissemination and administration of robust assessment instruments may constrain comparability
across studies (inhibiting knowledge accumulation) and decrease the prospects of supplementing
individual study findings with information from other large-scale (e.g., national) data collection
efforts.
1
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Recommendations for a research agenda that would translate these general concerns with
the development and diffusion of robust assessments into improved assessments of reading
comprehension have already been proposed (e.g., by the U.S. Department of Education’s Office of
Educational Research and Improvement-sponsored RAND Reading Study Group, RRSG; see Snow,
2002).1 Of particular concern here are the challenges of developing valid, reliable, sensitive
assessments of growth in reading comprehension among early (including struggling) readers. In the
absence of such assessments it is not possible to tease-out why, for example, an intervention that
made an impact on word attack and word identification scores of the Woodcock-Johnson2
assessment, did not (counter intuitively) demonstrate much impact on the passage comprehension
score. One possible interpretation of the findings (with important implications for public support
for the intervention) would be that the intervention did not, in fact, have the anticipated outcome.
Another plausible interpretation (with important implications for both educational research and
accountability programs) is that the measures originally employed to test responses to the
intervention were not appropriate or adequate to distinguish comprehension growth. The specific
issues raised by this example could, of course, be addressed through repeated assessments using
multiple (potentially, customized) measures. However, the questions this one set of counter
intuitive findings raise are not isolated to a single intervention or research team. They are
illustrative of a broader set of questions that need to be resolved to improve the confidence with
which decisions can be made regarding the support that should be given to interventions designed to
improve reading comprehension among early readers:
1 The RRSG’s 2002 report, Reading for Understanding: Toward an R&D Program in Reading Comprehension, includes a summary of the “persistent complaints” regarding “currently available assessments in the field of reading comprehension,” and suggests ten minimum requirements for an “adequate system of instrumentation for assessing reading comprehension,” and identifies ten key issues the RRSG argues “a research agenda on reading assessment needs to address” (Snow, 2002: 52-59).2 Woodcock, R.W. & Johnson, M.B. (1989) Woodcock-Johnson Psycho-Educational Battery-Revised. Chicago: Riverside Publishing Company
2
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
What are the problems associated with measuring reading comprehension in early readers?
Which measures provide the best insights into what interventions accomplish with respect to
improving reading comprehension? What is the technical adequacy of data obtained from
existing measures, and what are the implications for the development of additional and/or
enhanced instruments?
This White Paper provides a record of an initial discussion of these issues by the fourteen-member
Working Group on Research in Reading Comprehension Measures for Young Children. Their
comments are organized in three sections. The first section looks at the conceptual and theoretical
questions regarding the key underlying dimensions that need to be represented in a measure of
reading comprehension. Of particular concern is the interface assessment instruments maintain
between educators’ goals as represented in standards, and research on the development and growth
of reading comprehension (e.g., research on inference, text processing, and memory from cognitive,
developmental, and educational psychology). The second section summarizes the related discussion
of the limitations of current assessment tools, and the improvements required to allow researchers to
subject their core predictions regarding interventions’ impacts to empirical tests. The third section
provides descriptions of and insights from a selection of current research projects that attempt to
measure comprehension in early readers using innovative research designs while proposing several
requirements for the ideal measurement tool.
3
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Unpacking the Construct of Reading Comprehension
In order to measure a child’s ability to comprehend what is being read, we need to be able to
deconstruct the components that infuse our measured results. The first step in developing sound
measures of comprehension is to disentangle these components, i.e., to fully articulate the concept
of reading comprehension.
For the purposes of this meeting it was suggested reading be considered “as any reader
interaction with text,” an approach which subsumes comprehension as one aspect of reading and is
consistent with other definitions of reading comprehension (e.g., that articulated in the RAND
Reading Study Group’s 2002 report) which emphasize
The interaction of reader, the purpose for reading, text, and social context
The cognitive capabilities (e.g., attention, memory, critical analytic ability, inferencing,
visualization); motivation (e.g., purpose for reading, self-interest in content); knowledge of
vocabulary and comprehension strategies; and experiences the reader brings to the act of
reading
The different representations readers construct from text, important to its greater
understanding
The value of considering comprehension as a constructive process.
This latter perspective is particularly important in considering comprehension from the perspective
of young children and other early readers who may have little or no previous experiences with
which to infer meaning from the letters, words, sentences which make up a text passage.
4
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Limitations of Current Assessment Tools
A major related concern of the Working Group regarded the adequacy and efficacy of
current assessment tools. The meeting chairperson described an intervention which researchers
believe should have had an impact on comprehension. Studies did find that it had an impact on
word attack and word identification but not on passage comprehension. The question, then, is
whether the intervention really did not have the expected outcome or whether the measure was not
able to tap into whatever effect the intervention may have had. One participant suggested that “we
need more measures of the elements of comprehension” in order to determine whether an
intervention may be improving a component or key process of comprehension without improving
comprehension itself. On the other hand, another member of the Working Group noted that popular
assessment instruments may be too general to measure reading comprehension alone. Thus the
challenge is both to unpack current measures so that foundational or related skills can be excluded
and to create new measures that can isolate the key components of reading comprehension (see the
next section).
Constructs that affect comprehension measurement
While it is apparent that the outcome measures included in current instruments do not fully
tap (e.g., are not sufficiently sensitive to) the changes interventions bring about, it is difficult to
define what is getting in the way. The discussion at the meeting revolved around two main
possibilities: decoding and listening comprehension. Creating a more accurate measure of reading
comprehension hinges in part upon the ability to tease out these foundational skills.
Decoding
5
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Decoding is the ability to decipher printed words by recovering the spoken word that a
printed word represents. Several participants observed that few if any measures adequately measure
if children who are still beginning readers are making growth in their ability to comprehend text,
beyond their ability to decode that text. This is due in part to the foundational importance of
decoding and fluency to beginning readers, but it may also result from the practical limits in terms
of time and cost to administering a battery of tests that can go beyond measuring these skills .
Whatever the reason, one participant observed that decoding and fluency seem to account for a
great deal of the variation in what comprehension measures are getting at.
Research has shown that decoding ability is one of the strongest predictors of the ability to
comprehend text. The Connecticut Longitudinal Study (Shaywitz et al, 1990), for example, found a
correlation (on the Woodcock Reading Mastery) between decoding and comprehension of 0.89 in
Grade 1 and 0.63 in Grade 9. Part of the reason, according to one Working Group member, is the
simple fact that text that is more difficult to understand also uses words that are more difficult to
decode. “So when one is trying to understand what kids are able to do with text, this understanding
can be confounded by their ability to ‘get the information in’—decoding.” Several members of the
Working Group discussed their current efforts (described more fully below) to account for
differences in decoding ability among early readers.
“Reading” Comprehension versus “Listening” Comprehension
Another issue that arose during the opening comments was the difference between
comprehending text and comprehending speech. Is a study that uses a computer to tell a story to
children, for example, actually measuring “reading” comprehension or “listening” comprehension?
One participant observed that this distinction is critical for beginning readers since stories that are
6
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
(appropriately) simplified to meet the reading and vocabulary level of the child convey minimal
content. Many assessors respond to this constraint by having the child respond to text that is read to
them. The question then is whether “listening comprehension is an alternative” to reading
comprehension?
This subject was broached again in a later presentation. The Working Group member
observed that “[w]e keep talking about how reading comprehension and decoding interfere with
each other.” But if we remove decoding by reading to the child, we must ask ourselves “what is
unique to visual text processing” since this will be left out of assessments of comprehension too.
“Even beyond the decoding-fluency level,” what are other aspects in the way one processes print
that distinguish reading comprehension from listening comprehension? Brains learn how to
comprehend text by reading, so children with deficiencies in decoding and fluency, i.e. “the bottom
half of the distribution,” may continue to fall farther behind no matter how much they are read to.
Listening comprehension could suffer too since poor readers “won’t be able to handle . . . all the
sentence structures they might have to listen to and process or reason about.”
Decoding and listening abilities do not exhaust the list of cognitive skills that may be
foundational to or confused with reading comprehension. Other possible confounding factors
discussed at the meeting include working memory, background knowledge, vocabulary, sentence
processing, and verbal reasoning. Getting a “clean” measure of reading comprehension also means
controlling for individual differences in this set of skills. The next section provides an overview of
the different research projects presented at the meeting that have begun to tackle some of these
difficult issues in measuring comprehension in early readers.
7
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Enhancing Assessments of Reading Comprehension
The meeting included several presentations of new or ongoing research projects that are
confronting the challenges posed by current methods of assessing reading comprehension. These
projects roughly divide into those that are seeking more find-grained or detailed measures of
comprehension and those that are attempting to develop assessment techniques that are practical for
use in the classroom. An area of common interest is the need to ground assessment in models of
human cognition and to incorporate technology-supported learning environments.
Detailed Measures of Comprehension
David Francis talked via conference call about his work on isolating the specific nature of
reading problems by disentangling decoding abilities from the ability to comprehend. This work
builds on research by Potts and Peterson (1985) and Hannon and Daneman (2001) and targets “four
different aspects of comprehension”:
1. Text memory2. Background knowledge3. Inferencing based on the text4. Integration with background knowledge
A student having reading problems may be having trouble with vocabulary or with extracting
information from the text. Accounting for this difference means controlling for individual
differences in background knowledge. “The idea,” according to Francis, “is to build texts that are
simple to read in terms of decoding but complex to understand.” Although only a few stories have
been developed, the result, at least from the 2nd grade on, is a relatively “clean” measure of the
reasoning processes involved in reading comprehension.
Another project looking to trace comprehension problems back to their source was presented
by Richard Wagner. Building on the recommendations of an NRC report titled “Knowing What
8
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Students Know,” this series of five studies of 2nd and 4th graders incorporates both IT and
assessments grounded in models of cognition, learning, and development (Snow 2002). The
resulting constructs are to be measured with multiple indicators and include working memory
(reading span, listening span, visuo-spatial span), morphological awareness (decomposition,
derivation), decoding (accuracy, fluency), and vocabulary. Assessments themselves involve a
combination of traditional and experimental measures. The goal of the project is to be able to
identify the relative importance of the measured constructs for improving reading comprehension.
The final phase also will look at the added value of passage-specific over generic assessment
information for predicting outcomes on three comprehension performance tests.
Joseph Torgesen talked about a similar effort to determine the kinds of skills that are lacking
in students who struggle with reading comprehension. This project builds on the Florida
Comprehensive Assessment Test (FCAT), which “was specifically created to place high demands
on vocabulary and reasoning/inferential skill.” The result is that performance hinges upon a
combination of reading, language, and cognitive abilities. Torgesen and his team gave a two-hour
battery of tests that measure a variety of language, reading, reasoning, and memory skills to about
200 students in 3rd, 7th, and 10th grades (Buck & Torgesen, 2003). A factor analysis showed that
differences in decoding skills and fluency accounted for much of the variation in reading
performance on the FCAT at the 3rd and 7th grades. Fluency decreased in importance at the 10th
grade, while non-verbal reasoning ability became somewhat more important. However, Torgesen
emphasized that different tests of reading comprehension might give different results.
Finding better ways of decomposing a rich data set was the subject of Jenny DeMonte’s
presentation. She discussed her work with Deborah Ball, David Cohen, and Brian Rowan on
analyzing reading comprehension data collected from teacher logs on two cohorts of students (K-2
9
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
and 3-5) in 120 schools undergoing comprehensive school reform (Rowan et al, 2004). Detailed
case studies of twelve schools focused on teachers who had taught one of three “gateway” items,
including comprehension, writing, and word analysis. Instruction was then matched to outcomes by
using Terra Nova (levels 10 through 16) to measure student progress at six points over three years.
The results show “substantial gains in word analysis and not much of a gain in comprehension from
K through five.” The question is whether Terra Nova actually measures comprehension or allows a
more fine-grained analysis of comprehension skills. Some data even is being lost because Terra
Nova does not always map onto the teacher logs.
John Sabatini described a new research project at Educational Testing Services (ETS) “to
identify and empirically examine which subsets of comprehension skills are linked to difficulties in
understanding printed materials for individual struggling readers” (Sabatini, 2002). This project
builds on the current Development Reading Comprehension Targeting Struggling Readers project
but focuses on younger readers (Sabatini et al, 2000). Among this age group it is important to go
“beyond [even] the decoding-fluency level” and consider “other aspects in the way one processes
print that distinguishes reading comprehension from listening comprehension.” The goal is to
develop assessments that can be used in education delivery settings to identify sources of reading
comprehension difficulty, which means being able to identify problems “at any or every level of
text processing,” including decoding, vocabulary, fluency, and sentence processing.
Practical Measures of Comprehension
The timely delivery of assessments to teachers is central to the Independent Comprehensive
Adaptive Reading Evaluation (ICARE) project. Barbara Wise discussed how the ICARE project
tackles the practical need to be able to diagnose reading problems quickly and with limited
personnel (Allinder et al, 2004). The goal of this effort in “instructional profiling” is to be able to
10
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
identify “good” readers in less than 15 minutes in order to guide instruction and evaluate
interventions. ICARE uses the “simple model of reading” (Gough et al., 1996) to build profiles of
good readers along the dimensions of word reading ability and listening comprehension. The
screening test also is delivered using speech recognition, which displays the words on a computer
(with 91% accuracy) as the child reads. Students who exhibit difficulties with time-limited word
reading are screened for more specific reading deficits using a variety of comprehension measures.
A series of research projects described by Peter Foltz is attempting to “develop and evaluate
novel techniques for improving vocabulary assessment and reading comprehension.” The approach
used builds upon the traditional Cloze test (Mayer et al, 1999) and draws on computational models
of language simulating human understanding of words and texts. This “Open-Cloze” test works by
removing words from a sentence and giving students the unprompted (i.e. no multiple choice) task
of filling in the blank. The “appropriateness” of the word chosen is determined by having humans
(or a computer) rate its effect on the “true” meaning of the sentence. The goal is to have a calibrated
model of each potential word’s contribution to the meaning of a sentence so that deviations from
“meaning” can be used to measure reading comprehension and improve literacy skills via human or
computer feedback. One benefit of this approach is that it can be used to assess the literacy skills of
very young children.
Developing measures of comprehension skills in early readers is the focus of the research
described by Barbara Foorman and Kristi Santi. The Texas Primary Reading Inventory (TPRI) uses
“authentic passages” to measure reading comprehension, but teachers found that the stories were
too hard for 1st graders (Foorman et al, 2004). Foorman and her team took on the challenge of
developing their own reading passages which controlled for decoding abilities. The task then
became one of matching students to the given instructional level of a text (Schatschneider et al.
11
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
2004). This was done by having students read through word lists of increasing difficulty in terms of
IRT value. Matching vocabulary IRT value to passage Lexile score was successful for placing
students at their appropriate instructional level from the 3rd grade on. However, placing younger
students proved more difficult given the sub-lexical complexity of words.
Keith Millis described an effort “to measure comprehension as it’s happening.” The
approach, called Latent Semantic Analysis (LSA), creates “semantic benchmarks” in a text that
represent different reading comprehension strategies such as paraphrasing (sentence-focused),
elaborating in terms of the prior sentence (local), or using prior text to help develop a theme
(global). The assessment is done by having students create a “verbal protocol” via a Web-based
interactive program. The similarity between a verbal protocol and the semantic benchmarks
provides a metric (expressed as a cosine) of reading comprehension (Magliano & Millis, 2003). The
key, of course, is selecting good benchmarks, which can be drawn from theory but also must rest on
some pragmatic concerns. Early results suggest that LSA can identify the information source in
protocols and may also be predictive of narrative and expository comprehension.
Creating benchmarks or “strategy predictors” is central to another approach to measuring
reading comprehension described by John Guthrie (Guthrie and Scaffidi, 2004). This method of
assessment is designed to have an impact on standardized test scores and begins by coding a
student’s written summary of a reading assignment in terms of background knowledge, questioning,
and searching in multiple texts. The results are compared with a predefined metric of reading
comprehension that assigns discrete “levels” based on the three strategies used. A quicker,
automated variation of this approach has students “pair” computer-generated words related to a
reading task. The “conceptual maps” that result then can be used to generate path correlation and
12
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
coherence data by comparing them with experts’ maps (Ozgungor and Guthrie, 2004). The results
of both methods have been shown to correlate with multiple text comprehension in grades 3-5).
Although the use of technology can help solve many of the conceptual and practical
problems associated with assessment, several members of the Working Group noted that one
challenge yet to be fully addressed is finding measures of reading comprehension that apply to very
young children. Most assessment techniques presume some basic decoding and memory skills that
the beginning (or struggling) reader simply may not possess. As the above projects also
demonstrate, measures need to be finely-grained enough to determine specific deficiencies in
reading comprehension in order to guide classroom instruction. As one participant observed, it is
imperative to know not only that an individual is poor at reading comprehension but also why an
individual performs poorly and what might be done to remedy poor performance.
Conclusion
The search for a valid and reliable assessment instrument that provides a comprehensive
picture of a child’s early reading development is critical to many educational research initiatives.
The Working Group on Research in Reading Comprehension Measures for Young Children echoes
the call for action to improve the system of comprehension assessment measures available to
reading researchers and teachers. In particular, it endorses the benefits of sharing measures and
instruments and data from their administration that could be reanalyzed to obtain a better
understanding of what the raw data is missing – i.e., the dimensions extant measures do not fully tap
but that are necessary to accurately reflect the dynamic, developmental nature of comprehension
growth in early readers.
13
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Appendix A:
Members of the Working Group on Research in Reading
Comprehension Measures for Young Children
Patricia Mathes (Chair), Southern Methodist UniversityMarcia Davis, University of MarylandDavid Cohen, University of Michigan (Unable to attend; sent Jenny DeMonte also from the University of Michigan)Peter Foltz, New Mexico State UniversityBarbara Foorman, University of Texas at Houston (via conference call)David Francis, University of Houston (via conference call)John Guthrie, University of MarylandTom Laudauer, University of ColoradoKeith Millis, Northern Illinois UniversityJohn Sabatini, Education Testing ServiceKristi Santi, University of Texas at Houston (via conference call)Joe Torgesen, Florida State UniversityRichard Wagner, Florida State UniversityBarbara Wise, University of Colorado
vi
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Appendix B
Participants in the November 12, 2004 Meeting
Elizabeth Albro, Institute of Education Sciences, U.S. Department of EducationMarcia Davis, University of MarylandJenny DeMonte (University of Michigan)Peter Foltz, New Mexico State UniversityBarbara Foorman, University of Texas at Houston (via conference call)David Francis, University of Houston (via conference call)John Guthrie, University of MarylandTom Laudauer, University of ColoradoMichelle Llosa, Data Research and Development Center, NORC at the University of ChicagoPatricia Mathes (Chair), Southern Methodist UniversityPeggy McCardle, National Institute of Child Health and Human DevelopmentSarah-Kathryn McDonald, Data Research and Development Center, NORC at the University of ChicagoKeith Millis, Northern Illinois UniversityBarbara Olds, National Science FoundationJohn Sabatini, Education Testing ServiceKristi Santi, University of Texas at Houston (via conference call)Barbara Schneider, Data Research and Development Center, NORC at the University of ChicagoFinbarr Sloane, National Science FoundationJoe Torgesen, Florida State UniversityRichard Wagner, Florida State UniversityBarbara Wise, University of Colorado
vii
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Appendix C
Agenda for the November 12, 2004 Meeting
11:00 am- 11:15 am Participant registration and introductions
11:15 am- 11:35 am Agency Introductions
Finbarr SloaneProgram Director National Science Foundation
Peggy McCardleAssociate Chief, Child Development & Behavior BranchNational Institute of Child Health and Human Development
Elizabeth AlbroResearch Associate, Teaching and Learning Division National Center for Education Research, Institute ofEducation Sciences
11:40 am- 12:10 am Patricia MathesMeeting ChairSouthern Methodist UniversityThe Problem: Measuring Comprehension Growth in Young Children
12:10 am- 12:40pm David Francis- Conference CallDirector, Texas Institute for Measurement, Evaluation & StatisticsProfessor of Quantitative Methods Department of Psychology at the University of Houston
12:40pm- 1:00pm Lunch
1:00 am – 1:20 pm Barbara WiseCenter for Spoken Language ResearchUniversity of Colorado at BoulderICARE: Independent Comprehensive Adaptive Reading
Evaluation: Why and How?
1:25 pm- 1:45 pm Richard WagnerAssociate Director, Florida Center for Reading Research Origins of Developmental and Individual Differences in Reading Comprehension
viii
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
1:50 pm- 2:15 pm Peter FoltzAssociate Professor in the Department of PsychologyNew Mexico State UniversityPearson Knowledge Technologies“Research on new constructed response test-item types for
vocabulary, reading and writing”
2:20 pm- 2:50 pm Barbara Foorman- Conference CallProfessor and DirectorUniversity of Texas-Houston Health Science CenterCenter for Academic and Reading SkillsThe development of comprehension items in the TPRI and the reliability and validity of these items
2:50 pm- 3:15 pm Joseph TorgesenDirector, Florida Center for Reading ResearchVariability in the Skills Measured by Tests of “Reading”
Comprehension across Tests and Across Grade Levels”
3:15 pm- 3:30 pm Break
3:30 pm – 3:55 pm Jenny DeMonteUniversity of MichiganReading Comprehension and the Study of Instructional
Improvement
3:55 pm- 4:10 pm John SabatiniResearch scientist Educational Testing ServiceDeveloping Reading Comprehension Assessments Targeting Struggling Readers
4:15pm- 4:35 pm Keith MillisNorthern Illinois University“Measuring Comprehension with Latent Semantic Analysis and Verbal Protocols”
4:35 pm- 5:05 pm John Guthrie and Marcia DavisDepartment of Human Development, University of MarylandMeasuring Knowledge Acquired from Information Text
5:05pm – 5:30 pm DiscussionNext Steps
ix
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
5:30 pm Adjourn
x
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Appendix D
References and Additional Resources
Allinder, R.M., Fuchs, L.S., & Fuchs, D. (2004). Issues in curriculum-based assessment.
In A. M. Sorrells, H. Rieth, & P. Sindelar (Eds.), Critical issues in special education:
Access, diversity, and accountability (pp. 106-124). Boston: Allyn & Bacon.
Buck, J. & Torgesen, J.K. (2003). The relationship between performance on a measure of
oral reading fluency and performance on the Florida Comprehensive Assessment Test.
Technical Report #1. Tallahassee, FL: Florida Center for Reading Research.
Duke, Nell K. and Pearson, P.D. (2002). Effective Practices for Developing Reading
Comprhension. CIERA (University of Michigan).
http://www.scholastic.com/dodea/Module_1/resources/dodea_m1_pa_duke.pdf
Fuchs, D., & Fuchs, L.S. (2001). One blueprint for bridging the gap: Project Promise:
(Practiioners and researchers orchestrating model innovations to strengthen educaiton).
Teacher Education and Special Education, 24, 304-314.
Foorman, B. R.; Francis, D. J.; Davidson, K. C.; Harm, M. W.; & Griffin, J. (2004.)
Variability in text features in six grade 1 basal reading programs. Scientific Studies of
Reading, 8, 167-197.
xi
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Gough, P.B., Hoover, W.A., & Peterson, C. (1996). Some observations on the simple
view of reading. In C. Cornoldi & Oakhill (Eds.), Reading comprehension difficulties.
Hillsdale, NJ: Erlbaum.
Guthrie, J.T. and Wigfield, A. (2000). Engagement and Motivation in Reading, In Hand
book of Reading Research, Ed. Michael l. Kamil, Peter B.Mosenthal, P.David Pearson &
Rebecca Barr. p.403-422.
Guthrie, J. T., & Scafiddi, N. T. (2004). Reading comprehension for information text:
Theoretical meanings, developmental patterns, and benchmarks for instruction. In J. T.
Guthrie, A. Wigfield, & K. C. Perencevich (Eds.), Motivating reading comprehension:
Concept-Oriented Reading Instruction (pp. 225–248). Mahwah, NJ: Erlbaum.
Hannon, B., & Daneman, M. (2001). A new tool for measuring and understanding
individual differences in the component processes of reading comprehension. Journal of
Educational Psychology, 93, 103-128.
Mayer, R.E., Schustack, M., & Blanton, W. (1999, March-April). What do children learn
from using computers in an informal Collaborative environment? Educational
Technology, 39, 215-227.
Neale, Marie D. 1999, Neale Analysis of Reading Ability. Camberwell: ACER Press.
xii
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Ozgungor, S. & Guthrie, J. T. 2004. Interactions among elaborative interrogation,
knowledge, and interest in the process of constructing knowledge from text. Journal of
Educational Psychology, 96, 437-444.
Potts, G. R., & Peterson, S. B. (1985). Incorporation versus compartmentalization in
memory for discourse. Journal of Memory and Language, 24, 107-118.
RAND® Reading Study Group (2002). Reading for understanding: toward an R&D
program in reading comprehension. Santa Monica: CA: RAND®
Rowan, B., Camburn, E., & Correnti, R. (2004). Using teacher logs to measure the
enacted curriculum in large-scale surveys: A study of literacy teaching in 3rd grade
classrooms. Elementary School Journal, 105, 75-102.
Sabatini, J. P. (2002). Efficiency in word reading of adults: Ability group comparisons.
Scientific Studies of Reading, 6, 267-298.
Sabatini, J. P., Venezky, R. L., Jain, R., & Kharik, P. (2000). Cognitive reading
assessments for low literate adults: An analytic review and new framework (TR00-01).
University of Pennsylvania, National Center on Adult Literacy.
xiii
INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023
Schatschneider, C.; Fletcher, J. M.; Francis, D. J.; Carlson, C. D.; & Foorman, B. R.
(2004). Kindergarten prediction of reading skills: A longitudinal comparative analysis.
Journal of Educational Psychology, 96, 265-282.
Shaywitz, S.E.; Shaywitz, B.A.; Fletcher, J.M.; & Escobar, M.D. (1990). Prevalence of
reading disability in boys and girls: results of the Connecticut Longitudinal Study.
Journal of the American Medical Association, 264, 998-1002.
Snow, C. 2002. Reading for understanding: Toward an R&D program in reading
comprehension. Santa Monica, CA: RAND
Wiederholt, L.J., Bryant, B.R. (1003). Gray Oral Reading Tests, Fourth Edition (GORT).
Austin: PRO-ED.
Woodcock, R.W. & Johnson, M.B. (1989) Woodcock-Johnson Psycho-Educational
Battery-Revised. Chicago: Riverside Publishing Company
xiv