Innovations in Assessing Reading Comprehension€¦ · Web viewAssessing Reading Comprehension with Young Children. was held in Arlington, Virginia. Funded by the National Science

INTERNAL WORKING DOCUMENT – NOT FOR CIRCULATION OR CITATIONREVISED 5/18/2023

Improving Assessments of Reading Comprehension among Early Readers

Working Group on Research in Reading Comprehension MeasuresFor Young Children

Patricia Mathes (Chair) Southern Methodist University

Marcia Davis, University of Maryland Keith Millis, Northern Illinois UniversityJenny DeMonte, University of Michigan John Sabatini, Education Testing ServicePeter Foltz, New Mexico State University Kristi Santi, University of Texas at HoustonBarbara Foorman, University of Texas at Houston Joe Torgesen, Florida State UniversityDavid Francis, University of Houston Richard Wagner, Florida State UniversityJohn Guthrie, University of Maryland Barbara Wise, University of ColoradoTom Laudauer, University of Colorado


Table of Contents

The Working Group on Research in Reading Comprehension MeasuresFor Young Children: History and Charge ………………………………………….. ii

Challenges of Assessing Reading Comprehension ………………………………….. 1

Unpacking the Construct of Reading Comprehension ………………………….. 4

Limitations of Current Assessment Tools ………………………………………….. 5

Enhancing Assessments of Reading Comprehension ………………………….. 8

Conclusion ………………………………………………………………………….. 13

Appendix A: Members of the Working Group on Research in ReadingComprehension Measures for Young Children …………………………………. vi

Appendix B: Participants in the November 12, 2004 meeting …………………. vii

Appendix C: Agenda for the November 12, 2004 meeting …………………………. viii

Appendix D: References and Additional Resources …………………………………. x

i


The Working Group on Research in Reading Comprehension MeasuresFor Young Children: History and Charge

On November 12, 2004 a one-day meeting on Assessing Reading Comprehension with

Young Children was held in Arlington, Virginia. Funded by the National Science Foundation

(NSF) as part of the Interagency Education Research Initiative (IERI, a collaboration of NSF, the

U.S. Department of Education, and the National Institute for Child Health and Human

Development, NICHD), the meeting was designed to address the challenges of measuring reading

comprehension of young children and struggling readers in classroom settings. This specific issue

and the more general challenge of identifying, developing, and administering assessments

sufficiently sensitive to detect responses to interventions with minimal disruption to normal

classroom activities are critical to the IERI program, and to other federal initiatives and education

research activities. There are concerns in the research community that assessments capable of

detecting small yet significant improvements in student learning outcomes may be too resource

intensive to implement in the large-scale longitudinal studies required to obtain rigorous evidence of

the impact of scaling-up promising interventions. Such concerns raise questions regarding the

adequacy and efficacy of current assessments that need to be investigated from multiple viewpoints.

With respect to the assessment of reading comprehension, key questions include: how well do

existing assessments distinguish fluency, decoding, listening comprehension and reading

comprehension; what are the implications of relying on popular methods of assessment for

developing and evaluating responses to interventions; and what practical and conceptual challenges

need to be overcome to enhance assessments of reading comprehension in young children and

struggling readers?

ii


The Working Group on Research in Reading Comprehension Measures for Young

Children was formed to address these issues. The specific catalyst for the group’s formation was a

request from an IERI investigator for assistance from the Data Research and Development Center

(DRDC, a technical assistance and research center established by NSF to support the IERI program;

further information about DRDC & IERI is available online at www.drdc.uchicago.edu). The

investigator raised concerns which DRDC found were shared by several IERI researchers regarding

the limitations of many commonly employed instruments and measures for assessing

comprehension among early readers. DRDC’s efforts to assist the investigator led to a wider

conversation among a small group of IERI researchers, other reading researchers not supported by

the IERI program, and IERI program officers and other senior staff at NSF, NICHD, and the U.S.

Department of Education’s Institute of Education Scientists (IES) regarding the challenges of

employing and the desirability of improving currently available assessment instruments. There was

a general consensus that the challenges the IERI researcher posed have important implications for

many members of the education research and policymaking communities, and should be pursued.

The Working Group is comprised of fourteen education researchers (including four

conducting IERI-supported research). The group is chaired by Patricia Mathes, Texas Instruments

Chair of Reading, Director of the Institute for Reading Research and Professor of Literacy and

Language Acquisition in the School of Education at Southern Methodist, (principal investigator on

the IERI-supported project “Scaling-Up Effective Intervention for Preventing Reading Difficulties

in Young Children”); a full list of the members of the working group is included in Appendix A.

Eleven of these researchers or their designated spokespersons were able to attend the November

2004 meeting; the other three members joined a portion of the meeting via teleconference. Joining

iii

http://www.drdc.uchicago.edu/


the group were seven representatives of the three IERI agencies and DRDC; the full list of attendees

is included in Appendix B.

Prior to the meeting, the members of the Working Group received a full briefing on the

objectives of the meeting and invitations to discuss salient aspects of their research and their expert

views on techniques for measuring reading comprehension with children reading at primary grade

levels. Members were specifically encouraged to come prepared to share the measurements and

instruments they use to assess reading comprehension, particularly for lower achievement level

students, and to address the two framework questions for discussion:

What are the problems associated with measuring reading comprehension in early readers?

Which measures provide the best insights into what interventions accomplish with respect to

improving reading comprehension? What is the technical adequacy of data obtained from

existing measures, and what are the implications for the development of additional and/or

enhanced instruments?

This format allowed participants to explore the difficulties they encounter when measuring

comprehension in early readers, and to present their current instruments and materials to a group of

experts able to draw parallels to their own research projects. The complete agenda for the meeting

is included in Appendix C. A transcript of the meeting and copies of presentation materials are

available online to those who attended the meeting; instructions and credentials for accessing these

materials are available from Michelle Llosa (773 256 6189) at DRDC.

This White Paper has been prepared from transcripts of the meeting and presentation

materials provided by the members of the Working Group; (a compendium of sources of additional

iv


information on the assessments and research discussed at the meeting and referenced here is

included in Appendix D). It is intended to provide a public record of the issues addressed by the

Working Group on Research in Reading Comprehension Measures for Young Children. It is also

an invitation to other researchers interested in forming a network for sharing measures and

experiences using them. The Working Group welcomes opportunities to work with other

researchers committed to resolving the challenges of measuring comprehension growth among early

readers, thus ultimately strengthening the quality of the evidence base regarding the efficacy of

interventions designed to improve reading comprehension.

Data Research and Development Center

University of Chicago and NORC

February 2005

v


Challenges of Assessing Reading Comprehension

Taking educational innovations that have been shown to be effective in one or more

contexts and replicating them in other settings has been a consistent goal of educational reform (see,

e.g., Elmore, 1996). What distinguishes recent efforts to bring educational interventions to scale is

a focus on providing compelling evidence (e.g., from scientifically-based research) of the impact of

such interventions across a wide range of educational contexts. The cornerstones of such efforts are

robust assessments capable of detecting changes in the student learning outcomes of interest.

Ideally these assessment instruments should yield reliable and valid results with minimal disruption

to normal classroom practice and limited if any commitment of additional instructor or other school

resources. Importantly, they should also be widely available to and used by both large numbers of

education researchers and practitioners, particularly in the context of larger accountability

initiatives. The development of tailored assessments may be an option – even a requirement – when

innovative interventions are in a developmental or pilot phase, and finely-grained measures of

responses to interventions are required to judge the merits of continuing R&D work in the area.

There are considerable disadvantages, however, to encouraging researchers to develop and

administer customized assessments in larger-scale evaluations. In particular, the limited

dissemination and administration of robust assessment instruments may constrain comparability

across studies (inhibiting knowledge accumulation) and decrease the prospects of supplementing

individual study findings with information from other large-scale (e.g., national) data collection

efforts.

1


Recommendations for a research agenda that would translate these general concerns with

the development and diffusion of robust assessments into improved assessments of reading

comprehension have already been proposed (e.g., by the U.S. Department of Education’s Office of

Educational Research and Improvement-sponsored RAND Reading Study Group, RRSG; see Snow,

2002).1 Of particular concern here are the challenges of developing valid, reliable, sensitive

assessments of growth in reading comprehension among early (including struggling) readers. In the

absence of such assessments it is not possible to tease-out why, for example, an intervention that

made an impact on word attack and word identification scores of the Woodcock-Johnson2

assessment, did not (counter intuitively) demonstrate much impact on the passage comprehension

score. One possible interpretation of the findings (with important implications for public support

for the intervention) would be that the intervention did not, in fact, have the anticipated outcome.

Another plausible interpretation (with important implications for both educational research and

accountability programs) is that the measures originally employed to test responses to the

intervention were not appropriate or adequate to distinguish comprehension growth. The specific

issues raised by this example could, of course, be addressed through repeated assessments using

multiple (potentially, customized) measures. However, the questions this one set of counter

intuitive findings raise are not isolated to a single intervention or research team. They are

illustrative of a broader set of questions that need to be resolved to improve the confidence with

which decisions can be made regarding the support that should be given to interventions designed to

improve reading comprehension among early readers:

1 The RRSG’s 2002 report, Reading for Understanding: Toward an R&D Program in Reading Comprehension, includes a summary of the “persistent complaints” regarding “currently available assessments in the field of reading comprehension,” and suggests ten minimum requirements for an “adequate system of instrumentation for assessing reading comprehension,” and identifies ten key issues the RRSG argues “a research agenda on reading assessment needs to address” (Snow, 2002: 52-59).2 Woodcock, R.W. & Johnson, M.B. (1989) Woodcock-Johnson Psycho-Educational Battery-Revised. Chicago: Riverside Publishing Company

2


What are the problems associated with measuring reading comprehension in early readers?

Which measures provide the best insights into what interventions accomplish with respect to

improving reading comprehension? What is the technical adequacy of data obtained from

existing measures, and what are the implications for the development of additional and/or

enhanced instruments?

This White Paper provides a record of an initial discussion of these issues by the fourteen-member

Working Group on Research in Reading Comprehension Measures for Young Children. Their

comments are organized in three sections. The first section looks at the conceptual and theoretical

questions regarding the key underlying dimensions that need to be represented in a measure of

reading comprehension. Of particular concern is the interface assessment instruments maintain

between educators’ goals as represented in standards, and research on the development and growth

of reading comprehension (e.g., research on inference, text processing, and memory from cognitive,

developmental, and educational psychology). The second section summarizes the related discussion

of the limitations of current assessment tools, and the improvements required to allow researchers to

subject their core predictions regarding interventions’ impacts to empirical tests. The third section

provides descriptions of and insights from a selection of current research projects that attempt to

measure comprehension in early readers using innovative research designs while proposing several

requirements for the ideal measurement tool.

3


Unpacking the Construct of Reading Comprehension

In order to measure a child’s ability to comprehend what is being read, we need to be able to

deconstruct the components that infuse our measured results. The first step in developing sound

measures of comprehension is to disentangle these components, i.e., to fully articulate the concept

of reading comprehension.

For the purposes of this meeting it was suggested reading be considered “as any reader

interaction with text,” an approach which subsumes comprehension as one aspect of reading and is

consistent with other definitions of reading comprehension (e.g., that articulated in the RAND

Reading Study Group’s 2002 report) which emphasize

The interaction of reader, the purpose for reading, text, and social context

The cognitive capabilities (e.g., attention, memory, critical analytic ability, inferencing,

visualization); motivation (e.g., purpose for reading, self-interest in content); knowledge of

vocabulary and comprehension strategies; and experiences the reader brings to the act of

reading

The different representations readers construct from text, important to its greater

understanding

The value of considering comprehension as a constructive process.

This latter perspective is particularly important in considering comprehension from the perspective

of young children and other early readers who may have little or no previous experiences with

which to infer meaning from the letters, words, sentences which make up a text passage.

4


Limitations of Current Assessment Tools

A major related concern of the Working Group regarded the adequacy and efficacy of

current assessment tools. The meeting chairperson described an intervention which researchers

believe should have had an impact on comprehension. Studies did find that it had an impact on

word attack and word identification but not on passage comprehension. The question, then, is

whether the intervention really did not have the expected outcome or whether the measure was not

able to tap into whatever effect the intervention may have had. One participant suggested that “we

need more measures of the elements of comprehension” in order to determine whether an

intervention may be improving a component or key process of comprehension without improving

comprehension itself. On the other hand, another member of the Working Group noted that popular

assessment instruments may be too general to measure reading comprehension alone. Thus the

challenge is both to unpack current measures so that foundational or related skills can be excluded

and to create new measures that can isolate the key components of reading comprehension (see the

next section).

Constructs that affect comprehension measurement

While it is apparent that the outcome measures included in current instruments do not fully

tap (e.g., are not sufficiently sensitive to) the changes interventions bring about, it is difficult to

define what is getting in the way. The discussion at the meeting revolved around two main

possibilities: decoding and listening comprehension. Creating a more accurate measure of reading

comprehension hinges in part upon the ability to tease out these foundational skills.

Decoding

5


Decoding is the ability to decipher printed words by recovering the spoken word that a

printed word represents. Several participants observed that few if any measures adequately measure

if children who are still beginning readers are making growth in their ability to comprehend text,

beyond their ability to decode that text. This is due in part to the foundational importance of

decoding and fluency to beginning readers, but it may also result from the practical limits in terms

of time and cost to administering a battery of tests that can go beyond measuring these skills .

Whatever the reason, one participant observed that decoding and fluency seem to account for a

great deal of the variation in what comprehension measures are getting at.

Research has shown that decoding ability is one of the strongest predictors of the ability to

comprehend text. The Connecticut Longitudinal Study (Shaywitz et al, 1990), for example, found a

correlation (on the Woodcock Reading Mastery) between decoding and comprehension of 0.89 in

Grade 1 and 0.63 in Grade 9. Part of the reason, according to one Working Group member, is the

simple fact that text that is more difficult to understand also uses words that are more difficult to

decode. “So when one is trying to understand what kids are able to do with text, this understanding

can be confounded by their ability to ‘get the information in’—decoding.” Several members of the

Working Group discussed their current efforts (described more fully below) to account for

differences in decoding ability among early readers.

“Reading” Comprehension versus “Listening” Comprehension

Another issue that arose during the opening comments was the difference between

comprehending text and comprehending speech. Is a study that uses a computer to tell a story to

children, for example, actually measuring “reading” comprehension or “listening” comprehension?

One participant observed that this distinction is critical for beginning readers since stories that are

6


(appropriately) simplified to meet the reading and vocabulary level of the child convey minimal

content. Many assessors respond to this constraint by having the child respond to text that is read to

them. The question then is whether “listening comprehension is an alternative” to reading

comprehension?

This subject was broached again in a later presentation. The Working Group member

observed that “[w]e keep talking about how reading comprehension and decoding interfere with

each other.” But if we remove decoding by reading to the child, we must ask ourselves “what is

unique to visual text processing” since this will be left out of assessments of comprehension too.

“Even beyond the decoding-fluency level,” what are other aspects in the way one processes print

that distinguish reading comprehension from listening comprehension? Brains learn how to

comprehend text by reading, so children with deficiencies in decoding and fluency, i.e. “the bottom

half of the distribution,” may continue to fall farther behind no matter how much they are read to.

Listening comprehension could suffer too since poor readers “won’t be able to handle . . . all the

sentence structures they might have to listen to and process or reason about.”

Decoding and listening abilities do not exhaust the list of cognitive skills that may be

foundational to or confused with reading comprehension. Other possible confounding factors

discussed at the meeting include working memory, background knowledge, vocabulary, sentence

processing, and verbal reasoning. Getting a “clean” measure of reading comprehension also means

controlling for individual differences in this set of skills. The next section provides an overview of

the different research projects presented at the meeting that have begun to tackle some of these

difficult issues in measuring comprehension in early readers.

7


Enhancing Assessments of Reading Comprehension

The meeting included several presentations of new or ongoing research projects that are

confronting the challenges posed by current methods of assessing reading comprehension. These

projects roughly divide into those that are seeking more find-grained or detailed measures of

comprehension and those that are attempting to develop assessment techniques that are practical for

use in the classroom. An area of common interest is the need to ground assessment in models of

human cognition and to incorporate technology-supported learning environments.

Detailed Measures of Comprehension

David Francis talked via conference call about his work on isolating the specific nature of

reading problems by disentangling decoding abilities from the ability to comprehend. This work

builds on research by Potts and Peterson (1985) and Hannon and Daneman (2001) and targets “four

different aspects of comprehension”:

1. Text memory2. Background knowledge3. Inferencing based on the text4. Integration with background knowledge

A student having reading problems may be having trouble with vocabulary or with extracting

information from the text. Accounting for this difference means controlling for individual

differences in background knowledge. “The idea,” according to Francis, “is to build texts that are

simple to read in terms of decoding but complex to understand.” Although only a few stories have

been developed, the result, at least from the 2nd grade on, is a relatively “clean” measure of the

reasoning processes involved in reading comprehension.

Another project looking to trace comprehension problems back to their source was presented

by Richard Wagner. Building on the recommendations of an NRC report titled “Knowing What

8


Students Know,” this series of five studies of 2nd and 4th graders incorporates both IT and

assessments grounded in models of cognition, learning, and development (Snow 2002). The

resulting constructs are to be measured with multiple indicators and include working memory

(reading span, listening span, visuo-spatial span), morphological awareness (decomposition,

derivation), decoding (accuracy, fluency), and vocabulary. Assessments themselves involve a

combination of traditional and experimental measures. The goal of the project is to be able to

identify the relative importance of the measured constructs for improving reading comprehension.

The final phase also will look at the added value of passage-specific over generic assessment

information for predicting outcomes on three comprehension performance tests.

Joseph Torgesen talked about a similar effort to determine the kinds of skills that are lacking

in students who struggle with reading comprehension. This project builds on the Florida

Comprehensive Assessment Test (FCAT), which “was specifically created to place high demands

on vocabulary and reasoning/inferential skill.” The result is that performance hinges upon a

combination of reading, language, and cognitive abilities. Torgesen and his team gave a two-hour

battery of tests that measure a variety of language, reading, reasoning, and memory skills to about

200 students in 3rd, 7th, and 10th grades (Buck & Torgesen, 2003). A factor analysis showed that

differences in decoding skills and fluency accounted for much of the variation in reading

performance on the FCAT at the 3rd and 7th grades. Fluency decreased in importance at the 10th

grade, while non-verbal reasoning ability became somewhat more important. However, Torgesen

emphasized that different tests of reading comprehension might give different results.

Finding better ways of decomposing a rich data set was the subject of Jenny DeMonte’s

presentation. She discussed her work with Deborah Ball, David Cohen, and Brian Rowan on

analyzing reading comprehension data collected from teacher logs on two cohorts of students (K-2

9


and 3-5) in 120 schools undergoing comprehensive school reform (Rowan et al, 2004). Detailed

case studies of twelve schools focused on teachers who had taught one of three “gateway” items,

including comprehension, writing, and word analysis. Instruction was then matched to outcomes by

using Terra Nova (levels 10 through 16) to measure student progress at six points over three years.

The results show “substantial gains in word analysis and not much of a gain in comprehension from

K through five.” The question is whether Terra Nova actually measures comprehension or allows a

more fine-grained analysis of comprehension skills. Some data even is being lost because Terra

Nova does not always map onto the teacher logs.

John Sabatini described a new research project at Educational Testing Services (ETS) “to

identify and empirically examine which subsets of comprehension skills are linked to difficulties in

understanding printed materials for individual struggling readers” (Sabatini, 2002). This project

builds on the current Development Reading Comprehension Targeting Struggling Readers project

but focuses on younger readers (Sabatini et al, 2000). Among this age group it is important to go

“beyond [even] the decoding-fluency level” and consider “other aspects in the way one processes

print that distinguishes reading comprehension from listening comprehension.” The goal is to

develop assessments that can be used in education delivery settings to identify sources of reading

comprehension difficulty, which means being able to identify problems “at any or every level of

text processing,” including decoding, vocabulary, fluency, and sentence processing.

Practical Measures of Comprehension

The timely delivery of assessments to teachers is central to the Independent Comprehensive

Adaptive Reading Evaluation (ICARE) project. Barbara Wise discussed how the ICARE project

tackles the practical need to be able to diagnose reading problems quickly and with limited

personnel (Allinder et al, 2004). The goal of this effort in “instructional profiling” is to be able to

10


identify “good” readers in less than 15 minutes in order to guide instruction and evaluate

interventions. ICARE uses the “simple model of reading” (Gough et al., 1996) to build profiles of

good readers along the dimensions of word reading ability and listening comprehension. The

screening test also is delivered using speech recognition, which displays the words on a computer

(with 91% accuracy) as the child reads. Students who exhibit difficulties with time-limited word

reading are screened for more specific reading deficits using a variety of comprehension measures.

A series of research projects described by Peter Foltz is attempting to “develop and evaluate

novel techniques for improving vocabulary assessment and reading comprehension.” The approach

used builds upon the traditional Cloze test (Mayer et al, 1999) and draws on computational models

of language simulating human understanding of words and texts. This “Open-Cloze” test works by

removing words from a sentence and giving students the unprompted (i.e. no multiple choice) task

of filling in the blank. The “appropriateness” of the word chosen is determined by having humans

(or a computer) rate its effect on the “true” meaning of the sentence. The goal is to have a calibrated

model of each potential word’s contribution to the meaning of a sentence so that deviations from

“meaning” can be used to measure reading comprehension and improve literacy skills via human or

computer feedback. One benefit of this approach is that it can be used to assess the literacy skills of

very young children.

Developing measures of comprehension skills in early readers is the focus of the research

described by Barbara Foorman and Kristi Santi. The Texas Primary Reading Inventory (TPRI) uses

“authentic passages” to measure reading comprehension, but teachers found that the stories were

too hard for 1st graders (Foorman et al, 2004). Foorman and her team took on the challenge of

developing their own reading passages which controlled for decoding abilities. The task then

became one of matching students to the given instructional level of a text (Schatschneider et al.

11


2004). This was done by having students read through word lists of increasing difficulty in terms of

IRT value. Matching vocabulary IRT value to passage Lexile score was successful for placing

students at their appropriate instructional level from the 3rd grade on. However, placing younger

students proved more difficult given the sub-lexical complexity of words.

Keith Millis described an effort “to measure comprehension as it’s happening.” The

approach, called Latent Semantic Analysis (LSA), creates “semantic benchmarks” in a text that

represent different reading comprehension strategies such as paraphrasing (sentence-focused),

elaborating in terms of the prior sentence (local), or using prior text to help develop a theme

(global). The assessment is done by having students create a “verbal protocol” via a Web-based

interactive program. The similarity between a verbal protocol and the semantic benchmarks

provides a metric (expressed as a cosine) of reading comprehension (Magliano & Millis, 2003). The

key, of course, is selecting good benchmarks, which can be drawn from theory but also must rest on

some pragmatic concerns. Early results suggest that LSA can identify the information source in

protocols and may also be predictive of narrative and expository comprehension.

Creating benchmarks or “strategy predictors” is central to another approach to measuring

reading comprehension described by John Guthrie (Guthrie and Scaffidi, 2004). This method of

assessment is designed to have an impact on standardized test scores and begins by coding a

student’s written summary of a reading assignment in terms of background knowledge, questioning,

and searching in multiple texts. The results are compared with a predefined metric of reading

comprehension that assigns discrete “levels” based on the three strategies used. A quicker,

automated variation of this approach has students “pair” computer-generated words related to a

reading task. The “conceptual maps” that result then can be used to generate path correlation and

12


coherence data by comparing them with experts’ maps (Ozgungor and Guthrie, 2004). The results

of both methods have been shown to correlate with multiple text comprehension in grades 3-5).

Although the use of technology can help solve many of the conceptual and practical

problems associated with assessment, several members of the Working Group noted that one

challenge yet to be fully addressed is finding measures of reading comprehension that apply to very

young children. Most assessment techniques presume some basic decoding and memory skills that

the beginning (or struggling) reader simply may not possess. As the above projects also

demonstrate, measures need to be finely-grained enough to determine specific deficiencies in

reading comprehension in order to guide classroom instruction. As one participant observed, it is

imperative to know not only that an individual is poor at reading comprehension but also why an

individual performs poorly and what might be done to remedy poor performance.

Conclusion

The search for a valid and reliable assessment instrument that provides a comprehensive

picture of a child’s early reading development is critical to many educational research initiatives.

The Working Group on Research in Reading Comprehension Measures for Young Children echoes

the call for action to improve the system of comprehension assessment measures available to

reading researchers and teachers. In particular, it endorses the benefits of sharing measures and

instruments and data from their administration that could be reanalyzed to obtain a better

understanding of what the raw data is missing – i.e., the dimensions extant measures do not fully tap

but that are necessary to accurately reflect the dynamic, developmental nature of comprehension

growth in early readers.

13


Appendix A:

Members of the Working Group on Research in Reading

Comprehension Measures for Young Children

Patricia Mathes (Chair), Southern Methodist UniversityMarcia Davis, University of MarylandDavid Cohen, University of Michigan (Unable to attend; sent Jenny DeMonte also from the University of Michigan)Peter Foltz, New Mexico State UniversityBarbara Foorman, University of Texas at Houston (via conference call)David Francis, University of Houston (via conference call)John Guthrie, University of MarylandTom Laudauer, University of ColoradoKeith Millis, Northern Illinois UniversityJohn Sabatini, Education Testing ServiceKristi Santi, University of Texas at Houston (via conference call)Joe Torgesen, Florida State UniversityRichard Wagner, Florida State UniversityBarbara Wise, University of Colorado

vi


Appendix B

Participants in the November 12, 2004 Meeting

Elizabeth Albro, Institute of Education Sciences, U.S. Department of EducationMarcia Davis, University of MarylandJenny DeMonte (University of Michigan)Peter Foltz, New Mexico State UniversityBarbara Foorman, University of Texas at Houston (via conference call)David Francis, University of Houston (via conference call)John Guthrie, University of MarylandTom Laudauer, University of ColoradoMichelle Llosa, Data Research and Development Center, NORC at the University of ChicagoPatricia Mathes (Chair), Southern Methodist UniversityPeggy McCardle, National Institute of Child Health and Human DevelopmentSarah-Kathryn McDonald, Data Research and Development Center, NORC at the University of ChicagoKeith Millis, Northern Illinois UniversityBarbara Olds, National Science FoundationJohn Sabatini, Education Testing ServiceKristi Santi, University of Texas at Houston (via conference call)Barbara Schneider, Data Research and Development Center, NORC at the University of ChicagoFinbarr Sloane, National Science FoundationJoe Torgesen, Florida State UniversityRichard Wagner, Florida State UniversityBarbara Wise, University of Colorado

vii


Appendix C

Agenda for the November 12, 2004 Meeting

11:00 am- 11:15 am Participant registration and introductions

11:15 am- 11:35 am Agency Introductions

Finbarr SloaneProgram Director National Science Foundation

Peggy McCardleAssociate Chief, Child Development & Behavior BranchNational Institute of Child Health and Human Development

Elizabeth AlbroResearch Associate, Teaching and Learning Division National Center for Education Research, Institute ofEducation Sciences

11:40 am- 12:10 am Patricia MathesMeeting ChairSouthern Methodist UniversityThe Problem: Measuring Comprehension Growth in Young Children

12:10 am- 12:40pm David Francis- Conference CallDirector, Texas Institute for Measurement, Evaluation & StatisticsProfessor of Quantitative Methods Department of Psychology at the University of Houston

12:40pm- 1:00pm Lunch

1:00 am – 1:20 pm Barbara WiseCenter for Spoken Language ResearchUniversity of Colorado at BoulderICARE: Independent Comprehensive Adaptive Reading

Evaluation: Why and How?

1:25 pm- 1:45 pm Richard WagnerAssociate Director, Florida Center for Reading Research Origins of Developmental and Individual Differences in Reading Comprehension

viii


1:50 pm- 2:15 pm Peter FoltzAssociate Professor in the Department of PsychologyNew Mexico State UniversityPearson Knowledge Technologies“Research on new constructed response test-item types for

vocabulary, reading and writing”

2:20 pm- 2:50 pm Barbara Foorman- Conference CallProfessor and DirectorUniversity of Texas-Houston Health Science CenterCenter for Academic and Reading SkillsThe development of comprehension items in the TPRI and the reliability and validity of these items

2:50 pm- 3:15 pm Joseph TorgesenDirector, Florida Center for Reading ResearchVariability in the Skills Measured by Tests of “Reading”

Comprehension across Tests and Across Grade Levels”

3:15 pm- 3:30 pm Break

3:30 pm – 3:55 pm Jenny DeMonteUniversity of MichiganReading Comprehension and the Study of Instructional

Improvement

3:55 pm- 4:10 pm John SabatiniResearch scientist Educational Testing ServiceDeveloping Reading Comprehension Assessments Targeting Struggling Readers

4:15pm- 4:35 pm Keith MillisNorthern Illinois University“Measuring Comprehension with Latent Semantic Analysis and Verbal Protocols”

4:35 pm- 5:05 pm John Guthrie and Marcia DavisDepartment of Human Development, University of MarylandMeasuring Knowledge Acquired from Information Text

5:05pm – 5:30 pm DiscussionNext Steps

ix


5:30 pm Adjourn

x


Appendix D

References and Additional Resources

Allinder, R.M., Fuchs, L.S., & Fuchs, D. (2004). Issues in curriculum-based assessment.

In A. M. Sorrells, H. Rieth, & P. Sindelar (Eds.), Critical issues in special education:

Access, diversity, and accountability (pp. 106-124). Boston: Allyn & Bacon.

Buck, J. & Torgesen, J.K. (2003). The relationship between performance on a measure of

oral reading fluency and performance on the Florida Comprehensive Assessment Test.

Technical Report #1. Tallahassee, FL: Florida Center for Reading Research.

Duke, Nell K. and Pearson, P.D. (2002). Effective Practices for Developing Reading

Comprhension. CIERA (University of Michigan).

http://www.scholastic.com/dodea/Module_1/resources/dodea_m1_pa_duke.pdf

Fuchs, D., & Fuchs, L.S. (2001). One blueprint for bridging the gap: Project Promise:

(Practiioners and researchers orchestrating model innovations to strengthen educaiton).

Teacher Education and Special Education, 24, 304-314.

Foorman, B. R.; Francis, D. J.; Davidson, K. C.; Harm, M. W.; & Griffin, J. (2004.)

Variability in text features in six grade 1 basal reading programs. Scientific Studies of

Reading, 8, 167-197.

xi

http://www.scholastic.com/dodea/Module_1/resources/dodea_m1_pa_duke.pdf


Gough, P.B., Hoover, W.A., & Peterson, C. (1996). Some observations on the simple

view of reading. In C. Cornoldi & Oakhill (Eds.), Reading comprehension difficulties.

Hillsdale, NJ: Erlbaum.

Guthrie, J.T. and Wigfield, A. (2000). Engagement and Motivation in Reading, In Hand

book of Reading Research, Ed. Michael l. Kamil, Peter B.Mosenthal, P.David Pearson &

Rebecca Barr. p.403-422.

Guthrie, J. T., & Scafiddi, N. T. (2004). Reading comprehension for information text:

Theoretical meanings, developmental patterns, and benchmarks for instruction. In J. T.

Guthrie, A. Wigfield, & K. C. Perencevich (Eds.), Motivating reading comprehension:

Concept-Oriented Reading Instruction (pp. 225–248). Mahwah, NJ: Erlbaum.

Hannon, B., & Daneman, M. (2001). A new tool for measuring and understanding

individual differences in the component processes of reading comprehension. Journal of

Educational Psychology, 93, 103-128.

Mayer, R.E., Schustack, M., & Blanton, W. (1999, March-April). What do children learn

from using computers in an informal Collaborative environment? Educational

Technology, 39, 215-227.

Neale, Marie D. 1999, Neale Analysis of Reading Ability. Camberwell: ACER Press.

xii


Ozgungor, S. & Guthrie, J. T. 2004. Interactions among elaborative interrogation,

knowledge, and interest in the process of constructing knowledge from text. Journal of

Educational Psychology, 96, 437-444.

Potts, G. R., & Peterson, S. B. (1985). Incorporation versus compartmentalization in

memory for discourse. Journal of Memory and Language, 24, 107-118.

RAND® Reading Study Group (2002). Reading for understanding: toward an R&D

program in reading comprehension. Santa Monica: CA: RAND®

Rowan, B., Camburn, E., & Correnti, R. (2004). Using teacher logs to measure the

enacted curriculum in large-scale surveys: A study of literacy teaching in 3rd grade

classrooms. Elementary School Journal, 105, 75-102.

Sabatini, J. P. (2002). Efficiency in word reading of adults: Ability group comparisons.

Scientific Studies of Reading, 6, 267-298.

Sabatini, J. P., Venezky, R. L., Jain, R., & Kharik, P. (2000). Cognitive reading

assessments for low literate adults: An analytic review and new framework (TR00-01).

University of Pennsylvania, National Center on Adult Literacy.

xiii


Schatschneider, C.; Fletcher, J. M.; Francis, D. J.; Carlson, C. D.; & Foorman, B. R.

(2004). Kindergarten prediction of reading skills: A longitudinal comparative analysis.

Journal of Educational Psychology, 96, 265-282.

Shaywitz, S.E.; Shaywitz, B.A.; Fletcher, J.M.; & Escobar, M.D. (1990). Prevalence of

reading disability in boys and girls: results of the Connecticut Longitudinal Study.

Journal of the American Medical Association, 264, 998-1002.

Snow, C. 2002. Reading for understanding: Toward an R&D program in reading

comprehension. Santa Monica, CA: RAND

Wiederholt, L.J., Bryant, B.R. (1003). Gray Oral Reading Tests, Fourth Edition (GORT).

Austin: PRO-ED.

Woodcock, R.W. & Johnson, M.B. (1989) Woodcock-Johnson Psycho-Educational

Battery-Revised. Chicago: Riverside Publishing Company

xiv

Documents

Innovations in Assessing Reading Comprehension€¦ · Web viewAssessing Reading Comprehension with Young Children. was held in Arlington, Virginia. Funded by the National Science