Designing and Assessing Summer Reading Programs and Assessing Summer Reading Programs Scott Paris University of Michigan/CIERA

Designing and AssessingSummer Reading Programs

Scott Paris

University of Michigan/CIERA

Research Team♦ University of Michigan

– Scott Paris, Rob Carpenter, Alison Paris,Melissa Mercer

♦ Michigan State University– David Pearson, Gina Cervetti, Stephanie

Davis, Joseph Martineau

♦ Ingham Intermediate School District– Jonathan Flukes, Tamara Bashore-Berg,

Kathy Humphrey

♦ Michigan Department of Education– Sheila Potter, Bonnie Rockafellow

Special Thanks to theTeachers and Staffs

from Schools inAnn Arbor, Willow Run, Romulus,

Milan, Grand Rapids, Southfield,Saginaw, Macomb County, Flint,

Leslie, Kalamazoo, Portage, Gaylord,

St. Ignace, Traverse City, Cesar

Chavez Academy, Waverly, and theEarly Literacy Committee

Closing the Gap

♦ Summer Reading programs providesupplementary reading interventions

♦ Summer reading programs provideaccountability, perhaps a retention gate

♦ Summer reading programs requiremoney, planning, and training

♦ Lack of sustained vision, funding, andtraining undermine summer programs

Closing the GapChildren in summer reading programs:

♦ Spend more time reading and writing

♦ Work in guided activities

♦ Receive focused remedial instruction

♦ Experience success and enrichment

♦ Promote family involvement

♦ Do not experience summer reading loss

Goals 2000:Research Goals

♦To evaluate the effectiveness of K-3summer reading programs in samplesites in Michigan

♦To develop assessment tools for K-3literacy as part of the MLPP

♦To provide suggestions to improve K-3 summer reading programs

Year 1 Study

♦6 districts throughout Michigan

♦K-3 students in summer programs

♦Pre-test and post-test on QRI

♦No control group

♦Observations in schools

Conclusions From Year 1

♦ Good news:– Children read same passages better after

summer school– Observations and teacher logs revealed

features of effective programs♦ Worries:

– No control for practice or maturation– No control group w/o summer school

Recommendations for EvaluationCriteria for 1999 (Year 2) Programs➔ High Standards

➔ Intensive support, feedback

➔ Daily opportunities to:

• read easy materials

• read challenging materials

• write authentic texts

➔ Word-reading, comprehension, writing andmonitoring skills

➔ Student motivation

➔ Guided reading and writing

Additional Desirable Characteristics of1999 Summer Programs

Ø Manageable class sizes

Ø Knowledgeable and experienced staff

Ø P.D. opportunities for teachers

Ø Quality instructional materials

Ø Effective use of libraries/media resources

Ø Assessment criteria to determine student selection,student progress, and program effectiveness

Ø Minimum 60 hours prime instructional time

Ø Parent involvement

Ø Leadership and accountability

Year 2: Design

♦ K-3 summer programs in 12 Michigandistricts

♦ Tested >1000 children who were eligibleor recommended for summer school

♦ Pretest: Spring 1999

♦ Posttest: Fall 1999

♦ Delayed post test: Spring 2000

♦ Compare Experimental & Control students

Measures for Year 2

♦Gates-MacGinitie Reading Tests

♦Johns Basic Reading Inventory (BRI)

♦Literacy Habits

♦Student Opinions About Reading(SOAR)

♦MLPP tasks for pre-readers

–Concepts of Print

–Phonemic Awareness

The Gates-MacGinitie Reading Tests

♦ Level PRE

– Literacy Concepts, ReadingInstruction & Relational Concepts,Oral Language Concepts, Letter-Sound Correspondences

♦ Level R

– Beginning Consonants, FinalConsonants, Vowels, Use of Context

♦ Levels 1, 2, 3

– Vocabulary, Comprehension, Total

Pros & Cons of a Standardized Test

Benefits

♦ Group administered in about an hour

♦ Multiple forms for pre & post testing

♦ Subscores and scaled scores

♦ Administrators want/expect it

Liabilities

♦ Young children unfamiliar with format

♦ Children distressed

♦ Not aligned with curricula & instruction

♦ May measure ability not achievement

Level PRE Results

64

65

66

67

68

69

70

71

72

73R

aw

Sco

re

Experimental 67.2 72

Control 67.3 68.8

Pretest Posttest

Level R Results

285

290295

300

305

310315

320

325

Sca

led

Sco

re

Experimental 299 309.1

Control 321.7 318.9

Pretest Posttest

Scal

ed S

core

s

Level 1 Results

365

370

375

380

385S

cale

d S

core

Experimental 372 380.9

Control 380.7 382.5

Pretest Posttest

Levels 2 & 3 Results

380

390

400

410

420

430

440

450S

cale

d S

core

Experimental - 2 406.1 413.5

Control - 2 405.8 409.8

Experimental - 3 433.7 430.7

Control - 3 440.3 445.7

Pretest Posttest

Year 2: Conclusions for GMRTSpring-Fall 99

♦No gender differences at any level

♦Greatest benefits for beginning orstruggling readers, usually theyoungest children

♦No gains in standardized test scoresfor better/older readers

HLM Analyses: Value Added

♦ Some sites showed little summer gains

♦ Some sites showed large summer gains

♦ Value of Summer Program increased if:

– More hours devoted to readinginstruction

– Used structured programs such asAccelerated Reader or Richard OwenLiteracy Network

460

470

480

490

500

510

520

P retest Posttest E xtended posttest

Average tr eatment student,max acceler ated r eader site

Average tr eatment student,no accelerated r eader siteControl student, maxaccelerated r eader site

Control student, noaccelerated r eader site

Oral Reading MeasuresBenefits♦ Aligned with daily instruction♦ Multiple measures of fluency &

comprehension collected simultaneously♦ Diagnostic immediatelyLiabilities♦ Requires expertise to administer & interpret♦ Requires multiple passages and levels♦ Accuracy (i.e., running records & miscues) or

rate are insufficient by themselves♦ Teachers may “teach” commercial materials

Oral Reading Data: BRI

♦Graded Word lists

♦Words/Minute Read Correctly onGrade Level Passage

♦Fluency = Rate, Accuracy, andProsody

♦Miscues & Self-Corrections

♦Propositions & Key Ideas Recalled

♦Comprehension Questions

BRI Results

♦ Analyzed % Accuracy, % QuestionsCorrect, % Propositions Recalled byPassage levels

♦ No Exp-Control differences on oralreading measures

BUT♦ Difficulty of different levels confounded♦ BRI differences between Forms A & B♦ Groups not equivalent at pre-test

Solutions?

♦Use same forms or same passagesfor pre-test and post-test

♦Use ANCOVA or pre-postdifference scores (gains)

♦Use IRT analyses

Item Response Theory (IRT)♦ Makes one scale for different passages so

different grades/levels can be compared onsame scale

♦ Based on local norms & data

♦ Shows growth & progress

♦ Like MEAP, NAEP, SAT, GRE tests

♦ Easy to report

♦ Complex to understand how scores arecalculated and interpreted

Mean RATE values

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

Pretest Posttest Delayed Posttest

ExperimentalControl

Mean FLUENCY values

-0.4

-0.2

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6


ExperimentalControl

Mean COMPREHENSION Values

-0.2

-0.1

0

0.1

0.2

0.3

0.4

0.5


ExperimentalControl

Conclusions About IRT Analyses

♦ Same scale resolves differences betweenpassages and reveals Exp-Controldifferences

♦ IRT scores easy to compare for growth

♦ IRT scores may be a solution for IRIanalyses and reports nationwide

Literacy Habits Items

♦How often do you…– visit the library or Bookmobile?– write stories or poems at home just for fun?– read at home for fun?– read books or stories at bedtime?– your parents help you read or write at home

♦Response options Hardly ever About once a week Almost every day

Literacy Habits

18.4 18.1 18.8 18.4

0

6

12

18

24

30

Pre Post

Control

Exp

Grade 2-3 SOAR Items

♦ Opinions (16 items)– I can read out loud in class without

making many mistakes.

– I pay attention when I read in class.

– I choose to read things that really makeme think.

– I think reading is fun.

♦ Response Options Not like me A little like me A lot like me

Grades 2-3 SOAR

39.9 40.7 40.0 40.5

0

12

24

36

48

Pre Post

ControlExp

Parents’ Survey

♦ 25 item survey mailed to parents

♦ Includes 15 items about reading habitswith 4 response options about frequencyof activities

♦ Includes 10 items about attitudes with 4options for agree/disagree

♦ Summer School N = 319; Control N = 120

Parents’ Reports of Attitudes

♦Parents report MORE positiveattitudes among younger children

♦Parents report MORE positiveattitudes among girls > boys

♦No differences between SummerSchool and Control groups

Conclusions

♦ Children’s responses to Literacy Habits& SOAR did not differ by group

♦ Girls and younger children may havebetter habits and attitudes

♦ Summer school too brief?

♦ Measures too complicated?

Did Summer School HelpChildren Read Better?

Yes

No

Hard to tell

Evidence for No Advantage

♦Literacy Habits

♦SOAR (Attitudes)

Evidence for Yes Advantage

♦GMRT for youngest children

♦BRI: Summer School > Control

♦Case studies of teachers whoelicited High versus Low gainsfrom students on the BRI

Hard To Tell Because ofDesign Factors

Control Groups– Not random; teacher nominated– Why did they not attend summer

school?– What did they do during summer?– Higher Gates pretests than Exp Ss so

not equivalent groups at pretest– Need to equate groups or match Ss -

feasible? possible?


Treatment Groups– Not random; recruited/enlisted Ss– Hawthorne effects, positive & negative– Diverse etiology of reading problems– Treatments/curricula vary by school &

teacher– Large teacher differences and teacher x

treatment interactions– Cost factors > curricula factors– Assessments do not match curricula &

instruction at each site


Assessment Factors– Were tasks mismatched with curricula?– How do you reconcile changes on

some measures and not others?

Conclusions About Designs

♦Traditional Pre-Post x Exp-Control design is impractical

♦Need to examine the fit betweenassessment tasks and curriculumin each program

♦Need to develop alternativeevaluation designs

Recommendations

We recommend the following criteriafor constructing good summerschool reading programs…

Recommendations

Choose good assessments

–Multiple assessments

–Connected to instruction

– Immediately useful

–Same assessments from pre-testto post-test

Recommendations

Good assessments include:♦Oral reading fluency♦Comprehension♦Metacognitive strategy measures♦Curriculum-based assessments♦Specific skill assessments

Recommendations

Useful questions to consider:

♦Was summer loss prevented?♦Did students improve on same tasks?♦Did students maintain gains?♦Did students meet explicit standards

or benchmarks?

Recommendations

Use surveys to obtain the views ofall stakeholders:

–Students

–Parents

–Teachers

• Summer program teachers

• Receiving teachers

Recommendations

States/districts need to provide…

♦Models of useful evaluations

♦Adequate staff to collect and analyzedata

♦Adequate time to administer, analyzeand report assessment results

♦Resources to build local capacity

Recommendations

Use the very best personnel.

Best results are from the best teachers

• Active recruitment

• Experienced teachers

• Higher salaries

Recommendations

Give students the time and attentionthey need.

–Low student to teacher ratios

–Daily one-on-one instruction

– Instruction targeted to individualstudent’s needs

Recommendations

Maximize coordination with theregular school year programs andteachers.– Summer teacher à regular year teacher

– Regular year teacher à summer teacher

• Diagnostic information

• Records of progress

Recommendations

Increase time on literacy instruction.

–Require at least 60 hours ofinstructional time over 3+ weeks.

–Require attendance for entireprogram.

–Require literacy activities at home.

Recommendations

Intensify efforts to involve parents.

–Parent contracts

–Parent nights

–Parent journals

–Shared homework/activities

Recommendations

Ensure that programs are funded well.–More instruction and assessment

requires more supportSecretariesEvaluation consultantsTutors/Coaches

Year 3: Building Capacity

Our goal was to:

u Create materials and resources toenable local schools to design,implement, document, and assess theirown summer reading programs.

Models & Resources

Collaborations with 4 exemplaryschools produced:

♦ Assessment procedures & records

♦ MLPP assistance

♦ Videos for staff development

♦ Website with downloadable forms,materials, and videos

Exemplary Sites

Four sites representing:

♦urban and rural populations

♦small and large school districts

Gaylord Leslie

Milan Southfield

Exemplary Sites

Each site had:

♦Assessment Coach

♦Coach’s Notebook

♦Support for Assessment

Coach’s Notebook

♦ Goals, history, anddescription ofsummer program

♦ Curriculummaterials

♦ Instructionalactivities

♦ Assessment tasksand procedures

♦ Coach’s andprincipal’s roles

♦ Support staff

♦ Parentinvolvement

♦ Staff development

♦ Photos and sampleartifacts

Effective Practices Included

ØSmall ratio of children to adults @ 5:1

ØPre-service student teachers as interns

ØThematic Instruction across entire school

ØGrade level teaching teams

ØParents’ contracts, Parents’ nights atschool, Parents’ journal writing

ØFocused daily program with 60 minuteinstructional blocks

Goals 2000 Year 3 Websitehttp://isd.ingham.k12.mi.us/~rdggrant/

♦ Links to each model site’s web-based coach’s notebook

♦ Downloadable forms for assessments,lesson plans

♦ Staff development and parentinvolvement

♦ Assessment tools and instructions

Year 3: Conclusions♦ Local capacity established for

documenting features and assessingsuccess of summer programs through:

♦ Website – downloadable assessmentforms, curriculum plans, exemplarymodels

♦ CD and video – info, forms, videos

♦ Links with MLPP assessment andtraining

Year 4: MLPP Validation Study

♦Reliability of teachers’ assessments

♦Validity of MLPP

–Concurrent (TPRI)

–Predictive (GMRT, MEAP)

–Consequential (teachers’ views &practices)

Comments on MLPP

♦Teacher controlled assessments

♦Used selectively & diagnostically

♦Information is clear and immediate

♦Depends on teachers’ knowledge

♦Connected to instruction

♦Tied to professional development

MLPP Tasks

♦ IRIs were not intended for accountability

♦ Fluent reading is not sufficient

♦ Comprehension is difficult to assess

♦ Rubrics for retellings and writing are notalways reliable and do not show growth

♦ Easy to measure skills predominate

♦ Developmental sensitivity varies by task

Yes, there are problems

♦Not enough time

♦Need staff/support

♦Need models, resources, materials

♦Interpreting data can be difficult

♦Impact on stakeholders must beassessed

Different Strokes

Not everyone:

♦Wants to use the MLPP

♦Likes Book Clubs

♦Enjoys self-assessment

♦Likes to read aloud

♦Wants to write in journals

♦Dislikes multiple-choice tests

♦Wants to share their portfolio

So, Assessment must

♦Be selective for each child

♦Be diagnostic

♦Link instruction with assessmentresults

♦Be child-centered

♦Be teacher-friendly

♦Be parent-useful

Close the gap?

Maybe, but the important gap is notbetween test scores. It is the gapbetween a child’s own potential andactual achievement.

If we help all children to try their best,to succeed often, to read and writeevery day, and to challenge them to

meet high standards, then every childand every teacher is successful.

Documents

Designing and Assessing Summer Reading Programs and Assessing Summer Reading Programs Scott Paris University of Michigan/CIERA