Formative Information Using Student Growth Percentiles for the Quantification of English Language Learners’ Progress in Language Acquisition

Applied Measurement in Education, 27: 196–213, 2014Copyright © Taylor & Francis Group, LLCISSN: 0895-7347 print/1532-4818 onlineDOI: 10.1080/08957347.2014.905784

Formative Information Using Student Growth Percentilesfor the Quantification of English Language Learners’

Progress in Language Acquisition

Husein Taherbhai, Daeryong Seo, and Kimberly O’MalleyPearson

English language learners (ELLs) are the fastest growing subgroup in American schools. These stu-dents, by a provision in the reauthorization of the Elementary and Secondary Education Act, are to besupported in their quest for language proficiency through the creation of systems that more effectivelymeasure ELLs’ progress across years. In the past, ELLs’ progress has been based on students’ priorscores measuring the same construct. To disentangle effectiveness from achievement, the reportinghas generally targeted mean-group activity. In contrast, student growth percentiles (SGPs) provide acomparison of students’ growth with others who have the same achievement score history. By exam-ining the construct measured by an English language proficiency test as manifested in student scoresin Speaking, Listening, Reading and Writing, this article outlines the use of SGPs in providing infor-mation on how much each student needs to grow, which will allow educators to more effectively applydifferential formative instructional strategies.

In recent years, the study of English in K–12 settings has taken center stage in the United Statesbecause of the growing numbers of English language learners (ELLs) enrolled in schools acrossthe nation (Meyer, Madden, & McGrath, 2004; U.S. Government Accountability Office [GAO],2006; Van Roekel, 2008). However, many of these students’ academic performances fall wellbelow those of their non-ELL peers, not because of the lack of academic achievement but becauseof the inadequacy of their English language skills (Abedi, 2008).

Under the No Child Left Behind Act (NCLB) of 2001, each state is required to assess lan-guage proficiency via the four recognized English Language Proficiency (ELP) modalities (i.e.,Speaking, Listening, Reading, and Writing). As McCarthy (1999) points out, the use of modalityresults is necessary because it helps educators demarcate the underlying reasons of student dif-ferential performances, look for parallels between the processes in the learning of each modality,and use the information constructively in a classroom setting.

According to Abedi (2008), most states use a compensatory model for assessing students’language acquisition. Unlike conjunctive models where students must achieve “targets” in eachof the modalities of ELP to be considered proficient, students assessed by the compensatory

Correspondence should be addressed to Daeryong Seo, Psychometric and Research Services, Pearson, 19500Bulverde Road, San Antonio, TX 78259. E-mail: [email protected]; or to Husein Taherbhai, 1265 EarlfordDrive, Pittsburgh, PA 15227. E-mail: [email protected]

Color versions of one or more of the figures in the article can be found online at www.tandfonline.com/hame.

www.tandfonline.com/hame

GROWTH PERCENTILES OF ELLS 197

model may not be proficient in an aspect of language acquisition that is important when ELLsare mainstreamed into a non-ELL classroom (e.g., Reading), and yet be considered proficient intotal language acquisition.

In recent years, the paradigm shift in educational culture emphasizes the importance of assess-ment that provides information for formative purposes (Rushton, 2005), where relevant feedbackcan be used to minimize the existing gap between the actual and desired levels of performance(Nichols, Meyers, & Burling, 2009; Perie, Marion, & Gong, 2009).

While simple descriptive scores do provide a diagnostic aspect to the differential performanceof student achievement, they do not provide meaningful information that can be integrated intoeffective classroom learning and teaching (Ferrara & DeMauro, 2006; Goodman & Hambleton,2004; Roberts & Gierl, 2010). In its simplest form, the difference in two student scoreson a vertical scale can give an indication of growth. However, as Lissitz and Doran (2009)point out, additional information is required to meaningfully interpret that growth. Even whenwell-designed scales exist, Betebenner (2009) contends that vertical scales are, at their best, quasi-interval because the same change in student score points can lead to different amount of learning,depending on where the student is on the scale.

In the past, the primary effort in the analysis of growth has been to use prior student achieve-ment to disentangle the effectiveness (e.g., of teachers, of schools) from the aggregate level ofachievement (Ballou, Sanders, & Wright, 2004; Betebenner, 2007; Braun, 2005). The use of stu-dents’ prior scores (one or more, at different points in time) as indicators of students’ growth isbased on empirical evidence in the establishment of the relationship between students’ pretestsand outcome variables (e.g., Ho, 2011; Sanders, 2006). Generally speaking, two performancetasks (including one prior score) are necessary, although a larger number of prior scores canprovide more accurate results (Sanders, 2006).

Using prior student achievement to quantify teacher and school effectiveness has been broughtinto the limelight, primarily through value-added models (VAM) (Sanders, Saxton, & Horn,1997) of which Sanders’ (2006) Tennessee Value-Added Assessment System (TVAAS) and theEducational Value-Added Assessment System (EVAAS) have been very prominent. However,as Betebenner (2007) points out, models suitable for quantifying teacher and school effective-ness through students’ longitudinal data “are generally not well suited for making individualdeterminations concerning student progress” (p. 3).

Furthermore, it should be noted that most current multilevel approaches to measuring growthconsider measurement occasions as nested within students. These approaches use fit lines for thevertical scale with distinct slopes and intercepts for each student. However, the slopes represent an“average” rate of increase for the students across testing occasions, and these rates show statisticalartifacts that make lower achieving students increase at rates exceeding those of their higherachieving counterparts (Marsh & Hau, 2002).

While most formative methods provide the same type of intervention for ELLs with the samescore in the test, they fail to recognize differential requirements of students within these similarscoring groups. Recognizing that students achieve differently and knowing what these differencesare can instill realistic goals for ELLs. One way of providing fair differential growth expectancyfor students is through normative information whereby student scores are compared not to anaverage or aggregate achievement of students who have different achievement trends, but toother students who have an identical pattern of achievement across tests that measure the sameconstruct.

198 TAHERBHAI, SEO, O’MALLEY

Betebenner’s (2007) student growth percentile (SGP) model, which uses quantile regression,can be used as a formative assessment tool to compare students with identical prior history. TheSGP model allows students’ estimated entry scores at each of the predetermined percentiles tobe calculated based on their prior scores. Thus, the percentile necessary for growth to attaina predetermined target score can be assessed for each student. The differential propensity ofeach student for achieving a predetermined target score allows the application of instructionalresources in a way that does not expect too much or too little from each student. Creating realisticexpectations based on the potential of the student would likely allow for the allocation of scarceinstructional resources in the most productive manner and avoid setting some students up forfailure.

PURPOSE OF THE ARTICLE

The purpose of this article is to utilize Betebenner’s (2007) SGP model to provide formativeinformation for ELLs’ ongoing progress in English language proficiency (ELP). The method usesthe quantile regression model based on students’ prior total ELP scores and the ELP modalityscores to estimate each student’s growth percentile score for the total and the ELP modalities.

While the normative comparison of students with their academic peers (i.e., students whohave identical prior scores) allows the quantification of students’ potential, there is also a needfor examining how much the student needs to achieve to obtain the criterion-referenced target ofproficiency. Therefore, aside from comparing SGPs in a normative manner with academic peers,each student’s percentile growth will be examined to evaluate the percentile growth required inattaining a predetermined criterion score, i.e., the target score.

The purpose of this article then can be stated as:

1. quantifying growth percentile score for each ELL based ona. total ELP scores conditioned on previous years’ ELP scores,b. modality ELP scores conditioned on previous years’ ELP modality scores; and

2. determining the percentile entry score for each ELL in achieving the target score (i.e., thepercentile ranking a student needs to achieve proficiency in total and in each of the fourmodalities of the ELP examination).

THE STUDENT GROWTH PERCENTILE (SGP) MODEL

The conditional distribution of students using the SGP procedure provides the context withinwhich the students’ current achievement is understood normatively. In other words, students areexamined on their current performance with their academic peers by a classification of theirachievement in terms of the quantiles of interest. The comparative aspect of the SGP model isestablished through an examination of the percentiles (as in, say, achievement percentiles), whichbasically is a normative process that compares students based on the percentile ranking theyobtain.

The percentile of a student’s current score within his/her corresponding conditional distribu-tion can be translated to a probability statement of a student achieving the current percentile score


given his/her prior achievement scores. In this context, current scores are the scores for whichpercentile estimates are needed based on the trend obtained from a set of prior year scores mea-suring the same construct, or a set of scores that have substantive meaning in its use as predictors.Mathematically,

SGP ≡ Pr(CurrentAchievement|PastAchievement) × 100

As can be seen from the above equation, unconditional normative percentiles normatively quan-tify achievement, while conditional percentiles normatively quantify growth (Betebenner, 2007).In other words, when the conditional aspect in the equation is removed, the SGP would simplybe the probability of obtaining percentile ranks from the current administration, which provides,through the students’ percentile rankings, a comparison among students who took the current test.On the other hand, when the percentile probability is conditioned on prior performance, it projectsstudent status that reflects the current ranking vis-à-vis the scores of the students in the previousyears (i.e., it reflects the trend [growth] that provides percentiles based on the performance inprior administrations).

Calculation of a student’s growth percentile is associated with the conditional density of thestudent’s score at time t using the student’s prior scores at times 1, 2, . . . , t-1 as the condition-ing variable. By conditioning on a covariate x (i.e., the prior score), the rth conditional quantilefunction, Qy(r|x), is given by (Betebenner, 2009):

Qy(r|x) = arg minβ∈Rp

n∑

i=1

pr(yi − x′iβ)

As can be seen from the above equation, when r = 0.5, then the estimated conditional quantileline is the median regression line.

SGPs provide a number of attractive features from both theoretical and practical perspectives.In a practical sense, aside from the fact that SGPs are familiar (e.g., in the field of Pediatrics)and easily communicable to the layperson, the probabilistic approach allows the stakeholders toestablish what is deemed adequate in terms of growth. However, as Betebenner (2007) pointsout, the classification of SGPs as being “adequate,” “good,” or “enough” is a standard settingprocedure, which will differ from one assessment to the other.

From a theoretical perspective, it should be noted that aside from the model being robust tooutliers (Betebenner, 2007), SGPs are uncorrelated with prior achievement, which is analogousto least squares-based residuals being uncorrelated with independent variables. Hence, there isno foundation for applying to the SGP model the common complaint about regression creatingartifacts that generally provide a faster rate of increase for lower achieving students compared tohigher achieving students (Marsh & Hau, 2002).

As with regression analysis, the quantile regression method models a relationship between aset of predictor variables and specific percentiles (or quantiles) of the response variable. It speci-fies a change in a specified quantile of the response variable produced by one unit change in thepredictor variables. Thus, the relative effect on student achievement is reflected by the change inthe size of the regression coefficients.


Through the use of the SGP method as provided by Betebenner (2009), the regression coeffi-cients calculated for each specified percentile (see Table 3) can be used to predict score estimatesat the predetermined percentiles with respect to each student’s prior history. Therefore, as shownin Table 4, the score estimates can serve as an entry point for students to perform in a particularpercentile. When the focus is on the predetermined cut-score for achieving proficiency, this scorecan then be used to assess the percentile the student would have to grow in order to attain thetarget criterion-referenced cut.

METHOD

Participants

The sample of data analyzed for the study consists of 7,195 ELLs who had been administeredan ELP assessment since 2007. The large-scale ELP assessment was originally developed to testfive grade spans (i.e., K, 1–2, 3–5, 6–8, and 9–12) and annually administered to these ELLssince 2007. For our analyses, first graders who had been in the ELP program for five years since2007 were selected. These students’ fifth year of administration in 2011 was considered to betheir most current administration, while the other four years from 2007 to 2010 were the fourprior tests as indicators of the current year score.

Instrument

The ELP test, comprising four modalities (i.e., Speaking, Listening, Reading, and Writing), isintended to measure English language progress of the students who have a primary home lan-guage other than English. Five ELP test scores of each student for five consecutive years (2007 to2011) were used in this study.

The Scale and Proficiency Cut Scores

Even though most scores obtained in educational assessment fall under the equivariance of mono-tone scale transformation (Koenker, 2005), percentile rankings do not change in spite of theirestimates being a function of varying scales (Betebenner, 2007). Therefore, a vertical scale is nota requirement for SGP analyses, even though in this study, the scores of the ELP test were on avertical scale.

As per Wei and He (2006) and Betebenner (2007), B-splines were employed to accommodateheteroscedasticity and skewness of the conditional densities associated with the values of theindependent variables (i.e., student scores from 2007 to 2010). The B-spline parameterizationwas used because, according to Harrell (2001), they provide excellent fit and seldom lead to“estimation problems” (p. 20).

Generally speaking, proficiency cut scores are set as criterion-based achievement for “passing”the underlying construct that is being measured. Achieving the target cuts could imply variousthings in different assessments where target cuts are set as a proficient score, above proficientscore, and so on. In this article, we use the total proficiency cut as the score that indicates that thestudents do not need to attend ELL classes because he/she has enough language acquisition to


function effectively in academic classes. For the modalities, each target cut indicates proficiencyin the particular modality. In practice, the target scores are generally set through a standard set-ting procedure. The proficiency scale score cuts for the ELP examination used in this article are674 for the Total, 686 for Reading, 678 for Writing, 657 for Listening, and 652 for Speaking.

Summary Statistics

Summary statistics that included the median and the median absolute deviation (MAD), which arerobust measures of univariate location and scale, respectively, were checked for marked deviation.

Model Fit Analyses

Histograms of the standardized residuals and two fitted density curves (normal and Kernel) wereused to examine regression fit to the data at the predetermined percentiles. Beside the visualinspection, a goodness-of-fit analysis was conducted by comparing the estimated conditional den-sity of the B-spline parameterization with the theoretical density for each predetermined quantile.The expectation from this analysis was to have the percentage of ELL students under each of thepredetermined percentiles reflect the scenario of a data set with perfect model fit (i.e., 10% ofstudents under or at the 10th percentile, 20% under or at the 20th percentile, etc).

Student Growth Percentile (SGP) Analyses

Two aspects of growth percentiles were needed to evaluate ELLs’ progress:

1. Norm-referenced inferences: The examination of each student’s SGP for the total scoreand the modality scores of the ELP examination to see how well they grew vis-à-vis theiracademic peers (i.e., those with identical prior scores).

2. Criterion-referenced inferences: The evaluation of the SGP needed for non-proficientELLs to achieve proficiency and for proficient students to maintain their SGP rankingfor the total score and across each individual modality ELP score.

In the first instance, the number of students who were proficient based on their current totalELP scores (i.e., their Year 5 scores) were identified. Then narrow bands of quantiles were pre-specified for total and modality ELP score estimations of these students, that is, from the 10thpercentile all the way to the 90th percentile in increments of 10 percentile points (i.e., 0.10, 0.20,0.30, etc.). The number of quantiles is generally based on the requirements of the researcher.

In this article, based on the total n-count of 7,195 students scale scores, the authors believedthat groupings from the 10th to the 90th percentile would provide enough discrimination in ana-lyzing each student’s unique percentile trajectory for formative purposes. The predicted valuesof the scores at the specified percentiles for each of these students were calculated using ProcQuantReg in SAS 9.2.

Students’ Year 5 scores were then examined vis-à-vis the predicted total ELP and modalityscores needed for entry into each percentile. The percentiles that harbored each fifth year modalityscore and the total ELP score became the students’ growth percentile ranks for the modalities andthe total score, respectively. The established proficiency cuts for the total ELP examination andeach of the four modalities were then examined for their location (i.e., the percentile in which


they resided), to determine the SGP required for students to maintain proficiency or to achieveproficiency with the assumption of the same growth pattern over years (i.e., holding the priorperformance constant).

This kind of analysis could be a very useful tool for teachers in the context of formativeassessment for it would give them an indication of what growth is expected for each student tomaintain or attain proficiency, which in turn would help them formulate their teaching strategiesin terms of how much effort would be required for each of these students.

RESULTS

Summary Statistics

Summary statistics that included the median and the MAD were produced (see Table 1). In gen-eral, the mean of students’ total and modality ELP scores increases as the time goes along. Thedifferences between the mean and median as well as the differences between the standard devia-tion (SD) and the MAD indicate the presence of outliers. For Year 1, the mean and the median aswell as the SD and the MAD are a bit different; for Year 4 and Year 5, only the Speaking modalityshows such differences; otherwise, for all other years and modalities, the differences in the meanand median and the SD and MAD are very small.

Model Fit Analyses

Histograms overlaid with two fitted density curves (i.e., the Normal and the Kernel DensityCurves) of the standardized residuals were examined at each of the predetermined quantiles.Figure 1 displays one such histogram at the 50th percentile. As seen in the figure, the modelseems to fit the data well for the total scores. The model also fits the data well with varyingdegrees of “absolute” fit for the modalities.

To better discriminate between low and high achieving students’ model fit to the data, Year4 students’ B-spline scores (2010) were grouped by deciles based on low-to-high performingstudents. Percentages of the student growth using the current year scores (i.e., the fifth year,2011 scores) at the predetermined 10th to the 90th percentiles were calculated for each Decile(see Table 2).

Overall, the percentages of students under each percentile were not too far removed fromexpectations, with only a few more than three percentage points higher than expectation.

Diagnostics

As shown in Table 3, the Year 4 score had the highest regression coefficient, relative to the Year 1,Year 2, and Year 3 scores, which indicates that this predictor was the most influential of the fourpredictors in each of the growth percentiles across the total scores and the modalities. Similarresults were found for the modality-based analyses. Here, too, the Year 4 score had the highestregression coefficient, relative to Year 1, Year 2, and Year 3 scores.

It should be also noted that the Year 1 ELP score has no influence in the prediction of the Year5 ELP score (i.e., its coefficient is close to zero). This particular phenomenon is understandable

TAB

LE1

Sum

mar

yS

tatis

tics

ofF

ive

Year

s’E

LPA

sses

smen

t:To

tala

ndM

odal

ityS

core

Tota

lL

iste

ning

Spea

king

Rea

ding

Wri

ting

Vari

able

Mea

nM

edia

nSD

MA

DM

ean

Med

ian

SDM

AD

Mea

nM

edia

nSD

MA

DM

ean

Med

ian

SDM

AD

Mea

nM

edia

nSD

MA

D

Yea

r1

Scor

e58

159

537

1559

158

349

3859

458

462

5956

956

042

2157

756

640

23Y

ear

2Sc

ore

619

618

2423

622

645

4132

634

643

5053

621

614

4553

619

622

2932

Yea

r3

Scor

e64

764

926

2664

063

836

3668

167

553

5264

063

939

4164

565

135

35Y

ear

4Sc

ore

665

670

3128

663

665

3843

685

676

5379

666

668

4544

671

681

4944

Yea

r5

Scor

e69

069

233

3368

667

541

2870

569

746

7469

269

346

4470

170

553

44

203


FIGURE 1 Histogram for standardized residuals at the 50th percentile:Total score.

TABLE 2Goodness-of-Fit Analysis: Estimated Percent of Students at or Below Each Percentile by Deciles Based on

the Students’ Total Scores

Group(Decile)

10thPercentile

20thPercentile

30thPercentile

40thPercentile

50thPercentile

60thPercentile

70thPercentile

80thPercentile

90thPercentile

1 15 24 31 42 50 58 68 76 872 13 21 31 42 50 62 66 77 883 13 21 34 39 46 59 72 78 894 11 21 29 41 49 56 66 81 875 10 20 32 38 54 63 71 80 936 11 23 31 39 51 58 67 77 917 12 22 32 41 50 60 70 78 938 12 23 32 41 50 60 69 81 909 11 21 29 38 48 61 75 75 9310 10 23 33 44 58 58 70 83 92

ExpectedValues

10 20 30 40 50 60 70 80 90

since this is the first year of the five-year period during which the students take the examination.At this stage, students are new to ELP classrooms and the effects of learning may not have beenassimilated. Furthermore, as can be seen from the table, the influence of the prior tests decreasesas the time between the test administrations increases (i.e., the coefficients are much lower for the


TABLE 3Regression Coefficient Estimates of SGP Model: Total Score

Quantile Regression Coefficients

Percentiles Intercept S.E.

Year 1Score

Coefficient S.E.

Year 2Score

Coefficient S.E.

Year 3Score

Coefficient S.E.

Year 4Score

Coefficient S.E.

10th Percentile 30.85 12.37 −0.02 0.01 0.14 0.02 0.31 0.02 0.55 0.0220th Percentile 59.00 12.58 −0.02 0.01 0.12 0.02 0.30 0.02 0.54 0.0130th Percentile 75.61 11.63 −0.03 0.01 0.12 0.02 0.29 0.02 0.54 0.0140th Percentile 87.05 11.54 −0.04 0.01 0.12 0.02 0.29 0.02 0.54 0.0150th Percentile 94.16 11.66 −0.05 0.01 0.12 0.02 0.30 0.02 0.53 0.0160th Percentile 108.65 11.27 −0.06 0.01 0.12 0.02 0.30 0.02 0.53 0.0170th Percentile 103.99 11.62 −0.05 0.01 0.12 0.02 0.31 0.02 0.53 0.0280th Percentile 121.91 13.28 −0.07 0.01 0.12 0.02 0.30 0.02 0.54 0.0290th Percentile 112.27 20.16 −0.07 0.02 0.13 0.03 0.32 0.03 0.54 0.03

first few years compared to the most recent year, Year 4). This is intuitively understandable as stu-dents’ current behavior can be best estimated by their most immediate prior behavior. However, itis important to note that even though coefficients for the first few years are not large, they providea monotonically increasing trend for estimating growth in the current year (i.e., the students’ Year5 ELP scores).

The coefficients from Table 3 were used to estimate each student’s growth percentile score atthe selected percentiles. Examples of some students’ total scores, necessary to attain membershipin a percentile, are provided in Table 4. The table also provides the same information for thestudents based on their modality performance.

Growth Needed to Achieve Proficiency With Respect to Students’ Total Scores

As can be seen from Table 4, Student ID #1’s Year 5 score of 697 was above the establishedproficiency cut = 674. This student needs to grow in the same 50th percentile to maintain his/herproficiency status, holding his/her prior performance trend constant.

Similarly, Student ID #2’s growth is at the 50th percentile and this student has also reachedproficiency.

By the same token, Student ID #3 has achieved proficiency but fails to meet the adequategrowth category since his/her growth percentile score is at the 20th percentile (just below the30th percentile).

Student ID #4 has missed proficiency by one scale point and his/her SGP is just below the50th percentile (i.e., at the 40th percentile).

Student ID #5 has a very high growth percentile (70th percentile) but his/her Year 5 score of670 does not cross over the threshold to proficiency, which requires her/his already high growthlevel to increase slightly to the 80th percentile.

TAB

LE4

Eig

htE

xam

ples

ofS

tude

nt’s

Tota

land

Mod

ality

Pre

dict

edS

core

sA

cros

sth

eS

peci

fied

Per

cent

iles

Type

ofE

LP

Scor

eID No.

Year

5Sc

ore

10th

Perc

enti

le20

thPe

rcen

tile

30th

Perc

enti

le40

thPe

rcen

tile

50th

Perc

enti

le60

thPe

rcen

tile

70th

Perc

enti

le80

thPe

rcen

tile

90th

Perc

enti

le

Tota

l1

697∗

671

679

685

690

695

700

706

712

724

269

7∗66

967

868

368

969

469

970

471

172

33

688∗

676

684

689

695

699

704

710

717

728

467

364

965

766

366

867

467

868

469

170

25

670

631

640

646

652

657

662

667

674

684

667

068

969

670

270

771

271

772

373

074

17

650

641

650

656

661

666

671

677

683

694

858

658

959

960

661

161

662

162

663

364

2L

iste

ning

172

3∗64

165

266

066

867

568

369

270

472

42

723∗

645

658

667

674

683

691

699

712

736

367

5∗68

169

370

271

072

173

074

075

578

94

694∗

624

636

644

651

658

665

673

684

699

572

3∗62

163

264

064

765

366

066

967

969

36

660∗

651

662

670

678

686

694

702

714

738

764

862

463

564

465

265

866

567

468

569

98

648

614

628

636

643

650

657

664

673

690

Spea

king

167

1∗66

468

770

373

274

274

774

774

774

72

655∗

619

636

649

660

678

704

747

747

747

374

7∗65

367

568

971

373

474

774

774

774

74

747∗

654

674

689

707

722

731

747

747

747

574

7∗61

764

265

267

871

274

774

774

774

76

619

659

682

697

728

739

747

747

747

747

767

165

267

068

770

871

272

174

774

774

78

671

629

647

658

673

694

714

747

747

747

206

Rea

ding

168

266

067

268

168

869

570

371

272

474

62

750∗

670

681

690

697

705

712

721

733

752

370

6∗65

867

168

068

969

670

371

272

374

14

663

640

654

663

672

679

687

697

709

729

564

663

564

965

866

767

568

469

370

571

86

723∗

673

684

692

700

707

714

723

734

756

761

961

162

863

864

865

666

467

468

670

08

437

497

520

531

543

552

565

578

594

606

Wri

ting

173

0∗66

867

868

769

570

271

272

373

876

42

675

693

704

713

722

730

742

755

771

799

366

465

566

767

768

569

270

271

272

674

54

654

631

643

652

659

666

673

683

695

716

565

462

763

764

665

466

066

767

568

770

96

688∗

710

722

731

740

749

762

776

793

819

767

563

564

865

666

467

167

868

970

172

28

414

517

526

533

537

540

538

542

545

568

∗ ind

icat

esin

divi

dual

who

pass

espe

rfor

man

cecu

tsco

re.

207


Student ID #6 has the same Year 5 score as Student ID #5. However, in comparison to his/heracademic peers, Student ID #6’s progress is rather low, placing him/her below the 10th percentile.In terms of attaining proficiency, however, this student can increase his/her score slightly toachieve proficiency at the 10th percentile growth level.

Student ID #7 is not only progressing at a low 20th percentile but he/she needs to grow in the70th percentile to meet proficiency.

Student ID #8 is very low in achieving proficiency, and his/her growth percentile is belowthe 10th percentile. As with Student ID #7, this student would have to increase his/her growthpercentile to beyond 90th percentile in comparison with his/her academic peers to achieveproficiency.

Examination of Growth Needed to Achieve Proficiency With Respect to Both Students’Total and Modality Scores

Analyzing modality scores in the same manner as we did the total scores in the previous sectioncan help teachers allocate resources in an appropriate manner as befitting those who are at varyingneed for intervention with respect to the particular modalities.

In examining the modality scores, there are several percentiles at the higher end for theSpeaking modality that have the same entry score (i.e., 747; see Table 4). This can happen withthe ELL population because proficiency in Speaking is most easily achievable among the differ-ent ELP modalities (Menken & Kleyn, 2009). In the data set used in this article, there were alarge number of students who had achieved high scores in Speaking (i.e., approximately 48% ofthe students had scores at or above 747). For the eight students shown as examples in Table 4,all had reached the 747 mark at the 60th or the 70th percentile (see Table 4), which provided thesame entry score for the remaining percentiles at the higher end.

The crux of the analyses, however, lies in teasing out the effects of compensatory scoringso that a student, who is proficient or nearly proficient with respect to the total score, may notperform well in mainstream classrooms because of his/her lack of proficiency in one or more ofthe modalities.

Deciphering who is proficient can be easily accomplished by examining the scale scores.However, this type of information would not include how much effort is needed to achieve pro-ficiency. Additional scrutiny of the entry score that is needed for proficiency at each modalitypercentile can inform us of the percentile growth needed in the modalities not only to achieveoverall English language proficiency but also to perform adequately in academic classes.

For example, as shown in Table 5, while Student ID #2 has reached total ELP proficiency (Year5 score = 697), he/she is lacking in Writing, which is one of the key components of academicsuccess (Robertson, 2009). Granted, this student has not missed proficiency in Writing by much(675, which is very close to the proficient cut score of 678). Nevertheless, the fact that this studentgrew only to a level below the 10th percentile in comparison to his/her academic peers in Writingis troubling because it shows that the student has not improved his/her writing much in recentyears. Overall, the message from this model is that, with a little attention on Writing by his/herteacher, this student could obtain the growth in Writing needed (in this case, to the 10th percentile)to achieve proficiency in that modality.

On the other hand, Student ID # 8 has a very poor Total score (i.e., 586), with a SGP set atbelow the 10th percentile and the proficiency cut that lies beyond the 90th percentile. While there

TAB

LE5

Exa

mpl

esof

Teac

her

Leve

lInf

orm

atio

nD

epic

ting

Indi

vidu

alS

tude

nt’s

Gro

wth

and

The

irP

redi

cted

Gro

wth

Per

cent

iles

toA

chie

veor

Mai

ntai

nP

rofic

ienc

y:To

tala

ndM

odal

ityS

cale

Sco

res

(SS

)

Tota

lR

eadi

ngW

riti

ngSp

eaki

ngL

iste

ning

Std

IDYe

ar5

SGP

Est

imat

eof

Gro

wth

for

Pro

f.at

SS=

674

Year

5SG

P

Est

imat

eof

Gro

wth

for

Pro

f.at

SS=

686

Year

5SG

P

Est

imat

eof

Gro

wth

for

Pro

f.at

SS=

678

Year

5SG

P

Est

imat

eof

Gro

wth

for

Pro

f.at

SS=

652

Year

5SG

P

Est

imat

eof

Gro

wth

for

Pro

f.at

SS=

657

Com

men

ts

269

750

thPe

rcen

tile

Ach

ieve

dPr

of75

080

thPe

rcen

tile

Ach

ieve

dPr

of67

5B

elow

10th

Perc

entil

e

10th

Perc

entil

e65

530

thPe

rcen

tile

Ach

ieve

dPr

of72

3A

bove

90th

Perc

entil

e

Ach

ieve

dPr

ofTo

talP

rof

butn

eed

som

ehe

lpin

Wri

ting

whe

rehe

/sh

eha

sno

tac

hiev

edPr

of.A

lso

need

todo

bette

rin

Wri

ting

and

Spea

king

com

pare

dto

his

acad

emic

peer

s4

673

40th

Perc

entil

e50

thPe

rcen

tile

663

30th

Perc

entil

e70

thPe

rcen

tile

654

30th

Perc

entil

e70

thPe

rcen

tile

747

70th

Perc

entil

eA

chie

ved

Prof

694

80th

Perc

entil

eA

chie

ved

Prof

Just

one

scor

epo

intl

ess

toac

hiev

eTo

tals

core

profi

cien

cybu

tnee

dsex

tens

ive

rem

edia

lin

Rea

ding

and

Wri

ting

858

6B

elow

10th

Perc

entil

e

Bey

ond

90th

Perc

entil

e

437

Bel

ow10

thPe

rcen

tile

Bey

ond

90th

Perc

entil

e

414

Bel

ow10

thPe

rcen

tile

Bey

ond

90th

Perc

entil

e

671

40th

Perc

entil

eA

chie

ved

Prof

648

50th

Perc

entil

e60

thPe

rcen

tile

The

stud

enth

asa

long

clim

bin

Tota

lgro

wth

toac

hiev

epr

ofici

ency

.N

eeds

exte

nsiv

ere

med

iali

nR

eadi

ngan

dW

ritin

g.A

lso

need

sso

me

help

inL

iste

ning

toov

erco

me

the

10po

ints

perc

entil

ega

p

Not

e.Pr

of=

Profi

cien

cy;S

td=

Stud

ent.

209


may be indications that this student needs help across all modalities, he/she is a classic case ofthe type of student who is proficient in Speaking (the student is proficient in Speaking with SGPin the 30th growth percentile; see Table 5) but who does poorly in academic achievement. SeeMenken and Kleyn (2009) for a discussion of fluency in speaking being construed as an indicationof language proficiency.

By the same token, Student ID #4 has missed the total ELP proficiency score by just one scalepoint (673 instead of 674). However, an examination of the modality scores shows the student issignificantly lacking in both Reading and Writing skills (i.e., he/she has achieved 30th percentilegrowth for both Reading and Writing) even though the student has 80th percentile growth forSpeaking and Listening. To obtain a label of “true” proficiency that will serve the student wellin English-laden academic subjects, the teacher needs to concentrate her/his efforts on the stu-dent’s Reading and Writing in the ELP classroom. In other words, this student has to grow in the60th percentile for Reading and in the 70th percentile for Writing to perform well in academicclasses.

Thus, it becomes evident that each student’s modality performance should also be scrutinizedin the same manner as was shown in our discussion on Total scores so that, even if studentsperform at proficiency in overall ELP scores, they must also be proficient in their modality scoresto be able to perform well in academic classrooms laden with the language component.

COMMENTS AND DISCUSSION

As outlined by the United States Department of Education’s Blueprint for Reform (2010), moreeffective measures of students’ growth are expected for all students. This expectation is all themore important for ELL students because fluency in the language could also affect their perfor-mance in academic content areas, particularly where academic learning is associated with theknowledge of the English language.

While ELLs’ overall progress in ELP can be assessed simply by observing the relative cutemployed for passing the ELP examination, an evaluation of ELLs’ performance on the fourmodalities becomes paramount when assessing areas of weakness and strength within the ELPconstruct for each student. Because of the use of compensatory scoring by many ELP assess-ments, students’ proficiency in certain modalities could be so low that it could result in pooracademic performance (Menken & Kleyn, 2009) even though the student could have achieved atotal score of proficiency on an ELP examination.

As stated earlier, the modality in which a student does not perform adequately can easilybe obtained from the actual scores and the proficiency cut-offs established for each modality.However, unlike the SGP model, the simple cut-off calculation in each of the four modalities doesnot give us an indication of how much effort is required to help the student achieve proficiency.

As Betebenner (2007) points out, SGPs focus on normatively quantifying changes in achieve-ment instead of focusing on the magnitude of learning. In analyzing the information provided inTables 4 and 5, it becomes evident from the use of quantile regression methods that achieving agiven level of proficiency can require differential effort from two students with the same currentscore. When knowledge of comparative growth is lacking, instructional efforts are appliedwithout a fair reckoning of students’ growth potential. As such, students are often instructedbased on an average performance criterion that lumps them together in an undifferentiated


manner, instead of receiving remedial help that varies for each student based on different levelsof achievement across years.

Thus, when the SGP model is applied as a formative assessment tool, students can benefitfrom using individualized, tailor-made remedial activities. Furthermore, providing teachers theirstudents’ differential propensity for achieving proficiency may assist in mollifying those teacherswho claim that it is unfair to be held accountable based on a single measure of progress for allstudents.

As in many educational assessments (e.g., students’ ability estimations), the standard errorsfor the regression coefficient at the extremes of the percentile groupings seems a bit larger relativeto the middle of the predetermined percentile scale. However, this relative difference depends onfactors such as low student counts at the scale extremes and measurement error. The importanceof accuracy and precision, therefore, is dependent on the type of inferences one wishes to drawfrom the growth percentiles with the understanding that precise comparison cannot be sustained(Betebenner, 2007).

Much like other growth models, students with missing prior scores would have to be elimi-nated from the SGP analyses. Other SGP models could be created for students with a differentconsecutive number of prior scores, provided enough n-counts are available for such analyses (seeGrady, Lewis, & Gao, 2010 for minimum sample size requirements). It should be kept in mindthat while more than one prior score is desirable for more accurate SGP estimates, the net effecton formative information is that it serves as a guideline for teachers’ remedial actions. Any helpin that direction is always useful as a starting point for intervention techniques. While estimationof missing scores to account for dwindling n-counts is a possibility (Sanders, 2006), imputingmissing values carries the risk of larger sampling errors.

Projecting how much students need to grow is based on holding the growth pattern con-stant. But like any dynamic situation, students’ growth trajectories are likely to change, andtherefore, if behooves educators to monitor student progress across years as is done by theColorado Department of Education (see http://ww2.ed.gov/adminis/lead/account/growthmodel/co/coattachment1tutorial.pdf). Appropriate changes can then be made in remedial efforts for eachstudent by an examination of his/her yearly growth toward proficiency.

Finally, using the SGP method for comparison can seem a daunting task for some educa-tors, particularly for those teachers whose ELLs’ growth requirements for achieving proficiencymay seem unattainable. Nevertheless, this concern does not undercut the useful information themethod provides (viz., the desirable property of knowing how much growth is required). On thecontrary, the SGP method for comparison can facilitate the better allocation of finite resourcesfor those students who are having problems, allowing educators to revisit these problems withcreative teaching methods, one-on-one tutorials, motivational strategies, and parental involve-ment that provides the right type of intervention on a differential basis for each student (Seo &Taherbhai, 2009).

ACKNOWLEDGMENTS

The authors thank the editor, Dr. Kurt F. Geisinger, and the reviewers for their valuable adviceand suggestions, which greatly improved the article. Thanks are also due to Dr. Mark G. Robeckfor his constructive editorial suggestions. All errors remain the responsibility of the authors.

http://ww2.ed.gov/adminis/lead/account/growthmodel/co/coattachment1tutorial.pdf

http://ww2.ed.gov/adminis/lead/account/growthmodel/co/coattachment1tutorial.pdf


REFERENCES

Abedi, J. (2008). Classification system for English language learners: Issues and recommendations. EducationalMeasurement: Issues and Practice, 27, 17–31.

Ballou, D., Sanders, W., & Wright, P. (2004). Controlling for student background in value-added assessment of teachers.Journal of Educational and Behavioral Statistics, 29, 37–65.

Betebenner, D. W. (2007). Estimation of student growth percentiles for the Colorado student assessment program.Retrieved from http://www.cde.state.co.us/cdedocs/Research/PDF/technicalsgppaper_betebenner.pdf

Betebenner, D. W. (2009). Norm-and criterion-referenced student growth. Educational Measurement: Issues andPractice, 28, 42–51.

Braun, H. I. (2005). Using student progress to evaluate teachers: A primer on value-added models. Retrieved fromhttp://www.ets.org/Media/Research/pdf/PICVAM.pdf

Ferrara, S., & DeMauro, G. E. (2006). Standardized assessment of individual achievement in K–12. In R. L. Brennan(Ed.), Educational measurement (4th ed., pp. 579–621). Westport, CT: American Council on Education/Praeger.

Goodman, D. P., & Hambleton, R. K. (2004). Student test score reports and interpretive guides: Review of currentpractices and suggestions for future research. Applied Measurement in Education, 17, 145–220.

Grady, M., Lewis, D., & Gao, F. (2010). The effect of sample size on student growth percentiles. Retrieved from http://www.ctb.com/ctb.com/control/getAssetListByFilterTypeViewAction?param=393&title=topic&p=library

Harrell, F. E. (2001). Regression modeling strategies: With applications to linear models, logistic regression, and survivalanalysis. New York, NY: Springer-Verlag.

Ho, A. (2011). Growth and consequences: Will NCLB give way to growth models? Retrieved from http://www.gse.harvard.edu/blog/news_features_releases/2011/01/growth-and-consequences-will-nclb-give-way-to-growth-models.html

Koenker, R. (2005). Quantile regression: Econometrics society monographs. New York, NY: Cambridge UniversityPress.

Lissitz. B., & Doran, H. (2009). Modeling growth for accountability and program evaluation: An introduction forWisconsin educators. Retrieved from http://marces.org/completed/Lissitz%20(2009)%20Modeling%20Growth%20for%20Accountability.pdf

Marsh, H. W., & Hau, K. T. (2002). Multilevel modeling of longitudinal growth and change: Substantive effects orregression toward the mean artifacts? Multivariate Behavioral Research, 37, 245–282.

McCarthy, C. P. (1999). Reading theory as a microcosm of the four skills. Retrieved from http://iteslj.org/Articles/McCarthy-Reading.html

Menken, K., & Kleyn, T. (2009). The difficult road for long-term English learners. Retrieved from http://www.ascd.org/

publications/educational_leadership/apr09/vol66/num07/The_Difficult_Road_for_Long-Term_English_Learners.aspx

Meyer, D., Madden, D., & McGrath, D. (2004). English language learner students in U.S. public schools: 1994 and 2000(Issue Brief No. 2004-035). Jessup, MD: National Center for Education Statistics.

Nichols, P. D., Meyers, J. L., & Burling, K. S. (2009). A framework for evaluating and planning assessments intended toimprove student achievement. Educational Measurement: Issues and Practice, 28, 14–23.

Perie, M., Marion, S., & Gong, B. (2009). Moving toward a comprehensive assessment system: A framework forconsidering interim assessments. Educational Measurement: Issues and Practice, 28, 5–13.

Roberts, M. R., & Gierl, M. J. (2010). Developing score reports for cognitive diagnostic assessments. EducationalMeasurement: Issues and Practice, 29, 25–38.

Robertson, K. (2009). Math instruction for English language learners. Retrieved from http://www.readingrockets.org/article/30570/

Rushton, A. (2005). Formative assessment: A key to deep learning? Medical Teacher, 27, 509–513.Sanders, W. L. (2006). Comparisons among various educational assessment value-added models. Retrieved from http://

www.sas.com/govedu/edu/services/vaconferencepaper.pdfSanders, W. L., Saxton, A. M., & Horn, S. P. (1997). The Tennessee value-added assessment system: A quantitative

outcomes-based approach to educational assessment. In J. Millman (Ed.). Grading teachers, grading schools: Isstudent achievement a valid measure? (pp. 137–162). Thousand Oaks, CA: Corwin Press.

http://www.cde.state.co.us/cdedocs/Research/PDF/technicalsgppaper_betebenner.pdf

http://www.ets.org/Media/Research/pdf/PICVAM.pdf

http://www.ctb.com/ctb.com/control/getAssetListByFilterTypeViewAction?param=393&title=topic&p=library

http://www.ctb.com/ctb.com/control/getAssetListByFilterTypeViewAction?param=393&title=topic&p=library

http://www.gse.harvard.edu/blog/news_features_releases/2011/01/growth-and-consequences-will-nclb-give-way-to-growth-models.html



http://marces.org/completed/Lissitz%20(2009)%20Modeling%20Growth%20for%20Accountability.pdf

http://marces.org/completed/Lissitz%20(2009)%20Modeling%20Growth%20for%20Accountability.pdf

http://iteslj.org/Articles/McCarthy-Reading.html

http://iteslj.org/Articles/McCarthy-Reading.html

http://www.readingrockets.org/article/30570/

http://www.readingrockets.org/article/30570/

http://www.sas.com/govedu/edu/services/vaconferencepaper.pdf

http://www.sas.com/govedu/edu/services/vaconferencepaper.pdf


Seo, D., & Taherbhai, H. (2009). Motivational beliefs and cognitive processes in mathematics achievement, analyzedin the context of cultural differences: A Korean elementary school example. Asia Pacific Education Review, 10,193–203.

U.S. Department of Education (2010). A blueprint for reform: The reauthorization of the Elementary and SecondaryEducation Act. Retrieved from http://www2.ed.gov/policy/elsec/leg/blueprint/blueprint.pdf

U.S. Government Accountability Office (2006). No Child Left Behind Act: Assistance from education could help statesbetter measure progress of students with limited English proficiency. Retrieved from http://www.gao.gov/highlights/d06815high.pdf

Van Roekel, D. (2008). English language learners face unique challenges. Retrieved from http://www.nea.org/assets/docs/mf_PB05_ELL.pdf

Wei, Y., & He, X. (2006). Conditional growth charts. Annals of Statistics,34, 2069–2097.

http://www2.ed.gov/policy/elsec/leg/blueprint/blueprint.pdf

http://www.gao.gov/highlights/d06815high.pdf

http://www.gao.gov/highlights/d06815high.pdf

http://www.nea.org/assets/docs/mf_PB05_ELL.pdf

http://www.nea.org/assets/docs/mf_PB05_ELL.pdf

Copyright of Applied Measurement in Education is the property of Taylor & Francis Ltd andits content may not be copied or emailed to multiple sites or posted to a listserv without thecopyright holder's express written permission. However, users may print, download, or emailarticles for individual use.

Documents

Formative Information Using Student Growth Percentiles for the Quantification of English Language Learners’ Progress in Language Acquisition