Haxhiraj ch13 14-presentation

Ch-13 & 14Cizek & Bunch

Brikena Haxhiraj

1

Ch-13Scheduling Standard Setting Activities

Chapter Goals:

Authors suggest ways and methods of how to schedule standard setting in two types of assessments by drawing primarily on their experiences in large scale credentialing programs and educational assessments while providing examples of each standard setting activities.

1. Scheduling standard setting for educational assessments

2. Scheduling standard setting for credentialing programs

2

Scheduling standard setting for educational assessment

Table 31-1 (p. 219-221) provides an overview of the main activities to be completed along with a time table for their completion.

A generic version of the table can also be found at www.sagepub.com/cizek/schedule

This table shows the planning for standard setting beginning two years before the actual standard setting session.

3

http://www.sagepub.com/cizek/schedule

1. Overall Plan Establish performance level labels (PLLs) and performance level descriptions (PLDs)

Drafting a standard setting plan before item writing begins, is one way to make sure the test supports the standard-setting activity that is eventually carried out.

Table 13-1 shows a field test exactly one year prior to the first operational administration of the test. During the first year, a regular testing window would be reserved for field testing.

The planning should specify: a) a method, b) an agenda, c) training procedures and d) analysis procedures.

Technical advisory committee (TAC).

Stakeholder review

4

2. Participants Identify and recruit the individuals who will participate in the standard setting activity (i.e., the panelists).

For statewide assessments, it is preferable that the panelists be as representative of the state as possible.

Table 13-1 shows the process of identifying these individuals about nine months before standard setting begins.

Creation of the standard-setting panels is a three-step process

1. Local superintendents or their designees identify potential panelists in accordance with specifications provided by the state education agency.

2. Notify candidates prior to submitting their names by sending an initial letter to all candidates

3. State agency staff sort the nominations to create the required number of panels and with the approved numbers of panelists.

5

3. Materials Training materials, forms and data analysis programs

The timing of preparing these materials is crucial

Some can be prepared in advance and some can not (refer to Table 13-2; 13-3).

Final Preparations: Everyone involved needs to be thoroughly prepared; all presentations should be scripted and rehearsed, all rating forms should be double checked, all participant materials should be produced, duplicated, collated, and assembled easy in sets.

As a final part of the preparation, the entire standard-setting staff should conduct a dress rehearsal making sure that timing of presentations, is consistent with the agenda, that all forms are correct and usable and that the flow of events is logical.

6

4. At the standard setting site and following up

The lead facilitator attends to matters related to conduct of the sessions

Logistics coordinator attends to everything else

Once panelists complete their tasks, and turn in their materials, data entry staff take over, and the next morning, the data analysis staff continues the process.

All data entry should be verified by a second person before data analysis begins.

The state education agency responsible for the standard setting should have arranged time on the agenda of the state board of education as soon as possible after standard setting in order to have cut scores approved.

Once cut scores are adopted by the board, it is possible to include them in the score reporting programs and produce score reports.

7

Scheduling standard setting for credentialing programs

Scheduling standard setting for credentialing programs is different from educational assessment programs. Educational assessment programs are bound to specific time of the academic year and tests are typically given in the spring or fall.

Credentialing programs are not bound by these constraints, and have the ability for some flexibility such as computer adaptive testing (CAT) or computer based testing (CBT) may permit test administration on any day of the year.

Table 13-4 provides an overview of the major tasks for a credentialing testing program.

8

Small group activity

In groups of three review pages (237-245) and post the key components of scheduling standard setting for credentialing programs focusing on differences between scheduling standard setting for educational assessments.

Use this website to post your thoughts

http://padlet.com/wall/4qxyguqgnd

9



Recommendations Planning for standard setting needs to be made an integral part of planning for

test development.

Plans of the standard setting facilitators should be reviewed by test development staff, and vice versa.

One person with authority over both item developers and standard setters should have informed oversight over both activities.

Attention to scoring in particular with open ended or constructed response items.

Finally, test planning, test development, and standard setting are interlinked parts of a single enterprise.

10

Ch-14Vertically-Moderated Standard Setting

Chapter Goals:

Describe:

(1) the general concept of VMSS

(2) specific approaches to conduct VMSS

(3) a specific application of VMSS

Provide:

(1) suggestions for a current assessment system and a need for additional research

11

Linking Test Scores across grades within the Norm Referenced Testing (NRT) context

Review from Ch-6 (Ryan & Shepard)

Construct of Linking- refers to several types of statistical methods that establish a relationship between the score scales from two tests, so the results can be comparable between the tests.

Test Score Equating- Used to measure year to year changes over time for different students in the same grade

Vertical Equating- linking test scores vertically across grade levels and schooling levels. The tests that are to be linked need to measure the same construct.

12

Interrelated Challenges within the Standards-Referenced Testing (SRT) context

NCLB requirements for tracking cohort growth & achievement gaps

These newer assessment apply standards-referenced testing (SRT)

Linking test performance standards from two or more grade levels (adjacent and not adjacent)

The construct measured may be different

Sheer number of performance levels that NCLB requires

The wide test span and developmental range

The panels of educators who participate in standard setting

13

A New Method that Links Standards Across Tests

To address these challenges, a need to develop and implement standard setting methods that set performance levels across all affected grade levels with some method for smoothing out differences between grades.

Suggested approach—VMSS—Vertically Moderated Standard Setting

14

History of VMSS Introduced by Lissitz & Huynh (2003b)

AYP is based on the percentage of students who meet Proficient and the expected percentage increases over time.

The purpose of VMSS – deriving at a set of cross grade standards that realistically tracks student growth over time and provides a reasonable expectation of growth from one grade to the next.

The critical issue—defining reasonable expectations using vertical scaling would not produce a satisfactory set of expectations for grade to grade growth.

Alternative to vertical scaling or equating, Lissitz and Huynh (2003 b) suggested VMSS.

15

What is VMSS?

A process of vertical articulation of standards: aligning scores, scales or proficiency levels.

Is a procedure or set of procedures, typically carried out after individual standards have been set that seeks to smooth out the bumps that inevitably occur across grades.

Reasonable expectations are stated in terms of percentages of students at or above a consequential performance level, such as Proficient.

Lets discuss the hypothetical scenario using the table on the next slide (p.255 in your book).

16

What is VMSS?

Grades % of Students At or Above Proficient Performance Lv.

Difference

345678

374134432942

+ 4 %- 7 %+ 9 %- 14 % + 13 %

17

Approaches to VMSS Focuses on percentages of students at various proficiency levels

Is based on assumptions about growth in achievement over time

Problem: Different percentages of students reaching a given performance level – such as—Proficient cut score at different grades.

Solution:

1. Set all standards at the score point or such that equal percentages of students would be classified as proficient at each grade level by fiat.

2. Set standards only for the lowest and highest grades and then align the percentages of Proficient students in the intermediate grades accordingly.

18

Approaches to VMSSGrades % of Students At or

Above Proficient Performance Lv.

345678

373839404142

36

37

38

39

40

41

42

0 5 10

Y-Value 1

Y-Value 1

19

Assumptions re: growth over time

Lewis & Huang (2005)

The percentage of students classified as at or above Proficientwould be expected to be:

1. Equal across grades or subjects

2. Approximately equal

3. Smoothly decreasing

4. Smoothly increasing

Ferrara, Johnson & Chen (2005)

Assumptions for standard setting are based on the intersection of three growth models:

1. Linear Growth

2. Remediation

3. Acceleration

20

Alternative procedures

Due to VMSS being a relatively new procedure, it is difficult to pinpoint limitations and alternative procedures

There have been few thoroughly documented applications of VMSS

Each application has been slightly different from the others

Authors have suggested a common core of elements to VMSS

However, no fixed set of steps has emerged in applications of VMSS so far

Every aspect of any application might be thought as an alternative procedure

21

Core components of VMSS future applications

1. Grounding in historical data (Lwesi & Haug, 2005; Buckendahl et al, 2005).

2. Establishment of performance models

3. Consideration of historical data

4. Cross-grade examination of test content and student performance

5. Polling of participants

6. Follow up review and adjustment

22

Limitations

Lack of historical perspective or context would be not only limiting, but debilitating If the focus of VMSS is the percentages of students at or above a particular proficiency level.

Any application of VMSS is hampered if it is not supported by a theoretically or empirically sound model of achievement growth.

Maintaining a meaning of cut scores and fidelity to PLDs is one of the most fundamental for future research.

Research and development is a growth industry

23

Documents

Haxhiraj ch13 14-presentation