View
1
Download
0
Category
Preview:
Citation preview
Measuring the Growth of Students Participating in the Alternate
Assessment
Thursday, August 9, 2012
Time: 1–2:30 p.m. Eastern Time
Webinar 3
Webinar series designed to address challenges in measuring the growth of students with disabilities for
use in educator evaluation sponsored by National Comprehensive Center for Teacher Quality (TQ
Center), in partnership with the Office of Special Education Programs (OSEP):
http://www.tqsource.org/webcasts/osep2012/
2
A Forum of State Special Education and Teacher Effectiveness Experts and Researchers
Webinar 1: State Approaches to
Measuring Student Growth For the
Purpose of Teacher Evaluation
Webinar 2: Challenges and Considerations
in Measuring Growth of Students With
Disabilities
Webinar 3: Measuring the Growth of
Students Participating in the Alternate
Assessment
Recorded webinars can be located here:
http://www.tqsource.org/webcasts/osep2012/
http://www.tqsource.org/pdfs/TQ_Forum_SummaryUsi
ng_Student_Growth.pdf
3
Learning Targets
• Increase awareness of the challenges in using growth of students
with disabilities participating in the alternate assessments in
teacher and leader evaluations
• Increase understanding of lessons learned from early efforts to
measure student growth using alternate assessment results and
the potential to use newly designed and aligned assessments
Sandra Hopfengardner Warren, Adviser, ASES SCASS, Council of Chief State School Officers
Neal Kingston, Director, Center for Educational Testing and Evaluation–Dynamic Learning Maps
Alternate Assessment System
Kamarrie Coleman, Coordinator, National Center on Educational Outcomes
Jacqui Kearns, Principal Investigator, National Alternate Assessment Center and Consortium
Gary Phillips, Vice President, Assessment, American Institutes for Research
Brian Touchetee, Education Associate, Delaware Department of Education
Council of Chief State School Officers
State Collaborative on Assessment and Student Standards
Assessing Special Education Students
Sandra Warren, ASES SCASS Facilitator
4
Forums on Evaluating Educator Effectiveness (ASES in collaboration with….)
National Comprehensive Center for Teacher Quality (September 2011)
National Center on Educational Outcomes (June 2012)
Issues How should assessment results of students with
disabilities be used in evaluation models of educator
effectiveness?
What are the benefits and concerns of using IEP
Goals or Student Learning Objectives (SLO)?
If multiple measures are used , what would a
balanced model look like?
5
Balanced Systems
Classroom
Instruction
6
Challenges Specific to Students With Significant Cognitive Disabilities
• State alternate assessments are often portfolio based; therefore, comparability between measures is a significant challenge.
• State alternate assessments may vary in their technical quality; therefore, using alternate assessment results for the purpose of measuring student growth may not be a viable option.
• Subjectivity may be more prevalent in portfolio reviews.
• The heterogeneity of students with significant cognitive disabilities makes it difficult to identify and/or develop a standardized measure that takes into account the variance in learning trajectories.
7
State and School District Considerations Regarding AA-AAS
• Guard against diminishing expectations for the work of students
with significant cognitive disabilities. Regardless of the assessment format, this should be at the forefront of the review.
• Ensure students are provided a range of opportunities for accessing the assessment and providing responses.
• Ensure students have equal access to the curriculum, instruction, and opportunities to learn.
• Take into consideration that static growth for some students with significant cognitive disabilities may be considered growth. This is particularly true for students with degenerative conditions.
8
State and School District Considerations
Regarding AA-AAS (cont) • Consider whether or how student scores can be attributed to
educators other than the special educator (e.g., general educators, other licensed educators, and related service providers).
• Consider methods (e.g., discrete responses, chained responses, and permanent products) used in research to capture student learning for students with significant cognitive disabilities.
• Consider using the content-plus model and/or performance-based assessments (e.g., academic content plus student progress on life skills or therapy goals).
• Recognize the heterogeneous nature of this group of students and understand that the expected learning trajectory will vary from student to student.
9
Transforming the Education System
10
Dynamic Learning Maps
Alternate Assessment Consortium
Center for Educational Testing and Evaluation
University of Kansas
The present publication was developed under grant 84.373X100001 from the
U.S. Department of Education, Office of Special Education Programs. The views
expressed herein are solely those of the author(s), and no official endorsement
by the U.S. Department should be inferred.
State Participants
Goal: Support student learning and
avoid unintended consequences
• When an assessment system is
embedded in an accountability
system there will be consequences
– Many teachers will narrowly teach to the
test
– Sometimes teachers and administrators
will act counter to their professional
responsibilities
• Teachers need more information
about student learning
– Timely
– Actionable
Goal: Support student learning and
avoid unintended consequences
How will DLM meet these goals?
• Common Core Essential Elements
• Instructionally-embedded (and
summative) assessments
• Instructionally-relevant tasks
• Learning maps
• Dynamic assessment
• Professional development
• Technology platform to tie it all together
• Reporting: status and growth
Potential problems with growth
models in AA population
• Unreliability (short tests with low
item information)
• Extra error variance (good days/bad
days) and missing data
• Students with ongoing mental and
physical deterioration
DLM potential advantages for
measuring growth
• Many measurements (helps with good
day/bad day and low reliability)
• Dynamic testing (helps with low
reliability)
• High granularity (opportunity to show
slow growth)
Presenter: Jacqui Kearns, NCSC
Professional Development Workgroup Lead
Kamarrie Coleman, NCSC
Teacher Evaluation and Use of AA-AAS
Data in Teacher Evaluation Systems
AA-AAS School Accountability Models
With Growth Measures
• Current statistical models make assumptions about underlying AA-AAS content progression that have limited evidence; measurement experts are not in agreement
• Current performance level models are only as good as the content progression and increasing expectations evident in test and standards from year to year
• Opportunities for the future in learning progressions/maps work; focus on student profiles and patterns; ensure accountability doesn’t reward “topping out”
24
Growth at the Student Level
• Standards-based progress monitoring data within year, on high priority academic content
• Increased depth, breadth, and complexity of academic profile from AA-AAS and from within-year progress monitoring, increasingly near links to grade-level academic standards
• Communicative competence
• UDL grade-level instructional opportunities that encourage full participation – social and academic – in a community of learners, with typical peers
25
Purpose and Use:
Data Must Be Appropriate for Each
Purpose/Use
• System accountability
• Teacher evaluation
• Professional development targets
• Student instructional planning
26
NCSC Project Components
• Summative assessment
• Curriculum development resource materials – Universally Designed Units (UDL)
– Curriculum guides
– Model lessons that scaffold instruction on difficult to teach content
– Formative assessment tools
• Communities of Practice in each partner state – Webinars
– LCI state profiles/Communication Triage Summit
– Orientation to CCSS
– Overview & implementation of project C&I materials
– Training on test administration
27
Theory of Action - NCSC
28
Teacher Evaluation and Multiple Measures:
What Do We Care About?
• New, richer patterns of academic skills and knowledge from AA-AAS data from year to year (student profile approach)
• Performance level growth: Maintenance of proficient or above or improvement of below proficient - only where performance levels show evidence of quality for this purpose; even then, data more stable at school level than classroom level
• Communicative competence: Data to show interventions, growth, actual performance
• Standard-based classroom progress monitoring tools that show within year growth on priority academic skills and knowledge
• Integrated supports for self-determination, independence, and real world application of skills
• Evidence-based classroom practices
29
What Will Ensure:
• Maximized communicative competence
• Full access to the academic content for
life-long learning
• Development of appropriate social skills
• Development of independent work behaviors
• Development of support access skills
(NCSC discussion based on Kearns, Kleinert, Harrison,
Shepard-Jones, Hall, & Jones, 2011)
30
Copyright © 2012
American Institutes
for Research.
All rights reserved.
Alternate Assessments
That Measure Student
Growth
Gary W. Phillips
Louis Danielson
Lynnett Wright
American Institutes for Research
32
Three State Consortiums
• National Center and State Collaborative
24 states
• Dynamic Learning Maps
13 states
• Multistate Adaptive Alternate Assessment Consortium*
6 states
*The consortium described here is in the initial stages of development. The
name, composition, and organizational structure of the consortium will be
worked out by the states over the next year. Minnesota is part of the
consortium and is in the process of deciding if the state will use the AIR
adaptive design.
33
Typical Types of State Alternate
Assessments
• Checklist
List of skills, reviewed by teacher with a student. Teacher observes
or recalls whether students are able to perform the skills listed, and
to what level of proficiency.
• Portfolio
Collection of student work gathered by teacher demonstrating
student performance on specific skills and knowledge, generally
linked to state content standards. May include student work,
observations recorded by multiple persons, test results, or video or
audio records of student performance.
• IEP-linked body of evidence
Collection of student work gathered by teacher demonstrating
student achievement on standards-based IEP goals and objectives,
measured against pre-determined scoring criteria.
34
What Is Wrong With the Current Typical
State Alternate Assessment?
• Time consuming, burdensome, and expensive
• Do not reliably measure what the student knows and can do (they are confounded by what the teacher knows and can do)
• Cannot measure student growth from grade to grade or from fall to spring
35
What Is Wrong With the Current Typical
State Alternate Assessment? (continued)
• Overinflate proficiency (almost all students are proficient)
• Do not meet the same technical requirements as assessments for the general population of students
• Inherently unfair to students with disabilities because the typical alternate assessment does not reliably measure what students know and can do and is not capable of measuring progress
36
Advantages of the AIR Test Design Used by the
Multistate Adaptive Alternate Assessment
Consortium
• Task-based (standardized administration, which
allows scores from the test to be comparable)
• Test difficulty adapted to student ability
• Administered and scored by teachers
• Vertical scale using Item-Response Theory models
• High reliability and validity of the scores
• Aligned to extensions of Common Core State
Standards (CCSS)
Delaware is completely aligned with the CCSS
Remaining states are transitioning to the CCSS
37
Advantages of the AIR Test Design Used by the
Multistate Adaptive Alternate Assessment
Consortium
• Less costly than traditional alternate assessments
• Requires less administration time (about one hour)
• Meets the same technical requirements as
assessments of the general population
• NCLB-approved (New Mexico and South Carolina)
• The same growth models that apply to the general
assessment can be applied to the alternate
assessment
• School and teacher effectiveness indices can be
calculated, and value-added models can be used
38
Advantages of the AIR Test Design Used by the
Multistate Adaptive Alternate Assessment
Consortium
• Each student receives a set of items that meets the state test blueprint
• Measures growth from year to year and/or from fall to spring
• Scoring is contemporaneous with test administration
• Scores are immediately entered in the computer
• Score reports are immediately available
• Score reports for alternate assessment look exactly like the score reports for general education
39
Growth on the Vertical Scale:
New Mexico Language Arts
New Mexico Alternative Assessment
Language Arts Longitudinal Growth, by
Grade
440
450
460
470
480
490
500
510
2007 2008 2009
Year of Administration
Me
an
Sc
ale
Sc
ore
Grade 3 in 2007
Grade 4 in 2007
Grade 5 in 2007
Grade 6 in 2007
40
Growth on the Vertical Scale:
New Mexico Mathematics
440
450
460
470
480
490
500
510
2007 2008 2009
Year of Administration
Mean
Scale
Sco
re
New Mexico Alternative Assessment
Mathematics Longitudinal Growth, by
Grade
Grade 3 in 2007
Grade 4 in 2007
Grade 5 in 2007
Grade 6 in 2007
41
Growth on the Vertical Scale:
South Carolina Eng Language Arts
South Carolina Alternate
Assessment
ELA Longitudinal Growth, by Base
Grade
465
470
475
480
485
490
495
500
505
2007 2008 2009
Year of Administration
Mean
Scale
d S
co
re
Grade 3 in 2007
Grade 4 in 2007
Grade 5 in 2007
Grade 6 in 2007
42
Growth on the Vertical Scale:
South Carolina Mathematics
South Carolina Alternate Assessment
Mathematics Longitudinal Growth, by Base Grade
465
470
475
480
485
490
495
500
505
2007 2008 2009
Year of Administration
Me
an
Sc
ale
d S
co
re
Grade 3 in 2007
Grade 4 in 2007
Grade 5 in 2007
Grade 6 in 2007
Overview
We have had educator evaluation system for over a decade
12 minute high level overview
2 year process to build current system for student improvement
Training began for school teams on August 6, 2012
43
For more updated information: www.doe.k12.de.us
Purpose/Philosophy - Focus on building a school climate where everyone is focused on the improvement of student achievement. - Bringing back a focus on conferences and conversation between educators and administrators. 44
Challenge #1
A balanced measurement system
45
Overall Components
Teachers Specialists Administrators
Component 1 Planning & Preparation
Planning & Preparation
Vision & Goals
Component 2 Classroom
Environment
Professional Practice & Delivery
of Services
Culture of Learning
Component 3 Instruction Professional
Collaboration & Consultation
Management
Component 4 Professional
Responsibilities Professional
Responsibilities Professional
Responsibilities
Component 5 Student
Improvement Student
Improvement Student
Improvement
46
Challenge #2
Who are the educator groups?
47
3 Types of Educators
Generalizations about the category
Group I
Are you the reading and/or math Teacher of Record and give grades for at least 10 students in a DCAS – tested grade 3-10?
Group II
Are you the Teacher of Record and give grades for at least 10 students at any grade or subject other than DCAS reading and/or math?
Group III
Any educator who does not meet the criteria for Group 1 or Group 2 will defer to Group 3.
48
Challenge #3
What measures will we use to show improvement?
49
3 Types of Measures
Type of Measure
A State Assessment
B External Internal
C Growth Measures
50
Ideal cohort size 25+ students
Minimum of 10 students
For DCAS-Alt1
Fall and Spring test for DCAS-Alt1
Growth shown across large group of students
First year of implementation
Growth may be widely variable within a classroom
Time frame for completing spring assessment
Type of Measure
A State Assessment
B External Internal
C Growth Measures
51
External Assessment
DDOE approved, standardized assessments that can be used at the discretion of the district
DIBELS, STAR Math, STAR Reading
Internal Assessment
DDOE approved, educator developed assessments specific to subject and grade level
Pre/post student assessments
Type of Measure
A State Assessment
B External Internal
C Growth Measures
52
DDOE approved, educator developed goals Specific to content area and/or job assignment Includes a mix of: student growth and professional
outcomes Direct vs. indirect services (student growth vs.
professional outcomes) Standardized by: Cohort sizes established Baseline and data method Min and max time period Expected goal attainment or minimum expected
growth
Type of Measure
A State Assessment
B External Internal
C Growth Measures
53
Indicator
ID Standard Goal Statement
1 8d.1 – ELA
Assessment of
Independence
Given the average of scores attained during the
baseline period, the identified group of student(s)
with academic targets (ELA) will decrease the
number of prompts or show improvement on the
prompt hierarchy to meet the target(s) of ___* by
the conclusion of the timelines.
(*target must be at least one prompt lower than
baseline or improvement on the prompt hierarchy
by one level toward the target)
2 8d.2 – ELA
Assessment of
accuracy
Given the average of scores attained during the
baseline period, the identified group of students
with academic targets (ELA) will increase the
percentage of target by ____* at the conclusion of
the timelines.
(target must be 5% higher or attain/maintain 90%
or higher) 54
Growth Goals Summary 16 Growth Goals available
Educator chooses 4 for the students they teach
Educator sets targets (for Satisfactory and Exceeds)
Administrator approves based on professional conversation with the educator
Topics of Growth Goals include:
• ELA • Daily Living
• Mathematics • Career Readiness
• Science • Generalization of skills
• Social Studies • Communication
• Social Skills 55
Challenge #4
So how does all of this come together?
56
Measures by Educator Type
A B C
I 50% 50%
II 50% 50%
III 100%
57
How Will Measure C Be Rated?
Exceeds Satisfactory Unsatisfactory
The agreed upon “exceeds” target is met or surpassed
The agreed upon “satisfactory” target is met or surpassed, but the “exceeds” target is not met
The agreed upon “satisfactory” target is not met
58
Summative Ratings Total # of Satisfactory ratings in Components I-IV
Component Five Summative Rating
4/4 Exceeds Highly Effective
4/4 Satisfactory Effective
4/4 Unsatisfactory Needs Improvement
3/4 Exceeds Highly Effective
3/4 Satisfactory Effective
3/4 Unsatisfactory Needs Improvement
2/4 Exceeds Effective
2/4 Satisfactory Effective
2/4 Unsatisfactory Ineffective
1/4 Exceeds Needs Improvement
1/4 Satisfactory Needs Improvement
1/4 Unsatisfactory Ineffective
0/4 Exceeds Needs Improvement
0/4 Satisfactory Needs Improvement
0/4 Unsatisfactory Ineffective 59
60
TQ Center Resources (http://www.tqsource.org/)
• STEP Database http://resource.tqsource.org/stateevaldb/
• Guide to Evaluation Products http://resource.tqsource.org/GEP/
• Online Practical Guide to
Designing Comprehensive
Teacher Evaluation Systems
http://www.tqsource.org/practicalGuide/
• Aligning Teacher Evaluation
with Professional Learning
http://www.tqsource.org/alignEvalProfLe
arning.php
Recommended