Upload
john-cronin
View
218
Download
1
Tags:
Embed Size (px)
DESCRIPTION
John Cronin's presentation on teacher evaluation at NWEA's Texas Assessment Summit
Citation preview
Using Tests and Value-Added Modeling for Teacher and Administrator Evaluation: A Consumer’s Guide
John Cronin, Ph.D. – Senior Director of Education ResearchNorthwest Evaluation Association
What NWEA supports
• The evaluation process should focus on helping teachers improve.
• The principal or designated evaluator should control the evaluation.
• Tests should inform principal decision-making and not be the deciding factor in an evaluation.
• Multiple measures should be used.
Distinguishing teacher effectiveness
from teacher evaluation
• Teacher effectiveness – The judgment of a teacher’s ability to positively impact learning in the classroom.
• Teacher evaluation – The judgment of a teacher’s overall performance including:
– Teacher effectiveness
– Common standards of job performance
– Participation in the school community
– Adherence to professional standards
Effective teaching and professional job performance
Evidence of professional
responsibilities
Evidence of student learning
Evidence of professional
practice
The evaluation of teaching by classroom observation and use of artifacts
The evaluation of the teacher’s effectiveness in making progress toward their goals and fulfilling the responsibilities of a professional educator.
The evaluation of a teacher’s contribution to student learning and growth
A simple framework for teacher evaluation
Effective teaching and professional job performance
Evidence of professional
responsibilities
Evidence of student learning
Evidence of professional
practice
Domains 1 – 4:• Instructional planning
and delivery• Knowledge of students
and learning• Content knowledge and
expertise• Learning environment
Domain 6: Professional practices and responsibilities
Domain 5: Data-driven practice20% Measure of student growth
A simple framework for teacher evaluation –Texas style
Effective teaching and professional job performance
Evidence of professional
responsibilities
Evidence of student learning
Evidence of professional
practice
Domains 1 – 4:• Instructional planning
and delivery• Knowledge of students
and learning• Content knowledge and
expertise• Learning environment
Domain 6: Professional practices and responsibilities
Domain 5: Data-driven practice20% Measure of student growth
A simple framework for teacher evaluation –Texas style
Purposes of summative evaluation
• Make an accurate and defensible judgment of an educator’s job performance.
• Provide ratings of performance that provide meaningful differentiation across educators.
• Help educators focus on their students and their practice.
• Retain your top educators.
• Dismiss ineffective educators.
The greatest tragedy of this century in
education so far, was the number of
young, talented teachers who lost their
positions in the last recession.
Employment of Elementary Teachers
2007-2012
1538000 1544270 1544300
1485600
1415000
1360380
2007 2008 2009 2010 2011 2012
NUMBER OF TEACHERS
Source: (2012, May) Bureau of Labor Statistics – Occupational Employment Statistics Numbers exclude special education and kindergarten teachers
The elementary school teacher workforce shrunk by 178,000 teachers (11%) between May, 2007 and May, 2012.
The impact of seniority based layoffs on
school quality
Source: Boyd, L., Lankford, H., Loeb, S., and Wycoff, J. (2011). Center for Education Policy. Stanford University.
In a simulation study of implementation of a layoff of 5% of teachers using New York City data, reliance on seniority based layoffs resulted would:
• Result in 25% more teachers laid off.
• Teachers laid off would be .31 standard deviations more effective (using a value-added criterion) than those lost using an effectiveness criterion.
• 84% of teachers with unsatisfactory ratings would be retained.
What teacher effectiveness infers
• Evidence of Learning – A claim that the improvement in learning (or lack of it) reflected on one or more tests is caused by the teacher.
• Evidence of good practice – That the observers ratings or conclusions are reliableand associated with behaviors that cause improved learning in the classroom.
The evolving evaluation landscape – principal observation
Teacher observation as a part of
teacher evaluation
Systematic observation of teacher performance is a central part of every state’s teacher evaluation plan.
If performance ratings aren’t consistent with teacher growth, the media and public will demand to know why.
“The (Race to the Top teacher evaluation) changes, already under way in some cities and states, are intended to provide meaningful feedback and, critically, to weed out weak performers. And here are some of the early results:
In Florida, 97 percent of teachers were deemed effective or highly effective in the most recent evaluations. In Tennessee, 98 percent of teachers were judged to be “at expectations.” In Michigan, 98 percent of teachers were rated effective or better.”
Source: New York Times (2013, March 30). Curious Grade for Teachers: Nearly all Pass. Retrieved from: http://www.nytimes.com/2013/03/31/education/curious-grade-for-teachers-nearly-all-pass.html?pagewanted=all&_r=0
Learn from the experience of others
1% 2%
75%
23%
Evaluator Rating
ineffective
Minimally Effective
Effective
Highly Effective
Results of Georgia Teacher Evaluation
Pilot
Florida District
Highly Effective
Effective Needs Improvement
Developing Unsatisfactory VA Score Florida Ranking
Ranking
1 44.4% 55.6% 0.0% 0.0% 0.0%
2 25.0% 75.0% 0.0% 0.0% 0.0%
3 90.9% 9.1% 0.0% 0.0% 0.0%
4 60.7% 39.3% 0.0% 0.0% 0.0%
5 81.2% 18.8% 0.0% 0.0% 0.0%
6 37.3% 54.2% 1.7% 0.0% 6.8%
7 81.3% 18.8% 0.0% 0.0% 0.0%
8 41.7% 55.6% 1.4% 1.4% 0.0%
9 52.2% 47.8% 0.0% 0.0% 0.0%
10 27.0% 66.2% 1.4% 0.0% 5.4%
11 7.1% 72.6% 9.5% 10.7% 0.0%
Teacher Evaluation Ratings in Eleven Florida
Schools - 2013
Florida District
Highly Effective
Effective
Needs Improvement
Developing
Unsatisfactory
VA Score
Florida Ranking
Ranking
1 44.4% 55.6% 0.0% 0.0% 0.0% 0.39 109 1
2 25.0% 75.0% 0.0% 0.0% 0.0% 0.37 121 2
3 90.9% 9.1% 0.0% 0.0% 0.0% -0.14 2802 9
4 60.7% 39.3% 0.0% 0.0% 0.0% -0.14 2797 8
5 81.2% 18.8% 0.0% 0.0% 0.0% -0.16 2831 10
6 37.3% 54.2% 1.7% 0.0% 6.8% 0.12 880 5
7 81.3% 18.8% 0.0% 0.0% 0.0% 0.22 402 3
8 41.7% 55.6% 1.4% 1.4% 0.0% -0.34 3274 11
9 52.2% 47.8% 0.0% 0.0% 0.0% 0.16 664 4
10 27.0% 66.2% 1.4% 0.0% 5.4% 0 1764 6
11 7.1% 72.6% 9.5% 10.7% 0.0% -0.08 2445 7
Teacher Evaluation Ratings in Eleven Florida
Schools - 2013
Florida District
Highly Effective
Effective Needs Improvement
Developing Unsatisfactory VA Score Florida Ranking
1 44.4% 55.6% 0.0% 0.0% 0.0%
2 25.0% 75.0% 0.0% 0.0% 0.0%
3 90.9% 9.1% 0.0% 0.0% 0.0%
4 60.7% 39.3% 0.0% 0.0% 0.0%
5 81.2% 18.8% 0.0% 0.0% 0.0%
6 37.3% 54.2% 1.7% 0.0% 6.8%
7 81.3% 18.8% 0.0% 0.0% 0.0%
8 41.7% 55.6% 1.4% 1.4% 0.0%
9 52.2% 47.8% 0.0% 0.0% 0.0%
10 27.0% 66.2% 1.4% 0.0% 5.4%
11 7.1% 72.6% 9.5% 10.7% 0.0%
Teacher Evaluation Ratings in Six Florida
Districts 2013
Florida District
Highly Effective
Effective Needs Improvement
Developing Unsatisfactory VA Score Florida Ranking
1 44.4% 55.6% 0.0% 0.0% 0.0% 0.39 109
2 25.0% 75.0% 0.0% 0.0% 0.0% 0.37 121
3 90.9% 9.1% 0.0% 0.0% 0.0% -0.14 2802
4 60.7% 39.3% 0.0% 0.0% 0.0% -0.14 2797
5 81.2% 18.8% 0.0% 0.0% 0.0% -0.16 2831
6 37.3% 54.2% 1.7% 0.0% 6.8% 0.12 880
7 81.3% 18.8% 0.0% 0.0% 0.0% 0.22 402
8 41.7% 55.6% 1.4% 1.4% 0.0% -0.34 3274
9 52.2% 47.8% 0.0% 0.0% 0.0% 0.16 664
10 27.0% 66.2% 1.4% 0.0% 5.4% 0 1764
11 7.1% 72.6% 9.5% 10.7% 0.0% -0.08 2445
Teacher Evaluation Ratings in Six Florida
Districts 2013
The actual proportion of teachers for which student growth can be measured through the state assessment.
25%
Ineffective (Growth
Measures)
Developing (Growth Measures) Effective (Growth Measures) Highly Effective (Growth Measures)
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40
In
eff
ecti
ve (
Ob
servati
on
al)
0 1 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2
1 2 3 4 4 4 4 5 5 5 5 5 5 5 5 5 5 5 5 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6 6
2 2 4 5 6 6 6 7 7 7 7 7 8 8 8 8 8 8 8 8 8 8 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9 9
3 2 5 6 7 7 8 8 9 9 9 10 10 10 10 10 10 11 11 11 11 11 11 11 11 11 11 12 12 12 12 12 12 12 12 12 12 12 12 12 12 12
4 3 5 7 8 9 9 10 10 11 11 11 12 12 12 12 13 13 13 13 13 13 14 14 14 14 14 14 14 14 14 15 15 15 15 15 15 15 15 15 15 15
5 3 6 8 9 10 11 11 12 12 13 13 14 14 14 14 15 15 15 15 16 16 16 16 16 16 16 17 17 17 17 17 17 17 17 17 18 18 18 18 18 18
6 3 6 8 10 11 12 13 13 14 14 15 15 16 16 16 17 17 17 17 18 18 18 18 18 19 19 19 19 19 19 19 20 20 20 20 20 20 20 20 20 21
7 3 7 9 11 12 13 14 15 15 16 16 17 17 18 18 18 19 19 19 20 20 20 20 20 21 21 21 21 21 22 22 22 22 22 22 22 23 23 23 23 23
8 3 7 10 11 13 14 15 16 17 17 18 18 19 19 20 20 20 21 21 21 22 22 22 23 23 23 23 23 24 24 24 24 24 24 25 25 25 25 25 25 25
9 3 8 10 12 14 15 16 17 18 18 19 20 20 21 21 22 22 23 23 23 24 24 24 24 25 25 25 25 26 26 26 26 26 27 27 27 27 27 27 28 28
10 3 8 11 13 14 16 17 18 19 20 20 21 22 22 23 23 24 24 25 25 25 26 26 26 27 27 27 27 28 28 28 28 29 29 29 29 29 29 30 30 30
11 3 8 11 13 15 17 18 19 20 21 22 22 23 24 24 25 25 26 26 27 27 27 28 28 28 29 29 29 30 30 30 30 31 31 31 31 31 32 32 32 32
12 4 8 12 14 16 17 19 20 21 22 23 24 24 25 26 26 27 27 28 28 29 29 29 30 30 30 31 31 31 32 32 32 33 33 33 33 33 34 34 34 34
13 4 9 12 14 16 18 20 21 22 23 24 25 26 26 27 28 28 29 29 30 30 31 31 31 32 32 33 33 33 34 34 34 34 35 35 35 35 36 36 36 36
14 4 9 12 15 17 19 20 22 23 24 25 26 27 27 28 29 30 30 31 31 32 32 33 33 33 34 34 35 35 35 36 36 36 37 37 37 37 38 38 38 38
15 4 9 13 15 18 19 21 23 24 25 26 27 28 29 29 30 31 31 32 33 33 34 34 35 35 35 36 36 37 37 37 38 38 38 39 39 39 40 40 40 40
Develo
pin
g (
Ob
servati
on
al)
16 4 9 13 16 18 20 22 23 25 26 27 28 29 30 31 31 32 33 33 34 35 35 36 36 37 37 37 38 38 39 39 39 40 40 40 41 41 41 42 42 42
17 4 9 13 16 19 21 23 24 25 27 28 29 30 31 32 33 33 34 35 35 36 37 37 38 38 39 39 39 40 40 41 41 42 42 42 43 43 43 44 44 44
18 4 10 14 17 19 21 23 25 26 28 29 30 31 32 33 34 35 35 36 37 37 38 38 39 40 40 41 41 41 42 42 43 43 44 44 44 45 45 45 46 46
19 4 10 14 17 20 22 24 26 27 28 30 31 32 33 34 35 36 36 37 38 39 39 40 40 41 42 42 43 43 43 44 44 45 45 46 46 46 47 47 47 48
20 4 10 14 17 20 22 24 26 28 29 31 32 33 34 35 36 37 38 38 39 40 41 41 42 42 43 43 44 45 45 45 46 46 47 47 48 48 48 49 49 49
21 4 10 14 18 21 23 25 27 29 30 31 33 34 35 36 37 38 39 40 40 41 42 42 43 44 44 45 45 46 46 47 47 48 48 49 49 50 50 50 51 51
22 4 10 15 18 21 23 26 27 29 31 32 34 35 36 37 38 39 40 41 42 42 43 44 44 45 46 46 47 47 48 48 49 49 50 50 51 51 52 52 52 53
23 4 10 15 18 21 24 26 28 30 31 33 34 36 37 38 39 40 41 42 43 43 44 45 46 46 47 48 48 49 49 50 50 51 51 52 52 53 53 54 54 54
24 4 11 15 19 22 24 27 29 31 32 34 35 36 38 39 40 41 42 43 44 45 45 46 47 48 48 49 50 50 51 51 52 52 53 53 54 54 55 55 56 56
25 4 11 15 19 22 25 27 29 31 33 34 36 37 39 40 41 42 43 44 45 46 47 47 48 49 50 50 51 52 52 53 53 54 54 55 55 56 56 57 57 58
26 4 11 16 19 23 25 28 30 32 34 35 37 38 39 41 42 43 44 45 46 47 48 49 49 50 51 51 52 53 53 54 55 55 56 56 57 57 58 58 59 59
27 4 11 16 20 23 26 28 30 32 34 36 37 39 40 42 43 44 45 46 47 48 49 50 50 51 52 53 53 54 55 55 56 57 57 58 58 59 59 60 60 61
28 4 11 16 20 23 26 29 31 33 35 37 38 40 41 42 44 45 46 47 48 49 50 51 52 52 53 54 55 55 56 57 57 58 59 59 60 60 61 61 62 62
29 4 11 16 20 24 26 29 31 34 35 37 39 40 42 43 45 46 47 48 49 50 51 52 53 54 54 55 56 57 57 58 59 59 60 61 61 62 62 63 63 64
30 4 11 16 20 24 27 30 32 34 36 38 40 41 43 44 45 47 48 49 50 51 52 53 54 55 56 56 57 58 59 59 60 61 61 62 62 63 64 64 65 65
Eff
ecti
ve (
Ob
servati
on
al)
31 4 11 17 21 24 27 30 32 35 37 39 40 42 43 45 46 47 49 50 51 52 53 54 55 56 57 57 58 59 60 61 61 62 63 63 64 64 65 66 66 67
32 4 11 17 21 25 28 30 33 35 37 39 41 43 44 46 47 48 50 51 52 53 54 55 56 57 58 59 59 60 61 62 62 63 64 64 65 66 66 67 68 68
33 4 12 17 21 25 28 31 33 36 38 40 42 43 45 46 48 49 50 52 53 54 55 56 57 58 59 60 61 61 62 63 64 64 65 66 66 67 68 68 69 69
34 4 12 17 21 25 28 31 34 36 38 40 42 44 46 47 49 50 51 53 54 55 56 57 58 59 60 61 62 63 63 64 65 66 66 67 68 68 69 70 70 71
35 4 12 17 22 25 29 32 34 37 39 41 43 45 46 48 49 51 52 53 55 56 57 58 59 60 61 62 63 64 64 65 66 67 68 68 69 70 70 71 72 72
36 4 12 17 22 26 29 32 35 37 39 41 43 45 47 49 50 52 53 54 55 57 58 59 60 61 62 63 64 65 66 66 67 68 69 69 70 71 72 72 73 74
37 4 12 17 22 26 29 32 35 38 40 42 44 46 48 49 51 52 54 55 56 58 59 60 61 62 63 64 65 66 67 68 68 69 70 71 71 72 73 74 74 75
38 4 12 18 22 26 30 33 36 38 40 43 45 46 48 50 52 53 55 56 57 58 60 61 62 63 64 65 66 67 68 69 69 70 71 72 73 73 74 75 75 76
39 4 12 18 22 26 30 33 36 39 41 43 45 47 49 51 52 54 55 57 58 59 61 62 63 64 65 66 67 68 69 70 71 71 72 73 74 75 75 76 77 77
40 4 12 18 23 27 30 33 36 39 41 44 46 48 50 51 53 55 56 57 59 60 61 63 64 65 66 67 68 69 70 71 72 73 73 74 75 76 77 77 78 79
41 4 12 18 23 27 31 34 37 39 42 44 46 48 50 52 54 55 57 58 60 61 62 63 65 66 67 68 69 70 71 72 73 74 75 75 76 77 78 78 79 80
42 5 12 18 23 27 31 34 37 40 42 45 47 49 51 53 54 56 58 59 60 62 63 64 66 67 68 69 70 71 72 73 74 75 76 76 77 78 79 80 80 81
43 5 12 18 23 27 31 34 37 40 43 45 47 49 51 53 55 57 58 60 61 63 64 65 66 68 69 70 71 72 73 74 75 76 77 78 78 79 80 81 82 82
44 5 12 18 23 28 31 35 38 41 43 46 48 50 52 54 56 57 59 60 62 63 65 66 67 69 70 71 72 73 74 75 76 77 78 79 80 80 81 82 83 84
45 5 13 19 24 28 32 35 38 41 44 46 48 51 53 54 56 58 60 61 63 64 66 67 68 69 71 72 73 74 75 76 77 78 79 80 81 82 82 83 84 85
Hig
hly
Eff
ecti
ve (
Ob
servati
on
al)
46 5 13 19 24 28 32 35 39 41 44 47 49 51 53 55 57 59 60 62 63 65 66 68 69 70 71 73 74 75 76 77 78 79 80 81 82 83 83 84 85 86
47 5 13 19 24 28 32 36 39 42 45 47 49 52 54 56 58 59 61 63 64 66 67 69 70 71 72 74 75 76 77 78 79 80 81 82 83 84 85 85 86 87
48 5 13 19 24 29 32 36 39 42 45 47 50 52 54 56 58 60 62 63 65 66 68 69 71 72 73 74 76 77 78 79 80 81 82 83 84 85 86 87 87 88
49 5 13 19 24 29 33 36 40 43 45 48 50 53 55 57 59 61 62 64 66 67 69 70 71 73 74 75 77 78 79 80 81 82 83 84 85 86 87 88 89 89
50 5 13 19 24 29 33 37 40 43 46 48 51 53 55 57 59 61 63 65 66 68 69 71 72 74 75 76 77 79 80 81 82 83 84 85 86 87 88 89 90 90
51 5 13 19 25 29 33 37 40 43 46 49 51 54 56 58 60 62 64 65 67 69 70 72 73 74 76 77 78 79 81 82 83 84 85 86 87 88 89 90 91 92
52 5 13 19 25 29 33 37 41 44 47 49 52 54 56 58 61 62 64 66 68 69 71 72 74 75 77 78 79 80 82 83 84 85 86 87 88 89 90 91 92 93
53 5 13 19 25 30 34 37 41 44 47 50 52 55 57 59 61 63 65 67 68 70 72 73 75 76 77 79 80 81 82 84 85 86 87 88 89 90 91 92 93 94
54 5 13 20 25 30 34 38 41 44 47 50 53 55 57 60 62 64 66 67 69 71 72 74 75 77 78 80 81 82 83 85 86 87 88 89 90 91 92 93 94 95
55 5 13 20 25 30 34 38 41 45 48 50 53 56 58 60 62 64 66 68 70 71 73 75 76 78 79 80 82 83 84 85 87 88 89 90 91 92 93 94 95 96
56 5 13 20 25 30 34 38 42 45 48 51 54 56 58 61 63 65 67 69 70 72 74 75 77 78 80 81 82 84 85 86 87 89 90 91 92 93 94 95 96 97
57 5 13 20 25 30 35 38 42 45 48 51 54 56 59 61 63 65 67 69 71 73 74 76 78 79 81 82 83 85 86 87 88 90 91 92 93 94 95 96 97 98
58 5 13 20 26 30 35 39 42 46 49 52 54 57 59 62 64 66 68 70 72 73 75 77 78 80 81 83 84 85 87 88 89 90 92 93 94 95 96 97 98 99
59 5 13 20 26 31 35 39 43 46 49 52 55 57 60 62 64 66 68 70 72 74 76 77 79 81 82 83 85 86 88 89 90 91 92 94 95 96 97 98 99 100
60 5 13 20 26 31 35 39 43 46 49 52 55 58 60 63 65 67 69 71 73 75 76 78 80 81 83 84 86 87 88 90 91 92 93 95 96 97 98 99 100 101
The New York Evaluation Matrix
0
10
20
30
40
50
60
70
80
90
100
60 70 80 90 100
Principal Rating Value-added rating
Why differentiating ratings is
important
3.65%
7.22%
44% 44.23%
1.43% 4.59%
38.71%
55.27%
0.33% 1.94%
45.33%
52.40%
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
Ineffective Developing Effective Highly Effective
New York Teacher Ratings by Component
Value-Added Locat Assessment Prinicpal Observation
New York Teacher Ratings
Value-Added Local Assessment
Principal Observation
Ineffective 4216 1347 306
Developing 8337 4334 1793
Effective 51660 36508 41953
Highly Effective 51080 52132 48503
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable
Measures of Effective Teaching: Culminating Findings from the MET Projects Three-
Year Study
Observation by Reliability coefficient(relative to state test value-added gain)
Proportion of test variance explained
Principal – 1 .51 26.0%
Principal – 2 .58 33.6%
Principal and other administrator .67 44.9%
Principal and three short observations by peer observers
.67 44.9%
Two principal observations and two peer observations
.66 43.6%
Two principal observations and two different peer observers
.69 47.6%
Two principal observations one peer observation and three short observations by peers
.72 51.8%
Reliability of a variety of teacher observation
implementations
Non-cognitive factors
Jackson (2012) argues that teachers may have more impact on non-cognitive factors that are essential to student success like attendance, grades, and suspensions.
In education, value-added measurement has focused policy-makers on the teacher’s contribution to academic success, as reflected in test scores.
These are not the only measures that matter however.
Employing value-added methodologies, Jackson found that teachers had a substantive effect on non-cognitive outcomes that was independentof their effect on test scores
• Lowered the average student absenteeism by 7.4 days.
• Improved the probability that students would enroll in the next grade by 5 percentage points.
• Reduced the likelihood of suspension by 2.8%
• Improved the average GPA by .09 (Algebra) or .05 (English)
Source: Jackson, K. (2013). Non-Cognitive Ability, Test Scores and Teacher Quality: Evidence from 9th Grade Teachers in North Carolina. Northwestern University and NBER
Non-cognitive factors
The evolving evaluation landscape – testing and
growth measurement
Two ways tests are used in evaluation
and their claims
• Produces rankings of teachers relative to each other based on assessment results.
• Introduces controls to account for factors that may influence growth that are outside the teachers influence.
• Advances a claim of causation – that the teachers ranking is based on learning caused.
• Can be applied to as few as 20% of the teachers in a school system (Whitehurst, 2013).
Value-Added measures
Whitehurst, G. J. (2013). Teacher value- added: Do we want a ten percent solution? The Brown Center Chalkboard, April 24. Washington, DC: Brookings Institution. Retrieved October 2, 2014, from www.brookings.edu/blogs/brown-center-chalkboard/posts/2013/04/24-merit-pay-whitehurst
• Are a contract negotiated between the principal and teacher around student results.
• Do not produce rankings that compare teacher results across settings
• Do not introduce controls to account for factors that may influence growth that are outside the teachers influence.
• Do not advance a claim of causation – teacher competence is demonstrated by fulfillment of the contract
Student Learning Objectives
Percent of students who say they do not receive their state accountability test results.
37%
Make Assessment Matter: Students and Educators Want Tests that Support Learning (2014). –Portland, OR. NWEA and Grunwald Associates LLC.
Issues in the use of growth and value-
added measures
Differences among value-added
models
Los Angeles Times Study
Los Angeles Times Study #2
Issues in the use of value-added measures
Control for statistical error
All models attempt to address this
issue. Nevertheless, many teachers
value-added scores will fall within
the range of statistical error.
What Makes Schools Work Study -Mathematics
-10.0
-5.0
0.0
5.0
10.0
15.0
-10.0 -5.0 0.0 5.0 10.0 15.0
Year
2
Year 1
Value-added index by teacher
Data used represents a portion of the teachers who participated in Vanderbilt
University’s What Makes Schools Work Project, funded by the federal Institute of
Education Sciences
Issues in the use of value-added
measures
The choice of value-added model
The choice of model has an
important impact on teacher ratings.
Issues in the use of growth measures
The choice of test.
Many assessments are not
designed to measure growth.
Others do not measure growth
equally well for all students.
Tests are not equally accurate for all
students
California STAR NWEA MAP
-12.00
-11.00
-10.00
-9.00
-8.00
-7.00
-6.00
-5.00
-4.00
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
9.00
10.00
11.00
12.00
Ave
rage
Gro
wth
Ind
ex
Sco
re a
nd
Ran
ge
Mathematics Growth Index Distribution by Teacher - Validity Filtered
Q5
Q4
Q3
Q2
Q1
Each line in this display represents a single teacher. The graphic
shows the average growth index score for each teacher (green
line), plus or minus the standard error of the growth index estimate
(black line). We removed students who had tests of questionable
validity and teachers with fewer than 20 students.
Range of teacher value-added
estimates
Issues in the use of growth and value-added measures
“Among those who ranked in the top
category on the TAKS reading test, more
than 17% ranked among the lowest two
categories on the Stanford. Similarly
more than 15% of the lowest value-added
teachers on the TAKS were in the highest
two categories on the Stanford.”
Corcoran, S., Jennings, J., & Beveridge, A., Teacher Effectiveness on High and Low
Stakes Tests, Paper presented at the Institute for Research on Poverty summer
workshop, Madison, WI (2010).
Three ways tests are used in
evaluation and their issues
• Do not provide evidence of teacher effectiveness.• Teachers using SLOs may be evaluated against less
rigorous criteria than teachers evaluated by value-added methods.
• Goals are not consistent in difficulty.• Goals are not consistent across teachers.
Student Learning Objectives
New York Teacher Ratings
Value-Added Local Assessment
Principal Observation
Ineffective 4216 1347 306
Developing 8337 4334 1793
Effective 51660 36508 41953
Highly Effective 51080 52132 48503
Ultimately – the principal should
decide
• Evaluation inherently involves judgment – not a bad thing.
• Evidence should inform and not direct their judgment.
• The implemented system should differentiate performance.
• Courts respect the judgment of school administrators relative to personnel decisions.
If evaluators do not differentiate their ratings, then all differentiation comes from the test.
“The (Race to the Top teacher evaluation) changes, already under way in some cities and states, are intended to provide meaningful feedback and, critically, to weed out weak performers. And here are some of the early results:
In Florida, 97 percent of teachers were deemed effective or highly effective in the most recent evaluations. In Tennessee, 98 percent of teachers were judged to be “at expectations.” In Michigan, 98 percent of teachers were rated effective or better.”
Source: New York Times (2013, March 30). Curious Grade for Teachers: Nearly all Pass. Retrieved from: http://www.nytimes.com/2013/03/31/education/curious-grade-for-teachers-nearly-all-pass.html?pagewanted=all&_r=0
The importance of non-cognitive factors in teacher evaluation
Solving one problem can sometimes create another.
Suggested reading
Baker B., Oluwole, J., Green, P. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race to the Top Era. Education Policy Analysis Archives. Vol 21. No 5.