View
206
Download
2
Category
Tags:
Preview:
DESCRIPTION
NWEA's Connecticut presentation on teacher evaluation and goal setting.
Citation preview
John Cronin, Ph.D.Director
The Kingsbury Center @ NWEA
Implementing the Connecticut teacher evaluation system
Presenter - John Cronin, Ph.D.
Contacting us:Rebecca Moore: 503-548-5129E-mail: rebecca.moore@nwea.org
Helping teacher’s set reasonable and rigorous goals
What you’ll learn
• The purposes of teacher evaluation. There can be different purposes for different educators.
• The value of differentiating educator evaluations and the risks associated with not differentiating.
• The value of multiple measurements, and the importance of defining a purpose for each measurement.
• The information needed to determine whether a goal is attainable• The difference between “aspirational” and “evaluative” goals and the
value of each.• Thoughts on strategies for addressing the difficult to measure.
What’s the purpose?
• The Connecticut system requires a collaborative process.• For most educators the purpose of evaluation is formative.• For a small minority of educators the purpose is to summative
and goals may involve demonstrating basic competence.• Leaders should be transparent about the purpose of the
process for each educator.• Perfect consistency isn’t necessarily a requirement.
Differences between principal and teacher evaluation
Principals
Thus principals should be evaluated on their ability to improve growth or maintain high levels of growth over time rather than their students’ growth within a school year.
• Inherit a pre-existing staff.• Have limited control over
staffing conditions• Work with this intact
group from year to year.
Differences between principal and teacher evaluation
Teachers • They have new groups of
students each year• They generally work with
those students for one school year only.
• New teachers generally become more effective in their first three years
The difference between formative and summative evaluation
• Formative evaluation is intended to give educators useful feedback to help them improve their job performance. For most educators formative evaluation should be the focus of the process.
• Summative evaluation is a judgment of educator performance that informs future decisions about employment including granting of tenure, performance pay, protection from layoff.
Purposes of summative evaluation
• An accurate and defensible judgment of an educator’s job performance.
• Ratings of performance that provide meaningful differentiation across educators.
• Goals of evaluation– Support professional improvement– Retain your top educators– Dismiss ineffective educators
If evaluators do not differentiate their ratings, then all differentiation comes from the test.
If performance ratings aren’t consistent with school growth, that will probably be public information.
Results of Tennessee Teacher Evaluation Pilot
1 2 3 4 50%
10%
20%
30%
40%
50%
60%
Value-added resultObservation Result
Results of Georgia Teacher Evaluation Pilot
Evaluator Rating
ineffectiveMinimally EffectiveEffectiveHighly Effective
Connecticut expectations around teacher observation
• First and Second year teachers – – Required – 3 formal observations– Recommended – 3 formal observations and 3 informal
observations• Below standard and developing –
– Required – 3 formal evaluations– Recommended – 3 formal evaluations and 5 informal
evaluations• Proficient and exemplary –
– Recommended – 3 formal evaluations – 1 in class– Required – 3 formal evaluations – 1 in class
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study
Observation by Reliability coefficient (relative to state test value-added gain)
Proportion of test variance explained
Model 1 – State test – 81% Student surveys 17% Classroom Observations – 2%
.51 26.0%
Model 2 – State test – 50% Student Surveys – 25% Classroom Observation – 25%
.66 43.5%
Model 3 – State test – 33% - Student Surveys – 33% Classroom Observations – 33%
.76 57.7%%
Model 4 – Classroom Observation 50%State test – 25%Student surveys – 25%
.75 56.2%
Reliability of evaluation weights in predicted stability of student growth gains year to year
Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study
Observation by Reliability coefficient (relative to state test value-added gain)
Proportion of test variance explained
Principal – 1 .51 26.0%
Principal – 2 .58 33.6%
Principal and other administrator .67 44.9%
Principal and three short observations by peer observers .67 44.9%
Two principal observations and two peer observations .66 43.6%
Two principal observations and two different peer observers .69 47.6%
Two principal observations one peer observation and three short observations by peers
.72 51.8%
Reliability of a variety of teacher observation implementations
Why should we care about goal setting in education?
Because we want studentsto learn more!
• Research view–Setting goals improves performance
Testing
Metric (Growth Score)
Analysis (Value-Added)
Evaluation (Rating)
The testing to teacher evaluation process
The difference between growth and improvement
Mathematics
No ChangeDownUp
Fall RIT
Num
ber o
f Stu
dent
s
One district’s change in 5th grade math performance relative to Kentucky cut scores
MathematicsFailed growth targetMet growth target
Student’s score in fall
Nu
mb
er o
f S
tud
ents
Number of students who achieved the normal mathematics growth in that district
Issues in the use of growth measures
Measurement design of the instrument
Many assessments are not designed to measure growth. Others do not measure growth equally well for all students.
Tests are not equally accurate for all students
California STAR NWEA MAP
Issues with rubric based instruments
• Rubrics should be granular enough to show growth. Four point rubrics may not be adequate.
• High and low scores should be written in a manner that sets a reasonable floor and ceiling.
Expect consistent inconsistency!
Inconsistency occurs because
• Of differences in test design. • Differences in testing conditions. • Differences in models being applied to
evaluate growth.
Test Retest
Test 1 Time 1
Test 2 Time 1
Test 1 Time 2
Test 2 Time 2
The reliability problem – Inconsistency in testing conditions
Test 1 Time 1
Test 2 Time 1
Test 1 Time 2
Test 2 Time 2
The reliability problem – Inconsistency in testing conditions
Test 1 Time 1
Test 2 Time 1
Test 1 Time 2
Test 2 Time 2
Test 1 Time 1
Test 2 Time 1
Test 1 Time 2
Test 2 Time 2
The problem with spring-spring testing
3/11 4/11 5/11 6/11 7/11 8/11 9/11 10/11 11/11 12/11 1/12 2/12 3/12
Teacher 1 Summer Teacher 2
Testing
Metric (Growth Score)
Analysis (Value-Added)
Evaluation (Rating)
The testing to teacher evaluation process
Issues in the use of growth and value-added measures
Differences among value-added models
Los Angeles Times Study
Los Angeles Times Study #2
Issues in the use of value-added measures
Control for statistical error
All models attempt to address this issue. Nevertheless, many teachers value-added scores will fall within the range of statistical error.
Issues in the use of growth and value-added measures
Control for statistical error
New York City
New York City #2
-12.00-11.00-10.00
-9.00-8.00-7.00-6.00-5.00-4.00-3.00-2.00-1.000.001.002.003.004.005.006.007.008.009.00
10.0011.0012.00
Mathematics Growth Index Distribution by Teacher - Validity Filtered
Aver
age
Grow
th In
dex
Scor
e an
d Ra
nge
Q4
Q3
Q2
Q1
Each line in this display represents a single teacher. The graphic shows the average growth index score for each teacher (green line), plus or minus the standard error of the growth index estimate (black line). We removed stu-dents who had tests of questionable validity and teachers with fewer than 20 students.
Q5
Range of teacher value-added estimates
What’s a SMART goal?
• Specific• Measurable• Attainable• Relevant• Time-Bound
Specific
• What: What do I want to accomplish?• Why: What are the reasons or purpose for
pursuing the goal?• Which: What are the requirements and
constraints for achieving the goal?
SMART goal resources
• National Staff Development Council. Provides a nice process for developing SMART goals.
• Arlington Public Schools. Excellent and detailed examples of SMART goals across subject disciplines, including art and music.
• The Handbook for Smart School Teams. Anne Conzemius and Jan O’Neill.
Issues with local tests and goal setting
• Validity and reliability of assessments.• Teachers/administrators are unlikely to set
goals that are inconsistent with their current performance.
• It is difficult to set goals without prior evidence or context.
The goal should ALWAYS be improvement in a domain (subject)!
Specific
There should ALWAYS be multiple data sources and metrics.
Specific
Data should be triangulated
• Classroom assessment data to standardized test data.
• Domain data (mathematics) to sub-domain data (fractions and decimals) to granular data (division with fractions).
All students should be “in play” relative to the goal.
Specific
Measurable
Types of Goals
• Performance – 75% of the students in my 7th grade mathematics class will achieve the qualifying score needed for placement in 8th grade Algebra.
• Growth – 65% of my students will show growth on the OAKS mathematics test that is greater than the state reported norm.
• Improvement – Last year 40% of my students showed growth on the OAKS mathematics test that was greater than the norm. This year 50% of my students will show greater than normal growth.
Attainable
The goals set should be reasonable and rigorous. At minimum, they should represent a level of performance that a competent educator could be expected to achieve.
An analogy to baseball
6.1
1.7
0.0
Center Fielders - WAR (Wins Above Replacement)
Superstar – Mike Trout – Los Angeles Angels
Median Major Leaguer – Gregor Blanco, San Francisco Giants
Marginal Major Leaguer – Chris Young, Oakland As
6.4
-1.3
The difference between aspirational and evaluative goals
Aspirational – I will meet my target weight by losing 50 pounds during the next year and sustain that weight for one year.
Proficient – I intend to lose 15 pounds in the next six months, which will move me from the “obese” to the “overweight” category, and sustain that weight for one year.
Marginal – I will lose weight in the next six months.
Ways to evaluate the attainability of a goal
• Prior performance• Performance of peers within the system• Performance of a norming group
One approach to evaluating the attainment of goals.
Students in La Brea Elementary School show mathematics growth equivalent to only 2/3 of the average for students in their grade.
Level 4 – (Aspirational) – Students in La Brea Elementary School will improve their mathematics growth equivalent to 1.5 times the average for their grade.
Level 3 – (Proficient) Students in La Brea Elementary School will improve their mathematics growth to be equivalent to the average for their grade.
Level 2 – (Marginal) Students in La Brea Elementary School will improve their mathematics growth relative to last year.
Level 1 – (Unacceptable) Students in La Brea Elementary School do not improve their mathematics growth relative to last year.
Is this goal attainable?
62% of students at John Glenn Elementary met or exceeded proficiency in Reading/Literature last year. Their goal is to improve their rate to 82% this year. Is the goal attainable?
Growth > -30%
> -20% > -10% > 0% > 10% > 20% > 30%0
50100150200250300350400
362 351 291173
73 14 3
Oregon schools – change in Reading/Literature proficiency 2009-10 to 2010-
11 among schools that started with 60% proficiency rates
Is this goal attainable and rigorous?
45% of the students at La Brea elementary showed average growth or better last year. Their goal is to improve that rate to 50% this year. Is their goal reasonable?
LaBrea District Average0%
10%20%30%40%50%60%70%80%90%
100%
Students with average or better annual growth in Repus school district
The selection of metrics matters
Students at LaBrea Elementary School will show growth equivalent to 150% of grade level.
Students at Etsaw Middle School will show growth equivalent to 150% of grade level.
Scale score growth relative to NWEA’s growth norm in mathematics
2 3 4 5 6 7 8 9-1.0
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
Growth Index
Scal
e Sc
ore
Gro
wth
Percent of a year’s growth in mathematics
2 3 4 5 6 7 8 90%
20%40%60%80%
100%120%140%160%180%200%
Mathematics
Perc
ent o
f a Y
ear’
s G
row
th
Assessing the difficult to measure
• Encourage use of performance assessment and rubrics.• Encourage outside scoring
– Use of peers in other buildings, professionals in the field, contest judges
• Make use of resources– Music educator, art educator, vocational professional
associations– Available models – AP art portfolio. – Use your intermediate agency– Work across buildings
• Make use of classroom observation.
Success can’t be replicated if you don’t know why you succeeded.
Failure can’t be reversed if you don’t know why you failed.
The outcome is important, but why the outcome occurred is equally important!
The outcome is important, but why the outcome occurred is equally important!
• Establish checkpoints and check-in beyond what’s required when possible.
• Collect implementation information – Classroom observations - visits– Teacher journal– Student work – artifacts
These processes can be done by teacher peers!
Presenter - John Cronin, Ph.D.
Contacting us:NWEA Main Number: 503-624-1951 E-mail: rebecca.moore@nwea.org
Thank you for attending
Recommended