Upload
robert-l-mendro
View
218
Download
1
Embed Size (px)
Citation preview
Journal of Personnel Evaluation in Education 12:3 257±267, 1998
# 1998 Kluwer Academic Publishers, Boston ± Manufactured in The Netherlands
Student Achievement and School andTeacher Accountability
ROBERT L. MENDRO
Dallas Public Schools
Abstract
This article is part of a set of papers generated from a keynote presentation by Dr. Jack Frymier at the 1997
CREATE annual meeting. Dr. Frymier dealt with several reasons, that, as he saw it, invalidate the use of student
achievement data in teacher accountability systems. This article ®rst notes problems with Dr. Frymier's
conception of accountability. Next, it summarizes some of the recent evidence showing the strong connection
between school and teacher effectiveness measures and student achievement. It then notes some of the bene®ts of
school and teacher effectiveness measures external to their function as measures of performance. Next, policy
issues arising from the use of student data and the associated research are considered. Finally, it concludes with
some cautions about using effectiveness measures in teacher accountability systems.
Dr. Frymier's Conception of Accountability
Dr. Jack Frymier's paper on ``Accountability and Student Learning'' presents an
interesting but untenable position. He draws what seem to be preordained conclusions
regarding accountability that just do not hold in the light of evidence (Frymier, 1997). This
article ®rst argues for the link between accountability and student performance in an
overall response to Dr. Frymier. It then moves to a discussion of the existing data
regarding tying school and teacher performance to student performance in a value-added
context. Finally, it concludes with a discussion of issues that come to the forefront in light
of this research, including some of the many bene®ts that can be derived from value-added
measures and some cautions with regard to their use in teacher accountability systems.
Dr. Frymier basically holds that teachers can be held accountable only for their own
performance and not for the performance of their students. Regardless of the many ways
he restates his position, it is always a reiteration of this basic theme. There are three
arguments presented in support of this theme. The ®rst is the argument of tradition.
According to his illustrations, the British tried payment according to student outcome in
the 1800s and rejected it, and, therefore, so should we. Also, the ancient Greeks believed
individuals were responsible only for their own behavior and this, perforce, excludes
teachers from being held accountable for any part of their students' behavior. Next, the
idea of teacher accountability through student achievement is rejected on a supposed
violation of legal principles. Legally, he claims, even parents cannot be held responsible
for the behavior of their own children, so teachers cannot be held responsible for their
students' behavior. Finally, he produces an unusual twist of the concept of locus of control
for an argument based on psychology. Holding teachers responsible for student outcomes
will give students an external locus of control, which will turn them into mindless
automatons. As he has applied them, all three are false positions. Let us examine them
each in a little more detail.
The ®rst argument, that of tradition, falls apart on analysis. Regardless of who has tried
it and failedÐthe British, the ancient Greeks, or anyone elseÐnew information based on
solid research can invalidate tradition or, at a minimum, indicate that tradition needs to be
challenged. As is demonstrated below, such information now exists. Regardless of what
the Greeks held personal responsibility to be, we are dealing with organizations and
individuals given the speci®c task of in¯uencing the learning of students. Their
responsibility for doing so and their accountability for the task are based on traditions,
laws, and information available since the time of the ancient Greeks, thus rendering the
opinion of the Greeks irrelevant. Of course, there are other problems with these analogies.
To consider just a few with the British example, their school system was entirely different
from a modern system, and the status of employees was legally different. The curriculum
was much less well de®ned, and there were entirely different ways of measuring the
outcomes of schooling.
The Greeks we also consider regarding the legal argument for parental responsibility.
Here, Dr. Frymier has ignored a number of precedents where parents are being held
responsible for their children's truancy and a number of other individual behaviors.
Further, responsibility for the entire range of behavior of an individual is confused with the
limited responsibility of a teacher for, among other things, the learning, social behavior in
school, and academic performance of students. For example, from a legal standpoint, the
Texas legislature (and, no doubt, others) certainly holds the legal position that teachers are
responsible for their students' academic behavior, since, in 1995, it passed a law that
mandated the inclusion of the academic performance of a teacher's students as a part of all
teacher evaluation systems in the state (Senate Bill 95-1, 1995). The ancient Greeks may
or may not have disagreed with this stance, but since their time, new data regarding
behavior have come to light. Certainly, knowledge of the ability to manipulate behavior
through positive and negative reinforcement (Skinner, 1953) and the in loco parentisfunction of teachers and the schools relative to their students behavior and well-being
affects these conclusions. When these are combined with the teacher's obligation to spend
a minimum amount of time manipulating academic behavior, the conception of the Greeks
delivering schools and teachers from responsibility for student achievement collapses.
Locus of control has been stood on its head in order to absolve teachers of responsibility
for their students' achievement. Locus of control is a part of attribution theory, which
posits that individuals perceive (general) achievement-related situations as falling along a
continuum from uncontrollable to totally controllable. Recent research indicates that the
teacher can be directly responsible for instilling an internal locus of control in a student
(U.S. Department of Education, 1992; Freeman & Sokoloff, 1995). Thus, teachers are
shown to have responsibility for developing locus of control in their students as part of
enhancing student achievement. Regardless of the distortion of locus of control research, a
false dichotomy is offered with the concept. Locus of control is situation speci®c, and it is
possible for a student to perceive a different locus of control with each task, assignment, or
258 R.L. MENDRO
goal. Thus, locus of control is not the either/or situation argued. Finally, one of the
demonstrated tenets of the standards-based education movement is that students feel quite
comfortable with rational, achievable standards as long as they are allowed to have a
degree of control over how they reach the standard (Cross & Joftus, 1997; Mendro, 1997).
Thus, it is possible for students to allow a different locus of control for different parts of a
larger complex task as opposed to the all or nothing alternative presented to the reader.
Data in Support of Using Student Achievement in School andTeacher Accountability
The real issue is to determine what data are available that bear on the issue of school and
teacher accountability. The use of student achievement data as a part of school and teacher
accountability has been debated for some time, but there have been few practical
applications free from known biases until the last decade (Millman, 1997). The debate has
centered around the need for including multiple outcome variables and for controlling
exogenous in¯uences. The use of multiple-outcome variables is a two-part problem. The
®rst part centers on the need for multiple-outcome variables that are related to important
educational goals (see, for example, Murname, 1987; David, 1987). Here goals must be
decided with input from stakeholders. The second part of the multiple-outcome question
is mainly a question of resources on the part of the district involved (Webster &
Schuhmacher, 1973). Resources are needed to maintain the extensive databases and to take
the multiple measures required.
The variables used in the school effectiveness model used in the Dallas Public Schools
are chosen by a group known as the Accountability Task Force. This group consists
of principals, teachers, parents, community members, business representatives, and
administrators. The Task Force makes the ®nal decisions on all variables to be included in
the School Effectiveness Indices. In addition, the District has performance awards for
those schools that excel as measured by the School Effectiveness Indices. The Task Force
sets the rules under which a school will receive an award (for example, the school must test
at least 95 per cent of their non-special education students) and serves as the ®nal arbiter
for all appeals regarding the indices and the awards. The awards serve as one type of
employee bonuses. Each professional employee in an awarded school receives a $1,000
bonus. Each support employee receives $500, and the school receives $2,000 for its
activity fund. The variables currently included in the indices as designated are included in
table 1. The table includes the weights assigned to the variables as well. The variables
include primarily test data when weights are considered. For example, in an elementary
school with grades 1 through 6, test scores have weights totaling 89, attendance totals 6,
and promotion rate 1. The Accountability Task Force intended these weights to serve as the
relative value of the outcome measures for these schools. Indeed, where parents and
community members discuss goals, our experience has been that more emphasis is placed
on achievement as the primary legitimate outcome of schooling.
The stumbling block to the analysis of these data has been the known in¯uence of
variables outside the control of the school or teacher (Webster & Mendro, 1997). For
STUDENT ACHIEVEMENT AND SCHOOL AND TEACHER ACCOUNTABILITY 259
Tabl
e1.
Wei
gh
tin
go
fC
rite
rio
nM
easu
res
by
the
Acc
ounta
bil
ity
Tas
kF
orc
eof
the
Dal
las
Publi
cS
chools
.
Ele
men
tary
Cri
teri
aM
iddl
eSc
hool
Cri
teri
aH
igh
Scho
olC
rite
ria
Cri
teri
on
Wei
gh
tC
rite
rion
Wei
ght
Cri
teri
on
Wei
ght
ITB
SÐ
Rea
din
g4
/Gra
de
ITB
SÐ
Rea
din
g4/G
rade
TA
PÐ
Rea
din
g,
Gra
de
96
ITB
SÐ
Mat
hem
atic
s4
/Gra
de
ITB
SÐ
Mat
hem
atic
s4/G
rade
TA
PÐ
Mat
hem
atic
s,G
rade
96
TA
AS
ÐR
ead
ing
5/G
rad
eT
AA
SÐ
Rea
din
g5/G
rade
TA
AS
ÐR
eadin
g,
Gra
de
10
12
TA
AS
ÐM
ath
emat
ics
4/G
rad
eT
AA
SÐ
Mat
hem
atic
s4/G
rade
TA
AS
ÐM
athem
atic
s,G
rade
10
12
TA
AS
ÐW
riti
ng
,G
rade
45
TA
AS
ÐW
riti
ng,
Gra
de
85
TA
AS
ÐW
riti
ng,
Gra
de
10
12
Att
end
ance
1/G
rad
eT
AA
SÐ
Sci
ence
,G
rade
81
AC
PÐ
Lan
guag
e8/S
chool
Pro
mo
tio
nR
ate
1/S
choo
lT
AA
SÐ
Soci
alS
tudie
s,G
rade
81
AC
PÐ
Mat
hem
atic
s8/S
chool
Att
endan
ce1/G
rade
AC
PÐ
Sci
ence
8/S
chool
Pro
moti
on
rate
1/S
chool
AC
PÐ
Soci
alS
tudie
s8/S
chool
Dro
pout
rate
1/S
chool
AC
PÐ
Rea
din
gim
p,
Gra
de
92
Per
centa
ge
inhonors
cours
es2/S
chool
AC
PÐ
Honors
/Advan
ced
3/S
chool
AC
PÐ
Fore
ign
Lan
guag
e2/S
chool
SA
T/A
CT
,G
rade
12
4
PS
AT
ÐV
erbal
1/S
chool
PS
AT
ÐM
athem
atic
s1/S
chool
Gra
duat
ion
rate
5/S
chool
Per
centa
ge
takin
gS
AT
/AC
T5/S
chool
Per
centa
ge
takin
gP
SA
T3/S
chool
Per
centa
ge
inhonors
cours
es5/S
chool
Per
centa
ge
inA
Pco
urs
es4/S
chool
Per
centa
ge
takin
gA
Pan
dpas
sing
1/S
chool
260 R.L. MENDRO
example the Dallas Public Schools system controls for the effects of a combined ethnicity
and language pro®ciency variable, student gender, four variables measuring socio-
economic status (free or reduced lunch participation, census family income, census
poverty level, and census college participation), and the complete interactions of ethnicity
or language, gender, and free lunch.
Although regression analysis has been proposed for years as a remedy for controlling
these in¯uences (see, for example, Felter & Carlson, 1985; Kirst, 1986), standard multiple
linear regression has shown problems controlling known in¯uences in the higher levels of
multilevel data (Webster & Olson, 1988; Webster, Mendro, Orsak & Weerasinghe, 1998).
Only with the widespread availability of software and techniques for the analysis of
sophisticated multilevel models such as hierarchical linear modeling (HLM) (Raudenbush
& Bryk, 1989) or with the statistical developmental efforts commissioned by at least one
state Department of Education (Sanders & Horn, 1993) have school personnel had the
statistical tools available to control in¯uences on multiple levels of variables and their
associated bias. For example, at the school level, the Dallas Public Schools model
simultaneously controls the effects of mobility, crowding, average family income, average
family educational level, average family poverty level, per cent on free lunch, per cent
SOL students, per cent African American students, per cent Hispanic students, per cent
minority students, and per cent of teacher days for vacant positions, while controlling the
student variables listed above.
With the initiation of sophisticated models for multilevel data analysis, several carefully
implemented systems for determining school and teacher effectiveness have been
developed and implemented (Sanders, Saxton & Horn, 1997; Webster & Mendro, 1997).
Further, Webster et al. (1998) have laid out clear criteria for the evaluation of the level of
bias in these models. They note that the ®nal criterion must be the degree to which these
models control the correlation of variables known to be related to the outcome variables
but not under control of the school or teacher (SES, for example) and the effectiveness
outcomes derived from the models. The degree to which these correlations are nonzero is
the degree to which a model is biased. Furthermore, these correlations must be controlled
at both the student and school or classroom level.
Research by Sanders and Rivers (1996), Jordan, Mendro, and Weerasinghe (1997), and
Bembry, Jordan, Gomez, Anderson, and Mendro (1998) has demonstrated the effects of
teachers on student achievement. They show that there are large additive components in
the longitudinal effects of teachers, that these effects are much larger than expected, and
that the least effective teachers have a long-term in¯uence on student achievement that is
not fully remediated for up to three years later. Finally, both Dallas studies show a
selection bias where lower-achieving students are more likely to be put with lower
effectiveness teachers and vice-versa. Thus, the negative effects of less effective teachers
are being visited on students who probably need the most help.
Further, the research done in Tennessee by Sanders and Rivers and that done in Dallas
by Jordan et al. and Bembry et al. reveal near identical results across three different
populations, different methods of computing teacher effectiveness, different analysis
methods, and different assessment measures (Bembry et al., 1998). Clearly, these data
show that groups of ineffective teachers can be readily identi®ed as the ®rst step in
STUDENT ACHIEVEMENT AND SCHOOL AND TEACHER ACCOUNTABILITY 261
remediating their poor performance in in¯uencing student achievement and that the effects
of ineffective and effective teachers cross populations of students and the measures used to
obtain effectiveness indices.
These points are important because they expose myths about the in¯uence of teachers
on student achievement. The result that showed losses relative to ineffective teachers
to be large contradicts the ®rst myth, which is that if a teacher doesn't get very good
achievement gains in one year, it is of little importance because the losses are small. The
research in Dallas shows that with students of average prior achievement levels, groups of
students can lose as much as twenty norm-referenced percentile points in a year. Across
three or four years, students with ineffective teachers each year can perform at a level ®fty
percentile points lower than students with effective teachers. The data on longitudinal
effects of an ineffective teacher void a second myth about teachers and achievement,
which is that if teacher A doesn't do a good job this year, teacher B will make up for it next
year. The Dallas data strongly suggest that negative effects of a teacher in the bottom third
of effectiveness last through three years of teachers in the top third of effectiveness. These
data have immediate policy issues regarding teacher evaluation and student achievement,
which are discussed below.
These effectiveness systems are not without their distracters (Darling-Hammond, 1997;
Glass, 1990; Sykes, 1997; Thum & Bryk, 1997). However, their objections do not explain
away the data and the remarkable consistency across sites. They focus, instead, on either
philosophical objections, on hypothetical situations unrelated to the real data, or on ideal
alternative models that have not been tested. Typically, they make the assumption that the
data will be used inappropriately in a scenario of their devising and condemn the models
from that standpoint. Without a substantive objection to the data or the analyses presented
in the research, these arguments are of little value in advancing the discussion on the use of
teacher and school effectiveness models.
An editorial comment on an earlier version of this article focused on a major bone of
contention regarding the most appropriate tests for use in identifying effective and
ineffective schools and teachers in relation to student achievement. As noted, the Dallas
system uses a number of variables in its system for determining effective schools. Norm-
referenced, criterion-referenced, and curriculum-referenced tests, student attendance,
dropout rates, participation in honors courses, and other variables are used in determining
the effectiveness of schools. For teachers, only test data are used. At some grades and
in some subjects, as many as four different sets of test results are used. At some grades,
only norm-referenced tests are used. Some would prefer that the assessment use only
``authentic'' measures (Darling-Hammond, 1997). We ®nd that, for our purposes, any
reliable measure related to the general or speci®c learning outcomes, including
standardized norm-referenced tests, provides good information for beginning an
improvement process. To rephrase this, because it is an important point, we ®nd norm-
referenced tests to be suf®cient for the initial identi®cation of potentially outstanding
teachers and schools and potentially ineffective teachers and schools.
We believe, with Haney and Madaus (1989) that the critical element is not the type of
test ( provided the test is reliable and a measure of the subject taught by the teacher) but the
use to which the test is put. The Dallas Classroom Effectiveness Indices are intended to be
262 R.L. MENDRO
used as a starting point for identifying groups of teachers who are effective or ineffective
relative to their students' measured achievement, not as a ®nal outcome measure of teacher
effectiveness.
After the results are veri®ed, they can be used to learn about effective teachers (as
described in the next section) or to investigate the reasons for ineffective student results as
part of assisting a teacher to improve. Only when numerous sources, including Classroom
Effectiveness Indices, con®rm ineffective teaching should the results be used in making
decisions about changing ineffective behaviors. Only when numerous sources con®rm
effectiveness should we leave good teachers alone or attempt to learn from them. In some
cases, a pattern of effective or ineffective results across several years will give more
credence to the test information, but a single result has to be veri®ed. We realize that this is
no more than the good advice given for using the results of any assessment, but critics of
the use of test scores assume, sometimes correctly, that test results will be used
inappropriately. Thus, even this elemental caution is in order. We strongly discourage
using any test result or effectiveness measure in isolation.
Results and Uses Associated with the Dallas Effectiveness Indicators
The Dallas Public Schools (DPS) initiated a value-added system of identifying effective
schools in 1991 to 1992. In 1994 to 1995, the DPS started computing effectiveness at the
classroom level and formally adopted Classroom Effectiveness Indices as part of a revised
teacher evaluation system in 1996 to 1997 after extensive input from all levels of district
personnel and community members. The system is described in Webster, Mendro,
Bearden, Bembry, and Jordan (1997). Over the six years the school portion of the system
has been in effect, a primary question was the impact that the system has had on variables
measured by the system. The primary variables in the system where comparable data are
available were achievement on the norm-reference ITBS' on the state mandated criterion-
referenced Texas Assessment of Academic Skills, and on the SAT; dropout rate;
graduation rate; and the percentage of students in honors courses in grades 7 to 8 and 9 to
12. (Attendance was included but is not considered here since a separate attendance
initiative with ®nancial rewards was in effect during the same time period.) All variables
showed a general increasing trend (or decreasing trend for dropout rate) except the mean
SAT scores, which decreased, and the grades 7 to 8 honors course enrollment, which
decreased signi®cantly from 1991 to 1993 and then has shown a steady rise (Webster et al.,
1997). The mean SAT scores were impacted by increased participation on the SAT test. A
separate analysis of these scores by quintile of the graduating class has shown these scores
to be stable within quintile. These are obviously correlational and not causal data, but they
do not contradict the possibility of a positive trend related to the introduction of indices.
More important have been the uses of the indices data. One of the primary uses is to
reward effective schools through the use of performance awards based on the school
indices. The administration uses school indices as one of the indicators in determining
the retention of principals at ineffective schools. The experience of the DPS is that the
quickest way to change the effectiveness of a school, for better or worse, is to change the
STUDENT ACHIEVEMENT AND SCHOOL AND TEACHER ACCOUNTABILITY 263
principal. Preliminary research that DPS is in the process of conducting has so far strongly
supported this conclusion.
The indices for teachers and schools serve as the basis of many evaluations, particularly
where there are no speci®c control groups. They have been used to identify characteristics
of effective and ineffective schools (Webster et al., 1997). In this study, effective and
ineffective schools were sampled, and evaluators identi®ed three consistent characteristics
of effective schools. First, effective schools had achievement as a major focus. Second, the
staff in effective schools expected students to achieve. (This is different from believing all
students can learn but not expecting it.) Finally, the principals of effective schools did not
tolerate ineffective teachers. Ineffective teachers were expected to change, or they were
removed. Outside of these principles and a few others noted that were less prominent, the
effective schools all differed in terms of atmosphere, management styles, and other
dimensions.
The classroom indices have also been used in the evaluation of the District's
mathematics program (Bearden, 1997). Research conducted in effective teachers' classes
found that effective teachers knew subject matter, taught the entire range of the curriculum
(including higher-order thinking skills) with equal emphasis, and assessed students
frequently through formal and informal methods. Beyond this, teacher styles varied
widely. Thus, the payoff of effectiveness indices may extend beyond the effect on the
variables in the accountability system and provide additional knowledge and information
about the ingredients that bring effectiveness about.
Policy Issues Associated with Teacher Effectiveness
Bembry et al. (1998) discuss the policy issues that are imputed by current teacher
effectiveness research. Clearly, equity in student access to a quality education is the most
important issue that emerges. The devastating student outcomes obtained by ineffective
teachers demand that a district must remedy the effects of these teachers on students or, if
that is not possible, to remove that teacher from the district. In the past, less extensive
analyses or the lack of analysis of student achievement data allowed principals and
administrators to believe that poor teachers were doing little harm or little long-lasting
harm. The import of the effectiveness research is that these teachers are doing students
long-term harm and it is possible to identify potential groups of teachers having this effect.
This provides a moral imperative to investigate these teachers and to attempt to correct
their behaviors if they consistently have these effects on students. School and districts can
no longer hide behind misinformation or no information.
A second issue involving equity is the type of help to provide students who have had an
ineffective teacher in the past. We now know, as a result of our research, that students
who are placed with an ineffective teacher suffer long-term negative effects. Carrying
ineffective teachers also limits the number of effective teachers that can be hired. The
policy issue is how to allocate resources for better teaching. This is particularly a problem
because of the tendency to assign lower-achieving students to less effective teachers. Do
we assign a somewhat less effective teacher to a higher-achieving student to make room
264 R.L. MENDRO
for the lower-achieving student who was put with a very ineffective teacher? Should we
investigate models where effective teachers team teach with less effective teachers and
have differentiated responsibilities?
Another important policy issue arises when the situation is considered from the
perspective of staff development. Most large districts are having trouble ®lling vacancies
with competent teachers, and many have few candidates to choose from. Clearly, investing
in making an ineffective teacher more effective is a necessary response, simply because
other options may not be available. A differentiated staff development policy is highly
likely to be needed, which allows the effective teacher more freedom to pursue individual
interests and requires the ineffective teacher to target particularly ineffective practices.
The problem of retaining effective teachers has to be reconsidered. These teachers have
such a high payoff in the classroom that it is more important than ever to keep them from
leaving the profession or going to other districts, if possible. Differentiated rewards must
be considered. These policy issues present the most apparent policy problems. There are
more. However, it is clear that the consideration of teacher effectiveness offers up an
entirely new set of concerns.
Cautions Relative to Effectiveness Indices
The largest problem in the use of effectiveness indices, as noted earlier, is their potential
for misinterpretation. This is particularly the case with teacher indices. The single biggest
problem is that consumers are tempted to reduce all of their thinking to a number and
to abandon the use of other data. Rarely is this justi®ed. Several questions must be
considered. What about a teacher who has high indices in one area (reading, for example)
and low in another (say, mathematics)? What about a teacher who has a pattern of
acceptable indices and then has a bad year? What about a less than effective teacher when
there is a teacher shortage in that person's content area and no replacements are available?
These and many other questions should rapidly convince the user that other information
must be used in conjunction with the indices.
Leaving out problems that can arise from direct cheating and its effects on indices, there
are several other concerns. Indices present a quanti®cation of worth along one dimension
relative to the person receiving them. Since any system using indices is fraught with the
psychological consequences of this perception, there is the danger that the teacher will
adopt the assessment measures as the curriculum and will narrow instruction to the
perceived focus of the assessments. Worse, if the teacher misunderstands the indices, the
teacher might teach only to one group of students in the mistaken belief that the group will
make the largest difference.
The use of indices requires a careful education program to help school leaders and
teachers understand the concepts underlying them and the limits to the value of the ®nal
product. Measures of teacher and school effectiveness are potent and meaningful tools.
But their use in attaching responsibility for achievement to schools and teachers must be
viewed in light of their limitations as well as their considerable strengths.
STUDENT ACHIEVEMENT AND SCHOOL AND TEACHER ACCOUNTABILITY 265
References
Bearden, D. (1997). An overview of the elementary mathematics program 1996±97. Research report REIS97-
116-3. Dallas: Dallas Public Schools.
Bembry, K., Jordan, H., Gomez, E., Anderson, M., & Mendro, R. (1998). Policy implications of long-termteacher effects on student achievement. Paper presented at the Annual Meeting of the American Educational
Research Association, San Diego, CA, April 13±17.
Cross, C., & Joftus, S. (1997). Are academic standards a threat or an opportunity? Bulletin of the NationalAssociation of Secondary School Principals, 81(590), 12±20.
Darling-Hammond, L. (1997). Toward what end? The evaluation of student learning for the improvement of
teaching in J. Millman (ed.), Grading teachers, grading schools. Newbury Park, CA: Sage.
David, J. (1987). Improving education with locally developed indicators. New Brunswick, NJ: Center for
Policy Research in Education, Eagleton Institute of Politics, Rutgers, the State University of New Jersey.
Felter, M., & Carlson, D. (1985). Identi®cation of exemplary schools on a large scale. In Austin & Gerber
(eds.), Research on Exemplary Schools. New York: Academic Press.
Freeman, C., & Sokoloff, H. (1995). Toward a theory of thematic curricula: Constructing new learning
environments for teachers and learners. Education Policy Analysis Archives, 3(14), 17. (Internet Journal).
Frymier, J. (1997). Accountability and student learning. A keynote paper presented at the Sixth Annual
National Evaluation Institute sponsored by CREATE, Indianapolis, IN, July 1997.
Glass, G. (1990). Using student test scores to evaluate teachers. In J. Millman & L. Darling-Hammond (eds.),
The new handbook of teacher evaluation. Newbury Park, CA: Sage.
Haney, W., & Madaus, G. (1989). Searching for alternatives to standardized tests: Whys, whats, and whithers.
Kappan, 70(9), 683±687.
Jordan, H., Mendro, R., & Weerasinghe, D. (1997). Teacher effects on longitudinal student achievement.A paper presented at the Sixth Annual National Evaluation Institute sponsored by CREATE, Indianapolis, IN,
July 1997.
Kirst, M. (1986). New directions for state education data systems. Education and Urban Society, 18(2),
343±357.
Mendro, R. (1997). Summary of common features of standards-based schools. Dallas TX: Dallas Public
Schools.
Millman, J. (Ed.) (1997). Grading teachers, grading schools. Newbury Park, CA: Sage Publications.
Murname, R. (1987). Improving education indicators and economic indicators: The same problems?
Educational Evaluation and Policy Analysis, 9, 101±116.
Raudenbush, S., & Bryk, A. (1989). Quantitative models for estimating teacher and school effectiveness. In
R.D. Bock (edc.), Multilevel Analysis of Educational Data. San Diego, CA: Academic Press.
Sanders, W., & Horn, S. (1993). The Tennessee Value-Added Assessment System (TVAAS): Mixed ModelMethodology in Educational Assessment. Knoxville, TN: University of Tennessee.
Sanders, W., & Rivers, J. (1996). Cumulative and residual effects of teachers on future student academicachievement. Knoxville, TN: University of Tennessee.
Sanders, W., Saxton, A., & Horn, S. (1997). The Tennessee Value-Added Assessment System: A quantitative
outcomes-based approach to educational assessment. In J. Millman (ed.), Grading teachers, grading schools.Newbury Park, CA: Sage Publications.
Senate Bill 95-1, (1995). A bill enacted by the 95th session of the Legislature of the State of Texas,
Austin, TX.
Skinner, B. (1953). Science and Human Behavior. New York: Macmillan.
Sykes, G. (1997). On trial, the Dallas Value-Added Accountability System in J. Millman (ed.), Gradingteachers, grading schools. Newbury Park, CA: Sage Publications.
Thum, Y., & Bryk, A. (1997). Value-Added Productivity Indicators in J. Millman (eds.), Grading Teachers,Grading Schools. Newbury Park, CA: Sage Publications.
U.S. Department of Education. (1992). Hard work and high expectations: motivating students to learn.Washington D.C.: USDOE Of®ce of Educational Research and Improvement.
266 R.L. MENDRO
Webster, W., & Mendro, R. (1997). The Dallas Value-Added Accountability System. In J. Millman (ed.),
Grading teachers, grading schools. Newbury Park, CA: Sage Publications.
Webster, W., Mendro, R., Bearden, D., Bembry, K., & Jordan, H. (1997). Rewarding Effective SchoolsÐTheory and Practice in an Outstanding Schools Awards Program. Paper presented at the Annual Meeting of the
American Educational Research Association, Chicago, IL, March 24±28.
Webster, W., Mendro, R., Orsak, T., & Weerasinghe, D. (1998). An application of hierarchical linearmodeling to the estimation of school and teacher effect. Paper presented at the Annual Meeting of the American
Educational Research Association, San Diego, CA, April 13±17.
Webster, W., & Olson, G. (1988). A quantitative procedure for the identi®cation of effective schools. Journalof Experimental Education, 56, 213±219.
Webster, W., & Schuhmacher, C. (1973). A uni®ed strategy for systemwide research and evaluation.
Educational Technology, 13(5), 68±72.
STUDENT ACHIEVEMENT AND SCHOOL AND TEACHER ACCOUNTABILITY 267