Student Achievement and School and Teacher Accountability

Journal of Personnel Evaluation in Education 12:3 257±267, 1998

# 1998 Kluwer Academic Publishers, Boston ± Manufactured in The Netherlands

Student Achievement and School andTeacher Accountability

ROBERT L. MENDRO

Dallas Public Schools

Abstract

This article is part of a set of papers generated from a keynote presentation by Dr. Jack Frymier at the 1997

CREATE annual meeting. Dr. Frymier dealt with several reasons, that, as he saw it, invalidate the use of student

achievement data in teacher accountability systems. This article ®rst notes problems with Dr. Frymier's

conception of accountability. Next, it summarizes some of the recent evidence showing the strong connection

between school and teacher effectiveness measures and student achievement. It then notes some of the bene®ts of

school and teacher effectiveness measures external to their function as measures of performance. Next, policy

issues arising from the use of student data and the associated research are considered. Finally, it concludes with

some cautions about using effectiveness measures in teacher accountability systems.

Dr. Frymier's Conception of Accountability

Dr. Jack Frymier's paper on ``Accountability and Student Learning'' presents an

interesting but untenable position. He draws what seem to be preordained conclusions

regarding accountability that just do not hold in the light of evidence (Frymier, 1997). This

article ®rst argues for the link between accountability and student performance in an

overall response to Dr. Frymier. It then moves to a discussion of the existing data

regarding tying school and teacher performance to student performance in a value-added

context. Finally, it concludes with a discussion of issues that come to the forefront in light

of this research, including some of the many bene®ts that can be derived from value-added

measures and some cautions with regard to their use in teacher accountability systems.

Dr. Frymier basically holds that teachers can be held accountable only for their own

performance and not for the performance of their students. Regardless of the many ways

he restates his position, it is always a reiteration of this basic theme. There are three

arguments presented in support of this theme. The ®rst is the argument of tradition.

According to his illustrations, the British tried payment according to student outcome in

the 1800s and rejected it, and, therefore, so should we. Also, the ancient Greeks believed

individuals were responsible only for their own behavior and this, perforce, excludes

teachers from being held accountable for any part of their students' behavior. Next, the

idea of teacher accountability through student achievement is rejected on a supposed

violation of legal principles. Legally, he claims, even parents cannot be held responsible

for the behavior of their own children, so teachers cannot be held responsible for their

students' behavior. Finally, he produces an unusual twist of the concept of locus of control

for an argument based on psychology. Holding teachers responsible for student outcomes

will give students an external locus of control, which will turn them into mindless

automatons. As he has applied them, all three are false positions. Let us examine them

each in a little more detail.

The ®rst argument, that of tradition, falls apart on analysis. Regardless of who has tried

it and failedÐthe British, the ancient Greeks, or anyone elseÐnew information based on

solid research can invalidate tradition or, at a minimum, indicate that tradition needs to be

challenged. As is demonstrated below, such information now exists. Regardless of what

the Greeks held personal responsibility to be, we are dealing with organizations and

individuals given the speci®c task of in¯uencing the learning of students. Their

responsibility for doing so and their accountability for the task are based on traditions,

laws, and information available since the time of the ancient Greeks, thus rendering the

opinion of the Greeks irrelevant. Of course, there are other problems with these analogies.

To consider just a few with the British example, their school system was entirely different

from a modern system, and the status of employees was legally different. The curriculum

was much less well de®ned, and there were entirely different ways of measuring the

outcomes of schooling.

The Greeks we also consider regarding the legal argument for parental responsibility.

Here, Dr. Frymier has ignored a number of precedents where parents are being held

responsible for their children's truancy and a number of other individual behaviors.

Further, responsibility for the entire range of behavior of an individual is confused with the

limited responsibility of a teacher for, among other things, the learning, social behavior in

school, and academic performance of students. For example, from a legal standpoint, the

Texas legislature (and, no doubt, others) certainly holds the legal position that teachers are

responsible for their students' academic behavior, since, in 1995, it passed a law that

mandated the inclusion of the academic performance of a teacher's students as a part of all

teacher evaluation systems in the state (Senate Bill 95-1, 1995). The ancient Greeks may

or may not have disagreed with this stance, but since their time, new data regarding

behavior have come to light. Certainly, knowledge of the ability to manipulate behavior

through positive and negative reinforcement (Skinner, 1953) and the in loco parentisfunction of teachers and the schools relative to their students behavior and well-being

affects these conclusions. When these are combined with the teacher's obligation to spend

a minimum amount of time manipulating academic behavior, the conception of the Greeks

delivering schools and teachers from responsibility for student achievement collapses.

Locus of control has been stood on its head in order to absolve teachers of responsibility

for their students' achievement. Locus of control is a part of attribution theory, which

posits that individuals perceive (general) achievement-related situations as falling along a

continuum from uncontrollable to totally controllable. Recent research indicates that the

teacher can be directly responsible for instilling an internal locus of control in a student

(U.S. Department of Education, 1992; Freeman & Sokoloff, 1995). Thus, teachers are

shown to have responsibility for developing locus of control in their students as part of

enhancing student achievement. Regardless of the distortion of locus of control research, a

false dichotomy is offered with the concept. Locus of control is situation speci®c, and it is

possible for a student to perceive a different locus of control with each task, assignment, or

258 R.L. MENDRO

goal. Thus, locus of control is not the either/or situation argued. Finally, one of the

demonstrated tenets of the standards-based education movement is that students feel quite

comfortable with rational, achievable standards as long as they are allowed to have a

degree of control over how they reach the standard (Cross & Joftus, 1997; Mendro, 1997).

Thus, it is possible for students to allow a different locus of control for different parts of a

larger complex task as opposed to the all or nothing alternative presented to the reader.

Data in Support of Using Student Achievement in School andTeacher Accountability

The real issue is to determine what data are available that bear on the issue of school and

teacher accountability. The use of student achievement data as a part of school and teacher

accountability has been debated for some time, but there have been few practical

applications free from known biases until the last decade (Millman, 1997). The debate has

centered around the need for including multiple outcome variables and for controlling

exogenous in¯uences. The use of multiple-outcome variables is a two-part problem. The

®rst part centers on the need for multiple-outcome variables that are related to important

educational goals (see, for example, Murname, 1987; David, 1987). Here goals must be

decided with input from stakeholders. The second part of the multiple-outcome question

is mainly a question of resources on the part of the district involved (Webster &

Schuhmacher, 1973). Resources are needed to maintain the extensive databases and to take

the multiple measures required.

The variables used in the school effectiveness model used in the Dallas Public Schools

are chosen by a group known as the Accountability Task Force. This group consists

of principals, teachers, parents, community members, business representatives, and

administrators. The Task Force makes the ®nal decisions on all variables to be included in

the School Effectiveness Indices. In addition, the District has performance awards for

those schools that excel as measured by the School Effectiveness Indices. The Task Force

sets the rules under which a school will receive an award (for example, the school must test

at least 95 per cent of their non-special education students) and serves as the ®nal arbiter

for all appeals regarding the indices and the awards. The awards serve as one type of

employee bonuses. Each professional employee in an awarded school receives a $1,000

bonus. Each support employee receives $500, and the school receives $2,000 for its

activity fund. The variables currently included in the indices as designated are included in

table 1. The table includes the weights assigned to the variables as well. The variables

include primarily test data when weights are considered. For example, in an elementary

school with grades 1 through 6, test scores have weights totaling 89, attendance totals 6,

and promotion rate 1. The Accountability Task Force intended these weights to serve as the

relative value of the outcome measures for these schools. Indeed, where parents and

community members discuss goals, our experience has been that more emphasis is placed

on achievement as the primary legitimate outcome of schooling.

The stumbling block to the analysis of these data has been the known in¯uence of

variables outside the control of the school or teacher (Webster & Mendro, 1997). For

STUDENT ACHIEVEMENT AND SCHOOL AND TEACHER ACCOUNTABILITY 259

Tabl

e1.

Wei

gh

tin

go

fC

rite

rio

nM

easu

res

by

the

Acc

ounta

bil

ity

Tas

kF

orc

eof

the

Dal

las

Publi

cS

chools

.

Ele

men

tary

Cri

teri

aM

iddl

eSc

hool

Cri

teri

aH

igh

Scho

olC

rite

ria

Cri

teri

on

Wei

gh

tC

rite

rion

Wei

ght

Cri

teri

on

Wei

ght

ITB

SÐ

Rea

din

g4

/Gra

de

ITB

SÐ

Rea

din

g4/G

rade

TA

PÐ

Rea

din

g,

Gra

de

96

ITB

SÐ

Mat

hem

atic

s4

/Gra

de

ITB

SÐ

Mat

hem

atic

s4/G

rade

TA

PÐ

Mat

hem

atic

s,G

rade

96

TA

AS

ÐR

ead

ing

5/G

rad

eT

AA

SÐ

Rea

din

g5/G

rade

TA

AS

ÐR

eadin

g,

Gra

de

10

12

TA

AS

ÐM

ath

emat

ics

4/G

rad

eT

AA

SÐ

Mat

hem

atic

s4/G

rade

TA

AS

ÐM

athem

atic

s,G

rade

10

12

TA

AS

ÐW

riti

ng

,G

rade

45

TA

AS

ÐW

riti

ng,

Gra

de

85

TA

AS

ÐW

riti

ng,

Gra

de

10

12

Att

end

ance

1/G

rad

eT

AA

SÐ

Sci

ence

,G

rade

81

AC

PÐ

Lan

guag

e8/S

chool

Pro

mo

tio

nR

ate

1/S

choo

lT

AA

SÐ

Soci

alS

tudie

s,G

rade

81

AC

PÐ

Mat

hem

atic

s8/S

chool

Att

endan

ce1/G

rade

AC

PÐ

Sci

ence

8/S

chool

Pro

moti

on

rate

1/S

chool

AC

PÐ

Soci

alS

tudie

s8/S

chool

Dro

pout

rate

1/S

chool

AC

PÐ

Rea

din

gim

p,

Gra

de

92

Per

centa

ge

inhonors

cours

es2/S

chool

AC

PÐ

Honors

/Advan

ced

3/S

chool

AC

PÐ

Fore

ign

Lan

guag

e2/S

chool

SA

T/A

CT

,G

rade

12

4

PS

AT

ÐV

erbal

1/S

chool

PS

AT

ÐM

athem

atic

s1/S

chool

Gra

duat

ion

rate

5/S

chool

Per

centa

ge

takin

gS

AT

/AC

T5/S

chool

Per

centa

ge

takin

gP

SA

T3/S

chool

Per

centa

ge

inhonors

cours

es5/S

chool

Per

centa

ge

inA

Pco

urs

es4/S

chool

Per

centa

ge

takin

gA

Pan

dpas

sing

1/S

chool

260 R.L. MENDRO

example the Dallas Public Schools system controls for the effects of a combined ethnicity

and language pro®ciency variable, student gender, four variables measuring socio-

economic status (free or reduced lunch participation, census family income, census

poverty level, and census college participation), and the complete interactions of ethnicity

or language, gender, and free lunch.

Although regression analysis has been proposed for years as a remedy for controlling

these in¯uences (see, for example, Felter & Carlson, 1985; Kirst, 1986), standard multiple

linear regression has shown problems controlling known in¯uences in the higher levels of

multilevel data (Webster & Olson, 1988; Webster, Mendro, Orsak & Weerasinghe, 1998).

Only with the widespread availability of software and techniques for the analysis of

sophisticated multilevel models such as hierarchical linear modeling (HLM) (Raudenbush

& Bryk, 1989) or with the statistical developmental efforts commissioned by at least one

state Department of Education (Sanders & Horn, 1993) have school personnel had the

statistical tools available to control in¯uences on multiple levels of variables and their

associated bias. For example, at the school level, the Dallas Public Schools model

simultaneously controls the effects of mobility, crowding, average family income, average

family educational level, average family poverty level, per cent on free lunch, per cent

SOL students, per cent African American students, per cent Hispanic students, per cent

minority students, and per cent of teacher days for vacant positions, while controlling the

student variables listed above.

With the initiation of sophisticated models for multilevel data analysis, several carefully

implemented systems for determining school and teacher effectiveness have been

developed and implemented (Sanders, Saxton & Horn, 1997; Webster & Mendro, 1997).

Further, Webster et al. (1998) have laid out clear criteria for the evaluation of the level of

bias in these models. They note that the ®nal criterion must be the degree to which these

models control the correlation of variables known to be related to the outcome variables

but not under control of the school or teacher (SES, for example) and the effectiveness

outcomes derived from the models. The degree to which these correlations are nonzero is

the degree to which a model is biased. Furthermore, these correlations must be controlled

at both the student and school or classroom level.

Research by Sanders and Rivers (1996), Jordan, Mendro, and Weerasinghe (1997), and

Bembry, Jordan, Gomez, Anderson, and Mendro (1998) has demonstrated the effects of

teachers on student achievement. They show that there are large additive components in

the longitudinal effects of teachers, that these effects are much larger than expected, and

that the least effective teachers have a long-term in¯uence on student achievement that is

not fully remediated for up to three years later. Finally, both Dallas studies show a

selection bias where lower-achieving students are more likely to be put with lower

effectiveness teachers and vice-versa. Thus, the negative effects of less effective teachers

are being visited on students who probably need the most help.

Further, the research done in Tennessee by Sanders and Rivers and that done in Dallas

by Jordan et al. and Bembry et al. reveal near identical results across three different

populations, different methods of computing teacher effectiveness, different analysis

methods, and different assessment measures (Bembry et al., 1998). Clearly, these data

show that groups of ineffective teachers can be readily identi®ed as the ®rst step in


remediating their poor performance in in¯uencing student achievement and that the effects

of ineffective and effective teachers cross populations of students and the measures used to

obtain effectiveness indices.

These points are important because they expose myths about the in¯uence of teachers

on student achievement. The result that showed losses relative to ineffective teachers

to be large contradicts the ®rst myth, which is that if a teacher doesn't get very good

achievement gains in one year, it is of little importance because the losses are small. The

research in Dallas shows that with students of average prior achievement levels, groups of

students can lose as much as twenty norm-referenced percentile points in a year. Across

three or four years, students with ineffective teachers each year can perform at a level ®fty

percentile points lower than students with effective teachers. The data on longitudinal

effects of an ineffective teacher void a second myth about teachers and achievement,

which is that if teacher A doesn't do a good job this year, teacher B will make up for it next

year. The Dallas data strongly suggest that negative effects of a teacher in the bottom third

of effectiveness last through three years of teachers in the top third of effectiveness. These

data have immediate policy issues regarding teacher evaluation and student achievement,

which are discussed below.

These effectiveness systems are not without their distracters (Darling-Hammond, 1997;

Glass, 1990; Sykes, 1997; Thum & Bryk, 1997). However, their objections do not explain

away the data and the remarkable consistency across sites. They focus, instead, on either

philosophical objections, on hypothetical situations unrelated to the real data, or on ideal

alternative models that have not been tested. Typically, they make the assumption that the

data will be used inappropriately in a scenario of their devising and condemn the models

from that standpoint. Without a substantive objection to the data or the analyses presented

in the research, these arguments are of little value in advancing the discussion on the use of

teacher and school effectiveness models.

An editorial comment on an earlier version of this article focused on a major bone of

contention regarding the most appropriate tests for use in identifying effective and

ineffective schools and teachers in relation to student achievement. As noted, the Dallas

system uses a number of variables in its system for determining effective schools. Norm-

referenced, criterion-referenced, and curriculum-referenced tests, student attendance,

dropout rates, participation in honors courses, and other variables are used in determining

the effectiveness of schools. For teachers, only test data are used. At some grades and

in some subjects, as many as four different sets of test results are used. At some grades,

only norm-referenced tests are used. Some would prefer that the assessment use only

``authentic'' measures (Darling-Hammond, 1997). We ®nd that, for our purposes, any

reliable measure related to the general or speci®c learning outcomes, including

standardized norm-referenced tests, provides good information for beginning an

improvement process. To rephrase this, because it is an important point, we ®nd norm-

referenced tests to be suf®cient for the initial identi®cation of potentially outstanding

teachers and schools and potentially ineffective teachers and schools.

We believe, with Haney and Madaus (1989) that the critical element is not the type of

test ( provided the test is reliable and a measure of the subject taught by the teacher) but the

use to which the test is put. The Dallas Classroom Effectiveness Indices are intended to be

262 R.L. MENDRO

used as a starting point for identifying groups of teachers who are effective or ineffective

relative to their students' measured achievement, not as a ®nal outcome measure of teacher

effectiveness.

After the results are veri®ed, they can be used to learn about effective teachers (as

described in the next section) or to investigate the reasons for ineffective student results as

part of assisting a teacher to improve. Only when numerous sources, including Classroom

Effectiveness Indices, con®rm ineffective teaching should the results be used in making

decisions about changing ineffective behaviors. Only when numerous sources con®rm

effectiveness should we leave good teachers alone or attempt to learn from them. In some

cases, a pattern of effective or ineffective results across several years will give more

credence to the test information, but a single result has to be veri®ed. We realize that this is

no more than the good advice given for using the results of any assessment, but critics of

the use of test scores assume, sometimes correctly, that test results will be used

inappropriately. Thus, even this elemental caution is in order. We strongly discourage

using any test result or effectiveness measure in isolation.

Results and Uses Associated with the Dallas Effectiveness Indicators

The Dallas Public Schools (DPS) initiated a value-added system of identifying effective

schools in 1991 to 1992. In 1994 to 1995, the DPS started computing effectiveness at the

classroom level and formally adopted Classroom Effectiveness Indices as part of a revised

teacher evaluation system in 1996 to 1997 after extensive input from all levels of district

personnel and community members. The system is described in Webster, Mendro,

Bearden, Bembry, and Jordan (1997). Over the six years the school portion of the system

has been in effect, a primary question was the impact that the system has had on variables

measured by the system. The primary variables in the system where comparable data are

available were achievement on the norm-reference ITBS' on the state mandated criterion-

referenced Texas Assessment of Academic Skills, and on the SAT; dropout rate;

graduation rate; and the percentage of students in honors courses in grades 7 to 8 and 9 to

12. (Attendance was included but is not considered here since a separate attendance

initiative with ®nancial rewards was in effect during the same time period.) All variables

showed a general increasing trend (or decreasing trend for dropout rate) except the mean

SAT scores, which decreased, and the grades 7 to 8 honors course enrollment, which

decreased signi®cantly from 1991 to 1993 and then has shown a steady rise (Webster et al.,

1997). The mean SAT scores were impacted by increased participation on the SAT test. A

separate analysis of these scores by quintile of the graduating class has shown these scores

to be stable within quintile. These are obviously correlational and not causal data, but they

do not contradict the possibility of a positive trend related to the introduction of indices.

More important have been the uses of the indices data. One of the primary uses is to

reward effective schools through the use of performance awards based on the school

indices. The administration uses school indices as one of the indicators in determining

the retention of principals at ineffective schools. The experience of the DPS is that the

quickest way to change the effectiveness of a school, for better or worse, is to change the


principal. Preliminary research that DPS is in the process of conducting has so far strongly

supported this conclusion.

The indices for teachers and schools serve as the basis of many evaluations, particularly

where there are no speci®c control groups. They have been used to identify characteristics

of effective and ineffective schools (Webster et al., 1997). In this study, effective and

ineffective schools were sampled, and evaluators identi®ed three consistent characteristics

of effective schools. First, effective schools had achievement as a major focus. Second, the

staff in effective schools expected students to achieve. (This is different from believing all

students can learn but not expecting it.) Finally, the principals of effective schools did not

tolerate ineffective teachers. Ineffective teachers were expected to change, or they were

removed. Outside of these principles and a few others noted that were less prominent, the

effective schools all differed in terms of atmosphere, management styles, and other

dimensions.

The classroom indices have also been used in the evaluation of the District's

mathematics program (Bearden, 1997). Research conducted in effective teachers' classes

found that effective teachers knew subject matter, taught the entire range of the curriculum

(including higher-order thinking skills) with equal emphasis, and assessed students

frequently through formal and informal methods. Beyond this, teacher styles varied

widely. Thus, the payoff of effectiveness indices may extend beyond the effect on the

variables in the accountability system and provide additional knowledge and information

about the ingredients that bring effectiveness about.

Policy Issues Associated with Teacher Effectiveness

Bembry et al. (1998) discuss the policy issues that are imputed by current teacher

effectiveness research. Clearly, equity in student access to a quality education is the most

important issue that emerges. The devastating student outcomes obtained by ineffective

teachers demand that a district must remedy the effects of these teachers on students or, if

that is not possible, to remove that teacher from the district. In the past, less extensive

analyses or the lack of analysis of student achievement data allowed principals and

administrators to believe that poor teachers were doing little harm or little long-lasting

harm. The import of the effectiveness research is that these teachers are doing students

long-term harm and it is possible to identify potential groups of teachers having this effect.

This provides a moral imperative to investigate these teachers and to attempt to correct

their behaviors if they consistently have these effects on students. School and districts can

no longer hide behind misinformation or no information.

A second issue involving equity is the type of help to provide students who have had an

ineffective teacher in the past. We now know, as a result of our research, that students

who are placed with an ineffective teacher suffer long-term negative effects. Carrying

ineffective teachers also limits the number of effective teachers that can be hired. The

policy issue is how to allocate resources for better teaching. This is particularly a problem

because of the tendency to assign lower-achieving students to less effective teachers. Do

we assign a somewhat less effective teacher to a higher-achieving student to make room

264 R.L. MENDRO

for the lower-achieving student who was put with a very ineffective teacher? Should we

investigate models where effective teachers team teach with less effective teachers and

have differentiated responsibilities?

Another important policy issue arises when the situation is considered from the

perspective of staff development. Most large districts are having trouble ®lling vacancies

with competent teachers, and many have few candidates to choose from. Clearly, investing

in making an ineffective teacher more effective is a necessary response, simply because

other options may not be available. A differentiated staff development policy is highly

likely to be needed, which allows the effective teacher more freedom to pursue individual

interests and requires the ineffective teacher to target particularly ineffective practices.

The problem of retaining effective teachers has to be reconsidered. These teachers have

such a high payoff in the classroom that it is more important than ever to keep them from

leaving the profession or going to other districts, if possible. Differentiated rewards must

be considered. These policy issues present the most apparent policy problems. There are

more. However, it is clear that the consideration of teacher effectiveness offers up an

entirely new set of concerns.

Cautions Relative to Effectiveness Indices

The largest problem in the use of effectiveness indices, as noted earlier, is their potential

for misinterpretation. This is particularly the case with teacher indices. The single biggest

problem is that consumers are tempted to reduce all of their thinking to a number and

to abandon the use of other data. Rarely is this justi®ed. Several questions must be

considered. What about a teacher who has high indices in one area (reading, for example)

and low in another (say, mathematics)? What about a teacher who has a pattern of

acceptable indices and then has a bad year? What about a less than effective teacher when

there is a teacher shortage in that person's content area and no replacements are available?

These and many other questions should rapidly convince the user that other information

must be used in conjunction with the indices.

Leaving out problems that can arise from direct cheating and its effects on indices, there

are several other concerns. Indices present a quanti®cation of worth along one dimension

relative to the person receiving them. Since any system using indices is fraught with the

psychological consequences of this perception, there is the danger that the teacher will

adopt the assessment measures as the curriculum and will narrow instruction to the

perceived focus of the assessments. Worse, if the teacher misunderstands the indices, the

teacher might teach only to one group of students in the mistaken belief that the group will

make the largest difference.

The use of indices requires a careful education program to help school leaders and

teachers understand the concepts underlying them and the limits to the value of the ®nal

product. Measures of teacher and school effectiveness are potent and meaningful tools.

But their use in attaching responsibility for achievement to schools and teachers must be

viewed in light of their limitations as well as their considerable strengths.


References

Bearden, D. (1997). An overview of the elementary mathematics program 1996±97. Research report REIS97-

116-3. Dallas: Dallas Public Schools.

Bembry, K., Jordan, H., Gomez, E., Anderson, M., & Mendro, R. (1998). Policy implications of long-termteacher effects on student achievement. Paper presented at the Annual Meeting of the American Educational

Research Association, San Diego, CA, April 13±17.

Cross, C., & Joftus, S. (1997). Are academic standards a threat or an opportunity? Bulletin of the NationalAssociation of Secondary School Principals, 81(590), 12±20.

Darling-Hammond, L. (1997). Toward what end? The evaluation of student learning for the improvement of

teaching in J. Millman (ed.), Grading teachers, grading schools. Newbury Park, CA: Sage.

David, J. (1987). Improving education with locally developed indicators. New Brunswick, NJ: Center for

Policy Research in Education, Eagleton Institute of Politics, Rutgers, the State University of New Jersey.

Felter, M., & Carlson, D. (1985). Identi®cation of exemplary schools on a large scale. In Austin & Gerber

(eds.), Research on Exemplary Schools. New York: Academic Press.

Freeman, C., & Sokoloff, H. (1995). Toward a theory of thematic curricula: Constructing new learning

environments for teachers and learners. Education Policy Analysis Archives, 3(14), 17. (Internet Journal).

Frymier, J. (1997). Accountability and student learning. A keynote paper presented at the Sixth Annual

National Evaluation Institute sponsored by CREATE, Indianapolis, IN, July 1997.

Glass, G. (1990). Using student test scores to evaluate teachers. In J. Millman & L. Darling-Hammond (eds.),

The new handbook of teacher evaluation. Newbury Park, CA: Sage.

Haney, W., & Madaus, G. (1989). Searching for alternatives to standardized tests: Whys, whats, and whithers.

Kappan, 70(9), 683±687.

Jordan, H., Mendro, R., & Weerasinghe, D. (1997). Teacher effects on longitudinal student achievement.A paper presented at the Sixth Annual National Evaluation Institute sponsored by CREATE, Indianapolis, IN,

July 1997.

Kirst, M. (1986). New directions for state education data systems. Education and Urban Society, 18(2),

343±357.

Mendro, R. (1997). Summary of common features of standards-based schools. Dallas TX: Dallas Public

Schools.

Millman, J. (Ed.) (1997). Grading teachers, grading schools. Newbury Park, CA: Sage Publications.

Murname, R. (1987). Improving education indicators and economic indicators: The same problems?

Educational Evaluation and Policy Analysis, 9, 101±116.

Raudenbush, S., & Bryk, A. (1989). Quantitative models for estimating teacher and school effectiveness. In

R.D. Bock (edc.), Multilevel Analysis of Educational Data. San Diego, CA: Academic Press.

Sanders, W., & Horn, S. (1993). The Tennessee Value-Added Assessment System (TVAAS): Mixed ModelMethodology in Educational Assessment. Knoxville, TN: University of Tennessee.

Sanders, W., & Rivers, J. (1996). Cumulative and residual effects of teachers on future student academicachievement. Knoxville, TN: University of Tennessee.

Sanders, W., Saxton, A., & Horn, S. (1997). The Tennessee Value-Added Assessment System: A quantitative

outcomes-based approach to educational assessment. In J. Millman (ed.), Grading teachers, grading schools.Newbury Park, CA: Sage Publications.

Senate Bill 95-1, (1995). A bill enacted by the 95th session of the Legislature of the State of Texas,

Austin, TX.

Skinner, B. (1953). Science and Human Behavior. New York: Macmillan.

Sykes, G. (1997). On trial, the Dallas Value-Added Accountability System in J. Millman (ed.), Gradingteachers, grading schools. Newbury Park, CA: Sage Publications.

Thum, Y., & Bryk, A. (1997). Value-Added Productivity Indicators in J. Millman (eds.), Grading Teachers,Grading Schools. Newbury Park, CA: Sage Publications.

U.S. Department of Education. (1992). Hard work and high expectations: motivating students to learn.Washington D.C.: USDOE Of®ce of Educational Research and Improvement.

266 R.L. MENDRO

Webster, W., & Mendro, R. (1997). The Dallas Value-Added Accountability System. In J. Millman (ed.),

Grading teachers, grading schools. Newbury Park, CA: Sage Publications.

Webster, W., Mendro, R., Bearden, D., Bembry, K., & Jordan, H. (1997). Rewarding Effective SchoolsÐTheory and Practice in an Outstanding Schools Awards Program. Paper presented at the Annual Meeting of the

American Educational Research Association, Chicago, IL, March 24±28.

Webster, W., Mendro, R., Orsak, T., & Weerasinghe, D. (1998). An application of hierarchical linearmodeling to the estimation of school and teacher effect. Paper presented at the Annual Meeting of the American

Educational Research Association, San Diego, CA, April 13±17.

Webster, W., & Olson, G. (1988). A quantitative procedure for the identi®cation of effective schools. Journalof Experimental Education, 56, 213±219.

Webster, W., & Schuhmacher, C. (1973). A uni®ed strategy for systemwide research and evaluation.

Educational Technology, 13(5), 68±72.


Documents

Student Achievement and School and Teacher Accountability