11
Accountability, Testing, and Schools: Toward Local Responsibility and Away from Change by Mandate Grant Wiggins “~Qtrali(y is the cr1stc)1lrel-~pe~ceptior7 oJ’ ex-celhce. Qunlip is u8nt the cuslotner sqs he rmxis, irot iidmt our tests ir7clicale is satiLsfktory. “(quoted in Peters 1987) W e assume that school accountability depends on imposing standardized testing. That assumption is false, how- ever, as reflection on the state of accountability in the private sector reveals. Testing does not yield accountability; the opportunity of clients to influ- ence service providers by criticizing or leaving the service provider for another cloes yielcl ac- countability. If the client is unable to influence the clecisions of teachers and builcling principals, tests-whether “authentic” or “traditional”-will not change what happens in our schools. Accountability depends on the freedom to succeecl or fail based on client satisfaction. Man- elating slanclarcls and measures will provide less freedom, not more; it will make it less likely that schools and programs develop in response to local interests, and less likely, too, that entrepre- neurial behavior will occur in education. Let me put this matter a bit cynically: those in the business, professional, and policy communi- ties calling for accountability through uniform testing would never tolerate such an intrusion into their own affairs. Some of the current clamor for national testing is disingenuous or based on naive ignorance. A “command” system failed in all of Eastern Europe in economics; why then is such a system likely to work in education? Better incentives are needed for teachers and schools to improve on their own one school at a Accountability can on/y come through revolutionary changes in the thinking of everyone connected with education, I time--just as in business. Procedures for giving clients more opportunities to influence service. ancl to “reward” the bettey service pro- viders with theil “business,” are needed. ( “Drive out fear” is a moclern postulate of total quality management; 1 current thinking on I school accountability is rooted in the sweatshops of the late nineteenth century.) Accountability exists when the service pro- vider is obligatecl to respond to criticism from those whom the provider serves. The ability to hold the service provider responsible clepencls on a moral-legal-economic framework in which the client has formal power to clemand a response from the provider, influence the provision, and change service providers if the responses are deemed unacceptable by the client. Any required tests should make it easier for clients to exercise their rights; they shoulcl merely help them better understand the quality of the service with which they have been provided. Tests may furnish more information to those in such a relationship, but they do not improve the “accountability,” the responsive quality of the relationship. For example, requiring yearly physi- cal exams of all fathers will not increase their accountability to their families. As a father, I had Accountability, Testing, and Schools 13

Accountability, testing, and schools: Toward local responsibility and away from change by mandate

Embed Size (px)

Citation preview

Page 1: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

Accountability, Testing, and Schools: Toward Local Responsibility and Away from Change by Mandate

Grant Wiggins

“~Qtrali(y is the cr1stc)1lrel-~pe~ceptior7 oJ’ ex-celhce. Qunlip is u8nt the cuslotner sqs he rmxis, irot iidmt our tests ir7clicale is satiLsfktory. “(quoted in Peters 1987)

W e assume that school accountability depends on imposing standardized testing. That assumption is false, how-

ever, as reflection on the state of accountability in the private sector reveals. Testing does not yield accountability; the opportunity of clients to influ- ence service providers by criticizing or leaving the service provider for another cloes yielcl ac- countability. If the client is unable to influence the clecisions of teachers and builcling principals, tests-whether “authentic” or “traditional”-will not change what happens in our schools.

Accountability depends on the freedom to succeecl or fail based on client satisfaction. Man- elating slanclarcls and measures will provide less freedom, not more; it will make it less likely that schools and programs develop in response to local interests, and less likely, too, that entrepre- neurial behavior will occur in education.

Let me put this matter a bit cynically: those in the business, professional, and policy communi- ties calling for accountability through uniform testing would never tolerate such an intrusion into their own affairs. Some of the current clamor for national testing is disingenuous or based on naive ignorance. A “command” system failed in all of Eastern Europe in economics; why then is such a system likely to work in education?

Better incentives are needed for teachers and schools to improve on their own one school at a

Accountability can on/y come through revolutionary changes in the thinking of everyone connected with education, I

time--just as in business. Procedures for giving clients more opportunities to influence service. ancl to “reward” the bettey service pro- viders with theil “business,” are needed. ( “Drive out fear” is a moclern postulate of total quality management; 1 current thinking on

I

school accountability is rooted in the sweatshops of the late nineteenth century.)

Accountability exists when the service pro- vider is obligatecl to respond to criticism from those whom the provider serves. The ability to hold the service provider responsible clepencls on a moral-legal-economic framework in which the client has formal power to clemand a response from the provider, influence the provision, and change service providers if the responses are deemed unacceptable by the client. Any required tests should make it easier for clients to exercise their rights; they shoulcl merely help them better understand the quality of the service with which they have been provided.

Tests may furnish more information to those in such a relationship, but they do not improve the “accountability,” the responsive quality of the relationship. For example, requiring yearly physi- cal exams of all fathers will not increase their accountability to their families. As a father, I had

Accountability, Testing, and Schools 13

Page 2: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

that obligation before I went to the doctor. The only sense in which the test results might affect my conciuct is if the results tell me things I couici not have known otherwise-my blood pressure or cholesterol level, for instance. I am “account- able” for these new test results only in the sense that: (1) I must be capable of changing the re- suits, and (2) those to whom I am obligated-my family-know. understand, and can use the re- suits, and can use moral suasion to incluce me to change my unhealthy habits.

A simple example from ciientiinstitutionai relationships illustrates how accountability cle- pencls on a relationship of responsibility-a rela-

tionship of moral equals. In what way is the powerful Ford Motor Company “ac-

“Organizational countable” lo me for the per-

dependency on formance of my Taurus? There are ciifferent answers: I have

my satisfaction has an explicit contract, my war-

never he/d true for ranty; failure to honor the contract can lead to my seek-

public schools until ing redress. In the long run,

‘quite recently. N there is a more informal yet powerful form of accountabii- ity: I can buy a car from a clifferent manufacturer. We

certainly make a great mistake if we believe that accountability derives from any oversight pro- vicied by the Federal Tracle Commission or the Department of Transportation and Highway Safety.

Now consider the role of comparative testing in this relationship. Do I neeci COIKZ~I~XY Reports to hoici the Ford Motor Company accountable for my Taurus? Think carefully: I am not asking whether the magazine’s tests are useful and infor- mative; they surely are. The question is whether “accountability” ciepend~ 017 those tests ancl rank- ings. The answer, I think, is mostly no: in one sense, yes: the average car buyer cannot conduct all those tests, and the tests unearth hidden but vital aspects of buyer satisfaction. The more people read the comparisons among various makes of vehicles, anti the more the comparisons are judged to be creclibie, and the more that cars receive low ratings because of hard-to-see cle- fects, the more the manufacturer will pay the price in lower sales. The outcome is that I vote with my pocketbook. If I am unhappy with my Taurus, I will return it, sell it, or junk it. We don’t need any test other than our own satisfaction or dissatisfaction to keep Fort1 accountable; Ford ultimately is depenclent on my continued satisfac- tion.

Organizational dependency on my satisfac- tion has never held true for public schools until quite recently. In the monopoly that public schools have enjoyed, they have not needed to

14

worry about the ciissatisfaction of the student, the parents, or the receiving institution (the “institu- tional customers” of the school). This becomes much clearer if we reverse the Forcl example anti ask this question: What accountability existed in the auton~ol~iie industry of the Soviet Union when there was only one car to buy, the Trebia?

Let’s look al ;L more complex system of per- formers, where the clients are in the same re- moved position from the performers as they are in schooling: baseball. The fans are the clients; they pay the bills. Yet they cl0 not manage the team and should not do so, just as students and parents shoulci not manage teaching. How are the teams “accountable” to their fans? More incli- rectly through the pattern of actions of manage- ment ancl players-just as in schools. To whom is the team responsible? To its fans.

By what mechanisms is the team accountable lo its fans? IHere you might say that surely we hold the team accountable using the most !,asic of tests-its win-loss record. However. the win- loss recorcl is not always the best indicator of optimal performance. Teams, like schools, can do moclestiy well with limited talent. A team’s record does not always correlate with satisfaction: look at the perpetually clismai recorcl of the Chicago Cubs and the high attenclance at their games.

The biggest problem with envisioning a school accountability system grouncled in testing is that school testing cioes not work like a season’s worth of baseball games. A test is a one- shot affair, not a season; it is an inclirect ancl ge- ;ieric measure, not the criterion performance; the test is imposed, not a natural part of the environ- ment ancl mission of the schools.

The parallel in baseball would be if an impa- tient group of I~asel~aii overseers deciclecl that the health of a team shouici be juciged by a visit to one game per year. Each team would be “testeci” ancl compared with other teams using a complex ancl little-known set of statistics and equating formulae so arcane that only the measurement experts woulcl uncierstand it. How wouid that improve “accountability”? If the measurement is not of self-eviclent meaning and value, how can it hold anyone responsible for anything? Wouldn’t this suggest a conclusion opposite to those pro- posed by policymakers: that we would be better off assessing the stucient’s performance over 162 days in school to tell us how schools are doing?

In baseball, even if we have creclibie and multifaceted statistics for each player, we note that there is no single aggregate statistic repre- senting each player’s achievement. There is no single “test score,” in other worcls. Batting aver- age, runs batted in, runs scorecl, home runs, sto- len bases, anti many other statistics are part of a complex assessment of a player’s worth basecl on incommenstrm6Ie variables. The difficulty of

Business Horizons / Scpteliil,er-October 1993

Page 3: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

m;tking juclgments in high-stakes situations IX- comes clearer. When we consider thl even with reams of objective data, sunimmy conclusions- summary jric/~nlellts based on placing value on

specific traits--are hard to determine. Consider recent salary arbitration cases, in

which players’ agents hire statisticians to provide new (and often arcane) measures of their clients’ true vd~1e. Yes, I hit few home runs, but my total run production is high; yes. I strike out a lot, hut 1 have more hits in clutch situations (rnjo outs. late innings, when we 2re hehind) than othei players. In IxIs~Ix~II (as in schoolkeeping), many players statistics relate to 2nd tlepencl on othei people’s statistics: I c1n.t have many RBIs if my teammates don’t get on base; I am likely to hit For less r1verqe ancl draw more wvalks if I am the only good hitter on the team thecause pitchers A.ill pitch wound me ;incl I will gel Fewer goocl pitches to hit).

We should not IX looking r1t only one base- I~:111 season, if the :inrilogy is to remain valid. In any given year. some teams are weak :1ncl some ;1re strong. IF strengths ~1ntl wez1knesses played lhwiselves out over time, the rich woulcl get richer :1ncl the poor poorer. How, then, can there he accountnl~ility if there is little possibility of funclan~ental reform and injection of better talent? The game is clesignecl, however, so that re:1son- able Ixrlance--ecluit)l-is built into it over time. A new player cllaFt enaides poor teams to get first call on the best young talent. A Free-agent system :1ncl ;1 system for making trades end-de :iny team to improvc itself quickly if it acts wisely. Even in something 3s “Dwwini~n” 2s professional sports. we can huve account:il3ilily l-~ecausc there is 3 built-in possibility of cll-amatic change ancl pos- sihle improvement as well as the freedom to e.xit.

In schools the absence of creclilde tests makes scrmmary judgments about school perfor- mance (never mind teacher performance) almost inipossilde. WhLitever tests we use, if union con- tracts nncl custoni dictate thnt teachers c;uinol Iw ~wdecl” or act ~1s “Free agents,“ we m:tke it far less likely that schools can IX changecl or held :iccountal~le.

Where, then, is the “account3hility” in Ixise- Ml or in schools? Surely it derives in p:1rt From 3 possibility that kins, advertisers, ancl players- parents, students, ancl teachers in the case of schools--clesert IXKI oqqnizations. The “test” does not provicle 3ccountahility; the possibility of ;i direct and forceful response to the results of the test 13~ the many “clients” provides accountabil- ity-the “exit option,” as Hirschman (1970) has clescribecl it. Even though baseball teams are immune to antitrust legislation, eveiy one of lhem has a direct interest in improving itself if it seeks to stay in business. Most schools do not now have such an interest.

It makes no sense to think of “accountal~ility” 2s 21 system clevisecl by the state or other agencies to exert its influence hasecl on test scores. “Ac- countal%liCy” depends on the client’s freedom and power to exert influence-irrespective of the source of the information For the clecision-ancl is dependent, therefore, on 21 system that con- fronts teachers more directly with their successes and fdures. The Ford and I~sehall analogies suggest that we should he more cautious in as- suming that nccountnbility depe~zds 017 stare 01 national testing. If “accountal~ility” is a form of responsibility-and hence of responsiveness- accountability may work better if we help clients esert influence. in :1 context in which an institu- tion must face clirect consecluences ~1s a result of responsiveness, or the Lick of it, to clients.

SCHOOLING AM) CLIENT SATISFACTION

T

he unresponsiveness of schools has much to cl0 with the fact that union rules, bu- reaucratic traditions. ancl custom make it

almost impossible to shape schooling around the learner-client’s perceived needs. Students have little say in the teachers to whom they are :1s- signed. Classes are clesignecl ancl organized to suit the temperament, style, and (especially) pace of the teacher ancl the aggregate class. If one is zissignecl a poor teacher or h:is 2 relatiod prol~leni with ;1 teacher, too had: Few schools 3llow students to change classrooms or teachers lxisecl on the client’s un- happiness with the service. (Principals 3rd hoard members hate parents who persist in such mat-

“Union rules, bureaucratic traditions, and custom make if a/most impossbfe to shape schooling around the /earner-client’s perceived needs, M

ters.) As Albert Shanker, president of the Ameri- can Federation of Teachers, often says, if things clon’t work out For me with ;t doctor’s prescrip- tion and aclvice, the doctor is obligatecl Lo t1y an :dternative approach; in school they yell at you if you do not respond to their meclicine.

As odd as it may sound, I believe that indi- vidual teachers historically have been irnmutw to accountability. It is rare For teachers to have their niethocls rewarclecl or challengecl, their grades overturned, or their classroon~ duties altered he- c;1use of a Failure to serve clients well, as judgecl by the clients (or their guarclians). Only dramatic and extreme malfeasance leacl to such actions.

The issue of school choice is misleacling because responsiveness to customers is not greatly improved by such choice; parents still

15

Page 4: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

may be unable to influence a school once they commit themselves (and their child) to it. The primary “unit” of accountability is not the district officials or the “school” (both of which are too distant from the client’s interests), but the par- tictdnr set of teachers and administrators who are directly responsible for each child’s experience and achievement. This is a crucial distinction for an effective (and responsive) system of account- ability. I want to be able to be heard by my teacher if I am a student. My most important “choice” should be about the quality of my rela- tionship with that teacher. If I cannot influence the relationship for the better then I should be able to change teachers.

Consider the medical equivalent: I want to be able to influence the care provided by my doctor. I want to be heard when I am poorly served, and I want to change doctors if I am not heard and am dissatisfied with the service. I certainly do not want to be able merely to change hospitals in a system where hospitals are bound to be inclistin- guishable and not really interested in my particu- lar case when I get there.

Schools would be in.stant!y more accountable if we worried less about arcane psychometric proxy tests and more about making the teacher’s daily work public and giving the student per- former a more powerful voice. What if teachers were obligated to consider learning styles, not because of some mandate but because the stu- dent could transfer into another class on demand? What if’ each teacher had to display each month

the best work from each student in a public meeting with parents? What if academic teach- ers, like many vocational teach- ers, had to have a “consultant com- mittee” of profes- sionals from their field with whom they had to meet and review their work annuallv?

I *“We will never understand or achieve real account- ability until we realize that ‘tests’ themselves do not

. provide it, but mechanisms that increase responsive- ness to clients do. ”

I

What if performance appraisals were centerecl on teacher self-assessment concerning a range of student work from a major assignment? These kinds of mechanisms would improve accountabil- ity immediately and forcefully.

If we argue that the student performer is the primary “customer” for assessment information, then students ought to give teachers regular feed- back-not about the abstraction of an entire course or program, but about the quality and usefulness of the instructional help and assess- ment information they receive. As a simple ex-

ample, one of my former school colleagues would distribute index cards each Friday and ask students to put on one side “what worked for you” this week, and on the other side “what didn’t work.” This feedback is far more effective and responsive than formal course evaluations, which ultimately shield teachers from reacting as the dissatisfaction occurs, though even these measures are better than nothing. (Notice that such questionnaires do not ask questions about which the student is not an expert. Students are not only the best judges but the 04~) judges of the quality of the assessment feeclback they re- ceived ancl whether the lessons “worked” for them.)

This process is no clifferent from the kinds of questionnaires I am asked to fill out at the hotels I visit. In client-centered service businesses, a critical piece of the formal accountability is al- ways provided by the client’s formal evaluation. Also valuable, therefore, are surveys of graduat- ing students or alumni, used in many high schools, that assess the programs ancl services they founcl of most and least value. Here again, what matters is not the “truth” of the students’ claims but the importance of taking their percep- tions seriously.

I am not saying that the only or most impor- tant way to improve accountability is to use sur- vey data from stuclents. This is a small piece of a much larger assessment puzzle, in which we “triangulate” similar kinds of information gained from our many clients and institutional custom-

-*ers-parents, teachers at the next level, former students, and the test and admissions data we find most credible. I al?7 saying that we will never understand or achieve real accountability until we see that “tests” themselves cl0 not provide it, but mechanisms that increase responsiveness to clients do.

I must also emphasize that the results of credible and learner-useful tests are nonetheless an important part of any system of accountability. The key words here are “credible” ancl “useful” to the school faculty. Obtaining both criteria de- pends on more than just a move from indirect to direct tests. If the testing we do in the name of accountability is still one-event, year-encl testing, we will never obtain valicl, fair information. We need to ensure that any accountability-related testing reveals the “value added” by the school if we want to know whether or not a school is “effective.”

My earlier analogy of automobile purchasing was limitecl by an important fact: the automobile industry can be properly viewed as an almost perfect “outcome-based” system. We believe it to be fair and proper to assume that the vagaries of the “inputs” (the quality of the steel and the pro- duction process, for example, or the quality of

16 Business Horizons / Seplcml,er-Oclol,er 1993

Page 5: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

the workers) can be almost completely mini- mized through what we rightly call “quality con- trol.”

Schooling is not like that. The quality of the “inputs” to the system of schooling has much to do with the quality OF the result. Who our stu- dents are when they enter makes a profound difference if we are going to attempt to compare schools or districts. Then add student mobility to this complicated mix: in some urban districts, students enroll and leave the same school during the year, sometimes repeatedly. Some urban schools have a mobility rate of 2l0pe~erzt per year! What are we then measuring if we use one- event testing to provide accountability? How can one-event testing ezjer provide leverage For ac- countability?

Any sensible plan for school and district ac- countability should examine both “input” and “output”; it should ask about the “value added” by the faculty and the school. Whether we con- sider a simple pre- and post-test system or a more sophisticated approach to longitudinal as- sessment, thoughtful observers recognize the need to look For the changes effected’ by the school, not just isolated results. As a long-time researcher in this area of so-called “value-added” measurement stated, “The output of an institution

does not really tell us much about its educa- tional impact or educational effectiveness in de- veloping talent. Rather, outputs must always be evaluated in terms of inputs” (Astin 1991).

We can perhaps best appreciate the idea of “value added” (a term borrowed from economics) by casting the matter in mol-al language. Fairness and equity must be a part of any system of ac- countability. It is more fair to use a value-added model than an outcome-based model because good schools may be good only because of what already bright and capable students bring to the school. (That’s why the socioeconomic status of student bodies is often such a highly reliable indicator of achievement.)

If we really feel the need for comparisons, then we should at least compare apples to apples, not apples to oranges. What kind of ac- countability is it that compares economically depressed New York City schools with a 70 per- cent bilingual population and a 90 percent mobil- ity rate to the best suburban schools in Scarsdale? We don’t even do that in athletics: that’s why we have different divisions and conferences in high school and college sports. That’s why South Carolina has experimented with placing schools in different “bands” based on the socioeconomic status of its population.

The new Kentucky system takes the concept to its logical extreme. Uncler the Kentucky Educa- tion Reform Act, traditional comparability is dropped altogether. Instead of comparing schools

to each other, they are compared to themselves, over time, against standards. Two factors improve accountability: (1) we make it possible for local communities to understand how their schools are doing, and (2) we make it far more likely that local faculties will consider it in their self-interest to improve the ongoing assessment and level of their performance, no matter how “bad” or how “good” the current results.

Such a plan can easily be implemented at the district level. Edmonton, Alberta, for instance, has a site-based decision-making system that requires each site team to set yearly performance goals based on identi- fied local priority weaknesses. The Fairport schools in western New York have just begun an interesting experiment in which teams of teachers across grade levels will compete against other teams of

“Some urban schools have a mobilify rate of 210 percent per year! What are we then measuring if we use one- event testing to provide accountability? M

teachers for modest 1 1 bonuses. Competition is based on aggregate student performance using an array of assessments they are designing.

The main reason for thinking of accountabil- ity in such new ways is that oversight agencies cannot exert pressure on classroom-level con- duct. The state neither can nor should demand improvements from each classroom; that is a local matter. Single test scores provide no useful leverage because they cannot be used in a timely, appropriate fashion. Most important is that conventional standardized test results cannot be used /?JJ the clients to influence schools: many psychometric tricks are at work in the design and scoring of the tests, so their meaning is never self-evident to anyone but the designers.

TOWARD A NATURAL MERITOCFtACY

M ost intelligent observers now see that Free economic trade is superior to command economies with their com-

plex quotas, tariffs, and price supports. School reform is still being viewed as a brand of Eastern European communism: we are mandating outputs (national goals, national tests, national standards) in rooms and buildings that remain isolated From one another, where there is little or no incentive to break one’s habits. We need incentives that unleash and reward entrepreneurial behavior in every school.

No doubt “school choice” is important. Per- haps those who need it most to effect change are teachen. If accountability requires responsive-

Accountability, Testing, and Schools 17

Page 6: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

ness, then teachers must be allowed to band together to form programs and schools that ad- here to their pedagogical aims and values, re- gardless of what the union or the faculty would choose to do. We must allow our best teachers to exercise greater freedom of alliance, program development, and authority. There can be no accountability in a school world where unions and management allow the perpetuation of the myth that all teachers are equally competent and that “we are all responsible” for children’s wel- fare. If this were to happen, no one would accept responsibility.

True accountability for schools, then. must reside in freedom and opportunity for entrepre- neurial teachers to take reform ideas beyond their own classrooms. Even a “charter school” is not enough if teachers cannot gain increasing access to the system’s resources when they have a good idea that works. There must be opportu- nities and incentives in the system for non-risk- taking teachers to adopt such tactics and solve client problems Oecnlrse the job ckrm~~xis it, not because it’s a good idea or because “thoughtful professionals” do such things.

We need to build a performance-based meritocracy. Far too few opportunities exist for outstanding educators to be noticed and sought out (never mind properly rewarded with power

and authority). We neecl an account- ability system in

\\ We need incentives that which teachers feel

make the cynical, the lazy, obliged to know what the Oest

or the diffident realize that teachers, programs,

experimentation is more and schools clo-a

rewarding than inertia, I’ system in which %enchmarking” (to use an industry term) is a job re-

quirement and in one’s enlightened self-interest. We need to create a new kind of capital whereby successful educators receive “investable” educa- tional resources or the power to elpnnd their efforts. The National Board for Professional Teaching Standards will help; we need to con- sider which similar market forces would increase leverage on individual schools to do what works rather than what suits faculty habits, union con- tracts, and school boards.

By contrast, consider the current paucity of promotion possibilities in public schools as a sign of the problem. Unlike independent schools and private colleges, where the jobs of academic dean and department head are significant and are used to reward good teachers, public schools either have no such jobs or they make them ex- cessively bureaucratic. Everyone is a “profes- sional.” No one feels the need to change habits

18

fundamentally when protected by such fictions. A prototypical case of the anti-merit nature of

the world of K-12 schooling is the School With- out Walls in Rochester, New York. The school has been lauded by the state as exemplary, it is a long-time member of the Coalition of Essential Schools. and it was publicly praisecl routinely by former superintendent Peter McWalters and Adam Urbanski (head of the local teacher union). Nev- ertheless, the school has no credibility with rank- and-file teachers in other Rochester high schools. Moreover, no local school leaders have felt obliged or been obliged to inquire into its suc- cess. It is simply dismissed as an “alternative” school, struggling to acquire funcling for its in- creased student load. Such systemically tolerated cynicism makes accountability impossible.

Being “interested” in successful ideas is not the same as perceiving that they are in one’s interest. Why should a teacher or faculty persist with a process as cumbersome, complex, and fractious as school change without sustained incentives? In the current system, one’s “success” as an experimenter often earns the enmity of one’s peers and much more work-ancl often, more students with no concomitant increase in resources.

An accountability system requires iimwtiues, not just mandates and threats. We must finally learn to escape the Puritanism that still haunts American schooling whereby we assume that people should want to do the right thing. We need incentives that cause the cynical, the lazy. 6r the diffident to realize that experimentation is more renJarding than inertia. (An incentive is what causes you to cl0 something you probably would not do without the incentive, and persist with it, but in a way that does not make you perceive coercion.) Money can be usecl as a clis- incentive as well as an incentive, depending on how it is used to foster a climate of client-cen- tered experimentation.

Here are some examples of school systems setting some incentives correctly-for students as well as adults-which lead to better accountabil- ity:

l Cherry Creek, Colorado sharply reduced the number of district-level subject heads and used the savecl funds to support the most promis- ing ideas in each subject with capital resources, a mini-sabbatical, and a secretary for the teachers whose proposals are accepted.

l In Edmonton, Alberta, the central office people are called “consultants” and control little of the school budget. The consultants are “hired” by each school and are rehired based only on performance. The evaluation process reverses the chain of command: the superintendent is evalu- ated by the cabinet, administrators are evaluated by teaching staff, and teachers by students.

Business Horizons / Sepwmber-October 1993

Page 7: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

l In Upper Arlington, Ohio, students in lan- guage arts are required to produce quality work to be promoted. After they place their best work in the district portfolio, it must be deemed good or excellent for promotion to occur. Quality is not an option, but a requirement for all.

l In all New York vocational high schools, each teacher has a “consultant committee” made up of people from the teacher’s field. These com- mittees assist the teacher in evaluating the quality of the program, participate in the teaching and assessing process, ancl become involvecl in stan- clarcl setting.

Why can’t policymakers and state depart- ments of eclucation be helped to see their role as establishing an incentive-clriven rather than a manclate-clriven system of education reforms?

TOWARD A REAL SYSTEM

T

o speak of ‘~schools” ancl “clistricts” as “systems” is to voice a contradiction. Dis- tricts and schools are webs of profound

In our wish to respect the dignity of school personnel and the nobility of their intentions, we resist acknowledging a politically incorrect but important truth. ~Plcllz)J edclLlcatots still do tzot Lm- detstc/nd their-,jobs, given the absence of an ac- countability system that would make them more concerned about the effects of their teaching than about their intent. Many teachers wrongly believe

and long-standing isolation-of teachers from teachers, teachers from their subjects, *grade lev- els from other grade IeIrels, schools from other schools, and the school district from its inslitu- tional clients (colleges and businesses). How can policymakers hope to “leverage” change by tests or any other means if the “mass” being moved crumbles into many isolated bits upon contact? Until we rethink these webs of habit-bound isola- tion, we cannot have a school “system” at all. Keal accountability, then, woulcl be impossible to conceive.

that their purpose is to “teach” what they know and like on a relatively fixed sched- ule-irrespective of whether their stu- dents learn anything. Far too many persist in resolutely moving through their syllabi even when the sched- ule clearly is not working or important ideas are not under- stood. Thev will tell

Testing will not reverse teachers’ immunity to formal responsibility for their later, long-lasting effects on children. Alienation from one’s effects means that school is whatever the adults who inhabit it say it is, and success is wherever they find it. This is the exact opposite of any robust meaning of ‘*accountability.” Imposing one-shot, indirect tests will have no impact on these blincl spots because the tests will lack credibility and cliagnostic insight into the problem, he given when it is too late to clo anything about them, and provicle no guiclance in the much harcler work of establishing more intelligent lines of command ancl job descriptions.

you that thiy cannot I “slow down” and that they “must” cover the con- tent, yet it is not quite clear why; no one can usually cite a specific mandate, job description, or penalty that makes this more than myth. This argument doesn’t even make sense if the aggre- gate final achievement of all students is what we measure. Teachers are not now obligated-either by job description or direct pressures on the insti- tution from other institutions-to really know how they are doing and to correct it when things go baclly.

What John Dewey said seventy years ago on this subject remains true today. Too many teach- ers and administrators are willing to accept praise for their stuclents’ success while offering numer- ous excuses for why failure is not their fault (family or psychological problems, outside job, or television):

By contrast, professionals ancl coherent insti- tutions are always aclaptive and responsive; they are clear in their purpose ancl necessarily focused on their clients to he ~ystmmticcritjt and qLlick~J nttentiue to thegaps between intent cwd ejyect. The members and the “structures” all bend to a common goal. Stanford football, in a real sense, is the actions and adjustments of Bill Walsh and his players, not Bill Walsh’s values or strategic philosophy. The Metropolitan Museum of Art, in the same way, is its displays and the revenue generated by those displays, not its vast collec- tion and the tastes of its clirectors. And Apple

We make a religion out of eclucation, we profess unhouncled faith in its possibili- ties. . . . But on the other hand, we as-

sume in practice that no one is specifi- cally responsible when bad outcomes show themselves. . [Wlhen results are unclesirable we shrug our shoulders and place the responsibility upon some in- trinsic clefect or outer chance. (Dewey 1922)

We cannot have it both ways: school is either having an effect or it isn’t. We must therefore measure that ej@ct and assign responsibility het- ter.

“Too many teachers and administrators are willing to accept praise for their students ’ success while offering numerous excuses for why failure is not their fault U

I

Accountahihy, Testing. and Schools 19

Page 8: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

Computer is its computers, the niche it chooses, Perignon vineyards on the basis of indirect “indi- and the market share it earns-not the technical cators” or of single scores fried by others. We and design values of its executives. judge quality on two essential, independent crite-

School should be viewed as is any other ria: Is the daily work that we produce “standard- learning organization. It should be defined by the setting,” or at least “standard-upholding”? Is the quality of the products and performances pro- customer always satisfied, and do we have the duced by its “workers”-namely, the work of customers we want? The second criterion is con- stzldeent.+and by the ability of teacher-managers nected with “standards” regarding quality control .to adjust their approach, schedule, or use of re- or consistency across individual performances: Is sources Fundamentally if performance does not there 1nir?in7a/ variance between the “best” and meet the standard. “worst” work by students and teachers?

Though hard pressed on all sides by children and chores, teachers are isolated from obliging systemic feedback. The complaints of students, parents, the next grade level, and the next level of schooling can easily go unheard and un- solicited; responsiveness is t7ot required. Where,

for example. are teach- ers formally obligated to alter their lesson

“We m&t be honest about this cry for test-

plans based on early- year failures by some of their students? How

,driven accountability: schools are being

many faculties are required to find out what the altrlnlli be-

We must be honest about this cry for test- driven accountability: schools are being treated hypocritically. Those who propose high-stakes, uniform audit-tests as the sole measure of school performance woulcl never tolerate such reduc- tionism or interference in their own affairs. Our best colleges are not subject to a similar call; we certainly did not hear state legislatures or Con- gress calling for standardized tests when the sav- ings-and-loan scandal surfaced. School critics will not improve their workplace or define “quality” in their-domains of policy making or politics by using standardized tests or imposed, generic indi- cators. Why should schools be different?

treated hypocritically. 0 lieve are the strengths and weaknesses of their former educa- tion-and change in

response to the complaints? How many schools meet regularly with schools and professions at the “next” level to determine the fit between their objectives and programs? Being formally bound by contract and job description to seek and re- spond to such feedback would compel everyone to push performance beyond the current over- whelming power of self-fulfilling prophecies based on habitual and shut-in expectations.

If we could be certain of adequate collabora- tive, disinterested scoring of student work at the local level, an oversight system to ensure that the standards and criteria being used were both apt and rigorously employed, and an appeal system that could be used by students, parents, and teachers at other levels to contest a grade, then it is very unlikely that more “testing” would provide any usefulness in an accountability system at aII. As the CEO of a major service business said to Phillip Schlechty, president of the Center for Leadership in School Reform, we have made the terrible mistake of “handing over quality control to the accountants.”

One great mystery of the current debate about standards in education is the constantly heard claim by policymakers that national stan- dards are lacking in education. What an odd and erroneous c&m! We have the Advanced Place- ment program, achievement tests in every sec- ondary subject linked to college demands, pro- grams such as the International Baccalaureate, and such criterion-referenced tests as the National Assessment of Educational Progress. If academic reform is truly what we seek, what we lack is the utill to apply this multitude of existing standards to cr//stuclents. We lack policy and incentives- not standards-that would require mediocre schools and teachers to take notice of what good schools and teachers are accomplishing.

Standardized tests ancl “scales” were first imposed on teachers in the early twentieth cen- tury for reasons identical to those being offered today. Policymakers, then as now, mistrusted the local report card and transcript-often for souncl reasons. (Because teachers are inclined, for noble reasons or self-interested reasons, to see the child’s performance in the best possible light, and because teachers grade in isolation from one another and the wider world’s standards, it be- comes necessary to demand an “accounting” of the teachers’ accounting.)

STANDARDS

0 ne judges the standards of an enterprise by the quality found in its daily work. We do not judge Xerox, the Boston

Symphony, the Oakland Athletics, or Dom

Standards are never the result of imposed standardization, as I have said before (Wiggins 1991). Standards, like good assessment, are con- textual. The standards at Harvard have little to do with those at St. John’s College or the Juilliard School of Music; the standards at all our best private independent schools and colleges are

20 Business Horizons / September-October 1993

Page 9: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

determined by each faculty, not by mandate of policymakers. Standards relate to jobs done well by individuals as judged within a context of par- ticular purpose and effect.

Standards are not fixed or generic: they vary with a performer’s aspirations and purposes. No one standard exists for secondary education be- cause there are as many standards as there are colleges, professions, venues for performance, and aspirations. When good guidance is given to stuclents, and schools are responsive to the de- mands of the wider world, there can be as many “standards”-z&h accountability-as there are students, programs, and career options.

To “raise” performance standards requires not stanclardization of expectation but heightened clemancls for quality work from each student in each course. Establishing reasonable criteria for graduation, from which more personalizecl judg- ments would be made about each student’s work (just as we do in graduate school), shoulcl not be compromisecl by some perceived need for a con- sensus on what “our stanclarcls” ought to be.

We do not neecl a sir@ set of m:!ndated high academic standards and the accompanying pressures to meet them. Recently I watched 16 students build a beautiful $250,000 house to con- tract and code at an area vocational high school. Nevertheless, all the stuclents were academic “failures” in New York’s system and in terms of the national standards debate laid out by the governors and the President a few years ago. This makes little sense and causes much harm. We invert the proper relationship between ac- countability and education if we xm~rne that everyone must have the same education-if what we are really cloing is clesigning i i system that makes it easier for us to holcl schools ancl stu- dents accountable on the basis of comparability.

It is true that we use the worcl “standard” as if only ;I single excellence existecl. That masks the fact that clifferent criteria and contexts lead to dij@w?t single excellences. The musicianship of YO-YO Ma and Wynton Marsalis sets a standard for other musicians; the fiction of Tom Wolfe and Mark Twain sets a standard for American writers. These different standards are eclucative and entic- ing. ‘X&r-e is IIO one model of excellence; there are always a variety of exemplars to emulate. Some- one who is excellent in one category, genre, or performance can be mediocre in another.

There is no possible generic test of whether student work is “up to standard.” The aptness of standards and criteria depends on the context and purpose of the assessment: the artist who brings work for a job interview, selects works to hang in a juried exhibition, or selects hangings for his or her own home uses different standards and criteria each time. The portfolio changes as the demands of each context change. Excellence

Accounrability, Testing, ;ind Schools

is not a uniform correctness but the ability to unite personal style with mastery of a subject in a product or performance of one’s own clesign.

This crucial distinction becomes clearer when we use the word “standarcl” in the plural to de- scribe character. A person with “standards” exhib- its a passion for excellence and habitual attention to detail in all work done-even when a teacher’s or the state’s guidelines leave room for less. High standards, whether in people or institu- tions, are revealed through reliability, integrity. self-cliscipline, and craftsmanship as character.

Raising test scores will not raise standarcls. Raising standards locally (through changes in what kinds of work and conduct are tolerated) wi// raise test scores. This will require faculties to agree on local stanclarcls and benchmark their work against appropriate global stanclards linked to their many institutional customers. This can happen only when “teaching” is reclefined to include “assessing” and when structures are cre- ated to compel teachers to talk to one another and agree about performance outcomes ancl how to assess them.

Let’s take the idea of personally upheld stan- dards back to the classroom. Accountability be- gins with teachers not accepting work that is shodcly-it isn’t done until it’s clone right. Con- sider the English teacher who instructs student peer editors to mark the place in 21 stuclent paper where they lost interest or founcl it haphazard, and to return it for revision at that point; the paper isn’t clone until all peer eclitors reach the encl. Consicler the school system that requires a language arts portfolio from each student in which the sample of work must be of pieces that earn a gracle B or better. Consider the school system that uses an A/ B/Incomplete grading system. Consider the new British system of curriculum ancl assess- ment in which stu- dents must earn at least a 6 (on a scale of 10) in every course to graduate-no matter how long it takes. Accountability reform can be as simple as requiring every faculty, team, or academic department to formulate similar policies to ensure that quality work is not an option.

“Raising test scores will not raise standards. Raising standards locally (through changes in what kinds of work and conduct are tolerated) will raise test scores. m

STANDARDS VER.SUS EXPECTATIONS

4 any faculties have a difficult time ac- cepting that they should “set stan- dards” that some students cannot

Page 10: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

reach. This shows how far we are from under- standing standarcls and accountability. Stanclards are aliuap out of reach; that’s the point. The stanclarcls of performance ancl the standarcls of self-discipline in one’s work are always “ideals” For all but the world’s best performers in every field. I do not “expect” most people to meet the stanclnrcls set by the best. My “expectation” is that everyone will strive to improve his or her work by studying what is best ancl working continu- ously to narrow the gap between the current level of performance and the ideal level of per- formance.

In industry, for esample, the specification “zero defects per million” is commonly used. Such a lofty goal will never be met, but it serves as the appropriate benchmark for all work. Simi- larly, in writing, we should anchor our scoring systems not merely by the best work we happen to have in our possession but by the best work that is possible-even professional samples f01 some of our assignecl work. (We clo this in the performing arts and athletics without hesitation.) Otherwise, we have unwittingly reinventecl norm- referencecl testing.

“What about the student who can never come close to the standard?” This statement sum- marizes the belief that continues Lo make ecluca- tion less than a profession. The only possible response to this pervasive question is: How can you know that after only six months with a stu- dent? What chance will the student ever have of approaching the standard if you don’t assess his or her work in terms of the standard? This perva- sive and unsubstantiated fatalism about student performance should be the focus of all account- ability efforts.

Assessment in terms of standards is not to be confused with grading in terms of expectations. Johnny’s work may be a 1 on a scale of 10, but that may be what we expect for his age and es- perience. The standard-referenced score shoulcl not be translated mechanically into a gracle: think of the difference between beginning clivers ancl world-class divers, who both receive a 4.5 on a dive. Our “grade” is likely to be quite different based on our expectations, ancl justly so (as long as we communicate effectively to student ancl parent the difference between standarcl-refer- enced and norm-referencecl scores ancl grades).

We make a similar mistake when we confuse growth with progress, especially at the elemen- tary level. Growth is measurecl in terms of change in the individual: how far has the student come? Progress, however, is measured “back- ward” from the standard: how much closer is

Janie to the goal? It is possible to achieve much personal growth but little progress. Teachers do not serve students and parents well when they report results as growth and not progress; they

invariably make it seem as if the student is close to meeting a valid stanclard, whereas the com- ment about growth implies only some progress, not the level reachecl.

It makes no sense, therefore, to set standarcls only for each grade level. That’s why the out- come-based movement is so sensible. A “stan- clard” offers an objective ideal, serving as a wor- thy ancl tangible goal for everyone-even if, at this moment, for whatever reason, some cannot yet reach it.

E arlier in this article, I referred to John Dewey’s suggestion that educators are often insincere in their views about the

impact of education: we testify to its power and importance, but we often take no responsibility for its failures. In that same article, Dewey wryly reportecl a comment made by a friend that the real measure of progress in education would be the possibility of “bringing suit at law lo compel payment of clamages by educators to eclucatees for malpractice.” We are not there yet, but the time is close.

At the very least, we should be able to do a better job of acting on the question that was posecl to his teachers and administrators by the superintenclent of the Jefferson County (Ken- tucky) Public Schools: What are you willing to guarantee? Llntil n:e are willing to ask ancl answer such questions, there can be no accountability.

One superintenclent in New York State has .rriecl to clo just that. She argued that because we

profess that *‘all children can learn,” it makes sense to expect that all students in her clistricl shoulcl pass all the New York Regents Exams in every course. This assertion was greetecl with howls of protest by the high school faculty, who pronouncecl it impossible. She then turnecl the question: What. then, was the faculty willing to set as a target percentage for the next year? After some harrumphing, cliscussion, nncl incluiry, the faculty set themselves the goal of a passing rate approximately 14 percent higher for the year than had been set in preceding years-ancl proceecled to meet the target.

Whether we call it “quality control” or “ac- countability,” the process involves a faculty’s public obligation to a set of stanclarcls, criteria, and performance targets that seem out of reach but are in fact reachable. What matters, then, is what the total quality movement calls “continu- ous progress” and the tangible and public com- mitment to be consistently better in reference to llle most worthy indicators. If a faculty sets clear, public targets, if parents have a clear and present voice, ancl if stuclents, former students, ancl insti- tutional customers have a voice, then we will realize accountability that has the power to im- prove our schools. 0

Business Horizons / Seplember-October 1993

Page 11: Accountability, testing, and schools: Toward local responsibility and away from change by mandate

References

Alesmcler W. Astin, Assessrmrzt Jbr ~~~IIcvI~~: ‘The I%ilosoplq~ NIK/ Practice oJ’Asse.wnerrt arrd 13urlruttior? irl N&her Edilrrcatior~ (New York: American Council on Education, Macmillan, 1991).

John Dewey, “Education as Keligion,” in Jo Ann Boy&ton, ccl., 7%~ A!icklle Wrk~. ISYY- 1924. 1’01. 1-3 (Carbonclale, III.: Southern Illinois University Press. 1922).

A. Mirschman, hit, I~bice. NUC/ Lcyol!)~ in Firms. Orgo- r7iimtio77s am/ .S/rrtes (Canilxdge, Mass.: I-larwrcl Uni- versity Press. 1970).

Tom Peters, ThriiVr~g WI C%mos (New York: IHarper and Row, 1987 ).

Grant Wiggins, “Stnndarcls, Not Standardization: Evok- ing Quality Student Work,” Echrcatiorurl Lemletshp~ Felxuary 1991, pp. 18-25.

I I Grant Wiggins is the president of the Cen- ter on Learning, Assessment, and School Structure (CLASS), whose headquarters are in Geneseo, New York. This article has been adapted from Chapter 8 of Assess- ing Student Performance: Exploring the Purpose and Limits of Testing, to be pub- lished this fall by Jossey-Bass.

I I

Accountability, Testing, and Schools