12
Consensus and dissensus in mentor teachersjudgments of readiness to teach Mavis Haigh * , Fiona Ell The University of Auckland, Auckland, New Zealand highlights Study focuses on judgment of teacher candidatesreadiness to teach. Judgments made against vignettes of teacher candidatespracticum performance. Some consistency but also signicant dissensus displayed by the mentor teachers. The variability predicted by SJT resides in both the judges and the context. Dissensus needs to be used productively to help make more reliable judgments. article info Article history: Received 30 July 2013 Received in revised form 19 November 2013 Accepted 6 January 2014 Keywords: Teacher education practicum Professional judgment abstract Deciding whether a teacher candidate is ready to teach is a signicant judgment about which little is known. In this study, Social Judgment Theorys lens model is used to analyse grade decisions made by 18 primary school mentor teachers who were provided with four vignettes of ctional teacher candidatespracticum performance. Mentor teachersgrade decisions, and their reasoning, showed evidence of some consistency but also signicant dissensus. We argue that such dissensus is inevitable in complex social decision-making and therefore needs to be used productively to help make more reliable judgments. Ó 2014 Elsevier Ltd. All rights reserved. 1. Introduction The practicum plays a signicant (Ziechner, 2010), if somewhat problematic (Grudnoff, 2011), role in initial teacher education (ITE) programmes, providing authentic opportunities for teacher candi- dates (TCs) to gain understandings of the professional practice of teaching in todays diverse classrooms (Darling-Hammond, 2010). Practicum is also a key site for determining TC suitability, for entry into the profession, with summary assessments of TC practice being a particularly important source of information for future employers (Cameron-Jones & OHara, 1994, p. 213). However, while many teachers and policy makers paint glowing pictures of the value of practicum (Smith & Lev-Ari, 2005), assessment of TC learning/teaching during practicum is frequently positioned as being highly problematic (Darling Hammond & Snyder, 2000). International researchers investigating the chal- lenges of practicum-related assessment of TCs include Coll, Taylor, & Granger, 2002 (UK, NZ); Ciuffetelli-Parker & Volante, 2009 (Canada); Doerger & Dallmer, 2008 (USA); Goh, Wong, Choy, and Tan, 2009 (Asia); Ortlipp, 2006, 2009 (Australia); Rorrison, 2010 (Australia); Smith, 2010 (Norway); Ssentamu-Mamubiru, 2010 (Africa); and Tillema, Smith, & Lesham, 2011 (Israel). These re- searchers have taken up different aspects of practicum assessment, but none have dealt specically with how decisions about readiness to teach are made. The dual purposes of assessment during practicum e profes- sional learning and professional accountability e have a con- founding inuence on how it is done. These dual purposes can engender tensions (Porter, Youngs, & Odden, 2001) for teacher candidates as they experiment with practice while meeting prac- ticum requirements. A key question is: what is being assessed on practicum and how? Criteria are often not transparent and assessment procedures may not be clearly articulated (Brooker, Muller, Mylonas, & Hansford, 1998; Haigh & Tuck, 1999). Any form of assessment in- volves judgments being made against some criterion or normative standard. The judgment must involve some implicit or explicit understanding of what constitutes quality teaching (Porter et al., 2001), which in itself is a contested construct (Cohen, 2010). In line with authorities in many countries (Moss, 2010), The New Zealand Teachers Council has promulgated Graduating Teacher * Corresponding author. Tel.: þ64 9 6238964. E-mail addresses: [email protected], [email protected] (M. Haigh). Contents lists available at ScienceDirect Teaching and Teacher Education journal homepage: www.elsevier.com/locate/tate 0742-051X/$ e see front matter Ó 2014 Elsevier Ltd. All rights reserved. http://dx.doi.org/10.1016/j.tate.2014.01.001 Teaching and Teacher Education 40 (2014) 10e21

Consensus and dissensus in mentor teachers' judgments of readiness to teach

  • Upload
    fiona

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

lable at ScienceDirect

Teaching and Teacher Education 40 (2014) 10e21

Contents lists avai

Teaching and Teacher Education

journal homepage: www.elsevier .com/locate/ tate

Consensus and dissensus in mentor teachers’ judgments of readinessto teach

Mavis Haigh*, Fiona EllThe University of Auckland, Auckland, New Zealand

h i g h l i g h t s

� Study focuses on judgment of teacher candidates’ readiness to teach.� Judgments made against vignettes of teacher candidates’ practicum performance.� Some consistency but also significant dissensus displayed by the mentor teachers.� The variability predicted by SJT resides in both the judges and the context.� Dissensus needs to be used productively to help make more reliable judgments.

a r t i c l e i n f o

Article history:Received 30 July 2013Received in revised form19 November 2013Accepted 6 January 2014

Keywords:Teacher education practicumProfessional judgment

* Corresponding author. Tel.: þ64 9 6238964.E-mail addresses: [email protected], haight

0742-051X/$ e see front matter � 2014 Elsevier Ltd.http://dx.doi.org/10.1016/j.tate.2014.01.001

a b s t r a c t

Deciding whether a teacher candidate is ready to teach is a significant judgment about which little isknown. In this study, Social Judgment Theory’s lens model is used to analyse grade decisions made by 18primary school mentor teachers who were provided with four vignettes of fictional teacher candidates’practicum performance. Mentor teachers’ grade decisions, and their reasoning, showed evidence of someconsistency but also significant dissensus. We argue that such dissensus is inevitable in complex socialdecision-making and therefore needs to be used productively to help make more reliable judgments.

� 2014 Elsevier Ltd. All rights reserved.

1. Introduction

The practicum plays a significant (Ziechner, 2010), if somewhatproblematic (Grudnoff, 2011), role in initial teacher education (ITE)programmes, providing authentic opportunities for teacher candi-dates (TCs) to gain understandings of the professional practice ofteaching in today’s diverse classrooms (Darling-Hammond, 2010).Practicum is also a key site for determining TC suitability, for entryinto the profession, with summary assessments of TC practice beinga particularly important source of information for future employers(Cameron-Jones & O’Hara, 1994, p. 213).

However, while many teachers and policy makers paint glowingpictures of the value of practicum (Smith & Lev-Ari, 2005),assessment of TC learning/teaching during practicum is frequentlypositioned as being highly problematic (Darling Hammond &Snyder, 2000). International researchers investigating the chal-lenges of practicum-related assessment of TCs include Coll, Taylor,& Granger, 2002 (UK, NZ); Ciuffetelli-Parker & Volante, 2009(Canada); Doerger & Dallmer, 2008 (USA); Goh, Wong, Choy, and

@ihug.co.nz (M. Haigh).

All rights reserved.

Tan, 2009 (Asia); Ortlipp, 2006, 2009 (Australia); Rorrison, 2010(Australia); Smith, 2010 (Norway); Ssentamu-Mamubiru, 2010(Africa); and Tillema, Smith, & Lesham, 2011 (Israel). These re-searchers have taken up different aspects of practicum assessment,but none have dealt specificallywith howdecisions about readinessto teach are made.

The dual purposes of assessment during practicum e profes-sional learning and professional accountability e have a con-founding influence on how it is done. These dual purposes canengender tensions (Porter, Youngs, & Odden, 2001) for teachercandidates as they experiment with practice while meeting prac-ticum requirements.

A key question is: what is being assessed on practicum andhow? Criteria are often not transparent and assessment proceduresmay not be clearly articulated (Brooker, Muller, Mylonas, &Hansford, 1998; Haigh & Tuck, 1999). Any form of assessment in-volves judgments being made against some criterion or normativestandard. The judgment must involve some implicit or explicitunderstanding of what constitutes quality teaching (Porter et al.,2001), which in itself is a contested construct (Cohen, 2010). Inline with authorities in many countries (Moss, 2010), The NewZealand Teachers Council has promulgated Graduating Teacher

Fig. 1. The lens model employed in SJT in the context of TC practicum.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e21 11

Standards (NZTCGTS) to try and provide a normative standard. Thestandards, however, are couched in broad aspirational terms and donot lend themselves easily to direct, objective judgment, insteadrequiring a determined effort from all involved to reach sharedunderstandings (Grudnoff, Tuck, & Hawe, 2005; Smith, 2010).Furthermore, what we label as standards are socially constructed inthe larger narrative of economic, social and political issues inteacher education (Cochran-Smith, 2003) and attempts to under-stand assessment of TCs need to consider this broader social andpolitical context, with its inherent power relationships and sourcesof potential inequality.

It follows that making judgments about complex performancessuch as teaching will be challenging. Approaches to assessment ofteacher performance have been empirically investigated interna-tionally (e.g., Hegender, 2010 (Sweden); Hill & Grossman, 2013(United States); Johnson, 2013 (United Kingdom); Meeus, VanPetegem, & Engels, 2009 (Belgium); and Sedumedi & Mundalamo,2012 (South Africa)). The difficulties found with assessing teacherperformance extend to the assessment of TC’s performance.

Determining whether a prospective teacher is capable of inde-pendent teaching involves both conscious and unconscious pro-cesses. Teachers make thousands of judgments each day in theireveryday practice. Not all these judgments can be conscious anddeliberate, which leads over time to the development of an ‘in-stinct’ about things that happen in a classroom. It is to be expectedthat mentor teachers’ assessment of TCs will also possess instinc-tive elements. But acting on these instinctive judgments cancontribute to confusion for TCs if they do not understand the basesfor their mentors’ decisions.

Assessment practices in New Zealand teacher education areframed within a high trust environment where ITE providers areapproved by the New Zealand Teachers Council to graduate pro-visionally registered teachers who have met the NZTC GraduatingTeacher Standards (NZTC, 2011). Reflecting this high trust envi-ronment and cognizant of the contingent and contextual nature ofteaching practice (Hammerness, Darling-Hammond, & Bransford,2005) there is no set system of assessment that providers mustfollow to reach this decision. Practicum outcomes and the NZTCGTShave provided statements to judge teaching against, but the reli-ability of such judgments across context and judges is sometimesquestioned.

In NZ, early attempts to improve fairness for TCs provoked themovement to triadic assessment of the practicum (Turnbull, 1997).Such approaches were designed to be empowering for all, witheveryone in the triad of TC, mentor teacher (MT) and universitysupervisor (US) being constructed as having agency (Giddons,1984). TCs were encouraged to articulate their practice in a sup-portive professional context; MTs brought their understanding ofthe context of the judgment to the discussion and USs their widerITE contextual understanding. However, the assumption that triadicassessment provides opportunity for the equal sharing of voices inthe practicum assessment process is questionable, and many nowbelieve that triadic assessment is problematic (e.g. Ortlipp, 2003).

This article reports on part of a larger study that investigatedhow teachers and faculty make judgments of readiness to teach anddocumented processes and strategies that enhance the authenticityand reliability of such judgments. This research was a partnershipbetween our university and four primary schools. In an earlierpaper we reported how we have identified six dimensions ofteaching that are commonly attended to by New Zealand mentorteachers and faculty academics when they are evaluating TC prac-tice and the evidences they seek around these dimensions (Haigh,Ell, &Mackisack, 2013). These six dimensions were used as the basisfor the fictional vignettes in this study, and are described in Section3.3.1. In the previous study we found general agreement about the

dimensions, but differing views of their relative importance. Thosemaking the judgments prioritised the different dimensions inidiosyncratic ways. In this paper we report on assessment decisionsmade by 18 primary school mentor teachers provided with fourvignettes of TC practicum performance. We focus on how thementors made their decisions in order to understand the consensusand dissensus that arose between them.

2. Theoretical framing

2.1. Methodological framework

Our study was framed by a critical realist ontology (Bhaskar,1989) and a social constructivist epistemology (Vygotsky, 1978).Critical realism holds that it is possible to acquire knowledge aboutthe human world through critical reflection about that which weperceive. Social constructivism is an integrated theory that bringstogether the social and the psychological aspects of knowledgeconstruction. We believe that judging readiness to teach is a phe-nomenon that is socially constructed and fundamentally situated.Therefore, judgments must be understood in context. If we betterunderstand the judgment process then we can improve howjudgments are made.

2.2. Conceptual framework

2.2.1. Social Judgment TheoryThe Lens Model inherent in Social Judgment Theory (SJT)

(Hammond, Rohrbaugh, Mumpower, & Adelman, 1977) informs thedesign of this project. SJT recognises that “professional judgment isboth an individual cognitive act and a socially situated practice”(Allal, 2013, p. 31) (Fig. 1).

SJT emphasises careful identification and analysis of the contextof the judgment and the cues and policies (underlying constructs)used by judges. SJT suggests that there are stages within anyinvestigation of human judgment. These are: conceptualise thejudgment problem, understand the ecology (context), identify thecues and dimensions for judgment, determine a sample of cueprofiles, sample participating judges, obtain judgments, captureindividual’s judgment policies, and compare these policies(Cooksey, 1996). In the first part of our study, reported in Haighet al. (2013) we focused on the first three of these stages. In thispaper we follow the final five of these stages to capture, interrogateand compare judgment policies used by the mentor teachers asthey make judgments of TCs’ readiness to teach.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e2112

2.2.2. Consensus or dissensus?When mentor teachers make judgments about a TC’s readiness

to teach they are interpreting evidence and making decisions thatcan have significant implications for the TC’s career. Issues aroundthe validity of their judgments arise as the MTs are making theseassessments as “professionals working in complex, dynamic, andalways partially unique educational environments” (Moss, Girard, &Haniford, 2006, p.109). Yet, the debate betweenwhether we shouldstrive to reach consensus (Habermas, 1996) around these decisionsor value dissensus (Gadamer, 1994) continues.

Moss and Schutz (2001) ask: what level of agreement is itreasonable to expect? They critique Habermas’ notion that all ac-tions are regulated by rational dialogue. In response, they “draw onhermeneutic philosophy to offer a more pluralistic approach thatallows dissensus to be represented and taken into account in theassessment process” (p. 37). They question whether we can“legitimately cut standards free from the dialogic contexts inwhichthey were created and use them to direct the action of relativelyisolated judges” (p. 52). Apparently simple consensus reflected in ageneralised standards statements cannot reflect the complexity ofthe judgment context. Indeed, Wyatt-Smith and Klenowski (2013)have challenged the assumption that simply promulgating officialcriteria and standards will lead to improved accountability andclarity of meaning/intent. Shared understanding is not easy toreach, they argue, nor perhaps should we strive for it. Moss andSchutz suggest that we have to shift the emphasis from consensusto understanding and learning from differences, arguing that if weaim single-mindedly for consensus we may mask diversity.

As an alternative to consensus-seeking discourse, Moss andSchutz (2001) suggest hermeneutic conversation (Gadamer,1994). In a hermenuetic framework the gap between people asthey strive for understanding is seen as productive. In the spacecreated by the hermeneutic circle of reciprocal attempts at under-standing, new knowledge and deeper understanding can emerge.Rather than seeing disagreement or misunderstanding as a failure,hermeneutics sees dissensus (distanciation) as a source of learning(Gallagher, 1992). Agreement is thus only one possible outcome ofinteraction (Moss & Schutz, p. 57). Valuing difference requiresparticipants in the conversation to be confident their understand-ing is defensible even if others don’t agree with them, and able torecognise diversity and take opportunities to learn (Hoy, 1994).Moss and Schutz suggest that such learning could be supported bydetailed contextualised examples of situations where the partici-pants explore their potentially conflicting standards. Such herme-neutic dialogues are seen as potentially transformative and to beexpected (Hoy, 1994; Warnke, 1994). Articulation of differencesmight help judges decide when a TC’s performance is problematicand should be discussed more widely. Moss and Schutz also arguethat “by respectfully considering alternative perspectives we cometo a more textured understanding our current perspective” (p.63).

In summary, rather than striving for consensus it could beimportant to nurture dissensus in judgement-making as this canlead to critical evaluation of our taken-for-granted beliefs andpractices. Embracing dissensus might enhance our opportunity tolearn and enhance our judgment making. We need to take care,however, that TCs are not caught in the middle of this complexity.The personal and professional stakes are high when judging read-iness to teach. Can we accept, and use, dissensus as part of thiscomplex process?

3. Research methods

Social Judgment Theory emphasises that decisions made in so-cial contexts involve the coordination of multiple cues, many ofwhich are interdependent and may be redundant. The ways in

which judges use the available cues to make decisions are termed‘cue utilisation validities’ in SJT (Cooksey, 1996). The cue utilisationvaliditites are the judges’ attempts to understand the TCs’ perfor-mance. Dissensus amongst judges arises when cue utilisation val-idities differ.

Some of the variation that arises between judges’ decisions isdue to factors in the judges (their experience of teaching, forexample) and some of the variation is due to contextual factors (theclass level they teach, for example). This combination of contextualand individual differences means that the same decision will notalways be made by different people about the same TC.

Viewing the way in which available cues are perceived andbalanced through an SJT framework shaped the data collection andanalysis process. SJT suggests collecting rich information aboutjudgment practice, which led us to use hour-long interviews ratherthan pencil and paper methods. Rather than asking about mentorteachers’ individual experiences of judging, we chose to presentthem with standard vignettes. This was in order to gather compa-rable information about judges’ cue utilisation validities and to helpus pinpoint sources of variability in decision making which wouldotherwise be hard to see. To be consistent with SJT, however, thetask needed to be authentic and close to real judging practice.Thereforewe piloted the vignettes with colleagues and teachers notin the study to ensure that they seemed genuine and containedenough information.

Transcripts of the hour-long interviews form the data set for thisstudy, and were analysed with the aim of understanding how thejudges used the available cues to arrive at their judgments. Wewere interested in the dissensus that emerged between judges. Thetwo sources of variability proposed by SJT were present in thedecisions made about the vignettes e the mentor teachers broughttheir individual differences to the task, and the four vignettescontained factors that arose from the teaching context. Thus weused two analyses to consider the judgments: an analysis thatfocused on the decisions made by each person, looking at the in-dividual differences that might impact decision making, and ananalysis that focused on all the mentor teachers’ responses to eachvignette. By considering the interview transcripts in two ways wehoped to clarify the mentor teachers’ cue utilisation validities, withthe aim of better understanding consensus and dissensus in theprocess of judging readiness to teach.

3.1. Research context

3.1.1. Research sitesThe research took place at four elementary schools in a city of

one million people. The schools all served ethnically diverse com-munities, two in moderately high socio-economic areas and two inlow socio-economic areas. Onewas a large school of 625 pupils, onehad 240 pupils and the other two had approximately 350 pupils.The interviewer travelled to the schools to conduct the interviews.

3.1.2. Ethical considerationsEthics approval for the studywas obtained from our institution’s

human participants’ ethics committee. All participants gaveinformed, written consent to participate.

3.2. Participants

In New Zealand, teacher candidates must undertake at least 14weeks of practicum in a classroom setting. This is generally brokeninto blocks throughout their teacher preparation, with an extendedperiod of full responsibility towards the end of their studies.Teacher candidates are assigned to a MT by the teacher educationprogram administrators. TheMTs are recommended by their school

Fig. 2. The composition of the vignettes.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e21 13

principals. MTs in New Zealand are responsible for guiding theteacher candidates, providing themwith written and oral feedbackand, on final practica, gradually allowing them to teach the class fulltime for three weeks. At the end of this time MTs write a full reporton the TCs, and recommend that they pass or fail the practicum.They usually participate in three-way conference to discuss theirview, involving them, the TC and the university supervisor.

MTs from the four schools described in Section 3.1.1, who hadmentored final year TCs on their last practicum in the two monthsbefore the study took place, were approached to participate in thisstudy. All eligible mentor teachers in the four schools agreed toparticipate. Twoweremen, and sixteenwerewomen. Experience inteaching was evenly distributed between five years and more thantwenty years. Experience with mentoring had the same range, butten participants had less than ten years mentoring experience. Allof the MTs taught elementary school classes, ranging from Grade 1to Grade 6. The MTs taught general education classes, containingchildren from diverse cultural and linguistic backgrounds.

3.3. Data collection strategies

3.3.1. Vignette taskSJT emphasises the importance of the ‘principle of representa-

tive design’ (Cooksey, 1996, p.141) which calls for tasks to beauthentic and embedded in real contexts when information aboutjudgments is being gathered. These considerations informed thedevelopment of the vignette task in this study.

Four vignettes were developed for this study. Each was writtenas the type of summary statement that might be written on theformal evaluation of a TC’s practicum. The six dimensions and twodomains that form the skeleton of the vignettes were derived froma previous empirical study (Haigh et al., 2013). The two domainswere ‘professional practice’ and ‘personal attributes’. Within eachdomain were three dimensions that arose from participants’ con-tributions. In ‘professional practice’ there were knowledge andplanning, enacting teaching and managing learning and assess-ment and use of evidence. In ‘personal attributes’ there were re-lationships, learning as a teacher and personal qualities. The threedimensions in professional practice described three iterative pha-ses of teaching: preparing to teach (which included understandingthe material to be taught), enacting teaching with students(including managing a safe classroom environment) and being ableto use evidence to consider the effectiveness of teaching and whereto take the students next. These dimensions were essentially theones that described what the TC could do in the classroom. Thethree dimensions in personal attributes described aspects of a TC’spotential as a teacher. They were core elements of who the TCs

were that could contribute to effective teaching. Relationships withchildren, staff and families, the TCs’ personal qualities such asenthusiasm, flexibility, resilience and warmth and TCs’ ability tolearn themselves (for example, receiving feedback, reflecting ontheir practice and engaging with their practicum tasks) constitutedthe personal attributes domain. Fig. 2 shows how the vignetteswere constructed using the skeleton of the dimensions and do-mains. Each dimension was allocated a level of achievement: high,moderate or low. These were combined to make a profile whichwas overall high in one domain (two high scores and a moderatescore) and overall low in the other (two low scores and one mod-erate score).

To construct the vignettes two specific aspects were commentedon within each of the six dimensions. The aspects were addressedin the same order in each vignette. The aspects commented on inthe vignettes are shown in Fig. 3. If the TC in the vignette scoredhighly on a dimension, both aspects of that dimension weredescribed in very positive terms. If they received a moderate score,both were equivocal descriptions, and if they got a low score bothaspects were described as problematic. Examples are given in Fig. 3.

An example of a complete vignette is given in Fig. 4 below.Further evidence that the vignettes were credible came from the

responses of the mentor teachers who made comments using apersonal pronoun such as ‘at the end it says she was able to discusswithme’, or expressed feelings about the decisions such as ‘you feelreally cruel’.

3.3.2. Task administrationThe task was administered in a one-on-one interview by a

trained interviewer who was not otherwise part of the researchteam. The hour-long interviews took place at the participants’schools. The interviews were audio-recorded, and the recordingswere transcribed verbatim by a professional transcriber. Theseeighteen transcriptions form the data set for this paper.

The mentor teachers received the vignettes several days prior tothe interviews. In all but two cases the MTs came prepared for thediscussion. The interviewer went through the vignettes, asking fora judgment about the TC: should they receive a fail, low pass, passor high pass for their practicum, if this was the summary comment?The MTs were then asked to explain their judgment, and to discusshow they arrived at it.

3.4. Data analysis strategies

3.4.1. Determining the mentor teachers’ judgmentsThe transcripts were read by a research assistant to determine

the mentor teachers’ judgments for each vignette. Altogether the

Fig. 3. The aspects used to construct the vignettes, with examples.

Fig. 4. A complete vignette.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e2114

Table 1Range of grades assigned to candidates represented in the vignettes.

A B C D

High in PQ and L High in PQ and R High in KP and AE High in EM and AE

Med in R and AE Med in L and KP Med in EM and L Med in PQ and KP

Low in KP and EM Low in EM and AE Low in PQ and R Low in L and R

HP ¼ 1 HP ¼ 1 HP ¼ 4 HP ¼ 1P ¼ 8 P ¼ 4 P ¼ 8 P ¼ 9LP ¼ 5 LP ¼ 5 LP ¼ 6 LP ¼ 4F ¼ 4 F ¼ 8 F ¼ 0 F ¼ 4

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e21 15

mentor teachers made 72 separate judgments. In two cases thementor teacher changed their minds during the discussion, and itwas necessary to trace their process through to the end of thetranscript to determine the final decision.

The mentor teachers’ descriptions of the cues they used and thepolicies they employed were carefully examined to look for themesthat played out across the four decisions that they made (Braun &Clarke, 2006).

3.4.2. Looking across the decisions made about a vignetteThe first analysis considered the eighteen decisions made about

each of two vignettes. This analysis aimed to answer the question:How domentor teachers’ cue utilisation validities differ in responseto particular vignettes? Two vignettes with the full range of scoreswere chosen for this analysis.

3.4.3. Looking across a mentor teacher’s decisionsThe second analysis looked at the data another way, by

considering how individual mentors arrived at their decisions. Twomentor teachers were chosen for this analysis because they usedhigh pass grades for some vignettes (which was unusual), and hadopposite responses to the vignettes which described strong per-sonal attributes or strong professional practice.

3.4.4. Establishing the reliability and validity of the analysisprocedures

The analyses presented in this paper are essentially descriptive.The SJT framework calls for an in-depth understanding of thejudgment ecology and judges’ processes in order to understandvariability in the decisions people make. The purpose of the anal-ysis is to provide a rich description of the cue use and consistencies/inconsistencies evident in the mentor teachers’ judgments. Thedescriptions provided in this papermust therefore be reliable in thesense that another reader of the transcripts would extract a similardescription from the data, and valid in the sense that the descrip-tion must represent the real-world experience of the mentorteachers.

Two researchers and a research assistant worked on the data set.An iterative coding process was employed, and the data was cutseveral ways in initial analyses (by dimension, by domain, byparticipant and by vignette). This ensured familiarity with the dataand allowed the key ideas to emerge through discussion andjustification. The analyses presented here were each developed byone of the researchers after this process, and then checked by theother researcher. We have included as many quotes from the par-ticipants as possible to illustrate the validity of our descriptions.

3.5. Limitations of the study

This study looks closely at the process of individual judgmentmaking in the context of final practicum. It does not consider theissues of morality or power relationships involved in making thesedecisions, and does not place them in a wider social or politicalcontext. While these broader aspects are critical to a completeunderstanding of judging readiness to teach, this study gathereddata about the details of forming judgments, within a SJT frame-work rather than aiming to provide a full picture of judgmentmaking in context.

The study is small, with only eighteen participants from fourschools. With a larger participant group stronger patternsmay haveemerged and the variability present in these data may have provedartefactual. Although patternsmay become clearer in a larger study,it remains true that at an individual level the judges rate the sameTC in very different ways. TCs on practicum are judged by individual

mentor teachers in the majority of cases, so individual variation isof significant interest, alongside patterns across judges.

The vignettes were carefully constructed and were regarded asan authentic task by the participants, but there is a distinctionbetween deciding based on a brief description and deciding basedon working with a TC in a classroom. The extent to which a paper-based discussion task can represent part of the real judgmentprocess must be born in mind when interpreting the results of thisstudy.

The SJT-informed descriptive analyses in Section 4 are inter-pretive. The limitations of the framework we have used and ourreading of the transcripts are present in the analyses. While wehave agreement amongst the research team that these findingsrepresent the data, other stories and interpretations are possible,particularly if different theoretical frames were employed.

4. Findings

The mentor teachers used various strategies when makingjudgment decisions. All identified the intended aspects of the vi-gnettes that were strengths and weaknesses but they took differentapproaches to reach the final decision. Some started from thepremise that the TC had failed the practicum then searched thevignette for instances that would challenge that decision. Otherslooked for the strengths first and then weighed these againstidentified weaknesses, reflecting on whether the positives weremore important than the negatives. Still others considered the vi-gnettes against the learning outcomes for the final practicum of theteacher education programme or against the NZTCGTS.

4.1. General findings

The range of grades given by the 18 mentor teachers is shown inTable 1:

Overall, 7 High Pass, 29 Pass, 20 Low Pass and 16 Fail judgmentswere made. These judgments do not yield an immediate pattern.For example, Vignettes B and C have contrasting profiles. The storywithin Vignette B indicated high ratings for the personal attributesof “Personal quality” and “Relationships” but low for the profes-sional practice dimensions of “Enacting teaching andmanagement”and “Assessment and use of evidence”. Vignette C was rated highlyfor the professional practice dimensions ‘Knowledge and planning’and ‘Assessment and use of evidence’ but low for the personal at-tributes of ‘Personal qualities’ and ‘Relationships’. Neverthelessboth vignettes received high passes from some associates.

The range of grades used by the mentor teachers across the fourvignettes was also variable. Two mentors graded across the rangefrom High Pass to Fail when making the judgments, ten rangedacross three possible positions, four across two possible positionsand two teachers indicated a Pass judgment across all fourvignettes.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e2116

It was difficult to ascertain the influences underpinning thegrade decisions of six of the eighteen mentor teachers whocompleted the task. Of the remaining 12 mentors, professionalpractice dimensions appear to be more influential for more men-tors than the personal attribute dimensions. Eight mentorsappeared to give significance to professional practice indicators (sixstrongly and two weakly) and four appeared to be most influencedby the personal attribute dimensions in the vignette (one stronglyand three weakly). Those influenced by the personal attribute di-mensions were largely focussing on the professional learning/reflection dimension. These findings are in contrast to the findingsfrom an earlier stage of the larger study where Knowledge andPlanning (a PP dimension) and Relationships (a PA dimension)appeared to be of equal potency (Haigh et al., 2013).

In order to better understand the policies and cues that thementor teachers were using as they made these grading decisionswe have interrogated the data more deeply by taking the two ap-proaches to analysis described in Sections 3.4.3 and 3.4.4. Toexamine the apparent dissensus described above we looked at twocontrasting vignettes, Vignettes A and D and explored how all thementor teachers responded to these TC descriptions, seeking tounderstand if there were particular elements of the vignettes thatinfluenced mentor teachers’ judgments.

In seeking further explanation as to the source(s) of thisdissensus we then considered the decision-making of two con-trasting mentor teachers to see how the differences between thetwo arise, and what the consequences of these differences are fordeciding if a TC is ready to teach. Of the two mentors selected forthis closer consideration one appeared to be most strongly influ-enced by the personal attribute dimensions as described in thevignettes. The other appeared to be strongly influenced by thedescriptions of professional practice.

4.2. Looking across the decisions made about a vignette

Table 1 shows the range of scores assigned to each vignette.Vignettes A, B and D received the full range of possible scores, fromHigh Pass to Fail. In a real-world situation, the TCs in these threevignettes could have been failed or passed well depending on thementor teacher with whom they were placed. Responses toVignette C were slightly more consistent: all the mentor teacherspassed the TC in Vignette C, though with different grades rangingfrom Low Pass to High Pass. There was considerable variability inthementor teachers’ responses to all four vignettes. Vignettes A andD were chosen for this analysis because they were inverse of eachother: one higher in professional practice and one higher in per-sonal attributes. Furthermore Vignettes A and D provided a contrastin the TCs’ ability to reflect on their practice and receive feedback,which was a significant theme in the data. Table 2 shows how thementors rated the two vignettes in comparison with each other.

A third of the mentor teachers rated both descriptions the same,despite their opposite profiles. Of the mentors who had two-scoredifferences, all but one favoured Vignette A over Vignette D (PAover PP). Of mentors with one-score differences, five of the sixfavoured D over A (PP over PA). Thus there appeared to beconsiderable dissensus amongst these judges.

Table 2Comparison of the scores given to Vignettes A and D.

Number ofassociates

Score given

Same score for A and D 6 1 F, 1 LP, 4 POne score difference between A and D 6 4 LP-P, 1 P-F, 1 HP-LPTwo scores difference between A and D 6 5 P-F, 1 HP-P

By comparing the mentor teachers’ responses we sought toanswer the question: How do mentor teachers’ cue ultilisationvalidities differ in response to particular vignettes?

4.2.1. Vignette DVignette D received 4 fails, 4 low passes, 9 passes and 1 high

pass. Much of the discussion focussed on reflection and being opento advice. The centrality of reflection to being a professional teacherwas emphasised by all the mentors, whether they passed the TC orfailed them.

If she’s not going to reflect on her own teaching, that is a prettybig thing, especially if she’s not really wanting to discuss thelessons and talk about the lessons. That’s a huge part of being ateacher.if you are not going to do that, you are not really evergoing to grow your own teacher practice (MT 8, fail)

You need to be able to talk about it and reflect on it and alsoaccept the fact that you will be watched and your teaching willbe discussed and that is part of accountability. (MT 13, pass)

All but one of the ten mentors who gave a pass or high pass toVignette D invoked ‘the mentor’ as a reason why the TC hadstruggled to reflect or receive feedback. Vignette D’s strengths,particularly in relation to student learning, were seen as verypositive and these mentors looked for reasons outside of thevignette for why reflection and receiving feedback might beproblematic:

I think she might have low self esteem and finds difficulty toaccept constructive feedback. I don’t know what the mentormight be like. Could be a mature student with a young mentorbecause I’ve encountered that as well. (MT 3)

It says that the mentor teacher hasn’t seen any personal orwritten reflections.I would have thought that the mentorteacher could have dealt with that early on in the practicum.(MT 18)

There was also evidence of mentors drawing on their ownexperience to decide that the TC in Vignette D would be successful.This was a factor in assigning the high pass score.

I would hope that when she starts teaching in her class which Ifeel she could manage that she will become more reflective .Iwas a bit like that 10 years ago. (MT 16)

She reminds me of someone else I have seen on practicumbefore.I think that is something that could be taught. (MT 1)

4.2.2. Vignette AVignette A received 4 fails, 5 low passes, 8 passes and 1 high

pass. Again, discussion focused on reflection. The four mentors whofailed this TC felt that their classroom management and contentknowledge issues were indicators that the TC was not focused onchildren’s learning and this was too significant for reflection toovercome:

And one of the biggest things was that she needed to focus onthe learning with the students which of course is our corebusiness and what we’re about so if you can’t do that I feel thatshe was a fail. (MT 11)

The five mentors who assigned a low pass had similar concernsto mentors who failed the TC, but felt that there were some miti-gating circumstances. For three of the five, this was the TC’s ability

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e21 17

to reflect, for two therewere queries as towhether this was the TC’s‘preferred level’. Their comments implied that if it were not thendeficits in content knowledge and management were not aproblem:

And probably at this particular level that shewas placed in is not(ideal).we are either strong in the juniors or the seniors, and Ithink that’s a reality we need to face. Because me personally, Iam a strong junior class teacher. I’m not in the seniors. (MT 4)

Is this A’s preferred teaching level because if A actually wants toteach new entrants, the content level for new entrants asopposed to Year 7 is worlds apart, so I know in theory we have tobe able to teach all levels but I know that I don’t want to teachYear 7 and 8 and I know maybe A is perhaps not in her perfectplace. (MT 18)

The remaining nine mentors gave a pass or high pass score toVignette A. In all cases, the TC’s willingness to reflect overcameconcerns about aspects of their practice:

She had a little bit of a problem with content knowledge. Andthe thing I sort of weighed up here is content knowledge can belearnt.but I think the things like being reflective and takingthat feedback and using the data to drive the planning andseeing those next steps are not always the easy things to teach,and I think if you’ve got those things there, the contentknowledge can be taught or you can find it out. (MT 8)

I think I’ll give her a pass because as a TC she’s reflective, she’sgood on feedback. The main essence of a TC and a teacher isbeing reflective, acting on suggestions and all those kind ofthings and she’s got those. So one area she has to work on is themanaging relationship kind of thing but she has got the essenceof a teacher. (MT 6)

As with Vignette D, there was evidence that some mentorsthought of their own experience and of themselves when assigningscores. This was a factor in the high pass decision, as it also waswith Vignette D.

I’m just thinking about myself and when I was a TC and mymentor said to me that fact that I was reflective e much like A,like I put myself in her shoes ewent a long way. The fact thatshe’s reflective. (MT 5)

4.2.3. Consensus and dissensus across vignettesThe mentor teachers essentially agreed that reflection was

critical, but disagreed about whether it was learnable, or couldredeem or be redeemed by, other aspects of practice. As SJT pre-dicts, some of the variability in judgment was due to contextualfactors, with mentor teachers adding information to the vignette

Fig. 5. The contrasting decis

and creating a context to account for the comments. Some of thevariability also came from within the judges, with examples fromtheir own experience as both teachers andmentors cited as reasonsfor making certain decisions.

Only one mentor teacher failed both TCs. He/she used theGraduating Teacher Standards (NZTC, 2007) to evaluate the vi-gnettes, itemising which standards each TC did not meet. Othermentor teachers who failed these TCs stayed within the informa-tion provided in the vignette. Mentor teachers who passed theseTCs seemed to add information to what was there, consideringunknown factors such as the TC’s preferred level, or the actions ofthe fictional mentor. In both high pass decisions the mentorteachers used personal reference points to decide that the deficitscould be overcome. The two core problems e not focussing onlearning and not reflectingewere reasons to fail for some, or thingsthat could be redeemed for others.

4.3. Looking across a mentor teacher’s decisions

The contrasting decisions made by MT5 and MT15 are shown inFig. 5. MT 5 and MT 15 have been selected for this comparisonbecause they highlight the contrasts that underpin differencesfound throughout the data set. MT 5 is an outlier in her strongemphasis on personal attributes, but she serves as a clear exampleof themes that emerge in those for whom personal attributes areimportant. Several mentor teachers could have served as a com-parison to the personal attributes emphasis; MT15 used all fouravailable grades, in contrast with others in this category and thuswas selected as potentially allowing us to identify nuances in thedecision-making.

MT 5 and MT 15 both taught in schools within high socio-economic communities. One of these schools is a large, Year 1e6primary school with a roll of over 600 students (MT 15). The other isa smaller Year 1e6 primary school with a roll of less than 300 andwith a high proportion of English second-language learners (MT 5).Both schools had received positive Educational Review Office (ERO)reports within two years of the study, with assurances that bothschools had a strong focus on student learning. MT 5 taught ageneral class of 9e10 year old children. MT 15 taught a general classof 8e9 year-old children.

MT 5 and MT 15 did not give the same grades to any of thecandidates. While the final overall Pass/Fail would have beendifferent for only one of these fictional candidates (B), interviewcomments made by the mentor teachers indicate that in a realworld situation thewritten reports for these candidates would havepresented very different pictures of the candidates for future em-ployers to consider.

4.3.1. Mentor teacher MT5MT5 used three grades (HP, P, LP). She assigned HPs to Vignettes

A and Bwhere statements relating to personal attribute dimensions

ions of MT5 and MT15.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e2118

weremore positive than those for personal practice. She assigned Pand LP to Vignettes C and D where statements relating to profes-sional practice dimensions were more positive than those for thepersonal attributes.

MT5’s comments for Vignette A indicate that she was stronglyinfluenced by comments referring to the TC as a learner, havingpositive personal qualities, and being able to form strongrelationships:

The fact that she is receptive to receiving feedback is a positivething. . I guess she acts upon feedback, which is a big plus. .She’s flexible when in full responsibility for the class, which isanother thing. . Another thing, which is probably number onepriority, is that she has really good relationships with the chil-dren in her class.

She noticed negative statements relating to A’s professionalknowledge and planning and limited data collection but excusedthese as potentially learnable in the future, as she herself had done.Her overall decision appears to be based on the candidate’senthusiasm for teaching, her good relationships with children andthat she was receptive to feedback.

Comments for Vignette B again emphasised the candidate’senthusiasm and excellent relationships with children. MT5 wasconcerned that the candidate was reflective only when encour-aged; needed support for planning, had management difficulties,and struggled to engage children in ‘real’ work. However, shebelieved that the candidate’s ability to develop strong relationshipswith children outweighed these concerns, which could beaddressed later. Although indicating she felt the candidate wasweaker than A she again assigned a HP grade.

For Vignette C the negative of being distant was moderated bythe strengths described for knowledge and planning and MT5assigned a P. She also noticed that although the candidate wasnervous about receiving feedback, nervousness probably indicatedthat she was receiving feedback seriously and was thus a positivefeature. MT5 empathised with C here, saying:

I can relate to her [being nervous when] receiving feedback. Ibecome very weak in the knees . when I am going to receivefeedback.

Again she indicated that the candidate had potential, in this casesuggesting that the candidate will be able to learn how to developbetter professional relationships with children and colleagues.

For vignette D, MT5 focussed on the perceived negatives of poorreflective skills and poor relationships, believing that these out-weighed the positive statements about planning, and children’sengagement. MT5 indicated a “reluctance” to pass this candidate butthepositive statements in theprofessionalpractice areaof enactmentand management had led her to assign a LP grade instead of an F.

Overall this mentor teacher considered candidates’ potential asshemadeher grading decisions, a standpoint that shemayuse inherpractice as amentor since she also indicated that generally she has areluctance to fail candidates she has been mentoring on practicum.Policies underpinning MT5’s decision-making appear to be a strongbelief in the significance of reflective practice and other teacher-linked dispositions. She seems to regard these as capable of over-coming identified shortcomings in the area of professional practice.She often links TC potential to her own professional journey.

4.3.2. Mentor teacher MT15Mentor teacher MT15 used all four grades. She assigned LP and F

to Vignettes A and B respectively where the statements relating to

personal attribute dimensions were more positive than those forprofessional practice dimensions. She assigned HP and P grades toVignettes C and D, which are the inverse.

MT15’s comments for Vignette A indicate she was stronglyinfluenced by statements that A had difficulty with contentknowledge, planned poorly, set work that was too easy for herstudents, did not engage all children, and had difficulty managingrotations, indicating these were: “quite important”. However, shedid not fail A because she saw potential for growth by thecandidate:

Originally I put fail, but when I revisited it, I put a low pass andthis is why I changed my mind. She is reflective and good atreceiving feedback and she acts on suggestions. So that leavesroom for improvement. She is enthusiastic about teaching,which will motivate her to gain that content knowledge andovercome these difficulties.

MT15 considered fictional candidate B generally “incapable”.She noted B needed encouragement to reflect and act on sugges-tions, needed support with planning, struggled to maintainengagement and was not aware of whole class behaviour. Notcollecting and using assessment datawas also considered to be verynegative. MT15 did identify that B was friendly, flexible andenthusiastic and had developed good relationships with children.However, these positive aspects were not enough to overcomefailings within the professional practice domain. Overall, MT15believed that B did not appear to be focussed on learning and sheassigned a Fail grade.

In contrast MT15 assigned a HP grade to Vignette C, in responseto highly or moderately positive statements across all the profes-sional practice dimensions and moderate comments for theLearning as a Teacher dimension. The HP grade was assignedbecause C was reflective, focussed on children’s learning, had goodcontent knowledge and developed appropriate and detailed plan-ning, using data from careful assessment. Behaviour managementand student engagement was noted as high. The mentor did noticesome negatives regarding personality but she excused this by sug-gesting that C might be shy and just “give the impression of beingunapproachable. [a concern because people] might form opinionsthat they are not good at their job because they are lacking thatquality of showing emotion towards children ”. MT15 felt that Cmust be quietly enthusiastic for teaching, given all the professionaltasks she undertook. However, MT15 did comment that she wasthinking of C as teaching at a higher primary age group and thatmay be PQ and R would be more important with younger children.

Vignette D was assigned a P grade because of her flexibility andenthusiasm (“even if this is only for her subject strengths”) inconcert with her good content knowledge, interesting manage-ment strategies and good assessment records. Overall, MT15considered D was focussed on children’s learning. But she alsonoted D had “things to work on”, suggesting:

. where she needs to work on would be that she is over-confident and that stops her from listening to her mentorteacher’s feedback. I just thought the mentor teacher shouldstart these discussions from the student achievement point ofview, because she is willing to discuss student achievement andtheir needs, but she doesn’t like listening towhat went wrong inthe lesson.

In response to these concerns she dropped the assigned gradefromHP to P, noting, “personality is important. and [D]might be achallenge to work with in a team”.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e21 19

Overall, the policies underpinningMT15’s decisions appear to bea belief around the importance of TCs demonstrating strong pro-fessional practice rather than a focus on their personal attributedimensions. However, she did use the indicators about personalattribute dimensions to moderate the grade decisions she made.She was also strongly influenced by concerns about what was notworking well for the candidate when making her judgments.

4.3.3. Sources of dissensus between mentorsMT5 andMT 15 provide a strong contrast in approach to judging

readiness to teach. Dissenting grade allocation appears to ariseprimarily from three sources: different views about what are themost important dimensions of teaching, whether mentors focus onwhat they can see in TCs’ current practice or what they think mightbe possible in the future and whether they believe key aspects ofteaching to be ‘learnable’ or not. MT5 prioritised the personalattribute dimensions of teaching; MT15 prioritised the professionalpractice dimensions. Both, however, did acknowledge the candi-dates’ achievements in the other domain and used these to mod-erate their decisions. MT15 was somewhat less focussed onpotential than MT5, tending to consider candidates’ abilities asdemonstrated at a point in time, rather than what they might beable to do with mentor support in the future. Teaching level was acontextual consideration for MT15, but not MT5. MT5 referred toher own professional journey when making judgments; MT15 didnot. MT 5 saw knowledge, planning and use of evidence as learn-able, whereas MT 15 saw relationships and personal qualities aslearnable.

5. Discussion

The judges’ explanations for their decisions gave us access totheir ‘cue utiltisation validities’ e how they used the cues theyidentified. The judges’ cue utilisation variedwidely, as evidenced bythe different approaches to considering ‘reflection’ in Section 4.2.This exemplifies Mumpower & Stewart’s (1996) assertion that theway judges weight different cues when arriving at their decisions isa major source of disagreement. Within the participant group someappeared to emphasise personal attribute dimensions, other pro-fessional practice dimensions; for others it was difficult to deter-mine a preference. This results in differing cue utilisation validities,as exemplified in Section 4.3. Therewas, however, evidence that thementors did not emphasise one thing to the exclusion of the othercues with weaker cues being used to moderate their decisions.

Mentors differed inwhat they believed could be learnt when theTCs became beginning teachers. A significant number of thementorteachers believed that content knowledge, management strategiesand aspects of the use of evidence for planning can be learnt at alater date. Others, however, were sufficiently concerned to fail acandidate if these aspects of practice were not well demonstratedduring the practicum. Still others were prepared to pass a candidateeven if these aspects were somewhat lacking if the TC wasdescribed as good at receiving and acting upon feedback. Manymentors drew on their own prior experience, especially about theirown practice as beginning teachers, and this became a source ofvariability between judges.

There were hints throughout the interviews that in generalmentor teachers are reluctant to fail even imaginary TCs, describingthemselves as ‘cruel’ or ‘horrible’ for suggesting fail grades. Theysuggest ways that deficits could be made up and they describe howthey might help someone learn the things they do not know/do.The mentors were seeing opportunities for TC learning and backedthemselves to be able to teach the TCs to be effective. These beliefscould be helpful to TCs. Mentors who see the TCs as capable of

learning, and the dimensions as able to be taught, may be morelikely to persist in helping the TCs improve their practice.

Overall, the judgment-making in this study was considered,careful and reasoned e and widely variable. These results suggestthat, although there may be broad agreement about key areas forjudging candidate teachers (Haigh et al., 2013), individual judgesvary in their views about what is ‘essential’ and what can be‘excused’ or ‘fixed later’. There was also some evidence of internaldissensus for individual mentors, leading to confusion aroundassessment of TC practice.

Moss and Schutz (2001) asked the question: what level ofagreement is it reasonable to expect? Our findings have shown thatan apparently simple consensus reflected in generalised graduatingteacher standards statements does not reflect the complexity of thejudgment context. We do need to shift the emphasis from‘mandated consensus’ which masks difference to understandingand learning from this difference. Agreement is only one possibleoutcome of an interaction when people engage in discussion withthe aim of understanding others’ perspectives on, in this case, a TC’sreadiness to teach. However, the discussants should be able toconfidently defend their position even if others do not agree withthem. They should also recognise and welcome diversity and beopen to learn. Moss and Schutz suggested that such learning couldbe supported by detailed contextualised examples of situationswhere people hold potentially conflicting standards, for example,the stories of the decisions made by MT5 and MT15. Discussantscould explore differences, consider other’s opinions, and see theseas positive rather than negative. Articulation of differences mighthelp judges decide when a TC performance is problematic andshould be discussed more widely.

6. Implications for practice

Making overall judgments of TCs’ readiness to teach is a verycomplex and integrated process. As such it reflects the complexityof teaching and the challenges posed by all teacher evaluationsystems (Hill & Grossman, 2013). With much rich data available tojudges the challenges involved in ensuring acceptable reliabilityamong judges also increase, a situation also encountered whenportfolios are used to assess teaching (Schutz & Moss, 2004). Thevignette study we report in this paper has demonstrated the idio-syncratic approach that mentors take to reaching decisions aroundTCs’ readiness to teach. We have shown that even where judgeshold a shared vision of quality teaching they can ‘develop signifi-cantly different “stories”’ (Schutz & Moss, abstract) depending ontheir views of ‘essential’ or ‘not essential’ at this stage of a TC’scareer and what is learnable or not.

There are implications from this variability for TCs and theprofession at large. However, we argue that dissension betweenthose charged with assessing TCs’ readiness to teach is not neces-sarily negative, and can be framed as potentially opening oppor-tunities for professional growth if collaborative approaches toevaluation are taken, such as those suggested by Moss, Schutz, andCollins (1998). If we move to a situation where more than oneperson is involved in the assessment of a TC’s readiness to teachand if that decision is made over an extended period of time thenthere is likely to be a fairer decision for the TC and also rich op-portunities for professional development for all concerned aroundnotions of what makes an effective teacher. A further perspectivewould be offered by folding in the TC’s voice and evaluation of theirteaching.

Historical triadic assessment processes were also predicated onthe assumption that consensus would be reached by the triad. Ourstudy indicates that this collaborative decision-making approachcould be strengthened by including additional professionals such as

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e2120

the principal and other teaching team members and by building inopportunities for rich productive discussion that acknowledge andexplore dissensus among the judges. Such discussions will explorethe reasons for the different decisions and address sources ofvariability within judges, for example mentors’ prior experiences.Additionally, the assessment needs to take place over an extendedperiod of time, allowing for early formative feedback and finalsummative decisions.

There are potential risks. Time and cost for assessment mightboth be extended. Lengthy documentation outlining agreementand dissensus may be required. The discussions might generateunresolvable controversy and challenge professional autonomy andpersonal and professional identity. The use of professional stan-dards to develop a widely-shared view of good practice is oftenseen as a way to improve the reliability of decision making, and toshape educational reform. Our results suggest that providingstandards with the aim of trying to eliminate dissensus in complex,socially-situated decisions may be less successful than approachesthat embrace the variability of the judges. Continuing to aim forconsensus-reached standards may risk losing creative, ‘different’,innovative teachers. As Moss and Schutz (2001) argue, we need tonurture dissensus as it can “guard against taken-for-granted beliefsand practices that might dominate our thinking” (p. 65).Acknowledgement of the value of dissensus within the ITE com-munity can enhance our decisionmaking, and the development of adiverse educational culture.

Acknowledgements

We acknowledge the contribution of all members of the TLRIresearch team to this vignette study, especially Lexie Grudnoff,Vivienne Mackisack and Helen Villers who joined us for early dis-cussions of the findings.

References

Allal, L. (2013). Teachers’ professional judgement in assessment: a cognitive act anda socially situated practice. Assessment in Education: Principles, Policy & Practice,20(1), 20e34.

Bhaskar, R. (1989). Reclaiming reality: A critical introduction to contemporary phi-losophy. London: Verso.

Brooker, R., Muller, R., Mylonas, A., & Hansford, B. (1998). Improving the assessmentof practice teaching: a criteria and standards framework. Assessment andEvaluation in Higher Education, 23(1), 5e20.

Cameron-Jones, M., & O’Hara, P. (1994). What employers want to read about newteachers. Journal of Education for Teaching, 20(2), 203e214.

Ciuffetelli-Parker, D., & Volante, L. (2009). Responding to the challenges posed bysummative teacher candidate evaluation: a collaborative self study of practicumsupervision by faculty. Studying Teacher Education, 5(1), 33e44.

Cochran-Smith, M. (2003). The unforgiving complexity of teaching: avoidingsimplicity in the age of accountability. Journal of Teacher Education, 54(1), 3e5.

Cohen, D. (2010). Teacher quality: an American educational dilemma. InM. Kennedy (Ed.), Teacher assessment and teacher quality (pp. 375e401). SanFrancisco: Jossey-Bass.

Coll, R., Taylor, N., & Granger, S. (2002). Assessment of work-based learning: somelessons from the teaching profession. Asia-Pacific Journal of Cooperative Educa-tion, 3(1), 5e12.

Cooksey, R. (1996). The methodology of social judgment theory. Thinking andReasoning, 2(2), 141e174.

Darling-Hammond, L. (2010). Teacher education and the American future. Journal ofTeacher Education, 61(1e2), 35e47.

Darling Hammond, L., & Snyder, J. (2000). Authentic assessment of teaching incontext. Teaching and Teacher Education, 16, 523e545.

Doerger, D., & Dallmer, D. (2008). Maintaining high standards in a pass/fail prac-ticum. International Journal of Learning, 15(5), 173e177.

Gadamer, G. (1994). Truth and method (G. Barden & J. Cumming, Trans.). New York:Seabury (Original work published 1975).

Gallagher, S. (1992). Hermeneutics and education. New York: State University of NewYork Press.

Giddons, A. (1984). The constitution of society: Outline of the theory of structuration.California: University of California Press.

Goh, K., Wong, A., Choy, D., & Inn, J. (2009). Confidence levels after practicum ex-periences of student teachers in Singapore: an exploratory study. Journal ofEducational Policy, 6(2), 121e140.

Grudnoff, L. (2011). Rethinking the practicum: limitations and possibilities. Asia-Pacific Journal of Teacher Education, 39(3), 223e234.

Grudnoff, L., Hawe, E., & Tuck, B. (2005). Effective teaching and standardsfor teaching: a loose coupling. New Zealand Annual Review of Education, 14, 95e109.

Habermas, J. (1996). Between facts and norms: Contributions to a discourse theory oflaw and democracy (W. Rehg, Trans.). Cambridge: MIT Press (Original workpublished in 1992).

Haigh, M., Ell, F., & Mackisack, V. (2013). Judging teacher candidates’ readiness toteach. Teaching and Teacher Education, 34, 1e11.

Haigh, M., & Tuck, B. (December, 1999). Assessing teacher candidates' performance inthe practicum. Melbourne, Australia: Paper presented at the Joint Conference ofthe Australian Association of Research in Education and the New Zealand As-sociation of Research in Education.

Hammerness, K., Darling-Hammond, L., & Bransford, J. (2005). How teachers learnand develop. In L. Darling-Hammond, & J. Bransford (Eds.), Preparing teachers fora changing world: What teachers should learn and be able to do (pp. 88e125). SanFrancisco: Jossey-Bass.

Hammond, K., Rohrbaugh, J., Mumpower, J., & Adelman, L. (1977). Social judgmenttheory: applications in policy formation. In M. Kaplan, & S. Schwartz (Eds.),Human judgment and decision processes in applied settings (pp. 1e29). New York:Academic Press.

Hegender, H. (2010). The assessment of teacher candidates’ academic and profes-sional knowledge in school-based teacher education. Scandinavian Journal ofEducational Research, 54(2), 151e171.

Hill, H., & Grossman, P. (2013). Learning from teacher observations: challenges andopportunities posed by new teacher evaluation systems. Harvard EducationalReview, 83(2), 371e384.

Hoy, D. (1994). Critical theory and critical history. In D. Hoy, & T. McCarthy (Eds.),Critical theory (pp. 101e214). Oxford: Blacwell.

Johnson, S. (2013). On the reliability of high stakes teacher assessment. ResearchPapers in Education, 28(1), 91e105.

Meeus, W., Van Petegem, P., & Engels, N. (2009). Validity and reliability of portfolioassessment in pre-service teacher education. Assessment and Evaluation inHigher Education, 34(4), 401e413.

Moss, P. (2010). Thinking systematically about assessment practice. In M. Kennedy(Ed.), Teacher assessment and teacher quality (pp. 355e374). San Francisco:Jossey-Bass.

Moss, P., Girard, B., & Haniford, L. (2006). Validity in educational assessment. InSpecial issue on rethinking learning: What counts as learning and what learningcounts (2006): Vol. 30. Review of research in education (pp. 109e162).

Moss, P., & Schutz, A. (2001). Educational standards, assessment, and the search forconsensus. American Educational Research Journal, 38(1), 37e70.

Moss, P., Schutz, A., & Collins, K. (1998). An integrative approach to portfolio eval-uation for teacher licensure. Journal of Personnel Evaluation in Education, 12(2),139e161.

Mumpower, J., & Stewart, T. (1996). Expert judgment and expert disagreement.Thinking and Reasoning, 2(2/3), 191e211.

New Zealand Teachers Council (NZTC). (2007). Graduating teacher standards:Aotearoa New Zealand. Retrieved from http://www.teacherscouncil.govt.nz.

New Zealand Teachers Council (NZTC). (2011). Approval, review and monitoringprocesses and requirements for initial teacher education programmes. Retrievedfrom http://www.teacherscouncil.govt.nz/content/initial-teacher-education-providers.

Ortlipp, M. (2003). The risk of voice in practicum assessment. Asia-Pacific Journal ofTeacher Education., 31(3), 225e237.

Ortlipp, M. (2006). Equity issues in practicum assessment. Australian Journal of EarlyChildhood, 31(4), 40e48.

Ortlipp, M. (2009). Shaping conduct and bridling passions: governing practicumsupervisors’ practice of assessment. Contemporary Issues in Early Childhood,10(2), 156e167.

Porter, A., Youngs, P., & Odden, A. (2001). Advances in teacher assessments and theiruses. In V. Richardson (Ed.), Handbook of research on teaching (4th ed.). Wash-ington: American Educational Research Association.

Rorrison, D. (2010). Assessment of the practicum in teacher education: advocatingfor the student teacher and questioning the gatekeepers. Educational Studies,36(5), 505e519.

Schutz, A., & Moss, P. (2004). “Reasonable” decisions in portfolio assessment:evaluating complex evidence of teaching. Educational Policy Analysis Archives,12(33). Retrieved October 21, 2013 from http://epaa.asu.edu/epaa/v12n33.

Sedumedi, T., & Mundalamo, F. (2012). Understanding field assessment of pre-service teachers on school practicum. Africa Education Review, 9(S1), S73eS90.

Smith, K. (2010). Assessing the practicum in teacher education e do wewant candidates and mentors to agree? Studies in Educational Evaluation, 36,36e41.

Smith, K., & Lev-Ari, L. (2005). The place of the practicum in pre-service teachereducation: the voice of the students. Asia Pacific Journal of Teacher Education,33(3), 289e302.

Ssentamu-Mamubiru, P. (2010). Teaching practicum supervisors’ identity and stu-dent assessment on the practicum: an assorted mind-set? Africa EducationReview, 2, 305e322.

Tillema, H., Smith, K., & Lesham, S. (2011). Dual roles e conflicting purposes: acomparative study on perceptions on assessment in mentoring relations duringpracticum. European Journal of Teacher Education, 34(2), 139e159.

M. Haigh, F. Ell / Teaching and Teacher Education 40 (2014) 10e21 21

Turnbull, M. (1997). Assessment in the early childhood practicum: A triadicprocess. ACE papers. Auckland, New Zealand: Auckland College ofEducation.

Vygotsky, L. (1978). Mind in society: The development of higher psychological pro-cesses. Cambridge Mass: Harvard University Press.

Warnke, G. (1994). Justice and interpretation. Cambridge: MIT Press.

Wyatt-Smith, C., & Klenowski, V. (2013). Explicit, latent and meta-criteria: types ofcriteria at play in professional judgement practice. Assessment in Education:Principles, Policy & Practice, 20(1), 35e52.

Ziechner, K. (2010). Rethinking the connections between campus course and fieldexperiences in college- and university-based teacher education. Journal ofTeacher Education, 22, 22e31.