15
EAQUALS CEFR Standardisation Pack Brian North Eurocentres Foundation June 2006

EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

Embed Size (px)

Citation preview

Page 1: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack

Brian North Eurocentres Foundation

June 2006

Page 2: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 2 of 15

STANDARDISATION ACTIVITIES – AN INTRODUCTION

This pack covers three standardisation sessions. Each session should be seen as an two or three hour activ-ity with two sessions of 60-90 minutes each: Session 1 Standardisation training with calibrated video samples (1) Session 2 Standardisation training with calibrated video samples (2) Session 3 Standardisation training with written samples Once key members of staff have been trained in this way, simplified training in shorter sessions can be un-dertaken. Standardisation training with videos is the most effective and most enjoyable way to achieve a common interpretation of the CEFR levels. People can interpret the written word (the descriptors) in different ways; some people are just stricter than others. However, discussing concrete examples of performances in rela-tion to common criteria, supported by detailed documentation that explains why a performance is one par-ticular level, is a very effective way of counter-acting this. It is logical to do standardisation training with videos before using scripts because everyone can watch and then discuss the same video performance. The Standardisation Samples Videos illustrating the CEFR levels are currently available for English and for French on DVD, with a DVD for German expected early 2006. DVDs are also planned for Italian and Spanish. A CD of written scripts cali-brated to the CEFR levels is expected to become available in 2006. The illustrative videos are being published to accompany the preliminary pilot version of the Manual for ex-aminations providers.1 The English and French DVDs show pairs of learners filmed with a static camera and without a teacher/examiner interlocutor in order to give the learners a platform on which to show their best2. This approach reflects the learning-centred philosophy of the European Language Portfolio. The use of interaction between learners was introduced to communicative language testing by the examination origi-nally developed by Keith Morrow and Robin Davis and is now administered as the Cambridge English Lan-guage Skills: CELS (Cambridge ESOL 2003). The selection of topics by the candidates for longer pieces of spoken production is also reflected in the recent Irish Test of Interactive English3: The performances on the DVDs for English and French have been calibrated to the CEFR both through con-sensus discussion, and through mathematical scaling that takes account of differences of severity between examiners when they make independent judgements. Relevant reports are available from www.coe.int/lang.4 The performances on the cassette for English from Cambridge ESOL show candidates

1 Council of Europe 2003: Relating Language Examinations to the Common European Framework of Reference for Languages: Learning, Teaching, Assessment (CEF). Preliminary Pilot Version of a Proposed Manual. Strasbourg, Council of Europe, Languages Policy Division, DGIV/EDU/LANG (2003) 5. September 2003. Figueras, N., North, B., Takala, S., Verhelst, N. and Van Avermaet, P. (2005): Relating Examinations to the Common European Framework: a Manual. Language Testing, 22, 3, 1-19. 2 There are arguments for and against the use of an examiner/interlocutor in oral assessment. Most learners react in a very motivated way when given a platform to speak without being dominated by a native-speaker examiner. During the recordings for the French DVD, the candidates were also recorded doing the DELF/DALF tasks for the new version of the exams to be released for 2006. These activi-ties are conducted with a native-speaker examiner and almost invariably concern the candidate explaining the information from a “texte déclencheur” to a native speaker examiner, who then probes with follow-up questions. In one case a (Serbian, male) candidate per-formed noticeably better on a DALF task of this type than in the more relaxed atmosphere with a colleague as interlocutor. On the DALF sample he was (after statistical analysis) placed at the top of the C1 range and on the production/interaction sample he was placed at the bottom of the same C1 range. In another case a (Chinese, female) candidate performed far better on the informal task in which she managed the interaction in a very sophisticated manner, showed some C1 features and was calibrated at B2+, whereas she was only B2 on the formal DALF task in which she felt at a disadvantage to the interviewer. (The interviewer was Sylvie Lepage, to whom she then chatted happily over lunch and there was no doubt that her better performance on the production/interaction task was more representative than that on the DALF task). With all the other 21 candidates, the differences in performances on the formal task (with interviewer) and informal production/interaction task (with colleague) were less marked. 3 TIE: Test of Interactive English (The Advisory Council for English Language Schools 2003). 4 Lepage, S. and North, B. 2005: Seminar to calibrate examples of spoken performances in line with the scales of the Common Euro-pean Framework of Reference for Languages, CIEP, Sèvres, 2 - 4 December 2004, Strasbourg, Council of Europe, Languages Policy Division, DGIV/EDU/LANG (2005) 1.

Page 3: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 3 of 15

from standardisation videos for the Cambridge main suite of exams. As described in Section 1, these exams have been aligned to the CEFR levels5. Standardisation Papers The CEFR criteria, and the worksheets that are used in these standardisation workshops, are listed below. These are provided separately for English, French and German. The “criteria” papers are for permanent ref-erence; the “worksheet” materials are versions of these for use only in training sessions.

Global CEFR Scales

o Global scale – expanded with “salient features (spoken language)”

EAQUALS CEFR Reference Sheets 1 & 2 CEFR Exam Manual Table 2.1

Table 1 glossed with summary of CEFR Section 3.6

Oral Criteria (Appendix 1)

o Global oral assessment scale CEFR Exam Manual Table 5.4. A radically simplified version of CEFR Interaction scale and CEFR Table 3 grid.

o Oral assessment criteria grid CEFR Table 3, CEFR Exam Manual Table 5.5 Defines “Range,” “Accuracy,” “Fluency,” Interaction,” Coherence” at the 6 CEFR levels

o Supplementary criteria grid: Plus Levels

Defines Table 3 categories at A2.2, B1.2, B2.2

o Supplementary CEFR scales CEFR Chapter 4 scales for “Overall Spoken Interaction,” “Sustained Monologue: describing experience”, and chapter 5 scale: “Phonological Control”

Oral Worksheets (Appendix 2)

o Global scale – expanded with “salient features (spoken language)”

Large print version ready to be cut up

o Oral assessment criteria grid A3 version with 6 missing cells o Oral assessment criteria grid The 6 missing cells o Rating form CEFR Manual Form B2: Analytic Rating Form o Oral assessment criteria grid CEFR Table 3 presented as separate scales, large print,

ready be to cut up

Written Criteria (Appendix 3)

o Written assessment criteria grid CEFR Exam Manual Table 5.8. Range, “Accuracy” and “Coherence” from CEFR Table 3, plus “Overall Written Production,” “Description” and “Argument”

Lepage, S. and North, B. 2005: Séminaire pour le calibrage des productions orales par rapport aux échelles du Cadre européen commun de référence pour les langues. CIEP, Sèvres, 2 - 4 décembre 2004. Strasbourg, Conseil de l’Europe, Division des Politiques Linguistiques, DGIV/EDU/LANG (2005) 1. 5 Cambridge ESOL examinations tend to have a “pass level” for the oral that is slightly below the general pass level of the exam: i.e. they are a little lenient in oral exams in relation to the overall standard – and the CEFR level concerned. On the Cambridge ESOL video of illustrative samples, there is a very borderline FCE candidate who, 3 Cambridge examiners agreed at a benchmarking event held in Zurich in 2003 whilst rating B2 samples for the “Swiss” CEFR illustrative samples, may not quite meet the CEFR standard as defined in CEFR Table 3. This does not mean FCE is not B2. One has to expect “borderline problems” in aligning exams with well-established standards to the CEFR. This fact about pass standards in Cambridge oral exams first emerged in the English-speaking Framework project in the late 1980s. In the ESU project, two juries, each of 15-20 people, rated samples from different exams and mean ratings were used to situate the exams concerned on a scale. With Cambridge exams, the effect of statistically moderating results from subjectively marked tests makes the standard for the exam as a whole slightly higher than the mid grade for the speaking and writing components (which the ESU team had interpreted as a pass in the exam). The effect was therefore to underestimate the level of the Cambridge exams in the published results, so Cambridge withdrew from the project.

Page 4: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 4 of 15

Organising Standardisation Training Session 2 is intended as a follow-up a couple of weeks after Session 1. It uses different materials for the Familiarisation Phase. In Session 1 you may wish to work only with the six main CEFR levels, leaving the introduction of the so-called “Plus Levels” until the Session 2. However, it might be a good idea to have cop-ies of the criteria grid for “Plus levels” also photocopied for Session 1 in case their introduction becomes unavoidable. General recommendations 1. It is vital that the training takes place in a logical order:

o Familiarisation with the levels and the criteria (a reminder for those familiar with them) o Illustration, with one or two samples shown and discussed in plenary in relation to the documenta-

tion provided o Practice in individual rating:

- individual rating - pair / small group rating - whole group discussion

The focus in the first activity is on Familiarisation and Illustration. One may well not reach the third phase – individual rating – at all. The focus in the second activity is on practice at rating (individual, pair/group, plenary discussion). The Familiarisation phase and training are to establish a link back to the consensus built in Session 1 and should only take between one third and one half of Session 2.

2. It is vital that participants build a common interpretation of the CEFR levels from general to detailed:

o Holistic understanding of the 6 CEFR Levels (A1, A2, B1, B2, C1, C2), and the kinds of things learn-ers can do at each level

o A more detailed understanding of the 6 CEFR Levels, discussing key criteria for different categories of qualitative language use.

o Introduction of the “Plus Levels” 3. Standardisation training is not democratic. Its aim is to help participants to discover how their interpre-

tation of the CEFR levels matches or differs from the common interpretation and, if necessary, to move closer to that common framework. This is not as difficult as it sounds. People’s views are only half formed at the beginning. The aim is to give them enjoyable training and an experience of success in ar-riving with colleagues at a consensus that matches the general consensus. Therefore it is very important that the trainer avoids exposing individuals at the beginning (e.g. by saying: “Everyone who thinks she is A2 put your hands up.”). Either allow participants to be anonymous or let people submerge their per-sonal opinion into that of their group until they feel confident that they are getting it right.

4. It should be remembered that asking participants to identify the level of an already standardised sample is an “exercise with a right answer.” This answer is not initially given; it is only released at a later stage by the trainer. The group is not being invited to form their own consensus on the level of the sample ir-respective of previous evidence. Rather, they are invited to arrive at the correct answer by applying the criteria. The trainer needs skill and sensitivity to steer the group towards the right answer, and also to avoid publicly exposing those participants who are too strict or too lenient in their interpretation before they have had a chance to tune in with the training. There are two different approaches:

a. Start with general discussion and then afterwards, when rating individually, tell participants to do so anonymously under pseudonym like “Mickey Mouse,” “Micro J” etc. The rating forms are then passed around to the trainer after each individual “vote” without comment. The trainer then swiftly collates the results onto an overhead projector or flip chart. This shows participants if their interpretation is different from that of the others, but it does not identify them in public, enabling them to avoid losing face.

b. Use interactive discussion in groups. Here the consensus is more conscious, as a result of argu-ment. The disadvantage is that the result may be swayed by an articulate speaker. Working in pairs or small groups and justifying their position with the criteria is something participants find very enjoyable. The trainer can circulate, listen in to the discussion and where necessary steer a

Page 5: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 5 of 15

group in the right direction. At a certain point he/she can ask for a report back from a member of each group. The main advantage of group or pair work is that it naturally forces the partici-pants to use the defined criteria to justify their judgements.

In b, something to beware of is the possibility of participants “cherry-picking” words and phrases out of the criteria in order to defend (rationalise) a position they have already taken up inde-pendently. Trainers need to be aware that some teachers react in this way. The descriptor in each cell of the criteria grid (e.g. “Range” at B1) should be used holistically: when doing the rat-ing, participants should ask themselves: “Does this description of “range” best fit the perform-ance I am seeing?” If this kind of problem occurs, the trainer should intervene to support those participants in the discussion who are using the criteria holistically to help form their judge-ments, and who are not misusing them selectively to rationalise an extreme position.

Page 6: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 6 of 15

SESSION 1 Activity: Standardisation training with calibrated video samples (1)

Materials: o CEFR DVD/videos for (English, French, German Italian)

o Documentation for DVD/video samples o Cut-up version of “Salient Characteristics” o Gap-fill version of CEFR Table 3 criteria grid o CEFR Table 3 Criteria grid o CEFR scales for Interaction & Production o Rating form

Steps: 1. Familiarisation tasks - Sort “salient characteristics” into order - Complete Gap-fill version of CEFR Table 3 Criteria grid - Discuss criteria

2. Training - Show a sample, plenary discussion of level in relation to criteria, distribute

documentation for that sample - Show a sample, pair/group discussion of level in relation to criteria, plenary

discussion, distribute documentation for that sample Participants should be given CEFR Table 1 (Global scale), Table 2 (Portfolio grid) and Table 3 (Oral assess-ment grid) in advance, asked to refresh their memories. They should be told that Table 3 provides the crite-ria for the session. The activity will take 2-3 hours. It is probably best run in two periods of 60-90 minutes with a break in the middle after the Familiarisation Phase and viewing of one of two video extracts. Explicit guidance is given for trainers below as it is intended to support a director of studies undertaking this type of activity for the first time. If you already have experience of running examiner standardisation ses-sions, you may wish to follow a slightly different sequence. Step 1: Familiarisation: sort “salient characteristics” into order A. Start by ensuring that the participants have a good overview of the CEFR levels. Many people think they

have such a familiarity, but they have probably never studied the descriptions of levels. The best over-view is the description of the salient characteristics of the different levels that is given in CEFR Section 3.6. (EAQUALS Reference Sheet 2). This is summarised in the version of CEFR Table 1 (Global Scale) that is given in the Appendices. A version prepared for this activity is the first piece of material in the “Oral Workshop Worksheets.”

Material Function Adaptation

CEFR Table 1 “Global Scale” ex-panded with principal characteris-tics from CEFR Section 3.6 (CEFR Manual Table 2.2) (EAQUALS Reference Sheet 2).

Familiarisation - levels not mentioned in text - enlarged print - plus levels separated out on

last page - ready to be cut up

B. Give pairs or small groups of participants chopped up versions of this description. Ask them to sort the

pieces of paper into the correct rank order so as to clearly identify the 6 CEFR levels A1-C2. It usually takes about 20 minutes.

Preparation - Photocopy CEFR Table 1 for groups of 3-4 - Separate the plus level page - Cut up the 6 main levels and fix sets together with a paper clip - Cut up the “plus levels” and fix together with a staple

Page 7: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 7 of 15

Use - Explain this is the official description of the principal characteristics of oral performance at the CEFR

levels, taken from CEFR Section 3.6. - Give out the sets of main levels; groups sort into order and discuss - Give out plus levels, groups study them and insert them in the right places - General discussion

Step 2: Familiarisation: complete a gap-fill version of CEFR Table 3 criteria grid Participants have now tuned in to the CEFR levels, and to the idea that the levels are defined by descriptors that act as criteria to guide judgements. Next they need to be familiarised with the categories and descrip-tors in the assessment grid (CEFR Table 3), which is recommended for the standardisation training6. Table 3 is a grid of five categories defined at 6 levels, so there are 30 descriptor cells, each of which acts as a criterion. 30 cells is not an excessive number (the Eurocentres grid used regularly for 15 years has 4 cate-gories at 10 levels = 40 cells). However, a document made up of 30 cells is difficult to read in a linear way. Participants need a way in, which this activity provides. The activity itself is very easy and takes 10-20 minutes. What is interesting is the discussion it generates about the criteria.

Material Function Adaptation

CEFR Table 3 “Oral Assessment Criteria” (appendix 1)

Familiarisation - some cells emptied - missing descriptors supplied

on second page Preparation

- Photocopy the worksheets for groups of 3-4 onto A3 - Cut out the “missing descriptors” on the second page, again onto A3; fix sets together with paper clips. Use

- Explain that these are the official oral assessment criteria for the CEFR levels, CEFR Table 3; explain that they are put together from descriptors in Chapter 5 of the CEFR, sometimes by merging scales there. The grid has 6 empty cells.

- Tell groups they must put the six missing criteria (squares of paper in the paperclip) in the right places. They have to decide what category and level each of the 6 descriptor squares describes.

- General discussion Step 3: Familiarisation: Discuss criteria The above discussion turns naturally into a discussion about the criteria. At this point give out the criteria to be used whilst watching the videos:

6 CEFR Table 3 is constructed with descriptors in CEFR Chapter 5. Some columns on it (Accuracy, Fluency, Coherence) are copied more or less verbatim from Chapter 5 scales. The “Range” column includes some descriptor elements from the CEFR “Flexibility” scale. The “Interaction” column combines the scales for Interaction Strategies. Pronunciation was excluded because the grid was designed for use in contexts in which assessors might assess candidates with mother tongues they are not used to. In such contexts, assessment of pronunciation can be inconsistent, and negative impres-sions of pronunciation can over-influence the end result. A descriptor scale for phonological control is in fact provided in the CEFR and is available as a supplementary scale for these standardisation sessions.

Page 8: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 8 of 15

Materials to give out at this point (see appendix 1)

o Global scale – expanded with “salient features (spoken lan-guage)”

(EAQUALS Reference Sheet 2).

CEFR Exam Manual Table 2.1: version on 2 pages for partici-pants

o Oral assessment criteria grid (See appendix 1)

CEFR Table 3, CEFR Exam Manual Table 5.5 Defines “Range,” “Accuracy,” “Fluency,” Interaction,” Co-herence” at the 6 CEFR levels

o Supplementary CEFR scales CEFR Chapter 4 scales for “Overall Spoken Interaction,” “Sus-tained Monologue: describing experience”, and CEFR Chapter 5 scale: “Phonological Control”

Notes

a. Some of the criterion descriptors may contradict certain preconceptions held dear by some partici-pants (personal “rules of thumb”). If this happens, it should engender a useful discussion about the nature of a Common European Framework. The descriptors were not just written. They were scien-tifically calibrated to levels on the basis of the way in which some 250 teachers used them to rate performances of some 2500 learners. They have now been widely accepted. The documentation for the DVDs demonstrates how they relate to real performances. They are not perfect, but they are good – and they represent our common framework. They take precedence over people’s personal opinions about what, for example, being B2 should mean.

b. A question that is sure to come up is the issue of how to apply the criteria. What happens if someone is B2 for three criteria and B1 for the other two? Are they B2 or B1? An animated discussion here may force an introduction of the “Plus Levels” in this first session. However the answer depends on the assessment approach in a given context. One approach is to say that a learner matching four criteria at B1 and one criterion at B2 is level B1 – and strong in one aspect. A learner matching four criteria at B2, with one at B1, is B2 – and weak in one aspect. When there is a 3/ 2 split, grading is more complex, and the solution is to consult the more global criteria – the supplementary scales for “Overall Spoken Interaction” and “Sustained Monologue: describing experience.” In the end a considered judgement has to be made: on balance, is this person at level B1 or B2? Have they fulfilled the requirements for B2? If not they are still B1. This becomes clearer when following the recommended assessment procedure in practice.

c. Another question may concern phonology, which is not included in the main assessment grid. With DVD samples of mixed mother-tongues, if participants are only used to hearing learners sharing one first language, phonological aspects can be a distraction, and it may be best not to try to rate them at this stage. When the approach is implemented in the school, one option is to replace the “Coherence” column on the oral assessment grid with the scale for Phonological Control in CEFR Section 5.2.1.4. This is also supplied with the Supplementary Scales.

Step 4: Training: Illustration 1 Select a sample that is: - quite short; - shows main CEFR levels rather than “plus levels;” - in which the learners shown have relatively “flat” profiles across the criteria (i.e. avoid a sample in which

one candidate has one area that is very weak and at a different level); - which is near the middle of the CEFR levels. Ideally one wants a speaker who is, for example, B1 (or A2 or B2), on all the categories Range, Accuracy, Fluency, Coherence, Interaction. Additional materials needed Documentation for sample chosen Explains level of sample with reference to oral assessment

grid

Page 9: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 9 of 15

A. Give participants the following guidelines: - Tell them that they will first view a sample and discuss it, initially in groups and then all together. - Tell them you have documentation explaining the level of the learners and that you will give them this

after the discussion. - Tell them that you will play the “Production” phases for both candidates and then stop so that they can

discuss in groups, and that you will then repeat the “Production Phase” and let the recording also run on to the “Interaction Phase”

- Tell them to have ready: o the oral assessment criteria grid o the supplementary scales: “Overall Spoken Interaction” and “Sustained Monologue: describing

experience” - Tell them to glance at these during the recording. B. Play the two Production Phases. C. Invite participants to discuss the performance with neighbours or in small groups. Circulate while they

do this so as to identify which groups are getting it right and which of those people in the group are tak-ing a lead and interpreting the criteria correctly.

D. After at most 10 minutes, bring everyone together, and elicit from one of the groups getting it right: - the level of the candidates; - the way in which the performance illustrates the level described on the CEFR Table 3 (Manual Table 5.5)

grid. E. Support this opinion with comments of your own and invite other people to give feedback. Let this dis-

cussion go for 5-10 minutes. F. Now replay the whole extract. G. Invite the groups to discuss again. After 5-10 minutes, whilst they are still talking, distribute the docu-

mentation to the sample. H. In the general discussion that follows, point out to the participants the description for the level con-

cerned on both: - the Global Scale – expanded with “salient features (spoken language);” - the Supplementary Scales for “Overall Spoken Interaction,” “Sustained Monologue: describing experi-

ence.” I. Remind the participants that the Oral Assessment Grid is a summary of relevant CEFR scales. The more

global criteria (above) are equally valid: they should not lose sight of the wood for the trees. Step 4: Training: Illustrations 2 & 3 This step introduces a rating procedure that helps participants make considered, holistic judgements. At the same time it will give participants points of reference at the top and bottom of the scale of levels, to add to the point near the middle of the scale of levels that has now been established. Additional materials needed Rating form (Appendix 2, Worksheet 3) For recording initial impression, analysis with criteria, final

judgement Documentation for samples chosen Explains level of sample with reference to oral assessment

grid Supplementary criteria grid: Plus Levels (in reserve)

Supplementary rating criteria

A. Explain that you are now going to view another extract and follow the full procedure this time. B. Give out the Rating Form and explain the Rating Procedure.

Page 10: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 10 of 15

Note One problem with oral assessment is raters who assess in relation to a personal criteria that they have de-veloped independently. It therefore helps to have a phase of the assessment procedure in which the asses-sor consciously reads the criteria to check whether their view of the performance is in fact justified by the criteria. The procedure recommended in Table 14 has been used for over a decade in Eurocentres7 and is recommended in the Manual for relating examinations to the CEFR 8:

Assessment Procedure

Instructions during this Training Activity

1. Impression: Write down the overall impression of the global level of the candidate that you have after about 5 minutes.

While viewing, after 4-5 minutes, write a single level – your overall, initial impression – in the space at the top of the rating form.

2. Analysis: Consciously read the descriptors for that level across the assessment grid. If you con-firm that the candidate does meet the criterion description for a category at that level, look at the level above in that same category to see if they are even better than that. Write a result for each assessment category (Range, Accuracy, Flu-ency, Interaction, Coherence if using CEFR Table 3).

While viewing, after marking that initial judgement, consciously read the descriptors for that level across the assessment grid, for the level above and the level below. After viewing, read the criteria closely and mark your decision for each category on the form in the space provided

3. Judgement: Compare your analysis result to your original impression and make a considered judgement.

Consult the CEFR scales for “Overall Spoken Inter-action” and “Overall Spoken Production.” Write your final decision at the bottom of the form in the space provided.

C. Now show a whole sample for A1 or A2. D. Invite participants to discuss the performance with neighbours or in small groups. Circulate while they

do this so as to identify which groups are getting it right and which of those people in the group are taking a lead and interpreting the criteria correctly.

E. After at most 10 minutes, bring everyone together, and elicit from one of the groups getting it right: - the level of the candidates; - the way in which the performance illustrates the level described on the CEFR Table 3 grid. - Support this opinion with comments of your own and invite other people to give feedback. Let this dis-

cussion go for 5-10 minutes. F. Repeat the process with a sample for C2 or C1. Conclude by inviting feedback on the session.

7 North, B. (1991): Standardisation of continuous assessment grades. In Alderson, J. C. and North, B. (eds.): Language Testing in the 1990s: Modern English Publications/British Council, London, Macmillan: 167-177. North, B. 1993: L'évaluation collective dans les Eurocentres. In Evaluations et Certifications en Langue Etrangère, numéro spécial, Le Français dans le Monde - Récherches et Applications, août-septembre 1993: 69-81. 8 Council of Europe 2003: Relating Language Examinations to the Common European Framework of Reference for Lan-guages: Learning, Teaching, Assessment (CEF). Preliminary Pilot Version of a Proposed Manual. Strasbourg, Council of Europe, Languages Policy Division, DGIV/EDU/LANG (2003) 5. September 2003.

Page 11: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 11 of 15

SESSION 2 Activity: Standardisation training with calibrated video samples (2)

Materials: o CEFR DVD/videos for (English, French, German Italian)

o Documentation for DVD/video samples o CEFR Table 3 Criteria grid cut up for sorting task (appendix 2, worksheet 1a) o CEFR Global Oral Assessment Scale o CEFR Table 3 Criteria grid (Appendix 2, worksheet 2a) o CEFR Supplementary grid with “Plus Levels” (worksheet 1b) o CEFR scales for Interaction & Production o Rating form (worksheet 4)

Steps: 1. Familiarisation - Sort a scale from CEFR Table 3 into order - Discuss criteria

2. Training - Show a sample used last time, plenary discussion of level in relation to

criteria - Show a new sample, individual rating of level in relation to criteria,

pair/group discussion, plenary discussion, distribute documentation for that sample

- Repeat. In this Activity the three familiarisation activities should be undertaken quickly (in 30-45 minutes) in order to allow more time for the main training. Step 1: Familiarisation: Sort a scale from CEFR Table 3 Oral Assessment Grid This step is intended to remind participants that they really do have to read the criteria. It should take about 10 minutes. Give small groups scales to sort into the correct order. There are two ways to do this a. Give each group a different scale b. Give each group two contrasting scales

- Range and Accuracy - Accuracy and Fluency - Accuracy and Interaction

The latter option is more challenging than it sounds.

Material Function Adaptation

CEFR Table 3 “Oral Assessment Criteria” (Appendix 1 worksheet 3a)

Familiarisation for a second train-ing session using Table 3

Scales presented separately

Step 2: Familiarisation: Introduce the “Plus Levels” A. Hand out the Supplementary Grid with the “plus levels” (worksheet 2) and allow participants to read it

and compare it to the main grid. These descriptors come from the same project that developed the CEFR descriptors and the CEFR Tables 1, 2 and 39.

9 North, B. (2000): The Development of a common framework scale of language proficiency. New York, Peter Lang. North, B. and Schneider, G. (1998): Scaling Descriptors for Language Proficiency Scales. In: Language Testing 15/2: 217–262.

Page 12: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 12 of 15

B. Point out that this time you are going to use the “plus levels” when rating. A “plus level” shows a per-formance that, while it has not yet reached the next CEFR level, is significantly better than a perform-ance at the criterion level. It often shows features of the next level in the process of acquisition.

C. Remind participants that the main features of the “plus levels” were defined in the “salient features” on

the table they were given last time. For language schools, the “Plus levels” offer criteria for finer dis-crimination in the assessment and certification of progress, which is valuable and motivating to learners. The plus levels can be characterised as follows:

Level A2+ (A2.2) is noticeable for: • active participation in conversation given some assistance and certain limitations, understand

enough to manage simple, routine exchanges without undue effort; make him/herself understood and exchange ideas and information on familiar topics in predictable everyday situations, provided the other person helps if necessary AND

• significantly more ability to sustain monologues: give an extended description of everyday aspects of his environment e.g. people, places, a job or study experience; describe past activities and personal experiences; explain what he/she likes or dislikes about something.

Level B1+ (B1.2) has the same two main features that were noticeable at B1 (maintain interaction; cope flexibly with problems) plus a focus on: • exchange of quantities of information: provide concrete information required in an inter-

view/consultation (e.g. describe symptoms to a doctor) but does so with limited precision; summa-rise and give his or her opinion about a short story, article, talk, discussion interview, or documen-tary and answer further questions of detail; exchange accumulated factual information on familiar routine and non-routine matters within his field with some confidence

Level B2+ (B2.2) has a continuation of the focus on (a) argument, (b) effective social discourse and (c) on language awareness which appears at B2. However, the focus on argument and social discourse can now also be interpreted as a new focus on discourse skills: • in conversational management (co-operating strategies): give feedback on and follow up statements

and inferences by other speakers and so help the development of the discussion; relate own contri-bution skilfully to those of other speakers.

• in relation to coherence/cohesion in production: use a variety of linking words efficiently to mark clearly the relationships between ideas; develop an argument systematically with appropriate high-lighting of significant points, and relevant supporting detail.

• at this band there is a concentration of descriptors on negotiating. Step 3: Familiarisation: Introduce the “Global Oral Assessment Scale” If as a school you wish to proceed to using CEFR levels for placement interviews, then this is a good oppor-tunity to introduce the global assessment scale used for that purpose. It is in any case a good idea to use a “global scale” to guide the first phase of the assessment procedure (“Initial Impression”) Step 4: Familiarisation Replay the Interaction Phase (that shows both candidates) of one of the extracts used last time. Use the documentation to remind participants of the reasons why these learners were those levels. Replay just the beginning of the Interaction Phases of the other samples used last time, and quickly remind the learners what levels those learners were. Point out to the participants that these samples they have viewed – from the top, middle and bottom of the scale of levels – give them reference points for the discus-sion in this activity.

Schneider, G and North, B. (2000): Fremdsprachen können - was heisst das? Skalen zur Beschreibung, Beurteilung und Selbsteinschätzung der fremdsprachlichen Kommunikationsfähigkeit. Chur/Zürich.

Page 13: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 13 of 15

Step 5: Training: Standardisation Practice This session should be planned to give participants the feeling that they are being given plenty of time for discussion. Plan 45 minutes for each sample you intend to use – and have another couple in reserve in case things go more quickly. Try to ensure that between the last session and this session you have covered all 6 CEFR levels. In this session it is a good idea to deliberately choose some samples that show a more uneven profile, e.g. when the speaker is B1 for some categories but B2 or at least B1+ for others. Unless the issue of “uneven profiles” is discussed in the training, it may become a complication later on. During this phase, the aim is to set up a routine in which participants follow a standard procedure:

o Practice at individual rating: - individual rating - pair / small group rating - whole group discussion

o Making a considered, balanced judgement:

- initial impression (with global oral assessment scale if used) - analysis with detailed criteria - final decision, with reference to global criteria (Overall Interaction, Describing Experience)

A. It is important that participants make and write down an individual decision before they discuss with

neighbours in a group. The simplest way to collate results at this point is for the participants to pass pieces of paper to the trainer.

B. As the participants start their small group discussion, the trainer collates the independent judge-ments made before discussion onto the whiteboard or an overhead projector transparency.

A1 A2 B1 B2 C1 C2 Before Discussion After Discussion

C. During the discussion, the trainer should circulate, monitoring the discussion, helping participants to

see whether they are still tending to be too strict or too lenient, without embarrassing individuals. D. After 15-20 minutes at most, the trainer asks for everyone to “vote” again and collates the results of

the decisions after discussion. The result should now be closer together. E. The trainer then guides the formation of a consensus in plenary discussion, with reference to the cri-

teria and to the documentation for the sample concerned. F. Finally, the trainer gives out the documentation for each sample after consensus is reached.

Page 14: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 14 of 15

SESSION 3 Activity: Standardisation training with calibrated written samples

Materials: o CEFR illustrative writing samples for English, French, German (Italian)

o CEFR Writing Assessment Grid (appendix 3) Steps: 1. Familiarisation

- Distribute CEFR Written Assessment grid - Point out that it shares “Range,” Accuracy” and “Coherence” with CEFR

Table 3 that they are familiar with. - Discuss criteria

2. Training - Give out sample, individual rating of level in relation to criteria,

pair/group discussion, plenary discussion, distribute documentation - Repeat.

In this session, the aim is to transfer the routine established in viewing and rating video samples to written samples. Provided standardisation activities with the DVDs have been carried out, participants will by this stage have a good feel for the CEFR levels. Therefore one could commence with rating practice, without the illustration phase used with the videos. It is in any case more difficult to conduct an illustration phase with writing samples unless a computer beamer is available to project them. One must present samples in a random order and, as in the first session, ensure that participants get refer-ence points with samples in the middle, at the top and at the bottom of the scale of levels. Again, the emphasis is on:

o Practice at individual rating: - individual rating - pair / small group rating - whole group discussion

o Making a considered, balanced judgement:

- initial impression (with global Writing scale at the left of the Written Assessment Grid) - analysis with detailed criteria (Range, Accuracy, Coherence) - final decision, with reference to global criteria (Description or Argument – depending on the task

– provided on the right of the Written Assessment Grid)

A. Individual Rating: participants should read the selected script and write down their independent judgement, following the three steps (impression, analysis, decision). Again the trainer collates the results.

B. Pair/small Group rating: participants discuss the text together in relation to the criteria. At the end of

this discussion the trainer collates those results (called out from the groups).

C. Whole Group Discussion: the trainer guides the formation of a consensus with reference to the crite-ria and to the documentation and gives out the documentation for the sample.

The sessions should cover samples across the full range of CEFR levels.

Page 15: EAQUALS CEFR Standardisation Pack - Square Eyeclients.squareeye.net/uploads/eaquals/EAQUALS CEFR Standardisation... · EAQUALS CEFR Standardisation Pack Brian North ... This does

EAQUALS CEFR Standardisation Pack © EAQUALS 2006 Page 15 of 15

Appendix: List of Criteria Scales and Worksheets for CEFR Standardisation Training

See separate MS Word files for English, French and German available at www.eaquals.org

• Global CEFR Scales

- Global scale - Self-assessment grid - Global scale – expanded with “salient features (spoken language)”

• Oral Criteria

- Global oral assessment scale - Oral assessment criteria grid - Supplementary criteria grid: Plus Levels - Supplementary CEFR scales for “Overall Spoken Interaction,” “Sustained Monologue: describing ex-

perience” and “Phonological Control” • Oral Worksheets

- Global scale – expanded with “salient features (spoken language)” – for sorting task - Oral assessment criteria grid – for gap-fill task - Oral assessment criteria grid - Rating form - Oral assessment criteria grid – for sorting task

• Written Criteria

- Written assessment criteria grid