81
Speaking test formats and task types Anthea Wilson, Head of Test Production, Trinity College London Belinda Steinhuber, Head of Language Education Department, CEBS, Austria EAQUALS members’ meeting, Florence 2016 ©Eaquals 06/08/2014 1

Thom Kiddle & Eaquals members, Assessing oral Proficiency

  • Upload
    eaquals

  • View
    243

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Speaking test formats and task typesAnthea Wilson, Head of Test Production, Trinity College LondonBelinda Steinhuber, Head of Language Education Department, CEBS, Austria

EAQUALS members’ meeting, Florence 2016

©Eaquals 06/08/20141

Page 2: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Agenda

I. Speaking test formats and task typesII. Construction and validation of criteria

 III. Standardisation and monitoring

practices

©Eaquals 06/08/2014 2

Page 3: Thom Kiddle & Eaquals members, Assessing oral Proficiency

1. Speaking test formats and task types

Beyond the examiner-led interview:• What formats can we use to assess

speaking?• What demands do different task types

place on candidates?• What are the implications for reliable

assessment?

©Eaquals 06/08/2014 3

Page 4: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Why use other formats?

• focus on communicative competence• make use of more authentic tasks and

situations• include a greater variety of

communicative functions• widen the scope of task types• action-oriented approach

©Eaquals 06/08/2014 4

Page 5: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Activity

Watch the video and complete the table for the three task types:

• Trinity ISE II Collaborative Task (B2)• CEBS Plurilingual task (Engl. B2/French B1)• Group discussion task (B1)

©Eaquals 06/08/2014 5

Page 6: Thom Kiddle & Eaquals members, Assessing oral Proficiency

ISE II Collaborative taskFor the next part, I’ll tell you something. Then, you have to ask me questions to find out more information and make comments. You need to keep the conversation going. After four minutes, I’ll end the conversation. Are you ready?

My nephew’s school has just announced that all the students might have to learn three foreign languages. I’m not sure this is a good idea.

©Eaquals 06/08/2014 6

Page 7: Thom Kiddle & Eaquals members, Assessing oral Proficiency

PARTICIPANTS

1 Examiner for the second foreign language (e.g. French)

1 Candidate

Interaction

Interaction

1 Examiner for English

Plurilingual Task English and French

Page 8: Thom Kiddle & Eaquals members, Assessing oral Proficiency

TIME FRAME

Preparation minimum 30 min.

Exam12-15 min.

Interaction8-10 min.

Individual Long Turn

4-5 min.

Plurilingual Task English and French

Page 9: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Plurilingual Task English and French

©Eaquals 06/08/2014 9

Rubric in German Input

mostly in German

Situation

Task Long Turn

Task Interaction

Page 10: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Plurilingual Task English and French

TOPIC: Health and Nutrition

Situation

Your school is particularly involved in various activities encouraging a healthy lifestyle. Your class has organized a meeting with students

and teachers from other countries who are also interested in implementing projects in this field.

©Eaquals 06/08/2014 10

Page 11: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Plurilingual Task English and French

InteractionFollowing the presentation you carry on a conversation with the visiting teachers in which you discuss the possibility of working together on interscholastic projects. • Present examples of activities or projects at your school

which promote a healthy and active lifestyle (input 2).• Inquire about similar activities at the schools of your foreign visitors.

• Discuss the possibilities of a joint project.

©Eaquals 06/08/2014 11

Page 12: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Development of marking criteria Tim Goodier, Head of Academic Development, Eurocentres

www.eaquals.org

Page 13: Thom Kiddle & Eaquals members, Assessing oral Proficiency

• Introduction to Eurocentres ‘RADIO’ task oriented assessment

• Interpreting CEFR Table 3 and other relevant sources to form profile categories and maximise pragmatic validity

• Practical considerations for scaled criteria and issues informing update for EAP

• A sample from Eurocentres standardisation materials & criteria for spoken assessment

www.eaquals.org

Page 14: Thom Kiddle & Eaquals members, Assessing oral Proficiency

‘RADIO’ Task orientation

www.eaquals.org

How RADIO fits

Teacher-centred Focus on forms Present, Practice Produce (PPP)

Fluency-centred Planned focus on form ‘Free practice’ Role plays Communicative Drills Grammar Games

Natural

= Task-oriented approach approach

for fluency & assessment

Meaning-centred Focus on task Incidental focus on form Case studies Decision tasks Consensus tasks Simulations

A continuum, not categories

with fixed boundaries

R.A.D.I.O. = R: Range A: Accuracy D: Delivery I: Interaction O: Organisation & interaction

Page 15: Thom Kiddle & Eaquals members, Assessing oral Proficiency

R.A.D.I.O. – group task rationale R.A.D.I.O. group tasks follow three distinct stages:Phase 1: Collaboration. Students work in small groups (2-4) to organise the task, reach a consensus/conclusion and prepare their report. (planning)

Phase 2: Exchange. Groups are remixed in order to report their findings / conclusions (report)

Phase 3: Discussion. Groups discuss either (a) the best solution or (b) discussion questions related to the task topic (discussion)

• Impression (holistic/global)

• Analysis (R,A,D & I)

• Considered judgement

Page 16: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Distilling a workable profiling scheme (R,A,D,I + O)

www.eaquals.org

TABLE 3 OF THE CEFR Phonology scale

Range Accuracy Fluency Interaction Coherence Pron.

Range Accuracy Delivery Interaction Overall

R+A+D: Overall Spoken Production

R+I: Overall Spoken Interaction

Certificate Profile (SP & SI)

Page 17: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Assessor descriptors at 10 levels (including CEFR plus levels)

www.eaquals.org

B1+

B1(CEFRtable 3)

A2+

Page 18: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Key considerations for ongoing update and revision

www.eaquals.org

1. Draw on validated sources, and colour code ‘master’ for future reference2. Use bulleted clusters rather than boxed paragraphs

e.g.

Blue = CEFR, purple = IELTS public descriptors, black = original RADIO, green = EAQUALS, bold = paraphrased from the source.

(Accuracy) (Delivery) Maintains a high degree of grammatical

accuracy. Error-free sentences are frequent.

  Some inappropriate word choice and

occasional minor slips but few significant errors.

Uses paraphrase effectively. 

Speaks confidently and spontaneously in clear, smoothly-flowing speech.

Descriptions and arguments are easy to follow.

  Can vary intonation and place

sentence stress appropriately. Speech is clear and intelligible throughout.

Page 19: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Adaptation to include presentation task types for EAP

www.eaquals.org

Range Accuracy Delivery Interaction Organisation

R+A+D: Overall Spoken Production

R+I or R+O: Overall Spoken Interaction

Certificate Profile (SP & SI)

The ‘Interaction’ and ‘Organisation’ columns both contain the SAME descriptors for argumentation (B1+ to C2).

Structuring planned speaking to achieve a communicative objective with an audience

Page 20: Thom Kiddle & Eaquals members, Assessing oral Proficiency

RADIO Grades • Based on CEFR table 2

distinguishing between spoken interaction and spoken production

• In R.A.D.I.O.:• Spoken Interaction = an average of range and interaction• Spoken Production = an average of range, accuracy and delivery

• Half grades possible, but only full grades on certificate profile

Page 21: Thom Kiddle & Eaquals members, Assessing oral Proficiency

R.A.D.I.O. – Grading a spoken sample

We will now listen to a speaking sample.

Then, look at the mid-high leveldescriptors (5-9) and think about what score you might give each of them.

Rainer (left), Marco (centre) and Andreas (right) will talk about whether sport is bad for relationships and marriage

First think about who you think is lower/higher in level.

Page 22: Thom Kiddle & Eaquals members, Assessing oral Proficiency

RainerA relaxed communicator.

Can initiate discourse and take his turn when appropriate

Can link his utterances into a coherent contribution.

He has a sufficient range of language to express viewpoints without much searching for words, even though many of his utterances have a strong influence from German in both formulation and pronunciation.

He cannot be said to show a relatively high degree of grammatical or lexical control.

Communicates with reasonable accuracy in familiar contexts; generally good control though with noticeable mother tongue influence. Errors occur, but it is clear what he is trying to express.

Page 23: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Eaquals International Conference, 21 – 23 April 2016

Marco

Good interaction skills, and able to produce stretches of language with a fairly even tempo – although can be hesitant. Generally coherent speaker with some impressive turns of phrase for the narrowness of his linguistic base. Weak on accuracy with many past tense and word order mistakes, tends not to elaborate his contribution. Appeared to improve in the course of the activity.

Page 24: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Eaquals International Conference, 21 – 23 April 2016

Andreas

Clearly meets all the B2 criteria on Range, Accuracy, Fluency, Interaction and Coherence. A very controlled, conscious performance showing considerable language awareness for this level. He always gets his point across effectively, though the performance is very self-conscious and a little laboured at times.

Meets the level of accuracy described for B2+ but does not consistently maintain the high degree of accuracy seen at C1, and the hesitancy he showed launching himself into both description and discussion indicates he does not meet the C1 criterion in the area of Delivery.

Page 25: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Alternatives to theory-driven oral assessment criteria gridsThom KiddleDirector, NILE (Norwich Institute for Language Education)

Eaquals Members Meeting, Florence, November 2016

©Eaquals 06/08/201425

Page 26: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Alternatives to theory-driven marking criteria

©Eaquals 06/08/2014 26

“[Theory-driven] approaches generate impoverished descriptions of communication, while performance data-driven approaches have the potential to provide richer descriptions that offer sounder inferences from score meaning to performance in specified domains.”

Fulcher et al (2011)

Page 27: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Potential problems with theory-driven assessment criteria

©Eaquals 06/08/2014 27

• “Reification of ordered scale descriptors” (Fulcher et al, 2011)• Standardisation with abstract concepts• May not relate to specific task demands• Encourages the ‘halo effect’

Page 28: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Halo effect

©Eaquals 06/08/2014 28

Try this experiment from Nobel prize winner, Daniel Kahnemann:

On the next page, you will see descriptions of two people. Read the descriptions and decide which person you view more favourably…

Page 29: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Halo effect

©Eaquals 06/08/2014 29

Alan is: intelligent – industrious – impulsive – critical – stubborn – envious

Ben is: envious – stubborn – critical – impulsive – industrious – intelligent

Page 30: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Implications

©Eaquals 06/08/2014 30

What implications might this have for traditional criteria grid models?

Fulcher et al (2011) propose Performance Decision Trees to incorporate specific reference to data obtained from successful performance on a task (and as a way to include ‘indigenous’ criteria.

Page 31: Thom Kiddle & Eaquals members, Assessing oral Proficiency

©Eaquals 06/08/2014 31

Page 32: Thom Kiddle & Eaquals members, Assessing oral Proficiency

©Eaquals 06/08/2014 32

You bought the product and had the problems shown in the video. Record a voicemail message for the manager of the shop, stating:- What you bought- What the problems were- What you would like them to do about itYou should speak for at least one minute.

Page 33: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Lexical resource (theory-driven)

©Eaquals 06/08/2014 33

Manages to talk about familiar and unfamiliar topics but uses vocabulary �with limited flexibility attempts to use paraphrase but with mixed success�

Has enough language to get by with sufficient vocabulary to express �him / herself with some hesitation and circumlocution on topics such as family, hobbies and interests, work, travel, and current events.

Page 34: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Lexical resource (data-driven)

©Eaquals 06/08/2014 34

Is able to describe the sequence of events using time/sequence markers. �Has sufficient resource to describe two specific problems, either with individual accurate lexis or ‘placeholder names ’ (‘thing’, ‘stuff’, ‘kind of’).Has specific lexis to refer to future action and desired outcome / response.

Can sequence events using, for example, � earlier today / this morning / when I got home / after washing.Can identify concrete nouns and problems using, for example, jeans / washing machine / shrunk / ripped / a hole.Can make demands using, for example, money back, refund, replacement, return.

Page 35: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Challenges with data-driven approach

©Eaquals 06/08/2014 35

• Need for different descriptors for different tasks?• Need for piloting with ‘known masters’ to obtain data?• Need for detailed task familiarity among raters?• Need to establish parallels between task demands?• Need to relate to external frameworks?

Page 36: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

TestDaF: The development of standardisation and monitoring practise for ratersClaudia Pop,TestDaF-Institut, g.a.s.t. e.V. Germany

Page 37: Thom Kiddle & Eaquals members, Assessing oral Proficiency

1. Why standardise?2. The TestDaF

Test of German as a Foreign Language3. Rater trainings4. Conclusion

www.eaquals.org

Content

Page 38: Thom Kiddle & Eaquals members, Assessing oral Proficiency

1. Why standardise?

www.eaquals.org

Page 39: Thom Kiddle & Eaquals members, Assessing oral Proficiency

1. Why standardise?

www.eaquals.org

Page 40: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The Test of German as a Foreign Language (TestDaF)

Designed for international students applying for entry to an institution of higher education in Germany

Measures German language proficiency at an intermediate to high level (B2.1 to C1.2)

Developed, scored and evaluated at the TestDaF Institute in Germany

Can be taken in the applicant’s home country Administered worldwide since 2001

High stakes setting

www.eaquals.org

Page 41: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF37,881 participants in 2015 – a plus of 18.8 percent from 2014 to 2015, more than 257,000 participants since 2001

1.190 3.582

7.4988.982

11.052

13.55415.389

16.88218.059 18.528

21.374

24.261

27.166

31.898

37.881

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

www.eaquals.org

Page 42: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Our Test Centres worldwide

Europe / Russian Federation /Turkey

356 230 460

59 45 57

Asia

5 2 4

Australia /New Zealand /Ozeania

Germany

218 149 169 TestDaF-test centres TestAS-test centres onDaF/onSET-test centres

19 14 21

Africa

41 24 66

America

Page 43: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

www.eaquals.org

Dev

elop

men

t

Adm

inis

trat

ion

Scor

ing

Statistical analysis

Customer service / transparency of information

Page 44: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

www.eaquals.org

Dev

elop

men

t

Adm

inis

trat

ion

Scor

ing

Statistical analysis

Customer service / transparency of information

Standardized format Training and guidelines for

item writers Extensive trialling

procedures for each test version

Page 45: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF : Development

www.eaquals.org

Testokay?

Piloting

Ready to go

No

Yes

Revision Trialling

Item and task development

Page 46: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

www.eaquals.org

Dev

elop

men

t

Adm

inis

trat

ion

Scor

ing

Statistical analysis

Customer service / transparency of information Administration in licenced test

centres Training and monitoring for test

administrators Detailed security instructions and

procedures Inspections

Page 47: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

www.eaquals.org

Dev

elop

men

t

Adm

inis

trat

ion

Scor

ing

Statistical analysis

Customer service / transparency of information

Training of raters Monitoring

Calibration materials Regular evaluation of

rater behaviour

Page 48: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

1.190 3.582

7.4988.982

11.052

13.55415.389

16.88218.059 18.528

21.374

24.261

27.166

31.898

37.881

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

2 test dates Calibration- /

training session before each test date

www.eaquals.org

Page 49: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

1.190 3.582

7.4988.982

11.052

13.55415.389

16.88218.059 18.528

21.374

24.261

27.166

31.898

37.881

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

6 test dates (4+2) Separation:

calibration materials ≠ Rater trainings

www.eaquals.org

Page 50: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

1.190 3.582

7.4988.982

11.052

13.55415.389

16.88218.059 18.528

21.374

24.261

27.166

31.898

37.881

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

From now on: 9 test dates per year (6+3)

www.eaquals.org

Page 51: Thom Kiddle & Eaquals members, Assessing oral Proficiency

2. The TestDaF

1.190 3.582

7.4988.982

11.052

13.55415.389

16.88218.059 18.528

21.374

24.261

27.166

31.898

37.881

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015

Modifications in the standardisation

process for raters

www.eaquals.org

Page 52: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater Trainings

Trained raters 10/2016

2010 2012 2014 20160

50

100

150

200

250

300

350

www.eaquals.org

Page 53: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater Trainings

2010 2012 2014 20160

1

2

3

4

5

6

7

8

9

10

initial trainings re-trainings

www.eaquals.org

Page 54: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater TrainingsInitial trainings, goals:

Explaining construct, format Introducing the TestDaF-criteria and the rating

procedure Operationalizing the process and criteria: rating of

performances and group discussion Raising awareness of rater effects Explaining of the statistical procedures of quality

ensurance (MFR Analysis)

www.eaquals.org

Page 55: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater TrainingsInitial trainings, modifications:

Since 2008: e-learning unit to be completed before the actual 2-day training session

Since 2009: Presentation slot on practical and logistical procedures

Since 2013: successful individual rating as a condition to be contracted

www.eaquals.org

Page 56: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater TrainingsRe-trainings, goals:

Recollecting the goal (construct) Individual rating of performances and group

discussion Discussing external effects Giving updates about TestDaF-Institut Further training about chosen topics Giving the opportunity to meet “the others” – “rating is a

lonely job”

www.eaquals.org

Page 57: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater Trainings

Re-trainings, modifications:

Since 2013: Re-trainings are led by specially trained senior

raters Re-trainings are taking place across Germany Preparation weekend in January of each year

www.eaquals.org

Page 58: Thom Kiddle & Eaquals members, Assessing oral Proficiency

3. The TestDaF: Rater TrainingsFollow up-problem: Raters feel they are losing contact with the TestDaF-staff

Since 2016: Online-consultation hours (Vitero team room) In each assessment phase Separately for Writing and Speaking

www.eaquals.org

Page 59: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Summing up

www.eaquals.org

Calibration session

Page 60: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Summing up

www.eaquals.org

Calibration material

Rater trainings

Page 61: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Summing up

www.eaquals.org

Rater trainings

Initial rater trainings Re-trainings

Online-consultation

hours

Calibration material

Page 62: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Conclusion

www.eaquals.org

Page 63: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Conclusion

www.eaquals.org

Calibration material

Initial rater trainings

Consultation hours

Re-trainings

Online training

Page 64: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Conclusion

Page 65: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Standardisation – a practical example in a lowish-stakes context.Emma HeydermanDirector of EducationLacunza - IH

www.eaquals.org

Page 66: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Our journey• about us• the now• and the future

www.eaquals.org

Page 67: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 68: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

English & French:• 5• 11• 5,500 (70:30)• 3 hrs / wk• 110• 30

Page 69: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

• Que comiencen bien con el inglés, familiarizándose con el idioma en un ambiente ameno, adquiriendo los hábitos de estudio que utilizarán en el futuro.

• Si se sigue la trayectoria Lacunza, al terminar los estudios de secundaria el nivel de vuestro hij@ será de dominio del idioma C1.

Page 70: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 71: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Continuous assessment of:

ATTITUDE | ATTENDANCE| PUNCTUALITY

Speaking, Listening, Structure, Vocabulary, Writing.

• A-B Performance above expected level• C ‘On track’• D-E Needs improvement

www.eaquals.org

Page 72: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Speaking• Students are placed in level in September• Their speaking performance is assessed:

• informally through activities in class• formally through at least three assessed speaking

tasks per year• Teachers use our own Speaking & Writing

Assessment Handbook

www.eaquals.org

Page 73: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 74: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 75: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 76: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 77: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 78: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org

Page 79: Thom Kiddle & Eaquals members, Assessing oral Proficiency

And next?• complete the training course• but consider the implications for

• teaching and learning(How do these clips inform our reflections on our teaching

and our students’ learning and/or performance?)• evaluation and assessment

(How do these clips inform the decisions we make about evaluation and assessment?)

www.eaquals.org

Page 80: Thom Kiddle & Eaquals members, Assessing oral Proficiency

Thank-you!

www.eaquals.org

Page 81: Thom Kiddle & Eaquals members, Assessing oral Proficiency

www.eaquals.org