View
212
Download
0
Embed Size (px)
www.britishcouncil.org 1
Jamie Dunlea, British Council
Investigating the relationship
between empirical task difficulty,
textual features and CEFR levels
EALTA 2014
29 May 1 June
University of Warwick
Language Assessment
Research
www.britishcouncil.org 2
First
HEALTH WARNINGS CAUTIONS CAVEATS SOME APOLOGIES Some interesting
information to share
www.britishcouncil.org 3
What we will do Look at task specifications for reading to specify
criterial features of input texts for different CEFR levels
Focus on vocabulary profiles and readability
measures which are included in item writing
specifications
Discuss an exploratory analysis of the textual
features of texts built to spec and the relationship to
empirical difficulty
Look at the relationship between
Rasch difficulty estimates of reading tasks from
the item bank of an operational test designed
around the CEFR and
selected linguistic indices which we use for item
specification (and some additional measures).
Davidson & Fulcher (2007) encourage test developers to see the framework as a series of guidelines from which tests (and teaching materials) can be built to suit local contextualized needs.
www.britishcouncil.org 4
The CEFR can be a
springboard to task and test
development
Task specs: Where to start?
www.britishcouncil.org 5
Test specs from the CEFR CEFR: Vocabulary Range
B2
Has a good range of vocabulary for matters connected to his
field and most general topics? Can vary formulation to avoid
frequent repetition, but lexical gaps can still cause hesitation and
circumlocution.
B1
Has a sufficient vocabulary to express him/herself with some
circumlocutions on most topics pertinent to his everyday life such
as family, hobbies and interests, work, travel, and current events.
A2
Has sufficient vocabulary to conduct routine, everyday
transactions involving familiar situations and topics.
Has a sufficient vocabulary for the expression of basic
communicative needs.
Has a sufficient vocabulary for coping with simple survival
needs.
www.britishcouncil.org 6
Task specs: Where to start?
Descriptors need to remain holistic in order to give
an overview; detailed lists of microfunctions,
grammatical forms and vocabulary are presented
in language specifications for particular languages
(e.g. Threshold Level 1990).
An analysis of the functions, notions, grammar
and vocabulary necessary to perform the
communicative tasks described on the scales
could be part of the process of developing new
sets of language specifications.
(Council of Europe, 2001, p. 30)
CEFR Grid for Reading Tests
www.britishcouncil.org 7
Characteristic
Text source
Authenticity
Discourse type
Domain
Topic
Nature of content
Text length
Vocabulary
Grammar
Vocabulary Only frequent vocabulary
Mostly frequent vocabulary
Rather extended
Extended
Manual (Council of Europe, 2009)
Alderson, et al (2006)
Some criteria when considering categories
Consistency Transparency Accountability Ease of use for item writers
Specs have different audiences, and different levels of specificity according to the needs of the audience
No spec is exhaustive: all specs will contain some of a possible range of categories and measures
No spec is final: specs need to be reviewed and revised
www.britishcouncil.org 8
Some important principles
www.britishcouncil.org 9
Test Aptis
General Component Reading Task
Matching headings
to text Features of the Task
Skill focus Expeditious global reading of longer text, integrating propositions across a longer
text into a discourse-level representation.
Task Level A1 A2 B1 B2 C1 C2 task
description
Matching headings to paragraphs within a longer text. Candidates read through
a longer text consisting of 7 paragraphs, identifying the best heading for each
paragraph from a bank of 8 options.
Cognitive
processing
Goal
setting
Expeditious reading: local
(scan/search for specifics)
Careful reading: local
(understanding sentence)
Expeditious reading: global
(skim for gist/search for key
ideas/detail)
Careful reading: global
(comprehend main idea(s)/overall
text(s))
Cognitive
processing
Levels of
reading
Word recognition
Lexical access
Syntactic parsing
Establishing propositional meaning (cl./sent. level)
Inferencing
Building a mental model
Creating a text level representation (disc. structure)
Creating an intertextual representation (multi-text)
Task specs: an example
www.britishcouncil.org 10
Features of the Input Text
Words 700-750 words
Domain Public Occupational Educational Personal
Discourse mode Descriptive Narrative Expository Argumentative Instructive
Content knowledge General Specific
Cultural specificity Neutral Specific
Nature information Only concrete Mostly concrete Fairly abstract Mainly abstract
Lexical Level K1 K2 K3 K4 K5 K6 K7 K8 K9 K10
The cumulative coverage should reach 95% at the K5 level. No
more than 5% of words should be beyond the K5 level.
Readability Flesch-Kincaid Grade Level 9-12
Grammar A1-B2 Exponents Average sentence length 18-20 words
Text genre Magazines, newspapers, instructional materials (such as extracts from
undergraduate textbooks describing important events and ideas, etc).
Task specs: an example
www.britishcouncil.org 11
Task specs: an example
Features of the Response Targets
Length Up to 10
words Lexical K1-K5 Grammatical A1-B2
Distractors Length
Up to 10
words Lexical K1-K5 Grammatical B1-B2
Key
information
Within sentence Across sentences Across paragraphs
Extra criteria
Presentation Written Aural Illustration
s/Graphs
www.britishcouncil.org 12
Lexical Level K1 K2 K3 K4 K5 K6 K7 K8 K9 K10
Readability Flesch-Kincaid Grade Level 9-12
Using automated tools
Lexical profiles: BNC-20 lists
Derived from British National Corpus spoken corpora by
Paul Nation (2006) and adapted by Tom Cobb
20 1000-word levels, word=word family
Tools for analysis:
http://www.lextutor.ca/vp/
http://www.victoria.ac.nz/lals/about/staff/paul-nation
Alternative frequency lists
General Service List (2000 word families
Academic Word List
BNC-Coca 25
http://www.lextutor.ca/vp/http://www.victoria.ac.nz/lals/about/staff/paul-nationhttp://www.victoria.ac.nz/lals/about/staff/paul-nationhttp://www.victoria.ac.nz/lals/about/staff/paul-nation
www.britishcouncil.org 13
Using automated tools Readability: Flesch-kincaid grade level
Based on syllables per word and words per sentence.
lexical level (longer words tend to more less frequent) and
syntactic complexity (longer sentences have more
compound sentences and embedded clauses)
Scaled to US grade levels ( higher number, harder text)
for analysis:
https://readability-score.com/
http://cohmetrix.memphis.edu/cohmetrixpr/index.html
Readability measures available in Word
Some alternative readability
Reading Ease (basis for Flesch-kincaid)
Cohmetrix indices
Lexile measures
https://readability-score.com/https://readability-score.com/https://readability-score.com/http://cohmetrix.memphis.edu/cohmetrixpr/index.html
How much of a text do learners need to be
able to comprehend?
A threshold level of 95% suggested for reasonable comprehension and guessing words from context (Laufer, 1989; Hirsch & Nation, 1992; Chujo & Oghigian, 2009)
A higher threshold of 98% suggested for reading with ease (Hirsch & Nation, 1992; Hu & Nation, 2000; Nation, 2006)
Van Zeeland & Schmitt (2012) suggest the different criteria could be suitable for different purposes. 95% suitable for adequate comprehension
www.britishcouncil.org 15
Lvl Items/
Task
Word
length Task focus Response format
A1 5 50-60 Sentence level meaning
(Careful, local reading)
3-option multiple choice for
each gap.
A2 6 90-100 Inter-sentence cohesion
(Careful global reading)
Reorder 6 jumbled sentences.
All sentences must be used to
complete the story.
B1 7 125-135
Text-level comprehension
of short texts
(Careful global reading)
7 gaps in a short text. Select
the best word to fill each gap
from a bank of 9 options.
B2 7 700-750
Text-level comprehension
of longer text
(Global reading, both
careful and expeditious)
7 Paragraphs forming a long
text. Select the most
appropriate heading for each
paragraph from a bank of 8
options.
Aptis Reading Test Tasks
www.britishcouncil.org 16
Lvl Word
le