Using Corpora to Teach Vocabulary

Preview:

DESCRIPTION

Using Corpora to Teach Vocabulary. Helping Students Help Themselves. 1. What are Corpora?. Large free computerized databases of natural language Corpus of Contemporary American English (COCA) MICASE (Michigan Corpus of Academic Spoken English - PowerPoint PPT Presentation

Citation preview

Using Corpora to Teach Vocabulary

•Helping Students Help Themselves

1

What are Corpora?

Large free computerized databases of natural language

• Corpus of Contemporary American English (COCA)• MICASE (Michigan Corpus of Academic Spoken English• MICUSP (Michigan Corpus of Upper-Level Student Papers)• British National Corpus

2

Corpus Linguistics = Methodology

Bennett (2010)– Corpus-influenced materials

• Textbooks, materials based on frequency & patterns

– Corpus-cited texts• Dictionaries (Collins COBUILD)• Grammar books (Real Grammar: A Corpus-Based

Approach to English)

– Corpus-designed materials• Learner or teacher-created using a corpus

Corpus learning 101Pre-made Materials

Vocabulary Based on Corpus Studies

Frequency Lists• West’s General Service List (first ~2000 most

frequent words)

• Academic Word List (570 word families; 3000 words)

LexTutor’s VocabProfiler• Insert your own texts to assess vocabulary

level

West’s General Service List

1 the2 be3 of4 and5 a6 to7 in8 he9 have10 it11 that12 for13 they14 I15 with16 as

17 not18 on19 she20 at21 by22 this23 we24 you25 do26 but27 from28 or29 which30 one31 would

AWL

abandonabstractacademyaccessaccommodateaccompany

accumulateaccurateachieveacknowledgeacquireadapt

AWL

Analyse – head wordanalysersanalysers analyses analysing analysis – most commonanalyst Analysts

analytic analytical analytically analyze analyzedanalyzesanalyzing

General English

VocabProfiler

Why?

• Materials development• Check vocabulary levels of

webpages• Decide on vocabulary to

focus on

How?

• Create a .txt document• In Word (save as, then

select .txt)• Copy the text • Paste the text into the

VocabProfile site• Double click on proper

nouns to exclude• Click Submit

MS Office Shortcuts

Ctrl + A select all

Ctrl + C copy

Ctrl + V paste

Ctrl + X cut

Ctrl + Z undo

VocabProfiler

Using a Corpus to Teach VocabularyData-Driven Learning

Knowing a Word (Nation, 2001)

Metalinguistic awareness = dictionary definition

+ •spelling•morphology•part of speech•pronunciation•variant meanings•collocations•specific uses•register

Data Driven Learning (Johns, 1991)

Learners become “language detectives” Johns, 1991

Authentic examples & encourages “noticing” or “awareness-raising”

Romer, 2008

Using a Corpus

Pros

Natural Language

Practice analytical skills/verify choices

Creates self-sufficient learners

Contexts rich, varied

Focus on accuracy

Cons

Significant teacher training needed

Few ready-made exercises and challenging to design

Lexical information vast/confusing

Contexts incomplete

No focus on fluency

19

Data-Driven Learning: The Corpus of contemporary american english

COCA• 450 million words

• 20 million words added yearly (1990-2012)• 90 million spoken words

• Academic and general• Spoken• Fiction• Magazines• Newspapers• Academics

21

Academic Genres

• Education• Geography/Social Science• Law/Philosophy• Humanities• Philosophy/Religion• Science/Technology• Medicine• Miscellaneous

22

Training Yourself to Use the COCA

Brief Five-Minute Tour

Class Use

Sign up for group access at least 2 days prior to use– http://corpus.byu.edu/groupAccess.asp

Notice the group limits– One active request at a time– Four hour limit– Teacher must be a registered user

COCA Search Screen

COCA Corpus Search

Parts of Speech with KWIC (Key Words in Context)

They certainly will not grow as learners without opportunity to analyze their strengths and weaknesses.

Language Development

• KWIC search– Parts of speech color coded

• Students code nearby words• Student code 100 word sample

Language Development

Frequency searches (easiest)•Reading fluency – Should you memorize dawdle, meander, or drift?

Phrasal Verb Frequencies

Intermediate Class– Explain what phrasal verbs are with examples

(mess around, use up, call on, wrap up)– Use COCA to find sample sentences

High beginning writing class– Check spelling and non-English words on 30-

minute timed writing– Students look for words that might be misspelled

• Use COCA• If frequency below 10, circle the word (e.g., speciel)

COCA for Morphology• Transport– transportation– transported– transports

Wildcard* Searches

Circle the word not related in meaningclar* *noteclarify connoteclarinet denoteclarity keynoteclark

What are Concordancers?

• Computer programs used to analyze text• LexTutor

• VocabProfiler• AntConc

• Create specialized corpora for ESP classes

Websites of InterestELT Resource Training Wiki (with Amber Warren)•http://eltresourcetraining.pbworks.com

AWL•http://englishvocabularyexercises.com

VocabProfiler•http://www.lextutor.ca/vp/

Grimm’s Fairy Tales in .txt•http://www.cs.cmu.edu/~spok/grimmtmp/

Contact Information

Debra S. LeeVanderbilt University English Language Centerdleetn@gmail.com

Twitter: dleetnGoogle+: dleetn

Recommended