1
8/18/2016 Academic Vocabulary Lists (Corpusbased; 120 million words) http://www.academicvocabulary.info/compare.asp 1/1 Academic Vocabulary Lists Corpus of Contemporary American English Online interface Overview Compare to AWL Download lists Related sites Fulltext data Word frequency Collocates Ngrams WordAndPhrase corpus.byu.edu Contact us The Academic Word List [AWL] (Coxhead, 2000) has been very useful for teachers and learners since it was released in 2000. Nevertheless, we believe that the new academic vocabulary lists that are available here are more accurate and useful, in at least four important ways. The following is a short summary of the discussion from our August 2013 article in Applied Linguistics. First , the AWL based on an older, much smaller corpus just 3.5 million words of academic texts from the 1990s. Ours is based on more than 120 million words of academic texts in the Corpus of Contemporary American English, which contains texts as recent as 2015. Our academic corpus is composed all 86 million words of academic journals in COCA, as well as 26 million words from academicallyoriented magazine articles. The following table shows the size of the different subgenres, with the number of words in millions is shown in parenthesis: History (14.3 million words) Education (8.5) Law and political science (12.5) Social Science (16.7) Humanities (11.1) Philosophy, religion, psychology (12.5) Science and technology (22.8) Medicine and health (9.7) Business and finance (12.8) Second , our word lists provide better coverage of academic English. The 570 "word families" in the AWL cover 7.2% of the words in the COCA academic texts, but the top 570 word families in our list cover 14.0% nearly twice as much. In a "neutral" corpus the 32 million words of academic and semiacademic texts in the British National Corpus the AWL covers 7.1% and our list covers 14.0% again nearly twice as much. Part of this difference is due to the fact that the AWL "sits on top of" the General Service List (GSL), which already has many highfrequency words, but there are other factors at play as well. Our academic list is also very much oriented towards just academic, compared to other genres. For example, it covers 14.0% of academic texts in COCA, 7.3% of the 85 million words of newspapers in COCA, and just 3.4% of the 86 million words of fiction texts in COCA. That's exactly what you'd want a list that is oriented mainly towards academic, rather than a general word list for all types of English. Third , we believe that our lists are more usable they provide the data in a number of different formats (not just "one size fits all" word families), which are oriented towards different needs. You can download the data for the 3,000 "general academic" words (sample), the words grouped into AWLlike "word families" (sample), and the top 20,000 words in COCA Academic overall (sample). We also provide a wealth of information in the word families, which is not available in the standard AWL families (see a sample of our lists). The words are grouped by lemma (e.g. [decide] = {decide, decided, deciding}, etc), which eliminates clutter. There are different entries by part of speech, so you know, for example, whether abstract is used more as a noun, verb, or adjective. The words are also colorcoded to let you know whether the word is a "general" academic word, or whether it is a more "technical" one that occurs in just a few subgenres. And most importantly, the entries are listed in order of frequency, to help you focus more on words that you will actually see in the real world rather than just having a mass of unorganized words, as with the traditional AWL word families. Fourth , our word lists are integrally tied into the WordAndPhrase interface, so that you can see a great deal of information about the meaning and usage of each word its definition, the frequency in each of the nine academic subgenres (e.g. Medicine, Science, or Business), the collocates (nearby words, which provide great insight into meaning and usage), and many resortable concordance lines for each word, which show the patterns in which the word occurs. The regular AWL lists are integrated into the Compleat LexTutor site, but not to the same degree that ours are with WordAndPhrase. You can also analyze your own texts, based on the AVL data.

Academic Vocabulary Lists · 8/18/2016 Academic Vocabulary Lists ... The Academic Word List ... Second, our word lists provide better coverage of academic English. The 570 "word

Embed Size (px)

Citation preview

Page 1: Academic Vocabulary Lists · 8/18/2016 Academic Vocabulary Lists ... The Academic Word List ... Second, our word lists provide better coverage of academic English. The 570 "word

8/18/2016 Academic Vocabulary Lists (Corpus­based; 120 million words)

http://www.academicvocabulary.info/compare.asp 1/1

Academic Vocabulary ListsCo r pu s o f C on t empo r a r y Ame r i c a n E ng l i s h

Online interface

OverviewCompare to AWLDownload lists

Related sites Full­text data Word frequency Collocates N­grams WordAndPhrase corpus.byu.edu

Contact us

The Academic Word List [AWL] (Coxhead, 2000) has been very useful for teachers and learners since it was released in 2000.Nevertheless, we believe that the new academic vocabulary lists that are available here are more accurate and useful, in at least fourimportant ways. The following is a short summary of the discussion from our August 2013 article in Applied Linguistics.

First, the AWL based on an older, much smaller corpus ­­ just 3.5 million words of academic texts from the 1990s. Ours is based onmore than 120 million words of academic texts in the Corpus of Contemporary American English, which contains texts as recent as2015. Our academic corpus is composed all 86 million words of academic journals in COCA, as well as 26 million words fromacademically­oriented magazine articles. The following table shows the size of the different sub­genres, with the number of words inmillions is shown in parenthesis:

History (14.3 million words) Education (8.5) Law and political science (12.5)

Social Science (16.7) Humanities (11.1) Philosophy, religion, psychology (12.5)

Science and technology (22.8) Medicine and health (9.7) Business and finance (12.8)

Second, our word lists provide better coverage of academic English. The 570 "word families" in the AWL cover 7.2% of the words inthe COCA academic texts, but the top 570 word families in our list cover 14.0% ­­ nearly twice as much. In a "neutral" corpus ­­ the 32million words of academic and semi­academic texts in the British National Corpus ­­ the AWL covers 7.1% and our list covers 14.0% ­­again nearly twice as much. Part of this difference is due to the fact that the AWL "sits on top of" the General Service List (GSL), whichalready has many high­frequency words, but there are other factors at play as well.

Our academic list is also very much oriented towards just academic, compared to other genres. For example, it covers 14.0% ofacademic texts in COCA, 7.3% of the 85 million words of newspapers in COCA, and just 3.4% of the 86 million words of fiction texts inCOCA. That's exactly what you'd want ­­ a list that is oriented mainly towards academic, rather than a general word list for all types ofEnglish.

Third, we believe that our lists are more usable ­­ they provide the data in a number of different formats (not just "one size fits all"word families), which are oriented towards different needs. You can download the data for the 3,000 "general academic" words(sample), the words grouped into AWL­like "word families" (sample), and the top 20,000 words in COCA Academic overall (sample).

We also provide a wealth of information in the word families, which is not available in the standard AWL families (see a sample ofour lists). The words are grouped by lemma (e.g. [decide] = decide, decided, deciding, etc), which eliminates clutter. There aredifferent entries by part of speech, so you know, for example, whether abstract is used more as a noun, verb, or adjective. The wordsare also color­coded to let you know whether the word is a "general" academic word, or whether it is a more "technical" one that occursin just a few sub­genres. And most importantly, the entries are listed in order of frequency, to help you focus more on words that youwill actually see in the real world ­­ rather than just having a mass of unorganized words, as with the traditional AWL word families.

Fourth, our word lists are integrally tied into the WordAndPhrase interface, so that you can see a great deal of information about themeaning and usage of each word ­­ its definition, the frequency in each of the nine academic sub­genres (e.g. Medicine, Science, orBusiness), the collocates (nearby words, which provide great insight into meaning and usage), and many re­sortable concordance linesfor each word, which show the patterns in which the word occurs. The regular AWL lists are integrated into the Compleat LexTutor site,but not to the same degree that ours are with WordAndPhrase. You can also analyze your own texts, based on the AVL data.