37
Integrating Corpus-based Resources and Natural Language Processing Tools into CALL PASCLIAL CANTOS-GOMEZ* Unii~ersidcrd de Murcia ABSTRACI' This paper ainis at presenting a survey of computational linguistic tools presently available but whose potcntial has been ncither fully considered nor exploited to its full in niodern CALL. It starts with a discussion on the rationale ofDDL to language learning, presenting typical DDL-activities. DDL-software and potential extensions of non-typical DDL-software (electronic dictioiiaries and electronic dictionary facilities) to DDL . An cxtcndcd scction is devoted to describe NLP-technology and how it can be integrated into CALL, wiiliin alrcady cxisting software or as stand alone resources. A range of NLP-tools is presentcd (MT progranis, taggers, lemn~atizers, parsers and speech technologies) with special ernpliasis on tagged concordancing. Tlie paper finishcs with a number of reflections and ideas on how language technologies can be used cfficiently within the language learning context and how extensive exploration and intcgration of tliese tcchnologies might change and extend both modern CAI,I, and the present language learning paradigiii.. KEY WORDS: Concordancing. Corpus linguistics, data-driven learning, electronic dictionaries, Iiunian language tcchnologies, linguistic corpora, inachinc translation, morphological generation, NLP-tools, parsing, POS-tagging, speecli technology Arlrlress for corres~~oi~rlerrce: Pascual Cantos Cóinez, Departanlento de Filología Inglesa. Universidad de Murcia. Cl. Saiito Cristo 1. ;O07 1 Murcia, Spain, e-inail: [email protected]. Q Scrvicio dc I'i~blicacioncs. Universidad dc Miircia. All rights reserved. IJES. vol. 2 ( 1 ), 2002, pp. 129-1 65

Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Embed Size (px)

DESCRIPTION

Journal of Applied Linguistics

Citation preview

Page 1: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Integrating Corpus-based Resources and Natural Language Processing Tools into CALL

PASCLIAL CANTOS-GOMEZ*

Unii~ersidcrd de Murcia

ABSTRACI'

This paper ainis at presenting a survey of computational linguistic tools presently available but whose potcntial has been ncither fully considered nor exploited to its full in niodern CALL.

It starts with a discussion on the rationale ofDDL to language learning, presenting typical DDL-activities. DDL-software and potential extensions of non-typical DDL-software (electronic dictioiiaries and electronic dictionary facilities) to DDL .

An cxtcndcd scction is devoted to describe NLP-technology and how it can be integrated into CALL, wiiliin alrcady cxisting software or as stand alone resources. A range of NLP-tools is presentcd (MT progranis, taggers, lemn~atizers, parsers and speech technologies) with special ernpliasis on tagged concordancing.

Tlie paper finishcs with a number of reflections and ideas on how language technologies can be used cfficiently within the language learning context and how extensive exploration and intcgration of tliese tcchnologies might change and extend both modern CAI,I, and the present language learning paradigiii..

KEY WORDS: Concordancing. Corpus linguistics, data-driven learning, electronic dictionaries, Iiunian language tcchnologies, linguistic corpora, inachinc translation, morphological generation, NLP-tools, parsing, POS-tagging, speecli technology

Arlrlress for corres~~oi~rlerrce: Pascual Cantos Cóinez, Departanlento de Filología Inglesa. Universidad de Murcia. Cl. Saiito Cristo 1. ;O07 1 Murcia, Spain, e-inail: [email protected].

Q Scrvicio dc I'i~blicacioncs. Universidad dc Miircia. A l l rights reserved. IJES. vol. 2 ( 1 ), 2002, pp. 129-1 65

Page 2: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487
Page 3: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487
Page 4: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

A Iui-ilicr rclatcd issuc with DDL is authenticity. Widdowson (1983:30) considers that

Ai i autlieiitic stiniulus in tlie forrn o f attested iiistances o f language does riot guarantee an authentic respoiise iii tlic forin ofappropriate language activity [...] we sliould retain the term 'authenticity' to refer

to activity (¡.c. proccss) and use tlie terin 'geiiuine' to refer to attested instances o f language.

In this scnsc a corpus niay contain millions of "attested instances of language". but there

is notliing to guarantee that you can use data from that corpus as a stimulus for "appropriate

languagc activity" ('l'ribblc 1997). That is, it is likely that forcign languagc students are not

necessarily niotivated by a language learning activity if the instances of language use that they

arc studying are cxtractcd froni contexts that have little or no connection with their interests and

conceins. Gcnuine exaniplcs of language in use will not necessarily lead to autlientic language

use or efiective language learning activities.

So tlic question is: which is tlie best corpus for language learners? Flowerdew (1993:309) thinks tliat

Maiiy iiative spcakcrs inake use o f otliers' writii ig or speech to inodel their owii work iii their native laiiguagc wlieie tlie geiirc is uiifaiiiiliar. I t is tiiiie tliat tli isskill was brouglit out oftlie closet, aiid exploited as aii aid for Icarniiig.

Siniilarly. Bazernian (1994: 13 1) considers that the most useful corpus for learners of English is

tlie onc wliicli oSfcrs a collcction oiexpcrt perforniances in genres which have relevante to the

needs and intcrests of the learncrs. These texts might exeniplify the results and models of the

dcsircd í'ornis oflanguage behaviour that language learners want to achieve and might, therefore.

bc motivating startiiig points for language learning and language using activities.

Clcarly, tliis, sonieliow, relegatcs standard, balanced and representative corpora, such as

llic Brown corpiis of American Englisli (Kutera and Francis 1967), the 1,ancaster-Oslo-Bergen

(LOB) corpus OS Britisli tcxts (Johansson 1980). or other niajor corpora such as the British

National Corpus (BNC) (Burnard 1995). a 100 million word represcntative corpus of

contcniporary Brilish written and spoken texts, or tlie Bank of English at Birniingham University

(Sinclair 199l), for language learning purposes. Tribble (1997) points towards non-standard

corpora for DDL and draws his attention to multimcdia encyclopaedia, such as Microsoft

EncartaO. aiiioiig others. The latest version of Microsoft EncartaB contains more than 30,000

aiticlcs, betwccn 200 and 5000 words, which count for a total of roughly 30 niillion words,

covcring diflCrent doniains aiid topics, such as art, geography, history, language, life science.

literature. pliilosopliy. pliysical science, religion, social science, sports, etc. The data provided

by this niultiiiicdia encyclopaedia virtually contain enough tcxts which niost studcnts in niost

languagc classcs will find interesting and informative. ln addition, with this coniprehcnsive range

of topics atid tcxts. it is not difficult to sclect od hoc texts. focussiiig on students particular needs

and niotivations.

O Servicio de Piiblicaciones. Uiiivcrsidad de Miircia. Al l rights reserved. IJES', vol. 2 ( 1 ), 2002, pp. 129- 165

Page 5: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

I i i~cgi.tr/ii ig (oipiis-huscd l\'csorrir.es o17rl h'u~irizll Lungir~rge Pi~ocessing h o l s inro C , ILL 133

Ol'coursc, our aini here is not to advertise any particular multimedia encyclopaedia but

iiiucli iiiore to cncouragc language students and teachers to use the vast range of language texts,

corpora or data, in gcncral, wliich is available in electronic forni, in CD-ROMs andlor Internet,

ratlicr tliaii urging tliciii to construct their our comprchensive and representative corpora.

11. A WOKD ON EXISTING DDL-SOFTWARE

111 ~l i is scctioii, wc sliall rcview tlie niain soi'twarc applications used among DDL practitioners:

( 1 ) coiiiiiicrcially availablc concordance progranis and (2) Tini Sohns' Confext. In addition. we

sliall also prcsciit a Spaiiisli vocabulary learning niultiniedia application. PYLICI~CLI tu V o ~ ~ ~ h u l ~ ~ r i o

(Sánclicz aiid Cantos 2000), wliich is bascd on tlie clectronic dictionary nietaphor and DDL-like

lcarning/acquisition stratcgics.

Coiicoidaiiccrs arc text proccssing tools for looking at how words behave in texts. These tools

allow yo~ i to lind out liow words are uscd in texts. An~ong tlie facilities. al1 concordancers allow

you at Icast':

To lis1 al1 ~ h c words or word-clustcrs in a text. set out in alphabetical or frequcncy order.

'1'0 scc any word or phrase in context (concordances), so that you can see what sort of

coiiipniiy it kccps.

. . 1 Iiis tcxt proccssing tool is gencrally uscd for lexicographic work, for preparing dictionaries. and

by rescarclicrs invcstigatiiig language pattcrns. -. I iiii .lolins has compilcd nunierous cxercise cxaniples on his web page' using standard

concordancing tools. Tlic classroon~ materials that follow are extracts from his website and are

a collcctioii soiiic participants' work of tlie U.stí n ~ i d Lohem DDL Workshop (21st - 25th March

2000):

O Servicio <Ic P~iblicacioncs. Universidad dc blurcia. All righis reserved. /JES, vol. 2 ( l ) , 2002. pp. 129-165

Page 6: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

About and On

How do wc usc tlic prcpositioiis ABOUT aiid ON? Wliich oric is uscd iiiore ofien? Wliicli oiic tcnds to be iised

iii acadciiiic tcsts? Ii i wliicli cases is tlierc tlic occurciice of orily onc oftliem?

Book t !,; s , , r i~ n H r l t l s n Mi><llcoI A S S O C I ~ ~ L C J ~ hook d b ~ u t a p u t e n r i o i r i s k r u nurnon n s a i t t i p i

1 +LO. i s n r k i , : . 'Wa ~ ~ u t i l i s t i r < l o c o f f e e - t a b l r boak o h o u t a n t b e h a v i u u r c a l i e d The Ancc , wnich

u . 1 1 5 P u h l i c i s e < l o s a tmok ñhout t h e t e r r i b l e f a t e owaitincr humani ty F 1 i n l i f r . 7tiis is y z t a n o t h e r 'qee-wt i izz" houk a t m u t f a r r n s l c s c i e n c e , t h i s t ime basecl un

í t i t l r in t r ic iuvr i iiir: ac l a s t , 1 r n o u g h t . a tmak ahciut t h e I>ercurial r e l a t i o n s h l p c t h a t c c i e n

2 1 eari rverf ic t , ian . 1 reiiiernber huylnii my F i r c t hook on p l ñ n e t c (by P a t r i c k Moorel hack i n t h e 1

2 2 <]s . How t h i n g s have rhñni ied . A good buok on t h e p l a n e t s h a s a lwñyc needed t o he u p -

2 i f g i i a : i i s r r ~ n . s o hiiw did he ci,iiia LU w r i t e a tiook iin Murduck? Was h i s c h i i i r e < l i c t a t e d by Lhe

2 4 a o s I ' m Kerriing t h a t fcir m y s e l f . I t ' s a hook iin k i l i m c - geome t r l ca l l y -pa t t e rne r l rugs i r 2 5 r i i i n r y c r r u l l i n c . 19Y!JI, and i s w r i t i n g a hook iin t h e k u t u r e a f U S n a c i o n a l c e c u r i t y r > o l i c

(by Kvita Rychtárová)

'Great', 'Uig', 'Large', 'Huge' and their Collocations

Task 1 - 'Grcat' aiid its collocations

Tiy ro spot wliat tlic typical cascs of 'grcat' and its collocations arc on tlic basis of tlic followiiig cxaiiiples.

1 .

Whrth+r r r r e n t c l i s r r > v r r l r s in ~ h r G r e ñ t Pyramid of C h e u ~ ~ s have a n y t h i n g t o d o

irnt.i;d c i ~ r r s r u r l r i q Al-Ari<lalus dnd i t s G r e a t Mosclue t o t h e i r farmer f ñ i t h

kiiiri rtiei. isrlve? r r , inf ,>r ta i~le . And t h e y ~ e a t London r l u b s , wi t t , t h e l r rc iar ing c o a 1 F i r r s

2 . n o r r ñ t i v r i s ~f Lec M i s e r a i i l e s and G r e a t E x p e c t a t i o n s . Whñt r o u l d be more carnlca l

che a(irii.':l t u m a k e d Film i n t h e G r e a t J u u r n e y s e r i e s , t h e thenie o £ which was

Task 3 - 'Grent' aiid inissiiig iiouiis Try to prcdict tlic words wliicli arc iriissiiig.

1 . F a l s t ñ i F wlio h a s heen l u r e d i n t o Winiisor G r e a t w i t h a n t l e r s on h i c h r a d

2 Betiiazzi is riiie o £ Eur r , [> r ' s g reaL , l l k e Rudtier a n a l l - l o u n d

-!. whñt iia<i 1itci;iiir known a s G r r a t The Times , t h e D a i l y Mail and 4 . 111 1670 F r e r i e r i c k 11 t h a G l r a t , o £ P r u s s i a

5 . r h r wi>rl<l wñc d',iriinatrd l>y t h a g r r n t Europea" and. c incr t h e 1 8 5 0 s .

I P a l k , p l a y e ~ s . lK.?<.e, King, Empirecl

(by Sarka Canova and Jarka Ivanova)

O Scrvicio (Ic I'iiblic;icioncs. Univcrsid;id dc Miircia. All rights reservcd. IJES, vol. 2 ( 1 ) . 2003, pp. 139-165

Page 7: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Adjectives ending in -ic, -¡cal

l. Look carctully at tlic followiiigcitations. Wliat differcnce in tlie iiieaning oftlie adjectives classiciclassical can

yo11 spot?

And wlieri Siriatra was making his classic albums for Capitol in rhe 1950s Songs For Swinging . . . Gramophone. the classical music magazine. has not written about Ser single or her album.

11. how till i i i tlic blaiiks witli nppropriatc adjectives.

1. 1 remernber listening to al1 the Motown music and the Philadelphia soul stuff. 2. Once it became clear that she could not continue through the next two acts, Deane decided to replace hoth dancers. as in ballet one partner may not be physically suited to perform with a stand-in. 3. . . . we always dine at CaEe Des Arts. A bistro, run by a consortium of charming ladies. stylish. innovative food. 4. The area where the dirt collects is transparent. al1 our detritus is paraded on the outside, turning the design incide out. Why do we need to see it? 5. . . . arctiitect Kichard Norman Sfiaw, capable o£ turning out gathic, Queen Anne and strict

designs, made a valuable contribution to the Arts and Crafts movement.

(by Zuzana ~a f fková and Vladislav Sniolka)

Probably oiie of tlie hest known and niost used DDL-software is Context'. Tliis program

encouragcs language learner to invcstigate how words are used in context in English. and is

designcd to supplenient classes. It is based on short contexts (extracted from thc database by

iiicans of tlie coniputer prograni MiuoConcoi-6) illustrating the use of important key items from

a databasc oi'ovcr 3 nlillion words of tcxt in English.

Tlic program starts offering tlie user a list of lieadings: Top Menu (parts-of-speech and

topics; sce Figure 1). In addition, it is also possible to view a more detailed index of al1 the

keywords available to the prograni, together with the names of the files in which each key word

is storcd and to selcct a iilc of contexts (by kcywords defined by parts-of-speech, keywords

dciiiied by topic or niorplienies-pretixes or suffixes; Figure 2). Once the user has selected the

tilc ol' context. the prograni displays the list of key items in the bottom of the screen (among

otlier Iacilitics) in ordcr to invcstigate the set of contexts for any particular key item (Figure 3). .fhc Qzlrr Soeeii cliallengcs students to guess what the niissing keyword is (Figures 4 and 5). Al'tci. studcnts Iiavc íinislied tlic Quiz, they can see an analysis of their performance.

O Scrvicio de I'iib1ic;iciones. Univcrsid:id dc Miircia. All rights resewed. IJES, vol. 2 ( l ) , 2002, pp. 129-165

Page 8: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Paicual Canlos wpd D c c u n a d 1 4 ~ 6 lnsmla Pág310Ln166&mPor25~

2' :J 3 " . ? R w i o ~ 1 r j ~ a @ e i l e 1 &Explaand IldP,C(INTE 33D4 &'afiQ@ofij4- 1345

Figure 1. Top Menu

_I paicud Cantos wpd lrm& 1-1- ~ a ~ 3 1 O ~ n l S 4 ( r m P o r 2 Y

& 'J 4 " 9 R e p i o b n 1 C W a B e i f e 1 2 J E x p b a d (m 30&:< &jF92Qfi&PB4> 1723

Figure 2. Iiidexcd-data Wiridow

O Servicio (le I'ublicaciones. Uiiivcrsidatl de Murcia. A l l rights reserved. IJES, vol. 2 ( 1 ), 2002, pp. 129-1 65

Page 9: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Porcual Canloi wpd D m r r l o 2 I b A E & lnruia

ain8=4 2 :i ,> M J C w a a s r r e jlPCo)iTE ~aDdj4 a I ~ O b ~ ~ ~ h t : 1728 -

Figure 3. Concordance-data Wiiidow

- Parcual Cantos wpd DucurienloZ [bnl3@ inwia P& 312 Ln 3 4kmPor 254'

gIniciol ;* ;1 4 6~ M . ~ R ~ P , O ~ U C I 1 E w o < ~ c i ~ e 1 A J E W ~ ~ J 1- zUe:-i amfim58.: 1735

Figurc J. Quiz Window ( 1 )

O Servicio de I'~iblicacioncs. Uiiivcrsid;id dc Murcia. All rights rescrved. IJES, vol. 2 ( 1 ) . 2002, pp. 129-165

Page 10: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487
Page 11: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

.. . ,: ,>.,i , . : :I 2 l;i.i:Iilii.~ rr.r iiii:iizii i~i?l:~ir:i1i:)ri 1) t . r!r ,~b.: ( i . 1 ' ~ ~ a:iirriaI) i:rl i ii'i, f??qiisi'iiiig. or iliir:iii-g tir.~i:ri

(~jd!,.<~-~~!,:, ;t,j:li<f&.)

:< 4 k II:~, lir@ i.:x 2, ;,r,+(:::G pl~l:p>:e ~.C>F~?~-?,O;C;P, ~ I I , ~ J ? ~ , P ? - ~ G X . : )

1, :? i ,~,~:l, j i~e fc,: p-?;i:x rl:rrri-il: cr S,=,:!: (&.eii.)fbjcr$) .1 ;i i! i'eI:g:.,~i: c.,ri:::ri;iiiii). b 11:. biiii:?ii~, ,>rm:!i(.,ie.l t.y I!

, . .A c,,:, :y ofl:~g~~.xl~; ~IWII NI ti:e 5~rrie L,111;,.hrk: a a $~~a, : I i r~ - . :~ t i~ . , ,: t~ :::~.icti a L,~iiI.Jir~ c u ,:IIVISIC.~I ,.,:a

~ : A : . ~ - ~ . : ~ ~ . ; O ~ i.x c;t:r~e:. .:,.,:I:~~I:I~IcI.Ls e [ , . 1: :., .o : I?~:~ : i :1..:ru1 c'.:L:: o (111~ H,W;C;~ <.:!.II:::I (.:::LI::V;. ~;,:r~rd ! ., ?;q:::>~;y, 4:;;; ., :.;!:,,i l~:a.i~.~:ly, ,:i?:l:~51~: (S,yj.:,~::fpe~ :$ :$ :I Cirr~ ,:,u I~~,II[I.II.>~I 1) 11::: kl:!,:e (aib1!:ir1~~ c :IIK Huais<~ :ot,k,q il:? 5~~~c.xE:~:~: l~:u~:~ % :l :l Iegl;!:i~;:~? .jr .<eilk ?r:tl!xv? :fi:e~nt.i? b [!:e t!:;llilq %J>he[? 11 rrleelz c (tllc 1 Iou5e; C!r; ;t$? 1.3;; Ti'(%

Hot:ie ,>fCorrir:::,:r c.:' I..:r-l:., fii-i r!-:? i!Ti i.ie Ec.11;~ ~I?.?~;.e;ni iar ice~ i il :I ai ?i.icliei?:e 15 z :tt?;<!:.c, í!n?n.~:I, P-c 1) 2 :í.rP.dr!~mc(: ir 2 theair? ,?l: rir.?rnz (s?:>~iri ~ I O E C ~ ( . siiím rl: 9

?'<;d<>\, 3 [),;:z:[* . , > . .,:;:;y:, ,., :,.,,;<:):?r, l~,,J ,>f.;?,? l'l<:;>,tc::> : I. < .,:i.?:t#,, kc3lq: 11, ,i ll.>,:~>:[:d <:: :4 ::l',li,tc!: :,?,:[;*€' 0. ,,jL?< <>f,<jL.<r, j?.>!*< r?;~;<;:Lv:, ;E?,>< ::I,;>L'~,%) . .: . - a A ;>l&r:: ;f:1.121:,: I .:~~.I.II~J:~IC, 2 :':>~aur&~ii o[ 1:-x! !c< , '~ce . . / :~ j : i~e , p:i~?~: t???cre) b [.,!:~nb.) íc?v>!i'ie)

;clc;:t ...,.l 2;; tl:,: ~r~;n,:i>~iierii ,.,~:?~~,:L:ILIT:u,: j;,jt':l, $5. 1~ bi i i f i ~ . 3 : :,iri~.,:id : " /.:x .i L,r<<,p.'?:

.., . . L,, ;!W?!II:~;IP::X P:. .;IIC ,,i.:;:vcr L 1:) ;I~:~J.!LIII~ , :, , H*.::,<i7 ;..:: .:..;

i~ in ic ioJ ;i; =;i /ll R m m' EWOI.. L J E X ~ . . . ~ qusd.. . lwm s~k/;.i ~ ~ ~ ~ m g q ~ b - i - - F i g i i r e 6. Exa i i ip le o f a i i io i io l ingual e lectronic d ict ior iary

IBd 3 ;J 4 R 9Rm0 1 ~ w m e 1 ~ E x P ~ 1 WUS&L 1- 3l;lei-i W Z Q ~ ~ W ~ Q ~ J 1757 ( F i g i i r e 7. Euai i ip le o f a b i l i ng i ia l d ict ionary

O Scrvicio (Ic Pirblicacioiies. Ui1iversid;id dc Murcia. A l l r ighis reserved. UES, vol. 2 ( l ) , 2002, pp. 129-165

Page 12: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Based 011 the electronic dictionary n~etaphor, Sánchezand Cantos4 designed a DDL-like software:

Prncticri tii I'occihzilriiio (Pi'V). PTVis a Spanish lexicon learning software, containing the 4500

most frequent types occurring in the CUMBRE Coipus -a linguistic Corpus of contenlporary

Spaiiisli (Sancliez et al. 1995). All 4500 itenis:

Arc translatcd into English, I:rencli. German, Portuguese and Italian. By just clicking on

thc dcsircd Flag. students will get thc words translaied in that language. However.

students niiglit cliaiige translation language any tinie at will (Figure 8).

Can bc ricccssed. using standard clectronic search facilities: term search, window scroll

or tliumb indcx (Figure 9). Are illustrated witli a real exan-iple -fuIl concordance sentence, extracted from the

CUMBRE Coipus (Figiiie 10).

Arc rccorded and can bc lieard by the students.

I 1

Figiirc 8. Laiiguage Selection Window

Q Servicio cle Piiblicaciones. Universiclad de Murcia. All rigliis reserved. IJES: vol. 2 ( l ) , 2002, pp. 129-165

Page 13: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

I l

Figure 9. Searcli Facilities

Figiire 10. Visualisiiig Concordance-scntcncc and Translatcd Tcrin

O Servicio cIc I'iiblic~cioncs. Uriivcrsi<lad dc Murcia. All rights reserved. IJES, vol. ? (1 ): 2002, pp. 129-16.5

Page 14: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

The option Ejercicios offers three types of exercises:

Lislcn, repecit, rccordrindcheckyolil-pl-onlrncici~ioti. On selecting this exercise. students

will clioose tlie nuniber of words they wish to work with by clicking on the button with

1lie nuiiiber of exaniples which will, at random, be the basis of the exercise. Next, the

randoni-sclected words will appear. and by clicking on the loudspeaker icon, tlie blue-

Iiigliliglited word can be heard. Finally, students click on the microphone icon and record

tlie liigliliglitcd word. A click on tlie right loudspeaker reproduces the rnodel recording

followed by tlie studcnt's recording and the student can contrast both outputs (Figure 11 ) .

Lislcn ~ i n d ivrite. Tliis is a word dictation practice; students will hear randomly chosen

words aiid will liavc to write tlieni correctly. Tlie prograni allows three guesses before

displaying thc correct spelling. To facilitate the writing of Spanish diacritics, PTV provides them on a sniall table below tlie text-entry window (Figi i re 12) .

Rerid in j30i l r I~ lngz l~ lge ond /~.onslate into Sprinish. I4ere tlie program displays randomly

sclected words in thc target language cliosen and students have to write the translation for

cacli word into Spanisli (Figzrlx 13) .

3. Lee en lu idioma y traduce al espariol

I I

F ig i i r c I l . Excrcise Type 1 : Lisleir. Repeor, Recoidriild Cllcck your Pi.oiiiiiiciuiioi?

O Servicio de I'iiblicacioiics. Uiiiversidad dc Murcia. All rights reservcd. IJES, vol. 2 ( l ) , 2002, pp. 129-165

Page 15: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

117reg~~11i~ig Coiprrs-htcvcd Reso~ci~ces u17d Str/i~izrl Lungi~uge P~.ocessing Tools inro CALL 143

,cl,, , 1 ,,f (1, .-,S , m , S , )

pmnunciación

2 Ezcucha y A e s i t ~ h e Siguiente >>

3 Lee en tu idioma y lraduce al español

\;rsl,sr8 3 ir r il rriii.,i r > 111 li'l'lil 1 l

' << Salir

I I Figure 12. Exercise Type 2: Li~le17 und Wiile

1. Escucha. repite. graba y comprueba tu pronunciación

3 LRO ori t i i ~di<inie y traduce al espaiiol

L 1

Fig i i re 13. Excrcisc Type 3: Recid iii yoiir Lai7giiuge and Ti.unslare inlo Spur7isl1

0 Servicio de Piiblicncioncs. Uriivcrsidad de Murcia. All rights reserved. IJES, vol. 2 ( l ) , 2002, pp. 129-1 65

Page 16: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

144 Posc,iol Conlos-Gónlez

On conipletion oSeach cxcrcise or at the end osa working session, students may consult hislher

Iiits or failurcs by clicking on the Re.szrl/c~dos tab on tlie right of the agenda.

111. INTEGKATING UUL ANU LANGUAGE TECHNOLOGZES

In rcccnt ycars, a ncw tcrm has becn coined by the CALL community: Humc~n Llrngzrcige

Techno1ogie.s (1-ILT). This tcrni embraccs a wide range of research and developmcnt areas within

thc arca OS 1,anguagc Engiiieering or Language Teclinologics.

l'lic ticld o f Iiuinan laiiguagc tcclinology covers a broad range o f activities with thc cventual goal ot ciiabliiig pcoplc to coinrnunicate witli inacliines ~ is ing natural coininunication skills. Rescarch and dcvclopiiiciit activities includc tlie coding. recognition. interprctation, translation. and geiieration o f laiiguagc. ... Advaiiccs iii I i~i i i iai i lariguagc tcclinology offertlie prorniseofnearly universal access toon-liiie iiiforiiiatioii aiid scrviccs. Siiicc aliiiost evcryonc speaks and iindcrstaiids a laiiguage, tlie dcveloprnerit o f spokcii lairgiiage systciris wi l l allow tlie average pcrson to intcract with coniputers witliout special skills or traiiiiiig. usiiig coininoii deviccs sucli as tlie tcleplrone. Tliese systerns wil l coinbine spoken language uiidcistaiidiiigaiid gciicration to allow pcople to intcract witli cornputers using speech toobtain inforniation oii virtually any topic. to coiiduct business aiid to coininuiiicatc with each otlier niorc effkctivcly. (Cole 1996)

111.1. Sonie EILT tools5

. . lherc arc many 1-ILT tools that have beconie cornn~ercial systems. Among those systems,

probably tlie two arcas that have focused niost commercial and scientific motivation are Macliine

Translatioii (MI') and Specch Recognition (SR). Particularly interesting here is the possible

application domain of M'T and SR to CALL and more generally, language teaching and learning,

and othcr HL,'f tools. Intcrcsting in tliis respect are part-of-speech (POS) taggers and syntactic

parsers. These two Natural Language Processing (NLP) tools might help teachers and learners

to prcprocess tcxts and highligl-it ccrtain gran~n-iatical phenomena or patterns witliout the trouble

of having to manually annotate a text.

In the Sollowing scctions, we shall introduce sonie 1-ILT applications and try to highlight

thcir interest Sor language tcacliers and learners, in general, and also for non-HLT initiated CALL

practitioncrs.

O Servicio (Ic I'iiblicacioncs. Uiiivcrsidnd dc Miircia. Al l rights reserved. IJES, vol. 2 ( l ) , 2002. pp. 129-165

Page 17: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Froiii tlic earliest days, M1' Iias been bedevilled by grandiose claims and exaggerated

expectatioiis. Iii prcsent day. Iiowever. tlie terni MT is generally the standard for computerised

systcnis rcspoiisible Sor tlic production of translations froni one natural language into another,

witli or witliout Iiuiiian assistance.

Altliougli tlie ideal niay be to produce high-quality translations, in practice the output of

niost MT systcnis is revised and edited. In this respect, MT output does not differ much from the

output OS niost liunian translators wliich is normally revised by other translators before

disseniinatioii. MT output niay also serve as rough or raw translations.

Wliilc iiiany of thc conin-iercially available MT packages may be useful for extracting the

gist of a text tlicy sliould not be secn as a serious replacement for the human translator. Most

iiiacliiiic translations are not tliat bad, tlicy are half-intelligible, letting you know whether a text is wortli Iiaviiig traiislatcd properly aiid tlicrc are niany situations where the ability of MT systems

to produce rcliable, if less than perfect, translations at liigh speed are valuable. Even where the

quality is lowcr, it is oficn easier and cheaper to revise 'draft quality' MT output thaii translate

it entircly by Iiaiid. Thc translation quality of MT systems depends mainly on restrictions of the

translation doniain, linguistic arcliitecture and coniponents.

Iiiiposing rcstrictioiis on the input sucli as (a) limiting the texts to particular sublanguages

of docui-i-ieiii type aiid sub.ject Gcld aiidlor (b) controlling the language (reducing anibiguities,

collocluial cxprcssions. ctc.), niay iniprove translation quality.

liegarding M'i' arcliitecture, tlie first MT systenis are generally referred to as Iiaving a

dircct traiislation approach. 'flie niain idea beliind this architecture is that source language

sciitcriccs caii bc transtoriiicd into target language sentences by shallow analysing the source text,

replacing source words with tlieir target language equivalents as specified in a bilingual

dictionary. aiid tlien rouglily re-arranging tlieir order to suit the rules of the target language. The

sccond basic type is tlic intcrlingua approach. T1iis type assunies the possibility of converting

texts to aiid Sroiii rl~ecrnitig representations coninion to more than one language. Translations

consist o i two stages or pliases: (1) fioni the source language to the interlingua and (2) from the interlingua to tlie target laiiguage. 'Tlie third type OS MT systcnis, the transfer approach, involves

three stagcs: (1) converting source tcxts into intcrniediate representations in which ambiguities

Iiave beeii resolvcd irrespectively OS any otlier language, (2) converting tliese into equivalent

rcpiesentations of the targct language. and (3) generating the target texts (translations).

Soiiic otlicr MI' systcnis rely less on the approaches mentioned above. Example-based

iiiacliiiic traiislation. Sor instancc. does not cinploy niapping between languages but instead

matchcs stoi-ed translation cxaniples against eacli other using a bilingual Corpus of translation

pairs (Nagao 1984). An evcn iiiorc radical approacli to MT is the statistical approach (Brown et

al. 1993) wliicli rcquircs tlie use of large bilingual corpora which serve as input for a statistical

O Scrviciu tlc I'iiblicacioiies. Llriivcraid;~<l dc Murcia. ,411 righis rescrved. IJES, vol. 2 ( 1), 2002, pp. 129-165

Page 18: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

translation modcl.

Regarding language teacliitig, MT systems can be easily and efficiently integrated into the

learning process. Some potential applications are7:

Translating full texts or paragraplis. Student can tratislate and then read the texts in their

own language. extracting the gist without teaclier intervention (Figure 14). In most MT

systenis, users can translate sentences automatically or interactively. Automatic

translatioii proceeds autonomously. without the intervention of tlie user. whereas in

interactive translation. the user can intervene in the translation process and choose the

best word whencver more tlian one translation is posible.

Translating sentence-by-sentencc and print the source and target texts in a line-by-line

formal. pl'liis layout can be useful for comparing the original and translated text. This

allows studetits to explore for equivalents between the source and target language, look

to crroncous translations/falsc friends and assist in their own translations. (Tuhle 1).

Studying or writing in a foreign languagc.

Looking up words (dicíionary) and their inflections (Spanish grammar) (Figure 15) .

1 Eie E& T a i t 5cwch Famal 11mrlarc Qcsimr Y& Hcb 1

s t a n d weii before computers existed Tribbie and Jones (1990) trace the history o¡ ronsordancing from the 13th century. when Hugo de San Charo enlisted 500 rnonhs in prriducing a complete concordanre of the Latin Bible That is a kind of ie t~ ienre work desigiied to assist iii tlie exegesis of tlie Bible consisting of al1 (niiurrenres of terms nanies etc that were felt to he significant and presented thesa terins in a way that would help the researrher

1 El uso de [concordancin~~ en hteraiura y anaihis Iinguist~co son nada nuevo corrienzo bien computadtras anteriores existieron Tribble y Jones (1990) rastro la historia de [roricoidanring] del 13th siglo. cuando Hugo que [de] San Charo, alisto 500 rrioriles en producir una [coricordance] completo de la Biblia latina Ese esta. un QbnerO da ratarenria irabaja diseño asistir en el exegesis de la Biblia, corista de todo ocurrencias de terminos, nombres, etc , se erifieltro estar sigriificante ese. y preserito estos terminos en cieno modo ese ayudaria el investigador

Figiire 1.1. Autoinnt ic T e x t Translat ion (Englisli-Spanisli)

Q Servicio de I'iiblicacioiics. Uii iversidad de Murcia. All rights reserved. IJES, vol. 2 ( l ) , 2002: pp. 129-165

Page 19: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

l . Tlic use o f coiicordanciiig iii literature and linguistic analysis is nothing riew.

El uso de [coiicordancing] eii litcratura y análisis lingüistico soii nada nuevo.

7. I t staitcd well bcfore coinputers cxisted.

Coiiieiizó bien cornputadoras antcriores existieron.

3 Tribble aiid Joiies (1990) trace tlie Iiistory ofconcordancing froni thc 13th century. wlieii Hugo dc San Charo enlistcd 500 iiionks in producing a complete coricordance o f tlic Latiii Biblc.

Tribble y Joiies (1990) rastro la liistoria de [concordancing] del 13th siglo, cuando Hugo que [dcl Saii Cliaro alistó 500 inonjcs en producir una [concordance] completo dc la Biblia latina.

4. Tliat is. a kiiid o f refcrcnce work designed to assist iii tlic exegesis o f tlie Biblc, coiisistiiig o f al1 occurreiiccs oftcriiis. naiiies. ctc., tliat were felt to be signiticant. and presciitcd tlicse teriiis ir1 a way tliat would Iielp tlie researclicr.

Ésc está. un género de referencia trabaja diseiló asistir en el exégesis de la Biblia. coiista dc todo ocurrciicias de térrriirios, noiribres, etc.. se entieltró estar significante Csc, y presciitó estos términos en cierto iriodo ése ayudaría el irivestigador.

Table l . Line-by-line Printed Translation

resenl Pedect re ler i le Perfecl

nusntrns qucircrnos vusniius guciréls

resenl Ped. Subj.

Figure 15. lntlection Look-up

B Servicio de I'iiblicacioiies. Uiiiversidad de blurcia. Al l rights reserved. IJES, vol. 2 ( l ) , 2002, pp. 129-165

Page 20: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487
Page 21: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Tablc 2. Tag-set ofeTiKeT(i)

T a b l c 3a. Iiifoririatioii oii Sessioii Perforiiiance aiid POS-Disaiiibiguation Context selected

Tablc 3b. liiforiiiatioii oii Tags , Date, Sessioii. Text or Single Word Tagging aiid Languagd'

O Servicio cIc I'iiblicaciones. Universidad dc Murcia. All rights rcserved. IJES, vol. 2 ( 1 ), 2002, pp. 129-1 65

Page 22: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Table -1. Data Basc Extract (Types. Tag. Frequency. Session. First-Tiine Guessing of the Type and Language)

Puniuaciion(comma) Delermlner ~ e v calegory 7 3 Adlectlve

Change

Verb (aux) Adverb Verb (lexical) Particle Verb (aux) veib (iexical) Preposllion Delermlner Noun Preposlllon Determlner NOun Palllcle Vrrb (a",! Verb (lerical) Delerminei Adletllve r4oun

Figure 16. t.TiKeT/I (Data Basc Layout)

O Scrvicio dc Publicacioiies. Uiiivcrsidad de Murcia. All rights reserved. IJES, vol. 3 ( l ) , 3002. pp. 129-165

Page 23: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

[Pun(Calon)] ...............

Ualc[Nl whisk~lfll , IPunlcamI 1 Graln[Nl vhlsky[N] and[Conl BlendedlddJl uhi3ky[N] . [Pun(Full Stor ---..----..----.

Therelldvl areIlii*l chree[Ietl cypes[Nl of[Pie] Scocch[ldj] uh~sky[N] [Pun(Cr>Ior,)] ............... Ralcífdl uliiskv[Nl . iPunlconine11 CrairilNI uhisky[Nl and[Con] Blended[ldJl vhisky[N] [Pun(Full Stor --..-----.-----.

nelt[Nl uhiskYlN1 ls[luxI produced(V1 orily[ldvl froin[Prel 100[Och] mslted[bd]] barley[Nl [Pun(Ful ...............

GralnlNl uhlsky[NI ~siluxl producedlV1 frorn[Prel a[Derl variecy(N1 of[Pre] cereals[N] vh~ch[Prol n ---------------- Blended[ld~l uhlsky[Nl ls[b.ix] alDec1 camblnatlon[Nl ot[Pre] nalc[ll] uhis*y[N] and[Con] Graln[N] r ..--.--..--.-..-

slngle[Ad~l nalis[N]

................ iiost[Advl otlPie1 chelDec) ubiskles[N] covered[VI ~n[Pre] rhxs[Dec] cexc[N] src[Aux] 51nglc[Ad)] Y ................ TIieselDecl are[Nl tlie[Det] praducts[Nl oíIPce1 ind1vldual[Ad3] malc[N] whziky[N] d~scillerlei[NI ............... FoiIPrel rxariiple[Nl .LPun(coma)l Aberlour(N1 , [Pun(comtia)] EdrsdourINl ,[Pun(co-)] Lapbroalg[N] ................ Houever(Cun1 , [Fun(cormall che[Derl actual[ld]l diic~llery[N] name[Nl daes[Aux] not [Adv] have[V] t ................

I d Paa &mi A* p.%- Fl NUM

Figure 17. eTiKeT@ (ASCII Layout)

3 1 0 0 6 3 0 -2 -2 4 2 8 2 -2 -2 O 2 8 3 0 - 1 O O 1 8 3 - 1 0 0 O 1 8 3 0 8 4 O 1 8 3 8 4 0 O 1 8 3 4 0 - 2 8 1 8 3 o -2 -2 4 1 e 2 -2.2 O 1 8 3 0 2 1 O 2 8 3 2 1 5 O 1 8

3 . 2 0 0 3 1 5 6 2 1 8 3 5 6 1 3 1 1 8 3 6 13 4 5 1 8 3 1 3 4 0 6 1 8 3 4 0 . 2 13 1 e 3 0 -2.2 4 1 8

3 6 13 4 2 .2 -2 O 1 8 3 2 1 6 O 1 8

Figiirc 18. Inforination on Iiiferred Patteriis and Statistics

& h v o L d c h 1st preitar mrmata &pirbor Mrrarrietls VeMana 2

E . &h?' $ i f l Y j B # #a'.% m-a- E).

O Servicio de Publicaciones. Uriivcrsi<lad de Murcia. Al l rights reserved. IJES. vol. 2 (l), 2002, pp. 129-165

b - -

Patroi,Code 1 Nuinante 1 Anteriores 1 Numpost 1 Porteiiores 1 Cateyoria 1 Apariciones 1 SesionCoda 1 - 1 1 -2 3 2 1 0 0 1 8 b 2 2 -2 5 3 1 0 0 6 2 1 8 3 3 - 2 5 2 3 0 6 4 10 1 8

Page 24: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

59 1-621 'dd 'ZOOZ '( I) z FA '~3~1 .pa~~asa~ slq~!~ 11~ 'e!-'.inn ap pep!s~a~!un .sauo!s~s!lqiid al> o!s!~~s~ O

aInJ xge aqljo lred se readde lou saop ruroj ~JOM ej! 'L11eu!d WJOJ aseq aill Bu!ilndlno 's.tnn

WJOJ PJOM aqljjo uayel S! s- xwns 1ernld aqi 'a~diuexa JOJ :pa!llde are saInr Su!dd!~ls xue jo

Jaqiunu e uaql '(poqlaur aseo-Lq-aseo) plompeaq Lue rapun palsq lou S! LUJOJ PJOM ej! :poqlaru

8u!dd!rls xge (z) fpooX plompeaq aql Japun palunoo pue palsq ale lsaq pue rallaq 'a~druexa JOJ

:salnr~o sueaur Lq 'sa!l!re~nBarr! ql!~ leap 01 poqlaur aseo-Lq-ase3 (1) :sassa~ord pau!qiuo3 lnq

luaraj~!p OM] Lo~dura L11eo!dLl sJaspeiuiua[ 'sa!]!~e[n8an! Bu!pn~ou! '~~o~oi~d~oii~~o sa!l!xa1diuo3

aql aIpueq 01 JapJo u1 .L~ma alSu!s e Japun pals!l ale ~JOM e JO sii1103 paiipap pue pal3a~ju!

[eo!So~oydiour sno!reh aq] a~aqm ñ~euo!]~!p e u! se lsnr luo!les!leruiua1 :proMpear[ uoruruo3

e Japun PJOM E JO SLUJOJ pale[aJ JO [eJ!]uap! aql [[e 1aqlaSol iCJ!sse[3 01 S! 'srual! 3!is!nBu!l

JO Su!lunoo ay] ~oajje A~sno!ras ueo qo!qm '(uo!l.7as-,y~)d aas) srualqord le!luaiod Jaipo pue s!qi

qi!~ Su!leapjo Lem v .(sadL]) surroj prom luarajj!p se (euriual) LUJOJ aseq aiues arIlJo (suayol)

sur~o~ paloauu! ssa3o~d plnom s~a~uep~oouoo prepuels 'ahoqe pauo!luaru sv ~,07lclLlarueu 'erurual

auo pue (sadL1) sui~oj ~JOM Jnoj '(suayo]) sp~om au!u aheq aM aJaqm 'piCt7ldpuepaiC17ld ;Yu!iC17ld

;Cqd 'siCnld 'knld 'paiCi7ld 'Xu!iCi?ld 'siCnld :aouanbas pro^ BU!MOIIOJ ay1 Jap!suo3 y.mpodur!

S! (seruural) SLUJOJ aseq pue (sadL1) SLUJOJ ~JOM aql '(suayoi) sp~om uaa~laq uo!i3u!is!p ay1

Page 25: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487
Page 26: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

A much more interesting and challenging application is Togcorder". Tugcorder has been

implemcnted to allow conlplex searches within the CUMBRE Corpus (Sánchez et al. 1995) and

takcs fiill advantagc of tagged and lemnlatised data. Users can invariably look for terminal nodes

(types; Figtn-e 22). non-terminal nodes (POS-tags; Figure 23) with any additional tagged

grammatical information (ntrmher, person, mood, tense, etc.), base forms (lemmas; Figure 24)

and/or any conlbinations oftypes. POS-tags and lemmas; Figures 25 and 26). The program itself

is very interactive and flcxiblc in its search procedure and extremely fast as it works with pre-

indexed text.

Señar direciar Un articulista esciibia en,',EL'que en la guerra de Espoíio solamen srnbtguedades Djindlicdeclaraba ayer a ,\bi. que su partido no se aliara en ningun

Las luentes consulladas por LlJCdescartaron ayer que lo sucedido el l

Figure 22. Type Searcli "ABC"

Q Servicio (Ic I'iiblicaciorics. Uriivcrsidad de Murcia. All rights reserved. IJES, vol. 2 ( 1 ). 2002, pp. 129- 165

Page 27: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

I r t !c~t~t t / i i~g Corpits-b~tserl Kcsortrccs UIILI Xu'olitrul Loi~gitugc Proccssing Tools inro CA LL 155

pdabiai EEEEIRllCü

El director de la Real Academia Er~ariola ha sido nombrado doctor 1 1112 1 'honons causa' por la Universidad de Euenor Ares. a propuesta de la

1 Facultad de Fdosofia y Lctras, cn rcconocurvento a su destacada - 1 trayectma en los CSNYDS de lengua y Literatura espariola

ABIERTA 1 ABIERTO ABOCAR ABOGA ABOGADO 48UNO ABORTOS ABRE

El director de la Fi-al AcademiaEspañola ha sido nombra El diredor de la Real Academia Es!iaiir~la ha sido nombrado doctor 'hono

E x r r e s h le Filosofiay Letras. en reconocimiento a su deítocr.dn trayectoria en los estudios de I IC5DJETIVOI ?cloria en los estudios de lengua y literatura czpa6oio.

La niinistra de Asuntos %o.:iaie,: "resentará las Iineas de acluacic heas de actuación del Gobierno en politica 1airi:i:hr durante la Conlerencia Inleinecioni 10 en politicafamiliar durante la Conlerencia I~i?~rno¿io:inl que mafiana empieza en Pai que ha sido organizada con motivo del Año Ititer~isr!onal de lo Familia que se celebre i Jesucristo no está entre nosotios de lorrna~i i ib ie. el Espirilu Santo. a Iraves del Minis

r iaiodo el obispo de Málagaen relación a IaiJl'iriio encíclica Verilatis splendor" e produce algunatregua "sera por decisióniirilnteral' de los etarras.

Pdse F1 paa &mi ay& 1 1

465.902 535 Fiases A=

1 da53

& & $J 'a " 9 f l e p . I CWOI ...] &Erpl..l $~IcI.. {r %Q@.?d OB&Q@196a42 12:lU

Figure 23. POS-Search "ADJECTIVE"

EEEEO30103

El director de la Real Academia Española ha sido nombrado doctor

traycctona cn los estudos de lcwua y Literatura española

ABIERTO ABOCAR ABOGA ABOGADO ABONO ABORTOS

I diredor de la Real Academia Española ha :-irlii nombrado dodor 'honoris causa" poi mal que mañana empieza en Paris y que ha F [ ~ O organizada con motivo del Año Iniern i algún ~iiomento se produce alguna tregua 'sriio por decisión unilsleral" de los etarras Cámara Alta que preside Juan Laborda ha :,idri consensuado Iras cual años de iraba

Esto cifro de pasaleros ea l a mas alia registrado en un día desde nto. seqún dalos de Renfe. cuya presidenta c.- Merce Sala.

Probablemente no se equivoca Argol iaei un buen ejemplo de como entienden m 'Yeltsirino r:; un Jefferson sino quizá un Piiiochel' S<

entemente eii el'l4erYorkTinies'LaIiase c: ingeniosa pero injusta Yeltsm es bastt ." La frase es ingeniosa pero injusta Yeltsin es bastante peorque Pinochei npuso. los horreridos crimenes que comelió l u e i x el precio que Chile pagó para recol

Figiirc 24. Lernina Search "SER"

O Servicio dc Publicacioiies. Uiiivcrsidad de Murcia. All riglils reserved. IJES, vol. 2 ( l ) , 2002, pp. 129-1 65

Page 28: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Se enfrenta a un IPailorren.o elegido por una ambigua legi Es un cur\ i . i j r~j-~. I~ que se oludode supropiaC

con una politica caienle de prinopios un ci Jii de huniillauones seguido por otro de

Figure 25. Coiriplex Searcli: "UN" + N O U N (Coiiiiiioii Countable + Masculine + Singular)

Honeshdad que se ha puerto y se va a poner a prueba ante el proceso

ABATIMIENTO

MIERTA ABIERTO 2 PBoClR 1 ABOGA 1 ABOGADO 4 ABONO 1 ABORTOS 1 MRE 1 I

Qwach

Y A ante el pioceso sedudoi de Milosevic para arrancarselo a la oposición y traerlo a su c ' i ki _I

Rls Fl paa asa .+u& 465 9(12 1 F i m ti-' l&l

~ R O W 1 K w a 1 &E& 1 ~ ~ I C T ](SLa M ZJ 3 R m %#@ii EI18 Q O@*Ba-i 12~3

Figure 26. Cotnplex Search: VERB (Main + Two Clitics + Intinitive)

O Servicio dc Publicaciones. Universidad de Murcia. All rights reserved. IJES, vol. 2 ( l ) , 2002, pp. 129- 165

Page 29: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Iiiregi.ri1ii7g Giptis-husctl Kcsoiirces untl hu~rii~ul Lungi~uge Pvocessing Tools inro C 4 L L 157

Parsing involvcs tlie procedure of bringing basic morphosyntactic categories into high-leve1

syntactic relationships with onc anotlicr. This is probably the most commonly encountered form

of Corpus annotation aftcr 1'0s tagging. Parsed corpora are sometin~cs known as ireehrrnks. Tlicrc arc rules governing tlie way in wliich words can be put together to form

syntactically well-formed or graniniatical sentcnces: the study of syntax aims to discover them

and to describc and analyse language in ternis of these rulcs. Consider the sentence A dog chrised

/ha/ gil./, wliere we Iind the samc pattcrn of constituents before and after the verb. that is

rie1ei.riiit7er + noz~n. 'These two words also appear to be belong togcther more closely than say the

noun dog and tlie vcrb chrr.vct/. Anotlier way of illustrating that these words belong together is

to givc tlie girl and tlic doga nanie -names of specific itenis sucli as individual people, animals,

places and so on callcd proper noz1n.s-. and we get, for exaniple Henry ch~ised Crirol.

It scciiis clcar tliat natural languages or hunian languagcs have a role of cons/i/uenr

. ~ / I . L I c / z I ~ . ~ . A scntcncc is notjust a nierc string of words. The words are groupcd into phrases, each

of whicli consists of a short phrasc. Many of tlie importan1 properties of languages are organised

around constitucnt structurc. Constituent structures (a) group words into constitucnts such as rhe

riog and iti/o /he grrt.rien; (b) givc nanies to the constituents, such as noun phrcrse and

/>1.c/~o.vi/ionall7kra.ve. In turn. constitucnt structures are sanctioned or gcncrated by rulcs, known

as ~ ~ l r t ~ r r . v e - . ~ / r ~ ~ c / z ~ r e r~11e.v 01' this typc:

S +NI' Vi' NI-> -, h'

VI' -, l/.

wherc S stands 1or scntcncc. NI' for noun phrasc, VP for verb phrase. N for noun and Vfor verb.

So tlie 1%-rulcs abovc states that a scntencc consists of a noun phrase followed by a verb phrase.

In turn, tlic NI' o f a n N and the VP ol'a singlc V. The tree structure derived or generated by that

rulc would bc

Parsing algorithiiis can proceed top-down or bottoni-up. In some cases. top-down and

bottoiii-up algorithiiis can bc combiiied".

(0 Servicio dc I'iiblicacioiies. Uiiiversidad de Murcia. All righls reserved. IJES, vol. 2 ( 1 ) . 2002, pp. 129-165

Page 30: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Tlic Visual Intei-active Syntax Learning (VISL) websitei"s particularly interesting and

useful for languagc learners. It contains an on-line parser and a variety of other tools concerned

witli English graniniar. including ganies and quizzes. The parser itself is an excellent and very

transparent application tliat allows learners to analyse and experinient on sentences and study

tlieir structure (Figure 27).

lntcrcsting in this respect is also tlie parsing of students erroncous input. Integrated

parscrs into CALL soliwarc can be prepared to dcal with linguistic errors in Ihe input. So the

grammar tliat copes with correct sentcnces is coniplen~ented with a grammar of incorrect

senlcnces. 'l'he advantage of this error granimar approach is that tlie feedback to students' ouput

can be vcry spccific and is nornially fairly reliable as it can be attached to a very specific rule.

However, tlie niajor drawback of this approach is tliat individual learner errors have to be

anticipatcd in tlic sense tliat eacli error nceds to be covered by an adequate rule.

Seiilrnre The giri saw the hay with a telescope

Funrtioi, _o] S S I P Od 0 i 0 1 C s Co A SUB CO CJT 0 H U i i STA QUE COH EXC >

F ~ r m n u adj a& ari piiin prp con) num infm intl nll ti par 0 x O I Collapse Tiee

STA c I

S P Od A

ait n art n p r p

l I The gir l t h e boy mlh D H

art n I I

a t e l e s c o p e

Figure 27. Visual Iritcractive 011-Linc Parser

CALL soliwai-c Iias normally bccn rcstricted to written text. I3owever. recent advances in

niultiiiiedia llave resulted into powcrliil hardware and software applications. allowing users to

O Scrvicio tic I'iiblicaciones. Universidad dc Murcia. All rights rcscrved. IJES, vol. 2 ( 1 ), 2002, pp. 129- 165

Page 31: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

aitacli a n-iicropl-ione and loudspcakcrs to soundcards and to record hislher own voice.

Furtl-ierniorc, storing tl-iese sound tiles is no more a problem due to the imniensely increased

capacity and cost reduction of I-iard disks, otl-ier storage devices (CD-ROM) and improved

coniprcssion algoritl-in-is í'or tl-iis kind of data (i.e. MP3 files).

I'rescntly, tl-ierc is a widc iaiige oí' speech software available. This includes (a) spoken

input processi~ig '~ or spccch analysis, wliere speech input is analysed and represented graphically

or nun-icrically; (2) spcccl-i rccognition: the transiorn-iation of spoken input into written output;

and (3) spccch syntlicsis, that is ilie conversion oftext to speech"; this includes no just matching

cl-iaractcrs to sounds, bu1 also intonaiion and the rl-iytl-in-i particular utterances I-iave. Advances in

spcccl-i syiitl-icsis tecl-inology I-iave reached a I-iigli leve1 of performancc and robusiness and some

CALL applicritions havc staricd considcring its integration"'.

111 contrast, specch recognition is í'ar more con-iplex than speech synthcsis. Speech

rccogiiiiion needs an cxtensivc analysis of spccch by nleans of a number of parameters, which

arc vcry dilíiculi to cstablisli as tliey can be casily aíTectcd by background noise, speech speed

(conncctcd spcccl-i). pariicular acccnts or idiosyncraiic individual's speecl-i. All tl-iis leads to

coiiiplicatc aiid intcrlerc in tl-ie lixiiig and interprctation of the establisl-ied paranictcrs. . . I hcre arc son-ic eomn-iercial applications able to "understand" natural speech and can

providc Ilinguagc studenis with realisiic, highly effective, and motivating spcech practice. One

of ihis application is IBM.i<> Vi(~b'oice:k). This progran-i runs on normal PCs and includes speaker

indcpcndcni coiitii-iuous specch rccognition cngines and is able to deal with con-iplete sentences

spokeii rii a natural pace, not just isolated words. though it requircs a minor training period. ?'o

run ~ l i c progran-i, tlic uscr just needs to associate it with any wordprocessor, wl-iere ihe user's

uttcranccs will to bc iranscribed in (Figitr-e 28) or run Speech Pud, a sin-iplified standard

wordprocessor that includes Victb'oice.

Many niultin-icdia CALL courses already I-iave and still include son-ie naive direct

proi-iunciaiion prücticc. That is, cxerciscs wl-iicl-i focus on pronunciation, fluency and word order,

and witli iiativc spcakcr i-i-iodels wliicl-i are I-ieard in-iniediately after a student's perforn-iance. These

applications leave tl-ic learncr-n-iodcl con-iparison to student's criteria or visualise graphically both

performanccs, ii-idicating tlie success ratc in %. The negative side of these exercises is that in

soi-i-ic instarices ii is cven for naiivc speakcrs of the language very difficult, if not impossible, to

achievc saiislactory success rates. Somc of tl-iese applications are neitl-ier very flcxible nor

acciiraic and soi-i-ietin-ies studcnts would need to repeat their utterances in several occasions before

tl-ie progran1 "iiiiderstands" thcn-i corrcctly. In iurn, this inighi lead to some sn-iall frustration.

O Servicio tlc I'iiblicacioiics. Uiiivcrsi<Ia(l dc Murcia. A l l rights reservetl. IJES, vol. 2 ( l ) , 2002, pp. 129-165

Page 32: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

V. CONCLUSION

CALI, Iias 1br long bccn doniinatcd by the drills and practice associated with behaviourisni (cloze

and gap-iilling excrcises, ni~iltiple-choicc tests. etc.) and, eventually, the use ofsome basic word-

proccssing tools. Soon. sonie languagc tcachers and CALL practitioners reacted negatively and

iiotcd a laek ofpiogrcss iii CALL (Kaliski 1993, Last 1992, ctc.), partially due to the:

Idiniiicd CALL soliware available

Rcduccd iiuiiibcr OS coniputa~ional cxcrciscs

IncoiiipaLibili~y bciwccn einployed CALL techniques and current language teaching

pedagogy (paiiiciilarly influential in tliis respect has been tlic eniergence coniniunicative

syllabus)

Coiiscqiicncc of iiew technology being unablc to fulíi11 teachcrs' expectations

I~or t~ ina~e ly . iliings Iiavc clianged for CALL in the 90s, partly because of the wider

availability 01' PCs and tlie iniegraiion of linguistic corpora and NLP-technology. The use of

linguis~ic corpora and NLP-applica~ions arc liighly valuable tools for language dcscription with

iniporiaiit iniplications Sor laiiguage tcacliing. as they can:

Assist laiiguage tcacher in identifying relevant content of instruction (vocabulary, grammar,

coiitexts. ctc.), and

Iiclp in developiiig new pedagogical and niethodological approaclies to instruction (i.e.

sliiliiiig tlic pcdagogical tcacliing/learning paradigm from computer as niagister to computer

as pcdagoguc).

Moderii CALL has changcd and instead of adapting it to what software can offer, an

aitenipt is niade to get it LO take account of tlie neccssary conditions of successlul language

leariiiiig. Lcai-iiers arc givcii mucli niorc control over what they learn: autononious language

Icarniiig iii scll-paccd, niorc inieractivc, nieaning doniinated. task-orientcd activities (Kennedy

O Servicio <Ic I'iiblicaciones. Llnivcrsidad de Murcia. Al1 righis reserved. IJES, vol. 2 ( 1 ), 2002, pp. 129-1 65

Page 33: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

1/71c~gi.~i/ii7g Coi~~ii.s-hosetl K~~surii~ccs iinil,'L(iliii-rrl Lotigiicige P I . O L . L ' S S ~ I ~ ~ TUOIS i1710 C,1 LL 161

1998: 393).

I'articularly inlercsling in tliis respect is the use of real and relevant text sample for

sludcnts and tcacliers as tlie central pcdagogical teachingllearning cornerstone. Real time

nianipulatioii of texts by sludcnts using integrated user-friendly interfaces, including word-

proccssing lools and N1,P-applicalions could conforni an exlremely valuable pedagogical

paradigm williiii thc foreign language Icarninglteaching context. Teachers would be able:

. . l o cxtracl, nianipula~c and adap1 tcxts to students needs and language level

L'o cnrich plain tcxis with POS-tags and syntactic annotations for class work

'fo cxtracl vocabulary lists, phasc lisís, coiicordance lists, etc. (sublanguage specific, adapted

to a spccilic Icvcl. or doniain, etc.) . . I o gcncrale autoniatically c/d hoc cxerciscs, depending on students' particular needs.

Similarly. sl~idenis could also take advaniagc of this integrativc CALL application

To cxplorc [he targcl language by nicans of concordancers with integrated taggers and

oarscrs andlor laggcd tcxls and treebanks

fo cxlrnct Llie gist oSn1oi.e difficult texts, using MT-software . . l o check 1lic mcaning of words and plirases (electronic dictionaries) . . Lo gcncralc autonia~ically rrd hoc exercise gencration, depending on one's own needs

To Iicar ilic tcxt. sclccted scntenccs or words, using spcech synthesizers

To aiiswer orally LO sonic rcsponses, dictating the solutions to the coinputer (speech

rccognilioii lools)

Aclually, wliat wc propose is a sophisticatcd CALI, language processing tool" that

Takcs f~i l l advanlage oScurrent coniputational advances in an integrated and unitary way: I'leclronic dictionaries (nionolingual, bilingual or niultilingual ones)

M'I' systcnis

1'0s-~aggcrs

Synlaclic parscrs

Coiicordanccrs

Spccch ~~roduc~ion/rccogniiion Word-proccssors (this includcs spcll-clieckers and gramniar and style chcckers)

Gocs bcyond wriltcn text. as it also accounts for oral production and oral recognition

Assists bol11 ~caclicrs and studcnts in tlieir respcctive tasks and that could contribute to new

and cliallenging pcdagogical and iiiethodological paradignis in the area of foreign language

Icarning.

(0 Scrvicio [Ic I ' i iblicacioiie~. Uiiivcrsi<l;id de Murcia. A l l righis reserved. IJES. vol. 2 ( 1 ) . 2002, pp. 129-165

Page 34: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Aiid siiice we Iiavc al1 tliese computational tools are our disposal, it niakcs no sense to

renounce thcir application in sucli an iniportant area as language pedagogy. We cannot dismiss

tlicni. wc musl use theni ...

NOTES

l . Sec aiiioiig otliers. tlic Osford Concordancc Prograin (OUP 1988), Longinan Min i Concordancer (Chandler and Tribblc 1089). MicroCoiicord (Scott and Joliiis 1993). MoiioConc (Barlow 1996). T A C T (Bradley and Presutti 1989). or WordSiiiitli (Scott 1999). 3, l l i l l > : ~ ~ ' \ ~ c l ~ . l > l l ~ i l l l . ~ i c . i i ~ ~ ~ i ~ l l l l ~ l 1 : ~ l i i l l c ~ ~ i l c . l 1 t l l l 3 . C'oii/e\.l caii bc dowiiloadcd frcc o f cliarge for noii-coiiiincrcial purposes froiri T i i i i Johns DDL-web page: Iiitp: i ~ c b . t ~ l i í i i i i . a c . i ~ k ~ ~ ~ ~ I i ~ ~ s t l ~ ~ r i i n c ~ ~ i i c . l i i i n 4. Saiiclicz. A . aiid P. Caiitos (3000) Piricliccr /rr i~ucubi/laiio. Madrid: SCEL. 5. For a iiiorc coiiiprcliciisive survcy. visit "Module 3 . 5 . Hurnan Langiiage Tecliriologies (HLT)" o f theli?fornin/iuri titic/ C ' o i i t ~ ~ t i i i i i ~ . ~ i / i o ~ ~ . s T e ~ h i t o I o ~ ~ j o r L~iiigiirige Terrclier(ICT4LT) wcb page: Iitti>::~\+\r-\c.ict4lt.or$ciilcii iii»d.;- 5.litiri 6. Hutcliiiis aiid Soiiicrs ( 1992) aiid Ariiold ct al. ( 1994) provide excellent introductions to MT. 7. Tlic M T systeiii used Iicrc is Spciiiish A.s.sisrcriil (MicroTac Software). Otlier coininercial M T software: Syslruii ( l ~ ~ l ~ ~ : ~ ~ l ~ ¿ ~ l ~ ~ l f i ~ l ~ . : t l ~ ~ ~ i ~ ~ ; ~ . ~ l ~ ~ i i ~ i l . c c ~ ~ i i : ) or Poi~>er Tr(iiis/n/or ( l i t l p : ~ ~ ~ v ~ v ~ v . l l ~ ~ l . c ~ ~ ~ i i ~ p o ~ : c r ~ ~ ~ r ~ s l ~ ~ t o r ~ ) . 8. Sce Greeiie aiid Rubin ( 197 1) for a detailcd dcscription o f tliis tagger. 9. Describcd iii detailcd by Garside (1987) aiid Marsliall (1987). 10. To get a frec copy. for acadeiiiic purposcs only. e-inail Rafael Valencia(rat;lvaleiicia(ii~~>~io.es). Rodrigo Martincz ( r a ~ I i ~ i c o ~ ~ i ~ < l i l ~ ~ ~ i i i . ~ ~ ) or Pascual Caiitos (pcaiitos:Ífi~iiii.~s). 1 1. Wlicre 0 = Eiiglisli aiid 1 = Spaiiisli. 13. TI.-Tcig(Tecliiiol~iiigua) is pait o f tlicC'UA,IBRE Curpris Project aiid is not yet coinmercially available. For tliose interesied iii i t coiitoct aiiy o f tlic pcoplc iiivolved iii the Project: Enrique Pérez de L,eina (tlcIciiia~rjia~zli.ce.coiri). Josc Siiiióii (j~i3X746<Z~telcli1ic.cs). Aquilirio Sáiicliez (asai ic l ic~~~?~ir i i .cs) or Pascual Cantos ~>caiitos(Z~uiii.cs). 13. Diez. P. L. aiid J. lborra (1909) Cérbos Espriñoles Coitjiigridos. Madrid: SGEL. 14. Scc ciidiiotc 12. 15. Alleii (1995). Coiivigtori (1004) aiid Gazdar aiid Mellisli (1989) include excellent iiitiodiictory sections on parsiiig algoritliiiis. 16. I i i i1 i : l~v i~ l . l i i i i i i .~111.~1k~ 17. Ari cxcellciit sitc is liitegrating Spcccli Technology in (Langliage) Learning: Iitt~>:i~\v\v\v.iristi l.ora. 18. Aiiioiig tlie iiiaiiy iiitcrcstiiig web sitcs. check: I i t ~ r> :~ ' ; i ~o r : i l : i ~~~ .c~ in i~s i~ !~~ :~ Ivze . l i t ~n l . 19. A good csaiiiple is Wirispeccli: Iitip:!i~r-\v\v.pc\v\v.caiii. 20. Tlie Polytecliiiic Uiiivcrsity o f Hoiig Koiig cite iiicludcs a iiuinber o f text - to-s~ecl i tools: ~ l t l ~ > : ~ : ~ ~ ~ . ~ l O ~ ~ l i . ~ ~ ~ i l , ~ l k ~ ' ~ ~ ~ t ~ ~ ~ ~ ~ ~ [ > ~ ~ ~ ~ l

31. We Iiave dclibcrately iiot coiisidered tlie integration o f /ifirriici/iori Technologies Iiere. Tliis would Iiave incvitably expaiidcd tlie potciitial o f tlie "tool" proposed Iierc.

O Servicio de I'iiblicaciones. Universidad de Miircia. Al l righis reserved. IJES. vol. 2 ( 1 ), 2002, pp. 129-1 65

Page 35: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

IIEFEIIENCES

Alleii. J. (1095). NLIIII~LII L L I I I ~ L I L I ~ ~ U I I L ~ ~ ~ J I L I I I L ~ ~ ~ ~ ~ . Redwood City, CA: Tlie Beiijamiii/Cuininiiig.

Ariiold. D. et al. ( 1994). hlcrclririe Truri.sl~rtiori. Oslord: NCC Blackwell.

Astoii. G. (1095). Corpor:i iii IA:iiigiiage Pedagogy: Matcliiiig Tlieory aiid Practice. Iii G. Cook aiid B. Sei(llliolkr (Eds.), Prirrciple LIIILI Pr~rclice ir1 Applied Li~igl i is~ics: S~udies ir1 Horrolrr of' H.G. FYi(ltloii~.sori. Oslord: Ostord Uiiiversity Press, 257-270.

Astoii. G. (1996). Tlie Britisli Natioiial Corpus as a Laiiguage Leariier Resource. Iii S. Botley, J. Glass. T. McEiiery ~ i i i d A. W i lsoii (Eds.). Procce~lirrgs of' Teucliir~g ~rtrd L~~trgz~uge C'orporu 1996. L¿iiic¿isier: 1-:iiicaster Uiiiversity Press, 178- 19 1 .

Astoii. G. ( 1997). Siii:iI I aiid 1-auge Corpora iii Laiig~iage Leariiiiig. Iii B. Lewaiidowska-Toinaszczyk aiid J. Melie (Eds.), h~occe~ l i t rg~ qf'llre Firsl I r i~ert~~rt ior lul Cur7fi.rence o11 Prtrc~icul App1icu1io)rs in L~rrigii~rge C'orpor~r. Lodz: Lodz Uiiiversity Press, 5 1-62,

B;iII. C. (1095). IHwxt! A Corpiis-based Hyperiiiedia Resource Tor Old Eiiglisli Lexical Acquisiiioii. Liirp~i~.s/ic a S u c i c ~ ~ of'Airreric(r A~rrr~rcrl Meetirrg, New Orleaiis. Jaiiuary 7, 1995 (Soltware poster sessioii ).

B:irlo\v, M . (1997). Usiiig Coiic«rd;iiice Sottware iii I~.aiigiiage Teacliiiig aiid Researcli. Secorrd Iri/crrratiorr~~l COr~fi.rerrce ori Foreigrr L U I ~ ~ L I U ~ C E ~ L I C L I ~ ~ O I I LI~ILI T e e J ~ t ~ o I o ~ . Kasugai, Japaii (C«iilkreiice).

Barlo\\. M. ( 1095). Corpora 10i- Tlieory ;iiid Practice. Iir~errr~r/ioirul Jourt~ul uf'C'orp~n Lirigliis~ics, 1(1 ). 1-38,

B¿i~eriiiaii. C. ( 1094). ( O I ~ \ W ~ I ~ I I I I ~ EXI>L'I.I~~ICL' Carboiidale: Soutlierii Illiiiois Uiiiversity Press.

Bro\vii P.F. el. :i1.(1993). Tlie Mntlieiiiatics ot'St;itistical Macliiiie Traiislatioii: Paraineter Estiinatioii. C'u r r i~~~r t~~ / io r r~ i / Lrrrgirr~/ic.>. l9(2), 263-3 l l.

Bu riiard, L. ( 1 095 ). Tlre Br.i/i.sIr NLII~OIILII C . ' O ~ ~ L I S Users Rqferctrce Gui~le. Ox ford: Oxlord Uiiiversity Coiiipiitiiig Services.

Biiriiard. L. aiid T. McEiiery (Eds.)( 1999). Retllirrkirig L~rrrgu~rge P e ~ l ~ r g o ~ ~ f i o r r ~ u (.'orp~rs Pcrspectiilc. f~laiiibiirg: Peter l.¿iiig.

Celce-Miirci¿i. M . (1990). D;ita-based 1-aiiguage Aiialysis aiid TESL. Iii J. E. Alatis (Ed.), Georgetoii~rr Uriii~cr..si!,? R~ I I I IL I T~rhIe otr L L I I I ~ I I L I ~ C (~rjdI,ir1g~ri.s1ics 1990. Georgetowii: Georgetown Uiiiversity Press.

Cobb. T. (1997). Is tliere aiiy Measui-able Leariiiiig Sroiii Haiids-oii Coiicordaiiciiig'?. Sys~crri, 25(3), 301-315.

Cole, R. ( 1906). Foreword. Iii R. Cole (Ed.), Slrri,ej.li uf'flle Stcrfe ufl l ic Ar l ir1 H~rrrrutr I . U I I ~ L I L I ~ ~ Tcc////o/<~,q)~. 111 Iit11>:."c~lii.csr.opi.r.ti~iii Il,"l'oiir~cy/l 11 .'i'sLirvey.illinl.

O Servicio tlc I'irblic;icioiio. Universi<lad de Miircia. All rights reserved. IJES, vol. 2 (1) . 2002, pp. 129-1 65

Page 36: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Colliiis. 1 l. (1999). Materiiils Desigii aiid Laiigii;ige Corpora: a Report iii tlie Coiitext o f Distaiice Educalioii. Iii L . Burii;ird aiid T. McEiiery (Eds.), Re/l7illkillg Lurzguuge P e d u g o ~ , f i o ~ ~ i LI C:orpzr.s Perspeclive. 1-1;iiiibiirg: Peter Laiig, 5 1-63.

Coviiiyton. M . ( 1994). N L I I I I ~ ( I / L ~ ~ ~ ~ g z ~ ~ ~ g e Proces~ir~g fOr Prolog Progr~rliiriiers. Eiiglewood Cliffs, NJ: Prciilice 1-121 ll.

Flo\rerde\v, J. (1993). Coiicordaiicing as a Tool in Course Desigii. ,?vs/ew 21(3), 23 1-243

Flowerdew. J. ( 1996). Coiicordaiiciiig iii 1.aiiguage Learning. In M . Peniiiiigton (Ed.)ThePoii~er qfC'ALL. I lo~isloii: Atlielst:iii, 97- 1 13.

Gai-sicle. R. (1 987). Tlie CLAWS Word-taggiiig Systeiii. In R. G~irside et al. (Eds.), The C'ol11pzr/~r/io11u/ AIILI / I~.Y~.S (~/'E~~gli.sli. A C'ovpz~.s-B(~setI Approtrcl7. ILondon: 120ngmaii, 30-4 1 .

Gavioli. L . ( 1 997). Exploriiig Tests tlirougli tlie Concordaiicer: Guidiiig tlie Leariier. Iii A. Wicliinanii, S. Fligelsloiie. T. McEiiery aiid G. Kiiowles (Eds.), Tetrcllillg tr11t1 Ltrng~luge C'orporu. Loiidoii: 1 .oiigiiiirii, 83-99.

G;izdiir. G. aiid C. Mellisli (1989). Mr/~rtrl Ltnigiicrge Proces.siiig ill PROLOG. AII hi/rothrc/io~~ /o C'o~irpir/tr/io~l<rI Lii~giii.s/ic~. Wokiiigli;im: Addison-Wesley.

Greeiie, B. aiid G. M. Rubiii (197 1 ). Azrlolii~rled G ~ ~ I I I I I I I ~ I / ~ ~ L I / Tugg i~~g ufE11glis1~. Providelice Rliode lalaiid: Rrowii Uiiiversity Press.

1-1 iggiiis. J. ( 1 988). Ltr~~gzr(rge, Le~lr11er.s LI I IL I C ' O I I I ~ I I / C ~ S . London: Longinan.

1 ligyiiis. J. (199 la). I.ooking Vor Patterns. Iii T. Joliiis and P. Kii ig (Eds.), C'lussroo~~~ C'oiicorduilcing. Hiriiiingliaiii: Biriiiiiigliiiiii Uiiiversity Press, 63-70.

I4iggiiis. J. (1991 b). Wliicli Coiicord;incer? A Coiiiparative Review o f MS-DOS Software. S ~ ~ / ~ I I I . 19(112).91-100.

I i~ i t c l i i i i ~ , J. iiiid I l. Soiiiers ( 1992). AII hi/ro~luc/ioi~ / o Muclli~ie Trui~slulioi~. Loiidoii: Harcourt Brace Jov;iiiivicli Piiblisliers.

Joh;iiissoii, S. (1980). Tlie LOB Corpus o f Britisli Eiiglisli Texts: Presentatioii aiid Comiiients. ALLC' .Joztr11~11. 1 . 25-36.

Joliiis. T . ( 1986). M icrocoiicord: a I .aiigu¿ige-leariier's Researcli Tool. System, 14(2), 15 1 - 162

Joliiis, T. (1988). Wlieiice aiid Wliitlier Classrooiii Concordaiiciiig?, In T. Bongaerls. P. de Haan. S. Iabbe. aiid 1 l. Wekker(Eds.).C'o~~~pz~/erAppIicu/io~s i i ~ L(n1gzrugs Leur i~ i~~g . Dordreclit: Foris, 9-27.

Joliiis, T. (199 1). Slio~ild y o ~ i be Pers~i;ided -T\\o Saiiiples of Data-driveii Leariiiiig Malerials. In T. Joliiis aiid P. Ki i ig (Eds.), C'l~nsroo~i C'o~icor~l tr~~ci~~g. Birii1iiigli;im: Birmingliaiii Uiiiversity Press, 1-13.

0 Servicio de I'~iblicacioncs. Uiiivcrsicl;id dc Murcia. All righis rescrvecl. IJES, vol. 2 (1). 2002, pp. 129-165

Page 37: Dialnet-IntegratingCorpusbasedResourcesAndNaturalLanguageP-272487

Joliiis. T . ( 1993). Dkita-driveii Leariiiiig: an Update. TELL&C'AI.L, 199312,. 4-1 0.

Jolins. T and P. K i i ig (Eds.)(1991). C'lcrs.~roo~~~ C ' o ~ ~ c o r d u ~ ~ ~ i l l g . Biriiiiiigliain: Biri i i ingliai i i Uiiiversity Press.

K¿icowsky. W. (1987). Rc//cl. E~igli.sll. Sal~burg: Jugeiid-Verlag.

K;iIiski. T. ( 1992). Coiiil~~iter-assisted 12aiigu;ige Leariiiiig. I n P. Roach (Ed),C.'o1i1p11/iirgi11 Lirlg~ris/ics urld 1'11o11c1ic.c: b~/rot luc/o~:)~ Rctrcli~~gs. Loiidoii: Academic Press, 97- 106.

Keiiiiedy. G. ( 1998). AII I I I ~ . ~ L ~ I I ~ / ~ O I I /o C.'orp~~s Lit~x~lis/ics. Loiidoii: L.oiiginaii.

Keiieiii;iiiii, B . ( 1995). 011 tlie Use o f Coiicordaiiciiig iii ELT. TELL&C.'ALL. 199514, 4- 15

Kiieer;i, 1-1. aiid N . Fr;iiicis ( 1967). C . ' o ~ ~ ~ p ~ ~ / c i / i o ~ ~ c ~ l A ~ ~ u / y s i s qf'Prese11l Doy ,4111ericu11 E~igli.sll. Provideiice Rliode Islaiid: Browi i Uiiiversity Press.

12;ist. R. ( 1<)93). Co~i i~~ i iLers ;~i id I.aiiguage Leariiiiig: Past. Pieseiit - aiid tlie Future?. Iii C. Butler (Ed.). C'onlp~r~clit r111t1 FVri//ell Tc.r/.s. Oslord: Bl;icLwells, 227-245.

Leecli. G aiid C. N . Caiidliii (Eds.)(1986). C'ol1rpll/ei..s i11 E11glis11 L L I ~ I ~ I I U ~ C Tee~clli~lg L I I I ( / Reseurch. 120iidoii: 1-iiiigiiiaii.

M;ir>li¿ill. l . (1987). T;ig Selectioii Usii ig Probabilistic Metliods. Iii R. Garside et al. (Eds.), The C ' O I I I ~ ~ L I / ~ I ~ ~ ~ I I ~ ~ I A I I L I ~ ~ ~ ~ ~ S (1f'611lglis/1. A C:orpzls-Buserl Approucll. Loiidoii: Loiigiiiaii, 42-56.

N;ig¿io. M. (1084). A Fraiiiework oi'a Mecli;iiiical Tuanslatioii between Japaiiese aiid Eiiglish by Aiialogy I'rinciple. 111 A. El iiliorii ;iiid R. B;iiiei:ji (Eds.). Arlificic~l L I I I L ~ HLIIIIUII Ir~IeIIige~~cc. Ainsterdaiii: Elsevier Scieiice Piiblislier. 173- 180.

Sáiicliez, A . et 211. ( 1 995). C'UI\~RRE c o r p ~ u lil~giiislico clel espclfiol corl/c~~lporÚ~~eo. F~ll~dunlcn/os. ~~~eioclologiri 1, ( i~~ l i cc~c io~~es . Madrid: SGEI-.

Siiicl;iir. J. ( 199 1). C'OI~ILI .~ . C'o~~corclc~~ice riiltl C'oIIo~c~/ior~. Osford: Oxtord Uiiiversity Press

S1eveiis. V . ( 1005). Coiicoi-tl;iiiciiig wi i l i Laiigiiage Leariiers: Wliy? Wlieii? WIiat?.(;JELLJoz~rnd, 6(2), 2- 1 0.

Tribble. C. (1007). Iiiiprovisiiig Corpora for ELT: Quick aiid Dir ty Ways o f Developiiig Corpora for 12¿iiigii;ige Teacli iiig. I i i B. Lew;iiidowska-Toiii:~szczy k aiid J. Melia (Eds.), Procceclirlgs :~.?f'//ie F i r . ~ I I I I ~ ~ I I ( I / ~ O I I ~ ~ ~ C'ollfkrell~.c 0 1 1 Pre~clicc~I A / I / ) ~ ~ C U I ~ O I I S ~ I I L U ~ I ~ L I L I ~ C C:~rl>orc~. Lodz: Lodz Uiii\:crsity Press. 106-1 17.

Tribble. C. aiid G. Joiies ( 1090). C O ~ ~ c o ~ z l c ~ ~ i c c . ~ i11 //le ( % I \ S I ~ ~ ~ I I I . London: Loiigiiiaii.

Widdowsoii. I I.R. (1983). New Stkirts eiid Di I l t re i i t Kii ids o f Failure. In D . Larseii-Freeiiiaii et al. (Eds.), Lerll.ilirlg /o Ff'rile: Fir.\/ L(l~ig~~r~ge/Sec.olrtl I2ell~g~lclge, New York, I,oiigman, 34-47.

8 Servicio <le I'iiblic~cioiics. Uiiiversidad de Miircia. All rights rescrvcd. IJES, vol. 2 ( l), 2002, pp. 129-165