Using corpora in contrastive and translation studies Corpus Linguistics Richard Xiao

  • View

  • Download

Embed Size (px)

Text of Using corpora in contrastive and translation studies Corpus Linguistics Richard Xiao...

  • Slide 1

Using corpora in contrastive and translation studies Corpus Linguistics Richard Xiao Slide 2 Aims of this session Lecture Corpora in contrastive and translation studies Use of comparable and parallel corpora Case study: Translation universals, do they really exist? Lab session CUC paraconc and Babel parallel corpus Closing Shedding of valedictory tears Slide 3 Types of corpora: Some distinctions Monolingual versus multilingual corpora Parallel versus comparable corpora Comparable versus comparative corpora Slide 4 Monolingual versus multilingual corpora Monolingual corpora A corpus that only involves one language Multilingual corpora A corpus that involves texts of more than one language A corpus covering two languages is conventionally known as bilingual Multilingual corpora, in a narrow sense, must involve more than two languages Multilingual and bilingual are often used interchangeably Parallel and comparable corpora Slide 5 Parallel versus comparable corpora Terminological confusion centres around the terms For some scholars (e.g. Aijmer and Altenberg 1996; Granger 1996: 38) Corpora composed of source texts in one language and their translations in another language (or other languages) aretranslation corpora while those comprising different components sampled from different native languages using comparable sampling techniques are called parallel corpora For many others (e.g. Baker 1993: 248, 1995, 1999; Barlow 1995, 2000: 110; Hunston 2002: 15; McEnery and Wilson 1996: 57; McEnery, Xiao and Tono 2006) Corpora of the first type are labelled parallel corpora while those of the latter type are comparable corpora Slide 6 Parallel versus comparable corpora In classifying corpora, the criteria used must be consistent and logical ways of doing things - We can say a corpus is a translation or a non-translation corpus if the criterion of corpus content is used - But if we choose to define corpus types by the criterion of corpus form, we must use the criterion consistently - We can say a corpus is parallel if the corpus contains source texts and translations in parallel, or it is a comparable corpus if its components or subcorpora are comparable by applying the same sampling techniques and similar balance and coverage - It is simply inconsistent and illogical to refer to corpora of the first type as translation corpora by the criterion of content while referring to corpora of the latter type as parallel corpora by the criterion of form! Slide 7 Multilingual vs. monolingual comparable corpora A common practice in TS is to compare a corpus of translated texts (translational corpus) with a corpus consisting of comparably sampled non-translated texts in the same language The two sub-corpora form a monolingual comparable corpus for translation research, as opposed to a multilingual comparable corpus composed of comparable texts for different languages for cross-linguistic contrast Slide 8 Comparative corpora Corpora containing different regional varieties of the same language are not comparable corpora E.g. the International Corpus of English (ICE), the Brown family of corpora All corpora, as a resource for linguistic research, have always been pre-eminently suited for comparative studies (Aarts 1998: ix), either intralingually or interlingually Corpora of this kind are comparative corpora Slide 9 Use of parallel & comparable corpora Parallel and comparable corpora offer specific uses and possibilities for contrastive and translation studies (Aijmer & Altenberg 1996: 12) - they give new insights into the languages compared insights that are not likely to be gained from the study of monolingual corpora; - they can be used for a range of comparative purposes and increase our knowledge of language-specific, typological and cultural differences, as well as of universal features; - they illuminate differences between source texts and translations, and between native and non-native texts; - they can be used for a number of practical applications, e.g. in lexicography, language teaching and translation. Slide 10 Use of parallel & comparable corpora Used primarily for translation and contrastive studies The two types of corpora have their own characteristics, and serve different purposes Parallel corpora are useful in translation studies, but they alone serve as a poor basis for cross- linguistic contrast, because translations cannot avoid the effect of translationese Comparable corpora are well suited for contrastive research, but are less useful in translation studies Slide 11 Using corpora in translation studies Parallel corpora Useful in exploring how an idea in one language is conveyed in another language, thus providing indirect evidence to the study of translation processes Indispensable for building statistical or example-based machine translation (EBMT) systems, and for the development of bilingual lexicons and translation memories Parallel concordancing is a useful tool for translators Comparable corpora Useful in improving the translators understanding of the subject field and improving the quality of translation in terms of fluency, correct term choice and idiomatic expressions in the chosen subject field Can also be used to build terminology banks Slide 12 Using corpora in translation studies Translational corpora Provide primary evidence in product-oriented Translation Studies, and in studies of translation universals If corpora of this kind are encoded with sociolinguistic and cultural parameters, they can also be used to study the sociocultural environment of translations (e.g. functions of translation in DTS) Monolingual corpora (source / target language ) Raising the translators linguistic and cultural awareness in general Providing a useful and effective reference tool for translators In combination with a parallel corpus to form a so-called translation evaluation corpus that helps translator trainers or critics to evaluate translations more effectively and objectively Slide 13 Corpus-based translation studies Laviosa (1998a) the corpus-based approach is evolving, through theoretical elaboration and empirical realisation, into a coherent, composite and rich paradigm that addresses a variety of issues pertaining to theory, description, and the practice of translation. Hypotheses that translation universals can be tested by corpus data (Baker 1993, 1995) Rapid development of corpus linguistics, esp. multilingual corpus research in the early 1990s Increasing interest in Descriptive Translation Studies (Toury 1995) Tymoczko (1998) Corpus Translation Studies is central to the way that Translation Studies as a discipline will remain vital and move forward. Meta 43/4 (1998); Kenny (2001); Laviosa (2002); Granger et al (eds.) (2003); Olohan (2004); Mauranen et al (eds.) (2004); Kruger & Munday (ed.) (2011); Hu (2011), Wang (2011), Xiao (2012) Slide 14 The Holmes-Toury map Applied Translation Studies Descriptive Translation Studies Theoretical Translation Studies Slide 15 Applied Translation Studies Three major contributions of corpora Corpus-assisted translating Bowker (1998: 631): corpus-assisted translations are of a higher quality with respect to subject field understanding, correct term choice and idiomatic expressions. Corpus-aided translation teaching and training Bernardini (1997): large corpora concordancing (LCC) can help students to develop awareness, reflectiveness and resourcefulness, which are said to be the skills that distinguish a translator from those unskilled amateurs Development of translation tools Corpora, and especially aligned parallel corpora, are essential for the development of translation technology such as machine translation (MT) systems, and computer-aided translation (CAT) tools Slide 16 Descriptive Translation Studies Characterized by its emphasis on the study of translation per se It is to answer the question of why a translator translates in this way instead of how to translate Baker (1993) predicted that the availability of large corpora of both source and translated texts, together with the development of the corpus-based approach, would enable translation scholars to uncover the nature of translation as a mediated communicative event Slide 17 Descriptive Translation Studies Three focuses (Holmes 1972/1988) Translation as a product Concerned with describing translation as a product by comparing corpora of translated and non-translational native texts in the target language Attempting to uncover evidence to support or reject the so-called translation universal hypotheses Translation as a process Aims at revealing the thought processes that take place in the mind of the translator while she or he is translating One possible way for corpus-based DTS is to investigate the written transcripts of these recordings off-line, which is known as Think-Aloud Protocols (or TAPs) Translation as product providing indirect evidence to translation as process The function of translation The study of contexts rather than texts: function or impact of a translation Relatively few function-oriented studies that are corpus-based Slide 18 Theoretical Translation Studies Aims to establish general principles by means of which these phenomena can be explained and predicted (Holmes 1988: 71) Closely related to, and often reliant on the empirical findings produced by Descriptive Translation Studies One good battleground of using DTS findings to pursue general theory of translation is the hypothesis of so-called translation universals (TUs) and its related sub-hypotheses Sometimes referred to as the inherent features of translational language, or translationese Slide 19 19 TU: A focus of CBTS An important area of corpus-based TS over the past decade Baker (1993, 1996); Chesterman (2004); Kenny (1998, 1999, 2000, 2001); Laviosa (1998b); Mauranen & Kujamaki 2004); McEnery &