In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

  • 07/09/2015UCCTS 2008 - Hangzhou2 CBTS: A new paradigm Laviosa (1998a) the corpus-based approach is evolving, through theoretical elaboration and empirical realisation, into a coherent, composite and rich paradigm that addresses a variety of issues pertaining to theory, description, and the practice of translation. Hypotheses that translation universals can be tested by corpus data (Baker 1993, 1995) Rapid development of corpus linguistics, esp. multilingual corpus research in the early 1990s Increasing interest in Descriptive Translation Studies (Toury 1995) Tymoczko (1998) Corpus Translation Studies is central to the way that Translation Studies as a discipline will remain vital and move forward. Meta 43/4 (1998); Kenny (2001); Laviosa (2002); Granger et al (eds.) (2003); Olohan (2004); Mauranen et al (eds.) (2004); Kruger (ed.) (forthcoming)
  • 07/09/2015UCCTS 2008 - Hangzhou3 TU: A focus of CBTS An important area of corpus-based TS over the past decade Baker (1993, 1996); Chesterman (2004); Kenny (1998, 1999, 2000, 2001); Laviosa (1998b); Mauranen & Kujamaki 2004); McEnery & Xiao (2002, 2007); Olohan (2004); Olohan & Bakers (2000); vers (1998); Pym (2005); Xiao and Yue (2008) The Translational English Corpus (TEC) Manual Software
  • 07/09/2015UCCTS 2008 - Hangzhou4 Features of translated English Laviosa (1998b): Four core patterns of lexical use A relatively low proportion of lexical words over function words A relatively high proportion of high-frequency words over low-frequency words A relatively great repetition of the most frequent words Less variety in most frequently used words
  • 07/09/2015UCCTS 2008 - Hangzhou5 Features of translated English Beyond the lexical level Simplification: simpler than native language lexically / syntactically / stylistically the tendency to simplify the language used in translation (Baker 1996: 181-182) Normalization: more normal than the target native language the tendency to exaggerate features of the target language and to conform to its typical patterns (Baker 1996: 183) Explicitation: more frequent use of conjunctions, increased cohesion in translated text the tendency in translations to spell things out rather than leave them implicit (Baker 1996: 180) Sanitization: reduced connotational meaning translated texts are somewhat sanitized versions of the original (Kenny 1998: 515)
  • 07/09/2015UCCTS 2008 - Hangzhou6 TU: A target of debate Is translational language different from target native language? Translational language is at best an unrepresentative special variant of the target language because translations cannot possibly avoid the effect of translationese e.g. Baker 1993; Gellerstam 1996; Hartmann 1985; Laviosa 1997; McEnery & Wilson 2001; McEnery & Xiao (2002, 2007); Teubert 1996
  • 07/09/2015UCCTS 2008 - Hangzhou7 TU: A target of debate Are the features uncovered on the basis of translational English generalizable to other translated languages? Existing evidence has largely come from translational English and related European languages If such features are to be generalized as translation universals, the language pairs involved must not be restricted to English and closely related languages Cheongs (2006) study of English-Korean translation contradicts even the least controversial explicitation hypothesis Evidence from genetically distinct language pairs such as English and Chinese is undoubtedly more convincing, if not indispensable
  • 07/09/2015UCCTS 2008 - Hangzhou8 The ZCTC corpus Created with the explicit aim of studying the features of translated Chinese A translational counterpart of the Lancaster Corpus of Mandarin Chinese (LCMC), a one-million-word balanced corpus of native Chinese (McEnery & Xiao 2004) Five hundred 2,000-word text samples taken proportionally from fifteen written text categories published in China in the 1990s
  • 07/09/2015UCCTS 2008 - Hangzhou9 LCMC / ZCTC corpus design
  • 07/09/2015UCCTS 2008 - Hangzhou10 ZCTC vs. LCMC
  • 07/09/2015UCCTS 2008 - Hangzhou11 Corpus markup and annotation CES-compliant XML CES: Tokenization and POS tagging ICTCLAS2008: A precision rate of 98.54% for tokenization Paragraph, sentence, word token Encoded in Unicode (UTF-8)
  • 07/09/2015UCCTS 2008 - Hangzhou12 Core patterns of lexical use Do the core patterns of lexical use Laviosa (1998b) observes in translational English also apply in translated Chinese? Same criteria and parameters as in Laviosa (1998b) Lexical density Frequency profiles Mean sentence length
  • 07/09/2015UCCTS 2008 - Hangzhou13 Lexical density The Stubbs-style lexical density: the ratio between the number of lexical words (i.e. content words) and the total number of words (Stubbs 1986: 33; 1996: 172) Measure of informational load Adopted in Laviosa (1998b) Lexical density measure by TTR or Standardized TTR (Scott 2004) Measure of lexical variability Commonly used in Corpus Linguistics
  • 07/09/2015UCCTS 2008 - Hangzhou14 Stubbs-style lexical density Mean LD: LCMC (66.93%) vs. ZCTC (61.59%) the mean difference is statistically significant (t = -4.94, p