27
In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

Embed Size (px)

Citation preview

Page 1: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

In pursuit of the ‘third code’

Using the ZJU Corpus of Translational Chinese (ZCTC)

in Translation Studies

Richard XiaoLianzhen He

Ming Yue

Page 2: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 2

CBTS: A new paradigm

• Laviosa (1998a)– “the corpus-based approach is evolving, through theoretical

elaboration and empirical realisation, into a coherent, composite and rich paradigm that addresses a variety of issues pertaining to theory, description, and the practice of translation.”

• Hypotheses that translation universals can be tested by corpus data (Baker 1993, 1995)

• Rapid development of corpus linguistics, esp. multilingual corpus research in the early 1990s

• Increasing interest in Descriptive Translation Studies (Toury 1995)

• Tymoczko (1998)– “Corpus Translation Studies is central to the way that Translation

Studies as a discipline will remain vital and move forward.”• Meta 43/4 (1998); Kenny (2001); Laviosa (2002); Granger et al

(eds.) (2003); Olohan (2004); Mauranen et al (eds.) (2004); Kruger (ed.) (forthcoming)

Page 3: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 3

TU: A focus of CBTS• An important area of corpus-based TS over the

past decade– Baker (1993, 1996); Chesterman (2004); Kenny

(1998, 1999, 2000, 2001); Laviosa (1998b); Mauranen & Kujamaki 2004); McEnery & Xiao (2002, 2007); Olohan (2004); Olohan & Baker’s (2000); Øverås (1998); Pym (2005); Xiao and Yue (2008)

• The Translational English Corpus (TEC)– Manual

• http://www.llc.manchester.ac.uk/ctis/research/english-corpus/

– Software• http://ronaldo.cs.tcd.ie/tec/jnlp/

Page 4: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

19/04/23 UCCTS 2008 - Hangzhou 4

Features of translated English

• Laviosa (1998b): Four core patterns of lexical use– A relatively low proportion of lexical words

over function words– A relatively high proportion of high-frequency

words over low-frequency words– A relatively great repetition of the most

frequent words– Less variety in most frequently used words

Page 5: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 5

Features of translated English• Beyond the lexical level

– Simplification: simpler than native language lexically / syntactically / stylistically

• “the tendency to simplify the language used in translation” (Baker 1996: 181-182)

– Normalization: more “normal” than the target native language• the “tendency to exaggerate features of the target language and to

conform to its typical patterns” (Baker 1996: 183)

– Explicitation: more frequent use of conjunctions, increased cohesion in translated text

• the tendency in translations to “spell things out rather than leave them implicit” (Baker 1996: 180)

– Sanitization: reduced connotational meaning• translated texts are “somewhat ‘sanitized’ versions of the original”

(Kenny 1998: 515)

Page 6: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 6

TU: A target of debate

• Is translational language different from target native language?– Translational language is at best an

unrepresentative special variant of the target language because translations cannot possibly avoid the effect of translationese

• e.g. Baker 1993; Gellerstam 1996; Hartmann 1985; Laviosa 1997; McEnery & Wilson 2001; McEnery & Xiao (2002, 2007); Teubert 1996

Page 7: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 7

TU: A target of debate

• Are the features uncovered on the basis of translational English generalizable to other translated languages?– Existing evidence has largely come from translational

English and related European languages– If such features are to be generalized as “translation

universals”, the language pairs involved must not be restricted to English and closely related languages

• Cheong’s (2006) study of English-Korean translation contradicts even the least controversial explicitation hypothesis

– Evidence from “genetically” distinct language pairs such as English and Chinese is undoubtedly more convincing, if not indispensable

Page 8: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 8

The ZCTC corpus

• Created with the explicit aim of studying the features of translated Chinese

• A translational counterpart of the Lancaster Corpus of Mandarin Chinese (LCMC), a one-million-word balanced corpus of native Chinese (McEnery & Xiao 2004)– www.ling.lancs.ac.uk/corplang/lcmc/

• Five hundred 2,000-word text samples taken proportionally from fifteen written text categories published in China in the 1990s– www.ling.lancs.ac.uk/corplang/ZCTC/

Page 9: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 9

LCMC / ZCTC corpus design

Page 10: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 10

ZCTC vs. LCMC

Page 11: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 11

Corpus markup and annotation

• CES-compliant XML– CES: www.cs.vassar.edu/CES/

• Tokenization and POS tagging– ICTCLAS2008: www.ictclas.org

• A precision rate of 98.54% for tokenization

• Paragraph, sentence, word token

• Encoded in Unicode (UTF-8)

Page 12: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 12

Core patterns of lexical use

• Do the core patterns of lexical use Laviosa (1998b) observes in translational English also apply in translated Chinese?

• Same criteria and parameters as in Laviosa (1998b)– Lexical density– Frequency profiles– Mean sentence length

Page 13: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 13

Lexical density

• The Stubbs-style lexical density: the ratio between the number of lexical words (i.e. content words) and the total number of words (Stubbs 1986: 33; 1996: 172)– Measure of informational load– Adopted in Laviosa (1998b)

• Lexical density measure by TTR or Standardized TTR (Scott 2004)– Measure of lexical variability– Commonly used in Corpus Linguistics

Page 14: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 14

Stubbs-style lexical density

• Mean LD: LCMC (66.93%) vs. ZCTC (61.59%) – the mean difference is statistically significant (t = -4.94, p<0.001)

• All 15 genres have a greater lexical density in native than translated Chinese – significant for all genres barring M

0.0010.0020.0030.0040.0050.0060.0070.0080.00

A B C D E F G H J K L M N P R

Mea

n

Genre

Lex

ical

den

sity

LCMC ZCTC

Page 15: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 15

Standardized TTR

• LCMC as a whole has a slightly higher STTR than ZCTC (46.58 vs. 45.73) – not significant

• The differences in most genres are marginal• Greater STTR scores can be found in both native and translated

Chinese genres

0

10

20

30

40

50

60

A B C D E F G H J K L M N P R

Mea

n

Genre

Sta

nd

ard

sied

TT

R

ZCTC LCMC

Page 16: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 16

Lexical-function word ratio

• The mean ratio between lexical and function words is significantly greater in native Chinese (2.08) than translated Chinese (1.64) (t = -4.88, p<0.001)

• Native Chinese has a greater ratio in all genres, and the differences are statistically significant for all genres barring M (science fiction)

• In line with Laviosa’s (1998b) initial hypothesis that translational language has a relatively low proportion of lexical words over function words

0

0.5

1

1.5

2

2.5

3

A B C D E F G H J K L M N P R

Mea

n

Genre

Lex

ical

-fu

nct

ion

wo

rd r

atio

ZCTC LCMC

Page 17: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 17

Frequency profiles

• Laviosa’s (1998b) ‘list head’ or ‘high frequency words’– Wordlist items which individually account for

at least 0.10% of the total tokens in a corpus

• The same criterion for high frequency words in the present study to ensure comparability

Page 18: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 18

Frequency profiles

• The numbers of high frequency words are very similar in the two corpora• High frequency words account for a considerably greater proportion of

tokens in the translational corpus• High frequency words display a much greater repetition rate in translational

Chinese• The ratio between high- and low-frequency words is also greater in

translational corpus

Page 19: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 19

Mean sentence length vs. simplification

• Conflicting observations of mean sentence length as an indicator of simplification (e.g. Laviosa 1998b vs. Malmkjaer 1997)

• In our corpora, native Chinese shows a slightly greater mean sentence length (t = - 1.41, p = 0.17)

• Mean sentence length appears to be more sensitive to genres than being a reliable indicator of native versus translational language

0

5

10

15

20

25

30

A B C D E F G H J K L M N P R

Mea

n

Genre

Mea

n s

ente

nce

len

gth

ZCTC LCMC

Page 20: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 20

Lexical use in translational Chinese

• Summary– The core lexical features proposed by Laviosa

(1998b) for translational English are essentially also applicable in translated Chinese

– But mean sentence length is less reliable as an indicator of simplification in translational Chinese

Page 21: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 21

Connectives: Device for explicitation?

• Perhaps the most studied topic in TU research and the least controversial hypothesis

• Chen (2006)– Connectives are a device for explicitation in English-

Chinese translation of popular science books• Xiao and Yue (2008)

– Connectives are significantly more frequent in translated than native Chinese fiction

• Question– Can we generalize this finding from specific genres to

Mandarin Chinese in general?

Page 22: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 22

Conjunctions in ZCTC and LCMC

• Mean frequency of conjunctions is significantly greater in ZCTC (306.42 instances per 10,000 tokens) than in LCMC (243.23) (LL=723.12 for 1 d.f., p<0.001)

• Genres of imaginative writing (K-P, R) generally demonstrate a significantly more frequent use of conjunctions in translational Chinese

• Of expository writing, while conjunctions are considerably more frequent in most genres in translated Chinese (e.g. A, H), there are also genres in which conjunctions are more common in native Chinese (e.g. F, J)

0

100

200

300

400

500

A B C D E F G H J K L M N P R

Mea

n

Genre

Fre

qu

ency

per

10,

000

wo

rds

ZCTC LCMC

Page 23: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 23

Conjunctions of different frequency bands

• More types of conjunctions of high frequency bands (0.10%, 0.05%, and 0.01%) are used in translational corpus

• There are an equal number of conjunctions (56 types) with a proportion greater than 0.005% in translational and native corpora

• After this balance point, the native corpus displays a greater number of less frequent conjunctions (a usage band of 0.001% and below)

• The tendency to use conjunctions more frequently can be taken as a sign of explicitation

0

50

100

150

200

0.1%

+

0.05

%+

0.01

%+

0.00

5%+

0.00

1%+

0.00

05%

+

0.00

01%

+ All

Usage bands

Fre

qu

en

cy

ZCTC LCMC

Page 24: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 24

Conjunctions of different styles

• A closer comparison of the lists of frequent conjunctions (proportion of 0.001%) in their respective corpus also sheds some new light on the simplification hypothesis– There are 91 and 99 types of frequent conjunctions in

the two corpora – 86 items overlap in the two lists– Conjunctions on the ZCTC but not LCMC list are all

informal, colloquial, and simple, which usually have more formal alternatives

– Conjunctions on the LCMC but not ZCTC list are typically formal and archaic

• Evidence for the simplification hypothesis but against the normalization hypothesis

Page 25: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 25

Conclusions

• Laviosa’s (1998b) observations of the core patterns of lexical features of translational English are also applicable in translated Chinese

• Beyond the lexical level– Mean sentence length is sensitive to genres and may

not be a reliable indicator of simplification– A comparison of frequent connectives in native and

translational Chinese appears to suggest that simpler forms tend to be used in translations

– In spite of some genre-based subtleties, translational Chinese uses conjunctions more frequently than native Chinese, which provides evidence in favour of the explicitation hypothesis

Page 26: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

04/19/23 UCCTS 2008 - Hangzhou 26

Conclusions• We believe that the newly created ZCTC

will play a leading role in the study of translational Chinese by producing more empirical evidence

• It is our hope that the study of translational Chinese will help to address limitations of imbalance in the current state of translation universal research

Page 27: In pursuit of the ‘third code’ Using the ZJU Corpus of Translational Chinese (ZCTC) in Translation Studies Richard Xiao Lianzhen He Ming Yue

Thank you!

[email protected]@lancaster.ac.uk