74
Human Translation & Translation Workflow Prof. Gloria Corpas Pastor Dr. Jorge Leiva Rojo Dr. Míriam Seghiri Domínguez Universidad de Málaga Birmingham, 13 th November 2013

12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

  • Upload
    riilp

  • View
    939

  • Download
    2

Embed Size (px)

Citation preview

Page 1: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Human Translation & Translation Workflow

Prof. Gloria Corpas Pastor Dr. Jorge Leiva Rojo

Dr. Míriam Seghiri Domínguez

Universidad de Málaga Birmingham, 13th November 2013

Page 2: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Human Translation Workflow I: Overview

Page 3: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� HTW I: Overview (Prof. Corpas) � HTW II: Professional Translation (Dr. Leiva) � HTW III: Corpus-based translation (Dr. Seghiri)

Page 4: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow
Page 5: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

MAIN TRAINING EVENTS AND CONFERENCES (WP7) 9 Scientific and technological training 9 Complementary skills training 9 Scientific and technological workshop 9 Business showcases

Page 6: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

TUTORIAL ON HUMAN TRANSLATION AND TRANSLATION WORKFLOW 9 Relevant to all research sub-programmes (* WP1 & WP5) 9 Introduce the most common translation workflow to

researchers • to learn how translators currently work • to design new translation technologies • to cover confidence and quality estimation in HTWs

Page 7: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

LIST OF CONTENTS 9 Market studies (eg. industry, quality, technology,

language service providers) 9 The translation workflow (eg. certification, project

management, agents, emerging trends) 9 Training translators using corpora (compilation

protocol, analysis, translation strategies, etc.)

Page 8: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Human Translation Workflow II: Professional Translation

Page 9: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Table of contents

1. Introduction. Market studies 2. The translation workflow 3. Emerging trends

Page 10: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

(Trusted Translations)

1. Introduction. Market studies

Page 11: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1. Introduction

25,000 companies in the world (Translation Bureau, 2012) 1,500 translation companies in Europe; average turnover 300,000 € in 2005 (EUATC, 2005). Translation and interpreting (+ software & website localization) sector’s assumed value: 5.7 billion € in 2008; 9.1 billion € estimated in 2013 (European Commission, 2009). Highest growth rate of all European industries in Europe. World-wide annual growth: 5.13% (DePalma et al., 2013).

Page 12: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1. Introduction

700 participants (LSP) (European Commission, 2009): 43% freelancers or sole proprietors; 36% 1-10 employees; 21% 10+ employees. Growth of big companies is quicker than growth of the rest of the language market (Boucau, 2009). Supply exceeds demand; number of well-qualified linguists is too small to cover the growing demand.

Page 13: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1. Introduction

Six hyper-languages of the web (English, French, Italian, German, Spanish, Japanese) and Chinese to undergo a major growth (cf. Common Sense Advisory, 2011). Prices dependent on exchange rates, not influenced by inflation (cf. Goddard, 2013) Æ Prices relatively stable 2004-2008. Market is very competitive. 80% of providers charge less than 0.15 $ / word (Translation Bureau, 2012).

Page 14: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1. Introduction

Average per-word rate for the 30 most commonly used languages on the web fell 34.71%: 0.205 US$ (2010) Æ 0.134 US$ (2012)

Global supply, advances in technology, economic issues

and more aggressive buyers conspired to drive down the prices since 2008; Situation remains unchanged.

Page 15: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1. Introduction

Domain and technological skills should be better addressed (European Commission, 2009). “Use of technology by LSPs is sporadic” (Translation Bureau, 2012). It requires an investment to build and maintain infrastructure and a significant repository of data in order for the tool to be effective; difficulty for small enterprises, the bulk of businesses within the industry.

Page 16: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1. Introduction

Decrease in resistance to MT (Systran #1; Google #2) (European Commission, 2009). MT does not produce a level of quality sufficient, output to be reviewed by qualified translators Æ MT is not widely adopted (large volume translations) (Translation Bureau, 2012). HAMT is growing in usage. 2009 study indicating that HAMT doubled the translation output and was 45% cheaper (Translation Bureau, 2012).

Page 17: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

2. The translation workflow (Project Management Watch)

Page 18: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

“[A translation brief is a] definition of the communicative purpose for which the translation is needed. The ideal brief provides explicit or implicit information about the intended text function(s), the target-text addressee(s), the medium over which it will be transmitted, the prospective place and time and, if necessary, motive of production or reception of the text” (Nord, 1997).

Page 19: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

(Translate Media, 2013)

Page 20: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

2. The translation workflow

- ISO 9001:2008, ISO 17100 (mid 2014 [Rosam, 2013]) - EN 15038:2006 (EU) - ASTM (USA) - GB/T 19363 (China) - CA/CSGB-131.10 (Canada) - To define translation’s basic terms and concepts. - To establish the basics for the client-translation service provider relationship to meet market needs. - To determine the implementation of the translation process.

Page 21: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

2. The translation workflow

EN15038 needs amendments. “The standard í although well intended í does neither indicate nor reflect the quality of the output of an LSP. Due to downward pressures and trends in pricing, many translation agencies need to operate with limited budgets in order to stay competitive. As a result, if low cost and low quality translation work is performed, the mere fact that such work is revised does not guarantee high quality” (European Commission, 2009).

Page 22: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

2. The translation workflow

(DePalma et al., 2013)

Page 23: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

3. Emerging trends

(fiverr)

Page 24: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

3. Emerging trends

(Lynch, 2012)

Page 25: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

CAT tool suppliers to deal with newer media and new crowd-based supply chains (DePalma et al., 2013).

Users want different forms of content translated: emails, blogs, tweets.

Slight decrease in turnover due to the economic downturn, small enterprises with turnovers below 50,000 € (European Commission, 2009).

3. Emerging trends

Page 26: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

“Post-Editing is the process by which language professionals edit machine translation outputs to create human-quality translations” (Marcu, 2013).

Page 27: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

3. Emerging trends Crowd-sourced translations

(Muntés Molero et al., 2012)

Page 28: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Human Translation Workflow III: Corpus-based translation

Page 29: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Table of contents

1. Introduction 2. Corpora in Translation Training 3. Guidelines for Corpus Creation

3.1. Design Criteria 3.2. Compilation Protocol

4. Using Corpora to Translate 5. Using the corpus to translate 6. Corolary

Page 30: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Introduction

Page 31: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� The inclusion of documentation as a core subject in the curriculum of Translation and Interpretation degrees clearly underlines its importance to translators.

� Training in this discipline is considered essential for a translator given that only sufficient and conscientious work on documentation will allow an adequate translation of a specialised text.

Page 32: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

y The sources of information that may be utilised by the translator are extremely varied, ranging from an oral consultation with an expert to a search using specialised glossaries and dictionaries.

y However, in the field of translation perhaps the most relevant documentation activity today involves the use of the Internet and, closely related to this, the compilation and management of virtual corpora.

Page 33: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

y Here, we shall present a systematic methodology for corpus

compilation based on electronic resources available on the Internet.

y The methodology will be illustrated through the example of the

creation of a virtual corpus of Telecommunications integrated by:

1 subcorpus in English 1 subcorpus in Spanish

Page 34: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

CORPUS OF

TELECOMMUNICATIONS

English

subcorpus

Spanish subcorpus

Page 35: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Telecommunications, why?

Telecommunication is now the world’s largest industry [and] the world’s fastest-changing industry from any measure of change you can name technology, players applications and users. In one decade, this industry is going from totally-closed, government-controlled, highly regulated, monopolistic, bureaucratic, plodding thing to an exploding fre-for-all (Newton, 1994: 1)

Page 36: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Corpora in Translation Training

Page 37: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

What is a corpus?

� corpus, pl. corpora, from the Latin word corpus, i.e. “body” A collection of texts assumed to be representative of a given

language, dialect, or other subset of a language, to be used for linguistic analysis (Francis, 1982)

Page 38: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Characteristics of corpora

• collections of text • naturally-occurring / authentic text • representative of a given language • collected according to specific criteria • stored in machine-readable format • used for linguistic analysis

Page 39: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Different types of corpora

According to what could corpora be distinguished/classified? • language • size • purpose •

Page 40: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� The advantages of using corpora in translation have been shown by various studies (cf. Laviosa, 1998; Bowker, 2002; Bowker y Pearson, 2002; Zanettin et al. 2003).

� Advantages: their objectivity, their reusability and multiple usage.

They are user-friendly and allow access to and management of huge quantities of information in almost no time.

Page 41: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� Translators turn to the Internet in search of solutions to information

and documentation problems because they are not only translating between languages but also between discourse communities and cultures.

� The compilation of corpora and the Internet appear to be two of the

most important documentation resources in the practice and research of specialised translation.

� Corpora for a particular speciality are not available for consultation

on the Internet. � Translators have no alternative other than to compile their own

virtual corpora for the specific translation that has been commissioned in each case.

Page 42: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

In order for a collection of texts to be considered

a corpus in the strict sense of the term, it must meet:

� a set of clear design criteria and � a specific compilation protocol so that the collection may be deemed

representative of the field of specialisation or the particular type of document that is being translated.

Page 43: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Guidelines for Corpus Creation

Page 44: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Professional Competences - Translating - Linguistic and textual - Research, information acquisition & processing - Cultural - Technical

The knowledge of how to compile and use corpora is an essential part of modern translational competence (Varantola, 2003)

Page 45: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1) Design Criteria

Page 46: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

The extract comes from a brochure from the company DVEO:

<http://www.dveo.com/broadcast-systems/TDMB-and-DAB-modulator.shtml>.

Page 47: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� The objective is to create a specialized corpus on Telecommunications in English and Spanish compiled exclusively from resources available on the Internet. � Restricted to texts that have been drawn up in

UK and Spain. � It will include original documents

(comparable corpus), complete texts and documented.

Page 48: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

CORPUS DESING

� Text type: brochures, research articles,

� Language/s: English (subcorpus 1) & Spanish (subcorpus 2) � Diatopic restrictions: United Kingdom & Spain

� Original or translations: Comparable (original)

� Complete text or partial: complete

� Documented: yes

Page 49: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

2) Compilation Protocol

Page 50: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

The Compilation Protocol is

integrated by 4 steps (Seghiri,2011):

I. Locating and accessing resources II. Downloading Data III. Text formatting IV. Data storage

Page 51: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Step I: Locating and accessing

resources

Page 52: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

The main sources of information to compile our corpus have been:

� institutional searches, carried out on the web sites of

international organisations and institutions (International Telecommunication Union, Telefonica, etc.)

� key word searches using a search engine

(www.google.com, www.yahoo.co.uk, etc.)

Page 53: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow
Page 54: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

key word searches

Page 55: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Step II: Downloading Data

Page 56: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� This step can be performed manually (Ctrl+S) � We can download a group of pages with programs as GNU

Wget: http://www.gnu.org/software/wget

Page 57: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Step III: Text formatting

Page 58: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

III.Text formatting

� Clear preference for .HTML and .PDF � Format conversion: ASCII or plain text format.

(cf. clean-policy, Sinclair 1991: 21).

Page 59: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

http://www.pdf-to-html-word.com/pdf-to-text

Page 60: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Step IV: Data Storage

Page 61: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

IV. Data storage

Page 62: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� bilingual (EN-ES) � documented � comparable � virtual

Page 63: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Using the Corpus to Translate

Page 64: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

1) Concordancers

Page 65: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

COMPARABLE CONCORDANCERS: y AntConc 3.2. is a non-commercial freely downloadable

concordancer for Windows, Mac and Linux. This versatile software features several tools, which display lists of words and keywords (Word List, Keyword List), list, sort and search for lexical bundles (Collocates), generate lines in KWIC format (Concordance), indicate the position of the keyword within a given corpus (Concordance Plot), allow the user to have access to the whole source file or corpus (File View).

http://www.antlab.sci.waseda.ac.jp/antconc_index.html

Page 66: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

AntConc [Monolingual Freeware Multiplatform Concordancer]

Page 67: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

y Another monolingual concordancer for Windows only is the

Multilingual Corpus Toolkit which supports many European and Asian languages.

http://personalpages.manchester.ac.uk/staff/scott.piao/research/DownLoad/downl

oad.htm y Freeware concordancers for Mac are Conc 1.7/1.8 and Concorder

1.0.

Conc: http://www.sil.org/computing/conc/conc.html Concorder: http://mac.softpedia.com/get/Word-

Processing/Concorder.shtml

Page 68: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

PARALLEL CONCORDANCERS: � A bilingual or multilingual concordancer is a program for

parallel corpora, i.e. corpora of source texts and their translations into other languages. As a rule, this kind of software requires input aligned at sentence level. Most bi-/multilingual concordances are commercial. A well-known example is ParaConc 0.9, the multilingual version of MonoConc Pro. It can analyse up to four languages in parallel (one source text corpus and up to three target corpora).

Page 69: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

ParaConc [Bilingual Commercial Suite for Windows. Alignment]

Page 70: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

ParaConc [Concordancing]

Page 71: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Corolary

Page 72: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� Comparable corpora are particularly useful for meeting translators’ information needs.

� Representative Corpora: finding information on

terminology, phraseology, concepts, cultural issues and text discourse for direct and inverse translation.

Page 73: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

� Corpora:

9 instant access to authentic language and real usage 9 syntagmatic patterns and translation equivalents unavailable in

other resources or technologies 9 guidance to style, text-structuring devices and conventions in

both SL and TL 9 useful for the the translation of any kind of text type, language/s

and in any direction

Page 74: 12. Gloria Corpas, Jorge Leiva, Miriam Seghiri (UMA) Human Translation & Translation Workflow

Human Translation & Translation Workflow

Prof. Gloria Corpas Pastor Dr. Jorge Leiva Rojo

Dr. Míriam Seghiri Domínguez

Universidad de Málaga Birmingham, 13th November 2013