7
learn how language works sketchengine.co.uk enhancements for CAT tools

Sketch Engine CAT tool enhancement catalogue

Embed Size (px)

DESCRIPTION

 

Citation preview

learn how language works sketchengine.co.uk

enhancements

for CAT tools

developed by Lexical Computing CZ, s.r.o.

sketchengine.co.uk

Sketch Engine is a corpus management and corpus query tool specifically designed

for speedy processing of large multi-billion word corpora. Sketch Engine holds about

150 TB of data in 80 languages and is used by translators, terminologists,

lexicographers and linguists to learn about language as produced by real users.

Many Sketch Engine features are successfully used by translators and terminologists

around the world and are ready to be included in CAT tools. The Sketch Engine’s

natural language processing know-how and the vast amount of language data on one

side and a state-of-the-art CAT software on the other will inevitably help the software

developers gain competitive advantage over their competitors.

developed by Lexical Computing CZ, s.r.o.

A catalogue of proposed CAT tool enhancements

Terminology extraction

A trial and tested algorithm for terminology extraction delivers an incredible quality

even without a list of stop words. The user can decide to what extent high frequency

and therefore less specialized words should be included in the result.

Compatible languages

Term base lemmatisation

Especially with morphologically rich languages, lemmatisation of term bases

increases the chances of a term being correctly identified which will result in both

reduced translation time as well as better scores of QA.

Compatible languages Bosnian, Bulgarian, Croatian, Czech, Dutch, English, Estonian,

Finnish, French, German, Greek, Irish, Italian, Japanese, Korean, Latvian, Lithuanian,

Polish, Portuguese, Russian, Serbian, Slovak, Spanish, Swahili, Swedish

Language detection

Sketch Engine is able to significantly improve the accuracy of language detection

when uploading or opening a new document into your CAT tool.

Compatible languages over 200 most frequent languages

sketchengine.co.uk

Multi-language document detection

Sketch Engine is able to analyse a document for sections written in different

languages as well as identify those languages so that the right sections can be

assigned to the right translators avoiding re-assignment at a later time.

Compatible languages over 200 most frequent languages

Document alignment

Sketch Engine features alignment algorithm which can align previous and current

version of the same document at sentence level even if new passages have been

added or deleted or sentences have been amended.

Compatible languages all languages (language independent feature)

Terminology checking

The terminology checking feature will check whether the same term exists in the

source and target even if the word form differs from the one in the term base.

a mockup of a QA screen

Compatible languages

developed by Lexical Computing CZ, s.r.o.

Concordance

Generating a concordance from TM only will often result in only a few examples.

Using Sketch Engine’s mutli-billion corpora, a translator can view hundreds or

thousands of examples. A time-saving Word Sketch feature will even process those

examples and will display a summarising table showing the word’s behaviour.

Compatible languages all 80+ languages in Sketch Engine, the quality of results

improves with the size of the corpus, languages with a corpus bigger than 50 million

words: Arabic, Azerbaijani, Basque, Bosnian, Bulgarian, Catalan, Chinese Simplified,

Chinese Traditional, Croatian, Czech, Danish, Dutch, English, Estonian, Finnish,

French, German, Greek, Hebrew, Hungarian, Indonesian, Italian, Japanese, Kazakh,

Korean, Latvian, Lithuanian, Malay, Maltese, Norwegian, Persian, Polish, Portuguese,

Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swedish, Tajik, Thai, Turkish,

Ukrainian, Vietnamese

Writing suggestions

Sketch Engine can provide writing suggestions based on context when the translator

is lost for words or when a suitable collocation is difficult to think of.

sketchengine.co.uk

clicking a word or typing a question mark and clicking it will bring up a list of options

Compatible languages

More information

Sketch Engine is developed by

Lexical Computing CZ, s.r.o.

Brno, Czech Republic

For more information, please contact:

Ondřej Matuška, Sales and Marketing Manager

[email protected]

M +420 603 770 341 skype omatuska