Predictive Analysis in Machine Translation is Business Intelligence

Preview:

Citation preview

Tony O’DowdFounder & Chief Architect

tonyod@kantanmt.com

Predictive Analysis in Machine Translation

is Business Intelligence

What we aim to cover today?

What is KantanMT.com?Types of Quality Estimation

Comparative Quality EstimationsPredictive Quality Estimations

Benefits to IndustryProduct Scope DeterminationTiered Pricing Capabilities

Conclusions

What is KantanMT.com?Statistical MT Platform

Cloud-based Highly scalable Inexpensive to operate Fusion of TM & MT & rules High speed, high quality translations

Our VisionTo put Machine Translation

Customisation Improvement Deployment

into your hands

Active KantanMT Engines

7,783Training Words Uploaded

143,078,042,293Member Words Translated

4,259,399,846

www.kantanMT.com

Types of MT Quality Estimation

Comparative MT Quality EstimationUses a reference translation to calculate:-

Word recall & precisionText SimilaritiesWord Order correlationsLinguistic similarities

F-Measure ScoreRecall & Precision calculationClosely linked to the relevancy of word selection

for MT systems

Types of MT Quality Estimation

KantanBuildAnalytics™

BLEU ScoreImprovement upon F-MeasureTakes word-order into considerationLinked to a sense of translation ‘fluency’

Types of MT Quality Estimation

KantanBuildAnalytics™

Types of MT Quality Estimation

TER ScoreA method to help in predict the post-editing effort TER is quick to use and correlates highly with actual post-

editing effort

KantanBuildAnalytics™

Types of MT Quality Estimation

Useful for Engine Development

Baseline measurements Determination of ‘possible’ engine

quality and relevancyReference set of comparative

translations required Does not work on unseen translations

Of limited use in determining PE effort Resources Costs Kantan BuildAnalytics™

Kantan TotalRecall – Advanced TM% of TM hits in this job

KantanMT – automated translations% of automated translations for this job

Range of QE ScoresQE range defined to match existing fuzzy match ranges used by L10N industry

Quality Estimation ScoresSegment level QE scores – akin to fuzzy match scores

Word Counts – Project StatsCan be used to develop Project TimeLine and Tiered Pricing Model for Post-Editing Projects

Placeholder & Tag CountsUsed by PM for complexity sur-charges

Types of MT Quality Estimation

KantanAnalytics™

Types of MT Quality Estimation

KantanAnalytics™No Reference set reqd.Predictive, not comparative

BenefitsTiered Pricing ModelPrioritise PE activityScheduleResourcesCostSeamlessly integrated into all

CAT tools

KantanAnalytics™ - a predictive quality estimation

technology

Tony O’DowdFounder & Chief Architect

tonyod@kantanmt.com

Questions?

Recommended