Upload
rajendra-verma
View
66
Download
12
Embed Size (px)
Citation preview
ENGLISH TO HINDI STATISTICAL MACHINE TRANSLATION
SYSTEM
Presented By: Rajendra Verma
M.E(CSE)2ndYear
Natural language processing (NLP) is a field of computer science, artificial intelligence, and computational linguistics concerned with the interactions between computers and human (natural) languages.
Natural Language Processing
Question answering.Text Categorization/Routing.Text Mining.Machine Translation.Spelling Correction.
NLP APPLICATIONS
WHAT IS MACHINE TRANSLATIONA piece of text which has been written
Again automatically from one language to anotherBy a machine.
NEED OF MACHINE TRANSLATION ?
72.4% 56.2%
RESEARCH BY COMMON SENSE ADVISORY FIRM
72.1%
72.1 percent of the consumers spend most or
all of their time on sites in their own
language
72.4 percent say they would be more likely to
buy a product with information in their own
language
56.2 percent say that the ability to obtain
information in their own language is more
important than price.
Real-time communications where it would not be practical for a human to translate (e.g. chat and email.)
AREA’S IN WHICH MACHINE TRANSLATION IS USEFUL
TYPES OF MACHINE TRANSLATION
MACHINE TRANSLATION
RULE BASED MACHINE TRANSLATION (RBMT)
STATISTICAL MACHINE TRANSLATION (SMT)
• Rules-based systems use a combination of language and grammar rules plus dictionaries for common words. Specialist dictionaries are created to focus on certain industries or disciplines.
RULE BASED MACHINE TRANSLATION
RULE BASED APPROACH
GRAMMAR RULE
LEXICON SOFTWARE PROGRAM
Sita Slept in Garden
ANALYSIS
Gardenin Slept Sita
SVO
TRANSFER
Garden in Slept Sita
SOV GENERATION
सीता बाग में सोयि�
Sita Slept in Garden
ANALYSIS
Gardenin Slept Sita
STATISTICAL MACHINE TRANSLATION• Statistical machine translation (SMT) learns how
to translate by analyzing existing human translations (known as bilingual text corpora).
• Machine translator can use a database as the source for all the information it need for translating.
ISSUES IN MACHINE TRANSLATION• Word order
Word order in languages differs. Some classification can be done by naming the typical order of subject (S), verb (V) and object (O) in a sentence . Some languages have word orders as SOV. The target language may have a different word order. In such cases, word to word translation is difficult. For example, English language has SVO and Hindi language has SOV sentence structure.
• Ambiguity A given word or sentence can have more than one meaning.For ex, the word ‘’party’’ could mean a polytical party, or a social event,and deciding the suitable one in perticular case is crucial to getting right analysis and therefore right translation • The third reason is that when human use natural
language, they use an enormous amount of common sense, and knowledge about the world, which helps to resolve the ambiguity. For ex. in ‘’He went to the bank, but it was closed for lunch’’,we can infer that ‘bank’ refers to a financial institution, and not a river bank, because we know from our knowledge of the world that only the former type of bank can be closed for lunch.
SYSTRAN TRANSLATOR• RULE BASED MACHINE TRANSLATION
SYSTEM.• SUPPORT 45 LANGUGAES.
BING TRANSLATOR• STATISTICAL BASED MACHINE TRANSLATION.• SUPPORT 47 LANGUGAES.
GOOGLE TRANSLATOR• STATISTICAL BASED MACHINE TRANSLATION• SUPPORT 80 LANGUAGES.
EXISTING MACHINE TRANSLATION
YOUTHANK