Click here to load reader

Hindi –tamil text translation

Embed Size (px)

DESCRIPTION

Hindi - Tamil Text Translation

Citation preview

  • 1. Submitted to : Mr. Vimal Kumar K. Hindi Tamil Text Translation Submitted By : Vaibhav Agarwal 10103546 Akash Singh 10103549

2. Introduction Natural Language Processing is a field of computer science, artificial intelligence, and linguistics concerned with the interactions between computers and human (natural) languages. Traditionally , Interpreters having vast knowledge of source as well as target languages have been involved in converting text from source language to target language manually. Machine Translation is a part of linguistics which includes the task of automatically converting one natural language into another, preserving the meaning of the input text, and producing fluent text in the output language using the available technology. 3. Aim of Research Hindi and Tamil are among the top 5 spoken languages in India with a share of 41% and 5% respectively. Translations services like Google Translate are still working by taking an intermediate language such as English to translate. With this project , we have tried to directly convert Hindi text to Tamil text without taking any intermediate language . 4. Part-Of-Speech Tagging Process of marking up a word in a text (corpus) as corresponding to a particular part of speech, based on both its definition and context A simplified form of this is the identification of words as nouns, verbs, adjectives, adverbs, etc. Example :- 5. Statistical Machine Translation An Approach Statistical Machine Translation (SMT) is a translation system where translations are generated on the basis of statistical models. These statistical models parameters are derived from the analysis of bilingual text corpora. It is based on the view that every sentence in a language has a possible translation in another language. A sentence can be translated from one language to another in many possible ways. 6. Word Sense Disambiguation (WSD) A Challenge It is the process which governs the process of identifying which sense of a word (i.e. meaning) is used in a sentence, when the word has multiple meanings. For example :- 7. Lesk Algorithm - An Approach to WSD It selects a meaning for a particular target word by comparing the dictionary definitions of its possible senses with those of the other content words in the surrounding window of context It simply counts the number of words that overlap between each sense of the target word and the sense of other words in the sentence. 8. Architecture of Our Project Hindi Input Part-of-Speech Tagging Apply WSD Morphological Analysis Local Word Grouping Perform Translation & Produce Tamil Text 9. Resource References Indian Language Technology Proliferation and Technology Centre http://tdil-dc.in/ Centre for Indian language Technology , IIT Bombay http://www.cfilt.iitb.ac.in/ 10. References [1] Tripathi Sneha , Sarkhel Juran Krishna , Approaches to machine translation, Annals of Library and Information Studies Vol 57 , December 2010 [2] Antony P.J. , Machine Translation Approaches and Survey for Indian Languages , Computational Linguistics and Chinese Language Processing Vol. 18, No. 1, March 2013 [3] Gupta Deepa , Chatterjee Niladri , A Morpho Syntax Based Adaptation and Retrieval Scheme for English to Hindi EBMT , Department of Mathematics IIT Delhi [4] Sobha Lalitha Devi, Pravin Pralayankar, Menaka S, Bakiyavathi T, Vijay Sundar Ram R and Kavitha V , Verb Transfer in a Tamil to Hindi Machine Translation System, 2010 International Conference on Asian Language Processing 11. [5] Aswani Niraj, Gaizauskas Robert . Developing Morphological Analysers for South Asian Languages Experimenting with the Hindi and Gujarati Languages ,Department of Computer Science University of Shefeld [6] Raghavendra Udupa U. and Tanveer A. Faruquie, An English-Hindi Statistical Machine Translation System ,IBM India Research Lab New Delhi [7] Amba Kulkarni, Soma Paul, Malhar Kulkarni, Anil Kumar, Nitesh Surtani ,Semantic processing of Compounds in Indian Languages [8] Pankaj Kumar , Atul Vishwakarma , Ashwini Kr. Sharma , Approaches for Disambiguation in Hindi Language