33
Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide. Preeti Dubey Ph.D Thesis Page 52 Chapter 4 Grammatical & Inflectional Analysis of Hindi and Dogri India is a linguistically rich country having twenty two constitutional languages, which are written in twelve different scripts. The classification of South Asian languages has been broadly done into groups namely: Indo-Aryan Languages, Dravidian languages, Iranian languages, Nuristani languages, Austro-Asiatic languages and Tibetan-Burmese languages. The Indian languages belong to several language families, seventy five percent Indians speak languages belonging to Indo-Aryan Family, some of them are: Hindi, Bengali, Assamese, Punjabi, Marathi, Oriya, Dogri, Kashmiri, Konkani, Gujarati etc; twenty two percent languages belong to Dravidian family (Tamil, Telugu, Kannada and Malayalam); and the remaining languages i.e. Bodo and Manipuri belong to Tibetan-Burmese Family. Indian languages are inflectional with a rich morphology, relatively free word order, and default sentence structure as SOV (Subject Object Verb). Many of them are structurally similar and are called sibling languages.The literature survey on machine translation shows that Machine Translation systems between sibling language pairs can be developed with less effort by using direct approach. Therefore, in this chapter, the comparative study of the language pair of our Machine Translation system i.e. Hindi and Dogri is discussed in detail. The motive of comparative analysis is to find out the similarities and the differences between Hindi and Dogri languages from Machine Translation point of view. In this chapter, a brief introduction to Devanāgarī script is given; followed by some peculiarities of the Dogri language, then Hindi and Dogri Numerals. The grammatical and inflectional comparison of the language pair is given in detail; then some important features of the study with regard to machine translation are discussed. The International Alphabet of Sanskrit

Chapter 4 Grammatical & Inflectional Analysis of Hindi and ...shodhganga.inflibnet.ac.in/bitstream/10603/78191/11/11_chapter 4.pdf · employing suffixes: अनी (anī), आनी

  • Upload
    others

  • View
    20

  • Download
    1

Embed Size (px)

Citation preview

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 52

    Chapter 4

    Grammatical & Inflectional Analysis of Hindi and Dogri

    India is a linguistically rich country having twenty two constitutional languages, which are written in

    twelve different scripts. The classification of South Asian languages has been broadly done into

    groups namely: Indo-Aryan Languages, Dravidian languages, Iranian languages, Nuristani languages,

    Austro-Asiatic languages and Tibetan-Burmese languages. The Indian languages belong to several

    language families, seventy five percent Indians speak languages belonging to Indo-Aryan Family,

    some of them are: Hindi, Bengali, Assamese, Punjabi, Marathi, Oriya, Dogri, Kashmiri, Konkani,

    Gujarati etc; twenty two percent languages belong to Dravidian family (Tamil, Telugu, Kannada and

    Malayalam); and the remaining languages i.e. Bodo and Manipuri belong to Tibetan-Burmese Family.

    Indian languages are inflectional with a rich morphology, relatively free word order, and default

    sentence structure as SOV (Subject Object Verb). Many of them are structurally similar and are called

    sibling languages.The literature survey on machine translation shows that Machine Translation

    systems between sibling language pairs can be developed with less effort by using direct approach.

    Therefore, in this chapter, the comparative study of the language pair of our Machine Translation

    system i.e. Hindi and Dogri is discussed in detail. The motive of comparative analysis is to find out

    the similarities and the differences between Hindi and Dogri languages from Machine Translation

    point of view. In this chapter, a brief introduction to Devanāgarī script is given; followed by some

    peculiarities of the Dogri language, then Hindi and Dogri Numerals. The grammatical and

    inflectional comparison of the language pair is given in detail; then some important features of the

    study with regard to machine translation are discussed. The International Alphabet of Sanskrit

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 53

    Transliteration (IAST) scheme has been used for transliteration of Hindi and Dogri text in this

    chapter. IAST is the de facto standard used in printed publications, like books and magazines, and

    with the wider availability of Unicode fonts.

    4.1 Introduction of the script used for Hindi and Dogri

    The Devanāgarī script is a part of Brahmic family of scripts, used originally to write Sanskrit had

    evolved over a period of more than two thousand years. It is used to write many languages such as

    Nepali, Marathi, and Dogri etc. Hindi and Dogri both are two closely related languages and both use

    Devanāgarī script. It is written Left-to-Right. Devanāgarī has 52 alphabets in all, including the

    fourteen vowels and thirty two consonants; and five consonant conjuncts. There are some additional

    consonants that are formed with a dot diacritic.A dot below is used to supplement the alphabet to

    express additional sounds. Devanāgarī Script possesses two different forms for each of the vowels-

    Full form and short form.

    4.1.1 Full Form: In Devanāgarī, a full form is employed for a vowel that does not immediately

    follow a consonant or consonant cluster, i.e. in word-initial position or when the second of a

    sequence of vowels.

    4.1.2 Short form (or mātrā): In Devanāgarī, short form is used when the vowel immediately

    follows a consonant or consonant cluster. These short forms consist of lines, hooks or

    combination of both and are written around (that is, below, above, to the right, and to the left)

    the consonant signs. The following table shows both, the full form and the short form of the

    vowels in Devanāgarī.

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 54

    Table 4.1: shows the Devanāgarī vowels (in use) in their short and full forms.

    Devanāgarī Vowels

    Full Form Short Form

    अ (a) No Sign

    आ(ā) ◌ा

    इ (i) ◌

    ई (ī) ◌ी

    उ (u) ◌ु

    ऊ (ū) ◌ू

    ए (e) ◌े

    ऐ (ai) ◌ै

    ओ (o) ◌ो

    औ(au) ◌ौ

    ऋ(ṛ) ◌ृ

    अ(ंaṅ) ◌ं

    अः(aḥ) ◌ः

    अ(ँaṃ) ◌ँ

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 55

    4.1.3 Other Devanāgarī Symbols

    Table 4-2: shows some other symbols used in Devanagiri

    Symbol & its transliteration

    Description

    ऑ ( ô ) Devanāgarī letter candra O

    ऽ ( ̕) Devanāgarī sign Avagraha

    4.1.4 Consonants: The consonants in Devanāgarī are presented in the table below:

    Table 4.3: shows Devanāgarī consonants with their transliteration in IAST

    Consonants

    क k ख kh

    ग g

    घ gh

    ङ ṅ

    च c

    छ ch

    ज j

    झ jh

    ञ ñ

    ट ṭ

    ठ ṭh

    ड �

    ढ �h

    ण ṇ

    त t

    थ th

    द d

    ध dh

    न n

    प p फ ph

    ब b

    भ bh

    म m

    य y

    र r

    ल l व v

    श ś

    ष ṣ

    स s

    ह h

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 56

    4.1.5 Additional Consonants: A dot below is used to supplement the alphabet to express additional

    sounds i.e. these consonants are formed with a dot diacritic.

    Table 4-4: Shows Consonants with Dot Diacritic

    Additional Consonants

    Qa

    ḵẖa

    ṛh

    za

    fa

    ड़ ṛa

    ढ़ ṛha

    4.1.6 Consonant Conjuncts: In Devanāgarī, the following consonant conjuncts exist

    i. क् + ष = D

    ii. त ्+ र = E .

    iii. ज ्+ ञ =F

    iv. श ्+ र = G

    v. H + य = I

    4.1.7 The full alphabetic order of Devanāgarī as used for Hindi is as follows:

    � � � � � � � � � ; (�) (�) (�); � � � घ � ;� ! " # ; $ % & ' ( ) *; + , -

    . / ;0 1 2 3 4; 5 � � 6 7 8 9 ह; : ; < = >

    4.2 Peculiarities of the Dogri Language:

    Dogri is written using Devanāgarī script and has thirty eight segmental and five supra segmental

    phonemes. Segmental phonemes have been divided into two broad groups i.e. vowels and consonants.

    It has ten vowel phonemes and twenty eight consonant phonemes. Some peculiarities of the language

    are:

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 57

    i. Dogri has the same basic consonants as Devanāgarī; but घ (gh), " (jh), ( (�h), . (dh), 3 (bh)

    exist only in Dogri orthography. Phonetically, they are used for tonal � (k), � (c), $( ṭ) , + (t),

    0 (p) at initial stage of a word e.g. घ� (ghar/house), "'/�(jhaṛanā/ shed), (?�/(�hakkan/ lid)

    , .4�@ (dhamakī/ threat), 3�/�(bhakhanā/ to get heated) etc; and they are also used to

    represent the tone � (g), ! (j), & (�), - (d), 2 (b), in the middle and at end of the word e.g

    (4घA�( maghor/hole), 94"( samajh/understanding), 2B (/�(bad�hnā/to cut), 2.�/�(badhānā/

    to increase), �A3/�( cobhanā /to pierce )etc.

    ii. In Dogri � and # phonemes are also used in the initial position of a word e.g.�E �(ṅūr/sprout),

    �E %� (ṅūṭhā /thumb),F5�*� ( ñyānā /infant),#����( ñārāṃ / eleven), 'G��

    ( ṛekā /obstacle); which is not in the case of Hindi.

    iii. The use of chandrabindu ◌ँ and visarga ◌ः is not prevalant in Dogri.

    iv. �, :,

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 58

    b) Its second purpose is to indicate syncopated forms, For Example: 9ʼ#��’�S (This word is a

    combination of (9ʼ#�� + 2GQ�S) meaning ‘in the evening’. The first apostrophe (between the

    shiro rekha/line) indicates high falling tone and the other (above the rekha/line) shows the

    syncopated form.

    vi. Hallant U /h/ is also used to represent the tone e.g. �U (oh / that) ,0@U /� (pīhnā/ to grind)

    vii. In Dogri , an avagraha sign ऽ , is used to indicate extra-long vowels e.g.

    +�� ( talā/sole ) +��[ ( talā ̕/pond)

    ��� (lagā/began) ���[ (lagā ̕/affection)

    viii. Nasalization is also phonemic in Dogri. ‘◌ं’ is used to represent nasalization. The following

    examples show the difference in meaning of words with and without nasalization:

    Nasalization Without Nasalization

    ह��( hāṃ/yes) ह� (hā/ was)

    2��� (bāṃg/ a crow of cock) 2��( bāg/garden)

    ix. The full alphabetic order of Devanāgarī as used for Dogri is as follows:

    � � � � � � � � �; � � � घ �; � ! " #; $ % & ' ( ) *; + , - . /; 0 1 2 3

    4; 5 � � 6 7 9 ह : ;

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 59

    4.4 Other symbols used in Dogri

    Some special symbols for tone and syncopation used in dogri are listed in the table below:

    Table 4-5: shows other symbols used in Dogri

    Symbol Description

    [ Avagraha

    (apostrophe between shiro rekha) Sur Chinha

    ʼ (apostrophe above shiro rekha

    Supra segmental Phoneme

    4.5 Grammatical and Inflectional Analysis

    There are eight parts of speech in both Hindi and Dogri namely: Noun, Pronoun, Adjective, Verb,

    Adverb, Postposition, Conjunction and interjection. The first four i.e. Noun, Pronoun, Adjective and

    Verb are Declinable. Nouns and pronouns take suffixes to show gender, number and case distinction.

    Adjectives consist of bases which inflect for gender and number. Verbs consist of those morphemes

    that take suffixes showing a three-fold distinction of gender, number and person and tense or mood.

    4.5.1 Nouns

    In a Dogri sentence, Nouns function as subject, direct object, indirect object and object of postposition.

    Being declinable, Nouns get inflection for gender, number and case. Both Hindi and Dogri have two

    genders viz. masculine and feminine, two numbers i.e. Singular and plural and eight cases namely:

    Nominative, Accusative, Instrumental, Dative, Ablative, Genitive, Locative, and Vocative. All these

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 60

    cases have different case markers which are used as postpositions to denote the relation of nouns with

    other words in the sentences. When nouns are used in the sentences, these also modify their forms and

    on this level three cases can be distinguished as Direct, Oblique and Vocative. Postpositions are used

    with Oblique forms; Direct and vocative need no postpositions. Prepositions are used in vocative. Both

    languages can be distinguished on the basis of inflection.

    4.5.1.1 Inflection of Nouns by Gender

    There are only two genders: masculine and feminine. Names of male living beings are masculine and

    names of female beings are feminine. In case of inanimate things, abstract, collective and material

    nouns, gender is determined partly by form but mainly by usage. Mostly bigger and hard things are

    categorized as masculine and small and tender objects are referred feminine. The Gender of a noun

    can be changed from masculine to feminine by adding the suffix at the final position. Some rules are

    listed below:

    i. Dogri consonant ending masculine noun stems can be changed to feminine form by

    employing suffixes: अनीअनीअनीअनी (anī), आनीआनीआनीआनी (ānī), ईईईई(ī) and एआनीएआनीएआनीएआनी (eānī) or ऐनीऐनीऐनीऐनी (ainī)

    Masculine Feminine

    मोर (mor/ Peacock) मोरनी ( moranī)

    देर (der/ Husband's younger brother) दरानी (darānī)

    कौल (kaul/ big bowl) कौली ( kaulī)

    पTत (pant/ Pundit) पTतेआनी(panteānī)/ पTतैनी(pantainī)

    • आनीआनीआनीआनी (ānī) and ईईईई (ī ) are common in both the languages

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 61

    • Examples of words using suffix आनीआनीआनीआनी (ānī) both the languages.

    Masculine Feminine (Hindi) Dogri

    देवर (der/ Husband's younger brother) देवरानी(darānī) दरानी(darānī)

    सेठ (seṭh) सेठानी(seṭhānī) सठानी(saṭhānī)

    Though the suffix is same in both the languages but in Dogri, addition of the suffix आनी (ānī)

    causes a change in the root word (i.e. removal of the matra ◌G).

    • Examples of words using suffix ईईईई (ī )

    Masculine Feminine (Hindi) Dogri

    बकरा( bakarā/he-goat) बकरU (bakarī) बकरU (bakarī)

    घोड़ा(ghoṛā /horse) घोड़U(ghoṛī/mare) घोड़U(ghoṛī/)

    ii. The final -आआआआ (ā) of -आआआआ ending masculine noun stems of Dogri as well as Hindi change into

    –ईईईई (ī) to form feminine forms e.g.

    Masculine Word Feminine Word

    दादा (dādā/grandfather) दादU (dādī/ grandmother)

    बकरा (bakarā/ he-goat) बकरU (bakarī/ she goat)

    iii. The final -ईईईई (ī) of –ईईईई (ī) ending masculine noun stems change into - अनअनअनअन (an) to form

    feminine forms e.g.

    Masculine Dogri Feminine

    माली (mālī/gardner) मालन (mālan/female gardner)

    धमV (dharmī/religious man) धमWन (dharman /religious woman)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 62

    iv. The final -आईआईआईआई (-āī) of -आईआईआईआई (āī) ending masculine noun stems change into ऐनऐनऐनऐन (ain) to form

    feminine forms e.g.

    Masculine Dogri Feminine

    भाई(bhāī/brother) भैन(bhain/sister)

    शदाई ( śadāī / crazy man ) शदैन(śadain/ crazy woman)

    • Suffixes used in Hindi but not in Dogri

    Certain suffixes are used in Hindi but not in Dogri. These suffixes with examples are:

    i. iyā (इयाइयाइयाइया)

    Masculine Word Hindi Feminine Form Dogri Feminine Form

    बेटा (beṭā /son) XबYटया(biṭiyā/daughter) Word not in Dogri Vocabulary

    गुZड़ा (gu�dā /doll) गुYड़या(guṛiyā) गुZडU (gu��ī)

    बूढ़ा(būṛhā/oldman) बुYढ़या (buṛhiyā) बुZढU (bu��hī)

    बTदर(Bandar/monkey) बTद[रया (bandariyā) बांदरU (bāṃdarī)

    ii. in(इनइनइनइन)

    Masculine Word Hindi Feminine Form Dogri Feminine Form

    माली (mālī/gardner) मािलन(mālin) मालन(mālan)

    सुनार(sunār/goldsmith ) सुना[रन(sunārin) सनैरU(sanairī)

    धोबी(dhobī/loundryman) धोXबन(dhobin) धोबन(dhoban)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 63

    iii. āin (आइनआइनआइनआइन )

    Masculine Word Feminine Form in Hindi Feminine Form in Dogri

    प]डत(paṇ�it/priest) प]डताइन (paṇ�itāin) पTतैनी (pantainī)

    4.5.1.2 Inflection of Nouns for Number

    Hindi and Dogri both languages have two types of numbers: Singular number and Plural

    number. In Dogri, only -ā (आ) ending masculine stems change for plurals except a few e.g. नेता/

    netā /minister, मुखया/ mukhiyā/head etc.

    • Number Inflection for masculine words

    i. The final /-ā(आ) /changes into /-e(ए)/ in the plural forms of such masculine nouns e.g.

    Singular Form Plural Plural

    चाचा (cācā/uncle) चाचे (cāce)

    मामा (māmā/uncle) मामे(māme)

    The rest remain the same for example मोर (mor/ peacock), भाई (bhāī/ brother), माली (mālī /

    gardener) etc. The same rule applies to such Hindi masculine words.

    • Number Inflection for feminine words

    For feminine words, different suffixes are used to show number inflections in Hindi whereas in

    Dogri, mostly /āṃ /(आ)ं/ morpheme is added to the singular feminine noun stems to make their

    plural forms and the phonological changes in plural forms of feminine nouns depend on the final

    sound of the feminine nouns .e.g.

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 64

    Singular Form Plural Form

    ब/ू bū /Father’s sister बुआ/ं buāṃ

    छा/ंchāṃ /shade छामा ं/chāmāṃ

    • Rules for number inflection in Hindi feminine words are:

    i. Words ending with (अअअअ/a) or (आआआआ/ā); make plural by changing to एँएएँँए ँ(eṁ) e.g.

    Singular Form Hindi Plural Form

    बहन (bahan /sister) बहन̂(bahaneṃ)

    सेना(senā /army) सेनाए(ँ senāeṁ/armies)

    ii. Words ending with (या/ya); change the final (a / आ) into (āṁ / आ)ँ

    Singular Form Hindi Plural Form

    गुYड़या(guṛiya/doll) गुYड़या ँ(gu�iyāṁ/dolls)

    बुYढ़या(buɽhiya/old-lady) बुYढ़या ँ(buɽhiyāṁ/old-ladies)

    iii. ए ँ/ ẽ is added to words ending with u / उ, u: / ऊ and au / औ, and make short u / उ in

    place of long u: / ऊ.

    Singular Form/ekvacan Plural Form/ bahuvacan

    वध ु(vadhu/bride) वधएु(ँvadhūê/brides)

    व_त ु(vastu /thing) व_तुए ँ(vastuẽ/things)

    4.5.1.3 Inflection of Nouns for Case

    There are eight cases in both the languages- Hindi and Dogri, namely Nominative, Accusative,

    Instrumental, Dative, Ablative, Genitive, Locative, and Vocative. All these cases have different case

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 65

    markers which are used as postpositions to denote the relation of nouns with other words in the

    sentences. When nouns are used in the sentences, these also modify their forms and on this level three

    cases can be distinguished as Direct, Oblique and Vocative. Postpositions express Oblique forms;

    Direct and vocative need no postpositions. Prepositions are used in vocative. In Hindi, there’s no

    proper noun inflection, but it is seen in Dogri. The following table shows the various case

    markers used in Hindi and Dogri:

    Table 4-6: shows the various case markers used in Hindi &Dogri

    CASE(कारक/ kārak)

    Hindi

    Postposition

    Dogri

    Postposition

    Nominative(कताW/ kartā) ने न ै

    Accusative/Objective(कमW/karma) को गी, क`, ई

    Instrumental(करण/ karaṇ) से, के साथ, के

    aारा

    कTन/ैन,ैकशा/शा, कोला,थमा,ं

    aारा

    Dative(सbcदान/sampradān) के िलए, को आ_त,ैिगd/ैत,ैलेई,ताe,

    जोf गा

    Ablative(अपादान/ apādān) से उgपरा/परा/ रा, Xबhचा/ इhचा/

    चा, कशा/ शा, कोला

    Genitive(सbबTध/sambandha) का, के, क`, रा,रे,

    रU

    दा, दU, दे, Yदया/ं रा, रU, रे, [रया,ं

    ड़ा, ड़U, ड़े, Yड़या,ं ना, नी, ने, िनया ं

    Locative( अिधकरण/ adhikaraṇ) म̂, पर उgपर/पर/ र, Xबhच/ इhच/ च,

    Vocative(सbबोधन/ sambodhan) हे, अरे ओ, ए

    Though semantically and structurally both the languages have almost same pattern of case inflection

    but Dogri has its own peculiarity in this regard too. Proper nouns such as names of planets, cities,

    rivers, last names etc. get inflected for oblique case in Dogri, but not in Hindi e.g.

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 66

    Hindi Sentence: सूरज को जल चढ़ाओ । (sūraj ko jal caṛhāo)

    DogriSentence: सूरजै गी जल चाढ़ो । (sūrajai gī jal cāṛho)

    Here suraj(sun) is a proper noun meaning sun and is getting inflected as ‘surajai’.This sentence is an

    example of proper noun inflection in Dogri language. The above sentence shows proper noun inflection

    exists in Dogri but not in Hindi. [73-78]

    4.5.1.3 .1 Inflectional Rules for Word Formation

    i. In Dogri, ऐ(ai) is appended to all singular nouns except आ (ā) ending nouns for oblique case

    e.g.

    जागत( jāgat/boy) जागत(ै jāgatai) जागत ै कTन(ै jāgatai kannai)

    बाप ू (bāpū/father) बापुऐ बापुऐ कTन ै(bāpuai kannai)

    कुड़U (kuṛī/girl) कुYड़य ै(kuṛiyai) कुYड़य ैकTन ै(kuṛiyai kannai)

    ii. With consonant ending feminine nouns ‘ई’ (ī) or ‘उ’(u) are used in free variation

    रात (rāt/night) राती च (rātī c), राती िगd ै(rātī gittai)

    स_स(sassa/mother-in-law) स_सु थमा(ंsassu thamāṃ )etc.

    iii. ए ं(eṃ) is used with plural nouns e.g.

    जागत̂ दा(jāgateṃ dā) , साधएु ं गी(sādhueṃ gī), मालन̂ कशा (mālaneṃ kaśā), काXपय̂ च

    (kāpiyeṃ c) etc.

    iv. For vocative case आ (ā) and ए (e)are added to singular masculine and feminine nouns

    respectively where as ओ is appended to plurals of both masculine and feminine nouns in

    both Hindi and Dogri e.g.

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 67

    Masculine Singular Masculine Plural

    Dogri जागता (jāgatā) जागतो (jāgato)

    Hindi लड़के (laṛak) लड़को (laṛako)

    Feminine Singular Feminine Plural

    Dogri कुYड़ये (kuṛiye) कुYड़यो (kuṛiyo)

    Hindi लड़क` (laṛakī) लड़Yकयो (laṛakiyo) [73-78]

    4.5.2 Pronouns

    Pronoun is a word used in place of a noun. Pronouns also get affected by grammatical categories such

    as number and case and not by gender, but their gender is indicated by the verbal forms. Plural forms

    are also used to express honor, respect and politeness for singular e.g.

    Hindi मािलक ने नौकर से कहा— हम मीYटंग म̂ जा रहे हj। mālik ne naukar se kahā - ham mīṭiṃg meṃ jā rahe haiṃ

    Dogri मालकै न ैनौकरै गी आखेआ— अस मीYटङा च जा’रने आं। mālakai nai naukarai gī ākheā as mīṭiṅā c jā’rane āṃ.

    The boss said to the servant that we are going to the meeting.Here plural form of the pronoun ‘maīṃ’

    singular first person; is used for giving respect to oneself.

    Pronouns get inflected for number and case. As far as case forms are concerned, vocative case

    inflection cannot be seen. There are four case forms-Direct, Indirect (Nominative, Accusative),

    Instrumental and Genitive. The Genitive forms of pronouns also function as adjectives. In Dogri

    pronouns are highly case inflected; each pronoun can take various forms e.g. The following table

    shows case inflection for the word आʼ�� / ā'ūṃ /meṃ.

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 68

    Table 4 -7: shows various forms of the word ����ʼ̓̓̓�������� ( ā'ūṃ)/ म̂म̂म̂म̂ (meṃ)

    ����ʼ�������� /me/me/me/me (((( ā''''ūṃ)/)/)/)/ 4r 4r 4r 4r ((((memememeṃ))))

    Case Singular Plural

    Nominative(कताW/ kartā) �ʼ�� ( ā'ūṃ)/ 4r (meṃ) 9(as)

    Nominative(कताW/ kartā) 4r( meṃ) 9r(aseṃ)

    Accusative/Objective(कमW/karma) I4�@ (migī) 4@ (mī) 9r �@ (aseṃ gī)

    Genitive(सbबTध/sambandha) 4G��(merā),4G�G(mere), 4G�@(merī),4GI�5��

    (meriyāṃ)sह�'�(mhāṛā), sह�'@(mhāṛī)

    sह�I'5�� (mhāṛīyāṃ) sह�'G (mhāṛe),

    tह�'�(shāṛā),

    tह�'@ ( shāṛī)

    tह�I'5��(shāṛī

    yāṃ)tह�'G(shāṛe)

    In Both Hindi and Dogri personal pronouns are classified into three classes: first, second and third

    person. Pronouns distinguish three persons (first, second, and third), two numbers (singular and plural),

    and two cases (direct and oblique). The following table shows Hindi and Dogri Pronouns:

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 69

    Table 4-8: shows Hindi and Dogri Pronouns

    Language Pronouns First

    Person

    Second

    Person

    Third

    Person

    Hindi

    Singular Plural

    मj (maiṃ)

    हम (ham)

    त(ूtū),तुम(tum)

    आप( āp)

    यह(yah), ये(ye),

    वह(vah), वे(ve)

    Dogri

    Singular Plural

    �ʼ�� ( ā'ūṃ)/ म̂ (meṃ)

    अस (as)

    तू(ंtūṃ)तुk(tūdda)

    तोf(toh)

    तुस( tus)

    एf(eh), ओf(oh)

    4.5.3 Adjectives

    In Hindi and Dogri both, adjectives are of two basic kinds, declinable/inflected and

    indeclinable/uninflected. Only आ/ā ending masculine adjectives are declinable. All such adjectives

    change for gender, number and cases whereas feminine adjectives are concerned with only ई/ī ending

    and they get inflected for number and case; but that is not the case in Hindi. All others remain

    unchanged. The inflectional rules are as under:

    i. For Gender Inflection: The final आ/ā of the masculine adjectives changes into ई/ī; to form its

    feminine form e.g.

    Singular Masculine Singular Feminine

    Dogri चगंा जागत(caṃgā jāgat/good boy) चगंी कुड़U(caṃgī kuṛī/good girl)

    काला घोड़ा( kālā ghoṛā/black horse) काली घोड़U(kālī ghoṛī/black mare)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 70

    Singular Masculine Singular Feminine

    Hindi अhछा लड़का (acchā laṛakā/good boy) अhछl लड़क`(acchī laṛakī/ good girl)

    काला घोड़ा( kālā ghoṛā/black horse) काली घोड़U(kālī ghoṛī/black mare)

    ii. For Number inflection: To form the plural of आ/ā ending masculine adjectives the final आ/ā

    changes to ए/e

    Singular Masculine Plural Masculine

    Dogri चगंा जागत(caṃgā jāgata) चगें जागत/caṃge jāgat/good boys)

    काला घोड़ा (kālā koṛa / black horse) काले घोड़े (kāle koṛe/ black horses)

    Singular Masculine Plural masculine

    Hindi अhछा लड़का (acchā laṛakā/good boy) अhछे लड़के (acche laṛake/good boys)

    काला घोड़ा( kālā ghoṛā/black horse) काले घोड़े (kāle koṛe/ black horses)

    iii. The ई/ī ending singular feminine adjectives to form their plurals by appending या(ंyāṃ)

    Singular Feminine Plural Feminine

    चगंी कुड़U(caṃgī kuṛī/good girl) चिंगया ंकुYड़या/ंgood girls

    काली घोड़U / kāli koṛi / black mare) कािलया ंघोYड़या ंkāliyāṃ koṛiyāṃ

    The ई/ī ending Hindi singular feminine adjectives do not inflect e.g.

    Singular Hindi Feminine Plural Feminine

    अhछl लड़क` (acchī laṛakī/ good girl) अhछl लड़Yकयां(acchī laṛakiyāṃ/good girls)

    काली घोड़U (kāli koṛi / black mare) काली घोYड़या ं/ kāli koṛiyāṃ / black mares)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 71

    iv. In Dogri, adjectives inflect for Direct, Oblique and Vocative cases but masculine plural form

    does not inflect for case. Some examples of declinable and indeclinable Adjectives are:

    Table 4-9: shows some declinable and indeclinable Adjectives (underlined)

    Declinable Indeclinable

    लbमे बूf टे (लbमा)

    lamme būhṭe /tall tree

    ठंYडया धारा(ठंडU)

    ṭhaṃ�iyā dhārā/cold mountain range

    शैल टmले śaila ṭalle/beautiful clothes

    gयाजी साफे pyājī sāphe /pink turbans

    डn जागत ै�arū jāgatai / coward boy

    4.5.4 Postpositions

    Words which are used before a noun or a pronoun and which show their relation to some other words

    in the sentence are called postpositions. In Hindi they are not inflected with number, gender and case.

    Postpositions in Dogri are employed after words, which are used in oblique case to denote case

    relationships. No postposition is used with nouns or pronouns in direct case. Postpositions are also not

    employed with IST and IInd person in genitive case. With the exception genitive दा (dā) all the

    postpositions used in Dogri are indeclinable. दा (dā) declines to agree with the number, gender and

    case of the preceding nouns. Some of the important postpositions of Dogri are:

    �ʼ̓̓̓Q� / kha'll/ below, under,उgपर / uppar/ above, मझाटै / majhāṭai / in between, राह̂ / rāheṃ /through,

    साथ̂ / sātheṃ / along with, कTन ै/ kannai / with, तगर /tagar/ till, बारै / bārai/ etc

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 72

    Some Hindi Postpositions are:

    पहले / pahale/before , बाद/ bād/ after , आगे /āge/ forward-afterward, पीछे/ pīce/backward, बाहर/

    bāhar/ outside, बीच/ bīc / in middle, ऊपर / ūpar/upside, etc.[73]

    Some of these postpositions are used as Adverbs. The difference is that the postpositions are used

    with the nouns or pronouns and adverbs are those which come with the verb and describes the verb.

    Other Hindi and Dogri Postpositions that are used as case markers are shown in Table 4.4.

    4.5.5 Verbs

    Verb is a word used as a predicate denoting action. The verb is mainly categorized as Finite and Non-

    Finite. The verb structure can be either simple or compound or causative. When a sentence has only

    one verb, it is the Simple verb and if a sentence has two or more verbal bases, it is the Compound verb.

    Compound verbs are a highly visible feature of Hindi and Dogri grammar. Verb in Hindi as well as

    Dogri gets inflected for the above mentioned categories but the difference is in the way it gets inflected.

    4.5.5.1 Inflection of Verb

    Finite Verb gets inflection for Gender, Number, Voice, Tense, Mood etc. The forms of verb are made

    by adding suffixes of the above mentioned categories to the roots in both the languages.

    4.5.5.1.1 Inflection of Verb for Gender

    आ(ā) and ई( ī)/ are the suffixes added to verb stem along with their suffixes to denote the masculine

    and feminine gender of the subject or object respectively.

    Masculine लड़का _कूल जाता है। laṛakā skūl jātā hai .The Boy goes to School.

    जागत _कूल जंदा ऐ। jāgat skūl jaṃdā ai. (In Dogri)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 73

    Feminine लड़क` _कूल जाती है। laṛakī skūl jātī hai. The Girl goes to School

    कुड़U _कूल जंदU ऐ। kuṛī skūl jaṃdī ai.(In Dogri)

    4.5.5.1.2 Inflection of Verb for Number

    Distinction of number is made by inflection in all the tenses, moods, aspects etc.ए /e/ is plural suffix

    for masculine and इआ/ंiāṃ/ for feminine and दा/ dā shows singular number and दे / de denotes plural

    E.g.

    Singular मोर नाच रहा है | mor nāc rahā hai. ( The peacock is dancing).

    मोर नhचा दा ऐ। mor naccā dā ai. (In Dogri)

    Plural मोर नाच रहे हj | mor nāc rahe haiṃ. (Peacocks are dancing).

    मोर नhचा दे न। mor naccā de na.(In Dogri)

    Here, rahā hai/ dā ai is singular number and rahe hain/ de na is plural number.

    Sometimes plural forms are used instead of singular forms, for the elderly or respected people. E.g.

    Hindi: भीoम Xपतामह तो pbहचारU थे । bhīṣm pitāmah to bramhacārī the.

    Dogri: भीoम Xपतामह ते pbहचारU हे। bhīṣm pitāmah te bramhacārī he.

    Here, थे (the)/ हे (he) is the IIIrd person, plural verb, used for singular nouns / subject. [73-78]

    4.5.5.1.3 Voice//वाhयवाhयवाhयवाhय/ vācya

    Hindi and Dogri both have three Voices: Active/कतृWवाhय/kartṛvācya, Passive/कमW-वाhय/ karmvācya

    and Impersonal/भाववाhय/ bhāvavācya. But Dogri is categorized as a passive voiced language.

    Peculiarity of syncopation of Dogri is very prominent in the formation of passive voice and Impersonal

    voice. For Example

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 74

    Passive voice: िलखेआ गेआ / likheā geā /written (लखोआ/ lakhoā), िलखेआ जंदा / likheā jaṃdā /being

    written (लखqदा/ lakhoṃdā), िलखेआ जाग/ ikheā jāg/shall be written (लखोग/ lakhog).

    Impersonal voice: जान होआ / jān hoā/ (जनोआ/ jānoā ), जान हqदा (jān hoṃdā)/जनqदा(jānoṃdā), जान

    होग /jān hog/ (जनोग/ janog).[73-78]

    4.5.5.1.4 Tense/कालकालकालकाल/ kāl

    The forms of verb that indicate the time of action are called tenses. Like other languages, both Hindi

    and Dogri have three tenses-Present, Past and Future.

    4.5.5.1.5 Mood

    In Dogri, generally, there are three moods-Indicative, Imperative and Subjunctive. These moods are

    mostly expressed by combining root with present and past participles. There is no one-to-one

    correspondence between the inflectional suffixes and categories denoted by verbs particularly person

    and mood, their suffixes cannot be separated. [73-78]

    4.5.5.1.6 Aspect

    Aspect refers to the kind of action of a verb. There are four aspects in both the languages under study:

    Imperfect, Perfect, Progressive and Habitual. Various tense markers are used to express aspect.

    i. Imperfect aspect is expressed by using tense marker with present participle e.g. In Dogri चलदा

    ऐ / caladā ai/goes and in Hindi चलता है (calatā hai).

    ii. To express perfect aspect tense marker ऐ (ai)S ‘is’ with modifications due to number, gender,

    person etc. comes in combination with past participle such as in Dogri गेआ ऐ ( geā ai) has gone

    and in Hindi गया है ( gayā hai).

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 75

    iii. The progressive aspect represents the action as progressing and not ended. The progressive

    forms are made with the help of कर (kar) root. Present participle of this root is combined with

    the main verb. The main verb form is made by attaching आ (ā) ऐ (ai) to the root. Tense marker

    /ऐ /ai/ हा / hā/ ‘is /was’ is used to express the tense such as जा करदा ऐ। (jā karadā ai) ‘is going’

    and in Hindi जा रहा है । (jā rahā hai).

    iv. For the expression of habitual aspect the present participle of हो /ho/root is combined with

    present participle of main verb before tense marker in Dogri e.g. करदा हqदा ऐ । (karadā hoṃdā

    ai). In Hindi Yकया करता है । ( kiyā karatā hai) [73-78]

    4.5.5.2 Compound Verbs

    Compound verbs consist of a verbal stem plus an auxiliary verb. The auxiliary (variously called

    "subsidiary", "explicator verb", and "vector") loses its own independent meaning and instead "lends a

    certain shade of meaning" to the main/stem verb, which "comprises the lexical core of the compound"

    e.g.

    Hindi Dogri

    मार Yदया(māra diyā/killed) मारU ओड़ेआ(mārī oṛeā);

    जाना पड़ा(jānā paṛā/had to go) जाना पेआ(jānā peā)

    आ गया (ā gayā/has come) आई गेआ (āī geā)

    The auxiliary verbs Yदया (ओड़ेआ, पड़ा (पेआ) means kept, गया (गेआ) means gone. They lose their own

    independent meanings in the above phrases and denote meaning of completion of the action,

    compulsion for the doer and completion of the action respectively. [73-78]

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 76

    4.5.5.2.1 Formations of the main verb in a compound verb in Dogri:

    i. Conjunctive Participle: The conjunctive participle is formed in two ways :

    a) by adding (इय ै) to the stem of the main verb e.g कर + इय ै= क[रय ै

    b) by adding (ई ) to the stem of the main verb e.g. कर + ◌ी (ई) = करU

    Case (b) is used as the main verb in compound verb formation. e.g.

    Dogri जागत आई गेआ ऐ(jāgat āī geā ai/ the boy has come)

    जागत उrठl बैठा ऐ(jāgat uṭṭhībaiṭhā ai/ the boy woke up)

    In Hindi, the root is used as main verb e.g.

    Hindi लड़का आ गया (laṛakā ā gayā)

    लड़का जाग गया (laṛak jāg gayā)

    i. Infinitive verbs: are formed by adding ना (nā) to the root of the main verb.

    Auxiliary verbs such as पौना(paunā )or चाsा(cāhnā) or लोड़दा(loṛadā) are used as to denote

    compulsion for the doer to do that particular act .e.g.

    करना पौना(karanā paunā / करना पड़ेगा(karanā paṛegā/have to do).

    ii. Imperfect Participle: the formation of present participle is done in two ways:

    a) The suffix द (d) along with their inflectional suffixes is added to consonant ending root

    करदा(karadā), करदU(karadī), करदे(karade), करYदया(ंkaradiyāṃ) etc

    b) The suffix Tद (nd) along with their inflectional suffixes is added to vowel ending root, for

    exampleरोTदU (rondī), पीTदा (pīndā) etc.

    Present participle combines with auxiliary words such as औना (aunā),रौsा(rauhnā),

    जाना(jānā),होना(honā) . Some examples are given below:

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 77

    Dogri Hindi

    तुपदा रेहा होग । (tupadā rehā hog) ढंूढता रहा होगा । (�hūṃ�hatārahāhogā)

    खदंU हqदU हU । (khaṃdī hoṃdī hī) खाती रहती थी । (khātī rahatī thī)

    टुरदा आया ऐ । (ṭura dāāyā ai) चलता आया है । (calatā āyā hai)

    iii. Perfect Participle: For the formation of perfect participle,

    a) त (t) is suffixed to vowel ending roots along with inflectional suffixes e.g. the root पी

    (pī/drink) takes the following forms with number and gender inflectionपीता

    (pītā),पीती(pītī),पीते(pīte).

    b) with consonant ending roots ए (e) is added before the affixation of number, gender,

    suffixes for masculine e.g. टुरेआ (singular), (ṭureā /he walked), टुरे (ṭure/ they walked)

    c) ई(e) is appended with the root before the affixation of number suffixes in case of

    feminine . e.g. टुरU / ṭurī (singular), टु[रया/ं ṭuriyāṃ (plural)

    Auxiliary verb �uv/� (rauhnā) is mostly used with past participle to give the sense of continuity in both

    languages.

    Root ending with आआआआ (ā): The main verb ending with आ (ā), the auxiliary word करना (karnā) is added.

    It gives the sense of habit or repetition or continuity of the action e.g. आ’ऊं करा करना । (aūṃ karā

    karanā/ I’m doing), आ’ऊं पढ़ा करना । (aūṃ paṛhā karanā/ I’m studying).

    In Hindi, the root alone serves as the main verb. e.g.

    मj पढ़ रहा हंू। (maiṃ paṛha rahā hūṃ/ I’m studying)

    वो जा रहा है। (vo jā rahā hai / he is going)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 78

    4.5.5.3 Causative Verbs

    In Hindi as well as in Dogri, the verb undergoes some changes in itself to give its causal form and

    causative verb has the same forms (number, gender, mood etc.) Many verbs have two causative forms -

    First causative and Second causative but there are also some verbs which have only one causative like

    देना(dena/ to give),दोआना (doana /to cause to give).

    In Dogri, the First Causative Verbs are formed by appending (आऽ / ā ̕) to the root. To form second

    causative verb (-ओआऽ /oā ̕) is added to the root or the modified form. Whereas In Hindi, causative

    verbs can often be formed by appending वा/vā to the intransitive verb stems. The First Causative Verbs

    in Hindi are formed by appending (◌ाना /ānā) to the root. To form second causative verb (वाना/vānā) is

    added to the root or the modified form. The following table shows examples of first and second

    causative form in Hindi & Dogri.

    Table 4-10: shows examples of First and Second Causative forms in Hindi & Dogri

    Root Word First causative form Second causative form

    Dogri पढ़(paṛh ) Read पढ़ाऽ (paṛhā ̕) “to cause to read”

    पढ़ोआऽ(paṛhoā ̕) “to have someone to read”

    Hindi पढ़ (paṛh )

    पढ़ा/ पढ़ाओ (paṛhā/ paṛhāo ) पढ़वाओ (paṛhvāo)

    4.5.5.3.1 Some rules for formation of causative verbs in Dogri are as under:

    i. In two letter roots initial vowel आ /- a /, ई /- i /ऊ /- u / etc. are shortened and ल /l /is

    inserted between the shortened vowel and causal affixes .e.g.पी (pī/ drink) / पलाऽ

    (palā)/ पलोआऽ (paloā ̕)

    ii. Metathesis is another feature of causative verb formation in Dogri. Metathesis mostly

    occurs between first two letters specially when उ /ū / and tonal vowels are there in initial

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 79

    position. E.g. उतार (utār) / तुआर (tuār) / तरोआऽ (taroā ̕). In case of polysyllabic roots

    causal affix आ /a / is inserted before the third letter of the root. E.g. ह_स (hass) / _हाऽ

    (shā ̕) / _होआऽ (shoā ̕) [73]

    4.5.6 Adverb

    Adverbs come with the verb and describe the verb. In Hindi, it is Indeclinable; but in Dogri, if �/ā

    ending form of adverb is used, that gets inflected like adjective. Such adverbs in Dogri inflect for

    Gender and number. For example

    Singular Plural

    Masculine राम!ु तौले-तौले चल । राम ुते शाम ु! तौले-तौले चलो ।

    Feminine रामी! तौले-तौले जा । रामी ते शामी ! तौले-तौले जाओ[77]

    Table 4-11: Shows Declension of Dogri Adverbs.

    Gender Inflection

    Masculine राम ु! तौला तौला चल/

    rāmu ! taulā taulā cal।/Ramu walk fast.

    Feminine रामी ! तौली तौली जा

    rāmī ! taulī taulī jā/Rami go fast

    Number Inflection

    Singular(Masuline) राम ु! तौला तौला चल/

    rāmu ! taulā taulā cal।/Ramu walk fast.

    Plural (Masuline) राम ुते शाम ु! तौले-तौले चलो /

    rāmu and Shāmu! taule taule cao Ramu and Shamu walk fast

    Singular (Feminine) रामी! तौली तौली जा ।

    rāmī ! taulī taulī jā/Rami go fast

    Plural (Feminine)

    रामी ते शामी! तौिलया-ंतौिलया ंजाओ । Rami and Shami ! go fast/ rāmī te śāmī! tauliyāṃ-tauliyāṃ jāo

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 80

    4.6 Study with regard to Machine Translation

    For developing any machine translation system, comparative study of the language pair to be

    translated is most important, since it brings out the similarities as well as the differences between

    them. The above study has been very helpful with regard to machine translation of Hindi language

    into Dogri language. Some of the findings from the study are: choice of the translation method,

    finding collocations in the Dogri language, the development of morphological analyzer etc. Some of

    these are explained below in short.

    • Choice of Translation Method

    A language pair is said to be closely related if the languages have the grammar that is close in

    structure, contain similar constructs having almost same semantics, and share a great deal of lexicon.

    Hindi and Dogri are very similar languages in regard to script, word order and even the lexicon is

    shared. It has been observed that Hindi and Dogri languages share all features of closely related

    languages such as common roots, similar alphabets, structural similarity, similar parts of speech and

    grammatical categories; similar religio-cultural contexts. For such closely related sibling languages,

    effective translation can be achieved by word-for-word translation. Thus, direct Machine Translation

    approach is chosen for machine translation from Hindi into Dogri language.

    • Development of Rules for Morphological Analyzer

    Morphology is the branch of linguistics that studies the internal structure of the words. A root word

    can take many forms, for example: the root word (or stem) खा (khā/eat) is the same in both Hindi and

    Dogri, but it varies in the ways it gets inflected for number, case, gender. E.g. root word (or stem)

    खा(khā) can be inflected to take the following forms in Hindi :

    खाओ(khāo),खाया(khāyā),खाए(ंkhāeṃ),खाता(khātā),खाती(khātī),खाते(kheāte),खाना(khānā),खाने(khāne)

    etc.

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 81

    For translating such Hindi words to Dogri words, a morphological analyzer is required.

    Morphological Analyzer is a tool used for generating a word, given the stem and its affixes. Hindi and

    Dogri are both highly inflected language share a great deal of lexicon but the difference is in the

    inflection of the root word. This study has been helpful in designing rules for the Morphological

    analyzer.

    • Ambiguity

    Ambiguity of words means a word with same spelling but different meaning. Such words if not

    considered, can change the sentence sense. e.g से takes about seven forms(कTन/ै kannai, दा/ dā , कोला/

    kolā ,चा / cā, ग/ै gai, गी/ gī,जेये/jeye) etc in Dogri depending on the context of the sentence. To

    preserve the meaning of the sentence, disambiguation is done. Some sentences are given in the table

    below to show the different forms of the word से (se):

    Table 4-12: shows various forms of the Hindi word से (se) in Dogri

    HindiHindiHindiHindi

    DogriDogriDogriDogri

    &� 9G (with fear) &�S �yS 6ह�� 9G ��(come from there) �z,E� -� ��

    4�� 9G �ह�(said to Ram) 4�� �@ ��G� �9 4r 9G(out of which) �U -G ��

    4�9E4 9G 2}�G (innocent kids) 49E4 !G5G 2}�G 5 9G 10 !E/(5th to 10th june) 5 �A�� 10 !E/

    + 9G I��� (fell from the roof) �A%G �0�� I&�G� ��9+u� 9G (especially) ��9+u� 0�

    %@� 9G ��A (do it properly) %@� ��Q�@ ��A

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 82

    • Handling Kar (करकरकरकर): This module handles the morpheme कर/ Kar; it is an important part of

    the system and has increased the system accuracy.In Hindii language, the word Kar (कर) takes

    two forms:

    a) As a morpheme it means ‘do’;

    b) As conjuctive participle in compound verb

    In Dogri, कर (kar) is replaced with

    a) इय(ैiyai) for vowel ending words e.g. रखकर /rakh kar (रtखय/ैrakkhiyai)

    b) ◌य ैfor consonant ending. e.g. जाकर/ jākar (जाइय/ैjāiyai)

    c) If kar/ कर, appears after echo words e.g.��----�� ��, , , , -u'-----u' �� then again kar/ कर is

    replaced with

    i. इय(ैiyai) and is appended to the previous word for vowel ending words,e.g.रख-

    रखकर / rakh -rakh kar (�?�@-रtखय/ैrakkhi-rakkhiyai)

    ii. ◌य ै (iyai) and ◌@ is appended to the previous word for consonant ending words. e.g.

    !�----!���/ jā-jākar (!�-जाइय/ैjāi-jāiyai)

    Handling ‘kar’ in Hindi sentences is important to improve the accuracy of the system.

    • Handling Special cases pertaining to Dogri

    i. Handling words preceding the morpheme �ह�////rahā; �ह@////rahī; �हG////rahe

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 83

    a) In Dogri, if the word preceeding �ह�////rahā is vowel ending then आ (ā) is added to it and

    �ह� is changed to दा/dā.

    Else if it is consonant ending ◌� is added to it and �ह� is changed to दा.

    b) If the word is �ह@, then as in the previous case, � (ā) or ◌� is appended to the preceeding

    word and �ह@ is changed to -@

    c) If the word is �हG, then like the other two cases � (ā) or ◌� is appended to the preceeding

    word, but �हG takes two forms depending on the word next to it. The two cases are

    described below:

    iii. If �हG/rahe is followed by ‘हA/ho’ then , �हG हA is replaced with /G

    iv. If �हG /rahe is followed by ‘ह/ haiṃ’ then, �हG is replaced with -G

    The following table depicts the various forms of raha:

    Table 4-13: Examples to illustrate handling rahā

    Hindi Dogri

    6ह �A �ह� हS (vaha ro rahā hai/ he is crying)

    �U �A� -� � (oh roā dā ai)

    6ह 0) �ह@ हS (vaha ro paṛha hai/ she is studying)

    �U 0)� -@ � (oh paṛhā dā ai)

    �A� -u' �हG ह (loga dauṛa rahe haiṃ/people are running)

    �A� /t9� -G / (loga nassā de na)

    ?5� �� �हG हA? (kyā kara rahe ho/what are you doing?)

    �G U ��� /G (keh karā ne)

  • Study and Development of Machine Translation System from Hindi language to Dogri language; an important tool to bridge the digital divide.

    Preeti Dubey Ph.D Thesis Page 84

    ii. Handling words preceding the morpheme Lagā /���,,,,//// Lagī/��@, , , , Lage////��G)

    In Dogri, the inflection ◌G◌G◌G◌G is deleted from all words preceeding laga (���)

    e.g. Hindi: 4 ��4 ��/G ��� हE� (maiṃ kāma karane lagā hūṃ)

    Dogri: 4r ��s4 ��/ ��� �� (meṃ kāmma karana laggā āṃ)

    4.7 Summary: This objective of this chapter is to identify the grammatical and inflectional

    similarities as well as differences in Hindi and Dogri languages. After thorough study, the drawn

    observations helped in choosing the appropriate method for the development of Machine Translation

    for our system, for framing the rules for inflectional analysis, for identifying collocations, ambiguous

    words etc. This study will be useful in understanding the inflections in Dogri, which can be used for

    further research on translation of Dogri to other languages or vice- versa and for developing other

    translation tools.