Click here to load reader

First-Character Filtering Method in Syllable Segmentation using Data Dictionary for Myanmar Language

Embed Size (px)

Citation preview

  1. 1. HTET MYET LYNN 20147728 Department of Computer Engineering Chosun University Intelligent Computing Laboratory First-Character Filtering Method in SyllableSegmentation usingData Dictionary for MyanmarLanguage
  2. 2. Contents Nature of Myanmar Words First-Character Filtering (FCF) Method Implementation & Result Future Work
  3. 3. Nature of Myanmar words (1/2) Like other South-East Asian languages No delimiter (whitespace) between words but phrases No standard rules for whitespace 33 Consonants, 10 digits
  4. 4. Nature of Myanmar words (2/2) . 1 2 3 4 5
  5. 5. Nature of Myanmar words (3/3) Kinzi Stacked Consonants Writing Methods
  6. 6. First-Character Filtering (FCF) Method(1/8) Input Sentence Get First Character Syllable Collections Output Syllable
  7. 7. First-Character Filtering (FCF) Method(2/8) Syllable Collections
  8. 8. First-Character Filtering (FCF) Method(3/8) Syllable Collections
  9. 9. First-Character Filtering (FCF) Method(4/8) Sentence Pre-processing Whitespace Punctuations Marks Number of Input Sentence Length of Each Sentence .
  10. 10. First-Character Filtering (FCF) Method(5/8) Get First Character of the sentence Input_txt_length =160
  11. 11. First-Character Filtering (FCF) Method(6/8) Extract Candidates from Syllable Collections
  12. 12. First-Character Filtering (FCF) Method(7/8) Extract Candidates from Syllable Collections Input_txt_length =160 Length_of_syl=1 Length_of_syl=4 Length_of_syl=8 Length_of_syl=12 . . . . . . . $candidate = substr ( Input_txt, 0, length_of_syl); //Store Candidate If $candidate == $syllable { Store_Candidate ($candidate); } Candidates.txt
  13. 13. First-Character Filtering (FCF) Method(8/8) Store Final Syllable Input_txt_length =140 Results.txt Final_syllable_length = 20 Input_txt_length =160 destroy candidates.txt
  14. 14. Implementation & Result
  15. 15. Implementation & Result
  16. 16. Future Work Algorithm for : Loan Words Kinzi syllables Stacked Consonants syllables Word Segmentation