1
A VERSATILE EXACT PATTERN MATCHING ALGORITHM FOR BIOLOGICAL SEQUENCES Latha Parthiban Assistant Professor of Computer Science Pondicherry University Community College Puducherry, India ABSTRACT Pattern matching is essential for locating nucleotide or amino acid sequence patterns in the biological databases. In this paper, Zhu-Takaoka Fast Search algorithm (ZTFS) algorithms are modified to reduce the computational time complexity and also the number of shifts done while comparing the pattern with the given string. The experimental results on biological sequence of Enterobacteria phage lambda and Measles virus proves that the modified algorithm works faster than other existing algorithms, especially in case of longer patterns of nucleotide sequences. The algorithm has also been extended to classify the input biological sequences into different possible combinations of tricodons and amino acids present in the sequence. Computational complexity calculations and performance analysis for character comparisons are presented to indicate the potential reduction in time complexity of the proposed algorithm. KEY WORDS: Pattern matching, Amino acids, Character comparisons, Nucleotide sequences, Measles virus ث نسهست انبيىنىجيتماخذاسخيقت مخعذدة اذ انخىارسميت انذقحذي ح انمهخصىكهيىحيذاثقع انى نخحذيذ مىساسيبقت انىمظ ا يعخبز مطاnucleotide ىيمي انحمط ا أوماط أو حسهسمamino acid ت. فيث انبيىنىجيعذ بياوا في قىا خىارسميت حعذيم هذي انىرقت حم(ZTFS) مظ حهك انسهسهت. و قذ أثبخجل مفاروت ونمعمىل بها خث ا عذد انخحىنمعقذ و أيضابي انحساخفيط انىقج ا نخ فيزوست فينيت انبيىنىجينمخخات في ابي انخجزي ويائجEnterobacteria phage Lambda و فيزو س انحصبتMeasles virus انخىارسميتم أن عمىكهيذحيذهت نعيىت انىهسهت انطىيىصا في حانت سيت انمىجىدة و خصزع مه انخىارسم انمعذنت أسNucleotide ارسميت نخصىيفضا حىسيع خى . و قذ حم أيميىيت وض احماهسهت اث انممكىت في سنخىنيفاخهف ات في مخث انبيىنىجينياث مخىا مذخtricodons حهيمسابيت و حيم انخعقيذاث انح حمث . و قذ حميت انمفخزحت.يذاث وقج انخىارسمم في حعقض انمحخموخفانت عهى انمىصفاث نذروت انفعانيت نمقا ا

A VERSATILE EXACT PATTERN MATCHING ALGORITHM … · A VERSATILE EXACT PATTERN MATCHING ALGORITHM FOR BIOLOGICAL ... Zhu-Takaoka Fast Search algorithm ... and Measles virus proves

  • Upload
    dokhanh

  • View
    223

  • Download
    0

Embed Size (px)

Citation preview

A VERSATILE EXACT PATTERN MATCHING ALGORITHM FOR

BIOLOGICAL SEQUENCES Latha Parthiban

Assistant Professor of Computer Science

Pondicherry University Community College

Puducherry, India

ABSTRACT Pattern matching is essential for locating nucleotide or amino acid sequence patterns in the biological databases. In

this paper, Zhu-Takaoka Fast Search algorithm (ZTFS) algorithms are modified to reduce the computational time

complexity and also the number of shifts done while comparing the pattern with the given string. The experimental

results on biological sequence of Enterobacteria phage lambda and Measles virus proves that the modified algorithm

works faster than other existing algorithms, especially in case of longer patterns of nucleotide sequences. The

algorithm has also been extended to classify the input biological sequences into different possible combinations of

tricodons and amino acids present in the sequence. Computational complexity calculations and performance analysis

for character comparisons are presented to indicate the potential reduction in time complexity of the proposed

algorithm.

KEY WORDS: Pattern matching, Amino acids, Character comparisons, Nucleotide sequences, Measles virus

ححذيذ انخىارسميت انذقيقت مخعذدة اإلسخخذاماث نسهست انبيىنىجيت

انمهخص

في قىاعذ بياواث انبيىنىجيت. في amino acidأو حسهسم أوماط انحمط األميىي nucleotideيعخبز مطابقت انىمظ اساسي نخحذيذ مىقع انىىكهيىحيذاث

نخخفيط انىقج انحسابي انمعقذ و أيضا عذد انخحىالث انمعمىل بها خالل مفاروت ومظ حهك انسهسهت. و قذ أثبخج (ZTFS)هذي انىرقت حم حعذيم خىارسميت

أن عمم انخىارسميت Measles virusس انحصبت و فيزو Enterobacteria phage Lambdaويائج انخجزيبيت في انمخخانيت انبيىنىجيت في فيزوس

. و قذ حم أيضا حىسيع خىارسميت نخصىيف Nucleotideانمعذنت أسزع مه انخىارسميت انمىجىدة و خصىصا في حانت سهسهت انطىيهت نعيىت انىىكهيذحيذ

. و قذ حم حمثيم انخعقيذاث انحسابيت و ححهيم tricodonsمذخالث مخىانياث انبيىنىجيت في مخخهف انخىنيفاث انممكىت في سهسهت األحماض األميىيت و

انفعانيت نمقاروت انمىصفاث نذالنت عهى اوخفاض انمحخمم في حعقيذاث وقج انخىارسميت انمفخزحت.