12
DISTANCE MEASURE BETWEEN TWO BIOLOGICAL SEQUENCES Shweta Kumari M. Sc. Bioinformatics 1 st semester Roll no. - 21

Distance measure between two biological sequences

Embed Size (px)

Citation preview

Page 1: Distance  measure between  two biological  sequences

DISTANCE MEASURE BETWEEN TWO

BIOLOGICAL SEQUENCES

Shweta KumariM. Sc. Bioinformatics1st semesterRoll no. - 21

Page 2: Distance  measure between  two biological  sequences

CONTENT

INTRODUCTION HAMMING DISTANCE EXAMPLE OF HAMMING DISTANCE LIMITATIONS OF HAMMING DISTANCE EDIT DISTANCE RULES FOR EDIT DISTANCE EXAMPLE OF EDIT DISTANCE HAMMING DISTANCE V/S EDIT DISTANCE REFERENCES

Page 3: Distance  measure between  two biological  sequences

INTRODUCTION It is the measure of distance between sequences, which allow us to infer their similarities and related biological functions.

The distance measure due to similarities suggest us homology between the sequences.

It also help us to know the evolutionary relationship.

Distance can be calculated by these methods – Hamming distance Edit distance

Page 4: Distance  measure between  two biological  sequences

HAMMING DISTANCEHamming distance is the measure of dissimilarity between two sequences i.e, the no. of positions at which the corresponding character are different.

It measure the minimum no. of substitutions required to change one character into the other.

It was introduced by RICHARD HAMMING in 1950.

Page 5: Distance  measure between  two biological  sequences

EXAMPLE OF HAMMING DISTANCE

ATCGGTAGT ATGGTTCCT The hamming distance between these two sequences is 4.

Page 6: Distance  measure between  two biological  sequences

LIMITATIONS OF HAMMING DISTANCE

Both the sequences having same length.

Two sequences of unequal length having high similarity will fail to align.

We can’t perform edit operation on any sequences.

Distance calculation of two sequences having unequal length can be measured by an alternative way called EDIT DISTANCE.

Page 7: Distance  measure between  two biological  sequences

EDIT DISTANCE Similarity between two sequences as minimum no. of edit operation ( substitution, insertion & deletion ) needed to transform one to other.

Also called LEVENSHTEIN DISTANCE, developed by VLADIMIR LEVENSHTEIN in 1966.

It gives an indication of how close two sequences are, which are applicable for various purpose-

To find gene or protein that may have shared functions or properties.

To infer family relationship & evolutionary tress over different organism.

Page 8: Distance  measure between  two biological  sequences

RULES FOR EDIT DISTANCE

Edit operation (substitution & indel ) must be performed on only one sequence.

It must not affect the length of the sequence.

Every character or gap of first sequence should be align with every character or gap of other sequence.

Page 9: Distance  measure between  two biological  sequences

EXAMPLE OF EDIT DISTANCE

ATGCA T CA

Minimum no. of edit operation = 2

Page 10: Distance  measure between  two biological  sequences

HAMMING DISTANCE V/S EDIT DISTANCE

Unlike hamming distance, edit distance allow us to compare sequences of different length.

Hamming distance is a simple position by position comparision while edit distance is formalised through insertion and deletion operation.

Page 11: Distance  measure between  two biological  sequences

REFERENCEShttp://www.maths.manchester.ac.uk/~pas/code/notes/part2.pdf

http://en.wikipedia.org/wiki/Hamming_distance

http://schatzlab.cshl.edu/teaching/2011/2011.Lecture3.Sequence%20Alignment.pdf

https://www.princeton.edu/~achaney/tmve/wiki100k/docs/Hamming_distance.html

http://en.wikipedia.org/wiki/Edit_distance

Page 12: Distance  measure between  two biological  sequences

THANK YOU