Upload
hasad
View
47
Download
0
Embed Size (px)
DESCRIPTION
Comp. Genomics. Recitation 13. Genome rearrangements Homework solutions. Exercise 1. Two haploid, single-chromosome genomes G 1 and G 2 were sequenced. G 1 is an ancestor of G 2 . G 1 is represented by the unsigned permutation 1,2,…,n. - PowerPoint PPT Presentation
Citation preview
Comp. Genomics
Recitation 13
Genome rearrangementsHomework solutions
Exercise 1• Two haploid, single-chromosome
genomes G1 and G2 were sequenced. G1 is an ancestor of G2. G1 is represented by the unsigned permutation 1,2,…,n.
• The region gi,…,gj is known as a “tough chromosomal region”. Reversal events never create breakpoints in this region.
Exercise 1
• Assume that G2 was generated from G1 by the minimal number of reversal events that is needed for obtaining G2
• Give an upper bound on the number of reversal events that occurred during G1 to G2 evolution.
Solution 1• We can apply the same reversals in
reverse order to obtain G1
• E.g., if a single reversal transformed G1=12345 into G2=14325, we can apply a reversal on the same indices and get G1
• So is we show a series of reverse-reversals of length k, k is an upper bound
Solution 1
• Genes 1,…,i-1 appear in G2 before position i or after position j. In the worst case, we need i-1 reversal operations to get these genes into their correct order.• Then we have in G2:
1,2,..,i-1,TOUGH_REGION,REST_OF_GENES
where the TOUGH_REGION is eitheri,i+1,…,j or j,j-1,…,i+1
Solution 1
• We can fix the REST_OF_GENES region inn-j-1 reversal operations, and in total we get i-1+1+n-j-1=n-(j-i)-1
Exercise 2• A break point is a location in the sequence
such that • Prove or refute: Out of n/2 reversals on the
unsigned permutation 1,2,…,n, there is at least one reversal that cancels a breakpoint at some index.
• A reversal operates on a subsequence.• Note that a reversal can both cancel a
breakpoint and create new ones
1|| 1 ii
Solution 2
• Can you refute it?
• The claim is false.
• Consider the permutation (1,2,3). (1,2,3)(1,3,2)(3,1,2)(1,3,2)…
No No Yes No Yes No NoYes
Exercise 3• Two reversals occur on the permutation
1,2,…,n. How many breakpoints can occur in the resulting permutation?
Solution 3
• One reversal:
1 2 3 4 5 6 71 7 6 5 4 3 2 one breakpoint
1 6 5 4 3 2 7 two breakpoints
Solution 3
• Two reversals:
1 2 3 4 5 6 71 6 5 4 3 2 71 2 3 4 5 6 7 zero breakpoints
Solution 3
• Two reversals:
1 2 3 4 5 6 71 7 6 5 4 3 2 3 4 5 6 7 1 2 one breakpoint
Solution 3
• Two reversals:
1 7 6 5 4 3 2 1 2 3 4 5 6 7
1 3 4 5 6 7 2 two breakpoints
Solution 3
• Two reversals:
1 2 3 4 5 6 71 6 5 4 3 2 71 6 2 3 4 5 7 three breakpoints
Solution 3
• Four breakpoints:
1 2 3 4 5 6 71 6 5 4 3 2 71 6 5 3 4 2 7 four breakpoints
DCJ Algorithm
• Why does it run in linear time?
DCJ Algorithm – cont’d
• dDCJ(A,B) = N – (C+I/2).• Each iteration increments either C by
on or I by two.• Our genome representation allows to
find and perform each sorting operation in constant time.
• The DCJ distance is never larger than N.
שאלה ממועד א' תשס"ז
גנום הוא קבוצה של כרומוזומים, שבו כל •כרומוזום הוא רצף של מספרים שלמים בעלי
סימן. יחד, הכרומוזומים מכילים את המספרים ללא חזרות.n,…,1השלמים
הוא גנום G={(1,-2,3),(4,5,6,-7)}למשל, •עם שני כרומוזומים אנחנו מניחים שכרומוזום וההפכי שלו עם סימנים הפוכים הם שקולים.
. (4,-5,-6,-7) שקול ל-(7,-4,5,6)לכן •
שאלה ממועד א' תשס"ז
( הופכת את הסדר ואת reversalפעולת היפוך )•הסימנים של מקטע רציף בתוך כרומוזום בודד.
Gלכן, היפוך יחיד על הכרומוזום הראשון של ({7,-4,5,6(, )3,2,-1})יכול לייצר את הגנום
( מחליפה translocationפעולת העברה )•מקטעים קיצוניים של שני כרומוזומים )כאשר
אחד מהם יכול להיות ריק(. למשל, העברה על G ({4,3(,)7,-2,5,6,-1}) יכולה ליצור את הגנום.
שאלה ממועד א' תשס"ז
הבעיה היא לעבור מגנום נתון לגנום אחר תוך •שימוש במספר מינימלי של פעולות היפוך
והעברה.תן אלג' המבטיח יחס ביצועים קבוע לבעיה •
ופועל בזמן פולינמויאלי.
פתרון
.signed reversalהבעיה שקולה ל-•-קירוב בזמן פולימניאלי.2ראינו בכיתה פתרון •
HW 3 question 5
• Uniform lifted alignment – alignment in which for each level all string are either lifted from right or left.
• Prove that the optimal uniform lifted alignment has cost at most twice of the optimal alignment tree.
• Give a polynomial algorithm to find the optimal uniform lifted alignment.
HW 3 question 5
• Uniform lifted alignment, proof:• Assume we had the optimal tree T*.• Transform it in the following way:• To assign string at level k, consider:
• Pick the minimal sum.
HW 3 – question 5 – cont’d
• Assign each ‘costy’ edge (T,S) to a path in the optimal tree:
• The path from leaf (T) to node (S*).
S (S*)
T S
T
Together, these paths cover all edges of the tree.
HW 3 – question 5 – cont’dBy triangle inequality:D(S, T) ≤ D(S, S*) + D(S*, T) S (S*)
T S
T
By choice of left/right:Σs D(S,S*)+D(S*,T) ≤ Σs D(S*,T)+D(S*,T) =Σs 2D(S*,T) => One-sided tree with cost at most twice the optimal.
HW 3 – question 5 – cont’d
• Algorithm:• Preprocess pairwise sequence
distances.• Try all different assignments for a
left/right for each level, and pick the minimal one.
• Running time (n sequences of length m):• Proprocessing: O(m2n2).• Height h, different assignment 2h.• Calculation cost of tree O(n).
HW 1 question 1
• Question 1: Explain how to compute local alignment in linear space
• The linear space algorithm from the lecture is a global alignment algorithm
solutionx
y
local alignment global alignment
solution
• For every cell [i,j] in the DP matrix, add a field b[i,j] that will be updated as follows:• If the score of [i,j] is 0 then b[i,j]=(i,j)• Otherwise
• If match b[i,j]=b[i-1,j-1]• If mismatch for x b[i,j]=b[i-1,j]• If mismatch for y b[i,j]=b[i,j-1]
solution
• Use the linear space algorithm from class for computing the score of the optimal local alignment
• At the same time the field b[i,j] can be updated for every cell
• Now, “cut out” the small matrix using the cell with the optimal score [i* ,j*] and b[i* ,j*], and run Hirschberg