39
Genome Rearrangements Unoriented Blocks

Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Embed Size (px)

Citation preview

Page 1: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Genome Rearrangements

Unoriented Blocks

Page 2: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Quick Review

Looking at evolutionary change through reversals

Find the shortest possible series of reversals that transform gene A into gene B

It has been shown that this results in an NP-Hard problem

Page 3: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Oriented Blocks

1 2 3 4 5

5 2 1 3 4

1 2 3 4 5

1 2 5 4 3

1 2 5 3 4

5 2 1 3 4

Page 4: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Unoriented Blocks

Orientation of the blocks in the genomes is unknown

2 1 3 7 5 4 8 6

1 2 3 4 5 6 7 8

Page 5: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Definitions

unoriented permutation - a mapping from {1,2,…,n} to a set L of n labels.

reversal – reverses the order of a segment of consecutive labels.

Page 6: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Definitions (cont.)

reversal distance – if p1,p2,…pt is a shortest series

of reversals such that

αp1p2…pt = β ,

t is the reversal distance of α with respect to β, denoted by dβ(α)

Page 7: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 1

2 1 3 7 5 4 8 6

1 2 3 4 5 6 7 8

• Assign labels 1 through 8 to the blocks in the lower chromosome

• Transfer the labels to the upper chromosome giving equal labels to homologous blocks

• We obtain a starting permutation in the upper chromosome and our goal is to sort it into the lower one, the identity

Figure below shows two chromosomes with homologous blocks

Page 8: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 1 (cont.)

2 1 3 7 5 4 8 6

1 2 3 7 5 4 8 6

1 2 3 4 5 7 8 6

1 2 3 4 5 7 6 8

1 2 3 4 5 6 7 8

Page 9: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Best Solution?

How do we know that this is the shortest series of reversals?

To decide what the reversal distance should be, we look at the breakpoints

Page 10: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Breakpoints

A breakpoint of an unoriented permutation α is a pair of labels adjacent in α but not in the target.

In the case of the identity, this means adjacent labels that are not consecutive.

Page 11: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 2

Assume the identity is the target…

Breakpoints with oriented blocks:

L 5 2 1 3 4 R

Breakpoints with unoriented blocks:

L 5 2 1 3 4 R

Page 12: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 2 (cont.)

L 2 1 3 7 5 4 8 6R

• b(α) denotes the number of breakpoints of α

• a reversal can remove at most two breakpoints hence:

d(α) > ( b(α) / 2 )

where d(α) is the reversal distance

• using this rule, we see that d(α) > 4 for the above example

Page 13: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Strips

L 4 5 3 2 1 R

If we have two adjacent labels that do not make a breakpoint, they must be of the form:

…x(x+1)or

…x(x-1)

Page 14: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Strips (cont.)

strip – a sequence of consecutive labels surrounded by breakpoints but with no internal breakpoints

Two types of strips: increasingdecreasing

Page 15: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Special Rules A single label surrounded by breakpoints is said to be a

strip that is both increasing and decreasing

L and R are always considered part of an increasing strip, even if they are by themselves

L and R are considered a single element for the purpose of defining strips. If 0, 1, … is a strip and …, n, n+1 is a strip, we consider these two sequences as a single strip. They are linked by the common element L = R.

Page 16: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 3

L 1 2 8 7 3 5 6 4R

Stripsincreasing: (R,L,1,2) (5,6)

decreasing: (8,7) both: (3) (4)

Page 17: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Theorem 1

If label k belongs to a decreasing strip and k - 1 belongs to an increasing strip, then there is a reversal that removes at least one breakpoint

L 4 5 2 3 1 7 6R kk-1

Page 18: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof Labels k – 1 and k must belong to

different strips, since only single elements are said to be both increasing and decreasing.

The above statement implies that each one is the last element in its strip (each is followed by a breakpoint).

Page 19: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof (cont.)

Two possible schemes:

… (k - 1) … k …

… k … (k - 1) …

Performing a reversal on the area between the breakpoints brings k and k-1 together, reducing the number of breakpoints by at least one.

Page 20: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 4

L 4 5 2 3 1 7 6 R

L 4 5 2 3 1 7 6 R

L 4 5 6 7 1 3 2 R

L 4 5 6 7 1 3 2 R

kk-1

Page 21: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Observations All permutations have at least one

increasing strip (L or R)

All permutations do not necessarily have a decreasing strip

If there is a decreasing strip, the previous proof shows that there is a breakpoint-removing reversal

Page 22: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Theorem 2

If label k belongs to a decreasing strip and k + 1 belongs to an increasing strip, then there is a reversal that removes at least one breakpoint.

L 5 4 2 3 1 6 7R k k+1

Page 23: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof

Two possible schemes:

(k + 1) … k …

k … (k + 1) …

Performing a reversal on the area between the breakpoints brings k and k+1 together, reducing the number of breakpoints by at least one.

Page 24: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 5

L 5 4 2 3 1 6 7 R

L 5 4 2 3 1 6 7 R

L 1 3 2 4 5 6 7 R

L 1 3 2 4 5 6 7 R

k+1k

Page 25: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

The Result The two proofs just explained show that,

as long as we have decreasing strips, we can always reduce the number of breakpoints.

Notice that this also applies to single-element strips

What about when there are no decreasing strips?

Page 26: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Theorem 3

Let α be a permutation with a decreasing strip. If all reversals that remove breakpoints from α leave no decreasing strips, then there is a reversal that removes two breakpoints from α.

Page 27: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof

Let k be the smallest label involved in a decreasing strip.

p is the reversal uniting k and k - 1 k – 1 must be to the left of k,

otherwise p leaves a decreasing strip.

… (k – 1) … k …

Page 28: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof (cont.)

Let ℓ be the largest label involved in a decreasing strip.

σ is the reversal uniting ℓ and ℓ + 1 ℓ + 1 must be to the right of ℓ,

otherwise σ leaves a decreasing strip

… ℓ … (ℓ + 1) …

Page 29: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof (cont.)

Observe that k must be inside the interval reversed by σ, otherwise σ would leave k ’s decreasing strip intact.

Likewise, ℓ must belong to the interval of p

… (k – 1) ℓ … k (ℓ + 1) …

Page 30: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof (cont.)

… (k – 1) ℓ … k (ℓ + 1) …

We can see that p = σ must be true The reversal removes two

breakpoints because k is united with k – 1 and ℓ is united with ℓ + 1

Page 31: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Example 6

L 7 8 3 5 4 6 1 2 R

Reversals that remove breakpointsL 7 8 3 5 4 6 1 2 R

L 7 8 3 4 5 6 1 2 R

k-1 ℓ ℓ + 1k

Page 32: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Sorting a Permutation

We can use an algorithm that sorts a permutation using at most 2 * d(α) reversals (that is, twice as many reversals as the minimum possible)

Algorithm assumes that the target is the identity (1,2,3,4….)

Page 33: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

General Idea

A main loop looks at the current permutation and selects the best possible reversal to apply

Update the current permutation and report the reversal applied

The loop stops when the current permutation is the identity

Page 34: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Choosing the Reversal s If there is a decreasing strip, look for a

reversal that reduces the number of breakpoints and leaves a decreasing strip.

If no such reversal exists, there is a reversal that encompasses all the decreasing strips and removes two breakpoints.

If there are no decreasing strips, select a reversal that cuts two breakpoints.

Page 35: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Sorting Algorithm

Algorithm: Sorting Unoriented Permutation input: permutation α output: series of reversals that sort α list empty while α != I do if α has a decreasing strip then k smallest label in a decreasing strip p reversal that cuts after k and after k-1 if αp has no decreasing strip then ℓ largest label in a decreasing strip p reversal that cuts before ℓ and before ℓ+1 else p reversal that cuts the first two breakpoints α αp list list+preturn list

L 1 2 . 8 7 . 3 . 5 6 . 4 . R

list emptyk 3p (8 7 3)αp = L 1 2 3 . 7 8 . 5 6 . 4 . Rα αplist (8 7 3)

k 4p (7 8 5 6 4)αp = L 1 2 3 4 . 6 5 . 8 7 . Rα αplist (8 7 3), (7 8 5 6 4)

k 5p (6 5)αp = L 1 2 3 4 5 6 . 8 7 . Rα αplist (8 7 3), (7 8 5 6 4), (6 5)

k 7p (8 7)αp = L 1 2 3 4 5 6 7 8 Rα αplist (8 7 3), (7 8 5 6 4), (6 5), (8 7)

Page 36: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Another Example

list emptyk 1p (2 1)αp = L 1 2 3 . 7 . 5 4 . 8 . 6 . Rα αplist (2 1)

k 4p (7 5 4)αp = L 1 2 3 4 5 . 7 8 . 6 . Rα αplist (2 1), (7 5 4)

k 6p (7 8 6)αp = L 1 2 3 4 5 6 . 8 7 . Rα αplist (2 1) , (7 5 4) , (7 8 6)

k 7p (8 7)

αp = L 1 2 3 4 5 6 7 8 R

list (2 1), (7 5 4), (7 8 6), (8 7)

L . 2 1 . 3 . 7 . 5 4 . 8 . 6 . R

Page 37: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

But is it Optimal?

It has been shown:d(α) > ( b(α) / 2 )

For the previous example: b(α) = 7 d(α) >= 4

Although the algorithm produces the optimal result in this instance, it is not guaranteed to do so. The algorithm may produce a list containing more reversals than are actually necessary to solve the problem.

Page 38: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Theorem 4

The number of iterations in algorithm Sorting Unoriented Permutation is less than or equal to the number of breakpoints in the initial permutation

Page 39: Genome Rearrangements Unoriented Blocks. Quick Review Looking at evolutionary change through reversals Find the shortest possible series of reversals

Proof

Must prove that, on average, each iteration removes at least one breakpoint.

We can see this is true because the only time we remove 0 breakpoints, is immediately after we have removed 2, keeping the average of 1 breakpoint per iteration intact.