Algorithm Techniques -class1- · The backtracking technique can be applied to those problems that exhibit the domino principle: if a constraint (condition) is not satis ed by a partial

Programa

Table: Temas

Brute F. Greedy DP D&C Graph Comb. patt. Clust.& trees HMM Rand.

Subject 4 5 6 7 8 9 10 11 12

Mapping DNA *Sequencing *Comparing Seqs * * *Predicting Genes *Finding Signals * * * *ldentifying Prots *Repeat Analysis *DNA arrays * *Genome Rearrang. *Molecular evol. *

Bioinfo I (Institut Pasteur de Montevideo) Algorithm Techniques -class1- July 11th, 2011 1 / 85

Libro recomendado

An introduction to bioinformatics algorithms Neil Jones and Pavel Pevzner


Algorithm Techniques

In this chapter we will very briefly review the most common algorithmictechniques which are used in bioinformatics.


Algorithms

An algorithm is a well-defined and finite sequence of steps used to solve awell-defined problem.

Algorithms that solve all instances of the problem for which they weredesigned are said to be correct.

The running time of an algorithm is the number of machine instructions itexecutes when run on a particular instance.

For the analysis of the algorithm the running time is computed for theworst case instance of the problem.


Running time

Computers need determined amount of time top for the execution ofsome operation (e. g. 10−9s)

Algorithms need a determined amount of steps s

If top and s is known → running time of algorithm: top · sSince top changes constantly we base on s (independent of hardware)

s is not always easy to determine → depends on input n


Running time

Computers need determined amount of time top for the execution ofsome operation (e. g. 10−9s)

Algorithms need a determined amount of steps s

If top and s is known → running time of algorithm: top · sSince top changes constantly we base on s (independent of hardware)

s is not always easy to determine → depends on input n


Big-O Notation

Big-O for describing the running time of analgorithm

O(n2) running time of the algorithm islimited by a 2nd degree polynomial

f (n) = O(n2): f doesn’t grow faster thanc · n2 for a c

2n = O(n2) valid, but uninformative →more informative 2n = O(n)

Big-O establishes an upper bound for thegrowth of a function.If f (n) = O(g(n)),then f doesn’t grow fasterthan g


Definitions

Let f and g be real functions

1 One writes f (x) = O(g(x)) if and only if there exists c and x0 (c,x0 ∈R, c ≥ 0) such that

f (x) ≤ c · g(x) for all x ≥ x0

2 One writes f (x) = Ω(g(x)) if and only if there exists c and x0 (c,x0 ∈R, c ≥ 0) such that

f (x) ≥ c · g(x) for all x ≥ x0

3 One writes f (x) = Θ(g(x)) if and only if

f (x) = O(g(x)) and f (x) = Ω(g(x))


Example: Sorting Algorithms

Sorting Problem:Sort a list of integersInput: A list of n distinct integers a = (a1, a2, ..., an)Output: Sorted list of integers, that is, a reordering b = (b1, b2, ..., bn) ofintegers from a such that b1 < b2 < < bn

Selection Sort Algorithm:SELECTIONSORT(a, n)1 for i ← 1 to n − 12 aj ← Smallest element among ai , ai+1, . . ., an3 Swap ai and aj4 return a


Example: Sorting Algorithms

Recursive Selection Sort:RECURSIVESELECTIONSORT(a, first, last)1 if first < last2 index ← INDEXOFMIN (a, first, last)3 Swap afirst with aindex

4 a ← RECURSIVESELECTIONSORT(a, first+1, last)5 return a

INDEXOFMIN (array, first, last)1 index ← first2 for k ← first +1 to last3 if arrayk < arrayindex

4 index ← k5 return index


Complexity Analysis

n − 1 iterations

Analyzes n − i + 1 elements in eachiteration i

The aprox. number of operations:n + (n − 1) + (n − 2) + . . . + 2 + 1

= 1 + 2 + ... + n = n(n+1)2

In each iteration a swap: 3 ops

Total: n(n+1)2 + 3(n − 1)

→ O(n2)

SELECTIONSORT(a, n)1 for i ← 1 to n − 12 j ← INDEXOFMIN (a, i, n)3 Swap elements ai and aj4 return a

INDEXOFMIN (array, first,last)1 index ← first2 for k ← first+1 to last3 if arrayk < arrayindex

4 index ← k5 return index


Complexity Analysis

Let T (n) be the time analgorithm needs for the input ofsize n

Finding the smaller n → max. n

recursive call on array of sizen − 1→ T (n − 1)

Call on array of size 1

It holds: T (n) = n + T (n − 1)T (1) = 1T (n) = n + (n − 1) + T (n − 2)=n+(n−1)+(n−2)+...+2+T (1)→ O(n2)

Recursive Selection Sort:

1 RECURSIVESELSORT(a, first,last)

2 if first < last

3 index ← INDEXOFMIN (a, first,last)

4 Swap afirst with aindex

5 a ← RECURSIVESELSORT(a,first+1, last)

6 return a


Algorithms

Conceptually we distinguish

Algorithm strategy

Algorithm structureI recursiveI iterative

Algorithm solutionI find a good solutionI find best(s) solution(s)


Algorithm strategies

Brute force algorithms

Greedy algorithms

Recursive algorithms

Backtracking algorithms

Branch and bound algorithms

Divide and conquer algorithms

Dynamic programming algorithms

Heuristic algorithms


Brute Force or Exhaustive Search

Systematically enumerating all possible candidates for the solution andchecking whether each candidate satisfies the problem’s statement

Simple

Very slow

Used as starting point for other types of algorithms


Greedy

Many algorithms are iterative processes

Greedy algorithms choose in each iteration the more “attractive”solution


Recursive

A combinatorial problem: Fibonacci numbers

n 0 1 2 3 4 5 6 7 8 9 10 11Fn 0 1 1 2 3 5 8 13 21 34 55 89

The problem of the Fibonacci numbers is a classical example for arecursion problem:

F0 = 0

F1 = 1

Fn = Fn−1 + Fn−2


Recursions

Recursions: reapply algorithm to subproblemAnother example: N!, the factorial of a number N:

function fact(N)

if(N==1)

return 1

else

return N*fact(N-1)


Backtracking

Backtracking is a general technique for organizing the exhaustive searchfor a solution to a combinatorial problem.

The backtracking technique can be applied to those problems that exhibitthe domino principle: if a constraint (condition) is not satisfied by a partialsolution, the constraint will not be satisfied by any extension of the partialsolution to a global solution.


Backtracking

Domino principle

1 2 3w

h

... n n+1

Given h (height of a domino) > w (space in between dominos):we knock over the first dominoif nth domino falls, then (n + 1)st domino will fall.


Backtracking

The backtracking algorithm enumerates a set of partial candidatesthat could be completed in various ways to giveall the possible solutions to the given problem.

The way towards the solution is done incrementally, by a sequence ofcandidate extension steps.

Conceptually, the partial candidates are the nodes of a tree, the“search tree”

Each partial candidate is the parent of the candidates thatdiffer from it by a single extension step

Leaves of the tree are the partial candidates thatcannot be further extended

The backtracking algorithm traverses this search tree recursively,from the root down, in depth-first order


Backtracking

Root

1

2

3

5

64

At each node c, the algorithm checks whether c can be completed toa valid solution

If it cannot, the whole sub-tree rooted at c is skipped (pruned)

Otherwise, the algorithm (a) checks whether c itself is a valid solutionand (b) recursively enumerates all sub-trees of c

The actual search tree that is traversed by the algorithm is only a part ofthe tree. The total cost of the algorithm is the number of nodes of theactual tree times the cost of obtaining and processing each node.


BacktrackingExample: Eight queens puzzle → How to place 8 queens in a chess board

Consider one row of the board at a time

Eliminate most nonsolution board positions at a very early stage

It rejects attacks on incomplete boards, hence it examines only 15720possible queen placements (brute force: 648 = 281.474.976.710.656)

The actual search tree is only a part of the tree. The total cost of thealgorithm is the # nodes of the actual tree × the cost of obtaining andprocessing each node.


Branch-and-Bound

The branch-and-bound method can be used for finding one or all solutionsof a combinatorial problem, where solutions are associated with a cost,such that the cost of the whole solution cannot besmaller than the cost of any partial solution →optimization problems

The technique consists of remembering the lowest-cost solution found ateach stage of the backtracking search, and to use the cost of thelowest-cost solution found so far as a lower bound on the cost of aleast-cost solution to the problem, in order to discard partial solutionswith costs larger than the lowest-cost solution found so far.


Branch-and-Bound

Represent again as a tree: The root of the bb-tree is a so-called dummynode of cost zero, the nodes at level one represent the possible valueswhich the first variable can be assigned to, the nodes at level tworepresent the possible values which the second variable can be assigned to,given the value which the first variable was assigned to, and so on.

Subtrees in the tree rooted at nodes of cost greater than the cost of aprevious leaf node, are pruned off the bb-tree.

A1

SEC

B

...

2

C E

4 2

C

2

S

S

2

C

Problem: can become exponential


Divide-and-Conquer

Definition: An algorithmic technique. To solve a problem on aninstance of size n, a solution is found either directly because solving thatinstance is easy (typically, because the instance is small) or the instance isdivided into two or more smaller instances. Each of these smaller instancesis recursively solved, and the solutions are combined to produce a solutionfor the original instance.


Divide-and-Conquer Methodology1 Given a problem, identify a small number of significantly smaller

subproblems of the same type2 Solve each subproblem recursively (the smallest possible size of a

subproblem is a base-case)3 Combine these solutions into a solution for the main problem

The name divide and conquer can be motivated because the problem isconquered by dividing it into several smaller problems.


Divide-and-Conquer

The divide-and-conquer technique can be applied to those problems thatexhibit the independence principle:problem instance can be divided into a series of smaller problem instanceswhich are independent of each other.

Example: One of the simplest examples is “Quicksort” of an array:Partition the array into two parts, and quicksort each of the parts. Here infact, no additional work is required to combine the two sorted parts.Running time: O(n2)


Divide-and-Conquer

When a problem is solved by “divide-and-conquer”, sometimes the samesubproblem appears multiple times.A recursive algorithm for the divide-and-conquer according to thisdefinition is:

Fibonacci-R(i)

if i = 0

then return 0

else

if i = 1

then return 1

else return Fibonacci-R(i-1) + Fibonacci-R(i-2)


Divide-and-Conquer

However, it is easy to see that the algorithm is not efficient, since values ofFi are calculated several times independently.

n

n

n-2

n-3

n-2

n-3 n-3 n-4

n-1

n-4 n-4 n-5 n-4 n-5 n-5 n-6


Randomized Algorithms

Toss a coin to decide where to start looking for the phone

Not as intuitive as deterministic algorithms


Machine Learning

Collect statistics over the course of a year about where you leave thephone, learning where the phone tends to end up most of the time.

E. g. 80% of the times it was left on the bathroom, 15% in thebedroom and 5% in the kitchen

Strategy: first look in the bathroom, the in the bedroom and finallyin the kitchen


Dynamic Programming

Dynamic Programming is a very general programming technique.

Most often applied in the construction of algorithms to solve a certainclass of optimisation problems, ie. problems which require theminimisation or maximisation of some measure.

Applicable when a large search space can be structured into asuccession of stages, such that the initial stage contains trivialsolutions to sub-problems, each partial solution in a later stage canbe calculated by recurring on only a fixed number of partialsolutions in an earlier stage, the final stage contains the overallsolution.

The method usually accomplishes this by maintaining a table ormatrix of sub-instance results.


Dynamic Programming

Dynamic programming can be thought of as being the reverse of recursionor divide-and-conquer.? Divide-and-conquer is a top-down mechanism – we take a problem, splitit up, and solve the smaller problems that are created.? Dynamic programming is a bottom-up mechanism – we solve all possiblesmall problems and then combine them to obtain solutions for biggerproblems.


Dynamic Programming

A general DP algorithm consists of 4 steps:

1 Characterization of the structure of the (an) optimal solution

2 Recursive definition of the value of an optimal solution

3 Computation of the optimum using recursion

4 Construction of an optimal solution through the computed optimalvalue.


Dynamic Programming

Example: “The Rocks game”

2 players, 2 piles of rocks, say 10 each

In each turn one player may take either one rock (from either pile) ortwo rocks (one from each pile). Taken rocks are removed from thegame.

The player that takes the last rock wins the game

To find the winning strategy we construct a 10× 10 table R:→ If Player 1 can always win the game (i,j), then we would say Rij = W→ If Player 1 looses the game Rij , then we would say Rij = L


Dynamic Programming

Example: “The Rocks game”


Dynamic ProgrammingExample: “The Rocks game”


Tractable vs. Non Tractable Problems

Algorithms can be classified accoriding to its complexity

Problems might also be classified according to its inherent complexity

There are problems, for which there is no non polynomial algorithm:enumerate all subsets of n elements

Other problems can be solved in polynomial time

Between these two, exponential and polinomial problems, lie theNP-complete


Tractable vs. Non Tractable Problems

Problems for which there is no known polynomial algorithm, but forwhich you cannot prove that it does’t exist

The classic: Traveling-Salesman Problem


Literature

Sources and further recommended reading:

Schoning, Algorithmik, Spektrum Akademischer Verlag, 2001.

Kay Nieselts Lecture Notes (Grundlagen der Bioinformatik SS 2007),Karls-Eberhard Universitat Tubingen

N. C. Jones and P. A. Pevzner, An Introduction to BioinformaticsAlgorithms, 2004


DNA Mapping, Motifs and Brute Force Algorithms

In this chapter we will see:

Restriction Enzymes

Gel Electrophoresis

Partial Digest Problem

Brute Force Algorithm for Partial Digest Problem

Branch and Bound Algorithm for Partial Digest Problem

Double Digest Problem

Finding Regulatory Motifs


Molecular Scissors

Molecular Cell Biology, 4th edition


Molecular Scissors

ePlantScience.com, An online botanical encyclopedia, Chapter 3.


Uses of restriction enzymes

Recombinant DNA technologyI Recombinant technology starts with the isolation of a gene of interest.

It is then inserted into a vector and clonedI Recombinant protein result form the expression of rDNA

DNA CloningI Is a technique to reproduce DNA fragments.I Cell based or via PCR

cDNA/genomic library constructionI mRNA→cDNA→restriction enzyme + ligase→into plasmidI genomic regions

DNA mapping


Restriction maps

A map showing positions of restriction sites in a DNA sequence

If DNA sequence is knownthen construction ofrestriction map is a trivialexercise

In early days of molecularbiology DNA sequences wereoften unknown

Biologists had to solve theproblem of constructingrestriction maps withoutknowing DNA sequences


Full Restriction Digest

A map showing positions of restriction sites in a DNA sequence

Cutting DNA at each restriction site creates multiple restrictionfragments:

Is it possible to reconstruct the order of the fragments from the sizes ofthe fragments 3,5,5,9 ?


Full Restriction Digest

Multiple Solutions

vs.


Measuring length of fragments: Gel electrophoresis

Gel electrophoresis: processfor separating DNA by sizeand measuring sizes ofrestriction fragments

Separates DNA fragmentsthat differ in only 1nucleotide for fragments upto 500

Using an electric field,molecules can be made tomove through a gel (agar)


Measuring length of fragments: Gel electrophoresis

The gel is placed in anelectrophoresis chamber.When the electric current isapplied, the larger moleculesmove more slowly throughthe gel while the smallermolecules move faster. Thedifferent sized moleculesform bands on the gel


Detecting DNA

One possibility to visualize DNA bands: Fluorescence

The gel is incubated with a solution containing the fluorescent dyeethidium

Ethidium binds to the DNA

The DNA lights up when the gel is exposed to ultraviolet light.


Partial Restriction Digest

The sample of DNA is exposed to the restriction enzyme for only alimited amount of time to prevent it from being cut at all restrictionsites

This experiment generates the set of all possible restriction fragmentsbetween every two (not necessarily consecutive) cuts

This set of fragment sizes is used to determine the positions of therestriction sites in the DNA sequence


Partial Restriction Digest: Example

Partial Digest results in the following 10 restriction fragments:

Multiset: 3, 5, 5, 8, 9, 14, 14, 17, 19, 22

→We assume that multiplicity of a fragment can be detected, i.e., thenumber of restriction fragments of the same length can be determined(e.g., by observing twice as much fluorescence intensity for a doublefragment than for a single fragment)


Partial Restriction Digest: Example

Partial Digest results in the following 10 restriction fragments:

Multiset: 3, 5, 5, 8, 9, 14, 14, 17, 19, 22→We assume that multiplicity of a fragment can be detected, i.e., thenumber of restriction fragments of the same length can be determined(e.g., by observing twice as much fluorescence intensity for a doublefragment than for a single fragment)


Partial Digest

Fundamentals:

X: the set of n integers representing the location of all cuts in therestriction map, including the start and end

n: the total number of cuts

DX: the multiset of integers representing lengths of each of the(n

2

)fragments produced from a partial digest


Partial Digest

A way of representating n, X , DX :

Representation of DX = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10 as a two dimensionaltable, with elements of X = 0, 2, 4, 7, 10 along both the top and leftside. The elements at (i , j) in the table is xj − xi for 1 ≤ i < j ≤ n.


Partial Digest Problem

Formulation:

Goal: Given all pairwise distances between points on a line, reconstructthe positions of those points

Input: The multiset of pairwise distances L, containing n(n−1)2 integers

Output: A set X , of n integers, such that DX = L


Partial Digest Problem: Multiple Solutions

It is not always possible to uniquely reconstruct a set X based only on DX

For example, the set:X = 0, 2, 5andX + 10 = 10, 12, 15

both produce DX = 2, 3, 5 as their partial digest set.

The sets 0, 1, 2, 5, 7, 9, 12 and 0, 1, 5, 7, 8, 10, 12 present a less trivialexample of non-uniqueness. They both digest into:1, 1, 2, 2, 2, 3, 3, 4, 4, 5, 5, 5, 6, 7, 7, 7, 8, 9, 10, 11, 12


Homometric Sets


Brute Force Algorithms

Also known as exhaustive search algorithms; examine every possiblevariant to find a solution

Efficient in rare cases; usually impractical


Partial Digest Problem: Brute Force

1. Find the restriction fragment ofmaximum length M. M is the lengthof the DNA sequence3. For every possible set

X = 0, x2, ..., xn−1,M

compute the corresponding DX5. If DX is equal to the experimentalpartial digest L, then X is the correctrestriction map

BruteForcePDP(L, n):1. M← maximum element in L2. for every set of n − 2 integers0 < x2 < ...xn−1 < M

X ← 0, x2, ..., xn−1,M3. Form DX from X4. if DX = L5. return X6. output “no solution”


Efficiency of Brute Force

BruteForcePDP takes O(Mn−2) time since it must examine allpossible sets of positions.

One way to improve the algorithm is to limit the values of xi to onlythose values which occur in L.

BruteForcePDP(L, n):1. M← maximum element in L2. for every set of n − 2 integers 0 < x2 < ...xn−1 < M



Efficiency of Brute Force

BruteForcePDP takes O(Mn−2) time since it must examine allpossible sets of positions.

One way to improve the algorithm is to limit the values of xi to onlythose values which occur in L.

AnotherBruteForcePDP(L, n):1. M← maximum element in L2. for every set of n − 2 integers 0 < x2 < ...xn−1 < M from L



Efficiency of AnotherBruteForcePDP

Its more efficient, but still slow. This algorithms examines( |L|n−2

)If L = 2, 998, 1000, (n = 3,M = 1000), BruteForcePDP will beextremely slow, but AnotherBruteForcePDP will be quite fast

Fewer sets are examined, but runtime is still exponential: O(n2n−4)


Branch and Bound Algorithm for PDP

1 Begin with X = 02 Remove the largest element in L and place it in X

3 See if the element fits on the right or left side of the restriction map

4 When it fits, find the other lengths it creates and remove those from L

5 Go back to step 1 until L is empty


PartialDigest Algorithm

Before describing PartialDigest, first define D(y ,X )as the multiset of all distances between point y and all other points in theset X

D(y ,X ) = |y − x1|, |y − x2|, ..., |y − xn|

for X = x1, x2, ..., xn

PartialDigest(L):width ← Maximum element in LDELETE(width, L)X ←0,widthPLACE(L,X )


PartialDigest Algorithm

PLACE(L, X)2. if L is empty3. output X4. return5. y← maximum element in L6. Delete(y,L)7. if D(y, X ) ∈ L8. Add y to X and remove lengths D(y, X) from L9. PLACE(L,X )10. Remove y from X and add lengths D(y, X) to L11. if D(width-y, X ) ∈ L12. Add width-y to X and remove lengths D(width-y, X) from L13. PLACE(L,X )14. Remove width-y from X and add lengths D(width-y, X ) to L15. return


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0

Remove 10 from L and insert it into X . We know this must be the lengthof the DNA sequence because it is the largest fragment.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0

Remove 10 from L and insert it into X . We know this must be the lengthof the DNA sequence because it is the largest fragment.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 10


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 10

Take 8 from L and make y = 2 or 8. But since the two cases aresymmetric, we can assume y = 2.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 10

We find that the distances from y = 2 to other elements in X areD(y ,X ) = 8, 2, so we remove 8, 2 from L and add 2 to X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 10

We find that the distances from y = 2 to other elements in X areD(y ,X ) = 8, 2, so we remove 8, 2 from L and add 2 to X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 10

Take 7 from L and make y = 7 or y = 10− 7 = 3. We will explore y = 7first, so D(y ,X ) = 7, 5, 3.D(y ,X ) = 7, 5, 3 = 7− 0, 7− 2, 7− 10


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 10

For y = 7 first, D(y ,X ) = 7, 5, 3. Therefore we remove 7, 5, 3 from Land add 7 to X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 7, 10

For y = 7 first, D(y ,X ) = 7, 5, 3. Therefore we remove 7, 5, 3 from Land add 7 to X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 7, 10

Take 6 from L and make y = 6. Unfortunately D(y ,X ) = 6, 4, 1, 4,which is not a subset of L. Therefore we won’t explore this branch.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 7, 10

This time make y = 4. D(y ,X ) = 4, 2, 3, 6, which is a subset of L so wewill explore this branch. We remove 4, 2, 3, 6 from L and add 4 to X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 4, 7, 10

L is now empty, so we have a solution, which is X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 4, 7, 10

L is now empty, so we have a solution, which is X .


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 7, 10

To find other solutions, we backtrack.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 10

More backtrack.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 2, 10

This time we will explore y = 3. D(y ,X ) = 3, 1, 7, which is not a subsetof L, so we won’t explore this branch.


An Example

L = 2, 2, 3, 3, 4, 5, 6, 7, 8, 10X = 0, 10

We backtracked back to the root. Therefore we have found all thesolutions.


Complexity analysis of PartialDigest Problem

Still exponential in worst case, but is very fast on average

Informally, let T (n) be the time PartialDigest takes to place n cuts:

No branching case: T (n) = T (n − 1) + O(n)

→Quadratic

Branching case: T (n) = 2T (n − 1) + O(n)= T (n) = 2(2T (n − 2) + O(n)) + O(n)

→Exponential


Double Digest Problem (DDP)

Double Digest is yet another experimentally method to constructrestriction maps

Uses two restriction enzymes; three full digests:I One with only first enzymeI One with only second enzymeI One with both enzymes

Computationally, Double Digest problem is more complex than PartialDigest problem



Without the information about X (i.e. A + B), it is impossible to solve theDDP



Without the information about X (i.e. A + B), it is impossible to solve theDDP


Double Digest Problem

Formulation:

Input: dA → fragment lengths from the digest with enzyme AdB → fragment lengths from the digest with enzyme BdX → fragment lengths from the digest with both A and B

Output: A → location of the cuts in the restriction map for the enzyme A.B → location of the cuts in the restriction map for the enzyme B.


DDP: Multiple Solutions


Documents

Algorithm Techniques -class1- · The backtracking technique can be applied to those problems that exhibit the domino principle: if a constraint (condition) is not satis ed by a partial