15
Challenging Cloning Related Problems with GPU-Based Algorithms Authors : Thierry Lavoie Michael Eilers-Smith Ettore Merlo Publisher: ACM IWSC’10 Presenter: Ye-Zhi Chen Date: 2011/12/21

Challenging Cloning Related Problems with GPU-Based Algorithms

  • Upload
    tahir

  • View
    47

  • Download
    0

Embed Size (px)

DESCRIPTION

Authors : Thierry Lavoie 、 Michael Eilers -Smith 、 Ettore Merlo Publisher: ACM IWSC’10 Presenter: Ye- Zhi Chen Date: 2011/12/21. Challenging Cloning Related Problems with GPU-Based Algorithms. Introduction. - PowerPoint PPT Presentation

Citation preview

Page 1: Challenging Cloning Related Problems with GPU-Based Algorithms

Challenging Cloning Related Problems with GPU-Based Algorithms

Authors :Thierry Lavoie、Michael Eilers-Smith、 Ettore Merlo Publisher:ACM IWSC’10Presenter:Ye-Zhi ChenDate:2011/12/21

Page 2: Challenging Cloning Related Problems with GPU-Based Algorithms

• This paper describes an implementation of the Smith-Watterman algorithm for proper clone filtering

Introduction

Page 3: Challenging Cloning Related Problems with GPU-Based Algorithms

• To address the clone detection false positives problem by an appropriate filtering technique ; the DP-matching seemed to be an interesting choice

Algorithm

- A B C C A

- 0X

0X

0X

0X

0X

0X

A 0X

1↖

1←

1←

1←

1↖

B 0X

1↑

2↖

2←

2←

2←

A 0X

1↖

2↑

2↑

2↑

3↖

C 0X

1↑

2↑

3↖

3↖

3↑

Page 4: Challenging Cloning Related Problems with GPU-Based Algorithms

Algorithm

Page 5: Challenging Cloning Related Problems with GPU-Based Algorithms

GPU DP-matching :• Find what cells of the matrix are free of computational dependencies in

order to compute their values on separate cores simultaneously

• It is simple to check that every cells on the anti-diagonals become free of any computational dependencies at the same moment because their value is solely dependent on the cells of the previous anti-diagonals.

Algorithm

Page 6: Challenging Cloning Related Problems with GPU-Based Algorithms

• Let Vk represents the linear buffer computed at step k. Let fk be the following map between the Indexes of V and those of the matrix D :

u can be seen as the index of threads , s1 and s2 ‘s first character are gaps

Algorithm

Page 7: Challenging Cloning Related Problems with GPU-Based Algorithms

Algorithm- A B C C A

- 0X

0X

0X

0X

0X

0X

A 0X

1↖

1←

1←

1←

1↖

B 0X

1↑

2↖

2←

2←

2←

A 0X

1↖

2↑

2↑

2↑

3↖

C 0X

1↑

2↑

3↖

3↖

3↑

Page 8: Challenging Cloning Related Problems with GPU-Based Algorithms

The characters which are comparedtop

leftUpper left

Page 9: Challenging Cloning Related Problems with GPU-Based Algorithms

Worst case problem:• The worst case of the classical DP-matching algorithm has a quadratic

running time. • In the general worst case, the GPU-based implementation also has a

running quadratic worst time.• However, since a large number of cores perform the computation at the

same time, the hidden quadratic constant can be divided by a large factor

Algorithm

Page 10: Challenging Cloning Related Problems with GPU-Based Algorithms

• On very small instances of DP-matching problems, the CPU might outrun the GPU, mostly because of memory bandwidth limitations

• If computation on such very small instances is to be performed on a basis of one string matched against a set of strings, there’s a way of packing the data on the GPU to make the total computation more efficient.

Algorithm

Page 11: Challenging Cloning Related Problems with GPU-Based Algorithms

• Let C be a set of strings and let c0 be an element of C. Lets define C’ as:C ’= C − {c0}

The problem is then defined as matching c0 against all ci in C’.

• Practical implementations need to pad the strings to be matched.This will enforce the number of computational steps k to be the same in each sub

matrix. The length of the padding p of a ci is defined as follow:p = len(ci) − max(len(cj)|cj C)∈

• Each padded ci of C’ is then concatenated to each other separated by a special blank character

Algorithm

Page 12: Challenging Cloning Related Problems with GPU-Based Algorithms

k’s initial value is not 0,the initial value is |C’-1|*(max(len(ci)|ci C)+1)∈

the number of computationalsteps k is reduced to 2*(max(len(ci)|ci C))-1∈

Page 13: Challenging Cloning Related Problems with GPU-Based Algorithms
Page 14: Challenging Cloning Related Problems with GPU-Based Algorithms

the indexes γ correspondingto these cells can be evaluated with this equation:γ = x (max(len(ci)|ci C) + 1)∗ ∈ ∀ x {0..|C| − 1}∈

Page 15: Challenging Cloning Related Problems with GPU-Based Algorithms

Equipment:Intel Core 2 Duo computer 3.00 GHz with 6MB of cache, 3GB of RAM and a

GeForce 8800GT

EXPERIMENTAL