View
561
Download
3
Category
Tags:
Preview:
Citation preview
Dynamic Programming on SIMD
SIMD (SSE)
A register has say 4 parts, each
part has 8 bits or 16 bits
Each part executes the
same operation but in parallel
Examplex:=x+y
x,y are memory locations of appropriate alignment and each refers to 4 different
variables
Edit Distance between Strings
D
Q
𝐻 (𝑖 , 𝑗 )=𝑚𝑎𝑥 ¿Edit distance between Q[1..j]
and D[1..i]
𝑄 [ 𝑖 ]𝐷 [ 𝑗 ]
𝐷 [ 𝑗 ]
𝑄 [ 𝑖 ]
E(i,j)
E
𝐷 [ 𝑗−1 ]𝐷 [ 𝑗 ]
𝐷 [ 𝑗−1 ]𝐷 [ 𝑗 ]Q
F(i,j)
F
Q Q
𝐷 [ 𝑗 ]QQ
Summarizing
E
F
𝐻 (𝑖 , 𝑗 )=𝑚𝑎𝑥 ¿ A G C T
A -1 2 2 2
G 2 -1 2 2
C 2 2 -1 2
T 2 2 2 -1
Parallel Computing
Mismatch
Insertion in Q
Insertion in D
Find the max value path from each location to
the top left corner
Parallel Computing
Anti-diagonal can be done in parallel
Δ (𝑄 [ 𝑖 ] ,𝐷 [ 𝑗 ])
SIMD?
Δ (𝑄 [ 𝑖 ] ,𝐷 [ 𝑗 ])
The 4 parts go to different memory
locations to pick up
One SIMD register can handle all
values in this chunkPartition into chunks, each chunk should have proper
alignment
Fix: Go Vertical
Δ (𝑄 [ 𝑖 ] , 𝐴 )
Partition into vertical chunks, each chunk should have proper
alignment
𝐴
, , , can be pre-computed and
chunked
Δ (𝑄 [0 ] ,𝐴 )Δ (𝑄 [1 ] , 𝐴)
Δ (𝑄 [𝑚 ] , 𝐴 )
Problem 1: Boundaries𝐴
Each chunk needs a shift
One value comes from the previous chunk y
Δ (𝑄 [0 ] ,𝐴 )Δ (𝑄 [1 ] , 𝐴)
Δ (𝑄 [𝑚 ] , 𝐴 )
x z
• Shift
x ′
Solution 1: Striping𝐴
x’s belong to one chunk, y’s to another
and so on
• ”
xyz
xyz
xyz
x ′y ′z ′
x ′y ′z ′
x ′y ′z ′
xyz
xyz
xyz
Δ (𝑄 [∗] , 𝐴 )
Problem 2: Computing F𝐴
Sequential Dependency
x ′y ′z ′
x ′y ′z ′
Repeat At most number of blocks
Problem 2: Computing F𝐴
Sequential Dependency
x ′y ′z ′
x ′y ′z ′
Repeat Only if F-z’<<1 -
has an entry > H- x’ - -init
Thank You
Recommended