Generalized Index-Set Splitting

Generalized Index-Set Splitting

Christopher Barton

Arie Tal

Bob Blainey

Jose Nelson Amaral

Generalized Index-Set Splitting

● Overview● Compile-time unknown split points● Loops with multiple split points● Removing “dead” inductive-branches● Nested inductive branches● Controlling Code Growth● Results● Conclusions

Index-Set Splitting

● Divide the index-set (range) of a loop into

sub-ranges● Each sub-range represents a separate loop

for (i=0; i < 100; i++) { if (i < 5) a[i] = 2 * a[i]; else a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

for (i=0; i < 5; i++) { a[i] = 2 * a[i]; b[i] = a[i] * a[i];}for (i=5; i < 100; i++) { a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

Compile-time Unknown Split

Points● When the lower-bound, upper-bound or split-

point are unknown at compile time:

for (i=0; i < 100; i++) { if (i < m) a[i] = 2 * a[i]; else a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

for (i=0; i < m; i++) { a[i] = 2 * a[i]; b[i] = a[i] * a[i];}for (i=m; i < 100; i++) { a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

Problem: m may be smaller than 0 or greater than 100!


Points● To overcome this problem, we will use the

following sub-ranges:– First loop: 0 – min(m, 100)– Second loop: max(m,0) - 100

for (i=0; i < min(m,100); i++) { a[i] = 2 * a[i]; b[i] = a[i] * a[i];}

for (i=max(m,0); i < 100; i++) { a[i] = 5 * a[i]; b[i] = a[i] * a[i];}



Points

m ≤ 0:

0 < m < 100

100 ≤ m:


second loop iterates from 0 to 100


first loop iterates from 0 to msecond loop iterates from m to 100



first loop iterates from 0 to 100



Points● In general:

– Given a range with lower bound lb, upper bound

ub, and split-point sp, we define two sub-ranges:● lb to min(sp, ub)● max(sp,lb) to ub

for (i=lb; i < min(sp,ub); i++) { a[i] = 2 * a[i]; b[i] = a[i] * a[i];}for (i=max(sp,lb); i < ub; i++) { a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

No other conditions are needed in addition to theloop structure

Loops With Multiple Split Points

● Multiple split points can be handled iteratively

● Instead, we build a sub-range tree:

for (i=0; i < 100; i++) { if (i < m) a[i] = 2 * a[i]; if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)

max(0,n),min(m,100) max(m,0),min(n,100)

max(m,n,0),100

if (i < m)if (i < m)

if (i < n)if (i < n)

Loops with multiple split points

● The first level in the sub-range tree

corresponds to the original loop range


0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)


max(m,n,0),100

Loops with multiple split points

● The second level of the tree corresponds to

the two loops created to remove the (i < m)

inductive branchfor (i=0; i < min(m,100); i++) { a[i] = 2 * a[i]; if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}for (i=max(m,0); i < 100; i++) { if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)


max(m,n,0),100

Removing Dead Inductive-

Branches● To easily remove dead inductive-branches,

we mark the edges in the Then/Else to

correspond with the conditionfor (i=max(0,n); i<min(m,100); i++) { if (i < m) a[i] = 2 * a[i]; if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)


max(m,n,0),100

Removing Dead Inductive-

Branches● We examine the path that leads from the root

of the sub-range tree to the desired sub-

range nodefor (i=max(0,n); i<min(m,100); i++) { if (i < m) a[i] = 2 * a[i]; if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)


max(m,n,0),100

Nested Inductive-branches

● Nested branches only split the nodes of the

ranges to which they apply

for (i=0; i < 100; i++) { if (i < m) a[i] = 2 * a[i]; else if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

max(m,0),min(n,100) max(m,n,0),100

Nested Inductive-branches

● Nested branches only split the nodes of the

ranges to which they apply● The result will be three loops:

for (i=max(m,n,0); i < 100; i++) { b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

max(m,0),min(n,100) max(m,n,0),100

for (i=0; i < min(m,100; i++) { a[i] = 2 * a[i]; b[i] = a[i] * a[i];}

for (i=max(m,0); i < min(n,100); i++) { a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

Controlling Code Growth

● We descend down the sub-range tree using

a greedy breadth-first approach


0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)


max(m,n,0),100

Controlling Code Growth

● We stop when we reach all the leaf nodes, or

the code growth limit● The selected nodes designate the loops that

will be generatedfor (i=0; i<min(m,100); i++) { a[i] = 2 * a[i]; if (i < n) a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

0,100

0,min(m,100) max(m,0), 100

0,min(m,n,100)


max(m,n,0),100

for (i=max(m,n,0); i<100; i++) { b[i] = a[i] * a[i];}

for (i=max(m,0); i<min(n,100); i++) { a[i] = 5 * a[i]; b[i] = a[i] * a[i];}

ISS Opportunities in SPEC2000

0

5

10

15

20

25

Number of ISS Opportunities

bzip2craftyeon gapgzip

perlbmktwolf vpr

appluapsiequakefma3dgalgellucas

sixtrack

Benchmarks

No Loop Fusion

Loop Fusion

Compilation Times for SPEC2000

-3-2-1012345678

Compilation Time Increase (%)

gziplucasbzip2equake

vprcrafty

apsitwolfgalgelappluperlbmk

eon gap

sixtrackfma3d

Benchmarks

Runtimes for SPEC2000

-3-2.5

-2-1.5

-1-0.5

00.5

11.5

2

Runtime Reduction (%)

lucasbzip2equake

vprcrafty

apsi twolfgalgelappluperlbmk

gap

sixtrackfma3d

Benchmark

Conclusions

● Presented a technique to remove loop-variant

branches from loops by splitting the original

loop into several loops with varying ranges● Implemented in IBM’s XL Compiler suite and

showed ISS opportunities in benchmark suites● Exploring ways ISS can be used to improve

effectiveness of other optimizations

Documents

Generalized Index-Set Splitting