24
Wish Branches A Review of “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution” Russell Dodd - October 24, 2006

Wish Branches A Review of “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution” Russell Dodd - October 24, 2006

  • View
    218

  • Download
    1

Embed Size (px)

Citation preview

Wish Branches

A Review of “Wish Branches: Enabling Adaptive and Aggressive Predicated

Execution”

Russell Dodd - October 24, 2006

Motivation Traditional Execution

Control dependencies can cause delays or stalls. Performance penalties for hard to predict branches.

Predicated Execution Converts control dependencies in branches into data

dependencies. Typical HW cannot override a compiler’s choice of branch

type. If easy-to-predict branches are predicated (unnecessarily);

it results in processing overhead and possibly a performance loss.

Cannot remove backward branches.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Proposed Solution

Compiler generates code that has dual functionality: predicated and non-predicated.

Same as predicated code only predicated conditional branches are not removed.

These conditional branches are called

wish branches.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Proposed Solution

Wish Branches When fetched, HW estimates the difficulty of the

branch prediction at runtime with a confidence estimator.

If hard-to-predict the HW executes the predicated version of the code, to avoid misprediction.

Otherwise, uses branch predictor and ignores predicate information for easy-to-predict branches.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Comparison

Figure 1 from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution”

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

C-code

Comparison

Figure 1 from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution”

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

C-code

Condition==FALSE

Low confidence mode

Not used

High confidence mode

Consequences

If the confidence estimate is accurate Avoids overhead of predicated execution. Or eliminates a branch misprediction.

If the wish branch is mispredicted In high confidence the pipeline must be flushed. In low confidence the processor does not flush

the pipeline, but executes as conventional predicated code.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Wish Loops

Used for backward branching. Used since predication cannot directly

remove backward branches. Use predication to reduce branch

misprediction penalties, without removing the branch.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Comparison

Extra instruction to initialize predicate

C-code

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Figure 2 from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution”

A Closer Look at the Wish Loop in Low Confidence

C-Code

do{

a++;

i++;

} while (i<3)

a = 1

i = 1

i < 3

a= 2

i = 2

i < 3

a = 3

i = 3

i !< 3 exit

What if a misprediction occurs during the loop?

Three cases can occur:

1 Early Exit

The loop iterates fewer times than it should. Processor needs to execute the body one

more time. Requires a pipeline flush just like a normal

mispredicted branch.

Early stop here

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

A Closer Look at the Wish Loop in Low Confidence

C-Code

do{

a++;

i++;

} while (i<3)

a = 1

i = 1

i < 3

a= 2

i = 2

i < 3

a = 3

i = 3

i !< 3 exit

a = 4

i = 4

i !< 3 exit

What if a misprediction occurs during the loop?

Three cases can occur:

2 Late Exit

The loop iterates a few more times than it should.

Allow extra blocks to go through data path as no-ops.

Performs better than normal backward branch since it reduces penalty.

Late stop here

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

A Closer Look at the Wish Loop in Low Confidence

C-Code

do{

a++;

i++;

} while (i<3)

a = 1

i = 1

i < 3

a= 2

i = 2

i < 3

a = 3

i = 3

i !< 3 exit

a = N

i = N

i !< 3 exit

What if a misprediction occurs during the loop?

Three cases can occur:

3 No Exit

Flush any extra blocks and continue. Similar to normal mispredicted branch. If extra blocks were treated as no-ops as in

late exit it would waste energy.

No exit

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Wish Branches in Complex Control Flow

Normal Predicated Wish Branch

C-code

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Figure 3 from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution”

Wish Branch Support

ISA Implementable in existing branch instructions. Requires two hint bits for four choices (normal branch, wish

jump, wish join and wish loop). Compiler

Needs to decide which branches should be converted to. Hardware

Requires an accurate confidence estimator. Changes to the processor front-end and branch

misprediction detection module.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Advantages

Dynamically eliminates the performance overhead of predicated execution.

Allows compilers to generate code more aggressively, since the processor corrects the decisions at runtime.

Exploits predicated execution to reduce branch misprediction penalties for backward branches.

Increases code adaptivity to machines since processors dynamically decide when not to use predication.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Disadvantages

Requires extra branch instructions. Increases contention for the branch predictor

table entries. Increase interference in pattern history tables.

Reduces the size of basic code blocks by adding control dependencies. Reduces the scope of optimization.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Performance Evaluation

Implemented in the IA-64 ISA since it has full support for predication.

Used a superscalar out-of-order processor model.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

An excerpt from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution” pg 54.

Results

Figure 4 from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution”

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Creates large skew on average.

Results

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

Table 1 from “Wish Branches: Enabling Adaptive and Aggressive Predicated Execution”

Analysis

Wish branches improve performance since they divide work between the compiler and micro-architecture. Compilers excel at analyzing control-flow graphs

and making predicated code. Micro-architectures excel at making runtime

decisions: whether to use predicated execution or branch prediction, based on dynamic program information.

Adapted from H. Kim et alRussell Dodd - October 24, 2006

Future Directions

Develop computer algorithms to decide which branches to convert to wish branches. Combat reduced scope for code optimization.

Develop more accurate confidence estimation mechanisms.

Develop a hardware wish loop predictor.

Russell Dodd - October 24, 2006 Adapted from H. Kim et al

My Thoughts Branches are large portion of code content ~ 20%, so it’s

worth the effort to improve. No mention of code size/instruction count.

Dynamic micro-instruction count given in previous paper but no comparison given to other styles.

Branch Predicting vs. Wish Branching Hypothetically… If a branch predictor can be improved, so it is accurate

most of the time during runtime, is it worthwhile to invest in dynamically solving branch-types (predicated/normal) for further prediction.

Russell Dodd - October 24, 2006

My Thoughts More hardware isn’t that big of a deal in a billion transistor

world. (A previous paper details hardware requirements and its complexity) H.Kim et al., “Wish Branches: Combining Conditional Branching

and Predication for Adaptive Predicated Execution”. Software would not require a large investment since it is

based on predicated (existing) execution. However compiler support to enable heuristics for optimization

could be expensive. How much?? The improvement in execution time is a really interesting

development for future processors. How far could prediction be nested?

Predict the confidence estimator’s results..

Russell Dodd - October 24, 2006

END