51
Branch Mispredictions in Quicksort K. Kaligosi C. Martínez P. Sanders Max-Planck-Inst., Germany Univ. Politècnica de Catalunya, Spain Univ. Karlsruhe, Germany AofA 2006 Alden Biesen, Belgium

Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Branch Mispredictions in Quicksort

K. Kaligosi1 C. Martínez2 P. Sanders3

1Max-Planck-Inst., Germany

2Univ. Politècnica de Catalunya, Spain

3Univ. Karlsruhe, Germany

AofA 2006

Alden Biesen, Belgium

Page 2: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I Modern hardware executes several sequential

instructions in a pipelined fashion

I Jump instructions pose a major challenge!

I So we try to predict which branch will be taken ...

I Branch mispredictions are expensive: we have to

rollback the pipeline

Page 3: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I Modern hardware executes several sequential

instructions in a pipelined fashion

I Jump instructions pose a major challenge!

I So we try to predict which branch will be taken ...

I Branch mispredictions are expensive: we have to

rollback the pipeline

Page 4: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I Modern hardware executes several sequential

instructions in a pipelined fashion

I Jump instructions pose a major challenge!

I So we try to predict which branch will be taken ...

I Branch mispredictions are expensive: we have to

rollback the pipeline

Page 5: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I Modern hardware executes several sequential

instructions in a pipelined fashion

I Jump instructions pose a major challenge!

I So we try to predict which branch will be taken ...

I Branch mispredictions are expensive: we have to

rollback the pipeline

Page 6: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I In comparison-based algorithms, we want

comparisons to yield as much information as

possible =) difficult to predict!

I In static branch prediction, jump instructions are

statically predicted as TAKEN or NOT TAKEN

I In dynamic branch prediction, the hardware predictswhat to do during execution, taking the past intoaccount

I 1-bit: We predict the instruction will take the same

direction it took the last time it was executedI 2-bit: We must be wrong twice before we change

the predictionI . . .

Page 7: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I In comparison-based algorithms, we want

comparisons to yield as much information as

possible =) difficult to predict!

I In static branch prediction, jump instructions are

statically predicted as TAKEN or NOT TAKEN

I In dynamic branch prediction, the hardware predictswhat to do during execution, taking the past intoaccount

I 1-bit: We predict the instruction will take the same

direction it took the last time it was executedI 2-bit: We must be wrong twice before we change

the predictionI . . .

Page 8: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I In comparison-based algorithms, we want

comparisons to yield as much information as

possible =) difficult to predict!

I In static branch prediction, jump instructions are

statically predicted as TAKEN or NOT TAKEN

I In dynamic branch prediction, the hardware predictswhat to do during execution, taking the past intoaccount

I 1-bit: We predict the instruction will take the same

direction it took the last time it was executedI 2-bit: We must be wrong twice before we change

the predictionI . . .

Page 9: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I In comparison-based algorithms, we want

comparisons to yield as much information as

possible =) difficult to predict!

I In static branch prediction, jump instructions are

statically predicted as TAKEN or NOT TAKEN

I In dynamic branch prediction, the hardware predictswhat to do during execution, taking the past intoaccount

I 1-bit: We predict the instruction will take the same

direction it took the last time it was executed

I 2-bit: We must be wrong twice before we change

the predictionI . . .

Page 10: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I In comparison-based algorithms, we want

comparisons to yield as much information as

possible =) difficult to predict!

I In static branch prediction, jump instructions are

statically predicted as TAKEN or NOT TAKEN

I In dynamic branch prediction, the hardware predictswhat to do during execution, taking the past intoaccount

I 1-bit: We predict the instruction will take the same

direction it took the last time it was executedI 2-bit: We must be wrong twice before we change

the prediction

I . . .

Page 11: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Introduction

I In comparison-based algorithms, we want

comparisons to yield as much information as

possible =) difficult to predict!

I In static branch prediction, jump instructions are

statically predicted as TAKEN or NOT TAKEN

I In dynamic branch prediction, the hardware predictswhat to do during execution, taking the past intoaccount

I 1-bit: We predict the instruction will take the same

direction it took the last time it was executedI 2-bit: We must be wrong twice before we change

the predictionI . . .

Page 12: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

2-bit Predictor

00

PNTPNT

01

02

PT

03

PT

T

NT

T

NT

NT

T

T

NT

Page 13: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Partition

// We have to partition A[i::j] around the pivot

// that we have already put on A[i]int l = i; int u = j + 1; Elem pv = A[i];

for ( ; ; ) {

do ++l; while(A[l] < pv); // Loop S

do --u; while(A[u] > pv); // Loop G

if (l >= u) break;

swap(A[l], A[u]);

};

swap(A[i], A[u]); k = u;

}

Page 14: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Setting up the Recurrences

I Probability that the chosen pivot is the kthsmallest element out of the n: �n;k

I Average number of branch mispredictions when

partitioning an array of size n and the pivot is the

kth: bn;k

I Average number of branch mispredictions whan

partitioning an array of size n:

bn =X

1�k�n

�n;k � bn;k

Page 15: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Setting up the Recurrences

I Probability that the chosen pivot is the kthsmallest element out of the n: �n;k

I Average number of branch mispredictions when

partitioning an array of size n and the pivot is the

kth: bn;k

I Average number of branch mispredictions whan

partitioning an array of size n:

bn =X

1�k�n

�n;k � bn;k

Page 16: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Setting up the Recurrences

I Probability that the chosen pivot is the kthsmallest element out of the n: �n;k

I Average number of branch mispredictions when

partitioning an array of size n and the pivot is the

kth: bn;k

I Average number of branch mispredictions whan

partitioning an array of size n:

bn =X

1�k�n

�n;k � bn;k

Page 17: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Setting up the Recurrences

I Average number of branch mispredictions Bn to

sort n elements:

Bn = bn +nXk=1

�n;k � (Bk�1 +Bn�k)

I We will later consider the total cost Tn which

satisfies the same recurrence with toll function

tn = n+ � � bn + o(n)

Page 18: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Setting up the Recurrences

I Average number of branch mispredictions Bn to

sort n elements:

Bn = bn +nXk=1

�n;k � (Bk�1 +Bn�k)

I We will later consider the total cost Tn which

satisfies the same recurrence with toll function

tn = n+ � � bn + o(n)

Page 19: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Sampling

I It is well-known that using samples to select the

pivot of each recursive stage improves the

average performance of quicksort and reduces the

probability of worst-case behavior

I For quicksort with samples of size s from which

we pick the (p+ 1)th element as the pivot, we have

�n;k =

�k�1p

�� n�ks�1�p

��ns

Page 20: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Sampling

I It is well-known that using samples to select the

pivot of each recursive stage improves the

average performance of quicksort and reduces the

probability of worst-case behavior

I For quicksort with samples of size s from which

we pick the (p+ 1)th element as the pivot, we have

�n;k =

�k�1p

�� n�ks�1�p

��ns

Page 21: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Sampling

I A typical case is to pick the median of the sample

with s = 2t+ 1 and p = t

I We can use variable-size samples with s = s(n);then s!1 as n!1 but must grow sublinearly,

s = o(n); we use to denote the relative rank of

the pivot within the sample =) e.g., = 1=2 means

choosing the median of the sample

Page 22: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Sampling

I A typical case is to pick the median of the sample

with s = 2t+ 1 and p = t

I We can use variable-size samples with s = s(n);then s!1 as n!1 but must grow sublinearly,

s = o(n); we use to denote the relative rank of

the pivot within the sample =) e.g., = 1=2 means

choosing the median of the sample

Page 23: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

Theorem

The average number of branch mispredictions to sort

n elements with quicksort using samples of size s and

choosing the (p+ 1)th in the sample of each stage is

Bn =�(s; p)

H(s; p)n lnn+O(n);

where

H(s; p) = Hs+1 �p+ 1

s+ 1Hp+1 �

s� p

s+ 1Hs�p:

and

�(s; p) = limn!1

bnn

= limn!1

1

n

X1�k�n

�(s;p)n;k bn;k

Page 24: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

Theorem

For variable-sized sampling, if s!1 as n!1 with

s = o(n), and p=s! then

Bn =�( )

H( )n lnn+ o(n logn);

with �( ) = limn!1 �(s; � s+ o(s)) and

H(x) = �(x lnx+ (1� x) ln(1� x))

Page 25: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

Theorem

The total cost Tn of quicksort is given by

Tn =1 + � � �(s; p)

H(s; p)n lnn+O(n); s = �(1)

and

Tn =1 + � � �( )

H( )n lnn+ o(n logn); s = !(1); s = o(n)

Page 26: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I In order to compute �(s; p), we can use, under

suitable conditions,

�(s; p) =s!

p!(s� 1� p)!

Z 1

0xp(1� x)s�1�pb(x) dx

with

b(x) = limn!1

bn;x�nn

I Computing �( ) is easier!

�( ) = b( )

Page 27: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I In order to compute �(s; p), we can use, under

suitable conditions,

�(s; p) =s!

p!(s� 1� p)!

Z 1

0xp(1� x)s�1�pb(x) dx

with

b(x) = limn!1

bn;x�nn

I Computing �( ) is easier!

�( ) = b( )

Page 28: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I The optimal value � for minimizes the total

cost, i.e., minimizes

��( ) =1 + � � �( )

H( )

and depends on �

I It’s not difficult to prove that for any s and p,

�(s; p)

H(s; p)>�( �)

H( �)

Page 29: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I The optimal value � for minimizes the total

cost, i.e., minimizes

��( ) =1 + � � �( )

H( )

and depends on �

I It’s not difficult to prove that for any s and p,

�(s; p)

H(s; p)>�( �)

H( �)

Page 30: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I In general, there exists a threshold value �c such

that if � � �c (branch mispredictions are not too

expensive) then we have to take the median of the

samples, i.e., � = 1=2

I If � > �c (that can happen often in practice!) then

� < 1=2 and it is given by the unique solution in

[0; 1=2) of the equation

� � b0( )H( ) = (1 + � � b( ))H0( )

(provided that b(x) is in C2[0; 1=2))

Page 31: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I In general, there exists a threshold value �c such

that if � � �c (branch mispredictions are not too

expensive) then we have to take the median of the

samples, i.e., � = 1=2

I If � > �c (that can happen often in practice!) then

� < 1=2 and it is given by the unique solution in

[0; 1=2) of the equation

� � b0( )H( ) = (1 + � � b( ))H0( )

(provided that b(x) is in C2[0; 1=2))

Page 32: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I The threshold value �c is the solution of

d2��( )

d 2

����� =1=2

= 0

I That is

�c = �4

b00(1=2) ln 2 + 4b(1=2)

Page 33: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

General results

I The threshold value �c is the solution of

d2��( )

d 2

����� =1=2

= 0

I That is

�c = �4

b00(1=2) ln 2 + 4b(1=2)

Page 34: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Static branch prediction

I We analyze here optimal prediction: if the position

of the pivot k � n=2 then we predict Loop S not

taken and loop G taken, and the other way around

I If k � n=2 we incur a branch misprediction every

time there is an element which is smaller than the

pivot; symetrically, if k > n=2 then the number of

branch mispredictions is n� k

I Hence, bn;k = min(k� 1; n� k), b( ) = min( ; 1� ) and

��( ) =1 + � �min( ; 1� )

H( )

Page 35: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Static branch prediction

I We analyze here optimal prediction: if the position

of the pivot k � n=2 then we predict Loop S not

taken and loop G taken, and the other way around

I If k � n=2 we incur a branch misprediction every

time there is an element which is smaller than the

pivot; symetrically, if k > n=2 then the number of

branch mispredictions is n� k

I Hence, bn;k = min(k� 1; n� k), b( ) = min( ; 1� ) and

��( ) =1 + � �min( ; 1� )

H( )

Page 36: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Static branch prediction

I We analyze here optimal prediction: if the position

of the pivot k � n=2 then we predict Loop S not

taken and loop G taken, and the other way around

I If k � n=2 we incur a branch misprediction every

time there is an element which is smaller than the

pivot; symetrically, if k > n=2 then the number of

branch mispredictions is n� k

I Hence, bn;k = min(k� 1; n� k), b( ) = min( ; 1� ) and

��( ) =1 + � �min( ; 1� )

H( )

Page 37: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Static branch prediction

0.2

0.16

5

0.44

0.32

0.4

20 25

0.36

15

0.28

10

0.08

0.12

0 30

0.24

0.48

The value of � as a function of �

Page 38: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

1-bit branch prediction

I The number of branch mispredictions is twice the

number of exchanges: we incur a misprediction

each time we abandon the loops S and G

I Hence, bn;k = 2(k � 1)(n� k) and b( ) = 2 (1� )

Page 39: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

1-bit branch prediction

I The number of branch mispredictions is twice the

number of exchanges: we incur a misprediction

each time we abandon the loops S and G

I Hence, bn;k = 2(k � 1)(n� k) and b( ) = 2 (1� )

Page 40: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

1-bit branch prediction

I We can analyze in full detail the performance when

using fixed-sized samples. For example, for

median-of-(2t+ 1) we have

�(2t+ 1; t) =t+ 1

2t+ 3

I For variable-size samples, �( ) = 2 (1� ).

I The threshold is then at �c = 2=(2 ln 2� 1) � 5:177 : : :and � is the solution of

ln + 2� 2 ln = ln(1� ) + 2�(1� )2 ln(1� )

Page 41: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

1-bit branch prediction

I We can analyze in full detail the performance when

using fixed-sized samples. For example, for

median-of-(2t+ 1) we have

�(2t+ 1; t) =t+ 1

2t+ 3

I For variable-size samples, �( ) = 2 (1� ).

I The threshold is then at �c = 2=(2 ln 2� 1) � 5:177 : : :and � is the solution of

ln + 2� 2 ln = ln(1� ) + 2�(1� )2 ln(1� )

Page 42: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

1-bit branch prediction

I We can analyze in full detail the performance when

using fixed-sized samples. For example, for

median-of-(2t+ 1) we have

�(2t+ 1; t) =t+ 1

2t+ 3

I For variable-size samples, �( ) = 2 (1� ).

I The threshold is then at �c = 2=(2 ln 2� 1) � 5:177 : : :and � is the solution of

ln + 2� 2 ln = ln(1� ) + 2�(1� )2 ln(1� )

Page 43: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

1-bit branch prediction

0.28

0.08

5

0.44

15

0.36

10

0.12

0.16

0.48

2520

0.32

0.4

30

0.2

0

0.24

The value of � as a function of �

Page 44: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

2-bit branch prediction

I In (Kaligosi, Sanders, 2006), an approximate model

to compute bn;k is given, from which

b(x) =2x4 � 4x3 + x2 + x

1� x(1� x)

follows

I We are working on a more refined analysis of bn;kfor this prediction scheme; once bn;k has been

found, we should only have to apply the machinery

shown here

Page 45: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

2-bit branch prediction

I In (Kaligosi, Sanders, 2006), an approximate model

to compute bn;k is given, from which

b(x) =2x4 � 4x3 + x2 + x

1� x(1� x)

follows

I We are working on a more refined analysis of bn;kfor this prediction scheme; once bn;k has been

found, we should only have to apply the machinery

shown here

Page 46: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Some real data

6.8

7

7.2

7.4

7.6

7.8

8

8.2

8.4

10 12 14 16 18 20 22 24 26

time

/ n lg

n [n

s]

lg n

random pivotmedian of 3

exact medianskewed pivot n/10

Time vs. size on a Pentium 4 (from (Kaligosi, Sanders,

2006))

Page 47: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Some real data

6.8

7

7.2

7.4

7.6

7.8

8

2 4 6 8 10 12 14 16 18

time

/ n lg

n [n

s]

1/α

n=212

n=219

n=226

Time vs. 1= on a Pentium 4

Page 48: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Some real data

3

3.2

3.4

3.6

3.8

4

4.2

4.4

4.6

4.8

5

10 12 14 16 18 20 22 24 26

time

/ n lg

n [n

s]

lg n

random pivotmedian of 3

exact medianskewed pivot n/10

Time vs. size on an Athlon 64

Page 49: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Some real data

3

3.2

3.4

3.6

3.8

4

4.2

4.4

4.6

4.8

10 12 14 16 18 20 22 24 26

time

/ n lg

n [n

s]

lg n

random pivotmedian of 3

exact medianskewed pivot n/10

Time vs. size on an Opteron

Page 50: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Some real data

4

6

8

10

12

14

16

18

20

22

10 12 14 16 18 20 22 24

time

/ n lg

n [n

s]

lg n

random pivotmedian of 3

exact medianskewed pivot n/10

Time vs. size on a Sun

Page 51: Branch Mispredictions in Quicksortconrado/research/talks/aofa06.pdf · Introduction I Modern hardware executes several sequential instructions in a pipelined fashion I Jump instructions

Future work

I Complete the analysis of static branch prediction

with fixed-size samples (it’s not easy to obtain

�(s; p) for general s and p!)

I Analyze the 2-bit prediction scheme and possibly

others

I Conduct additional experiments, compare

theoretical analysis to real data

I Analyze branch mispredictions and their impact on

the performance of other algorithms