39
Compensated Algorithms Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´ e de Perpignan http://webdali.univ-perp.fr Compensated Algorithms – N. Louvet April 18, 2007 1 / 33

Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Embed Size (px)

Citation preview

Page 1: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Compensated Algorithms

Philippe Langlois, Nicolas Louvet

Equipe DALI, Universite de Perpignan

http://webdali.univ-perp.fr

Compensated Algorithms – N. Louvet April 18, 2007 1 / 33

Page 2: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Motivation

Results computed in floating point arithmetic are possibly corrupted by

rounding errors.

Compensated algorithms = algorithms that correct the rounding errors

generated during the computation:

If r is a computed result, how to find a correcting term c such that

r = r ⊕ c is more accurate than r ?

Aim of this presentation is:

I to recall the principle of so-called compensated algorithms,

I to present some details about the compensated Horner algorithm 1.

Context: IEEE-754 fp arithmetique, rounding to the nearest, no underflow.

1S. Graillat, Ph. Langlois, N. Louvet. Compensated Horner Scheme, Research Report

DALI-LP2A, Universite de Perpignan, aug. 2005 (submitted)Compensated Algorithms – N. Louvet April 18, 2007 2 / 33

Page 3: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Outline

1 Introduction

2 Principle of Compensated Algorithms

3 Sketch of proof for the compensated Horner algorithm

4 Conclusion

5 More slides

Faithful rounding with the CHS

Practical efficiency

Compensated Algorithms – N. Louvet April 18, 2007 3 / 33

Page 4: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Introduction

Why do we need compensated algorithms ?

Compensated Algorithms – N. Louvet April 18, 2007 4 / 33

Page 5: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Backward stable algorithms vs. condition number

Backward stable algorithms:

The accuracy of the computed solution satisfies

accuracy . condition number× u,

where

I u is the computing precision:

IEEE-754 double, 53-bits mantissa, rounding to the nearest ⇒ u = 2−53.

1 1+

u

I the condition number quantify the difficulty to solve the problem accuratly.

Examples: summation, dot product, Horner algorithm, substitution for

triangular system solving.

Compensated Algorithms – N. Louvet April 18, 2007 5 / 33

Page 6: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Accuracy of the Horner scheme

We consider the polynomial

p(x) =n∑

i=0

aixi ,

with ai ∈ F, x ∈ F

Algorithm (Horner scheme)function r0 = Horner (p, x)

rn = an

for i = n − 1 : −1 : 0

r i = r i+1 ⊗ x ⊕ ai

end

Relative accuracy of the evaluation with the Horner scheme:

|Horner (p, x)− p(x)||p(x)|

≤ γ2n︸︷︷︸≈2nu

cond(p, x)

u is the computing precision

cond(p, x) denotes the condition number of the evaluation:

cond(p, x) =

∑|aix

i ||p(x)|

≥ 1

.

Compensated Algorithms – N. Louvet April 18, 2007 6 / 33

Page 7: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Accuracy of the Horner scheme

We consider the polynomial

p(x) =n∑

i=0

aixi ,

with ai ∈ F, x ∈ F

Algorithm (Horner scheme)function r0 = Horner (p, x)

rn = an

for i = n − 1 : −1 : 0

r i = r i+1 ⊗ x ⊕ ai

end

Relative accuracy of the evaluation with the Horner scheme:

|Horner (p, x)− p(x)||p(x)|

≤ γ2n︸︷︷︸≈2nu

cond(p, x)

u is the computing precision

cond(p, x) denotes the condition number of the evaluation:

cond(p, x) =

∑|aix

i ||p(x)|

≥ 1

.Compensated Algorithms – N. Louvet April 18, 2007 6 / 33

Page 8: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Accuracy . condition number of the problem × u

1e-18

1e-16

1e-14

1e-12

1e-10

1e-08

1e-06

1e-04

0.01

1

100000 1e+10 1e+15 1e+20 1e+25

rela

tive

forw

ard

erro

r

condition number

Accuracy of the polynomial evaluation [n=50]

u

1/u

γ2n cond

How to manage ill-conditioned cases?

Compensated Algorithms – N. Louvet April 18, 2007 7 / 33

Page 9: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Compensated algorithms

Algorithms that correct the generated rounding errors

Many examples:

I Compensated summation: Neumaier (74), Sum2 in Ogita-Rump-Oishi (05)

I Compensated Horner algorithm

I Compensated substitution for triangular system solving

Accuracy as if computed in twice the working precision:

accuracy . u + condition number× u2

More efficient than fixed length expansion libraries (double-double)

Not considered here: Kahan’s compensated summation (65), Priest’s

doubly compensated summation (92)

Compensated Algorithms – N. Louvet April 18, 2007 8 / 33

Page 10: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Accuracy of the result . u + condition number× u2.

1e-18

1e-16

1e-14

1e-12

1e-10

1e-08

1e-06

1e-04

0.01

1

100000 1e+10 1e+15 1e+20 1e+25 1e+30 1e+35

rela

tive

forw

ard

erro

r

condition number

Accuracy of polynomial evaluation with the compensated Horner scheme [n=50]

u

1/u 1/u2

u + γ2n2 cond

How is computed the compensated result?

Compensated Algorithms – N. Louvet April 18, 2007 9 / 33

Page 11: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Principle of compensated Algorithms

The compensated result is r = r ⊕ c . The correcting term c is

an approximate of the forward error in r ,

computed thanks to “Error-Free Transformations”.

Compensated Algorithms – N. Louvet April 18, 2007 10 / 33

Page 12: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Compensated result

x

Input space D Output space R

Forward error

G

G

r = G(x)

r = G(x)

Forward error analysis

Compensated Algorithms – N. Louvet April 18, 2007 11 / 33

Page 13: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Compensated result

x

Input space D Output space R

x + ∆x

Backward errorForward error

G

G

G

r = G(x)

r = G(x)

Forward error analysis

Backward error analysis

Identify r as the exact solution of a perturbed problem:

r = G (x + ∆x)

Compensated Algorithms – N. Louvet April 18, 2007 11 / 33

Page 14: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Compensated result

x

r

r

r + c

c

c

G

G

Input space D Output space Rr

G

Backward error

x + ∆x

How to improve the quality of the computed result r ?

First, compute an approximate c of the forward error c = r − r .

c is a correcting term for r .

Then, a compensated result r = r ⊕ c .

Compensated Algorithms – N. Louvet April 18, 2007 11 / 33

Page 15: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

How to compute the correcting term c?

c is an approximate of the forward error c = r − r

Classical error analysis:

Let p(x) =∑n

i=0 aixi be a polynomial with floating point coefficient.

I Backward error result: Horner algorithm computes

Horner (p, x) =nX

i=0

(1 + δi )aixi , with |δi | ≤ γ2n ≈ 2nu

I Forward error result:

c = p(x)− Horner (p, x) = −nX

i=0

δiaixi ⇒ |c| ≤ γ2n

nXi=0

|ai ||x |i .

As the δi are unknown, classical error analysis does not solve our problem. . .

But we can do better thanks to Error-Free Transformations!

Compensated Algorithms – N. Louvet April 18, 2007 12 / 33

Page 16: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Error-free transformations

Error-Free Transformations (EFT) are properties and algorithms to compute the

rounding errors at the current working precision.

+ (x , y) = 2Sum (a, b) 6 flops Knuth (74)

such that a + b = x + y and x = a⊕ b

× (x , y) = 2Prod (a, b) 17 flops Veltkamp

such that a× b = x + y and x = a⊗ b Dekker (71)

with a, b, x , y ∈ F.

Compensated Algorithms – N. Louvet April 18, 2007 13 / 33

Page 17: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

How to compute the correcting term c thanks to EFT?

Algorithm (Horner scheme)function r0 = Horner (p, x)

rn = an

for i = n − 1 : −1 : 0

pi = r i+1 ⊗ x % rounding error πi ∈ F ⇒ pi = r i+1x − πi

r i = pi ⊕ ai % rounding error σi ∈ F ⇒ r i = pi + ai − σi

For i = n − 1 : −1 : 0,

r i = r i+1x + ai − πi − σi .

Since rn = an,

Horner (p, x) =n∑

i=0

(ai − πi − σi )xi , with πn = σn = 0.

Compensated Algorithms – N. Louvet April 18, 2007 14 / 33

Page 18: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

How to compute the correcting term c thanks to EFT?

c is an approximate of the forward error c = r − r

Express r w.r.t. the data and the elementary rounding errors:

Horner algorithm computes

Horner (p, x) =n∑

i=0

(ai − πi − σi )xi , with πn = σn = 0.

Deduce an expression for the forward error c = r − r :

c = p(x)− Horner (p, x) =n−1∑i=0

(πi + σi )xi .

If we manage to find a closed form formula for c w.r.t

I the data,

I the elementary rounding errors (exactly computable thanks to EFT),

we can easily compute a correcting term c .

Compensated Algorithms – N. Louvet April 18, 2007 14 / 33

Page 19: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

An EFT for the Horner Algorithm

We have

c =n−1∑i=0

(πi + σi )xi = (pπ + pσ)(x),

with pπ(x) =∑

πiXi and pσ(x) =

∑σiX

i .

Algorithm (EFT for Horner)

function [ r0, pπ, pσ] = EFTHorner (p, x)

rn = an

for i = n − 1 : −1 : 0

[ pi , πi ] = 2Prod ( r i+1, x)

[ r i , σi ] = 2Sum ( pi , ai )

pπ[i ] = πi ; pσ[i ] = σi

This algorithm is an EFT for the Horner algorithm since

p(x) = Horner (p, x) + (pπ + pσ)(x)︸ ︷︷ ︸=c

.

Compensated Algorithms – N. Louvet April 18, 2007 15 / 33

Page 20: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Compensated Horner Algorithm

Since c = (pπ + pσ)(x), we compute c as Horner (pπ ⊕ pσ, x).

Algorithm (Compensated Horner scheme)

function r = CompHorner (p, x)

[ r , pπ, pσ] = EFTHorner (p, x) % r = Horner (p, x)

c = Horner (pπ⊕pσ, x)

r = r⊕ c

Next question: how to prove something about the accuracy of r ?

(Accuracy of the result . u + condition number× u2)

Difficult to answer in a general manner. . .

Compensated Algorithms – N. Louvet April 18, 2007 16 / 33

Page 21: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Sketch of proof for the compensated Horneralgorithm

The key “ingredient” here is the EFT for the Horner algorithm,

p(x) = Horner (p, x) + (pπ + pσ)(x)︸ ︷︷ ︸=c

.

Compensated Algorithms – N. Louvet April 18, 2007 17 / 33

Page 22: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Sketch of proof for the compensated Horner algorithm

We recall:

r = Horner (p, x),

c = (pπ + pσ)(x) is the forward error in r ,

c = Horner (pπ ⊕ pσ, x) is the computed correcting term,

r = r ⊕ c is the compensated result.

Since the compensated result is r = r ⊕ c ,

|p(x)− r | = |p(x)− (1 + ε)( r + c)|, with |ε| ≤ u.

Using the EFT for the Horner scheme, r = p(x)− c ,

|p(x)− r | = |p(x)− (1 + ε)(p(x) + c − c)|

≤ u|p(x)|+ (1 + u)| c − c |

How can we bound |c − c |?

Compensated Algorithms – N. Louvet April 18, 2007 18 / 33

Page 23: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Sketch of proof for the compensated Horner algorithm

|c − c | is the forward error in c = Horner (pπ ⊕ pσ, x). Then

|c − c | ≤ γ2n−1( ˜pπ + pσ)(x),

with ( ˜pπ + pσ)(x) =∑n−1

i=0 |pπ + pσ||x i |. We bound “largely” this term as

follows,

( ˜pπ + pσ)(x) ≤ γ2n p(x),

with p(x) =∑n

i=0 |ai ||x i |. Therefore,

|c − c | ≤ γ2n−1γ2n p(x).

Nota:

γk =ku

1− ku= ku +Ou2.

Compensated Algorithms – N. Louvet April 18, 2007 18 / 33

Page 24: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Sketch of proof for the compensated Horner algorithm

Then,

|p(x)− r | ≤ u|p(x)|+ (1 + u)γ2n−1γ2n p(x),

≤ u|p(x)|+ γ22n p(x).

Now we turn to relative accuracy, and we obtain the following theorem.

TheoremGiven p a polynomial with floating point coefficients, and x a floating point

value, let r be the compensated evaluation of p(x) computed with

CompHorner. Then,

|p(x)− r ||p(x)|

≤ u + γ22n cond(p, x).

Again, γ22n ≈ 4n2u2.

Compensated Algorithms – N. Louvet April 18, 2007 18 / 33

Page 25: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Accuracy of the result . u + condition number× u2.

1e-18

1e-16

1e-14

1e-12

1e-10

1e-08

1e-06

1e-04

0.01

1

100000 1e+10 1e+15 1e+20 1e+25 1e+30 1e+35

rela

tive forw

ard

err

or

condition number

Accuracy of polynomial evaluation [n=25]

u

1/u 1/u2

γ2n cond u + γ2n2 cond

HornerDDHorner

CompHorner

Compensated Algorithms – N. Louvet April 18, 2007 19 / 33

Page 26: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Conclusion

We have recall how to define a correcting term:

correcting term = approximate of the forward error.

Compensating the Horner algorithm improves the accuracy:

I the accuracy of the compensated result is the same as if the result was

computed in doubled working precision.

I Remark: this is also true for compensated summation or compensated

triangular system solving.

Compensated Algorithms – N. Louvet April 18, 2007 20 / 33

Page 27: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Faithful rounding with the CHS

Compensated Algorithms – N. Louvet April 18, 2007 21 / 33

Page 28: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Faithful rounding

Definition

A floating point number x is said to be a faithful rounding of a real number x if

either x = x ,

or x is one of the two floating point neighbours of x .x

x

The error bound

|CompHorner (p, x)− p(x)||p(x)|

≤ u + γ22n︸︷︷︸

2n2u2

cond(p, x)

is too large for reasoning about faithful rounding.

Compensated Algorithms – N. Louvet April 18, 2007 22 / 33

Page 29: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

An a posteriori test

We recall:

I r = CompHorner (p, x) is the compensated result,

I c = (pπ + pσ)(x) is the exact (real) correcting term for br ,I bc = Horner (pπ ⊕ pσ, x) is the computed (floating point) correcting term.

The following error bound on the computed correcting term holds:

|c − c | ≤ fl

(γ2n−1Horner (|pπ| ⊕ |pσ|, |x |)

1− 2(n + 1)u

)=: β.

Then, we can perform a dynamic test for faithful rounding:

Theorem

β <u

2| r | ⇒ |c − c | < u

2| r | ⇒ r is a faithful rounding of p(x).

Compensated Algorithms – N. Louvet April 18, 2007 23 / 33

Page 30: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

An a posteriori test

1e-18

1e-16

1e-14

1e-12

1e-10

1e-08

1e-06

1e-04

0.01

1

100000 1e+10 1e+15 1e+20 1e+25 1e+30 1e+35

rela

tive

forw

ard

erro

r

condition number

Accuracy of polynomial evaluation with the compensated Horner scheme [n=50]

u

1/u 1/u2

u + γ2n2 cond

(1-u)/(2+u)uγ2n-2

Compensated Algorithms – N. Louvet April 18, 2007 24 / 33

Page 31: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Practical efficiency

Compensated Algorithms – N. Louvet April 18, 2007 25 / 33

Page 32: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Overhead to obtain more accuracy

Theoretical ratios (flops):

CompHorner

Horner∼ 10.5

CompHornerIsFaith

Horner∼ 13

DDHorner

Horner∼ 14

Some practical ratios (running times 2):

CompHornerHorner

CompHornerIsFaithHorner

DDHornerHorner

Pentium 4, 3.00 GHz GCC 3.3.5 3.77 5.52 10.00

ICC 9.1 3.06 5.31 8.88

Athlon 64, 2.00 GHz GCC 4.0.1 3.89 4.43 10.48

Itanium 2, 1.4 GHz GCC 3.4.6 3.64 4.59 5.50

ICC 9.1 1.87 2.30 8.78

∼ 2− 4 ∼ 4− 6 ∼ 5− 10

2Average ratios for polynomials of degree 5 to 200.Compensated Algorithms – N. Louvet April 18, 2007 26 / 33

Page 33: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Some comparisons

How does the more accurate algorithms compare to each other?

CompHornerIsFaithCompHorner

DDHornerCompHorner

DDHornerCompHornerIsFaith

Pentium 4, 3.00 GHz GCC 3.3.5 1.46 2.66 1.89

ICC 9.1 1.71 2.92 1.72

Athlon 64, 2.00 GHz GCC 4.0.1 1.14 2.70 2.38

Itanium 2, 1.4 GHz GCC 3.4.6 1.27 1.51 1.27

ICC 9.1 1.24 4.67 3.84

≤ 2 ∼ 2− 5 ∼ 2− 5

Compensated Algorithms – N. Louvet April 18, 2007 27 / 33

Page 34: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

What is Instruction-Level Parallelism?

All processors since about 1985, including those in the embedded space,

use pipelining to overlap the execution of instructions and improve

performance. This potential overlap among instruction is called

instruction-level parallelism (ILP) since the instruction can be evaluated in

parallel. (Hennessy & Patterson)

A wide range of techniques have been developped to exploit the parallelism

available among instructions (pipelining, superscalar architectures...)

Amount of ILP available in a code:

I if two instructions are parallel they can execute simultaneously in a pipeline,

I if two instructions are dependent they must be executed in order.

How to determine wheter an instruction is dependent on another?

Compensated Algorithms – N. Louvet April 18, 2007 28 / 33

Page 35: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Dependences between instructions

Three different types of dependences:

I control dependences,

I name dependences,

I data dependences (or true dependences).

A control dependence determine the ordering of an instruction with respect

to a branch instruction.

A name dependence occurs when two instructions use the same register or

memory location (name), but there is in fact no flow of data between

instructions associated with that name.

But here we are mainly interested by data dependences.

Compensated Algorithms – N. Louvet April 18, 2007 29 / 33

Page 36: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Data dependences

An instruction i is data dependent on an instruction j if either

I instruction j produce a result that may be used by instruction i ,

I their exist a chain of dependences of the first type between i and j .

j

i i

k

j

If two instructions are data dependent, they cannot execute simultaneously.

Dependences are properties of programs: the presence of a data dependence

in an instruction sequence reflects a data dependence in the source code.

What about the dependences in CompHorner and DDHorner?

Compensated Algorithms – N. Louvet April 18, 2007 30 / 33

Page 37: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Difference between DDHorner and CompHorner

function r = CompHorner(P, x)

sn = ai; cn = 0

for i = n − 1 : −1 : 0

[pi, πi] = 2Prod(si+1, x)

[si, σi] = 2Sum(pi, ai)

ci = ci+1 ⊗ x ⊕ (πi ⊕ σi )

end

r = s0 ⊕ c0

function r = DDHorner(P, x)

shn = ai; sln = 0

for i = n − 1 : −1 : 0

%% [phi, pli]=[shi+1, sli+1] ⊗ x

[th, tl] = 2Prod(shi+1, x)

tl = sli+1 ⊗ x ⊕ tl

[phi, pli] = Fast2Sum(th, tl)

%% [shi, sli]=[phi, pli] ⊕ ai

[th, tl] = 2Sum(phi, ai)

tl = tl ⊕ pli

[shi, sli] = Fast2Sum(th, tl)

end

r = sh0

Compensated Algorithms – N. Louvet April 18, 2007 31 / 33

Page 38: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Difference between DDHorner and CompHorner

function r = CompHorner’(P, x)

sn = ai; cn = 0

for i = n − 1 : −1 : 0

[pi, πi] = 2Prod(si+1, x)

ti = ci+1 ⊗ x ⊕ πi

[si, σi] = 2Sum(pi, ai)

ci = ti ⊕ σi

end

r = s0 ⊕ c0

function r = DDHorner(P, x)

shn = ai; sln = 0

for i = n − 1 : −1 : 0

%% [phi, pli]=[shi+1, sli+1] ⊗ x

[th, tl] = 2Prod(shi+1, x)

tl = sli+1 ⊗ x ⊕ tl

[phi, pli] = Fast2Sum(th, tl)

%% [shi, sli]=[phi, pli] ⊕ ai

[th, tl] = 2Sum(phi, ai)

tl = tl ⊕ pli

[shi, sli] = Fast2Sum(th, tl)

end

r = sh0

Compensated Algorithms – N. Louvet April 18, 2007 32 / 33

Page 39: Philippe Langlois, Nicolas Louvet Equipe DALI, Universit´e ... · Outline 1 Introduction 2 Principle of Compensated Algorithms 3 Sketch of proof for the compensated Horner algorithm

Difference in the data flow graph

10

9

1

2

3

4

5

6

7

8

11

12

13

14

15

16

17

18

19

x_lor

r

x_lo x_hi

r r x

+

x_hi

x c

P[i]

P[i]

22

splitter

c

r

20*

+

21

P[i]

P[i]

sh splittersh x sl x

2

3

4

56

7

8

9

10

11

12

13

15

25

* *

−* *

* *−

+

+

+

+

+

sh

x_hi x_lo

ph

sh

14

pl

sl

21

+19

20−

23−−

22−

+24

26+

27−

+28

+16

−17

18−

*2

3−

−4

−5

*8

*6

−7

+9

*12 10

*

+11

+13

14−

15−

16−

18−

17−

+19

*1

*1

DDHornerCompHorner

We represent all data dependences in

the inner loop of each algorithm.

More parallelism among floating point

operations in CompHorner than in

DDHorner.

Thus more potential ILP, and greater

pratical performance!

Compensated Algorithms – N. Louvet April 18, 2007 33 / 33