39
Behind the Performance of Quake 3 Engine: Fast Inverse Square Root Maksym Zavershynskyi

Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Embed Size (px)

DESCRIPTION

Quake 3 was probably the most famous first-person shooter back in 1999. It had fascinating graphics and very high-responsiveness which is the result of a performance optimization and high-quality code written by id Software team. One of the most famous optimization tricks is the function that computes the approximate of inverse (reciprocal) square root through some clever bit hacking. This function is the subject of investigations by mathematicians and programmers even today. In this presentation we try to understand how it works and we also try to find the author.

Citation preview

Page 1: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Behind the Performance of Quake 3 Engine:

Fast Inverse Square RootMaksym Zavershynskyi

Page 2: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Quake 3 Arena

First Person Shooter

Released: 1999Engine: Id Tech 3

Average reviewers score: ~9/10

Page 3: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Architecture

• C-Language

• Client-Server separation

• Virtual Machine

• Local C Compiler for Scripts

• Highly Optimized Code

Page 4: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

ShadingCreates the depth of perception

Page 5: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

+ =

Material Based Shading

[1]

Page 6: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

•Shading•Lighting •Reflections•...

What makes a nice picture?

Page 7: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Angle of Incidence

αnormal

view

greater α - darker shading

Page 8: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Vector Normalization(x,y,z)

1

(a,b,c)

Page 9: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Vector Normalization(x,y,z)

1

(a,b,c)

Page 10: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Fast Inverse Square Root

Page 11: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Inverse Square Root

float Q_rsqrt( float number ){ return 1.0f/sqrt(number);}

Page 12: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Fast Approximate Inverse Square Root

float Q_rsqrt( float number ){ long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating //point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f☀✿k?

y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, //this can be removed return y;}

Page 13: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

float Q_rsqrt( float number ){ long i; float x2, y; const float threehalfs = 1.5F; x2 = number * 0.5F; y = number; i = * ( long * ) &y; // evil floating point bit level hacking i = 0x5f3759df - ( i >> 1 ); // what the f☀✿k?

y = * ( float * ) &i; y = y * ( threehalfs - ( x2 * y * y ) ); // 1st iteration// y = y * ( threehalfs - ( x2 * y * y ) ); // 2nd iteration, this can be removed return y;}

(1)Interpret float as integer

(2)Good initial guess with magic number 0x5f3759df

(3)One iteration of Newton’s approximation

(1)(2)

(3)(1)

Page 14: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

32-bit float:

E M

0.15625 which is 1.01x2-3 in binaryE=-3+127=124 or 01111100 in binaryM=.01

Page 15: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

Page 16: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

E → E/2

Page 17: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

E → E/2

Page 18: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

the magic number 0x5f3759df0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

0x5f3759df - (i>>1)0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

result: 2.614 (exact value 1/sqrt(x)=2.52982..)

Page 19: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(1)Interpret float as integer

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

float x=0.15625

0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

x as integer i

0 0 0 1 1 1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

shift right i>>1

the magic number 0x5f3759df0 1 0 1 1 1 1 1 0 0 1 1 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

0x5f3759df - (i>>1)0 1 0 0 0 0 0 0 0 0 1 0 0 1 1 1 0 1 0 1 1 0 0 1 1 1 0 1 1 1 1 1

result: 2.614 (exact value 1/sqrt(x)=2.52982..)

Page 20: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(2)Magic Number: 0x5f3759df

•Gives a good initial guess.

•Minimizes the relative error.

•Trying to find a better number that minimizes

the error of initial guess we come up with:

0x5f37642f [4]

Page 21: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(2)Magic Number: 0x5f3759df

•Gives a good initial guess.

•Minimizes the relative error.

•Trying to find a better number that minimizes

the error of initial guess we come up with:

0x5f37642f [4]

Did we find a better magical number? ;)

Page 22: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(3)One iteration of Newton’s method

Newton’s method:Given a suitable approximation yn to the root of f(y),gives a better one yn+1 using

root

Page 23: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(3)One iteration of Newton’s method

Newton’s method:Given a suitable approximation yn to the root of f(y),gives a better one yn+1 using

In our case:

y = y * ( 1.5f - ( 0.5f * x * y * y ) );

Page 24: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

(3)One iteration of Newton’s method

After one iteration of Newton’s methodour magic number 0x5f37642f gives worse approximation than the original magic number 0x5f3759df !!! [4]

Open Question:How was the original magic number derived?

Page 25: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Open Question:How was the original magic number 0x5f3759df derived?

•Lomont in 2003 numerically found a slightly better magic number 0x5f375a86 [4]

•Robertson in 2012 analytically found the same better magic number 0x5f375a86 [3]

Page 26: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Max relative error: 0.177% [3]

With the 2nd iteration of Newton’s method: 0.00047% [3]

How good?

Page 27: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

In 1999: ???

Today: on CPUs 3-4 times faster

With the 2nd iteration of Newton’s method: 2-2.5 faster

How fast?

[3]

Page 28: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who wrote it?

Page 29: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?John Carmack?Lead Programmer of Quake, Doom, Wolfenstein 3D

Michael Abrash?Author of:Zen of Assembly LanguageZen of Graphics Programming

[8]

Page 30: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?John Carmack?Lead Programmer of Quake, Doom, Wolfenstein 3D

“...Not me, and I don’t think it is Michael (Abrash). Terje Mathison perhaps?...”

Michael Abrash?Author of:Zen of Assembly LanguageZen of Graphics Programming

[8]

Page 31: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?

Terje Mathisen?Assembly language optimization for x86 microprocessors.

“... I wrote fast & accurate invssqrt()... for a computational fluid chemistry problem...

...The code is not the same as I wrote...”[8]

Page 32: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

[8]

Page 33: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

“It did pass by my keyboard many many years ago, I may have tweaked the hex constant a bit or so, but other than that I can’t take credit for it, except that

I used it a lot and probably contributed to its popularity and longevity. “

[8]

Page 34: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?Gary Tarolli?Co-founder of 3dfx (predecessor of Nvidia)

“It did pass by my keyboard many many years ago, I may have tweaked the hex constant a bit or so, but other than that I can’t take credit for it, except that

I used it a lot and probably contributed to its popularity and longevity. “

This hack is older than 1990!!!

[8]

Page 35: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?Cleve Moler inspirationFounder of the first MATLAB,one of the founders of MathWorks,is currently a Chief Mathematician there.

Greg Walsch author (most probably)Being working on Internet and distributed computing technologies since before it was even the Internet, and helping to engineer the first WYSIWYG word processor at Xerox PARC while at Stanford University

[9]

[9]

Page 36: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Who?

Inspired by Cleve Moler from the code written by Velvel Kahan and K.C. Ng at Berkeley around

1986!!!

http://www.netlib.org/fdlibm/e_sqrt.c

[10]

Page 37: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Finally

It is Fast: 3-4 faster than the straightforward code

It is Good: 0.17% maximum relative error

It can be Improved

Dates back in 1986

Page 38: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Thank you!

http://zavermax.github.io

Page 39: Behind the Performance of Quake 3 Engine: Fast Inverse Square Root

Quake 1,3 Architecture

1) Fabien Sanglard, Quake 3 source code review. 2012 http://fabiensanglard.net/quake3/

2) Michael Abrash, Ramblings in Realtime http://www.bluesnews.com/abrash/

Inverse Square Root

3) Matthew Robertson, A Brief History of InvSqrt. 2012 Bachelor’s Thesis. Brunswick, Germany

4) Chris Lomont, Fast Inverse Square root, Indiana: Purdue University, 2003

5) Jim Blinn, Floating-point tricks, IEEE Comp. Graphics and Applications 17, no 4, 1997

6) David Elbery, Fast Inverse square root (Revisited), Geometric Tools, LLC, 2010

7) Charles McEniry, The Mathematics Behind the Fast Inverse Square Root Function Code, 2007

Investigation of the Authorship

8) Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() 2006 http://www.beyond3d.com/content/articles/8/

9) Rys Sommefeldt, Origin of Quake3’s Fast InvSqrt() - Part Two 2007 http://www.beyond3d.com/content/articles/15/

10) http://blogs.mathworks.com/cleve/2012/06/19/symplectic-spacewar/#comment-13

Additional

11) http://en.wikipedia.org/wiki/Fast_inverse_square_root

12) https://github.com/id-Software/Quake-III-Arena

Some literature here