Upload
fernando-moreira
View
738
Download
0
Tags:
Embed Size (px)
Citation preview
Go Native
Squeeze the juice out of your 64-bit processor
using…
Go Native
Squeeze the juice out of your 64-bit processor
using…
C/C++
Who am I
Who am I
-> Fernando Moreira ( @fpmore )
Who am I
-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
Who am I
-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
-> Undergraduate Researcher @ Porto Interactive Center
Who am I
-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
-> Undergraduate Researcher @ Porto Interactive Center
-> Microsoft Student Partner Lead @ M$ PT
Who am I
-> Fernando Moreira ( @fpmore )
-> MSc student @ FEUP
-> Undergraduate Researcher @ Porto Interactive Center
-> Microsoft Student Partner Lead @ M$ PT
-> I’ve doing C++ for over… 5y
Who are you ?
Who are you ?
-> Norte
Who are you ?
-> Norte . Centro
Who are you ?
-> Norte . Centro . Sul
Who are you ?
-> Norte . Centro . Sul . Açores
Who are you ?
-> Norte . Centro . Sul . Açores . Madeira
Who are you ?
-> Norte . Centro . Sul . Açores . Madeira . FMI
Who are you ?
-> Norte . Centro . Sul . Açores . Madeira . FMI
-> Who has experience with C?
Who are you ?
-> Norte . Centro . Sul . Açores . Madeira . FMI
-> Who has experience with C? And with C++?
Who are you ?
-> Norte . Centro . Sul . Açores . Madeira . FMI
-> Who has experience with C? And with C++?
-> Who has experience with 64bit native dev?
Talk’s Schedule
int main( int argc, char **argv ) {
try {
} catch( Timeout &e ) { return -1; }
return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
try {
introducing_x64();
} catch( Timeout &e ) { return -1; }
return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
try {
introducing_x64();
advantagesOver_x86();
} catch( Timeout &e ) { return -1; }
return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
try {
introducing_x64();
advantagesOver_x86();
nativeDev_x64( const Topic &t );
} catch( Timeout &e ) { return -1; }
return 0;
}
Promising not to change the topic.
Talk’s Schedule
int main( int argc, char **argv ) {
try {
introducing_x64();
advantagesOver_x86();
nativeDev_x64( const Topic &t );
codeAnalysis_and_DebugTools();
} catch( Timeout &e ) { return -1; }
return 0;
}
Talk’s Schedule
int main( int argc, char **argv ) {
try {
introducing_x64();
advantagesOver_x86();
nativeDev_x64( const Topic &t );
codeAnalysis_and_DebugTools();
costProspectionOn_x64Dev();
} catch( Timeout &e ) { return -1; }
return 0;
}
introducing_x64()
introducing_x64()
-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
introducing_x64()
-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
introducing_x64()
-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
-> AMD64 is backwards compatible with x86 (IA-64 isn’t)
introducing_x64()
-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
-> AMD64 is backwards compatible with x86 (IA-64 isn’t)
-> Some Hardware: Phenom, Athlon 64, Core-iX, Core 2, …
introducing_x64()
-> The names : x64, x86-64, AMD64, Intel 64, IA-64, etc…
-> Notice : IA-64 ≠ AMD64
-> AMD64 is backwards compatible with x86 (IA-64 isn’t)
-> Some Hardware: Phenom, Athlon 64, Core-iX, Core 2, …
-> Some OS’s : Win(XP.Vista.7), OSX, Several Linux distros.
introducing_x64()
This talk will be focused on the AMD64 architecture.
advantagesOver_x86()
advantagesOver_x86()
-> Address space : Theoretical limit of 16 ExaBytes (2^64)
advantagesOver_x86()
-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
advantagesOver_x86()
-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
-> Larger instruction set with emphasis on SIMD
advantagesOver_x86()
-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
-> Larger instruction set with emphasis on SIMD
-> SSE1, SSE2, and SSE3 are always there
advantagesOver_x86()
-> Address space : Theoretical limit of 16 ExaBytes (2^64)
-> More available registers. (there’s one called RIP)
-> Larger instruction set with emphasis on SIMD
-> SSE1, SSE2, and SSE3 are always there
-> Unified function calling convention
advantagesOver_x86()
Can run x86 environmentsCan run x86 binaries under x64
environments
On Windows: . 32bit processes can’t load 64bit DLLs for execution
. 64bit processes can’t load 32bit DLLs for execution
nativeDev_x64( how_it_looks_like )
nativeDev_x64( how_it_looks_like )
-> A valid, yet useless, 64bit application.
int main( int argc, char **argv } { return 0;}
nativeDev_x64( how_it_looks_like )
-> A valid, yet useless and dangerous, 64bit application.
int main( int argc, char **argv } {
size_t external_debt = SIZE_MAX; int *ptr = &external_debt; *ptr = 0;
return 0;}
nativeDev_x64( how_it_looks_like )
-> A valid, yet useless and dangerous, 64bit application.
int main( int argc, char **argv } {
size_t external_debt = SIZE_MAX; int *ptr = &external_debt; *ptr = 0;
return 0;}
nativeDev_x64( data_model )
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
Can you see the data portability problem?
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
Suggestions: Use conditional compilation and type aliasing.
nativeDev_x64( data_model )
-> On Microsoft Win64 : LLP64 model
-> On Linux : LP64 model
-> LLP64: short( 2 ), int( 4 ), long( 4 ), ptr( 8 ), long long(8)
-> LP64: short( 2 ), int( 4 ), long( 8 ), ptr( 8 ), long long( 8 )
Suggestions: Use conditional compilation and type aliasing. Make conscious usage of the sizeof operator.
nativeDev_x64( data_model )
-> On x86 : ptr( 4 ), size_t( 4 ), ptrdiff_t( 4 )
nativeDev_x64( data_model )
-> On x86 : ptr( 4 ), size_t( 4 ), ptrdiff_t( 4 )
-> On x64 : ptr( 8 ), size_t( 8 ), ptrdiff_t( 8 )
nativeDev_x64( data_model )
-> On x86 : ptr( 4 ), size_t( 4 ), ptrdiff_t( 4 )
-> On x64 : ptr( 8 ), size_t( 8 ), ptrdiff_t( 8 )
These ones will increase memory usage…
But will be performance-wise.
nativeDev_x64( common_pitfalls )
nativeDev_x64( common_pitfalls )
-> Usage of magic numbers & bit-wise ops: 0x7fffffff
nativeDev_x64( common_pitfalls )
-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
nativeDev_x64( common_pitfalls )
-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
-> Virtual functions
nativeDev_x64( common_pitfalls )
-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
-> Virtual functions
-> Data exchange between x86 and x64 apps
nativeDev_x64( common_pitfalls )
-> Usage of magic numbers & bit-wise ops: 0x7fffffff
-> Functions with variable number of arguments : printf
-> Virtual functions
-> Data exchange between x86 and x64 apps
-> Data misalignment : SSE requires 16-byte alignment
nativeDev_x64( optimization_tips )
nativeDev_x64( optimization_tips )
-> Use native types for loops or tight data usage
nativeDev_x64( optimization_tips )
-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
nativeDev_x64( optimization_tips )
-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
-> Heap-allocs in Win64 and XBOX360 are 16-byte aligned
nativeDev_x64( optimization_tips )
-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
-> Heap-allocs in Win64 and XBOX360 are 16-byte aligned
-> *Use* intrinsics : #include <immintrin.h>
nativeDev_x64( optimization_tips )
-> Use native types for loops or tight data usage
-> Use 16-byte alignment for SSE loads and stores
-> Heap-allocs in Win64 and XBOX360 are 16-byte aligned
-> *Use* intrinsics : #include <immintrin.h>
-> Unroll loops and sort object’s member data by their size
nativeDev_x64( real-world_tips )
nativeDev_x64( real-world_tips )
-> Don’t sacrifice your software architecture.
nativeDev_x64( real-world_tips )
-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
nativeDev_x64( real-world_tips )
-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
-> Don’t go into premature optimization.
nativeDev_x64( real-world_tips )
-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
-> Don’t go into premature optimization.
-> Do it at lower levels and then hide it.
nativeDev_x64( real-world_tips )
-> Don’t sacrifice your software architecture.
-> Don’t use it if you don’t know how to.
-> Don’t go into premature optimization.
-> Do it at lower levels and then hide it.
-> Trust your compiler to help you do the job.
codeAnalysis_and_DebugTools()
codeAnalysis_and_DebugTools()
-> Your IDE : LEARN to fu**** use it!
codeAnalysis_and_DebugTools()
-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
codeAnalysis_and_DebugTools()
-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
-> Free tool : CppCheck (CmdLine, Eclipse, CodeBlocks, …)
codeAnalysis_and_DebugTools()
-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
-> Free tool : CppCheck (CmdLine, Eclipse, CodeBlocks, …)
-> State-of-the-art tool: PVS-Studio (VS 05,08,10)
codeAnalysis_and_DebugTools()
-> Your IDE : LEARN to fu**** use it!
-> Conditional break points, call-stack
-> Free tool : CppCheck (CmdLine, Eclipse, CodeBlocks, …)
-> State-of-the-art tool: PVS-Studio (VS 05,08,10)
-> Do pair programming and peer-review if possible
costProspectionOn_x64Dev()
costProspectionOn_x64Dev()
-> Hardware & Software (IDE + Plugins + Tools + Libs)
costProspectionOn_x64Dev()
-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
costProspectionOn_x64Dev()
-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
-> A port takes time, adds bugs, and it’s not creative
costProspectionOn_x64Dev()
-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
-> A port takes time, adds bugs, and it’s not creative
-> … plus you’ll probably have to maintain two code paths
costProspectionOn_x64Dev()
-> Hardware & Software (IDE + Plugins + Tools + Libs)
-> You’ll need to teach the developers (theory & practice)
-> A port takes time, adds bugs, and it’s not creative
-> … plus you’ll probably have to maintain two code paths
-> Full implementation adds creativity, but takes much more time and will add many more bugs.
Lets gostate-of-the-art!
Questions?