Upload
albert-defusco
View
66
Download
10
Embed Size (px)
Citation preview
Parallel Gold Sifting
I Each pan can sift a constant amount of dirt/dayI More pans means more dirt sifted
I Each pan sifts independentlyI Each pan has a defined amount of dirt to sift
I Sifting is parallelizable
https://github.com/AlbertDeFusco/goldRush
Parallel Gold Sifting
I Each pan can sift a constant amount of dirt/dayI More pans means more dirt sifted
I Each pan sifts independentlyI Each pan has a defined amount of dirt to sift
I Sifting is parallelizable
https://github.com/AlbertDeFusco/goldRush
Parallel Gold Sifting
I Each pan can sift a constant amount of dirt/dayI More pans means more dirt sifted
I Each pan sifts independentlyI Each pan has a defined amount of dirt to sift
I Sifting is parallelizable
https://github.com/AlbertDeFusco/goldRush
Serial Gold Sifting1 #include < l i s t >2 class pan3 {
4 public :5 pan ( ) ; / / c reate ar ray o f random in tege rs between 1 and 10 ,0006 i n t s i f t ( ) ; / / r e tu rns frequency o f the number 79 ( atomic number o f gold )7 bool hasGold ( ) ; / / c a l l s s i f t ; r e tu rns t rue i f s i f t ( ) > 08 }
910 i n t main ( )11 {
12 std : : l i s t <int> withGold ;13 Pan∗ myPans = new Pan [ nPans ] ;1415 for ( i n t i =0 ; i <nPans ; ++ i )16 {
17 bool gold = myPans [ i ] . hasGold ( ) ;18 i f ( gold ) {19 withGold . push back ( i ) ;20 }
21 }
22 std : : l i s t <int > : : c o n s t i t e r a t o r i t e r a t o r ;23 for ( i t e r a t o r = withGold . begin ( ) ; i t e r a t o r != withGold . end ( ) ; ++ i t e r a t o r )24 std : : cout << ∗ i t e r a t o r << " " ;25 s td : : cout << endl ;2627 return 0 ;28 }
https://github.com/AlbertDeFusco/goldRush
Gold Rush
Gold Rush!
10000 total chunks of dirt1000 pans
Found gold in 15 pansPan IDs: 94 142 265 268 289 440 442 443 569 600 721 781 783 806 818
serial execution took 5.60495 seconds
https://github.com/AlbertDeFusco/goldRush
Parallel Gold Sifting1 #include < l i s t >2 #include < c i l k / c i l k . h>3 class pan4 {
5 public :6 pan ( ) ; / / c reate ar ray o f random in tege rs between 1 and 10 ,0007 i n t s i f t ( ) ; / / r e tu rns frequency o f the number 79 ( atomic number o f gold )8 bool hasGold ( ) ; / / c a l l s s i f t ; r e tu rns t rue i f s i f t ( ) > 09 }
1011 i n t main ( )12 {
13 std : : l i s t <int> withGold ;14 Pan∗ myPans = new Pan [ nPans ] ;1516 c i l k f o r ( i n t i =0 ; i <nPans ; ++ i )17 {
18 bool gold = myPans [ i ] . hasGold ( ) ;19 i f ( gold ) {20 withGold . push back ( i ) ;21 }
22 }
23 std : : l i s t <int > : : c o n s t i t e r a t o r i t e r a t o r ;24 for ( i t e r a t o r = withGold . begin ( ) ; i t e r a t o r != withGold . end ( ) ; ++ i t e r a t o r )25 std : : cout << ∗ i t e r a t o r << " " ;26 s td : : cout << endl ;2728 return 0 ;29 }
https://github.com/AlbertDeFusco/goldRush
Parallel Gold Sifting
I There is a problem to be solvedI How do we keep track of which pans have gold?
I Where does the result get stored?I Who can access the result?
https://github.com/AlbertDeFusco/goldRush
Parallel Gold Sifting1 #include < l i s t >2 #include < c i l k / c i l k . h>3 class pan4 {
5 public :6 pan ( ) ; / / c reate ar ray o f random in tege rs between 1 and 10 ,0007 i n t s i f t ( ) ; / / r e tu rns frequency o f the number 79 ( atomic number o f gold )8 bool hasGold ( ) ; / / c a l l s s i f t ; r e tu rns t rue i f s i f t ( ) > 09 }
1011 i n t main ( )12 {
13 std : : l i s t <int> withGold ;14 Pan∗ myPans = new Pan [ nPans ] ;1516 c i l k f o r ( i n t i =0 ; i <nPans ; ++ i )17 {
18 bool gold = myPans [ i ] . hasGold ( ) ;19 i f ( gold ) {20 withGold . push back ( i ) ;21 }
22 }
23 std : : l i s t <int > : : c o n s t i t e r a t o r i t e r a t o r ;24 for ( i t e r a t o r = withGold . begin ( ) ; i t e r a t o r != withGold . end ( ) ; ++ i t e r a t o r )25 std : : cout << ∗ i t e r a t o r << " " ;26 s td : : cout << endl ;2728 return 0 ;29 }
may not be thread-safe
https://github.com/AlbertDeFusco/goldRush
Thread Safety
I Unsafe operationsI Multiple threads accessing the same address
I Basic types are not thread safeI STL containers may be thread safe for some operations
I Threads read and write memory at undetermined times
I Leads to a race condition
https://github.com/AlbertDeFusco/goldRush
Thread Safety
I Unsafe operationsI Multiple threads accessing the same address
I Basic types are not thread safeI STL containers may be thread safe for some operations
I Threads read and write memory at undetermined times
I Leads to a race condition
https://github.com/AlbertDeFusco/goldRush
Thread Safety
I Unsafe operationsI Multiple threads accessing the same address
I Basic types are not thread safeI STL containers may be thread safe for some operations
I Threads read and write memory at undetermined times
I Leads to a race condition
https://github.com/AlbertDeFusco/goldRush
Race Condition
int total=0;cilk_for(int i=0;i<4;++i) ++total;
Write
Read
0
0
1
1
12
01
total=1
https://github.com/AlbertDeFusco/goldRush
Inefficient solutions
I LockingI Requires careful programmingI Non deterministic
I cannot use cilk sync in the loopI will only sync child threads, not all threads
I Break the loopI Requires more storage and management
1 #include < c i l k / c i l k . h>23 double ∗sum = new double [N ] ;4 / / p a r a l l e l5 c i l k f o r ( i n t i =0 ; i <N;++ i )6 sum[N] = f (N ) ;78 / / s e r i a l9 double t o t a l =0 . 0 ;
10 for ( i n t i =0 ; i <N;++ i )11 t o t a l +=sum[N ] ;
https://github.com/AlbertDeFusco/goldRush
Cilk Reducers
I Provide thread safe access to a “smart pointer”
I Any associative operation is a valid reducer
(x OP y) OP (a OP b) = x OP y OP a OP b
I Small performance overhead for usage
I Very extensible in C++
I Operations are guaranteed to execute in the same order as inserial
https://github.com/AlbertDeFusco/goldRush
Cilk Reducers: views
1 #include < c i l k / c i l k . h>2 #include < c i l k / reducer opadd . h>3 i n t t o t a l =0 ;4 c i l k : : reducer< c i l k : : < op add<int>> r e d u c e r t o t a l ( 0 ) ;5 c i l k f o r ( i n t i =0 ; i <4;++ i )6 ++∗ r e d u c e r t o t a l ;7 t o t a l = r e d u c e r t o t a l . ge t va lue ( ) ;
I At spawn each strand gets a private view of the reducer
I Strands must dereference the pointer to operate on its value
I When strands mergeI views are combined by OPI The combined view is given to the exit strand
https://github.com/AlbertDeFusco/goldRush
Cilk Plus Reducers: views#include <cilk/reducer_opadd.h>int total=0;cilk::reducer<cilk::<op_add<int>> reducer_total (0);for(int i=0;i<4;++i) ++*reducer_total;total = reducer_total.get_value();
Merge update
Private View
total=40
0
0
0
0
0
1
1
1
12
2
4 4
2
2
https://github.com/AlbertDeFusco/goldRush
Cilk Plus Reducers: types
usage: cilk::reducer<cilk::REDUCER_TYPE<<my_type>> my_reducer;
Reducer Type Description Headerop add ++, --, +=, -=, +, - #include <cilk/reducer_add.h>op vector Provides push_back() #include <cilk/reducer_vector.h>op list append Provides push_back() #include <cilk/reducer_list.h>op list prepend Provides push_front() #include <cilk/reducer_list.h>op max Returns maximum #include <cilk/reducer_max.h>op min Returns minimum #include <cilk/reducer_min.h>op ostream Provides << #include <cilk/reducer_string.h>
Table: https://software.intel.com/en-us/node/522606
https://github.com/AlbertDeFusco/goldRush
Parallel gold sifting
1 #include < c i l k / c i l k . h>2 #include < c i l k / r e d u c e r l i s t . h>34 std : : l i s t <int> withGold ;5 c i l k : : reducer< c i l k : : op l i s t append <int> > reducer wi thGold ;6 Pan∗ myPans = new Pan [ nPans ] ;78 c i l k f o r ( i n t i =0 ; i <nPans ; ++ i )9 {
10 bool gold = myPans [ i ] . hasGold ( ) ;11 i f ( gold ) {12 reducer wi thGold−>push back ( i ) ;13 }
14 }
15 withGold = reducer wi thGold . ge t va lue ( ) ;1617 std : : l i s t <int > : : c o n s t i t e r a t o r i t e r a t o r ;18 for ( i t e r a t o r = withGold . begin ( ) ; i t e r a t o r != withGold . end ( ) ; ++ i t e r a t o r )19 std : : cout << ∗ i t e r a t o r << " " ;20 s td : : cout << endl ;
https://github.com/AlbertDeFusco/goldRush
Gold Rush
$>cat /proc/cpuinfo | grep Xeon | uniq -c16 model name : Intel(R) Xeon(R) CPU E5-2670 0 @ 2.60GHz
$>CILK_NWORKERS=16 ./goldRushGold Rush!
100000 total chunks of dirt1000 pans
Found gold in 15 pansPan IDs: 94 142 265 268 289 440 442 443 569 600 721 781 783 806 818
Cilk identified the correct pans
serial execution took 5.60154 seconds
parallel execution took 0.39801 seconds with 16 workersparallel speedup 14.0739paralell efficiency 0.879616
https://github.com/AlbertDeFusco/goldRush