Upload
garnet
View
21
Download
0
Embed Size (px)
DESCRIPTION
Claude Tadonki Laboratoire de l’Accélérateur Linéaire/IN2P3/CNRS University of Orsay Orsay / France [email protected]. 1st Workshop on Applications for Multi and Many Core Architectures - PowerPoint PPT Presentation
Citation preview
Claude TadonkiLaboratoire de l’Accélérateur Linéaire/IN2P3/CNRS
University of OrsayOrsay / France
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
The Algebraic Path ProblemThe Algebraic Path Problem
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
The Warshall-Floyd AlgorithmThe Warshall-Floyd Algorithm
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
Shift-toroïdal Reindexation ( Kung-Lo-Lewis, 1987)Shift-toroïdal Reindexation ( Kung-Lo-Lewis, 1987)
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
The CELL Broadband EngineThe CELL Broadband Engine
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
Ring Pipelined Algorithm for the APP ( algorithm )Ring Pipelined Algorithm for the APP ( algorithm )
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
Ring Pipelined Algorithm for the APP ( algorithm )Ring Pipelined Algorithm for the APP ( algorithm )
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Can run with any number of processors p <= N ( natural LPGS )
Interesting properties of our algorithm
Generic tiling applies ( LSGP by blocking )
Each processor only requires a buffer of size bN ( Block of size b )
Fully pipelined process with local synchronization only
Perfect computation-communication overlap
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
Ring Pipelined Algorithm for the APP ( implementation on the CELL BE )Ring Pipelined Algorithm for the APP ( implementation on the CELL BE )
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
PPE-DMA is issued only by the first and the last processor
Inner SPEs communicate and synchronize locally
Computation-communication overlap occurs for all communications
Can run on more SPEs or CELL Blades by natural extension
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
PerformancesPerformances
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
Conclusion and PerspectivesConclusion and Perspectives
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.
Our ring SPMD algorithm suits for the CELL BE with a good scalabilityOur ring SPMD algorithm suits for the CELL BE with a good scalability
Communication and synchronization yield less than 5% overheadCommunication and synchronization yield less than 5% overhead
Absolute performance can be improved by optimizing the APP kernelAbsolute performance can be improved by optimizing the APP kernel
Close to 80% of the peak performance expectedClose to 80% of the peak performance expected
Our scheduling can be applied to similar problemsOur scheduling can be applied to similar problems
Ring pipelined algorithm for the algebraic path problem on the CELL Broadband Engine C. TADONKI
END & QUESTIONSEND & QUESTIONS
1st Workshop on Applications for Multi and Many Core Architectures22nd International Symposium on Computer Architecture and High Performance Computing (SBAC PAD 2010)
October, 27 – 30 2010, Petrópolis, Rio de Janeiro, Brazil.