41
Daniel Kleine-Albers, 31.03.2009 JASS09, St. Petersburg Computing eigenvalues in parallel

Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Daniel Kleine-Albers, 31.03.2009JASS09, St. Petersburg

Computing eigenvalues in parallel

Page 2: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

2

Computing eigenvalues in parallel.Contents.

1. Motivation

2. Intro to parallelization

3. Algorithms- QR iteration- Divide and Conquer- MRRR

4. Conclusion

Page 3: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Eigenvalues.How they are defined.

• Av = λv

• λ is the eigenvalue

• v is the corresponding eigenvector

• no change of direction

• Solutions to characteristic polynomial det(A-λI) = 0

• Eigenvalues stay the same for similar matrices

3

Page 4: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Eigenvalues.What they are good for.

• Quantum mechanics

• Compression algorithms

• Eigenfrequency

• Further physical applications in stiffness calculations, load analysis, etc.

4

Page 5: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Parallelisation.Why it is important.

• High performance computers

• Single core clock speeds are close to the physical limits

• Multi core systems are standard

• Even mobile phones will become multi core processors soon

• Flexibility: Parallel programs can easily be run serial, but not the other way round

5

Page 6: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Parallelisation.Basic parallel architectures.

6

Uniform Memory

Processor 1

Processor 1I

Memory

Distributed Memory

Processor 1

Processor 1I

Memory

Memory

Page 7: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Parallelisation.Criteria for parallelization.

• Data locality

• Data dependence

• Communication overhead

• Speedup achieved

7

Page 8: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

8

Computing eigenvalues in parallel.Contents.

1. Motivation

2. Intro to parallelization

3. Algorithms- QR iteration- Divide and Conquer- MRRR

4. Conclusion

Page 9: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Algorithms.Choose depending on your needs.

9

QR

Eigenvalues only Eigenvalues + Eigenvectors

MRRR

Part spectrum Full spectrum

clustered not clusteredDivide and conquer

MRRR MRRRBisection and

inverse iteration

Page 10: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

10

Computing eigenvalues in parallel.Contents.

1. Motivation

2. Intro to parallelization

3. Algorithms- QR iteration- Divide and Conquer- MRRR

4. Conclusion

Page 11: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Basic algorithm.

n = size❨A❩while ❨n > 1❩ ❴

Factorize A = QR using QR decompositionA = RQif ❨an,n-1 < tolerance❩ ❴output λn = an,nremove row and column nn--

❵❵

output λn = an,n11

Page 12: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Notes.

• Based on Givens rotations to zero out the sub-diagonal elements:O(n2) rotations per step

• QR decomposition does not need to be explicitly calculated:A = ... G2 * ❨G1 * A * G1T❩ * G2T ...

• Performance depends mainly on the number of non-zero elements in the matrix

12

Page 13: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Order of Givens rotations.

13

Page 14: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Changed values.

14

1st / 2nd1s

t / 2

nd

1st only2nd only

Values for calculating 1st rotation

Values for calculating 2nd rotation

Page 15: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Pipeline parallelization.

15

1st / 2nd1s

t / 2

nd

1st only2nd only

Values for calculating 1st rotation

Values for calculating 2nd rotation

1 2

3 4

Page 16: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Parallelization.

• Hard to parallelize the algorithm itself because steps need to be executed sequentially -> however a pipeline scheme is possible

• Idea: Transform the matrix to a compact form to reduce number of required Givens rotations

16

Page 17: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Finding eigenvalues efficiently.General approach.

• Transform the matrix to compact formUpper hessenberg for non-symmetric,Tridiagonal for symmetric matrices

• Find eigenvalues of reduced matrix, e.g. by using QR iteration

• Backtransform, if eigenvectors are required

17

Page 18: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Compact matrix forms.

18

Upper Hessenberg

Full matrix

Tridiagonal

Symmetric matrix

Householder transformations

=> only O(n) Givens rotations required in each step

Page 19: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Householder Transformation.Changed values at 2nd rotation.

19

0

0 0

0 0

0 0

0 0

0 0

0 0

0 0

Column vector for calculating 1st rotation

Column vector for calculating 2nd rotation

2nd Householder

Page 20: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Householder Transformation.Parallelization.

• Can we parellelize this? Yes we can!

• Application of Householder similarity transform consists of two matrix-vector products

• Matrix vector products can be parallelized easily ❨but inefficient❩

20

Page 21: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Householder Transformation.Matrix vector product parallelization.

21

1 2

1 2

1 + 2

Page 22: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Blocked Householder Transformation.Parallelization.

• For a blocked householder transformation the transformations are accumulated

• Application as matrix-matrix products, which can be parallelized good

• However - still a lot of matrix-vector operations required

22

Page 23: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Blocked Householder Transformation.Matrix matrix product parallelization.

23

1 2

3 4

1, 3 2, 4

1, 2

3, 4

Page 24: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Some timings.

Measurement done with self-written Groovy code on Core 2 Duo 2GHzon a 10x10 Lotkin Matrix

24

Variant Complexity Time

Full QR O(n3) per step 1467 ms

QR + non-block. transformation

O(n3) for redu.O(n2) per step

693 ms

QR + blocked transformation

O(n3) for redu.O(n2) per step 1017 ms

QR + blocked parallel transf.

O(n3) for redu.O(n2) per step 1109 ms

Page 25: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Householder Reduction.Some timings.

Measurement done with self-written Groovy code on Core 2 Duo 2GHzon a 10x10 Lotkin Matrix

25

Variant Complexity Time

Non-Blocked O(n3) 256 ms

Blocked O(n3) 975 ms

Blocked Parallel O(n3) 552 ms

Page 26: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

QR Iteration.Review.

• Parallelization of reduction is possible

• QR Iteration can not be parallelized good enough

• More optimization can be applied on symmetric matrices

• Reduction can be optimized using 2-step process

• Next step:Replace QR Iteration with something else

26

Page 27: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

27

Computing eigenvalues in parallel.Contents.

1. Motivation

2. Intro to parallelization

3. Algorithms- QR iteration- Divide and Conquer- MRRR

4. Conclusion

Page 28: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Divide and Conquer.General idea.

• Only applicable on symmetric matrices

• Divide the ❨reduced❩ matrix into submatrices to calculate eigenvalues and -vectors

28

bm

Page 29: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Divide and Conquer.Divide.

29

+ |bm| v vT

Page 30: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Divide and Conquer.Conquer.

30

• Mathematical formulation leads to seculiar equation

• Roots are the eigenvalues-> Zero finders used, e.g. Newton’s method

• Deflation leads to fast results

• Eigenvectors can be computed easily when eigenvalues are known

Page 31: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Divide and Conquer.Algorithm.

function [ Q, A ] = dc_eig(T)if T is 1x1

return Q=1, A=Telse

split T to T1, T2[Q1, A1] = dc_eig(T1)[Q2, A2] = dc_eig(T2)based on A1,A2,Q1,Q2 find combined eigenvalues A and eigenvectors Qreturn Q, A

endifend

31

Can be done in parallel

Page 32: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Divide and Conquer.Parallelization.

32

Depth = 1 Depth = 2

1

2

1

2

3

4

Page 33: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Divide and Conquer.Review.

• Parallelization of reduction as before

• Divide and Conquer can be parallelized naturally and very easy - “embarrassingly parallel”

• Also computes eigenvectors -> sometimes slower if you only want eigenvalues

• In practice: Used for n > 25smaller n: switch to QR iteration

• Extra memory required

33

Page 34: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

34

Computing eigenvalues in parallel.Contents.

1. Motivation

2. Intro to parallelization

3. Algorithms- QR iteration- Divide and Conquer- MRRR

4. Conclusion

Page 35: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

MRRR.Multiple relatively robust representations.

• Modern algorithm

• Applicable to symmetric matrices

• O(nk)

• Based on multiple LDLT representations

• Calculated eigenvectors are numerically orthogonal without reorthogonalization

• Very high accuracy

35

Page 36: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

MRRR.New Representations.

• For each cluster of eigenvalues a new LDLT representation is build-> eigenvalues need to be “estimated”

• The new representation is a “child” of its “parent” representation

• LCDCLCT = LDLT-τI

• τ is chosen to shift into the cluster

36

Page 37: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

MRRR.Representation tree.

37

L0 D0 L0T

{1, ..., 11}

Initial factorization

Eigenpair indices

L1 D1 L1T

{2, 3}L2 D2 L2T

{5, ..., 10}{1} {4} {11}

{2} {3} L3 D3 L3T

{6, 7, 8}{5} {9} {10}

Single Eigenpair

Cluster

Page 38: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

MRRR.Parallelisation.

• Serially the algorithm builds a work queue that is processed

• This Queue can also be split up to several processors

38

L0 D0 L0T

{1, ..., 11}

L1 D1 L1T

{2, 3}L2 D2 L2T

{5, ..., 10}{1} {4} {11}

1

12 2

34

Page 39: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

MRRR.Review.

• O(nk)

• Calculated eigenvectors are numerically orthogonal without reorthogonalization

• Can be used for subsets of eigenpairs

• Very high accuracy

39

Page 40: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Computing eigenvalues in parallel

JASS09

Daniel Kleine-Albers

31.03.2009

Computing eigenvalues in parallel.Conclusion.

• Three steps:1. Transformation to compact form2. Calculation of eigenvalues/-vectors3. Back-Transformation

• Transformation and Back-Transformation parallelizable using Matrix-Matrix-Products

• Three algorithms:- QR iteration ❨for all kinds of matrices❩- Divide and Conquer ❨symmetric matrices❩- MRRR ❨symmetric matrices❩

40

Page 41: Computing eigenvalues in parallel · 2009. 8. 21. · Symmetric matrix Householder transformations => only O(n) Givens rotations required in each step. Computing eigenvalues in parallel

Thank you for your attention.