58
Efficient Graph Matching and Coloring on the GPU Jonathan Cohen Patrice Castonguay

Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

Efficient Graph Matching

and Coloring on the GPU

Jonathan Cohen

Patrice Castonguay

Page 2: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Key Idea

Cost model: count global syncs

Reasoning:

One kernel invocation = One global sync

1) Read graph data

2) Compute something,

3) Write results

4) Wait for all threads to finish (sync)

Assume “read graph data” and “wait” (sync) dominate

Model is too crude today, but leads to algorithms

that scale to future trends (and bigger machines)

Reducing kernel launches generally improves perf

Conclusion: want coloring and matching algorithms

requiring fewest number of kernel launches

Page 3: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Graph Coloring

Assignment of “color” (integer) to vertices, with no

two adjacent vertices the same color

Each color forms independent set (conflict-free)

reveals parallelism inherent in graph topology

“inexact” coloring is often ok

Our focus: fast, cheap, non-optimal colorings

Page 4: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

1

Parallel Graph Coloring – Luby-Jones

Parallel graph coloring algorithm of Luby / Jones-Plassman

24 12

57 64

11 98 45 75

91 39

86

71 66

9

77 10

14

51

54

79

33

25

81

60

50 7

59 39 44 48 74

62

11 88

72 90

20

65

75

80

10

95

2

64

69 79 44

Page 5: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

1

Parallel Graph Coloring – Luby-Jones

Classic approach: compute array of random numbers

First optimization: compute a hash function of vertex index on the fly

Vertex can compute hash number of its neighbors’ indices

Trades bandwidth for compute, skip kernel to assign random numbers

24 12

57 64

11 98 45 75

91 39

86

71 66

9

77 10

14

51

54

79

33

25

81

60

50 7

59 39 44 48 74

62

11 88

72 90

20

65

75

80

10

95

2

64

69 79 44

Page 6: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Luby-Jones

Round 1: Each vertex checks if local maximum

=> Adjacent vertices can’t both be local maxima

If max, color=green.

24

57 64

91 39

66

9

77

14

51

54 25

60

50

59 39 44

62

72

65

75

80 64

79

45

1

12 10

11

7 10

33

48

11

20

2

Page 7: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Luby-Jones

Round 2: Each vertex checks if local maximum, ignoring green

If max, color=blue

24

57 64

39 9

14

51

25

60

50

59 39 44

75

80

79

45

1

12 10

11

7 10

33

48

11

20

2

Page 8: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Luby-Jones

Round 3: Each vertex checks if local maximum, ignoring colored nbrs

If max, color=purple

57

39 9

14

25

50

39 44

75

45

1

12 10

11

7

33

11

2

Page 9: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Luby-Jones

Round 4: Each vertex checks if local maximum, ignoring colored nbrs

If max, color=red

9

14

25

1

10

11

7

2

Page 10: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Luby-Jones

Round 5: Each vertex checks if local maximum, ignoring colored nbrs

If max, color=white

1

10

Page 11: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Luby-Jones

Round 6: Each vertex checks if local maximum, ignoring colored nbrs

If max, color=yellow

Completes in 6 rounds

Page 12: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

1

Parallel Graph Coloring – Min-Max

Realization: Local min and local max are both independent sets

They are disjoint => can produce 2 colors per iteration

24 12

57 64

11 98 45 75

91 39

86

71 66

9

77 10

14

51

54

79

33

25

81

60

50 7

59 39 44 48 74

62

11 88

72 90

20

65

75

80

10

95

2

64

69 79 44

Page 13: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Min/Max

Round 1: Each vertex checks if it’s a local maximum or minimum.

If max, color=blue. If min, color=green

24

57 64

91 39

66

9

77

14

51

54 25

60

50

59 39 44

62

72

65

75

80 64

79

Page 14: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Min/Max

Round 2: Each vertex checks if it’s a local maximum or minimum.

If max, color=pink. If min, color=red

57

51

60

59 44

80

79

Page 15: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Min/Max

Round 3: Each vertex checks if it’s a local maximum or minimum.

If max, color=purple. If min, color=white

Improvement: 3 rounds versus 6

Page 16: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Multi-Hash

Use multiple hash functions to obtain multiple 2-coloring of the graph

Hash function 1:

Page 17: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Parallel Graph Coloring – Multi-Hash

Use multiple hash functions to obtain multiple 2-coloring of the graph

Hash function 2:

Page 18: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Use multiple hash functions to obtain multiple 2-coloring of the graph

Hash function 3:

Parallel Graph Coloring – Multi-Hash

Page 19: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Combine all 2-colorings – completes in 1 round!

Creates well-balanced graph colorings

Empirically: produces better colorings than Luby-Jones – not sure why

Parallel Graph Coloring – Multi-Hash

Page 20: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

100% Coloring Results

Speedup

Page 21: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Number of Colors (100% Coloring)

Page 22: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

95% Coloring Results

Spe

edup

Page 23: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Graph Matching

Set of edges such that no two edges share a vertex

“Maximum matching” – matching the includes the

largest number of edges

Equivalent: Independent set on dual of graph

independent pairs of connected vertices

Page 24: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 25: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Each vertex extends a hand to its strongest neighbour

Page 26: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Set aggregates

Each vertex checks if its strongest neighbour extended a hand back

Page 27: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Repeat with unmatched vertices

Page 28: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 29: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 30: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 31: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 32: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 33: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 34: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

One-Phase Handshaking

Page 35: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Page 36: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Extend a first hand to your strongest neighbour

Page 37: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Extend a second hand to the strongest vertex among those who gave a

hand to you

Page 38: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Keep only edges which have a handshake

New graph has maximum degree 2

Page 39: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Now do one-phase handshaking

Page 40: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Find perfect matches

Page 41: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Repeat

Page 42: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Repeat

Page 43: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Repeat

Page 44: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Repeat

Page 45: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Repeat

Page 46: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Two-Phase Handshaking

Repeat

Page 47: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Page 48: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Extend N hands at once (N=2)

Similar to first 2 steps of two-phase, but in a single step

Page 49: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Discard edges without a match

Resulting graph has max degree N (N=2)

Page 50: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Now do one-phase

Page 51: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Select perfect matches

Page 52: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Repeat

Page 53: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Repeat

Page 54: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Repeat

Page 55: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

N-Way Handshaking

Repeat

Page 56: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Graph Matching Performance

(90% Matching Target)

Page 57: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Conclusion

Graph Coloring and Matching algorithms can be

highly data parallel

Key optimizations:

More work per thread, fewer global synchronizations

Replace random numbers with hash functions

One view: recast in terms of generalized Sparse

Matrix-Vector product (SpMV)

For each row (in parallel)

Visit each neighbor, compute something

Compute reduction

Write out single result

Page 58: Efficient Graph Matching and Coloring on the GPU - GTC 2012on-demand.gputechconf.com/gtc/2012/presentations/S0332... · 2013-08-23 · Parallel Graph Coloring – Luby-Jones Parallel

NVIDIA GTC 2012

Questions?

Tech report and source code with lots more details

is forthcoming

Thanks to entire NVAMG team