4
Comput. & Elect.Engng Vol. 5, pp. 365-368 0045-790617811201-03651502.0010 (c)Pergamon PressLtd., 197g. Printed in GreatBritain AN ALGORITHM FOR FINDING ALL MAXIMAL COMPLETE SUBGRAPHS AND AN ESTIMATE OF THE ORDER OF COMPUTATIONAL COMPLEXITY S. R. DAS,t C. L. SHENG and Z. CHEN Institute of Computer Science, National Chiao Tung University, Hsinchu, Taiwan, Republic of China (Received 6 April 1978; received for publication 21 September 1978) Abstract--This paper develops an algorithm for finding all maximal complete subgraphs or cliques of an undirected graph. The algorithm is simple, and is based on a refinement of the technique of successive splitting described by Paull and Unger in the determination of maximal compatibles of states in the context of minimization of incomplete sequential machines. The proposed algorithm tends to reduce computation in generating the subgraphs for problems most generally encountered, particularly in relation to the ap- plicability in sequential switching theory. I. INTRODUCTION Linear graph theory is finding increasing applications as a tool of analysis in widely differing areas of science and technology[I-19]. Linear symmetric or undirected graphs in particular have special applications in sequential switching theory, map coloring problems, transportation research, systems programming, etc. and have also been widely studied. In the aforesaid context an often encountered problem is to enumerate all maximal complete subgraphs, also called cliques, or sometimes maximal cliques, of symmetric graphs. A maximal complete subgraph or clique of a symmetric graph G is a complete subgraph that is not contained in any other complete subgraph of G. Many different techniques exist in the literature to solve the clique generation problem of symmetric graphs. The studies made by Augustson and Minker, Bierstone, Mulligan, Mulligan and Corneil, Osteen, Bron and Kerbosch, Tarjan, Das, Das et al. and of others seem rather significant in this regard. Augustson and Minker in their paper described a number of techniques for finding maximal complete subgraphs of a given un- directed graph. The authors evaluated a number of clique finding techniques and report an algorithm by Bierstone as being the most efficient one. Note that Bierstone algorithm as reported by Augustson and Minker contained an error. This error was independently found by Mulligan and Corneil, and by Bron and Kerbosch. Bron and Kerbosch developed two back- tracking algorithms, using a branch-and-bound technique to cut off branches that cannot lead to a clique. Their first version is a straightforward implementation of the basic algorithm and generates cliques in lexicographic order. The second version is derived from the first and generates cliques in an unpredictable order in an attempt to minimize the number of branches to be traversed. The authors claim that both version 1 and version 2 perform significantly better than Bierstone algorithm. The processing time for version 1 is proportional to 4k, whereas for version 2 it is 3.14k, for some constant k characteristic of the graph. Osteen described two clique detection algorithms. One utilizes the cliques of a subgraph of the graph having the same vertices, and the set of edges of the graph not in the subgraph. The other utilizes the cliques of a supergraph of the graph having the same vertices, and the set of edges of the supergraph not in the graph. Tarjan described a recursive algorithm for finding a clique in an undirected graph that has a worst-case time bound of k(1.286)m for some constant k, if m is the number of vertices in the graph. This algorithm is a substantial improvement over the obvious algorithm. Within a fixed time this algorithm can handle a graph with about 2~ as many vertices as the obvious algorithm. In the present paper an algorithm is proposed to solve the clique problem that seems efficient and simple, and tends to reduce computation in generating the subgraphs for problems most often encountered, particularly in relation to the applicability in sequential switching theory. tS. R. Das was formerly with the Institute of Radiophysics and Electronics, University of Calcutta, Calcutta, India. 365

An algorithm for finding all maximal complete subgraphs and an estimate of the order of computational complexity

  • Upload
    sr-das

  • View
    212

  • Download
    0

Embed Size (px)

Citation preview

Comput. & Elect. Engng Vol. 5, pp. 365-368 0045-790617811201-03651502.0010 (c) Pergamon Press Ltd., 197g. Printed in Great Britain

AN ALGORITHM FOR FINDING ALL MAXIMAL COMPLETE SUBGRAPHS AND AN ESTIMATE

OF THE ORDER OF COMPUTATIONAL COMPLEXITY

S. R. DAS,t C. L. SHENG and Z. CHEN Institute of Computer Science, National Chiao Tung University, Hsinchu, Taiwan, Republic of China

(Received 6 April 1978; received for publication 21 September 1978)

Abstract--This paper develops an algorithm for finding all maximal complete subgraphs or cliques of an undirected graph. The algorithm is simple, and is based on a refinement of the technique of successive splitting described by Paull and Unger in the determination of maximal compatibles of states in the context of minimization of incomplete sequential machines. The proposed algorithm tends to reduce computation in generating the subgraphs for problems most generally encountered, particularly in relation to the ap- plicability in sequential switching theory.

I. INTRODUCTION

Linear graph theory is finding increasing applications as a tool of analysis in widely differing areas of science and technology[I-19]. Linear symmetric or undirected graphs in particular have special applications in sequential switching theory, map coloring problems, transportation research, systems programming, etc. and have also been widely studied. In the aforesaid context an often encountered problem is to enumerate all maximal complete subgraphs, also called cliques, or sometimes maximal cliques, of symmetric graphs. A maximal complete subgraph or clique of a symmetric graph G is a complete subgraph that is not contained in any other complete subgraph of G. Many different techniques exist in the literature to solve the clique generation problem of symmetric graphs. The studies made by Augustson and Minker, Bierstone, Mulligan, Mulligan and Corneil, Osteen, Bron and Kerbosch, Tarjan, Das, Das et al.

and of others seem rather significant in this regard. Augustson and Minker in their paper described a number of techniques for finding maximal complete subgraphs of a given un- directed graph. The authors evaluated a number of clique finding techniques and report an algorithm by Bierstone as being the most efficient one. Note that Bierstone algorithm as reported by Augustson and Minker contained an error. This error was independently found by Mulligan and Corneil, and by Bron and Kerbosch. Bron and Kerbosch developed two back- tracking algorithms, using a branch-and-bound technique to cut off branches that cannot lead to a clique. Their first version is a straightforward implementation of the basic algorithm and generates cliques in lexicographic order. The second version is derived from the first and generates cliques in an unpredictable order in an attempt to minimize the number of branches to be traversed. The authors claim that both version 1 and version 2 perform significantly better than Bierstone algorithm. The processing time for version 1 is proportional to 4 k, whereas for version 2 it is 3.14 k, for some constant k characteristic of the graph. Osteen described two clique detection algorithms. One utilizes the cliques of a subgraph of the graph having the same vertices, and the set of edges of the graph not in the subgraph. The other utilizes the cliques of a supergraph of the graph having the same vertices, and the set of edges of the supergraph not in the graph. Tarjan described a recursive algorithm for finding a clique in an undirected graph that has a worst-case time bound of k(1.286) m for some constant k, if m is the number of vertices in the graph. This algorithm is a substantial improvement over the obvious algorithm. Within a fixed time this algorithm can handle a graph with about 2~ as many vertices as the obvious algorithm.

In the present paper an algorithm is proposed to solve the clique problem that seems efficient and simple, and tends to reduce computation in generating the subgraphs for problems most often encountered, particularly in relation to the applicability in sequential switching theory.

tS. R. Das was formerly with the Institute of Radiophysics and Electronics, University of Calcutta, Calcutta, India.

365

366 S . R . DAS et aL

The algorithm is based on a refinement of the technique of successive splitting described by Paull and Unger in the determination of maximal compatibles of states in the context of minimization of incomplete sequential machines. It is to be noted that the clique problem like some of the classical problems of combinatorics such as the traveling salesman problem, the Hamilton circuit problem, etc. is nondeterministic polynomial-time complete (or simply NP- complete) and as such is quite intractable. Examples are provided that demonstrate that the time complexity of no algorithm for the generation of al l maximal complete subgraphs can be less than O(1.4421GI), where IG[ = m denotes the number of nodes of the graph. G. The proposed algorithm, for the types of problems it needs to handle, will have time complexity that is less than O(1.4421~1). In computer implementation of the algorithm standard graph-theoretic languages may be used with breadth-first or depth-first search. It is relevant to mention here that the symmetric graphs considered in this paper have no self loops and multiple edges, and are assumed to be connected.

II. GRAPHS FOR WHICH THE NUMBER OF CLIQUES INCREASES EXPONENTIALLY WITH INCREASE

IN GRAPH DIMENSION

Quite often we are interested in the rate of growth of the time or space required to solve larger and larger instances of a problem. We associate with a problem an integer, called the size of the problem, which is a measure of the quantity of input data. The time needed by an algorithm expressed as a function of the size of a problem is called the time complexity of the algorithm. There is general agreement that if a problem cannot be solved in less than exponential time, then the problem should be considered completely intractable. The growth rate of an exponential function is so explosive that we say a problem is intractable if all algorithms to solve that problem are of at least exponential time complexity. It is known that the class of nondeterministic polynomial-time complete (NP-complete) problems, is quite likely to contain only intractable problems[18].

In this section we describe a class of graphs for which the number of cliques increases exponentially as r m with increase in the size of the graph, r being a constant characteristic of the graph, and m the graph dimension, which is usually the total number of nodes of the graph. Figures l(b, d and f) illustrate three such graphs with dimensions 8, 6 and 8, respectively. The general class of such graphs can be characterized thus. Consider regular graphs of dimension n x k consisting of k disjoint n-cliques, where an n-clique represents a maximal complete subgraph of dimension n. (See graphs in Figs. l(a, c and e).) The graphs constructed as the

1 1 1

.8 2 ...... / \

7 3 7 ] [

" 6 5 4 /' . ." .¢: ....

~ 4

3 \ .... I"6

5

(a)

I

4

(d)

(b) (c)

1

2 / / ~ / / / 8 2 eC__.~

~' / J / / " ' X - _ / /

5 (e)

Fig. I.

An algorithm for finding all maximal complete subgraphs 367

complement of these k disjoint n-cliques represent graphs that contain n k cliques, each of dimension k.

The above statement can be proved in the following way. Denote by G the graph that consists of k disjoint n-cliques. Obviously, the number of nodes in G is IGI = n x k. Let t~ be the complementary graph of G. Now consider any two n-cliques, Ci and Cj, of G. Let the n nodes of Ci and Cj be, respectively, il, i2 . . . . . in and Jl, ./2 . . . . . Jn. Since Ci and Cj are disjoint in G, any node i,, in Ci is adjacent to all the nodes of Ci in t~, and hence is must appear in all the complete subgraphs formed by Ci in t~ [4, 5]. This holds separately for every other node of 6",.. The number of such subgraphs is evidently n x n. Similarly, considering any other n-clique, Cr, every node of C, is adjacent to all the nodes of C~ and Cj in t~, and hence every node must appear separately in all the complete subgraphs formed by Ci and Ci in t~. The number of such subgraphs is n x n × n. Since there are k such isolated n-cliques, the total number of maximal complete subgraphs formed by these k n-cliques in t5 will be n x n x n × . . . (k times), or n k. Each of these n k cliques consists of k nodes and hence is of dimension k.

The total number of k-cliques in G can also be expressed as n k= n n×k/n =(n~/n) n×k= (n l/n)l~l. Note that for n = 3, the graph reduces to a Moon-Moser graph with number of cliques 3 k = (3~/3) IGI = (1.442) IGI. This graph contains the largest number of cliques per node[17].

The above illustrations serve to demonstrate that there exists a class of graphs for which the all clique problem cannot be solved in less than exponential time. There can be no algorithm of which the worst-case time complexity can be less than O(1.442"). But for most of the common graphs the situation may not be that bad. In the following we suggest a splitting algorithm that will generate all cliques of a given undirected graph with minimum amount of redundancy. The algorithm will generate no duplicate cliques.

III. FINDING ALL CLIQUES BY SPLITTING AN UNDIRECTED GRAPH

For two subgraphs Gi and Gk of an undirected graph G, let the set of nodes in Gi be a proper or improper subset of the set of nodes in G~, both Gi and Gk having all the existing edges of G connecting between relevant pairs of nodes. Then Gi C_ Gk. Consider now a set of nonadjacent pairs of nodes (ik, jr), r = I, 2 . . . . . t, in G. Then splitting G into two subgraphs G~ and G~ around pairs of nodes (ik, j,) is to obtain the subgraphs G~ and Gj from G such that G~ contains all the nodes of G except nodes j,, and G~ contains all the nodes of G except node ik, both G~ and Gj having all the existing edges of G connecting between relevant pairs of nodes. Obviously, Gi, Gj C G.

Theorem 1. Let G be an undirected graph, and let (ik, j,), r = 1,2 . . . . . t, be a set of t nonadjacent pairs of nodes of G, Jr being all the t nodes of G nonadjacent to ik. Let G be split around (ik, j,) into two subgraphs G~ and G i, and let this process of splitting around nonadjacent pairs of nodes be iteratively applied to Gi and Gj and to all their subgraphs until in the resulting subgraphs there exist no more nonadjacent pairs of nodes. The final set of these subgraphs includes all cliques of G.

Proof. If nodes ik and jr are nonadjacent in G, they cannot appear in a clique. The decomposition into Gi, Gi will, therefore, leave each clique intact in at least one of the two subgraphs.

The computational requirements for enumerating the cliques are greatly reduced if, at every stage, the process of splitting is carried out around those (ik, j,,) for which ik is a node of least degree in a G, (or highest degree in t~r). The generation of duplicate cliques are completely avoided, though some non-clique subgraphs may be generated.

To cancel non-clique subgraphs that may be generated, we use the following obvious theorem.

Theorem 2. In successively splitting an undirected graph into subgraphs around nonadjacent pairs of nodes, let Gi and Gj be any two subgraphs obtained at different stages such that G~ c_ Gj, but G~ is not derived from Gj. Then in finding only cliques, Gi may be discarded.

A formal algorithm to find all cliques from a given undirected graph is presented next.

ALGORITHM

(1) Given an undirected graph G, find the node (or nodes) of minimum degree in G. (2) Select a set of nonadjacent pairs of nodes (ik, jm) in G, where ik is the node of minimum degree. If

368 S.R. DA~; et al.

more than one node is of minimum degree, select pairs (ik, jm) for any ik of minimum degree. Split G around (it, ira) into two subgraphs Gi and G, such that G, contains all the nodes of G excep t nodes/ 'm, and G, conta ins all the nodes of G except node it. Cons ider now Gi(G~) and go to (I). (3) Cont inue with (1) and (2) until in all the result ing subgraphs there is no nonad jacen t pairs of nodes. (4) In the set of subgraphs obta ined af ter (3), check if any subgraph is conta ined in another subgraph for poss ible cancel la t ion of non-cl ique subgraphs. The resul tant set af ter cancel la t ion, if any, gives all c l iques of G,

In the special case of M o o n - M o s e r graphs this algori thm genera tes all 3 k k-cl iques without genera t ing any non-cl ique subgraphs . The reason is quite obvious . After the first split, one

branch conta ins a node it but does not contain any nodes nonad jacen t to it, while the other branch does not contain ik but contains all of the nodes nonad jacen t to it. Hence in all the cl iques genera ted f rom the fo rmer branch the node it appears , while in all the cliques genera ted f rom the lat ter branch one or the other of the nodes nonad jacen t to it appears , both being ad jacen t to the remaining nodes of the graph. Since the algori thm genera tes subgraphs in the form of a binary tree, the total number of internal tree nodes is 3 ~ - 1,3 t being the total number of cl iques that co r respond to the terminal tree nodes. Hence there will be 3 t - I spli t t ings necessa ry before the cl iques are obtained. Based on this observa t ion the wors t -case complex i ty can be computed . Fo r an m-node graph, finding a node of minimum degree takes m time and split t ing the graph into two subgraphs takes m 2 time. Then the total t ime for generat ing the cl iques T,, ~< 3k - 3 t + 9 k 2- 3 t = a k, which gives a = 3 • (3k) t/g • (3k + 1) ~tk, which for k very

large, is very close to, but greater than 3. Hence the wors t -case t ime complexi ty is O(at ) . In the average case, however , even al lowing the t ime needed for compar i sons among subgraphs for poss ible cancel la t ion of non-cl ique subgraphs , the time complex i ty is expec ted to be less. In actual compute r implementa t ion of the a lgori thm any s tandard graph- theore t ic language may be used with depth-first or breadth-f i rs t search, though depth-first search may be preferable in pract ice because it may be more space conserving!

REFERENCES 1. K. E. Stoffers, Scheduling of traffic lights--a new approach. Trans. Res. 2, 199-234 (1%8). 2. K. E. Stoffers, Sequential algorithm for the determination of maximum compatibles. IEEE Trans. ('omput. ('-23.95-98

(Jan. 1974). 3. J. Nieminen, On finding maximum compatibles. Proc. IEEE 63, 729-730 (April 1975). 4. S. R. Das and C. L. Sheng, On finding maximum compatibles. Proc. IEEE 57, 694-695 (April 1%9). 5. S. R. Das, On a new approach for finding all the modified cut-sets in an incompatibility graph. IEEE Tran.~. Comput.

C-22, 187-193 (Feb. 1973). 6. C. V. Ramamoorthy, Analysis of graphs by connectivity considerations. J. ACM 13, 211-222 (April 1%6). 7. M. C. Paull and S. H. Unger, Minimizing the number of states in incompletely specified sequential switching functions.

IRE Trans. Electron. Comput. EC-8, 356-367 (Sept. 1959). 8. J. Weissman, Boolean algebra, map coloring, and interconnections. Am. Math. Mon. 69, 608-613 (Sept. 1%2). 9. S. Seshu and M. B. Reed, Linear Graphs and Electrical Networks. Addison-Wesley, Reading (1%1).

10. F. Harary, Graph Theory. Addison-Wesley, Reading (1%9). 11. J. G. Augustson and J. Minker, An analysis of some graph theoretical cluster techniques. J ACM 17, 571-588 (Oct.

1970). 12. E. Bierstone, Cliques and generalized cliques in a finite linear graph. Unpublished report. 13. G. D. Mulligan, Algorithms for finding cliques of a graph. MSc. thesis, Dept. Comp. So., University of Toronto,

Toronto, Canada. 14. G. D. Mulligan and D G. Corneil, Corrections to Bierstone's algorithm for generating cliques. J. ACM 19, 244-247

(April 1972). 15. C. Bron and J. Kerbosch, Finding all cliques of an undirected graph. Comm. ACM 16, 575-577 (Sept. 1973). 16. R. E. Tarjan, Finding a maximum clique. Tech. Rep. TR-72-123, Dept. Comp. Sc.. Cornell University, Ithaca, NY

(March 1972). 17. J. W. Moon and L. Moser, On cliques in graphs. Israel J. Math. 3, 23-28 (1%5). 18. A. V. Aho, J. E. Hopcroft and J. D. Ullman, The Design and Analysis of Computer A&orithms. Addison-Wesley,

Reading (1975). 19. S. R, Das, A. K. Choudhury, H. Debnath and D. Sarma, On the MMSC subgraphs of symmetric graphs. 7th Hawaii

Intl. Conf. on System Sciences, Honolulu, Hawaii (8-10 Jan. 1974). (North Hollywood, CA: Western Periodicals, 1974, pp. 121-123).