50
Barbara Ball [email protected] Clare Rodgers [email protected] The Fiedler Vector and Graph Partitioning College of Charleston Graduate Math Department Research Under Dr. Amy Langville

Barbara Ball [email protected] Clare Rodgers [email protected] The Fiedler Vector and Graph Partitioning College of Charleston Graduate Math Department

Embed Size (px)

Citation preview

  • The Fiedler Vector and Graph PartitioningBarbara [email protected] [email protected] of Charleston Graduate Math Department Research Under Dr. Amy Langville

  • OutlineGeneral Field of Data ClusteringMotivationImportancePrevious WorkLaplacian MethodFiedler VectorLimitationsHandling the Limitations

  • OutlineOur ContributionsExperimentsSorting eigenvectorsTesting Non-symmetric MatricesHypothesesImplications

    Future WorkNon-square matricesProofs

    References

  • Understanding Graph Theory12346578910Given this graph, there are no apparent clusters.

  • Understanding Graph Theory12346578910Although the clusters are now apparent, we need a better method.

  • Finding the Laplacian Matrix12346578910 A = adjacency matrix D = degree matrix Find the Laplacian matrix, L L = D - ARows sum to zero

  • Behind the Scenes of theLaplacian MatrixRayleigh Quotient Theorem:seeks to minimize the off-diagonal elements of the matrixor minimize the cutset of the edges between the clustersNot easily clusteredClusters apparent

  • Behind the Scenes of theLaplacian MatrixRayleigh Quotient Theorem Solution:1=0, the smallest right-hand eigenvalue of the symmetric matrix, L1 corresponds to the trivial eigenvector v1= e = [1, 1, , 1].

    Courant-Fischer Theorem:also based on a symmetric matrix, L, searches for the eigenvector, v2, that is furthest away from e.

  • Using the Laplacian Matrixv2, gives relation information about the nodes.

    This relation is usually decided by separating the values across zero.

    A theoretical justification is given by Miroslav Fiedler. Hence, v2 is called the Fiedler vector.

  • Using the Fiedler Vectorv2 is used to recursively partition the graph by separating the components into negative and positive values.12346578910Entire Graph: sign(V2)=[-, -, -, +, +, +, -, -, -, +]Reds: sign(V2)=[-, +, +, +, -, -]1 2 3 7 8 9

  • Problems With Laplacian MethodThe Laplacian method requires the use of:an undirected grapha structurally symmetric matrixsquare matrices

    Zero may not always be the best choice for partitioning the eigenvector values of v2 (Gleich)

    Recursive algorithms are expensive

  • Current Clustering Method

    Monika Henzinger, Director of Google Research in 2003, cited generalizing directed graphs as one of the top six algorithmic challenges in web search engines.

  • How Are These Problems Currently Being Solved?Forcing symmetry for non-square matrices:Suppose A is an (ad x term) non-square matrix.B imposes symmetry on the information:

    Example:

  • How Are These Problems Currently Being Solved?Forcing symmetry in square matrices:Suppose C represents a directed graph.D imposes bidirectional information by finding the nearest symmetric matrix:D = C + CT

    Example:

  • How Are These Problems Currently Being Solved?Graphically Adding Data:123123

  • How Are These Problems Currently Being Solved?Graphically Deleting Data:

    123123

  • Our Wish:

    Use Markov Chains and the subdominant right-hand eigenvector ( Ball-Rodgers vector) to cluster asymmetric matrices or directed graphs.

  • Where Did We Get the Idea?Stewart, in An Introduction to Numerical Solutions of Markov Chains, suggests the subdominant, right-hand eigenvector (Ball-Rodgers vector) may indicate clustering.

  • The Markov Method

  • Different Matrices and EigenvectorsDifferent Matrices:A: connectivity matrix

    L = D A : Laplacian matrixrows sum to 0

    P: probability Markov matrixthe rows sum to 1Q = I P: transitional rate matrixrows sum to 0Respective Eigenvectors:2nd largest of A2nd smallest of A

    2nd smallest of L (Fiedler vector)

    2nd largest of P (Ball-Rodgers vector)2nd smallest of Q

  • Graph 1EigenvectorValue Plots1234657891092 3 81710546-0.5-0.4-0.3-0.2-0.100.10.20.30.481927310546-0.4-0.3-0.2-0.100.10.20.30.44561037 2918-0.4-0.3-0.2-0.100.10.20.30.41 8927310456-0.4-0.3-0.2-0.100.10.20.30.4Second Largest of AFiedler VectorBall-Rodgers VectorSecond Smallest of Q

  • Graph 1Banding UsingEigenvector ValuesReorders just by using the indices of the sorted eigenvector No Recursion12346578910Banded ABanded LBanded PBanded Q

  • Graph 1Reordering UsingLaplacian Method12346578910Reordered LA

  • Graph 1Reordering UsingMarkov Method12346578910AReordered PReordered Q

  • Graph 2EigenvectorValue Plots16791141718222310161312202119158145231022202115182191223143165131711964871-0.4-0.35-0.3-0.25-0.2-0.15-0.1-0.0500.050.19471861117163202192115221023181214513-0.4-0.3-0.2-0.100.10.20.39471861117163202192115121022141823513-0.4-0.3-0.2-0.100.10.20.35132318141022121521219203161711681749-0.3-0.2-0.100.10.20.30.4Second Largest of AFiedler VectorBall-Rodgers VectorSecond Smallest of Q

  • Graph 2Banding UsingEigenvector ValuesNicely banded, but no apparent blocks.1679114171822231016131220211915814523Banded ABanded LBanded PBanded Q

  • Graph 2Reordering UsingLaplacian Method1679114171822231016131220211915814523AReordered L

  • Graph 2Reordering UsingMarkov MethodA1679114171822231016131220211915814523Reordered QReordered P

  • Directed Graph 112346578910Although it is directed, the Fiedler vector still works.

  • Directed Graph 112346578910v2 = -0.5783 -0.2312 -0.0388 0.1140 0.1255 0.1099 -0.1513 -0.5783 -0.4536 0.0821

  • Directed Graph 1Reordering UsingLaplacian Method12346578910Reordered LA

  • Directed Graph 1Reordering UsingMarkov Method12346578910Reordered PA

  • Directed Graph 1-B12346578910Was bi-directional

  • Directed Graph 1-BThe Laplacian Method no longer works on this graph.Certain edges must be bi-directional in order to make the matrix irreducible.Currently, to deal withthis problem, a smallnumber is added to eachelement of the matrix.12346578910.01

  • Directed Graph 1-BReordering UsingMarkov Method12346578910Reordered PA

  • Directed Graph 2Reordering Using Markov MethodAnswerAReordered P

  • Directed Graph 3Reordering Using Markov MethodAnswerAReordered P

  • Directed Graph 4Reordering Using Markov Method10% anti-block elementsAnswerAReordered P

  • Directed Graph 5Reordering Using Markovian Method30% anti-block elementsAnswerAReordered POnly the first partition is shown.

  • Hypothesis ImplicationsPlotting the eigenvector values gives better estimates of the number of clusters

    Sometimes, sorting the eigenvector values clusters the matrix without any type of recursive process.

    Using the stochastic matrix P cluster asymmetric matrices or directed graphsA number other than zero may be used to partition the eigenvector values.

    Recursive methods are time-consuming. The eigenvector plot takes virtually no time at all and requires very little programming or storage!Non-symmetric matrices (or directed graphs) can be clustered without altering data!

  • Future WorkExperiments on Large Non-Symmetric Matrices

    Non-square matrices

    Clustering eigenvector values to avoid recursive programming

    ProofsQuestions

  • ReferencesFriedberg, S., Insel, A., and Spence, L. Linear Algebra: Fourth Edition.Prentice-Hall. Upper Saddle River, New Jersey, 2003.Gleich, David. Spectral Graph Partitioning and the Laplacian with Matlab. January 16, 2006. http://www.stanford.edu/~dgleich/demos/matlab/spectral/spectral.htmlGodsil, Chris and Royle, Gordon. Algebraic Graph Theory. Springer-Verlag New York, Inc. New York. 2001.Karypis, George. http://glaros.dtc.umn.edu/gkhome/nodeLangville, Amy. The Linear Algebra Behind Search Engines. The Mathematical Association of America Online. http://www.joma.org. December, 2005.Mark S. Aldenderfer, Mark S. and Roger K. Blashfield. Cluster Analysis .Sage University Paper Series: Quantitative Applications in the Social Sciences,1984.Moler, Cleve B. Numerical Computing with MATLAB. The Society for Industrial and Applied Mathematics. Philadelphia, 2004.Roiger, Richard J. and Michael W. Geatz. Data Mining: A Tutorial-Based Primer Addison-Wesley, 2003.Vanscellaro, Jessica E. The Next Big Thing In Searching Wall Street Journal. January 24, 2006.

  • ReferencesZhukov, Leonid. Technical Report: Spectral Clustering of Large Advertiser Datasets Part I. April 10, 2003.Learning MATLAB 7. 2005. www.mathworks.comwww.Mathworld.comwww.en.wikipedia.org/http://www.resample.com/xlminer/help/HClst/HClst_intro.htmhttp://comp9.psych.cornell.edu/Darlington/factor.htmwww-groups.dcs.st-and.ac.uk/~history/Mathematicians/Markov.htmlhttp://leto.cs.uiuc.edu/~spiros/publications/ACMSRC.pdfhttp://www.lifl.fr/~iri-bn/talks/SIG/higham.pdfhttp://www.epcc.ed.ac.uk/computing/training/document_archive/meshdecomp-slides/MeshDecomp-70.htmlhttp://www.cs.berkeley.edu/~demmel/cs267/lecture20.htmlhttp://www.maths.strath.ac.uk/~aas96106/rep02_2004.pdf

  • Eigenvector Exampleback

  • Structurally Symmetricback

  • Theory Behind the LaplacianMinimize the edges between the clusters

  • Theory Behind the LaplacianMinimizing edges between clusters is the same as minimizing off-diagonal elements in the Laplacian matrix.

    min pTLpwhere pi = {-1, 1} and i is the each node.

    p represents the separation of the nodes into positives and negatives.

    pTLp = pT(D-A)p = pTDp pTAp

    However, pTDp is the sum across the diagonal, so is is a constant.

    Constants do not change the outcome of optimization problems.

  • Theory Behind the Laplacianmin pTAp

    This is an integer nonlinear program.

    This can be changed to a continuous program by using Lagrange relaxation and allowing p to take any value from 1 to 1. We rename this vector x, and let its magnitude be N. So, xTx=N.

    min xTAx - (xTx N)

    This can be rewritten as the Rayleigh Quotient:min xTAx/xTx = 1

  • Theory Behind the Laplacian1=0 and corresponds to the trivial eigenvector v1=e

    The Courant-Fischer Theorem seeks to find the next best solution by adding an extra constraint of x e.

    This is found to be the subdominant eigenvector v2, known as the Fiedler vector.

  • Theory Behind the LaplacianOur Questions:

    The symmetry requirement is needed for the matrix diagonalization of D. Why is D important since it is irrelevant for a minimization problem?

    If diagonalization is important, could SVD be used instead?

    future