298

Lauritzen - Undergraduate Convexity - From Fourier and Motzkin to Kuhn and Tucker

Embed Size (px)

DESCRIPTION

An undergraduate treatment of convexity and optimization.Can be read by anyone familiar with linear algebra.

Citation preview

  • UNDERGRADUATE CONVEXITYFrom Fourier and Motzkin to Kuhn and Tucker

    8527_9789814412513_tp.indd 1 24/1/13 4:19 PM

  • This page intentionally left blankThis page intentionally left blank

  • Niels LauritzenAarhus University, Denmark

    UNDERGRADUATE CONVEXITYFrom Fourier and Motzkin to Kuhn and Tucker

    N E W J E R S E Y L O N D O N S I N G A P O R E B E I J I N G S H A N G H A I H O N G K O N G TA I P E I C H E N N A I

    World Scientific

    8527_9789814412513_tp.indd 2 24/1/13 4:19 PM

  • Published by

    World Scientific Publishing Co. Pte. Ltd.

    5 Toh Tuck Link, Singapore 596224

    USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

    UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

    British Library Cataloguing-in-Publication Data

    A catalogue record for this book is available from the British Library.

    Cover image: Johan Ludvig William Valdemar Jensen (18591925).

    Mathematician and telephone engineer.

    Photograph by Vilhelm Rieger (courtesy of the Royal Library, Copenhagen).

    UNDERGRADUATE CONVEXITY

    From Fourier and Motzkin to Kuhn and Tucker

    Copyright 2013 by World Scientific Publishing Co. Pte. Ltd.

    All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,

    electronic or mechanical, including photocopying, recording or any information storage and retrieval

    system now known or to be invented, without written permission from the Publisher.

    For photocopying of material in this volume, please pay a copying fee through the Copyright

    Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to

    photocopy is not required from the publisher.

    ISBN 978-981-4412-51-3

    ISBN 978-981-4452-76-2 (pbk)

    Printed in Singapore.

    LaiFun - Undergraduate Convexity.pmd 1/25/2013, 9:41 AM1

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Preface

    Convexity is a key concept in modern mathematics with rich applicationsin economics and optimization.

    This book is a basic introduction to convexity based on several yearsof teaching the one-quarter courses Konvekse Mngder (convex sets) andKonvekse Funktioner (convex functions) to undergraduate students inmathematics, economics and computer science at Aarhus University. Theprerequisites are minimal consisting only of first year courses in calculusand linear algebra.

    I have attempted to strike a balance between different approaches toconvexity in applied and pure mathematics. Compared to the former themathematics takes a front seat. Compared to some of the latter, a key pointis that the ability to carry out computations is considered paramount anda crucial stepping stone to the understanding of abstract concepts e.g., thedefinition of a face of a convex set does not make much sense before it isviewed in the context of several simple examples and computations.

    Chapters 16 treat convex subsets from the basics of linear inequalitiesto Minkowskis theorem on separation of disjoint convex subsets by hyper-planes. The basic idea has been to emphasize part of the rich finite theoryof polyhedra before entering into the infinite theory of closed convex sub-sets.

    Fourier-Motzkin elimination is to linear inequalities what Gaussian elim-ination is to linear equations. It seems appropriate to begin a course on con-vexity by introducing this simple, yet powerful method. The prerequisitesare barely present. Still the first chapter contains substantial results suchas a simple algorithm for linear optimization and the fundamental theoremthat projections of polyhedra are themselves polyhedra.

    v

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    vi Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Before introducing closed convex subsets, several basic definitions andhighlights from the polyhedral world are given: a concise treatment of affinesubspaces, faces of convex subsets, Blands rule from the simplex algorithmas a tool for computing with the convex hull, faces of polyhedra, Farkasslemma, steady states for Markov chains, duality in linear programming,doubly stochastic matrices and the Birkhoff polytope.

    The chapter Computations with polyhedra contains a treatment of twoimportant polyhedral algorithms: the double description method and thesimplex algorithm. The double description method is related to Fourier-Motzkin elimination. It is very easily explained in an undergraduate contextespecially as a vehicle for computing the bounding half spaces of a convexhull.

    The simplex algorithm solves linear optimization problems and is some-what mysterious from a mathematical perspective. There is no obviousreason it should work well. In fact, the famous mathematician John vonNeumann never really believed it would perform in practice. The inven-tor George B. Dantzig also searched for alternate methods for years beforeconfronting experimental data from some of the worlds first computers:the simplex algorithm performed amazingly well in practice. Only recentlyhas a mathematical explanation for this phenomenon been given by Spiel-man and Teng. Our treatment of the simplex algorithm and the simplextableau deviates from the standard form and works with the polyhedron inits defining space.

    The transition to the continuous theory of non-polyhedral convex sub-sets comes after the first five chapters. Here it is proved that closed convexsubsets serve as generalizations of polyhedra, since they coincide with ar-bitrary intersections of affine half spaces. The existence of a supporting hy-perplane at a boundary point of a convex subset is proved and Minkowskistheorems on compact convex subsets and separation of disjoint convex sub-sets are given.

    Chapters 710 treat convex functions from the basic theory of convexfunctions of one variable with Jensens inequality to the Karush-Kuhn-Tucker conditions, dual optimization problems and an outline of an interiorpoint algorithm for solving convex optimization problems in several vari-ables. The setting is almost always the simplest. Great generality is finewhen you have lived with a subject for years, but in an introductory courseit tends to become a burden. You accomplish less by including more.

    The main emphasis is on differentiable convex functions. Since under-graduate knowledge of differentiability may vary, we give an almost com-

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Preface vii

    plete review of the theory of differentiability in one and several variables.The only general result on convex functions not assuming differentiabilityis the existence of the subgradient at a point.

    An understanding of convex functions of several variables is impossi-ble without knowledge of the finer points of linear algebra over the realnumbers. Introducing convex functions of several variables, we also give athorough review of positive semidefinite matrices and reduction of symmet-ric matrices. This important part of linear algebra is rarely fully understoodat an undergraduate level.

    The final chapter treats Convex optimization. The key elements are theKarush-Kuhn-Tucker conditions, how saddle points of the Lagrangian leadto a dual optimization problem and finally an outline of an interior pointalgorithm using bisection and the modified Newton method. Monographshave been written on these three topics. We only give a brief but self-contained introduction with simple examples.

    Suggestions for teaching a one-semester course

    The amount of material included in this book exceeds a realistic planfor a one-semester undergraduate course on convexity. I consider Fourier-Motzkin elimination (Chapter 1), affine subspaces (Chapter 2), basics ofconvex subsets (Chapter 3), the foundational material on polyhedra inChapter 4, a taste of one of the two algorithms in Chapter 5 and closed con-vex subsets (Chapter 6) as minimum along with almost all of the materialin Chapters 710.

    The progression of learning depends on the proficiency in linear algebraand calculus. The necessary basic concepts from analysis are introduced inAppendix A. In Appendix B there is a review of linear algebra from thepoint of view of linear equations leading to the rank of a matrix.

    In my view, a too rigid focus on the abstract mathematical details beforetelling about examples and computations is a major setback in the teachingof mathematics at all levels. Certainly the material in this book benefitsfrom being presented in a computational context with lots of examples.

    Aarhus, December 2012

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    This page intentionally left blankThis page intentionally left blank

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Acknowledgments

    I am extremely grateful to Tage Bai Andersen and Jesper Funch Thomsenfor very useful and detailed comments on a second draft for this book.Comments from Kent Andersen, Jens Carsten Jantzen, Anders NedergaardJensen and Markus Kiderlen also led to several improvements.

    I am an algebraist by training and encountered convexity because ofan interest in computational algebra (and computers!). As such, I havebenefited immensely over the years from insightful explanations from thefollowing more knowledgeable people: Tage Bai Andersen, Kent Andersen,Kristoffer Arnsfelt Hansen, Peter Bro Miltersen, Marcel Bkstedt, KomeiFukuda, Anders Nedergaard Jensen, Herbert Scarf, Jacob Schach Mller,Andrew du Plessis, Henrik Stetkr, Bernd Sturmfels, Rekha Thomas, Jr-gen Tornehave, Jrgen Vesterstrm and Bent rsted.

    I am grateful to Jens Carsten Jantzen, Jesper Ltzen and Tage GutmannMadsen for help in tracking down the venerable Jensen inequality postagestamp used for several years by the Department of Mathematical Sciences atUniversity of Copenhagen. Also, thanks to Tinne Hoff Kjeldsen for sharingher expertise on the fascinating history of convexity and optimization.

    A very special thanks to the teaching assistants on Konvekse Mngderand Konvekse Funktioner : Lisbeth Laursen, Jonas Andersen Seebach,Morten Leander Petersen, Rolf Wognsen, Linnea Jrgensen and Dan Zhang.They pointed out several inaccuracies in my lecture notes along the way.

    I am grateful to Kwong Lai Fun and Lakshmi Narayanan of WorldScientific for their skilled help in the production of this book.

    Lars daleif Madsen has been crucial in the technical typesetting withhis vast knowledge of LATEX and his usual careful attention to detail.

    Finally, Helle and William deserve an abundance of gratitude for theirpatience and genuine love.

    ix

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    This page intentionally left blankThis page intentionally left blank

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Contents

    Preface v

    Acknowledgments ix

    1. Fourier-Motzkin elimination 1

    1.1 Linear inequalities . . . . . . . . . . . . . . . . . . . . . . 31.2 Linear optimization using elimination . . . . . . . . . . . 81.3 Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

    2. Affine subspaces 17

    2.1 Definition and basics . . . . . . . . . . . . . . . . . . . . . 182.2 The affine hull . . . . . . . . . . . . . . . . . . . . . . . . 202.3 Affine subspaces and subspaces . . . . . . . . . . . . . . . 212.4 Affine independence and the dimension of a subset . . . . 222.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    3. Convex subsets 27

    3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 The convex hull . . . . . . . . . . . . . . . . . . . . . . . . 303.3 Faces of convex subsets . . . . . . . . . . . . . . . . . . . 323.4 Convex cones . . . . . . . . . . . . . . . . . . . . . . . . . 363.5 Carathodorys theorem . . . . . . . . . . . . . . . . . . . 413.6 The convex hull, simplicial subsets and Blands rule . . . 453.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

    xi

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    xii Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    4. Polyhedra 55

    4.1 Faces of polyhedra . . . . . . . . . . . . . . . . . . . . . . 564.2 Extreme points and linear optimization . . . . . . . . . . 614.3 Weyls theorem . . . . . . . . . . . . . . . . . . . . . . . . 634.4 Farkass lemma . . . . . . . . . . . . . . . . . . . . . . . . 644.5 Three applications of Farkass lemma . . . . . . . . . . . . 66

    4.5.1 Markov chains and steady states . . . . . . . . . . 664.5.2 Gordans theorem . . . . . . . . . . . . . . . . . . . 694.5.3 Duality in linear programming . . . . . . . . . . . . 70

    4.6 Minkowskis theorem . . . . . . . . . . . . . . . . . . . . . 744.7 Parametrization of polyhedra . . . . . . . . . . . . . . . . 754.8 Doubly stochastic matrices: The Birkhoff polytope . . . . 76

    4.8.1 Perfect pairings and doubly stochastic matrices . . 784.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

    5. Computations with polyhedra 85

    5.1 Extreme rays and minimal generators in convex cones . . 865.2 Minimal generators of a polyhedral cone . . . . . . . . . . 875.3 The double description method . . . . . . . . . . . . . . . 90

    5.3.1 Converting from half space to vertex representation 955.3.2 Converting from vertex to half space representation 965.3.3 Computing the convex hull . . . . . . . . . . . . . 98

    5.4 Linear programming and the simplex algorithm . . . . . . 1005.4.1 Two examples of linear programs . . . . . . . . . . 1025.4.2 The simplex algorithm in a special case . . . . . . 1055.4.3 The simplex algorithm for polyhedra in general form 1095.4.4 The simplicial hack . . . . . . . . . . . . . . . . . . 1115.4.5 The computational miracle of the simplex tableau . 1135.4.6 Computing a vertex in a polyhedron . . . . . . . . 119

    5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

    6. Closed convex subsets and separating hyperplanes 125

    6.1 Closed convex subsets . . . . . . . . . . . . . . . . . . . . 1266.2 Supporting hyperplanes . . . . . . . . . . . . . . . . . . . 1296.3 Separation by hyperplanes . . . . . . . . . . . . . . . . . . 1336.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

    7. Convex functions 141

    7.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Contents xiii

    7.2 Jensens inequality . . . . . . . . . . . . . . . . . . . . . . 1457.3 Minima of convex functions . . . . . . . . . . . . . . . . . 1477.4 Convex functions of one variable . . . . . . . . . . . . . . 1487.5 Differentiable functions of one variable . . . . . . . . . . . 150

    7.5.1 The Newton-Raphson method for finding roots . . 1527.5.2 Critical points and extrema . . . . . . . . . . . . . 153

    7.6 Taylor polynomials . . . . . . . . . . . . . . . . . . . . . . 1567.7 Differentiable convex functions . . . . . . . . . . . . . . . 1597.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

    8. Differentiable functions of several variables 167

    8.1 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . 1678.1.1 The Newton-Raphson method for several variables 1718.1.2 Local extrema for functions of several variables . . 172

    8.2 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . 1748.3 Lagrange multipliers . . . . . . . . . . . . . . . . . . . . . 1768.4 The arithmetic-geometric inequality revisited . . . . . . . 1838.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

    9. Convex functions of several variables 187

    9.1 Subgradients . . . . . . . . . . . . . . . . . . . . . . . . . 1889.2 Convexity and the Hessian . . . . . . . . . . . . . . . . . . 1909.3 Positive definite and positive semidefinite matrices . . . . 1939.4 Principal minors and definite matrices . . . . . . . . . . . 1969.5 The positive semidefinite cone . . . . . . . . . . . . . . . . 1989.6 Reduction of symmetric matrices . . . . . . . . . . . . . . 2019.7 The spectral theorem . . . . . . . . . . . . . . . . . . . . . 2059.8 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . 2089.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

    10. Convex optimization 223

    10.1 A geometric optimality criterion . . . . . . . . . . . . . . 22410.2 The Karush-Kuhn-Tucker conditions . . . . . . . . . . . . 22610.3 An example . . . . . . . . . . . . . . . . . . . . . . . . . . 23010.4 The Langrangian, saddle points, duality and game theory 23210.5 An interior point method . . . . . . . . . . . . . . . . . . 236

    10.5.1 Newtonian descent, exact line search and bisection 23810.5.2 Polyhedral constraints . . . . . . . . . . . . . . . . 240

    10.6 Maximizing convex functions over polytopes . . . . . . . . 243

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    xiv Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    10.6.1 Convex functions are continuous on open subsets . 24410.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

    Appendix A Analysis 253

    A.1 Measuring distances . . . . . . . . . . . . . . . . . . . . . 253A.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

    A.2.1 Supremum and infimum . . . . . . . . . . . . . . . 258A.3 Bounded sequences . . . . . . . . . . . . . . . . . . . . . . 258A.4 Closed subsets and open subsets . . . . . . . . . . . . . . 259A.5 The interior and boundary of a set . . . . . . . . . . . . . 260A.6 Continuous functions . . . . . . . . . . . . . . . . . . . . . 261A.7 The main theorem . . . . . . . . . . . . . . . . . . . . . . 262A.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

    Appendix B Linear (in)dependence and the rank of a matrix 265

    B.1 Linear dependence and linear equations . . . . . . . . . . 265B.2 The rank of a matrix . . . . . . . . . . . . . . . . . . . . . 267B.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

    Bibliography 273

    Index 277

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Chapter 1

    Fourier-Motzkin elimination

    You probably agree that it is easy to solve the equation

    2x = 4. (1.1)

    This is an example of a linear equation in one variable having the uniquesolution x = 2. Perhaps you will be surprised to learn, that there is es-sentially no difference between solving a simple equation like (1.1) and themore complicated system

    2x+ y + z = 7

    x+ 2y + z = 8

    x+ y + 2z = 9

    (1.2)

    of linear equations in x, y and z. Using the first equation 2x+ y+ z = 7 wesolve for x and get

    x = (7 y z)/2. (1.3)This may be substituted into the remaining two equations in (1.2) and

    we get the simpler system

    3y + z = 9

    y + 3z = 11

    of linear equations in y and z. Again using the first equation in this systemwe get

    y = (9 z)/3 (1.4)ending up with the simple equation 8z = 24. This is an equation of thetype in (1.1) giving z = 3. Now z = 3 gives y = 2 using (1.4). Finally y = 2and z = 3 gives x = 1 using (1.3).

    1

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    2 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Figure 1.1: Isaac Newton (16421727). English mathematician.

    Solving a seemingly complicated system of linear equations like (1.2) isreally no more difficult than solving the simple equation (1.1). One of theworlds greatest scientists, Isaac Newton, found it worthwhile to record thismethod in 1720 with the words

    And you are to know, that by each quation one unknownQuantity may be taken away, and consequently, when there areas many quations and unknown Quantities, all at length maybe reducd into one, in which there shall be only one Quantityunknown.

    Figure 1.2: Carl Friedrich Gauss (17771855). German mathematician.

    During the computation of the orbit of the asteroid Pallas around 1810,Gauss encountered the need for solving linear equations related to his fa-mous least squares method. If you spend a little time deciphering the Latinin Gausss original writings (see Figure 1.3), you will see how eliminationappears naturally towards the end of the page. In spite of Newtons ex-plicit description several years before Gauss was born, this procedure isnow known as Gaussian elimination (see [Grcar (2011)] for more on thefascinating history of Gaussian elimination).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 3

    Figure 1.3: Gausss encounter with (Gaussian) elimination (published in1810) for use in the least squares method in computing the orbit of theasteroid Pallas. Notice that is the sum of squares to be minimized.

    1.1 Linear inequalities

    Inequalities may be viewed as a generalization of equations, since for twonumbers a and b, a = b if and only if a b and a b. We will describe

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    4 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Figure 1.4: Joseph Fourier (17681830). French mathematician.

    a clever algorithm going back to Fourier for systematically solving systemsof linear inequalities. Fourier had a rather concrete problem in mind, whenhe presented his note [Fourier (1826)] (see Figure 1.6).

    The algorithm itself is very similar to Gaussian elimination except thatwe are facing two types of inequalities () and () instead of just oneequality (=). First consider the simplest case of linear inequalities in justone variable by way of the example

    2x + 1 73x 2 4x + 2 3x 0

    (1.5)

    of inequalities in x. This can be rewritten to

    x 3x 2

    1 x0 x

    and therefore

    x min{2, 3} = 2max{1, 0} = 0 x

    or simply 0 x 2. Here the fundamental difference between linear equa-tions and linear inequalities is apparent. Multiplying by 1 leaves = invari-ant, whereas changes into .

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 5

    Consider now the system

    x 0x + 2y 6x + y 2x y 3

    y 0

    (1.6)

    of inequalities in two variables x and y.Perhaps the most straightforward way of approaching (1.6) is through

    a sketch. The bounding lines are

    x = 0

    x + 2y = 6

    x + y = 2

    x y = 3y = 0 .

    (1.7)

    For each line we pick a point to decide which half plane to shade e.g., weneed to shade below the line x+ 2y = 6, since the corresponding inequalityis x+ 2y 6 and (for example) 0 + 2 0 < 6. The intersection of these halfplanes is sketched as the shaded area in Figure 1.5.

    Figure 1.5: Sketch of the solutions in (1.6).

    We are aiming for a more effective way of representing the solutions.Our sketching techniques are not of much use solving for example 17 lin-ear inequalities in 12 unknowns. In order to attack (1.6) algebraically, wefirst record the following result strongly related to Fouriers problem inFigure 1.6.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    6 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Proposition 1.1. Let 1, . . . , r, 1, . . . , s R. Then

    max{1, . . . , r} min{1, . . . , s}

    if and only if i j for every i, j with 1 i r and 1 j s:

    1 1 . . . 1 s...

    . . ....

    r 1 . . . r s.

    Proof. If max{1, . . . , r} min{1, . . . , s}, then

    i max{1, . . . , r} min{1, . . . , s} jfor every 1 i r and 1 j s. On the other hand let 1 i0 r and1 j0 s be such that

    i0 = max{1, . . . , r}j0 = min{1, . . . , s}.

    If i j for every 1 i r and 1 j s, then i0 j0 . Therefore

    max{1, . . . , r} min{1, . . . , s}.

    Inspired by Gaussian elimination we will attempt to isolate and eliminate x.The key point here is that there exists x solving the two inequalities

    a x and x b if and only if a b,

    where a and b are real numbers. With this in mind, we rewrite (1.6) to

    0 xx 6 2y

    2 y xx 3 + yy 0 .

    Just like in one variable, this system can be reduced to

    x min{6 2y, 3 + y}max{0, 2 y} x

    y 0 .(1.8)

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 7

    Figure 1.6: The first page of Fouriers note [Fourier (1826)]. Notice thespecific problem he is describing. In modern parlor it amounts to findingall x, y, z such that x+ y + z = 1 and max{x, y, z} (1 + r) min{x, y, z}for a fixed r 0.

    Therefore we can eliminate x from (1.8) and deduce that

    max{0, 2 y} min{6 2y, 3 + y}y 0 (1.9)

    is solvable in y if and only if (1.8) is solvable in x and y. Now Proposition 1.1

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    8 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    shows that (1.9) is equivalent to the inequalities

    0 6 2y0 3 + y

    2 y 6 2y2 y 3 + y

    0 y

    in the variable y. These inequalities can be solved just like we solved (1.5)and may be reduced to the two inequalities

    0 y 3.

    We have proved that two numbers x and y solve the system (1.6) if andonly if

    0 y 3max{0, 2 y} x min{6 2y, 3 + y}.

    If you phrase things a bit more geometrically, the projection of the solu-tions to (1.6) on the y-axis is the interval [0, 3]. In other words, if x andy solve (1.6), then y [0, 3] and for a fixed y [0, 3], x and y solve (1.6)provided that max{0, 2 y} x min{6 2y, 3 + y}.

    1.2 Linear optimization using elimination

    The elimination method outlined here can be used in solving the problemof maximizing a linear function subject to constraints consisting of lin-ear inequalities (see also Exercises 1.5 and 1.9). Such linear optimizationproblems are excellent models for many practical problems and are usuallysolved with the more advanced simplex algorithm, which we will explainlater in 5.4. The following example illustrates how elimination is used byadjoining an extra variable for the linear function.

    Example 1.2. Find the maximal value of x+ y subject to the constraints

    x + 2y 32x + y 3x 0

    y 0 .

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 9

    Here the trick is to introduce an extra variable z and then find the maxi-mal z, such that the inequalities (and the one equation)

    x + y = z

    x + 2y 32x + y 3x 0

    y 0have a solution. First we eliminate x by substituting x = z y into theinequalities:

    z + y 32z y 3z y 0

    y 0 .Preparing for elimination of y we write

    y 3 z2z 3 y

    y z0 y .

    Thereforemax{0, 2z 3} y min{3 z, z} (1.10)

    and Proposition 1.1 applies to give the inequalities

    0 3 z0 z

    2z 3 3 z2z 3 z

    with solution 0 z 2. Therefore the maximal value of z = x + y is 2.You can obtain a solution (x, y) to the linear optimization problem by firstinserting z = 2 into (1.10) to get y and then insert z and y into x = z yto get x. This gives the unique optimum (x, y) = (1, 1).

    Now we are ready to enter into the general setting.

    1.3 Polyhedra

    A linear inequality in n variables x1, . . . , xn is an inequality of the form

    a1x1 + + anxn b,

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    10 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    where a1, . . . , an, b R. By Rn we denote the set of column vectors withn entries (n 1 matrices). For typographical reasons we will sometimes let(x1, . . . , xn) refer to the column vector in Rn with entries x1, . . . , xn.

    Definition 1.3. The subset

    P =

    x1...xn

    Rna11x1 + + a1nxn b1...am1x1 + + amnxn bm

    Rnof solutions to a system

    a11x1 + + a1nxn b1...am1x1 + + amnxn bm

    of finitely many linear inequalities (here aij and bi are real numbers) iscalled a polyhedron.

    Notation 1.4. For vectors u = (u1, . . . , un), v = (v1, . . . , vn) Rn weintroduce the notation

    u v u1 v1 and . . . and un vn.

    With this convention the polyhedron in Definition 1.3 is expressed moreeconomically as

    P = {x Rn |Ax b},where A is the m n matrix and b the vector in Rm given by

    A =

    a11 a1n... . . . ...am1 amn

    and b = b1...bm

    .Example 1.5. For u = (1, 2) and v = (2, 3), u v, whereas neither u vnor v u hold for u = (1, 0) and v = (0, 1). The polyhedron in (1.6) canbe written as

    P =

    (xy

    ) R2

    1 0

    1 21 1

    1 10 1

    (xy)

    062

    30

    .

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 11

    In modern mathematical terms our computations can be used in provingthe main result (Theorem 1.6) in this chapter that the projection of a poly-hedron is a polyhedron (see Figure 1.7). This seemingly innocuous resulthas rather profound consequences. The proof may appear a bit technical atfirst, but it is simply a formalization of the concrete computations in 1.1.

    The elimination method for linear inequalities in 1.1 is called Fourier-Motzkin elimination. Not knowing the classical paper by Fourier, Motzkin1

    rediscovered it in his thesis Beitrge zur Theorie der linearen Ungleichun-gen supervised by Ostrowski2 in Basel, 1933.

    Figure 1.7: Projection to R2 of a polyhedron in R3.

    Theorem 1.6. Consider the projection pi : Rn Rn1 given bypi(x1, . . . , xn) = (x2, . . . , xn).

    If P Rn is a polyhedron, thenpi(P ) = {(x2, . . . , xn) | x1 R : (x1, x2, . . . , xn) P} Rn1

    1Theodore Samuel Motzkin (19081970). Israeli-American mathematician.2Alexander Markowich Ostrowski (18931986). Russian-Swiss mathematician.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    12 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    is a polyhedron.

    Proof. Suppose that P is the set of solutions to

    a11x1 + + a1nxn b1...am1x1 + + amnxn bm .

    We partition these m inequalities according to the sign of ai1:

    G = {i | ai1 > 0}Z = {i | ai1 = 0}L = {i | ai1 < 0}.

    Inequality number i reduces to

    x1 ai2x2 + + ainxn + bi,if i G and to

    aj2x2 + + ajnxn + bj x1,if j L, where aik = aik/ai1 and bi = bi/ai1 for k = 2, . . . , n. So theinequalities in L and G are equivalent to

    max{ai2x2 + + ainxn + bi

    i L} x1 min

    {aj2x2 + + ajnxn + bj

    j G}by Proposition 1.1. By definition, (x2, . . . , xn) pi(P ) if and only if(x2, . . . , xn) satisfies the inequalities in Z and

    max{ai2x2 + + ainxn + bi

    i L} min{aj2x2 + + ajnxn + bj j G}.

    Proposition 1.1 shows that this inequality is equivalent to the |L||G| in-equalities in x2, . . . , xn consisting of

    ai2x2 + + ainxn + bi aj2x2 + + ajnxn + bjor rather

    (ai2 aj2)x2 + + (ain ajn)xn bj biwhere i L and j G. Adding the inequalities in Z, where x1 is notpresent, it follows that pi(P ) is the set of solutions to these |L||G| + |Z|linear inequalities in x2, . . . , xn. Therefore pi(P ) is a polyhedron.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 13

    1.4 Exercises

    Exercise 1.1. Sketch the set of solutions to the system

    2x + y 23x + y 9x + 2y 4

    y 0(1.11)

    of linear inequalities. Carry out the elimination procedure for (1.11) asillustrated in 1.1.

    Exercise 1.2. Let

    P =

    (x, y, z) R3x y z 03x y z 1x + 3y z 2x y + 3z 3

    and pi : R3 R2 be given by pi(x, y, z) = (y, z).

    (i) Compute pi(P ) as a polyhedron i.e., as the solutions to a set of linearinequalities in y and z.

    (ii) Compute (P ), where : R3 R is given by (x, y, z) = x.(iii) How many integral points does P contain i.e., how many elements

    are in the set P Z3?Exercise 1.3. Find all solutions x, y, z Z to the linear inequalities

    x + y z 0 y + z 0

    z 0x z 1

    y 1z 1

    by using Fourier-Motzkin elimination.

    Exercise 1.4. Does the system

    2x 3y + z 2x + 3y + z 3

    2x 3y + z 2x 3y 3z 12x y + 3z 3

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    14 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    of linear inequalities have a solution x, y, z R?

    Exercise 1.5. Let P Rn be a polyhedron and c Rn. Define the poly-hedron P Rn+1 by

    P ={(

    xz

    ) Rn+1

    ct x = z, x P, z R}.(i) How does this setup relate to Example 1.2?(ii) Show how projection onto the z-coordinate (and Fourier-Motzkin

    elimination) in P can be used to solve the linear optimization prob-lem of finding x P , such that ctx is minimal (or proving that suchan x does not exist).

    (iii) Let P denote the polyhedron from Exercise 1.2. You can see that

    (0, 0, 0), (1, 12 , 12 ) P

    have values 0 and 1 on their first coordinates, but what is theminimal first coordinate of a point in P?

    Exercise 1.6. Solve the problem appearing in Fouriers article (Figure 1.6)for r = 1 using Fourier-Motzkin elimination.

    Exercise 1.7. Let P denote the set of (x, y, z) R3, satisfying

    2x + y + z 4x 1

    y 2z 3

    x 2y + z 12x + 2y z 5 .

    (i) Prove that P is bounded.(ii) Find (x, y, z) P with z maximal. Is such a point unique?

    Exercise 1.8. A vitamin pill P is produced using two ingredients M1and M2. The pill needs to satisfy four constraints for the vital vitamins V1and V2. It must contain at least 6 milligram and at most 15 milligram of V1and at least 5 milligram and at most 12 milligram of V2. The ingredientM1contains 3 milligram of V1 and 2 milligram of V2 per gram. The ingredientM2 contains 2 milligram of V1 and 3 milligram of V2 per gram:

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Fourier-Motzkin elimination 15

    V1 V2

    M1 3 2

    M2 2 3

    Let x denote the amount of M1 and y the amount of M2 (measured ingrams) in the production of a vitamin pill. Write down a system of linearinequalities in x and y describing the constraints above.

    We want a vitamin pill of minimal weight satisfying the constraints. Howmany grams of M1 and M2 should we mix? Describe how Fourier-Motzkinelimination can be used in solving this problem.

    Exercise 1.9. Use Fourier-Motzkin elimination to compute the minimalvalue of

    x1 + 2x2 + 3x3,

    when x1, x2, x3 satisfy

    x1 2x2 + x3 = 4x1 + 3x2 = 5

    andx1 0, x2 0, x3 0.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    This page intentionally left blankThis page intentionally left blank

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Chapter 2

    Affine subspaces

    A polyhedron is the set of solutions to a system of linear inequalities. Sets

    {x Rd |Ax = b} (2.1)

    of solutions to a system of linear equations are polyhedra of the simplestkind. Here A is an m d matrix and b Rm corresponding to a system ofm linear equations with d unknowns.

    Recall that a line in Rd is a subset (see Figure 2.1) of the form

    {x+ t | t R},

    where x Rd is a vector and Rd \ {0} a non-zero directional vector.Two distinct points u, v Rd are contained in the unique line

    L = {(1 t)u+ tv | t R}. (2.2)

    Here a directional vector for L is v u. It is not too difficult to check thatsets of solutions to systems of linear equations such as (2.1) contain theline between any two of their points. Subsets with this intrinsic geometricproperty are called affine subspaces.

    The purpose of this chapter is to give an account of affine subspacesbased on systems of linear equations. The difference between affine sub-spaces and the usual subspaces of linear algebra is that the former do notnecessarily contain the zero vector. Affine subspaces also enter into the im-portant definition of the dimension of an arbitrary subset of Rd so that apoint has dimension zero, a line dimension one etc.

    17

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    18 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    x

    Figure 2.1: Sketch of the line {x+ t | t R} in R2 with x and marked.

    2.1 Definition and basics

    We begin by stating some basic properties of affine subspaces. First a mo-tivating example.

    Example 2.1. Consider the three points v1 = (2, 1, 0), v2 = (1, 0, 1) andv3 = (0, 4, 1) R3. You can check through (2.2), that v3 does not lieon the unique line through v1 and v2, hence there is no line containing allthree points. Therefore they span a unique plane H in R3. This plane isgiven parametrically as

    v1 + t1(v2 v1) + t2(v3 v1) = (1 t1 t2)v1 + t1v2 + t2v3 (2.3)for t1, t2 R (see Figure 2.2). In other words, H = v1 + W = {v1 + v |v W}, where W is the linear subspace of R3 spanned by the vectorsv2 v1 and v3 v1. With numbers inserted this reads

    H =

    {(210

    )+ t1

    (11

    1

    )+ t2

    (231

    ) t1, t2 R}.By finding a non-zero solution to the system

    1 2 + 3 = 021 + 32 3 = 0

    of linear equations, you can check that W = {(x, y, z) R3 | 2x + 3y +5z = 0}. Therefore

    H = {(x, y, z) R3 | 2x+ 3y + 5z = 7} (2.4)and the plane H is presented as the set of solutions to a linear equation.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Affine subspaces 19

    v2

    v3

    v1

    Figure 2.2: Sketch of the plane H R3 marking v1, the directional vectorsv2 v1, v3 v1 and a normal vector proportional to (2, 3, 5).

    Definition 2.2. A non-empty subset M Rd is called an affine subspaceif

    (1 t)u+ tv Mfor every u, v M and every t R. A map f : Rd Rn is called an affinemap if

    f((1 t)u+ tv) = (1 t)f(u) + tf(v)for every u, v Rd and every t R.The identity (2.3) in Example 2.1 points to the result below.

    Lemma 2.3. Let M be an affine subspace of Rd and v1, . . . , vm M . Then

    1v1 + + mvm M

    for all real numbers 1, . . . , m R with 1 + + m = 1.Proof. This is proved by induction on m. For m = 2 this is the contentof Definition 2.2. For m > 2 we must have 1i 6= 0 for some i = 1, . . . ,m.We may assume that 1 m 6= 0. Then

    1v1 + + mvm = (1 m)(

    11 m v1 + +

    m11 m vm1

    )+ mvm.

    Since 1 + + m1 = 1 m we are done by induction.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    20 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    2.2 The affine hull

    Definition 2.4. A linear combination

    1v1 + + mvm

    of vectors v1, . . . , vm Rd is called an affine linear combination if

    1 + + m = 1.

    The affine hull, aff(S), of a subset S Rd is the set of all affine linearcombinations of elements from S i.e.,

    aff(S) := {1v1 + + mvm |m 1,v1, . . . , vm S, 1 + + m = 1}.

    Proposition 2.5. The affine hull, aff(S), of a subset S Rd is an affinesubspace. It is the smallest affine subspace containing S.

    Proof. Suppose that u, v aff(S) i.e.,

    u = 1v1 + + rvrv = 1v

    1 + + svs

    where r, s 1,1 + + r = 1 + + s = 1

    and v1, . . . , vr, v1, . . . , vs S. Then

    (1 )u+ v = (1 )1v1 + + (1 )rvr + 1v1 + + svsfor R. This is an affine linear combination, since

    (1 )1 + + (1 )r + 1 + + s= (1 )(1 + + r) + (1 + + r)= (1 ) + = 1

    and aff(S) is an affine subspace according to Definition 2.2. IfM is an affinesubspace containing S, then M aff(S) by Lemma 2.3. This proves thataff(S) is the smallest subspace containing S.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Affine subspaces 21

    2.3 Affine subspaces and subspaces

    It is shown below that affine subspaces are solution sets of systems of linearequations. You may find it helpful to compare the proof with the explicitcomputations in Example 2.1.

    Proposition 2.6. For an affine subspace M Rd, W = {uv |u, v M}is a subspace and

    M = {x0 + w |w W} =: x0 +W,

    for every x0 M .A subset M Rd is an affine subspace if and only if it is the solution

    set to a system of linear equations.If h : Rd Rn is a linear map and b Rn, then f(x) = h(x) + b is an

    affine map. If f : Rd Rn is an affine map, h(x) = f(x) f(0) is a linearmap and f(x) = h(x) + b with b = f(0).

    Proof. For u1, v1, u2, v2 M ,

    (u1 v1) + (u2 v2) =(u1 + u2 + (1 )v1) (v1 + v2 + (1 )v1)

    for , R and it follows by Lemma 2.3 that W = {u v |u, v M} is asubspace. If x0 M it follows that x0 + (u v) M again by Lemma 2.3.Therefore x0+W M . On the other hand if x M , then x = x0+(xx0) x0 +W . Therefore M x0 +W and M = x0 +W .

    IfM is the solution set {x Rd |Ax = b} to a system of linear equationswe leave it to the reader to verify that M is an affine subspace. If M onthe other hand is only assumed to be an affine subspace, we may writeM = x0 +W for x0 M and W as above. Now represent W as {x Rd |Ax = 0} for a suitable m d matrix A with m d (see Exercise 2.3). LetM = {x Rd |Ax = b} with b = Ax0: if x W , then A(x0 + x) = b andM M . If z M , then z x0 W and the identity z = x0 + (z x0)shows thatM M (compare this with the computations in Example 2.1).Therefore M = M .

    The last part of the proposition is left as an exercise (Exercise 2.9).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    22 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    2.4 Affine independence and the dimension of a subset

    Definition 2.7. The dimension of an affine subspace M Rd is definedas

    dim(M) := dimW,

    whereW is the subspace {uv |u, v M} Rd (see Proposition 2.6). Thedimension of an arbitrary subset S Rd is defined as

    dimS := dim aff(S).

    A finite set S = {v1, . . . , vm} is called affinely independent if dimS = m1.This definition is very intuitive e.g., two points are affinely independent ifthey are different, three points are affinely independent if they do not lieon the same line etc. A single point has dimension zero, the affine span oftwo affinely independent points is a line (dimension one), the affine spaceof three affinely independent points is a plane (dimension two) etc.

    If H Rd is an affine subspace of dimension d 1, then H = V + x0,where x0 H and V = {x Rd |tx = 0} for some Rd \ {0} i.e., H ={x Rd |tx = } for = tx0. Such an affine subspace is called an affinehyperplane.

    Definition 2.8. If H = {x Rn |tx = } is an affine hyperplane, wedefine

    H+

    = {x Rn |tx }, H++ = {x Rn |tx > },H

    = {x Rn |tx }, H = {x Rn |tx < }.The two subsets H and H+ are called (affine) half spaces.

    Proposition 2.9. Let S = {v1, . . . , vm} Rd. Then aff(S) = v1 + W ,where W is the subspace spanned by v2 v1, . . . , vm v1. The followingconditions are equivalent.

    (1) S is affinely independent.(2) v2 v1, . . . , vm v1 are linearly independent.(3) The equations

    1v1 + + mvm = 01 + + m = 0

    imply that 1 = = m = 0.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Affine subspaces 23

    (4) The vectors (v11

    ), . . . ,

    (vm1

    )are linearly independent in Rd+1.

    Proof. By definition

    aff(S) = {1v1 + + mvm |1 + + m = 1}

    and {v v1 | v aff(S)} is the subspace W of Proposition 2.6 for M =aff(S). Since

    (1 1)v1 + 2v2 + + mvm = 2(v2 v1) + + m(vm v1)

    it follows that W is spanned by v2 v1, . . . , vm v1 as claimed. Therefore(1) is equivalent to (2). To prove (2) (3) we write

    1v1 + + mvm = 2(v2 v1) + + m(vm v1)

    using that 1 + + m = 0. By the linear independence of v2 v1, . . . ,vm v1 it follows that 2 = = m = 0 and therefore also 1 = 0. Theclaim (4) is an exact restatement of (3). Assuming (3) holds, let us finishthe proof by showing that (2) holds. Suppose that

    1(v2 v1) + + m1(vm v1) = 0

    for 1, . . . , m1 R. We may rewrite this as

    v1 + 1v2 + + m1vm

    with = 1 m1. Now (3) implies implies 1 = = m1 = 0showing that v2 v1, . . . , vm v1 are linearly independent vectors.

    2.5 Exercises

    Exercise 2.1. Let u, v Rd with u 6= v. Prove that

    L = {(1 t)u+ tv | t R}

    is a line in Rd containing u and v. Prove also that if M is a line in Rd suchthat u, v M , then M = L.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    24 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Exercise 2.2. Let u = (1, 1, 1) and v = (1, 2, 3) be vectors in R3. Showthat u and v are linearly independent and find R3 with

    W = {x R3 |tx = 0},where W is the subspace spanned by u and v.

    Exercise 2.3. Let W Rd be a subspace and suppose that v1, . . . , vr is abasis of W . Prove that

    W = {u Rd |utv1 = = utvr = 0} Rd

    is a subspace. Let u1, . . . , us be a basis of W and A the s d matrix withthese vectors as rows. Show that s = d r and

    W = {x Rd |Ax = 0}.Exercise 2.4. Prove that {v Rd |Av = b} is an affine subspace of Rd,where A is an m d matrix and b Rm.Exercise 2.5. Let M be an affine subspace. Prove that {u v |u, v M}is a subspace.

    Exercise 2.6. Can you have two linearly independent vectors in R? Whatabout two affinely independent vectors?

    Exercise 2.7. Decide if (2, 1), (3, 2) and (5, 5) are on the same line in R2applying Proposition 2.9.

    Exercise 2.8. Let S = {(1, 1, 1), (2, 3, 4), (1, 2, 3), (2, 1, 0)} R3. Com-pute the smallest affine subspace containing S.

    Exercise 2.9. Prove that f(x) = h(x) + b is an affine map if h : Rd Rnis a linear map and b Rn. Prove that h(x) = f(x) f(0) is a linear mapif f : Rd Rn is an affine map.Exercise 2.10. Prove that you can have no more than d+ 1 affinely inde-pendent vectors in Rd.

    Exercise 2.11. Let v0, . . . , vd be affinely independent points in Rd. Provethat

    f(x) = (0, 1, . . . , d)

    is a well defined affine map f : Rd Rd+1, wherex = 0v0 + + dvd

    with 0 + + d = 1.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Affine subspaces 25

    Exercise 2.12. Prove that a non-empty open subset U Rd has dimen-sion dimU = d. Show that a subset S Rd with dimS = d contains anon-empty open subset.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    This page intentionally left blankThis page intentionally left blank

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Chapter 3

    Convex subsets

    An affine subspace M is the set of solutions to a system of linear equationsand contains the line {(1 t)x + ty | t R} between any two of its pointsx, y M . A polyhedron P is the set of solutions to a system of linearinequalities and is only guaranteed to contain the line segment

    {(1 t)x+ ty | 0 t 1}

    between any two of its points x, y P (see Figure 3.1 and Exercise 3.1).

    x

    y

    P

    Figure 3.1: A polyhedron P R2 with two points x, y P , the line segmentbetween x and y and the line through x and y.

    In this chapter we will go beyond polyhedra and study subsets of Rd withthe property that they contain the line segment between any two of theirpoints. Subsets with this nowhere concave property have appeared upthrough the history of mathematics at least since the time of Archimedes.1

    1Archimedes of Syracuse (287 BC to 212 BC). Greek mathematician.

    27

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    28 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    The first formal definition appeared around 1896 in Minkowskis famousmonograph Geometrie der Zahlen (see [Kjeldsen (2008)]).

    3.1 Basics

    The following simple definition (compare with Definition 2.2) is truly fun-damental in modern mathematics.

    Definition 3.1. A subset C Rd is called convex if it contains the linesegment between any two of its points:

    (1 t)u+ tv C

    for every u, v C and every t R with 0 t 1.

    (a) (b)

    Figure 3.2: (a) A non-convex subset of R2. (b) A convex subset of R2.

    An intersection of convex subsets is a convex subset. Since an affine halfspace is a convex subset, it follows that polyhedra are convex subsets. Theunion of convex subsets does not have to be convex.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 29

    Minkowski sum, dilation and the polar of a subset

    A fundamental operation is setwise addition of convex subsets. The sumA+B of two subsets A and B of Rd is defined as the subset

    A+B := {u+ v |u A, v B} Rd.If A and B are convex subsets, A + B is a convex subset and it is calledthe Minkowski sum of A and B. This rather simple operation on convexsubsets turns out to be very important.

    + =

    Figure 3.3: Minkowski sum of a triangle and a disc.

    Another fundamental operation is dilation of a convex subset C Rdby R. This is the convex subset

    C := {x |x C} Rd.

    C

    C

    Figure 3.4: Dilation of a convex subset C with = 2. The marked pointindicates (0, 0) in R2.

    Let S Rd be any subset. ThenS := { Rd |tx 1, for every x S} Rd

    is called the polar of S (see Figure 3.5). One can prove that S is a convexsubset of Rd and that 0 S.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    30 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Figure 3.5: A convex subset and its polar. The marked points indicate (0, 0)in R2.

    3.2 The convex hull

    In complete analogy with Lemma 2.3 we have the following (the proof isthe same).

    Lemma 3.2. Let C be a convex subset of Rd and v1, . . . , vm C. Then1v1 + + mvm C

    if 1, . . . , m 0 and 1 + + m = 1.A linear combination 1v1 + +mvm of vectors v1, . . . , vm Rd is calledconvex if 1, . . . , m 0 and 1 + + m = 1.Definition 3.3. The convex hull of a subset S Rd is the set of all convexlinear combinations of elements from S i.e.,

    conv(S) := {1v1 + + mvm |m 1,v1, . . . , vm S, 1, . . . , m 0 and 1 + + m = 1}.

    If S is a finite subset, conv(S) is called a polytope. If S R2 is a finitesubset, we call conv(S) a (convex) polygon.

    Proposition 3.4. The convex hull, conv(S), of a subset S Rd is a convexsubset. It is the smallest convex subset containing S.

    The polar set introduced previously has a nice interpretation for the convexhull of a finite set of points.

    Proposition 3.5. Let C = conv({v1, . . . , vm}), where v1, . . . , vm Rd.Then the polar of C is the polyhedron given by

    C = { Rd |tv1 1, . . . , tvm 1}.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 31

    Figure 3.6: The convex hull of 100 random points in R2. In this example,90 of the 100 points lie in the convex hull of the 10 "extreme" points.Finding the extreme points is a non-trivial computational task.

    Proof. In Exercise 3.9 you are asked to prove this and compute a polarconvex hull based on Example 3.6.

    Example 3.6. To get a feeling for convex hulls, it is important to playaround with (lots of) examples in the plane. In Figure 3.7 you see a finitesubset of four planar points and their convex hull.

    In suitable coordinates the four points are(01

    ),

    (32

    ),

    (21

    )and

    (40

    ).

    You can check that the third point is contained in the convex hull of theother points, since (

    21

    )= 37

    (01

    )+ 27

    (32

    )+ 27

    (40

    ).

    You may wonder where the coefficients 37 ,27 and

    27 came from. In fact, the

    coefficients 1, 2 and 3 in(21

    )= 1

    (01

    )+ 2

    (32

    )+ 3

    (40

    )

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    32 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    coming from Definition 3.3, must solve the system

    32 + 43 = 2

    1 + 22 = 1

    1 + 2 + 3 = 1

    of linear equations. This makes them very explicit and computable.

    Figure 3.7: Four points (left). The convex hull (right).

    One of the points in Figure 3.7 can be omitted without altering the convexhull. This point seems not to be a vertex or a corner. We will give aquite general definition that captures the corners or vertices of a convexsubset.

    3.3 Faces of convex subsets

    A convex subset C is structured around its faces, which are special convexsubsets of C. Informally the zero-dimensional faces are the vertices of C,the one-dimensional faces the extreme line segments of C etc. The vertices(called extreme points) are the points not located in the interior of a linesegment in C. Only one of the marked points in the triangle to the rightin Figure 3.7 fails to be extreme. The extreme points are the vertices andthe one-dimensional faces the edges of the triangle. We will prove later (seeTheorem 6.12) that a compact convex subset always is the convex hull ofits extreme points. The precise definition of a face in a convex subset is asfollows.

    Definition 3.7. Let C Rd be a convex subset. A subset F C is calleda face of C if F is convex and for every x, y C and 0 < < 1,

    (1 )x+ y F

    implies that x, y F .

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 33

    An important type of faces of a convex subset is given by following result.

    Lemma 3.8. Let C Rd be a convex subset and Rd. Then

    F = {z C |tz tx, for every x C} (3.1)

    is a face of C.

    Proof. If z1, z2 F , then tz1 tz2 and tz2 tz1. Thereforetz1 =

    tz2 and F = {x C |tx = } with = tz1, which shows thatF is a convex subset as the intersection of C with an affine hyperplane.

    Suppose that x, y C and z := (1 )x + y F for 0 < < 1 i.e.,tz tv for every v C, in particular tz tx and tz ty. Let usassume that tx ty. Then

    tx = (1 )tx+ tx (1 )tx+ ty = tz.

    This implies x F . Since we must have ty = tx, it also follows thaty F . A face F C of a convex subset C Rd is called exposed if it is given asin (3.1) for some Rd. A face of a convex subset does not have to beexposed (see Exercise 3.11). A zero-dimensional face (a point) is called anextreme point. A point z C is extreme if for every x, y C

    z conv({x, y})

    if and only if z = x or z = y. The set of extreme points in C is denotedext(C).

    (a) (b) (c)

    Figure 3.8: (a) A line in the plane does not have extreme points. (b) A dischas infinitely many extreme points but no one-dimensional faces. (c) Apolygon with nine extreme points and nine one-dimensional faces.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    34 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    The notion of an extreme point identifies the non-redundant points ina convex hull (see Figure 3.6). Formally, one can show (see Exercise 3.13)by induction that if z is an extreme point of conv({x1, . . . , xn}), then

    z = x1 or z = x2 or . . . or z = xn.

    Interlude: Integral points in convex subsets

    This section is meant to be a leisurely break from the formal theory anda pointer to some of the rich and beautiful mathematics that surroundsconvexity.

    A convex subset does not have to be a polyhedron. An example of non-polyhedral convex subset is a planar disc (see Figure 3.9).

    Figure 3.9: (left) A planar disc is not a polyhedron. (right) The integralpoints of a planar disc.

    Non-polyhedral convex subsets are both interesting and difficult. For onething, counting the number N(r) of integral points inside a circle (see Fig-ure 3.9) of radius r is a classical and notoriously difficult problem studiedby Gauss around 1834. Gauss studied the error term E(r) = |N(r) pir2|and proved that E(r) 22pir. In another example, a four-dimensionalball

    B(N) = {(x1, x2, x3, x4) R4 |x21 + x22 + x23 + x24 N} R4

    is a convex subset. There is a remarkable connection between the prime

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 35

    factorization of N N and counting integral points in B(N). Notice that

    |B(N) Z4| |B(N 1) Z4|={(x1, x2, x3, x4) Z4 |x21 + x22 + x23 + x24 = N}.

    An amazing result [Jacobi (1829)] due to Jacobi2 about the number of waysof writing an integer as a sum of four squares, says that{(x1, x2, x3, x4) Z4 |x21 + x22 + x23 + x24 = N} = 8

    d|N,4-dd. (3.2)

    As a simple check of this remarkable formula you can verify that 4 is a sumof four squares in the following 8 + 16 different ways:

    (2, 0, 0, 0), (0,2, 0, 0), (0, 0,2, 0), (0, 0, 0,2)(1,1,1,1).

    If you consider a number N = pq, which is the product of two odd primenumbers p and q, you can use (3.2) to find p+ q by counting integral pointsin B(N). But if you know N = pq and p+ q, then you can retrieve p and q,since

    (x p)(x q) = x2 (p+ q)x+N.This may sound esoteric, but in essence, it shows that if you can countintegral points in the four-dimensional ball effectively, then you can crackthe worldwide encryption system RSA (RSA is based on the computa-tional hardness of prime factorization we are talking 500 digit numbersor more here).

    Counting integral points in polyhedra is difficult but much better under-stood. For example if P is a convex polygon with dim(P ) = 2 and integralvertices, then the number of integral points in P is given by the formula ofPick3 from 1899:

    |P Z2| = Area(P ) + 12 B(P ) + 1,

    where B(P ) is the number of integral points on the boundary of P . Youcan easily check this with a few examples. Consider for example the convexpolygon P in Figure 3.10.2Carl Gustav Jacobi (18041851). German mathematician.3Georg Alexander Pick (18591942). Austrian mathematician.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    36 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Figure 3.10: Integral points in a convex polygon with integral vertices.

    By subdivision into triangles it follows that Area(P ) = 552 . Also, by aneasy count we get B(P ) = 7. Therefore the formula of Pick shows that

    |P Z2| = 552 + 12 7 + 1 = 32.

    The polygon contains 32 integral points. This can be verified by an explicitcount from Figure 3.10.

    For a very nice account of counting integral points in polyhedra andbeyond look up [de Loera (2005)].

    An exceedingly important class of convex subsets is now introduced andthe connection to the convex hull is described.

    3.4 Convex cones

    A cone in Rd is a subset K such that x K for every x K and every 0. A pointed cone is a cone not containing a line.

    Proposition 3.9. A non-empty polyhedron P = {x Rd |Ax b} is acone if and only if P = {x Rd |Ax 0}. P is a pointed cone if and onlyif rkA = d.

    Proof. If P = {x Rd |Ax 0}, then P is a cone, since A(x) =(Ax) 0 if 0 and Ax 0. If P = {x Rd |Ax b} is a cone, thenb 0, since 0 P . Therefore {x Rd |Ax 0} P . If x P and Ax 6 0,then atjx > 0 for some row vector aj of A and atj(x) = (atjx) bj for

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 37

    every 0, since P is a cone. This contradicts that atjz is bounded aboveby bj for z P . Therefore P = {x Rd |Ax 0}.

    If rkA < d, there exists a nonzero Rd with A = 0. This showsthat P contains the line {t | t R}. On the other hand, suppose that Pcontains the line L = { + t | t R} with Rd \ {0}. If atj 6= 0 forsome j, then we may find t R such that atj + tatj > 0 contradictingthat L P . Therefore A = 0 and we must have rkA < d.

    Figure 3.11: Part of the Lorentz cone in R3.

    A polyhedral cone is a convex cone. An example of a non-polyhedralconvex cone is the Lorentz cone (see Figure 3.11) given by

    C = {(x, y, z) R3 | z 0, x2 + y2 z2}.

    It is not too hard to prove that C is a cone. Proving that C is a convexsubset is more of a challenge (Exercise 3.17).

    The recession cone

    To every convex subset C Rn we associate a convex cone denoted rec(C)called the recession cone of C. Informally rec(C) is the set of directions inwhich C recedes4 (see [Rockafellar (1970)], Section 8). The precise definitionis

    rec(C) = {d Rn |x+ d C, for every x C}. (3.3)

    It is left as an exercise (Exercise 3.18) to prove that rec(C) is a convexcone. If C is a convex cone, then rec(C) = C.

    4Go to or toward a more distant point.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    38 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    C

    Figure 3.12: The recession cone of the convex subset C = {(x, y) |xy 1,x > 0} R2 is rec(C) = {(x, y) |x 0, y 0}. The dashed arrow is notin rec(C). Its direction leads to points outside C.

    Finitely generated cones

    A linear combination 1v1 + + mvm of vectors v1, . . . , vm Rd iscalled conic if 1, . . . , m 0. In complete analogy with the convex hull weintroduce the following.

    Definition 3.10. The cone generated by a subset S Rd is the set of allconic linear combinations of elements from S i.e.,

    cone(S) := {1v1 + + mvm |m 1,v1, . . . , vm S, 1, . . . , m 0}.

    A convex cone C is called finitely generated if C = cone(S) for a finite setS Rd.

    Convex cones are closely related to the convex hull. Here we repeat the ideaillustrated in Example 3.6 in the context of finitely generated cones.

    Example 3.11. A triangle T is the convex hull of three affinely indepen-dent points

    (x1, y1), (x2, y2), (x3, y3)

    in the plane and (x, y) T if and only if(xy1

    ) cone

    ({(x1y11

    ),

    (x2y21

    ),

    (x3y31

    )}). (3.4)

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 39

    Testing (3.4) amounts to solving the systemx1 x2 x3y1 y2 y31 1 1

    123

    =xy

    1

    (3.5)of linear equations. So (x, y) T if and only if the unique solution to (3.5)has 1 0, 2 0 and 3 0. Let us experiment with a few concretenumbers. You can plot the points(

    00

    ),(

    21

    )and

    (53

    ) R2

    and realize that their convex hull, T , is a very thin triangle. From a drawingit can be difficult to decide if a given point is inside the triangle. Here the3 3 matrix from (3.5) helps. In this case

    A =

    0 2 50 1 31 1 1

    and

    A1 =

    2 3 13 5 01 2 0

    .Let us check if v1 = (1, 1120 ) T and v2 = (4, 2) T . In the case of v1,

    A1

    111/201

    =13/201/4

    1/10

    .This is the solution (1, 2, 3) to the linear equations in (3.5). In this case,the solution satisfies 1 0, 2 0 and 3 0. Therefore v1 T . As anadded bonus you also see that(

    11120

    )= 1320

    (00

    )+ 14

    (21

    )+ 110

    (53

    ).

    For v2 we get

    A1(

    421

    )=

    (120

    ).

    Here the first coordinate of the solution to (3.5) is negative and thereforev2 6 T .

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    40 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    In general, the polar of a subset S Rd is defined asS = { Rd |tx 1 for every x S}. (3.6)

    If S happens to be a cone K we can replace the upper bound of 1 in(3.6) by 0, because if tx > 0 for some x K, we would have t(x) =(tx) > 1 for a sufficiently big > 0. This contradicts that K, sincex K. Therefore

    K = { Rd |tx 0 for every x K}.The polar cone K is a cone and therefore a convex cone. In completeanalogy with Proposition 3.5 we have the following result.

    Proposition 3.12. If C Rd is a finitely generated cone, then C is apolyhedral cone.

    Proof. Suppose that C = cone(S), where S = {v1, . . . , vm} Rd. Thentx 0 for every x = 1v1 + + mvm, where i 0 for i = 1, . . . ,mif and only if tvi 0 for i = 1, . . . ,m. Let A be the m d matrix withv1, . . . , vm as its rows. Then it follows that

    C = { Rn |A 0}. Polyhedral cones are finitely generated and finitely generated cones arepolyhedral. These two deeper results due to Minkowski and Weyl are thefocus of the next chapter. In purely algebraic terms, the finite generation ofa polyhedral cone says that there exists finitely many solutions v1, . . . , vN Rn to a system of linear inequalities

    a11x1 + + an1xn 0...a1mx1 + + anmxn 0,

    (3.7)

    such that every solution to (3.7) is a conic linear combination of v1, . . . , vN .In the setting of linear algebra you know that every solution to the homo-geneous linear system corresponding (3.7) is a linear combination of nsolutions. For (3.7) even the existence of finitely many (conic) generatingsolutions is a non-trivial statement.

    Example 3.13. In Figure 3.13 we have sketched a finitely generated coneC and its polar cone C. If you look closer, you will see that

    C = cone({(

    21

    ),(

    12

    )})and C = cone

    ({(12),(2

    1

    )}).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 41

    C

    C

    Figure 3.13: A finitely generated cone C and its polar cone C.

    Notice also that C encodes the fact that C is the intersection of thetwo half planes{(

    xy

    ) R2

    ( 12)t(xy) 0} and {(xy) R2 (21)t(xy) 0}.3.5 Carathodorys theorem

    In Example 3.11 you saw how to check if a point is in the convex hull of threepoints in the plane. This led to three linear equations in three unknowns.What if you had more than three points? In 3.6 it is shown how to reducesuch a computation to a finite number of the simple computations carriedout in Example 3.11. The key point is that a finitely generated cone isthe union of finitely many simplicial subcones. A convex cone is calledsimplicial if it is generated by finitely many linearly independent vectors.The following result was first proved by Carathodory.5

    5Constantin Carathodory (18731950). Greek mathematician.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    42 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Theorem 3.14 (Carathodory). Let V = {v1, . . . , vm} Rd. If v cone(V ), then v belongs to the cone generated by a linearly independentsubset of V .

    Proof. Suppose that

    v = 1v1 + + mvmwith 1, . . . , m > 0 and v1, . . . , vm linearly dependent. The linear depen-dence means that there exists 1, . . . , m R not all zero such that

    1v1 + + mvm = 0. (3.8)

    We may assume that at least one i > 0 multiplying (3.8) by1 if necessary.Introducing the variable R we have

    v = v (1v1 + + mvm)= (1 1)v1 + + (m m)vm. (3.9)

    Let

    = max{ 0 |i i 0, for every i = 1, . . . ,m}= min

    {ii

    i > 0, i = 1, . . . ,m}.When you substitute for in (3.9), you discover that v lies in the subconegenerated by a proper subset of V . Basically we are varying in (3.9)ensuring non-negative coefficients for v1, . . . , vm until the first time wereach a zero coefficient in front of some vj . This (or these) vj is (are)deleted from the generating set. Repeating this process we will eventuallyend up with a linearly independent subset of vectors from V .

    Corollary 3.15. Let V = {v1, . . . , vm} Rd. If v conv(V ), then vbelongs to the convex hull of an affinely independent subset of V .

    Proof. If v conv({v1, . . . , vm}), then(v1

    ) cone

    ({(v11

    ), . . . ,

    (vm1

    )}).

    Now use Theorem 3.14 to conclude that(v1

    ) cone

    ({(u11

    ), . . . ,

    (uk1

    )}),

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 43

    where {(u11

    ), . . . ,

    (uk1

    )}{(

    v11

    ), . . . ,

    (vm1

    )}is a linearly independent subset. Therefore v conv({u1, . . . , uk}). But(

    u11

    ), . . . ,

    (uk1

    )are linearly independent if and only if u1, . . . , uk are affinely independentby Proposition 2.9.

    A consequence of Corollary 3.15 is that a point in the convex hull of morethan three planar points, belongs to the convex hull of at most three of thepoints (see Figure 3.14).

    Figure 3.14: A point in the convex hull of some given planar points is inthe convex hull of at most three of these points.

    The affine equivalent of a simplicial cone is called a simplex. More pre-cisely, a convex subset is called a d-simplex if it is the convex hull of d+ 1affinely independent points (see Figure 3.15). In these terms, Corollary 3.15states that the convex hull of finitely many points is a union of finitely manysimplices.

    Figure 3.15: Picture of 0-simplex (point), 1-simplex (line segment), 2-simplex (triangle), 3-simplex (tetrahedron).

    Example 3.16. The proofs of Theorem 3.14 and Corollary 3.15 may ap-pear quite abstract until you study a few concrete examples. Here is atypical example illustrating the linear algebra that goes into their proofs.Consider the convex hull

    C = conv({(

    10

    ),(

    01

    ),(

    12

    ),(

    40

    )}).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    44 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    The identity (21

    )= 111

    (10

    )+ 111

    (01

    )+ 511

    (12

    )+ 411

    (40

    )shows that

    v =(

    21

    ) C.

    Corollary 3.15 says however that v is in the convex hull of at most threeof these four vectors (why?). Let us emulate the proof of Corollary 3.15 in aconcrete computation. The first step is to find a linear dependence betweenthe four vectors (

    101

    ),

    (011

    ),

    (121

    )and

    (401

    )(3.10)

    in R3 (compare this with the beginning of the proof of Corollary 3.15). Hereis a linear dependence:

    5

    (101

    ) 6

    (011

    )+ 3

    (121

    ) 2

    (401

    )=

    (000

    ). (3.11)

    Just to make sure you know how to translate linear dependence into themore mundane setting of linear equations, let us show (in painstaking detail)how (3.11) was found! A linear dependence for the vectors in (3.10) can befound as a non-zero solution to the equations

    1 + 3 + 44 = 0

    2 + 23 = 0

    1 + 2 + 3 + 4 = 0

    (3.12)

    in 1, 2, 3 and 4. Such a non-zero solution always exists (see TheoremB.2). By subtracting the first equation from the third (a step in Gaussianelimination), you get

    1 + 3 + 44 = 0

    2 + 23 = 0

    2 34 = 0 .From this system of equations you can glean the non-zero solution

    2 = 6, 4 = 2, 3 = 3, and 1 = 5,which is the linear dependence in (3.11). In this particular case (3.9) be-comes(

    211

    )= ( 111 5)

    (101

    )+ ( 111 + 6)

    (011

    )+ ( 511 3)

    (121

    )+ ( 411 + 2)

    (401

    ).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 45

    How big is allowed to be, when

    111 5 0511 3 0

    must hold? Solving the inequalities for we get 155 and 533 . Toensure that both inequalities are satisfied, we must have 155 . The magicappears when we pick = 155 . Then(

    211

    )= ( 111 1555)

    (101

    )+ ( 111 +

    1556)

    (011

    )+ ( 511 1553)

    (121

    )+ ( 411 +

    1552)

    (401

    )

    = 1155

    (011

    )+ 2255

    (121

    )+ 2255

    (401

    ).

    This implies (21

    )= 1155

    (01

    )+ 2255

    (12

    )+ 2255

    (40

    ).

    Therefore (21

    ) conv

    ((01

    ),(

    12

    ),(

    40

    )).

    3.6 The convex hull, simplicial subsets and Blands rule

    How do we decide if a point is in the convex hull of finitely many given pointsin Rd? You have seen in the proof of Corollary 3.15 and in Example 3.16how this can be reduced to deciding if a vector is a conic linear combinationof finitely many given vectors. In principle, Theorem 3.14 tells us how tocheck this. But if V in the statement of Theorem 3.14 is a generating setfor Rd with m elements, we need to check the linearly independent subsetsof V with d elements. This seems like an insurmountable problem. If forexample m = 30 and d = 10, there can be up to 30 045 015 of the latter.

    We will present a rather surprising algorithm for traversing these subsetscoming from 7.1 in the classic monograph [Schrijver (1986)]. The surprise isthat the algorithm does not get stuck it manages to traverse the linearlyindependent subsets until it finds that the vector is in the cone or that thisis certifiably false. The method is inspired by Blands rule in the simplexalgorithm (see 5.4).

    Let V = {v1, . . . , vm} Rd be a generating set and x Rd. We call asubset J {1, 2, . . . ,m} of size d simplicial if |J | = d and {vj | j J} islinearly independent.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    46 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    (1) For a simplicial subset J = {i1, . . . , id} {1, 2, . . . ,m}, there existsuniquely defined ij R for j = 1, . . . , d, such that

    x = i1vi1 + + idvid .(2) If ij 0 for every j = 1, . . . , d, we have verified that v cone(V ). If

    not, pick the smallest index k {i1, . . . , id} with k < 0. Let c Rdbe the unique vector with the property that

    ctvk = 1

    ctvj = 0, if j {i1, . . . , id} \ {k}.This vector is the k-th row in A1, where A is the matrix with columnvectors vi1 , . . . , vid . Notice that ctx = k < 0.

    (3) If ctvj 0 for every j 6 J , then ctv 0 for every v cone(V ). Sincectx < 0, this shows that x cone(V ) is impossible.

    (4) If not, pick the smallest index j {1, . . . ,m} with ctvj < 0 and repeatthe procedure from (1) with the simplicial subset

    J := (J \ {k}) {j}.Here we will say that j enters and k exits the simplicial subset.

    Example 3.17. As an illustration (see Figure 3.16) let us check if(11

    ) conv

    ({(10

    ),(

    01

    ),(

    22

    ),(

    51

    ),(

    40

    )}).

    Rewriting this as a conic problem we need to check if x cone(V ), where

    V =

    {(101

    ),

    (011

    ),

    (221

    ),

    (511

    ),

    (401

    )}and x =

    (111

    ).

    We initiate the algorithm above with the simplicial subset J = {3, 4, 5}.Here

    x = v3 v4 + v5with k = 4 and c = ( 12 ,

    12 ,2). Since ctv1 = 32 and ctv2 = 2 we get j = 1

    and repeat the iteration with J = {3, 1, 5}. Herex = 12v3 +

    23v1 16v5

    with k = 5 and c = ( 13 , 16 , 13 ). Since ctv2 = 12 and ctv4 = 76 we getj = 2. Therefore the iteration is repeated with J = {3, 1, 2}. Here

    x = 13v3 +13v1 +

    13v2

    showing that x cone(V ).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 47

    1

    2

    3

    4

    5

    Figure 3.16: Geometric illustration of the steps {3, 4, 5} {1, 3, 5} {1, 2, 3} in detecting that the marked point is in the convex hull of thepoints in Example 3.17.

    Non-cycling

    We will now prove that the procedure outlined in the above steps (1)(4)terminates. Let Jt denote the simplicial subset in step t of the algorithm.If the algorithm does not terminate it must enter into a cycle: Jk = Jl forsome k < l.

    Let r {1, 2, . . . ,m} be the largest index to exit the simplicial subsetin a step p {k, k+1, . . . , l1}. Since Jk = Jl, there must also exist a stepq {k, k + 1, . . . , l 1} for which r enters the simplicial subset. Supposethat Jp = {i1, . . . , id}. Then

    x = i1vi1 + + idvid (3.13)

    with ij 0 for ij < r and r < 0.Let c denote the vector of (2) in step q, where r enters. Then

    ctx < 0. (3.14)

    Since r enters the simplicial subset, we must have ctvij 0 for ij < r andctvr < 0 by (4). We already know that r < 0 and ij 0 for ij < r. Forij > r we must have ctvij = 0, by the construction of c in (2) and since r isthe largest index to exit the simplicial subset. Applying these observationsto (3.13), we get ctx 0 contradicting (3.14).

    The rule of always exchanging smallest indices is the equivalent ofBlands rule in the simplex algorithm. It is remarkable that this simplerule is so powerful. In fact, the above algorithm furnishes an independentproof of Theorem 3.14 along with the substantial mathematical result: ifx 6 cone(V ), there exists c Rd with ctx < 0 and ctz 0 for z cone(V ).The latter will appear later under the name Farkass lemma (Lemma 4.14)with a different proof.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    48 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    3.7 Exercises

    Exercise 3.1. Let P = {x Rd |Ax b} be a polyhedron in Rd. Provethat (1 t)x+ ty P if x, y P and 0 t 1.Exercise 3.2. Let A,B Rd be convex subsets. Prove that A B is aconvex subset. Give an example showing that A B does not have to be aconvex subset.

    Exercise 3.3. Prove in detail that an affine half space is a convex subsetand that a polyhedron

    P = {x Rd |Ax b}

    is a convex subset of Rd.

    Exercise 3.4. Let A be a convex subset of Rd. Prove that

    A+ z := {x+ z |x A}and

    A := {x |x A}

    are convex subsets of Rd for z Rd and R. Let B be a convex subsetof Rd. Prove that

    A+B := {x+ y |x A, y B}

    is a convex subset of Rd.

    Exercise 3.5. Let v1, v2, v3 Rn. Show that

    {(1 )v3 + ((1 )v1 + v2) | [0, 1], [0, 1]}= {1v1 + 2v2 + 3v3 |1, 2, 3 0, 1 + 2 + 3 = 1}.

    Exercise 3.6. Sketch the convex hull of

    S = {(0, 0), (1, 0), (1, 1)} R2.

    Write conv(S) as the intersection of three half planes.

    Exercise 3.7. Let u1, u2, v1, v2 Rn. Show that

    conv({u1, u2}) + conv({v1, v2}) = conv({u1 + v1, u1 + v2, u2 + v1, u2 + v2}).

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 49

    Exercise 3.8. Let S Rn be a convex subset and v Rn. Show that

    {(1 )s+ v | [0, 1], s S}

    is a convex subset. Hint: compare with Exercise 3.5.

    Exercise 3.9. Let C = conv({v1, . . . , vm}) and P = C for v1, . . . , vm Rd. Prove that

    P = { Rd |tv1 1, . . . , tvm 1}

    and that P is a polyhedron. Compute and sketch for C given in Example 3.6.Prove in general that P is bounded if 0 is an interior point of C.

    Exercise 3.10. If F G C are convex subsets of Rd, prove that F is aface of C if F is a face of G and G is a face of C.

    Exercise 3.11. Give an example of a convex subset C Rd and a faceF Rd, which is not exposed (hint: think about stretching a disc).Exercise 3.12. Prove that C \ F is a convex subset if F is a face of aconvex subset C. Is it true that F C is a face if C \F is a convex subset?Exercise 3.13. Let X = {x1, . . . , xn} Rd.

    (i) Prove that if z conv(X) is an extreme point, then z X.(ii) Suppose that z 6 X. Prove that z is an extreme point of conv({z}

    X) if and only ifz 6 conv(X).

    This means that the extreme points in a convex hull consists of the non-redundant generators (compare this with Figure 3.6).

    Exercise 3.14. Prove in detail that

    C = {(x, y) R2 | x2 + y2 1}

    is a convex subset of R2. What are the extreme points of C? Can you proveit?

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    50 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Exercise 3.15. Recall the notation

    A+B = {u+ v |u A, v B} Rn

    for two subsets A,B Rn and let [u, v] := conv({u, v}) for u, v Rn.(i) Show that

    [u, v] + {w} = [u+ w, v + w]for u, v, w Rn.

    (ii) SketchP =

    [(11

    ),(

    12

    )]+[(

    21

    ),(

    32

    )]along with its extreme points in the plane.

    (iii) LetQ = P +

    [(31

    ),(

    41

    )].

    Write Q as a convex hull of the minimal number of points and as anintersection of half planes.

    (iv) Let A and B be convex sets and u0 A a point in A, which is notextreme. Show that u0 + b A+B is not extreme in A+B for anyb B.

    (v) Show thatT = [x, y] + [z, w]

    has at most 4 extreme points for x, y, z, w Rn. Can T have 3extreme points? 2?

    (vi) Let Li = [ui, vi] for i = 1, . . . ,m, where ui, vi Rn. Give an upperbound for how many extreme points

    Z = L1 + + Lm (3.15)can have. Show that Z is the image of the unit cube [0, 1]m Rmunder a suitable affine map.

    The Minkowski sum of finitely many line segments (as in (3.15)) is calleda zonotope.

    Exercise 3.16. Give an example of a non-convex cone.

    Exercise 3.17. Prove in detail that

    C = {(x, y, z) R3 | z 0, x2 + y2 z2}is a convex cone. Is C finitely generated?

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 51

    Exercise 3.18. Prove in detail that the recession cone rec(C) defined in(3.3) is a convex cone, where C is a convex subset. Perhaps the identity

    x+ (n+ )d = (1 )(x+ nd) + (x+ (n+ 1)d)

    might come in handy.

    Exercise 3.19. What is the recession cone of a bounded convex subset?

    Exercise 3.20. Can you give an example of an unbounded convex subsetC R2 with rec(C) = {0}?

    Exercise 3.21. LetC = cone

    ({(21

    ),(

    12

    )}).

    (i) Show that

    C = cone({(

    12),(2

    1

    )}).

    (ii) Suppose that

    C = cone({(

    ac

    ),(bd

    )}),

    where (a b

    c d

    )is an invertible matrix. How do you compute C?

    Exercise 3.22. The vector

    v =

    7/419/8

    is the convex combination

    18

    (11

    )+ 18

    (12

    )+ 14

    (22

    )+ 12

    (23

    )

    of four vectors in R2. Use the method outlined in Example 3.16 to answerthe following questions.

    (i) Is v in the convex hull of three of the four vectors?(ii) Is v in the convex hull of two of the four vectors?

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    52 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    Exercise 3.23. Let C1, C2, C3, C4 be convex subsets of R2, such that anythree of them have non-empty intersection i.e.,

    C1 C2 C3 6= C1 C2 C4 6= C1 C3 C4 6= C2 C3 C4 6= .

    (i) Show that for v1, v2, v3, v4 R2 there are 1, 2, 3, 4 R, not allzero, such that

    1v1 + 2v2 + 3v3 + 4v4 = 0

    1 + 2 + 3 + 4 = 0 .

    (ii) By assumption there exists

    vi 4j=1j 6=i

    Cj

    for i = 1, . . . , 4. Suppose that with the notation in (i) we have

    1, 2 03, 4 0.

    Prove that

    11 + 2

    v1 +2

    2 + 2v2 C1 C2 C3 C4.

    (iii) Prove now in general that C1 C2 C3 C4 6= .(iv) Prove that if C1, . . . , Cm are convex subsets of R2 where any three

    of them have non-empty intersection, then C1 Cm 6= .(v) What is a natural generalization from R2 to Rn of the result in (iv)?

    Exercise 3.24. Let S be a subset of Rn containing at least n + 2 points.Prove that there exists subsets S1, S2 S, such that

    (i) S1 S2 = (ii) S1 S2 = S(iii) conv(S1) conv(S2) 6= .

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Convex subsets 53

    Hint: write down an affine dependence between v1, . . . , vn+2 S. This resultis called Radons theorem.

    Exercise 3.25. Use the result in Exercise 3.24 to give a complete proofof the natural generalization alluded to in Exercise 3.23 (v) (called Hellystheorem).

    Exercise 3.26. Let e1, e2, e3 denote the canonical basis vectors of R3 andlet

    C = conv({e1,e1, e2,e2, e3,e3}).Verify that (

    13 ,

    15 ,

    17

    ) Cby writing down a convex linear combination.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    This page intentionally left blankThis page intentionally left blank

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Chapter 4

    Polyhedra

    Unlike a planar disc (Figure 3.9) and the Lorentz cone (Figure 3.11), poly-hedra are intersections of finitely many affine half spaces (see Figure 4.1).

    Figure 4.1: A (famous) polyhedron in R3.

    The purpose of this chapter is to dig deeper into the structure of poly-hedra only using Fourier-Motzkin elimination and basic linear algebra. Apolyhedron has finitely many faces and they can be characterized usinglinear algebra. As a prelude to this characterization in Proposition 4.3, wewill briefly mention the remarkable numerical relation known as Eulersformula. A proof of this formula is beyond the scope of this book. Fortu-nately it is one those mathematical miracles that can be appreciated almostinstantly (see (4.1) and Figure 4.2).

    A central result due to Minkowski1 and Weyl2 says that a polyhedroncan be expressed as the Minkowski sum of a finitely generated cone andthe convex hull of finitely many points. This result is conveniently treatedin the setting of convex cones.1Hermann Minkowski (18641909). German mathematician.2Hermann Weyl (18861955). German mathematician.

    55

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    56 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

    A finitely generated convex cone is a polyhedron. It is not easy to writeup a precise mathematical proof of this statement. Hermann Weyl, one ofthe great mathematicians of the 20th century, found it worthwhile to devotea paper [Weyl (1935)] to it. This result is now known as Weyls theoremand was the subject of his last seminar in Gttingen in the summer of 1933before he left for the United States.

    We will give a deceptively short proof of Weyls theorem using Fourier-Motzkin elimination. In all fairness we also enter into the ideas of Weylsoriginal proof in Chapter 5, hoping to convince the reader that it containsmuch deeper information on the minimal generators of a polyhedral cone.

    The theory of polyhedra is an abundant source of interesting mathemat-ics. We give a few examples in this chapter like Markov chains and steadystates, duality in linear programming and doubly stochastic matrices.

    4.1 Faces of polyhedra

    Assume in the following remark that the convex hull of finitely many pointsis a polyhedron and that a polyhedron has finitely many faces. Both of thesestatements will be proved later.

    Remark 4.1. If C R3 is a convex subset, then its extreme pointsare called vertices, its one-dimensional faces are called edges and its two-dimensional faces are called faces. Let P R3 be the convex hull of finitelymany points and suppose that P is not contained in an affine plane. Thereis a remarkable numerical relation between the number of vertices V , thenumber of edges E and the number of faces F of P . This relation is

    V E + F = 2 (4.1)

    and seems to have been observed first by Descartes3 in 1640. It was laterused by Euler4 in 1752 (see the chapter on topology in [Courant and Rob-bins (1941)]) and is today called Eulers formula for polyhedra. You shouldcheck (4.1) with the platonic solids depicted as wireframes in Figure 4.2(for the cube we have 8 12 + 6 = 2).

    The formula (4.1) belongs to a rich branch of mathematics called poly-hedral combinatorics. For the convex hull P of finitely many points in Rd

    3Ren Descartes (15961650). French mathematician.4Leonhard Euler (17071783). Swiss mathematician.

  • January 18, 2013 13:22 World Scientific Book - 9in x 6in master

    Polyhedra 57

    Figure 4.2: An illustration of Eulers formula (4.1) the five so-calledplatonic solids with vertices and (in)visible edges marked: the cube(6 faces), the tetrahedron (4 faces), the octahedron (8 faces), the icosa-hedron (20 faces) and the dodecahedron (12 faces).

    not contained in an affine hyperplane, (4.1) generalizes to the identity

    f0 f1 + f2 + (1)d1fd1 = 1 + (1)d1, (4.2)

    where fj denotes the n