Lauritzen - Undergraduate Convexity - From Fourier and Motzkin to Kuhn and Tucker

UNDERGRADUATE CONVEXITYFrom Fourier and Motzkin to Kuhn and Tucker

8527_9789814412513_tp.indd 1 24/1/13 4:19 PM

This page intentionally left blankThis page intentionally left blank

Niels LauritzenAarhus University, Denmark

UNDERGRADUATE CONVEXITYFrom Fourier and Motzkin to Kuhn and Tucker

N E W J E R S E Y L O N D O N S I N G A P O R E B E I J I N G S H A N G H A I H O N G K O N G TA I P E I C H E N N A I

World Scientific

8527_9789814412513_tp.indd 2 24/1/13 4:19 PM

Published by

World Scientific Publishing Co. Pte. Ltd.

5 Toh Tuck Link, Singapore 596224

USA office: 27 Warren Street, Suite 401-402, Hackensack, NJ 07601

UK office: 57 Shelton Street, Covent Garden, London WC2H 9HE

British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.

Cover image: Johan Ludvig William Valdemar Jensen (18591925).

Mathematician and telephone engineer.

Photograph by Vilhelm Rieger (courtesy of the Royal Library, Copenhagen).

UNDERGRADUATE CONVEXITY

From Fourier and Motzkin to Kuhn and Tucker

Copyright 2013 by World Scientific Publishing Co. Pte. Ltd.

All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means,

electronic or mechanical, including photocopying, recording or any information storage and retrieval

system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright

Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to

photocopy is not required from the publisher.

ISBN 978-981-4412-51-3

ISBN 978-981-4452-76-2 (pbk)

Printed in Singapore.

LaiFun - Undergraduate Convexity.pmd 1/25/2013, 9:41 AM1

January 18, 2013 13:22 World Scientific Book - 9in x 6in master

Preface

Convexity is a key concept in modern mathematics with rich applicationsin economics and optimization.

This book is a basic introduction to convexity based on several yearsof teaching the one-quarter courses Konvekse Mngder (convex sets) andKonvekse Funktioner (convex functions) to undergraduate students inmathematics, economics and computer science at Aarhus University. Theprerequisites are minimal consisting only of first year courses in calculusand linear algebra.

I have attempted to strike a balance between different approaches toconvexity in applied and pure mathematics. Compared to the former themathematics takes a front seat. Compared to some of the latter, a key pointis that the ability to carry out computations is considered paramount anda crucial stepping stone to the understanding of abstract concepts e.g., thedefinition of a face of a convex set does not make much sense before it isviewed in the context of several simple examples and computations.

Chapters 16 treat convex subsets from the basics of linear inequalitiesto Minkowskis theorem on separation of disjoint convex subsets by hyper-planes. The basic idea has been to emphasize part of the rich finite theoryof polyhedra before entering into the infinite theory of closed convex sub-sets.

Fourier-Motzkin elimination is to linear inequalities what Gaussian elim-ination is to linear equations. It seems appropriate to begin a course on con-vexity by introducing this simple, yet powerful method. The prerequisitesare barely present. Still the first chapter contains substantial results suchas a simple algorithm for linear optimization and the fundamental theoremthat projections of polyhedra are themselves polyhedra.

v


vi Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

Before introducing closed convex subsets, several basic definitions andhighlights from the polyhedral world are given: a concise treatment of affinesubspaces, faces of convex subsets, Blands rule from the simplex algorithmas a tool for computing with the convex hull, faces of polyhedra, Farkasslemma, steady states for Markov chains, duality in linear programming,doubly stochastic matrices and the Birkhoff polytope.

The chapter Computations with polyhedra contains a treatment of twoimportant polyhedral algorithms: the double description method and thesimplex algorithm. The double description method is related to Fourier-Motzkin elimination. It is very easily explained in an undergraduate contextespecially as a vehicle for computing the bounding half spaces of a convexhull.

The simplex algorithm solves linear optimization problems and is some-what mysterious from a mathematical perspective. There is no obviousreason it should work well. In fact, the famous mathematician John vonNeumann never really believed it would perform in practice. The inven-tor George B. Dantzig also searched for alternate methods for years beforeconfronting experimental data from some of the worlds first computers:the simplex algorithm performed amazingly well in practice. Only recentlyhas a mathematical explanation for this phenomenon been given by Spiel-man and Teng. Our treatment of the simplex algorithm and the simplextableau deviates from the standard form and works with the polyhedron inits defining space.

The transition to the continuous theory of non-polyhedral convex sub-sets comes after the first five chapters. Here it is proved that closed convexsubsets serve as generalizations of polyhedra, since they coincide with ar-bitrary intersections of affine half spaces. The existence of a supporting hy-perplane at a boundary point of a convex subset is proved and Minkowskistheorems on compact convex subsets and separation of disjoint convex sub-sets are given.

Chapters 710 treat convex functions from the basic theory of convexfunctions of one variable with Jensens inequality to the Karush-Kuhn-Tucker conditions, dual optimization problems and an outline of an interiorpoint algorithm for solving convex optimization problems in several vari-ables. The setting is almost always the simplest. Great generality is finewhen you have lived with a subject for years, but in an introductory courseit tends to become a burden. You accomplish less by including more.

The main emphasis is on differentiable convex functions. Since under-graduate knowledge of differentiability may vary, we give an almost com-


Preface vii

plete review of the theory of differentiability in one and several variables.The only general result on convex functions not assuming differentiabilityis the existence of the subgradient at a point.

An understanding of convex functions of several variables is impossi-ble without knowledge of the finer points of linear algebra over the realnumbers. Introducing convex functions of several variables, we also give athorough review of positive semidefinite matrices and reduction of symmet-ric matrices. This important part of linear algebra is rarely fully understoodat an undergraduate level.

The final chapter treats Convex optimization. The key elements are theKarush-Kuhn-Tucker conditions, how saddle points of the Lagrangian leadto a dual optimization problem and finally an outline of an interior pointalgorithm using bisection and the modified Newton method. Monographshave been written on these three topics. We only give a brief but self-contained introduction with simple examples.

Suggestions for teaching a one-semester course

The amount of material included in this book exceeds a realistic planfor a one-semester undergraduate course on convexity. I consider Fourier-Motzkin elimination (Chapter 1), affine subspaces (Chapter 2), basics ofconvex subsets (Chapter 3), the foundational material on polyhedra inChapter 4, a taste of one of the two algorithms in Chapter 5 and closed con-vex subsets (Chapter 6) as minimum along with almost all of the materialin Chapters 710.

The progression of learning depends on the proficiency in linear algebraand calculus. The necessary basic concepts from analysis are introduced inAppendix A. In Appendix B there is a review of linear algebra from thepoint of view of linear equations leading to the rank of a matrix.

In my view, a too rigid focus on the abstract mathematical details beforetelling about examples and computations is a major setback in the teachingof mathematics at all levels. Certainly the material in this book benefitsfrom being presented in a computational context with lots of examples.

Aarhus, December 2012


Acknowledgments

I am extremely grateful to Tage Bai Andersen and Jesper Funch Thomsenfor very useful and detailed comments on a second draft for this book.Comments from Kent Andersen, Jens Carsten Jantzen, Anders NedergaardJensen and Markus Kiderlen also led to several improvements.

I am an algebraist by training and encountered convexity because ofan interest in computational algebra (and computers!). As such, I havebenefited immensely over the years from insightful explanations from thefollowing more knowledgeable people: Tage Bai Andersen, Kent Andersen,Kristoffer Arnsfelt Hansen, Peter Bro Miltersen, Marcel Bkstedt, KomeiFukuda, Anders Nedergaard Jensen, Herbert Scarf, Jacob Schach Mller,Andrew du Plessis, Henrik Stetkr, Bernd Sturmfels, Rekha Thomas, Jr-gen Tornehave, Jrgen Vesterstrm and Bent rsted.

I am grateful to Jens Carsten Jantzen, Jesper Ltzen and Tage GutmannMadsen for help in tracking down the venerable Jensen inequality postagestamp used for several years by the Department of Mathematical Sciences atUniversity of Copenhagen. Also, thanks to Tinne Hoff Kjeldsen for sharingher expertise on the fascinating history of convexity and optimization.

A very special thanks to the teaching assistants on Konvekse Mngderand Konvekse Funktioner : Lisbeth Laursen, Jonas Andersen Seebach,Morten Leander Petersen, Rolf Wognsen, Linnea Jrgensen and Dan Zhang.They pointed out several inaccuracies in my lecture notes along the way.

I am grateful to Kwong Lai Fun and Lakshmi Narayanan of WorldScientific for their skilled help in the production of this book.

Lars daleif Madsen has been crucial in the technical typesetting withhis vast knowledge of LATEX and his usual careful attention to detail.

Finally, Helle and William deserve an abundance of gratitude for theirpatience and genuine love.

ix


Contents

Preface v

Acknowledgments ix

1. Fourier-Motzkin elimination 1

1.1 Linear inequalities . . . . . . . . . . . . . . . . . . . . . . 31.2 Linear optimization using elimination . . . . . . . . . . . 81.3 Polyhedra . . . . . . . . . . . . . . . . . . . . . . . . . . . 91.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2. Affine subspaces 17

2.1 Definition and basics . . . . . . . . . . . . . . . . . . . . . 182.2 The affine hull . . . . . . . . . . . . . . . . . . . . . . . . 202.3 Affine subspaces and subspaces . . . . . . . . . . . . . . . 212.4 Affine independence and the dimension of a subset . . . . 222.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3. Convex subsets 27

3.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 283.2 The convex hull . . . . . . . . . . . . . . . . . . . . . . . . 303.3 Faces of convex subsets . . . . . . . . . . . . . . . . . . . 323.4 Convex cones . . . . . . . . . . . . . . . . . . . . . . . . . 363.5 Carathodorys theorem . . . . . . . . . . . . . . . . . . . 413.6 The convex hull, simplicial subsets and Blands rule . . . 453.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

xi


xii Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

4. Polyhedra 55

4.1 Faces of polyhedra . . . . . . . . . . . . . . . . . . . . . . 564.2 Extreme points and linear optimization . . . . . . . . . . 614.3 Weyls theorem . . . . . . . . . . . . . . . . . . . . . . . . 634.4 Farkass lemma . . . . . . . . . . . . . . . . . . . . . . . . 644.5 Three applications of Farkass lemma . . . . . . . . . . . . 66

4.5.1 Markov chains and steady states . . . . . . . . . . 664.5.2 Gordans theorem . . . . . . . . . . . . . . . . . . . 694.5.3 Duality in linear programming . . . . . . . . . . . . 70

4.6 Minkowskis theorem . . . . . . . . . . . . . . . . . . . . . 744.7 Parametrization of polyhedra . . . . . . . . . . . . . . . . 754.8 Doubly stochastic matrices: The Birkhoff polytope . . . . 76

4.8.1 Perfect pairings and doubly stochastic matrices . . 784.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

5. Computations with polyhedra 85

5.1 Extreme rays and minimal generators in convex cones . . 865.2 Minimal generators of a polyhedral cone . . . . . . . . . . 875.3 The double description method . . . . . . . . . . . . . . . 90

5.3.1 Converting from half space to vertex representation 955.3.2 Converting from vertex to half space representation 965.3.3 Computing the convex hull . . . . . . . . . . . . . 98

5.4 Linear programming and the simplex algorithm . . . . . . 1005.4.1 Two examples of linear programs . . . . . . . . . . 1025.4.2 The simplex algorithm in a special case . . . . . . 1055.4.3 The simplex algorithm for polyhedra in general form 1095.4.4 The simplicial hack . . . . . . . . . . . . . . . . . . 1115.4.5 The computational miracle of the simplex tableau . 1135.4.6 Computing a vertex in a polyhedron . . . . . . . . 119

5.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

6. Closed convex subsets and separating hyperplanes 125

6.1 Closed convex subsets . . . . . . . . . . . . . . . . . . . . 1266.2 Supporting hyperplanes . . . . . . . . . . . . . . . . . . . 1296.3 Separation by hyperplanes . . . . . . . . . . . . . . . . . . 1336.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

7. Convex functions 141

7.1 Basics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143


Contents xiii

7.2 Jensens inequality . . . . . . . . . . . . . . . . . . . . . . 1457.3 Minima of convex functions . . . . . . . . . . . . . . . . . 1477.4 Convex functions of one variable . . . . . . . . . . . . . . 1487.5 Differentiable functions of one variable . . . . . . . . . . . 150

7.5.1 The Newton-Raphson method for finding roots . . 1527.5.2 Critical points and extrema . . . . . . . . . . . . . 153

7.6 Taylor polynomials . . . . . . . . . . . . . . . . . . . . . . 1567.7 Differentiable convex functions . . . . . . . . . . . . . . . 1597.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 161

8. Differentiable functions of several variables 167

8.1 Differentiability . . . . . . . . . . . . . . . . . . . . . . . . 1678.1.1 The Newton-Raphson method for several variables 1718.1.2 Local extrema for functions of several variables . . 172

8.2 The chain rule . . . . . . . . . . . . . . . . . . . . . . . . 1748.3 Lagrange multipliers . . . . . . . . . . . . . . . . . . . . . 1768.4 The arithmetic-geometric inequality revisited . . . . . . . 1838.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 184

9. Convex functions of several variables 187

9.1 Subgradients . . . . . . . . . . . . . . . . . . . . . . . . . 1889.2 Convexity and the Hessian . . . . . . . . . . . . . . . . . . 1909.3 Positive definite and positive semidefinite matrices . . . . 1939.4 Principal minors and definite matrices . . . . . . . . . . . 1969.5 The positive semidefinite cone . . . . . . . . . . . . . . . . 1989.6 Reduction of symmetric matrices . . . . . . . . . . . . . . 2019.7 The spectral theorem . . . . . . . . . . . . . . . . . . . . . 2059.8 Quadratic forms . . . . . . . . . . . . . . . . . . . . . . . 2089.9 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

10. Convex optimization 223

10.1 A geometric optimality criterion . . . . . . . . . . . . . . 22410.2 The Karush-Kuhn-Tucker conditions . . . . . . . . . . . . 22610.3 An example . . . . . . . . . . . . . . . . . . . . . . . . . . 23010.4 The Langrangian, saddle points, duality and game theory 23210.5 An interior point method . . . . . . . . . . . . . . . . . . 236

10.5.1 Newtonian descent, exact line search and bisection 23810.5.2 Polyhedral constraints . . . . . . . . . . . . . . . . 240

10.6 Maximizing convex functions over polytopes . . . . . . . . 243


xiv Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

10.6.1 Convex functions are continuous on open subsets . 24410.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 245

Appendix A Analysis 253

A.1 Measuring distances . . . . . . . . . . . . . . . . . . . . . 253A.2 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

A.2.1 Supremum and infimum . . . . . . . . . . . . . . . 258A.3 Bounded sequences . . . . . . . . . . . . . . . . . . . . . . 258A.4 Closed subsets and open subsets . . . . . . . . . . . . . . 259A.5 The interior and boundary of a set . . . . . . . . . . . . . 260A.6 Continuous functions . . . . . . . . . . . . . . . . . . . . . 261A.7 The main theorem . . . . . . . . . . . . . . . . . . . . . . 262A.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 262

Appendix B Linear (in)dependence and the rank of a matrix 265

B.1 Linear dependence and linear equations . . . . . . . . . . 265B.2 The rank of a matrix . . . . . . . . . . . . . . . . . . . . . 267B.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . 270

Bibliography 273

Index 277


Chapter 1

Fourier-Motzkin elimination

You probably agree that it is easy to solve the equation

2x = 4. (1.1)

This is an example of a linear equation in one variable having the uniquesolution x = 2. Perhaps you will be surprised to learn, that there is es-sentially no difference between solving a simple equation like (1.1) and themore complicated system

2x+ y + z = 7

x+ 2y + z = 8

x+ y + 2z = 9

(1.2)

of linear equations in x, y and z. Using the first equation 2x+ y+ z = 7 wesolve for x and get

x = (7 y z)/2. (1.3)This may be substituted into the remaining two equations in (1.2) and

we get the simpler system

3y + z = 9

y + 3z = 11

of linear equations in y and z. Again using the first equation in this systemwe get

y = (9 z)/3 (1.4)ending up with the simple equation 8z = 24. This is an equation of thetype in (1.1) giving z = 3. Now z = 3 gives y = 2 using (1.4). Finally y = 2and z = 3 gives x = 1 using (1.3).

1


2 Undergraduate Convexity From Fourier and Motzkin to Kuhn and Tucker

Figure 1.1: Isaac Newton (16421727). English mathematician.

Solving a seemingly complicated system of linear equations like (1.2) isreally no more difficult than solving the simple equation (1.1). One of theworlds greatest scientists, Isaac Newton, found it worthwhile to record thismethod in 1720 with the words

And you are to know, that by each quation one unknownQuantity may be taken away, and consequently, when there areas many quations and unknown Quantities, all at length maybe reducd into one, in which there shall be only one Quantityunknown.

Figure 1.2: Carl Friedrich Gauss (17771855). German mathematician.

During the computation of the orbit of the asteroid Pallas around 1810,Gauss encountered the need for solving linear equations related to his fa-mous least squares method. If you spend a little time deciphering the Latinin Gausss original writings (see Figure 1.3), you will see how eliminationappears naturally towards the end of the page. In spite of Newtons ex-plicit description several years before Gauss was born, this procedure isnow known as Gaussian elimination (see [Grcar (2011)] for more on thefascinating history of Gaussian elimination).


Fourier-Motzkin elimination 3

Figure 1.3: Gausss encounter with (Gaussian) elimination (published in1810) for use in the least squares method in computing the orbit of theasteroid Pallas. Notice that is the sum of squares to be minimized.

1.1 Linear inequalities

Inequalities may be viewed as a generalization of equations, since for twonumbers a and b, a = b if and only if a b and a b. We will describe



Figure 1.4: Joseph Fourier (17681830). French mathematician.

a clever algorithm going back to Fourier for systematically solving systemsof linear inequalities. Fourier had a rather concrete problem in mind, whenhe presented his note [Fourier (1826)] (see Figure 1.6).

The algorithm itself is very similar to Gaussian elimination except thatwe are facing two types of inequalities () and () instead of just oneequality (=). First consider the simplest case of linear inequalities in justone variable by way of the example

2x + 1 73x 2 4x + 2 3x 0

(1.5)

of inequalities in x. This can be rewritten to

x 3x 2

1 x0 x

and therefore

x min{2, 3} = 2max{1, 0} = 0 x

or simply 0 x 2. Here the fundamental difference between linear equa-tions and linear inequalities is apparent. Multiplying by 1 leaves = invari-ant, whereas changes into .



Consider now the system

x 0x + 2y 6x + y 2x y 3

y 0

(1.6)

of inequalities in two variables x and y.Perhaps the most straightforward way of approaching (1.6) is through

a sketch. The bounding lines are

x = 0

x + 2y = 6

x + y = 2

x y = 3y = 0 .

(1.7)

For each line we pick a point to decide which half plane to shade e.g., weneed to shade below the line x+ 2y = 6, since the corresponding inequalityis x+ 2y 6 and (for example) 0 + 2 0 < 6. The intersection of these halfplanes is sketched as the shaded area in Figure 1.5.

Figure 1.5: Sketch of the solutions in (1.6).

We are aiming for a more effective way of representing the solutions.Our sketching techniques are not of much use solving for example 17 lin-ear inequalities in 12 unknowns. In order to attack (1.6) algebraically, wefirst record the following result strongly related to Fouriers problem inFigure 1.6.



Proposition 1.1. Let 1, . . . , r, 1, . . . , s R. Then

max{1, . . . , r} min{1, . . . , s}

if and only if i j for every i, j with 1 i r and 1 j s:

1 1 . . . 1 s...

. . ....

r 1 . . . r s.

Proof. If max{1, . . . , r} min{1, . . . , s}, then

i max{1, . . . , r} min{1, . . . , s} jfor every 1 i r and 1 j s. On the other hand let 1 i0 r and1 j0 s be such that

i0 = max{1, . . . , r}j0 = min{1, . . . , s}.

If i j for every 1 i r and 1 j s, then i0 j0 . Therefore

max{1, . . . , r} min{1, . . . , s}.

Inspired by Gaussian elimination we will attempt to isolate and eliminate x.The key point here is that there exists x solving the two inequalities

a x and x b if and only if a b,

where a and b are real numbers. With this in mind, we rewrite (1.6) to

0 xx 6 2y

2 y xx 3 + yy 0 .

Just like in one variable, this system can be reduced to

x min{6 2y, 3 + y}max{0, 2 y} x

y 0 .(1.8)



Figure 1.6: The first page of Fouriers note [Fourier (1826)]. Notice thespecific problem he is describing. In modern parlor it amounts to findingall x, y, z such that x+ y + z = 1 and max{x, y, z} (1 + r) min{x, y, z}for a fixed r 0.

Therefore we can eliminate x from (1.8) and deduce that

max{0, 2 y} min{6 2y, 3 + y}y 0 (1.9)

is solvable in y if and only if (1.8) is solvable in x and y. Now Proposition 1.1



shows that (1.9) is equivalent to the inequalities

0 6 2y0 3 + y

2 y 6 2y2 y 3 + y

0 y

in the variable y. These inequalities can be solved just like we solved (1.5)and may be reduced to the two inequalities

0 y 3.

We have proved that two numbers x and y solve the system (1.6) if andonly if

0 y 3max{0, 2 y} x min{6 2y, 3 + y}.

If you phrase things a bit more geometrically, the projection of the solu-tions to (1.6) on the y-axis is the interval [0, 3]. In other words, if x andy solve (1.6), then y [0, 3] and for a fixed y [0, 3], x and y solve (1.6)provided that max{0, 2 y} x min{6 2y, 3 + y}.

1.2 Linear optimization using elimination

The elimination method outlined here can be used in solving the problemof maximizing a linear function subject to constraints consisting of lin-ear inequalities (see also Exercises 1.5 and 1.9). Such linear optimizationproblems are excellent models for many practical problems and are usuallysolved with the more advanced simplex algorithm, which we will explainlater in 5.4. The following example illustrates how elimination is used byadjoining an extra variable for the linear function.

Example 1.2. Find the maximal value of x+ y subject to the constraints

x + 2y 32x + y 3x 0

y 0 .



Here the trick is to introduce an extra variable z and then find the maxi-mal z, such that the inequalities (and the one equation)

x + y = z

x + 2y 32x + y 3x 0

y 0have a solution. First we eliminate x by substituting x = z y into theinequalities:

z + y 32z y 3z y 0

y 0 .Preparing for elimination of y we write

y 3 z2z 3 y

y z0 y .

Thereforemax{0, 2z 3} y min{3 z, z} (1.10)

and Proposition 1.1 applies to give the inequalities

0 3 z0 z

2z 3 3 z2z 3 z

with solution 0 z 2. Therefore the maximal value of z = x + y is 2.You can obtain a solution (x, y) to the linear optimization problem by firstinserting z = 2 into (1.10) to get y and then insert z and y into x = z yto get x. This gives the unique optimum (x, y) = (1, 1).

Now we are ready to enter into the general setting.

1.3 Polyhedra

A linear inequality in n variables x1, . . . , xn is an inequality of the form

a1x1 + + anxn b,



where a1, . . . , an, b R. By Rn we denote the set of column vectors withn entries (n 1 matrices). For typographical reasons we will sometimes let(x1, . . . , xn) refer to the column vector in Rn with entries x1, . . . , xn.

Definition 1.3. The subset

P =

x1...xn

Rna11x1 + + a1nxn b1...am1x1 + + amnxn bm

Rnof solutions to a system

a11x1 + + a1nxn b1...am1x1 + + amnxn bm

of finitely many linear inequalities (here aij and bi are real numbers) iscalled a polyhedron.

Notation 1.4. For vectors u = (u1, . . . , un), v = (v1, . . . , vn) Rn weintroduce the notation

u v u1 v1 and . . . and un vn.

With this convention the polyhedron in Definition 1.3 is expressed moreeconomically as

P = {x Rn |Ax b},where A is the m n matrix and b the vector in Rm given by

A =

a11 a1n... . . . ...am1 amn

and b = b1...bm

.Example 1.5. For u = (1, 2) and v = (2, 3), u v, whereas neither u vnor v u hold for u = (1, 0) and v = (0, 1). The polyhedron in (1.6) canbe written as

P =

(xy

) R2

1 0

1 21 1

1 10 1

(xy)

062

30

.



In modern mathematical terms our computations can be used in provingthe main result (Theorem 1.6) in this chapter that the projection of a poly-hedron is a polyhedron (see Figure 1.7). This seemingly innocuous resulthas rather profound consequences. The proof may appear a bit technical atfirst, but it is simply a formalization of the concrete computations in 1.1.

The elimination method for linear inequalities in 1.1 is called Fourier-Motzkin elimination. Not knowing the classical paper by Fourier, Motzkin1

rediscovered it in his thesis Beitrge zur Theorie der linearen Ungleichun-gen supervised by Ostrowski2 in Basel, 1933.

Figure 1.7: Projection to R2 of a polyhedron in R3.

Theorem 1.6. Consider the projection pi : Rn Rn1 given bypi(x1, . . . , xn) = (x2, . . . , xn).

If P Rn is a polyhedron, thenpi(P ) = {(x2, . . . , xn) | x1 R : (x1, x2, . . . , xn) P} Rn1

1Theodore Samuel Motzkin (19081970). Israeli-American mathematician.2Alexander Markowich Ostrowski (18931986). Russian-Swiss mathematician.



is a polyhedron.

Proof. Suppose that P is the set of solutions to

a11x1 + + a1nxn b1...am1x1 + + amnxn bm .

We partition these m inequalities according to the sign of ai1:

G = {i | ai1 > 0}Z = {i | ai1 = 0}L = {i | ai1 < 0}.

Inequality number i reduces to

x1 ai2x2 + + ainxn + bi,if i G and to

aj2x2 + + ajnxn + bj x1,if j L, where aik = aik/ai1 and bi = bi/ai1 for k = 2, . . . , n. So theinequalities in L and G are equivalent to

max{ai2x2 + + ainxn + bi

i L} x1 min

{aj2x2 + + ajnxn + bj

j G}by Proposition 1.1. By definition, (x2, . . . , xn) pi(P ) if and only if(x2, . . . , xn) satisfies the inequalities in Z and

max{ai2x2 + + ainxn + bi

i L} min{aj2x2 + + ajnxn + bj j G}.

Proposition 1.1 shows that this inequality is equivalent to the |L||G| in-equalities in x2, . . . , xn consisting of

ai2x2 + + ainxn + bi aj2x2 + + ajnxn + bjor rather

(ai2 aj2)x2 + + (ain ajn)xn bj biwhere i L and j G. Adding the inequalities in Z, where x1 is notpresent, it follows that pi(P ) is the set of solutions to these |L||G| + |Z|linear inequalities in x2, . . . , xn. Therefore pi(P ) is a polyhedron.



1.4 Exercises

Exercise 1.1. Sketch the set of solutions to the system

2x + y 23x + y 9x + 2y 4

y 0(1.11)

of linear inequalities. Carry out the elimination procedure for (1.11) asillustrated in 1.1.

Exercise 1.2. Let

P =

(x, y, z) R3x y z 03x y z 1x + 3y z 2x y + 3z 3

and pi : R3 R2 be given by pi(x, y, z) = (y, z).

(i) Compute pi(P ) as a polyhedron i.e., as the solutions to a set of linearinequalities in y and z.

(ii) Compute (P ), where : R3 R is given by (x, y, z) = x.(iii) How many integral points does P contain i.e., how many elements

are in the set P Z3?Exercise 1.3. Find all solutions x, y, z Z to the linear inequalities

x + y z 0 y + z 0

z 0x z 1

y 1z 1

by using Fourier-Motzkin elimination.

Exercise 1.4. Does the system

2x 3y + z 2x + 3y + z 3

2x 3y + z 2x 3y 3z 12x y + 3z 3



of linear inequalities have a solution x, y, z R?

Exercise 1.5. Let P Rn be a polyhedron and c Rn. Define the poly-hedron P Rn+1 by

P ={(

xz

) Rn+1

ct x = z, x P, z R}.(i) How does this setup relate to Example 1.2?(ii) Show how projection onto the z-coordinate (and Fourier-Motzkin

elimination) in P can be used to solve the linear optimization prob-lem of finding x P , such that ctx is minimal (or proving that suchan x does not exist).

(iii) Let P denote the polyhedron from Exercise 1.2. You can see that

(0, 0, 0), (1, 12 , 12 ) P

have values 0 and 1 on their first coordinates, but what is theminimal first coordinate of a point in P?

Exercise 1.6. Solve the problem appearing in Fouriers article (Figure 1.6)for r = 1 using Fourier-Motzkin elimination.

Exercise 1.7. Let P denote the set of (x, y, z) R3, satisfying

2x + y + z 4x 1

y 2z 3

x 2y + z 12x + 2y z 5 .

(i) Prove that P is bounded.(ii) Find (x, y, z) P with z maximal. Is such a point unique?

Exercise 1.8. A vitamin pill P is produced using two ingredients M1and M2. The pill needs to satisfy four constraints for the vital vitamins V1and V2. It must contain at least 6 milligram and at most 15 milligram of V1and at least 5 milligram and at most 12 milligram of V2. The ingredientM1contains 3 milligram of V1 and 2 milligram of V2 per gram. The ingredientM2 contains 2 milligram of V1 and 3 milligram of V2 per gram:



V1 V2

M1 3 2

M2 2 3

Let x denote the amount of M1 and y the amount of M2 (measured ingrams) in the production of a vitamin pill. Write down a system of linearinequalities in x and y describing the constraints above.

We want a vitamin pill of minimal weight satisfying the constraints. Howmany grams of M1 and M2 should we mix? Describe how Fourier-Motzkinelimination can be used in solving this problem.

Exercise 1.9. Use Fourier-Motzkin elimination to compute the minimalvalue of

x1 + 2x2 + 3x3,

when x1, x2, x3 satisfy

x1 2x2 + x3 = 4x1 + 3x2 = 5

andx1 0, x2 0, x3 0.


Chapter 2

Affine subspaces

A polyhedron is the set of solutions to a system of linear inequalities. Sets

{x Rd |Ax = b} (2.1)

of solutions to a system of linear equations are polyhedra of the simplestkind. Here A is an m d matrix and b Rm corresponding to a system ofm linear equations with d unknowns.

Recall that a line in Rd is a subset (see Figure 2.1) of the form

{x+ t | t R},

where x Rd is a vector and Rd \ {0} a non-zero directional vector.Two distinct points u, v Rd are contained in the unique line

L = {(1 t)u+ tv | t R}. (2.2)

Here a directional vector for L is v u. It is not too difficult to check thatsets of solutions to systems of linear equations such as (2.1) contain theline between any two of their points. Subsets with this intrinsic geometricproperty are called affine subspaces.

The purpose of this chapter is to give an account of affine subspacesbased on systems of linear equations. The difference between affine sub-spaces and the usual subspaces of linear algebra is that the former do notnecessarily contain the zero vector. Affine subspaces also enter into the im-portant definition of the dimension of an arbitrary subset of Rd so that apoint has dimension zero, a line dimension one etc.

17



x

Figure 2.1: Sketch of the line {x+ t | t R} in R2 with x and marked.

2.1 Definition and basics

We begin by stating some basic properties of affine subspaces. First a mo-tivating example.

Example 2.1. Consider the three points v1 = (2, 1, 0), v2 = (1, 0, 1) andv3 = (0, 4, 1) R3. You can check through (2.2), that v3 does not lieon the unique line through v1 and v2, hence there is no line containing allthree points. Therefore they span a unique plane H in R3. This plane isgiven parametrically as

v1 + t1(v2 v1) + t2(v3 v1) = (1 t1 t2)v1 + t1v2 + t2v3 (2.3)for t1, t2 R (see Figure 2.2). In other words, H = v1 + W = {v1 + v |v W}, where W is the linear subspace of R3 spanned by the vectorsv2 v1 and v3 v1. With numbers inserted this reads

H =

{(210

)+ t1

(11

1

)+ t2

(231

) t1, t2 R}.By finding a non-zero solution to the system

1 2 + 3 = 021 + 32 3 = 0

of linear equations, you can check that W = {(x, y, z) R3 | 2x + 3y +5z = 0}. Therefore

H = {(x, y, z) R3 | 2x+ 3y + 5z = 7} (2.4)and the plane H is presented as the set of solutions to a linear equation.


Affine subspaces 19

v2

v3

v1

Figure 2.2: Sketch of the plane H R3 marking v1, the directional vectorsv2 v1, v3 v1 and a normal vector proportional to (2, 3, 5).

Definition 2.2. A non-empty subset M Rd is called an affine subspaceif

(1 t)u+ tv Mfor every u, v M and every t R. A map f : Rd Rn is called an affinemap if

f((1 t)u+ tv) = (1 t)f(u) + tf(v)for every u, v Rd and every t R.The identity (2.3) in Example 2.1 points to the result below.

Lemma 2.3. Let M be an affine subspace of Rd and v1, . . . , vm M . Then

1v1 + + mvm M

for all real numbers 1, . . . , m R with 1 + + m = 1.Proof. This is proved by induction on m. For m = 2 this is the contentof Definition 2.2. For m > 2 we must have 1i 6= 0 for some i = 1, . . . ,m.We may assume that 1 m 6= 0. Then

1v1 + + mvm = (1 m)(

11 m v1 + +

m11 m vm1

)+ mvm.

Since 1 + + m1 = 1 m we are done by induction.



2.2 The affine hull

Definition 2.4. A linear combination

1v1 + + mvm

of vectors v1, . . . , vm Rd is called an affine linear combination if

1 + + m = 1.

The affine hull, aff(S), of a subset S Rd is the set of all affine linearcombinations of elements from S i.e.,

aff(S) := {1v1 + + mvm |m 1,v1, . . . , vm S, 1 + + m = 1}.

Proposition 2.5. The affine hull, aff(S), of a subset S Rd is an affinesubspace. It is the smallest affine subspace containing S.

Proof. Suppose that u, v aff(S) i.e.,

u = 1v1 + + rvrv = 1v

1 + + svs

where r, s 1,1 + + r = 1 + + s = 1

and v1, . . . , vr, v1, . . . , vs S. Then

(1 )u+ v = (1 )1v1 + + (1 )rvr + 1v1 + + svsfor R. This is an affine linear combination, since

(1 )1 + + (1 )r + 1 + + s= (1 )(1 + + r) + (1 + + r)= (1 ) + = 1

and aff(S) is an affine subspace according to Definition 2.2. IfM is an affinesubspace containing S, then M aff(S) by Lemma 2.3. This proves thataff(S) is the smallest subspace containing S.


Affine subspaces 21

2.3 Affine subspaces and subspaces

It is shown below that affine subspaces are solution sets of systems of linearequations. You may find it helpful to compare the proof with the explicitcomputations in Example 2.1.

Proposition 2.6. For an affine subspace M Rd, W = {uv |u, v M}is a subspace and

M = {x0 + w |w W} =: x0 +W,

for every x0 M .A subset M Rd is an affine subspace if and only if it is the solution

set to a system of linear equations.If h : Rd Rn is a linear map and b Rn, then f(x) = h(x) + b is an

affine map. If f : Rd Rn is an affine map, h(x) = f(x) f(0) is a linearmap and f(x) = h(x) + b with b = f(0).

Proof. For u1, v1, u2, v2 M ,

(u1 v1) + (u2 v2) =(u1 + u2 + (1 )v1) (v1 + v2 + (1 )v1)

for , R and it follows by Lemma 2.3 that W = {u v |u, v M} is asubspace. If x0 M it follows that x0 + (u v) M again by Lemma 2.3.Therefore x0+W M . On the other hand if x M , then x = x0+(xx0) x0 +W . Therefore M x0 +W and M = x0 +W .

IfM is the solution set {x Rd |Ax = b} to a system of linear equationswe leave it to the reader to verify that M is an affine subspace. If M onthe other hand is only assumed to be an affine subspace, we may writeM = x0 +W for x0 M and W as above. Now represent W as {x Rd |Ax = 0} for a suitable m d matrix A with m d (see Exercise 2.3). LetM = {x Rd |Ax = b} with b = Ax0: if x W , then A(x0 + x) = b andM M . If z M , then z x0 W and the identity z = x0 + (z x0)shows thatM M (compare this with the computations in Example 2.1).Therefore M = M .

The last part of the proposition is left as an exercise (Exercise 2.9).



2.4 Affine independence and the dimension of a subset

Definition 2.7. The dimension of an affine subspace M Rd is definedas

dim(M) := dimW,

whereW is the subspace {uv |u, v M} Rd (see Proposition 2.6). Thedimension of an arbitrary subset S Rd is defined as

dimS := dim aff(S).

A finite set S = {v1, . . . , vm} is called affinely independent if dimS = m1.This definition is very intuitive e.g., two points are affinely independent ifthey are different, three points are affinely independent if they do not lieon the same line etc. A single point has dimension zero, the affine span oftwo affinely independent points is a line (dimension one), the affine spaceof three affinely independent points is a plane (dimension two) etc.

If H Rd is an affine subspace of dimension d 1, then H = V + x0,where x0 H and V = {x Rd |tx = 0} for some Rd \ {0} i.e., H ={x Rd |tx = } for = tx0. Such an affine subspace is called an affinehyperplane.

Definition 2.8. If H = {x Rn |tx = } is an affine hyperplane, wedefine

H+

= {x Rn |tx }, H++ = {x Rn |tx > },H

= {x Rn |tx }, H = {x Rn |tx < }.The two subsets H and H+ are called (affine) half spaces.

Proposition 2.9. Let S = {v1, . . . , vm} Rd. Then aff(S) = v1 + W ,where W is the subspace spanned by v2 v1, . . . , vm v1. The followingconditions are equivalent.

(1) S is affinely independent.(2) v2 v1, . . . , vm v1 are linearly independent.(3) The equations

1v1 + + mvm = 01 + + m = 0

imply that 1 = = m = 0.


Affine subspaces 23

(4) The vectors (v11

), . . . ,

(vm1

)are linearly independent in Rd+1.

Proof. By definition

aff(S) = {1v1 + + mvm |1 + + m = 1}

and {v v1 | v aff(S)} is the subspace W of Proposition 2.6 for M =aff(S). Since

(1 1)v1 + 2v2 + + mvm = 2(v2 v1) + + m(vm v1)

it follows that W is spanned by v2 v1, . . . , vm v1 as claimed. Therefore(1) is equivalent to (2). To prove (2) (3) we write

1v1 + + mvm = 2(v2 v1) + + m(vm v1)

using that 1 + + m = 0. By the linear independence of v2 v1, . . . ,vm v1 it follows that 2 = = m = 0 and therefore also 1 = 0. Theclaim (4) is an exact restatement of (3). Assuming (3) holds, let us finishthe proof by showing that (2) holds. Suppose that

1(v2 v1) + + m1(vm v1) = 0

for 1, . . . , m1 R. We may rewrite this as

v1 + 1v2 + + m1vm

with = 1 m1. Now (3) implies implies 1 = = m1 = 0showing that v2 v1, . . . , vm v1 are linearly independent vectors.

2.5 Exercises

Exercise 2.1. Let u, v Rd with u 6= v. Prove that

L = {(1 t)u+ tv | t R}

is a line in Rd containing u and v. Prove also that if M is a line in Rd suchthat u, v M , then M = L.



Exercise 2.2. Let u = (1, 1, 1) and v = (1, 2, 3) be vectors in R3. Showthat u and v are linearly independent and find R3 with

W = {x R3 |tx = 0},where W is the subspace spanned by u and v.

Exercise 2.3. Let W Rd be a subspace and suppose that v1, . . . , vr is abasis of W . Prove that

W = {u Rd |utv1 = = utvr = 0} Rd

is a subspace. Let u1, . . . , us be a basis of W and A the s d matrix withthese vectors as rows. Show that s = d r and

W = {x Rd |Ax = 0}.Exercise 2.4. Prove that {v Rd |Av = b} is an affine subspace of Rd,where A is an m d matrix and b Rm.Exercise 2.5. Let M be an affine subspace. Prove that {u v |u, v M}is a subspace.

Exercise 2.6. Can you have two linearly independent vectors in R? Whatabout two affinely independent vectors?

Exercise 2.7. Decide if (2, 1), (3, 2) and (5, 5) are on the same line in R2applying Proposition 2.9.

Exercise 2.8. Let S = {(1, 1, 1), (2, 3, 4), (1, 2, 3), (2, 1, 0)} R3. Com-pute the smallest affine subspace containing S.

Exercise 2.9. Prove that f(x) = h(x) + b is an affine map if h : Rd Rnis a linear map and b Rn. Prove that h(x) = f(x) f(0) is a linear mapif f : Rd Rn is an affine map.Exercise 2.10. Prove that you can have no more than d+ 1 affinely inde-pendent vectors in Rd.

Exercise 2.11. Let v0, . . . , vd be affinely independent points in Rd. Provethat

f(x) = (0, 1, . . . , d)

is a well defined affine map f : Rd Rd+1, wherex = 0v0 + + dvd

with 0 + + d = 1.


Affine subspaces 25

Exercise 2.12. Prove that a non-empty open subset U Rd has dimen-sion dimU = d. Show that a subset S Rd with dimS = d contains anon-empty open subset.


Chapter 3

Convex subsets

An affine subspace M is the set of solutions to a system of linear equationsand contains the line {(1 t)x + ty | t R} between any two of its pointsx, y M . A polyhedron P is the set of solutions to a system of linearinequalities and is only guaranteed to contain the line segment

{(1 t)x+ ty | 0 t 1}

between any two of its points x, y P (see Figure 3.1 and Exercise 3.1).

x

y

P

Figure 3.1: A polyhedron P R2 with two points x, y P , the line segmentbetween x and y and the line through x and y.

In this chapter we will go beyond polyhedra and study subsets of Rd withthe property that they contain the line segment between any two of theirpoints. Subsets with this nowhere concave property have appeared upthrough the history of mathematics at least since the time of Archimedes.1

1Archimedes of Syracuse (287 BC to 212 BC). Greek mathematician.

27



The first formal definition appeared around 1896 in Minkowskis famousmonograph Geometrie der Zahlen (see [Kjeldsen (2008)]).

3.1 Basics

The following simple definition (compare with Definition 2.2) is truly fun-damental in modern mathematics.

Definition 3.1. A subset C Rd is called convex if it contains the linesegment between any two of its points:

(1 t)u+ tv C

for every u, v C and every t R with 0 t 1.

(a) (b)

Figure 3.2: (a) A non-convex subset of R2. (b) A convex subset of R2.

An intersection of convex subsets is a convex subset. Since an affine halfspace is a convex subset, it follows that polyhedra are convex subsets. Theunion of convex subsets does not have to be convex.


Convex subsets 29

Minkowski sum, dilation and the polar of a subset

A fundamental operation is setwise addition of convex subsets. The sumA+B of two subsets A and B of Rd is defined as the subset

A+B := {u+ v |u A, v B} Rd.If A and B are convex subsets, A + B is a convex subset and it is calledthe Minkowski sum of A and B. This rather simple operation on convexsubsets turns out to be very important.

+ =

Figure 3.3: Minkowski sum of a triangle and a disc.

Another fundamental operation is dilation of a convex subset C Rdby R. This is the convex subset

C := {x |x C} Rd.

C

C

Figure 3.4: Dilation of a convex subset C with = 2. The marked pointindicates (0, 0) in R2.

Let S Rd be any subset. ThenS := { Rd |tx 1, for every x S} Rd

is called the polar of S (see Figure 3.5). One can prove that S is a convexsubset of Rd and that 0 S.



Figure 3.5: A convex subset and its polar. The marked points indicate (0, 0)in R2.

3.2 The convex hull

In complete analogy with Lemma 2.3 we have the following (the proof isthe same).

Lemma 3.2. Let C be a convex subset of Rd and v1, . . . , vm C. Then1v1 + + mvm C

if 1, . . . , m 0 and 1 + + m = 1.A linear combination 1v1 + +mvm of vectors v1, . . . , vm Rd is calledconvex if 1, . . . , m 0 and 1 + + m = 1.Definition 3.3. The convex hull of a subset S Rd is the set of all convexlinear combinations of elements from S i.e.,

conv(S) := {1v1 + + mvm |m 1,v1, . . . , vm S, 1, . . . , m 0 and 1 + + m = 1}.

If S is a finite subset, conv(S) is called a polytope. If S R2 is a finitesubset, we call conv(S) a (convex) polygon.

Proposition 3.4. The convex hull, conv(S), of a subset S Rd is a convexsubset. It is the smallest convex subset containing S.

The polar set introduced previously has a nice interpretation for the convexhull of a finite set of points.

Proposition 3.5. Let C = conv({v1, . . . , vm}), where v1, . . . , vm Rd.Then the polar of C is the polyhedron given by

C = { Rd |tv1 1, . . . , tvm 1}.


Convex subsets 31

Figure 3.6: The convex hull of 100 random points in R2. In this example,90 of the 100 points lie in the convex hull of the 10 "extreme" points.Finding the extreme points is a non-trivial computational task.

Proof. In Exercise 3.9 you are asked to prove this and compute a polarconvex hull based on Example 3.6.

Example 3.6. To get a feeling for convex hulls, it is important to playaround with (lots of) examples in the plane. In Figure 3.7 you see a finitesubset of four planar points and their convex hull.

In suitable coordinates the four points are(01

),

(32

),

(21

)and

(40

).

You can check that the third point is contained in the convex hull of theother points, since (

21

)= 37

(01

)+ 27

(32

)+ 27

(40

).

You may wonder where the coefficients 37 ,27 and

27 came from. In fact, the

coefficients 1, 2 and 3 in(21

)= 1

(01

)+ 2

(32

)+ 3

(40

)



coming from Definition 3.3, must solve the system

32 + 43 = 2

1 + 22 = 1

1 + 2 + 3 = 1

of linear equations. This makes them very explicit and computable.

Figure 3.7: Four points (left). The convex hull (right).

One of the points in Figure 3.7 can be omitted without altering the convexhull. This point seems not to be a vertex or a corner. We will give aquite general definition that captures the corners or vertices of a convexsubset.

3.3 Faces of convex subsets

A convex subset C is structured around its faces, which are special convexsubsets of C. Informally the zero-dimensional faces are the vertices of C,the one-dimensional faces the extreme line segments of C etc. The vertices(called extreme points) are the points not located in the interior of a linesegment in C. Only one of the marked points in the triangle to the rightin Figure 3.7 fails to be extreme. The extreme points are the vertices andthe one-dimensional faces the edges of the triangle. We will prove later (seeTheorem 6.12) that a compact convex subset always is the convex hull ofits extreme points. The precise definition of a face in a convex subset is asfollows.

Definition 3.7. Let C Rd be a convex subset. A subset F C is calleda face of C if F is convex and for every x, y C and 0 < < 1,

(1 )x+ y F

implies that x, y F .


Convex subsets 33

An important type of faces of a convex subset is given by following result.

Lemma 3.8. Let C Rd be a convex subset and Rd. Then

F = {z C |tz tx, for every x C} (3.1)

is a face of C.

Proof. If z1, z2 F , then tz1 tz2 and tz2 tz1. Thereforetz1 =

tz2 and F = {x C |tx = } with = tz1, which shows thatF is a convex subset as the intersection of C with an affine hyperplane.

Suppose that x, y C and z := (1 )x + y F for 0 < < 1 i.e.,tz tv for every v C, in particular tz tx and tz ty. Let usassume that tx ty. Then

tx = (1 )tx+ tx (1 )tx+ ty = tz.

This implies x F . Since we must have ty = tx, it also follows thaty F . A face F C of a convex subset C Rd is called exposed if it is given asin (3.1) for some Rd. A face of a convex subset does not have to beexposed (see Exercise 3.11). A zero-dimensional face (a point) is called anextreme point. A point z C is extreme if for every x, y C

z conv({x, y})

if and only if z = x or z = y. The set of extreme points in C is denotedext(C).

(a) (b) (c)

Figure 3.8: (a) A line in the plane does not have extreme points. (b) A dischas infinitely many extreme points but no one-dimensional faces. (c) Apolygon with nine extreme points and nine one-dimensional faces.



The notion of an extreme point identifies the non-redundant points ina convex hull (see Figure 3.6). Formally, one can show (see Exercise 3.13)by induction that if z is an extreme point of conv({x1, . . . , xn}), then

z = x1 or z = x2 or . . . or z = xn.

Interlude: Integral points in convex subsets

This section is meant to be a leisurely break from the formal theory anda pointer to some of the rich and beautiful mathematics that surroundsconvexity.

A convex subset does not have to be a polyhedron. An example of non-polyhedral convex subset is a planar disc (see Figure 3.9).

Figure 3.9: (left) A planar disc is not a polyhedron. (right) The integralpoints of a planar disc.

Non-polyhedral convex subsets are both interesting and difficult. For onething, counting the number N(r) of integral points inside a circle (see Fig-ure 3.9) of radius r is a classical and notoriously difficult problem studiedby Gauss around 1834. Gauss studied the error term E(r) = |N(r) pir2|and proved that E(r) 22pir. In another example, a four-dimensionalball

B(N) = {(x1, x2, x3, x4) R4 |x21 + x22 + x23 + x24 N} R4

is a convex subset. There is a remarkable connection between the prime


Convex subsets 35

factorization of N N and counting integral points in B(N). Notice that

|B(N) Z4| |B(N 1) Z4|={(x1, x2, x3, x4) Z4 |x21 + x22 + x23 + x24 = N}.

An amazing result [Jacobi (1829)] due to Jacobi2 about the number of waysof writing an integer as a sum of four squares, says that{(x1, x2, x3, x4) Z4 |x21 + x22 + x23 + x24 = N} = 8

d|N,4-dd. (3.2)

As a simple check of this remarkable formula you can verify that 4 is a sumof four squares in the following 8 + 16 different ways:

(2, 0, 0, 0), (0,2, 0, 0), (0, 0,2, 0), (0, 0, 0,2)(1,1,1,1).

If you consider a number N = pq, which is the product of two odd primenumbers p and q, you can use (3.2) to find p+ q by counting integral pointsin B(N). But if you know N = pq and p+ q, then you can retrieve p and q,since

(x p)(x q) = x2 (p+ q)x+N.This may sound esoteric, but in essence, it shows that if you can countintegral points in the four-dimensional ball effectively, then you can crackthe worldwide encryption system RSA (RSA is based on the computa-tional hardness of prime factorization we are talking 500 digit numbersor more here).

Counting integral points in polyhedra is difficult but much better under-stood. For example if P is a convex polygon with dim(P ) = 2 and integralvertices, then the number of integral points in P is given by the formula ofPick3 from 1899:

|P Z2| = Area(P ) + 12 B(P ) + 1,

where B(P ) is the number of integral points on the boundary of P . Youcan easily check this with a few examples. Consider for example the convexpolygon P in Figure 3.10.2Carl Gustav Jacobi (18041851). German mathematician.3Georg Alexander Pick (18591942). Austrian mathematician.



Figure 3.10: Integral points in a convex polygon with integral vertices.

By subdivision into triangles it follows that Area(P ) = 552 . Also, by aneasy count we get B(P ) = 7. Therefore the formula of Pick shows that

|P Z2| = 552 + 12 7 + 1 = 32.

The polygon contains 32 integral points. This can be verified by an explicitcount from Figure 3.10.

For a very nice account of counting integral points in polyhedra andbeyond look up [de Loera (2005)].

An exceedingly important class of convex subsets is now introduced andthe connection to the convex hull is described.

3.4 Convex cones

A cone in Rd is a subset K such that x K for every x K and every 0. A pointed cone is a cone not containing a line.

Proposition 3.9. A non-empty polyhedron P = {x Rd |Ax b} is acone if and only if P = {x Rd |Ax 0}. P is a pointed cone if and onlyif rkA = d.

Proof. If P = {x Rd |Ax 0}, then P is a cone, since A(x) =(Ax) 0 if 0 and Ax 0. If P = {x Rd |Ax b} is a cone, thenb 0, since 0 P . Therefore {x Rd |Ax 0} P . If x P and Ax 6 0,then atjx > 0 for some row vector aj of A and atj(x) = (atjx) bj for


Convex subsets 37

every 0, since P is a cone. This contradicts that atjz is bounded aboveby bj for z P . Therefore P = {x Rd |Ax 0}.

If rkA < d, there exists a nonzero Rd with A = 0. This showsthat P contains the line {t | t R}. On the other hand, suppose that Pcontains the line L = { + t | t R} with Rd \ {0}. If atj 6= 0 forsome j, then we may find t R such that atj + tatj > 0 contradictingthat L P . Therefore A = 0 and we must have rkA < d.

Figure 3.11: Part of the Lorentz cone in R3.

A polyhedral cone is a convex cone. An example of a non-polyhedralconvex cone is the Lorentz cone (see Figure 3.11) given by

C = {(x, y, z) R3 | z 0, x2 + y2 z2}.

It is not too hard to prove that C is a cone. Proving that C is a convexsubset is more of a challenge (Exercise 3.17).

The recession cone

To every convex subset C Rn we associate a convex cone denoted rec(C)called the recession cone of C. Informally rec(C) is the set of directions inwhich C recedes4 (see [Rockafellar (1970)], Section 8). The precise definitionis

rec(C) = {d Rn |x+ d C, for every x C}. (3.3)

It is left as an exercise (Exercise 3.18) to prove that rec(C) is a convexcone. If C is a convex cone, then rec(C) = C.

4Go to or toward a more distant point.



C

Figure 3.12: The recession cone of the convex subset C = {(x, y) |xy 1,x > 0} R2 is rec(C) = {(x, y) |x 0, y 0}. The dashed arrow is notin rec(C). Its direction leads to points outside C.

Finitely generated cones

A linear combination 1v1 + + mvm of vectors v1, . . . , vm Rd iscalled conic if 1, . . . , m 0. In complete analogy with the convex hull weintroduce the following.

Definition 3.10. The cone generated by a subset S Rd is the set of allconic linear combinations of elements from S i.e.,

cone(S) := {1v1 + + mvm |m 1,v1, . . . , vm S, 1, . . . , m 0}.

A convex cone C is called finitely generated if C = cone(S) for a finite setS Rd.

Convex cones are closely related to the convex hull. Here we repeat the ideaillustrated in Example 3.6 in the context of finitely generated cones.

Example 3.11. A triangle T is the convex hull of three affinely indepen-dent points

(x1, y1), (x2, y2), (x3, y3)

in the plane and (x, y) T if and only if(xy1

) cone

({(x1y11

),

(x2y21

),

(x3y31

)}). (3.4)


Convex subsets 39

Testing (3.4) amounts to solving the systemx1 x2 x3y1 y2 y31 1 1

123

=xy

1

(3.5)of linear equations. So (x, y) T if and only if the unique solution to (3.5)has 1 0, 2 0 and 3 0. Let us experiment with a few concretenumbers. You can plot the points(

00

),(

21

)and

(53

) R2

and realize that their convex hull, T , is a very thin triangle. From a drawingit can be difficult to decide if a given point is inside the triangle. Here the3 3 matrix from (3.5) helps. In this case

A =

0 2 50 1 31 1 1

and

A1 =

2 3 13 5 01 2 0

.Let us check if v1 = (1, 1120 ) T and v2 = (4, 2) T . In the case of v1,

A1

111/201

=13/201/4

1/10

.This is the solution (1, 2, 3) to the linear equations in (3.5). In this case,the solution satisfies 1 0, 2 0 and 3 0. Therefore v1 T . As anadded bonus you also see that(

11120

)= 1320

(00

)+ 14

(21

)+ 110

(53

).

For v2 we get

A1(

421

)=

(120

).

Here the first coordinate of the solution to (3.5) is negative and thereforev2 6 T .



In general, the polar of a subset S Rd is defined asS = { Rd |tx 1 for every x S}. (3.6)

If S happens to be a cone K we can replace the upper bound of 1 in(3.6) by 0, because if tx > 0 for some x K, we would have t(x) =(tx) > 1 for a sufficiently big > 0. This contradicts that K, sincex K. Therefore

K = { Rd |tx 0 for every x K}.The polar cone K is a cone and therefore a convex cone. In completeanalogy with Proposition 3.5 we have the following result.

Proposition 3.12. If C Rd is a finitely generated cone, then C is apolyhedral cone.

Proof. Suppose that C = cone(S), where S = {v1, . . . , vm} Rd. Thentx 0 for every x = 1v1 + + mvm, where i 0 for i = 1, . . . ,mif and only if tvi 0 for i = 1, . . . ,m. Let A be the m d matrix withv1, . . . , vm as its rows. Then it follows that

C = { Rn |A 0}. Polyhedral cones are finitely generated and finitely generated cones arepolyhedral. These two deeper results due to Minkowski and Weyl are thefocus of the next chapter. In purely algebraic terms, the finite generation ofa polyhedral cone says that there exists finitely many solutions v1, . . . , vN Rn to a system of linear inequalities

a11x1 + + an1xn 0...a1mx1 + + anmxn 0,

(3.7)

such that every solution to (3.7) is a conic linear combination of v1, . . . , vN .In the setting of linear algebra you know that every solution to the homo-geneous linear system corresponding (3.7) is a linear combination of nsolutions. For (3.7) even the existence of finitely many (conic) generatingsolutions is a non-trivial statement.

Example 3.13. In Figure 3.13 we have sketched a finitely generated coneC and its polar cone C. If you look closer, you will see that

C = cone({(

21

),(

12

)})and C = cone

({(12),(2

1

)}).


Convex subsets 41

C

C

Figure 3.13: A finitely generated cone C and its polar cone C.

Notice also that C encodes the fact that C is the intersection of thetwo half planes{(

xy

) R2

( 12)t(xy) 0} and {(xy) R2 (21)t(xy) 0}.3.5 Carathodorys theorem

In Example 3.11 you saw how to check if a point is in the convex hull of threepoints in the plane. This led to three linear equations in three unknowns.What if you had more than three points? In 3.6 it is shown how to reducesuch a computation to a finite number of the simple computations carriedout in Example 3.11. The key point is that a finitely generated cone isthe union of finitely many simplicial subcones. A convex cone is calledsimplicial if it is generated by finitely many linearly independent vectors.The following result was first proved by Carathodory.5

5Constantin Carathodory (18731950). Greek mathematician.



Theorem 3.14 (Carathodory). Let V = {v1, . . . , vm} Rd. If v cone(V ), then v belongs to the cone generated by a linearly independentsubset of V .

Proof. Suppose that

v = 1v1 + + mvmwith 1, . . . , m > 0 and v1, . . . , vm linearly dependent. The linear depen-dence means that there exists 1, . . . , m R not all zero such that

1v1 + + mvm = 0. (3.8)

We may assume that at least one i > 0 multiplying (3.8) by1 if necessary.Introducing the variable R we have

v = v (1v1 + + mvm)= (1 1)v1 + + (m m)vm. (3.9)

Let

= max{ 0 |i i 0, for every i = 1, . . . ,m}= min

{ii

i > 0, i = 1, . . . ,m}.When you substitute for in (3.9), you discover that v lies in the subconegenerated by a proper subset of V . Basically we are varying in (3.9)ensuring non-negative coefficients for v1, . . . , vm until the first time wereach a zero coefficient in front of some vj . This (or these) vj is (are)deleted from the generating set. Repeating this process we will eventuallyend up with a linearly independent subset of vectors from V .

Corollary 3.15. Let V = {v1, . . . , vm} Rd. If v conv(V ), then vbelongs to the convex hull of an affinely independent subset of V .

Proof. If v conv({v1, . . . , vm}), then(v1

) cone

({(v11

), . . . ,

(vm1

)}).

Now use Theorem 3.14 to conclude that(v1

) cone

({(u11

), . . . ,

(uk1

)}),


Convex subsets 43

where {(u11

), . . . ,

(uk1

)}{(

v11

), . . . ,

(vm1

)}is a linearly independent subset. Therefore v conv({u1, . . . , uk}). But(

u11

), . . . ,

(uk1

)are linearly independent if and only if u1, . . . , uk are affinely independentby Proposition 2.9.

A consequence of Corollary 3.15 is that a point in the convex hull of morethan three planar points, belongs to the convex hull of at most three of thepoints (see Figure 3.14).

Figure 3.14: A point in the convex hull of some given planar points is inthe convex hull of at most three of these points.

The affine equivalent of a simplicial cone is called a simplex. More pre-cisely, a convex subset is called a d-simplex if it is the convex hull of d+ 1affinely independent points (see Figure 3.15). In these terms, Corollary 3.15states that the convex hull of finitely many points is a union of finitely manysimplices.

Figure 3.15: Picture of 0-simplex (point), 1-simplex (line segment), 2-simplex (triangle), 3-simplex (tetrahedron).

Example 3.16. The proofs of Theorem 3.14 and Corollary 3.15 may ap-pear quite abstract until you study a few concrete examples. Here is atypical example illustrating the linear algebra that goes into their proofs.Consider the convex hull

C = conv({(

10

),(

01

),(

12

),(

40

)}).



The identity (21

)= 111

(10

)+ 111

(01

)+ 511

(12

)+ 411

(40

)shows that

v =(

21

) C.

Corollary 3.15 says however that v is in the convex hull of at most threeof these four vectors (why?). Let us emulate the proof of Corollary 3.15 in aconcrete computation. The first step is to find a linear dependence betweenthe four vectors (

101

),

(011

),

(121

)and

(401

)(3.10)

in R3 (compare this with the beginning of the proof of Corollary 3.15). Hereis a linear dependence:

5

(101

) 6

(011

)+ 3

(121

) 2

(401

)=

(000

). (3.11)

Just to make sure you know how to translate linear dependence into themore mundane setting of linear equations, let us show (in painstaking detail)how (3.11) was found! A linear dependence for the vectors in (3.10) can befound as a non-zero solution to the equations

1 + 3 + 44 = 0

2 + 23 = 0

1 + 2 + 3 + 4 = 0

(3.12)

in 1, 2, 3 and 4. Such a non-zero solution always exists (see TheoremB.2). By subtracting the first equation from the third (a step in Gaussianelimination), you get

1 + 3 + 44 = 0

2 + 23 = 0

2 34 = 0 .From this system of equations you can glean the non-zero solution

2 = 6, 4 = 2, 3 = 3, and 1 = 5,which is the linear dependence in (3.11). In this particular case (3.9) be-comes(

211

)= ( 111 5)

(101

)+ ( 111 + 6)

(011

)+ ( 511 3)

(121

)+ ( 411 + 2)

(401

).


Convex subsets 45

How big is allowed to be, when

111 5 0511 3 0

must hold? Solving the inequalities for we get 155 and 533 . Toensure that both inequalities are satisfied, we must have 155 . The magicappears when we pick = 155 . Then(

211

)= ( 111 1555)

(101

)+ ( 111 +

1556)

(011

)+ ( 511 1553)

(121

)+ ( 411 +

1552)

(401

)

= 1155

(011

)+ 2255

(121

)+ 2255

(401

).

This implies (21

)= 1155

(01

)+ 2255

(12

)+ 2255

(40

).

Therefore (21

) conv

((01

),(

12

),(

40

)).

3.6 The convex hull, simplicial subsets and Blands rule

How do we decide if a point is in the convex hull of finitely many given pointsin Rd? You have seen in the proof of Corollary 3.15 and in Example 3.16how this can be reduced to deciding if a vector is a conic linear combinationof finitely many given vectors. In principle, Theorem 3.14 tells us how tocheck this. But if V in the statement of Theorem 3.14 is a generating setfor Rd with m elements, we need to check the linearly independent subsetsof V with d elements. This seems like an insurmountable problem. If forexample m = 30 and d = 10, there can be up to 30 045 015 of the latter.

We will present a rather surprising algorithm for traversing these subsetscoming from 7.1 in the classic monograph [Schrijver (1986)]. The surprise isthat the algorithm does not get stuck it manages to traverse the linearlyindependent subsets until it finds that the vector is in the cone or that thisis certifiably false. The method is inspired by Blands rule in the simplexalgorithm (see 5.4).

Let V = {v1, . . . , vm} Rd be a generating set and x Rd. We call asubset J {1, 2, . . . ,m} of size d simplicial if |J | = d and {vj | j J} islinearly independent.



(1) For a simplicial subset J = {i1, . . . , id} {1, 2, . . . ,m}, there existsuniquely defined ij R for j = 1, . . . , d, such that

x = i1vi1 + + idvid .(2) If ij 0 for every j = 1, . . . , d, we have verified that v cone(V ). If

not, pick the smallest index k {i1, . . . , id} with k < 0. Let c Rdbe the unique vector with the property that

ctvk = 1

ctvj = 0, if j {i1, . . . , id} \ {k}.This vector is the k-th row in A1, where A is the matrix with columnvectors vi1 , . . . , vid . Notice that ctx = k < 0.

(3) If ctvj 0 for every j 6 J , then ctv 0 for every v cone(V ). Sincectx < 0, this shows that x cone(V ) is impossible.

(4) If not, pick the smallest index j {1, . . . ,m} with ctvj < 0 and repeatthe procedure from (1) with the simplicial subset

J := (J \ {k}) {j}.Here we will say that j enters and k exits the simplicial subset.

Example 3.17. As an illustration (see Figure 3.16) let us check if(11

) conv

({(10

),(

01

),(

22

),(

51

),(

40

)}).

Rewriting this as a conic problem we need to check if x cone(V ), where

V =

{(101

),

(011

),

(221

),

(511

),

(401

)}and x =

(111

).

We initiate the algorithm above with the simplicial subset J = {3, 4, 5}.Here

x = v3 v4 + v5with k = 4 and c = ( 12 ,

12 ,2). Since ctv1 = 32 and ctv2 = 2 we get j = 1

and repeat the iteration with J = {3, 1, 5}. Herex = 12v3 +

23v1 16v5

with k = 5 and c = ( 13 , 16 , 13 ). Since ctv2 = 12 and ctv4 = 76 we getj = 2. Therefore the iteration is repeated with J = {3, 1, 2}. Here

x = 13v3 +13v1 +

13v2

showing that x cone(V ).


Convex subsets 47

1

2

3

4

5

Figure 3.16: Geometric illustration of the steps {3, 4, 5} {1, 3, 5} {1, 2, 3} in detecting that the marked point is in the convex hull of thepoints in Example 3.17.

Non-cycling

We will now prove that the procedure outlined in the above steps (1)(4)terminates. Let Jt denote the simplicial subset in step t of the algorithm.If the algorithm does not terminate it must enter into a cycle: Jk = Jl forsome k < l.

Let r {1, 2, . . . ,m} be the largest index to exit the simplicial subsetin a step p {k, k+1, . . . , l1}. Since Jk = Jl, there must also exist a stepq {k, k + 1, . . . , l 1} for which r enters the simplicial subset. Supposethat Jp = {i1, . . . , id}. Then

x = i1vi1 + + idvid (3.13)

with ij 0 for ij < r and r < 0.Let c denote the vector of (2) in step q, where r enters. Then

ctx < 0. (3.14)

Since r enters the simplicial subset, we must have ctvij 0 for ij < r andctvr < 0 by (4). We already know that r < 0 and ij 0 for ij < r. Forij > r we must have ctvij = 0, by the construction of c in (2) and since r isthe largest index to exit the simplicial subset. Applying these observationsto (3.13), we get ctx 0 contradicting (3.14).

The rule of always exchanging smallest indices is the equivalent ofBlands rule in the simplex algorithm. It is remarkable that this simplerule is so powerful. In fact, the above algorithm furnishes an independentproof of Theorem 3.14 along with the substantial mathematical result: ifx 6 cone(V ), there exists c Rd with ctx < 0 and ctz 0 for z cone(V ).The latter will appear later under the name Farkass lemma (Lemma 4.14)with a different proof.



3.7 Exercises

Exercise 3.1. Let P = {x Rd |Ax b} be a polyhedron in Rd. Provethat (1 t)x+ ty P if x, y P and 0 t 1.Exercise 3.2. Let A,B Rd be convex subsets. Prove that A B is aconvex subset. Give an example showing that A B does not have to be aconvex subset.

Exercise 3.3. Prove in detail that an affine half space is a convex subsetand that a polyhedron

P = {x Rd |Ax b}

is a convex subset of Rd.

Exercise 3.4. Let A be a convex subset of Rd. Prove that

A+ z := {x+ z |x A}and

A := {x |x A}

are convex subsets of Rd for z Rd and R. Let B be a convex subsetof Rd. Prove that

A+B := {x+ y |x A, y B}

is a convex subset of Rd.

Exercise 3.5. Let v1, v2, v3 Rn. Show that

{(1 )v3 + ((1 )v1 + v2) | [0, 1], [0, 1]}= {1v1 + 2v2 + 3v3 |1, 2, 3 0, 1 + 2 + 3 = 1}.

Exercise 3.6. Sketch the convex hull of

S = {(0, 0), (1, 0), (1, 1)} R2.

Write conv(S) as the intersection of three half planes.

Exercise 3.7. Let u1, u2, v1, v2 Rn. Show that

conv({u1, u2}) + conv({v1, v2}) = conv({u1 + v1, u1 + v2, u2 + v1, u2 + v2}).


Convex subsets 49

Exercise 3.8. Let S Rn be a convex subset and v Rn. Show that

{(1 )s+ v | [0, 1], s S}

is a convex subset. Hint: compare with Exercise 3.5.

Exercise 3.9. Let C = conv({v1, . . . , vm}) and P = C for v1, . . . , vm Rd. Prove that

P = { Rd |tv1 1, . . . , tvm 1}

and that P is a polyhedron. Compute and sketch for C given in Example 3.6.Prove in general that P is bounded if 0 is an interior point of C.

Exercise 3.10. If F G C are convex subsets of Rd, prove that F is aface of C if F is a face of G and G is a face of C.

Exercise 3.11. Give an example of a convex subset C Rd and a faceF Rd, which is not exposed (hint: think about stretching a disc).Exercise 3.12. Prove that C \ F is a convex subset if F is a face of aconvex subset C. Is it true that F C is a face if C \F is a convex subset?Exercise 3.13. Let X = {x1, . . . , xn} Rd.

(i) Prove that if z conv(X) is an extreme point, then z X.(ii) Suppose that z 6 X. Prove that z is an extreme point of conv({z}

X) if and only ifz 6 conv(X).

This means that the extreme points in a convex hull consists of the non-redundant generators (compare this with Figure 3.6).

Exercise 3.14. Prove in detail that

C = {(x, y) R2 | x2 + y2 1}

is a convex subset of R2. What are the extreme points of C? Can you proveit?



Exercise 3.15. Recall the notation

A+B = {u+ v |u A, v B} Rn

for two subsets A,B Rn and let [u, v] := conv({u, v}) for u, v Rn.(i) Show that

[u, v] + {w} = [u+ w, v + w]for u, v, w Rn.

(ii) SketchP =

[(11

),(

12

)]+[(

21

),(

32

)]along with its extreme points in the plane.

(iii) LetQ = P +

[(31

),(

41

)].

Write Q as a convex hull of the minimal number of points and as anintersection of half planes.

(iv) Let A and B be convex sets and u0 A a point in A, which is notextreme. Show that u0 + b A+B is not extreme in A+B for anyb B.

(v) Show thatT = [x, y] + [z, w]

has at most 4 extreme points for x, y, z, w Rn. Can T have 3extreme points? 2?

(vi) Let Li = [ui, vi] for i = 1, . . . ,m, where ui, vi Rn. Give an upperbound for how many extreme points

Z = L1 + + Lm (3.15)can have. Show that Z is the image of the unit cube [0, 1]m Rmunder a suitable affine map.

The Minkowski sum of finitely many line segments (as in (3.15)) is calleda zonotope.

Exercise 3.16. Give an example of a non-convex cone.

Exercise 3.17. Prove in detail that

C = {(x, y, z) R3 | z 0, x2 + y2 z2}is a convex cone. Is C finitely generated?


Convex subsets 51

Exercise 3.18. Prove in detail that the recession cone rec(C) defined in(3.3) is a convex cone, where C is a convex subset. Perhaps the identity

x+ (n+ )d = (1 )(x+ nd) + (x+ (n+ 1)d)

might come in handy.

Exercise 3.19. What is the recession cone of a bounded convex subset?

Exercise 3.20. Can you give an example of an unbounded convex subsetC R2 with rec(C) = {0}?

Exercise 3.21. LetC = cone

({(21

),(

12

)}).

(i) Show that

C = cone({(

12),(2

1

)}).

(ii) Suppose that

C = cone({(

ac

),(bd

)}),

where (a b

c d

)is an invertible matrix. How do you compute C?

Exercise 3.22. The vector

v =

7/419/8

is the convex combination

18

(11

)+ 18

(12

)+ 14

(22

)+ 12

(23

)

of four vectors in R2. Use the method outlined in Example 3.16 to answerthe following questions.

(i) Is v in the convex hull of three of the four vectors?(ii) Is v in the convex hull of two of the four vectors?



Exercise 3.23. Let C1, C2, C3, C4 be convex subsets of R2, such that anythree of them have non-empty intersection i.e.,

C1 C2 C3 6= C1 C2 C4 6= C1 C3 C4 6= C2 C3 C4 6= .

(i) Show that for v1, v2, v3, v4 R2 there are 1, 2, 3, 4 R, not allzero, such that

1v1 + 2v2 + 3v3 + 4v4 = 0

1 + 2 + 3 + 4 = 0 .

(ii) By assumption there exists

vi 4j=1j 6=i

Cj

for i = 1, . . . , 4. Suppose that with the notation in (i) we have

1, 2 03, 4 0.

Prove that

11 + 2

v1 +2

2 + 2v2 C1 C2 C3 C4.

(iii) Prove now in general that C1 C2 C3 C4 6= .(iv) Prove that if C1, . . . , Cm are convex subsets of R2 where any three

of them have non-empty intersection, then C1 Cm 6= .(v) What is a natural generalization from R2 to Rn of the result in (iv)?

Exercise 3.24. Let S be a subset of Rn containing at least n + 2 points.Prove that there exists subsets S1, S2 S, such that

(i) S1 S2 = (ii) S1 S2 = S(iii) conv(S1) conv(S2) 6= .


Convex subsets 53

Hint: write down an affine dependence between v1, . . . , vn+2 S. This resultis called Radons theorem.

Exercise 3.25. Use the result in Exercise 3.24 to give a complete proofof the natural generalization alluded to in Exercise 3.23 (v) (called Hellystheorem).

Exercise 3.26. Let e1, e2, e3 denote the canonical basis vectors of R3 andlet

C = conv({e1,e1, e2,e2, e3,e3}).Verify that (

13 ,

15 ,

17

) Cby writing down a convex linear combination.


Chapter 4

Polyhedra

Unlike a planar disc (Figure 3.9) and the Lorentz cone (Figure 3.11), poly-hedra are intersections of finitely many affine half spaces (see Figure 4.1).

Figure 4.1: A (famous) polyhedron in R3.

The purpose of this chapter is to dig deeper into the structure of poly-hedra only using Fourier-Motzkin elimination and basic linear algebra. Apolyhedron has finitely many faces and they can be characterized usinglinear algebra. As a prelude to this characterization in Proposition 4.3, wewill briefly mention the remarkable numerical relation known as Eulersformula. A proof of this formula is beyond the scope of this book. Fortu-nately it is one those mathematical miracles that can be appreciated almostinstantly (see (4.1) and Figure 4.2).

A central result due to Minkowski1 and Weyl2 says that a polyhedroncan be expressed as the Minkowski sum of a finitely generated cone andthe convex hull of finitely many points. This result is conveniently treatedin the setting of convex cones.1Hermann Minkowski (18641909). German mathematician.2Hermann Weyl (18861955). German mathematician.

55



A finitely generated convex cone is a polyhedron. It is not easy to writeup a precise mathematical proof of this statement. Hermann Weyl, one ofthe great mathematicians of the 20th century, found it worthwhile to devotea paper [Weyl (1935)] to it. This result is now known as Weyls theoremand was the subject of his last seminar in Gttingen in the summer of 1933before he left for the United States.

We will give a deceptively short proof of Weyls theorem using Fourier-Motzkin elimination. In all fairness we also enter into the ideas of Weylsoriginal proof in Chapter 5, hoping to convince the reader that it containsmuch deeper information on the minimal generators of a polyhedral cone.

The theory of polyhedra is an abundant source of interesting mathemat-ics. We give a few examples in this chapter like Markov chains and steadystates, duality in linear programming and doubly stochastic matrices.

4.1 Faces of polyhedra

Assume in the following remark that the convex hull of finitely many pointsis a polyhedron and that a polyhedron has finitely many faces. Both of thesestatements will be proved later.

Remark 4.1. If C R3 is a convex subset, then its extreme pointsare called vertices, its one-dimensional faces are called edges and its two-dimensional faces are called faces. Let P R3 be the convex hull of finitelymany points and suppose that P is not contained in an affine plane. Thereis a remarkable numerical relation between the number of vertices V , thenumber of edges E and the number of faces F of P . This relation is

V E + F = 2 (4.1)

and seems to have been observed first by Descartes3 in 1640. It was laterused by Euler4 in 1752 (see the chapter on topology in [Courant and Rob-bins (1941)]) and is today called Eulers formula for polyhedra. You shouldcheck (4.1) with the platonic solids depicted as wireframes in Figure 4.2(for the cube we have 8 12 + 6 = 2).

The formula (4.1) belongs to a rich branch of mathematics called poly-hedral combinatorics. For the convex hull P of finitely many points in Rd

3Ren Descartes (15961650). French mathematician.4Leonhard Euler (17071783). Swiss mathematician.


Polyhedra 57

Figure 4.2: An illustration of Eulers formula (4.1) the five so-calledplatonic solids with vertices and (in)visible edges marked: the cube(6 faces), the tetrahedron (4 faces), the octahedron (8 faces), the icosa-hedron (20 faces) and the dodecahedron (12 faces).

not contained in an affine hyperplane, (4.1) generalizes to the identity

f0 f1 + f2 + (1)d1fd1 = 1 + (1)d1, (4.2)

where fj denotes the n

Documents

Lauritzen - Undergraduate Convexity - From Fourier and Motzkin to Kuhn and Tucker