02 Convex Sets

Embed Size (px)

Citation preview

  • 8/9/2019 02 Convex Sets

    1/23

    Convex sets

    In this section, we will be introduced to some of the mathematical

    fundamentals of convex sets. In order to motivate some of the defini-tions, we will look at theclosest point problemfrom several differentangles. The tools and concepts we develop here, however, have manyother applications both in this course and beyond.

    A setC RN is convexif

    x,y C (1 )x +y C for all[0, 1].

    In English, this means that if we travel on a straight line betweenany two points inC, then we never leave C.

    These sets in R2 are convex:

    These sets are not:

    1

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    2/23

    Examples of convex (and nonconvex) sets

    Subspaces. Recall that ifSis a subspace ofRN, thenx,y S ax +by Sfor all a, b R.SoSis clearly convex.

    Affine sets. Affine sets are just subspaces that have been offsetby the origin:

    {x RN : x= y + v, y T }, T= subspace,

    for some fixed vector v. An equivalent definition is thatx,y C ax+ by C for all a, b R the differencebetween this definition and that for a subspace is that subspacesmust include the origin.

    Bound constraints. Rectangular sets of the form

    C={x RN :1x1u1, 2x2u2, . . . , NxNuN}

    for some1, . . . , N, u1, . . . , uN Rare convex.

    Thesimplexin RN

    {x RN :x1+x2+ +xN1, x1, x2, . . . , xN0}

    is convex.

    Any subset ofRN that can be expressed as a set of linear in-equality constraints

    {x RN

    :Ax b}

    is convex. Notice that both rectangular sets and the simplex

    2

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    3/23

    fall into this category for the simplex, take

    A=

    1 1 1 11 0 0 0

    0 1 0 0... . . .0 1

    , b=

    10

    0...0

    .

    In general, when sets like these are bounded, the result is apolyhedron.

    Norm balls. If is a valid norm on RN, then

    Br={x : x r},is a convex set.

    Ellipsoids. An ellipsoid is a set of the form

    E={x : (x x0)TP1(x x0) r},

    for a symmetric positive-definite matrixP. Geometrically, theellipsoid is centered at x0, its axes are oriented with the eigen-

    vectors ofP, and the relative widths along these axes are pro-portional to the eigenvalues ofP.

    A single point {x}is convex.

    The empty set is convex.

    The set{x R2 : x2

    1 2x1 x2+ 1 0}

    is convex. (Sketch it!) The set

    {x R2 : x21

    2x1 x2+ 1 0}

    is notconvex.

    3

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    4/23

    The set{x R2 : x2

    1 2x1 x2+ 1 = 0}

    is certainly not convex.

    Sets defined by linear equality constraints where only some ofthe constraints have to hold are in general not convex. Forexample

    {x R2 : x1 x2 1 and x1+x2 1}

    is convex, while

    {x R2

    : x1 x2 1 or x1+x2 1}is not convex.

    Cones

    A coneis a setCsuch that

    x C x C for all0.

    Convex cones are sets which are both convex and a cone. C is aconvex cone if

    x1,x2 C 1x1+2x2 C for all1, 20.

    Given an x1,x2, the set of all linear combinations with positiveweights makes a wedge. For practice, sketch the region below thatconsists of all such combinations ofx1and x2:

    4

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    5/23

  • 8/9/2019 02 Convex Sets

    6/23

    form a proper cone. Notice that it is not necessary that allthe xn 0; for example t t2 (x1 = 0, x2 = 1, x3 = 1) isnon-negative on [0, 1].

    Norm cones. The subset ofRN+1

    defined by

    {(x, t), x RN, t R : x t}

    is a proper cone for any valid norm and t > 0. Sketchwhat this looks like for the Euclidean norm and N= 2:

    Every proper coneKdefines a partial orderingor generalizedinequality. We write

    xK y when y x K.

    For example, for vectors x,y RN, we say

    xRN+ y when xnyn for alln= 1, . . . , N .

    For symmetric matrices X,Y, we say

    XSN+ Y when Y Xhas non-negative eigenvalues.

    We will typically just use when the context makes it clear. In fact,for RN

    + we will just write x y (as we did above) to mean that

    6

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    7/23

    the entries in x are component-by-component upper-bounded by theentries in y.

    Partial orderings obey share of the properties of the standard onthe real line. For example:

    x y, u v x + u y + v.

    But other properties do not hold; for example, it is not necessarythat either x y or y x. For an extensive list of properties ofpartial orderings (most of which will make perfect sense on sight) canbe found in [BV04, Chapter 2.4].

    Hyperplanes and halfspaces

    Hyperplanes and halfspaces are both very simple constructs, but theywill be crucial to our understanding to convex sets, functions, andoptimization problems.

    A hyperplaneis a set of the form

    {x RN : x, c=t}

    for some fixed vector c = 0 and scalar t. When t = 0, this set isa subspace of dimension N1, and contains all vectors that areorthogonal to c. Fort = 0, this is an affine space consisting of allthe vectors orthogonal to c(call this setC) offset to some x0:

    {x RN : x, c=t}= {x RN : x= x0+ C},

    for any x0 with x0, c = t. We might take x0 = t c/c22, forinstance. The point is, cis a normal vectorof the set.

    Here are some examples in R2:

    7

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    8/23

    1 2 3 4 5

    1

    2

    3

    4

    5

    1

    t = 1 t = 5t = 1

    c =

    1

    0

    4

    1 41

    c =

    2

    1

    t = 5= 1

    It should be clear that hyperplanes are convex sets.

    A halfspaceis a set of the form

    {x RN : x, c t}

    for some fixed vector c = 0 and scalar t. Fort = 0, the halfspacecontains all vectors whose inner product with c is negative (i.e. theangle between x and c is greater than 90). Here is a simple example:

    8

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    9/23

    1

    2

    3

    4

    5

    1 2 3 4 51

    c =

    2

    1

    t = 5

    Separating hyperplanes

    If two convex sets are disjoint, then there is a hyperplane that sepa-

    rates them.

    C

    D

    d

    c

    {x : hx, ai = b}

    a = d c

    b =kdk2

    2 kck2

    2

    2

    9

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    10/23

    This fact is intuitive, and is incredibly useful in understanding the so-lutions to convex optimization programs (we will see this even in thenext section). It also easy to prove under mild additional assump-tions; below we will show how to explicitly construct a separatinghyperplane between two convex sets if there are points in the twosets that are definitely closest to one another. The result extends togeneral convex sets, and the proof is based on the same basic idea,but requires some additional toplogical considerations

    LetC, Dbe disjoint convex sets and suppose that there exists pointsc Cand d D that achieve the minimum distance

    c d2= inf xC,yD x y2.

    Define

    a= d c, b=d2

    2 c2

    2

    2 .

    Then

    x,a b for x C, x,a b for x D.

    That is the hyperplane {x : x,a = b} separates C and D. Tosee this, set

    f(x) =x,a b,

    and show that for and point uinD, we havef(u) 0.

    First, we prove the basic geometric fact that for any two vectors x,y,

    if x +y

    2

    x

    2 for all

    [0, 1] then

    x,y

    0. (1)

    To establish this, we expand the norm as

    x +y22=x2

    2+2y2

    2+ 2x,y,

    10

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    11/23

    from which we can immediately deduce that

    2y2

    2+ x,y 0 for all [0, 1]

    x,y 0.

    Now let u be an arbitrary point in D. SinceD is convex, we knowthatd+ (ud) Dfor all [0, 1]. Sincedas close to cas anyother point inD, we have

    d +(u d) c2 d c2,

    and so by (1), we know that

    d c,u d 0.

    This means

    f(u) =u,d c d2

    2 c2

    2

    2

    =u,d c d + c,d c2

    =u (d + c)/2, d c=u d + d/2 c/2, d c

    =u d,d c +c d2

    2

    20.

    The proof that for every v C,f(v) 0 is exactly the same.

    Note the following:

    11

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    12/23

    We actually only need the interiors of sets CandD to be dis-joint they can intersect at one or more points along theirboundaries. Viz

    C

    D

    But you have to be more careful in defining the hyperplanewhich separates them (as we would have c = d a = 0 inthe argument above).

    There can be many hyperplanes that separate the two sets, butthere is not necessarily more than one.

    The hyperplane {x : x,a = b} strictly separates setsC and D ifx,a < b for all x C and x,a >> b for allx D. There are examples of closed, convex, disjoint setsC, Dthat are separable but not strictly separable. It is a goodexercise to think of an example.

    12

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    13/23

    Supporting hyperplanes

    A direct consequence of the separating hyperplane theorem is thatevery point on the boundary of a convex setCcan be separated fromits interior.

    Ifa=0satisfiesx,a x0,afor all x C, then

    {x: x,a=x0,a}

    is called a supporting hyperplanetoCat x0. Heres a picture:

    0

    The hyperplane is tangent to Cat x0, and the halfspace {x:x,a x0,a}containsC.

    The supporting hyperplane theorem says that a supporting hyper-plane exists at every point x0 on the boundary of a (non-empty)convex setC. Note that there might be more than one:

    x

    0

    13

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    14/23

    Given the separating hyperplane theorem, this is easy to prove. IfChas a non-empty interior, then it follows from just realizing that x0and int(C) are disjoint. If int(C) = , then Cmust be contained inan affine set, and hence also in (possibly many) hyperplanes, any ofwhich will be supporting.

    14

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    15/23

    The closest point problem

    Let x0 RN be given, and letCbe a non-empty, closed, convex set.

    The projection ofx0 onto Cis the closest point (in the standardEuclidean distance, for now) in Cto x0:

    PC(x0) = arg minyC

    x0 y2

    We will see below that there is a unique minimizer to this problem,and that the solution has geometric properties that are analogous tothe case whereCis a subspace.

    Projection onto a subspace

    Lets recall how we solve this problem in the special case whereC := T is a K-dimensional subspace. In this case, the solutionx= PT(x0) is unique, and is characterized by the orthogonalityprinciple:

    x0 x T

    meaning thaty,x0 x= 0 for all y T. The proof of this factis reviewed in the Technical Details section at the end of these notes.

    The orthogonality principle leads immediately to an algorithm forcalculating PT(x0). Letv1, . . . ,vKbe a basis for T; we can writethe solution as

    x=Kk=1

    kvk;

    solving for the expansion coefficientsk is the same as solving for x.We know that

    x0 K

    j=1

    jvj, vk= 0, fork= 1, . . . , K ,

    15

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    16/23

    and so thek must obey the linear system of equations

    K

    j=1

    jvj, vk=x0, vk, fork= 1, . . . , K .

    Concatenating the vk as columns in the N Kmatrix V, and theentriesk into the vector RK, we can write the equations aboveas

    VTV = VTx0.

    Since the{vk}are a basis for T(i.e. they are linearly independent),VTV is invertible, and we can solve for the best expansion coeffi-

    cients:= (VTV)1VTx0.

    Using these expansion coefficient to reconstruct xyields

    x=V= V(VTV)1VTx0.

    In this case, the projectorPT() is a linear map, specified byNNmatrix V(VTV)1VT.

    Projection onto an affine set

    Affine sets are not fundamentally different than subspaces. Any affinesetCcan be written as a subspaceTplus an offset v0:

    C=T + v0={x : x= y + v0, y C}.

    It is easy to translate the results for subspaces above to say that theprojection onto an affine set is unique, and obeys the orthogonalityprinciple

    y x,x0 x= 0, for all y C. (2)

    16

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    17/23

    You can solve this problem by shifting x0 and C by negative v0,projectingx0 v0 onto the subspace C v0, and then shifting theanswer back.

    Projection onto a general convex set

    In general, there is no closed-form expression for the projector ontoa given convex set. However, the concepts above (orthogonality,projection onto a subspace) can help us understand the solution foran arbitrary convex set.

    Uniqueness IfCis closed2 and convex, then for any x0, the program

    minyC

    x0 y2

    has a uniquesolution.

    To see this, suppose that x1 and x2 are solutions3 to the program

    above. Then for every[0, 1], it must be true that

    x0 x12 x0 (1 )x1 x22=x0 x1+x1 x22 x0 x12+x1 x22.

    The only way this inequality holds for all [0, 1] simultaneously isifx1=x2.

    2We review basic concepts in topology for RN in the Technical Details section

    at the end of the notes.3We are taking it for granted that there is indeed at least one point in Cfor

    which the minimum is obtained that this is true is a consequence ofCbeing closed. For a more careful and detailed proof, see [Ber03, Chapter2].

    17

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    18/23

    Obtuseness.Relative to the solution x, every vector inCis at anobtuse angle to the error x0 x. More precisely,PC(x0) =xif andonly if

    y x,x0 x 0. (3)

    Compare with (2) above. Here is a picture:

    x0

    y

    x

    We first prove that (3)PC(x0) =x. For any y C, we have

    y x02

    2=y x +x x0

    2

    2

    =y x22+ x x0

    2

    2+ 2y x,x x0

    x x02

    2+ 2y x,x x0.

    Note that the inner product term above is the same as (3), but withthe sign of the second argument flipped, so we know this term mustbe non-negative. Thus

    y x02

    2 x x0

    2

    2.

    Since this holds uniformly over all y C, x must be the closest pointinCto x0.

    18

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    19/23

    We now show thatPC(x0) =x (3). Letybe an arbitrary pointin C. Since C is convex, the point x +(y x) must also be in Cfor all[0, 1]. Since x= PC(x0),

    x +(y x) x02 x x02 for all [0, 1].

    By our intermediate result (1) a few pages ago, this means

    y x,x x0 0 y x,x0 x 0.

    19

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    20/23

    Technical Details: closest point to a subspace

    In this section, we establish the orthogonality principle for projectionof a point x0 onto a subspaceT. Let xbe a vector which obeys

    e= x x T.

    We will show that x is the unique closest point to x0 in T. Let ybe any other vector inT, and set

    e=x y.

    We will show that

    e

    > e

    (i.e. thatx

    y

    > x

    x

    ) .

    Note that

    e2 =x y2 =e (y x)2

    =e (y x), e (y x)

    =e2 + y x2 e,y x y x,e.

    Since y x T ande T,

    e,y x= 0, and y x,e= 0,

    and soe2 =e2 + y x2.

    Since all three quantities in the expression above are positive and

    y x> 0 y=x,

    we see thaty=x e> e.

    We leave it as an exercise to establish the converse; that ify,x x0= 0 for all y T, thenxis the projection ofx0 ontoT.

    20

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    21/23

    Technical Details: basic topology in RN

    This section contains a brief review of basic topological concepts inR

    N. Our discussion will take place using the standard Euclideandistance measure (i.e. 2 norm), but all of these definitions can begeneralized to other metrics. An excellent source for this material is[Rud76].

    We say that a sequence of vectors{xk, k= 1, 2, . . .}convergestoxif

    xk x2 x as k .

    More precisely, this means that for every > 0, there exists an nsuch thatxk x2 for all kn.

    It is easy to show that a sequence of vectors converge if and only iftheir individual components converge point-by-point.

    A set X is open if we can draw a small ball around every point inXwhich is also entirely contained in X. More precisely, letB(x, )

    be the set of all points within ofx:

    B(x, ) ={y RN : x y2}.

    ThenXis open if for every x X, there exists anx>0 such thatB(x, x) X. The standard example here is open intervals of thereal line, e.g. (0, 1).

    There are many ways to define closed sets. The easiest is that asetX is closed if its complement is open. A more illuminating (andequivalent) definition is thatX is closed if it contains all of its limitpoints. A vector xis a limit pointofX if there exists a sequenceof vectors{xk} Xthat converge to x.

    21

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    22/23

    The closureof general set X, denoted cl(X), is the set of all limitpoints ofX. Note that every x X is trivially a limit point (takethe sequence xk =x), soX cl(X). By construction, cl(X) is thesmallest closed set that containsX.

    Related to the definition of open and closed sets are the technicaldefinitions of boundary and interior. The interiorof a setX is thecollection of points around which we can place a ball of finite widthwhich remains in the set:

    int(X) ={x X : >0 such that B(x, ) X }.

    TheboundaryofXis the set of points in cl(X) that are not in theinterior:

    bd(X) = cl(X)\ int(X).

    Another (equivalent) way of defining this is the set of points that arein both the closure ofXand the closure of its complementXc. Notethat if the set is not closed, there may be boundary points that arenot in the set itself.

    The set X is bounded if we can find a uniform upper bound onthe distance between two points it contains; this upper bound iscommonly referred to as the diameterof the set:

    diam X = supx,yX

    x y2.

    The set X RN is compact if it is closed and bounded. A keyfact about compact sets is that every sequence has a convergent sub-sequence this is known as the Bolzano-Weierstrauss theorem.

    22

    Georgia Tech ECE 8823a Notes by J. Romberg. Last updated 1:23, January 8, 2015

  • 8/9/2019 02 Convex Sets

    23/23

    References

    [Ber03] D. Bertsekas.Convex Analysis and Optimization. Athena

    Scientific, 2003.[BV04] S. Boyd and L. Vandenberghe. Convex Optimization.

    Cambridge University Press, 2004.

    [Rud76] W. Rudin. Principles of Mathematical Analysis.McGraw-Hill, 1976.

    23