Chapter 1 Real Number System - SUNY Cortlandweb.cortland.edu/jubrani/MAT603LectureNotes.pdf · 2015. 6. 29. · Chapter 1 Real Number System 1.1 Introduction De nition 1.1.1. An Ordered

Chapter 1

Real Number System

1.1 Introduction

Definition 1.1.1. An Ordered set is a set S with a relation, denoted by <, with the followingtwo properties:

1. For any x, y ∈ S, one and only one of the following statements is true:

x < y x = y y < x

2. For x, y, z ∈ S, if x < y, y < z then x < z.

Definition 1.1.2. A Field is a set F with two operations addition and multiplication, which satisfythe following axioms:(A) Axioms for addition

A1.) If x, y ∈ F then x+ y ∈ F

A2.) x+ y = y + x ∀x, y ∈ F

A3.) x+ (y + z) = (x+ y) + z ∀x, y, z ∈ F

A4.) F contains an element 0 such that x+ 0 = x for all x ∈ F .

A5.) For every x ∈ F there is an element −x such that x+ (−x) = 0

(M) Axioms for multiplication

M1.) If x, y ∈ F then xy ∈ F

M2.) xy = yx ∀x, y ∈ F

M3.) x(yz) = (xy)z ∀x, y, z ∈ F

M4.) F contains an element 1 such that 1x = x for all x ∈ F .

M5.) For every x ∈ F , x 6= 0, there is an element x−1 ∈ F such that xx−1 = 1.

1

The Distributive Law

x(y + z) = xy + xz ∀ x, y, z ∈ F.

Notation: We will denote the set of all natural numbers by N, set of all integers by Z and theset of all rationals by Q.

The set of rational numbers, Q is both an ordered set and a field, and hence satisfies all theproperties listed above in the definitions. Rational numbers further satisfy the follwing property:

For p, q ∈ Q: p > 0, q > 0⇒ pq > 0, p+ q > 0

There are infinitely many rational numbers, in fact, between any two rationals p and q, there is

another rational, e.g.,p+ q

2. If we continue taking the midpoint of the previous midpoint and one

of the extreme points then we can also see that between any two rationals there are infinitely manyrationals. But we will see below that Q by itself does not provide a whole (or a complete) picture,and that there are ‘gaps’ in Q.

Proposition 1.1.3. x2 = 2 has no solution in Q.

Proof. To see this, assume that it does have a solution, say x =m

n, for some m,n ∈ Z, n 6= 0, and

suppose that gcd(m,n)= 1 (i.e., m and n have no common factor other than 1.) Then(mn

)2= 2 ⇒ 2m2 = n2 ⇒ n2 is even ⇒ n is even.

Let n = 2k for some k ∈ Z. Then

2m2 = (2k)2 ⇒ m2 = 2k2 ⇒ m2 is even ⇒ m is even.

If both m and n are even, 2 is their common factor. This is a contradiction, since gcd(m,n)= 1.

More generally, for every prime p, x2 = p has no solutions in Q.

Define A = {q ∈ Q : q > 0, q2 < 2} and B = {q ∈ Q : q > 0, q2 > 2}. Then we show (in thetheorem below) that A has no largest element in it, and B has no smallest element in it.

Theorem 1.1.4. (a) For every p ∈ A, ∃ q ∈ A such that p < q.

(b) For every p ∈ B, ∃ q ∈ B such that p > q.

Proof. (a) Suppose p ∈ A, then p2 < 2. Pick h ∈ Q with 0 < h < 1 and

h <2− p2

2p+ 1.

Let q = p+ h, then q > p since h > 0, and

q2 = p2 + 2ph+ h2 = p2 + (2p+ h)h

< p2 + (2p+ 1)h (since h < 1)

< p2 + (2− p2) = 2.

2

So q2 < 2, i.e., q ∈ A.

(b) Now suppose p ∈ B, so p2 > 2. Let

q = p− p2 − 2

2p=

2p2

2p− p2 − 2

2p=p2 + 2

2p=p

2+

1

p.

Then 0 < q < p and

q2 = p2 − 2pp2 − 2

2p+ (

p2 − 2

2p)2

> p2 − (p2 − 2) = 2.

So q ∈ B.

1.2 Dedekind Cuts

Definition 1.2.1. A set α in Q is a cut if

(i) α 6= ∅ and α 6= Q.

(ii) If p ∈ α and q < p then q ∈ α.

(iii) α contains no largest rational, that is for every p ∈ α, there is a q ∈ α such that p ≤ q.

Remarks: The follwing facts will be used frequently and follow from the definition of a cut.

1. p /∈ α ⇐⇒ p > q for all q ∈ α.

Reason: Note that this statement is the contrapositive of the statement that ∃ q ∈ α suchthat q ≥ p ⇐⇒ p ∈ α

2. If p ∈ α and q /∈ α then p < q

Reason: If p ∈ α and q ≤ p then by the definition of a cut part (ii) q ∈ α.

Definition 1.2.2. An upper number of a cut α is a rational q which is not in α, i.e., q ∈ Q \α.Every element p in α is called a lower number of the cut α.

So a “cut” divides the set of rationals into two disjoint sets.

Definition 1.2.3. Let r ∈ Q. Define r∗ := {p ∈ Q : p < r}, r∗ is called a rational cut.

Theorem 1.2.4. Let r ∈ Q, then r∗, defined above, is a cut. Also, r is the smallest upper numberof r∗. In particular, 0∗ and 1∗ are cuts.

3

Proof. (i) Since r − 1 ∈ Q and r − 1 < r, r − 1 ∈ r∗. So r∗ 6= ∅.Also since r /∈ r∗, r∗ 6= Q.

(ii) Let p ∈ r∗ and q < p. Since p < r and q < p, q < r. Thus q ∈ r∗.

(iii) Let p ∈ r∗, then p <p+ r

2< r, so

p+ r

2∈ r∗. Thus r∗ has no largest element.

Since r /∈ r∗, r is an upper number for r. If p < r, then p ∈ r∗, so no number smaller than r isan upper number.

Definition 1.2.5. We say that two cuts α and β are equal (written α = β) if α and β are equalas sets, i.e., p ∈ α ⇐⇒ p ∈ β.

Definition 1.2.6. Let α and β be cuts. We say α < β (or β > α) if α is a proper subset of βi.e., For all p ∈ Q, p ∈ α⇒ p ∈ β and ∃ q ∈ Q such that q ∈ β but q /∈ α.

We say α ≤ β if α < β or α = β. And, α ≥ β means β ≤ α.

If α > 0∗, we say α is positive. If α ≥ 0∗, we say α is non-negative.

Theorem 1.2.7. (O1. Trichotomy Law) Let α and β be cuts. Then exactly one of α = β orα < β or α > β holds.

Proof. Case I: α = βBy the definition it is clear that the other two do not hold.

Case II: α 6= βThen either α < β or β < α.

Suppose α < β and β < α. Then ∃ p ∈ Q such that p ∈ β, p /∈ α. Also, ∃ q ∈ Q such thatq ∈ α, q /∈ β. Since p ∈ β and q /∈ β, p < q. On the other hand, q ∈ α and p /∈ α implies q < p.This is a contradiction. Hence this is not possible.

Theorem 1.2.8. (O2. Transitivity of “ < ”) Let α, β and γ be cuts. If α < β and β < γ, thenα < γ.

Proof. Let α < β and β < γ. Then ∃ p ∈ Q such that p ∈ β, p /∈ α, and ∃ q ∈ Q such that q ∈ γ,q /∈ β.

Since p ∈ β and q /∈ β, p < q. Further, since p /∈ α and p < q, q /∈ α. Hence, q ∈ γ and q /∈ α,that is, α < γ.

Definition 1.2.9. (Sum of cuts) Let α and β be cuts. Define α+β := {r = p+ q : p ∈ α, q ∈ β}.

Theorem 1.2.10. (A1. Closure) If α and β are cuts then α + β is a cut.

Proof. (i) Since α, β 6= ∅, α + β 6= ∅.

Since α, β 6= Q, ∃ s, t ∈ Q such that s /∈ α and t /∈ β. So s > p for all p ∈ α and t > q for allq ∈ β. Thus s+ t > p+ q for all p ∈ α and q ∈ β, i.e., s+ t /∈ α + β. Hence α + β 6= Q.

(ii) Let r ∈ α + β and s < r. Then r = p+ q for some p ∈ α and q ∈ β.

4

We can write s = p+ (s− p), then where s− p < r − p = q ∈ β and p ∈ α. Thus, s ∈ α + β.

(iii) Let r ∈ α+ β, so r = p+ q for some p ∈ α and q ∈ β. Since α is a cut, p is not the largestelement of α, i.e., ∃ s ∈ α, s > p. Then s+ q > p+ q = r and s+ q ∈ α + β.

Theorem 1.2.11. Let α, β and γ be cuts. Then(i) A2. Commutativity: α + β = β + α(ii) A3. Associativity: (α + β) + γ = α + (β + γ)(iii) A4. Identity: α + 0∗ = α.

Proof. (i) and (ii) are easy to prove. For instance, (i) follows from the fact that p + q = q + p forall p, q ∈ Q.

(iii) Let r ∈ α+ 0∗. So r = p+ q, p ∈ α, q ∈ 0∗. Since q < 0, p+ q < p, and hence r = p+ q ∈ α.This proves that α + 0∗ ⊆ α.

Let r ∈ α. Pick s ∈ α such that s > r. Let q = r − s, then q < 0 ⇒ q ∈ 0∗, and r = s + q. Sor ∈ α + 0∗. Hence α ⊆ α + 0∗.

Definition 1.2.12. (Negative of a cut) Let α be a cut. Define

−α = {p ∈ Q : −p is an upper number of α but not the smallest upper number}.

Theorem 1.2.13. If α is a cut, then −α is a cut.

Proof. (i) Since α 6= Q, ∃ p /∈ α. Choose q > p, then q is an upper number for α which is not thesmallest. Thus −q ∈ −α and −α 6= ∅.

Since α 6= ∅, ∃ p ∈ α. Then p is not an upper number for α. Hence −p /∈ −α, so −α 6= Q.

(ii) Let p ∈ β and q ∈ Q, q < p. Then −p /∈ α and −q > −p. So −q /∈ α, i.e., −q is an uppernumber and not the smallest. Thus q ∈ β.

(iii) Let p ∈ β. Then −p /∈ α and −p is not the smallest upper number of α. So pick q ∈ Qwith −q < −p, −q /∈ α.

Let r =p+ q

2, then −q < −r < −p. So −r is an upper number of α which not the smallest. So

∃ r > p with r ∈ β.

Lemma 1.2.14. Let α be a cut and let p ∈ Q, p > 0. Then ∃ q, r ∈ Q such that q ∈ α, r /∈ αwhere r is not the smallest upper number of α, and r − q = p.

Proof. Let s ∈ α. For n = 0, 1, 2, . . ., let sn = s + np. Then ∃ a unique m ∈ N such that sm ∈ α,sm+1 /∈ α. If sm+1 is not the smallest upper number of α, then let q = sm, r = sm+1. Otherwise let

q = sm +p

2, r = sm+1 +

p

2.

Theorem 1.2.15. (A5. Inverse) Let α be a cut, then −α is a unique cut such that α+(−α) = 0∗.

5

Proof. Let β := −α. Then we need to show α + β = 0∗.

“⊂” : Let p ∈ α + β. Then p = q + r, q ∈ α, r ∈ β. So

−r /∈ α⇒ −r > q ⇒ q + r < 0⇒ p < 0⇒ p ∈ 0∗.

“⊃” : Let p ∈ 0∗, so p < 0. By the lemma, ∃ q ∈ α, r /∈ α, r is not the smallest upper numberof α such that r − q = −p.

So − r ∈ β ⇒ p = q − r = q + (−r) ∈ α + β.

Uniqueness: If α + β1 = α + β2 = 0∗, then

β2 = 0∗ + β2 = (α + β1) + β2 = (α + β2) + β1 = 0∗ + β1 = β1.

Proposition 1.2.16. Let α be a cut. Then α > 0∗ if and only if −α < 0∗.

Proof. (⇒): Let 0∗ < α, then ∃ p ∈ α, p /∈ 0∗.If p = 0, we can choose q ∈ α, q > p = 0, so that q 6= 0, q ∈ α, and q /∈ 0∗. Thus we can assume

that p 6= 0.

Claim: −p ∈ 0∗ and −p /∈ −α

Since p = −(−p) ∈ α, p is not an upper number for α. So −p /∈ −α.

Since p /∈ 0∗ and p 6= 0⇒ p > 0⇒ −p < 0⇒ −p ∈ 0∗.

(⇐): Let −α < 0∗, then ∃ p ∈ 0∗ such that p /∈ −α. Choose q > p such that q ∈ 0∗.

Claim: −q ∈ α, and −q /∈ 0∗.

Since q ∈ 0∗ ⇒ q < 0⇒ −q > 0⇒ −q /∈ 0∗.

Since p /∈ −α and q > p, q /∈ −α. By definition, q /∈ −α implies either −q is not an uppernumber of α (i.e., −q ∈ α) or −q is the smallest upper number of α.

In the first case, we are done.

If −q is the smallest upper number of α, then −q is the smallest number such that −q /∈ α. Nowp < q ⇒ −q < −p, so−p /∈ α and also−p is not the smallest upper number. Thus p = −(−p) ∈ −α,which is a contraction, since p /∈ −α.

Proposition 1.2.17. Let r ∈ Q, then −r∗ = (−r)∗

Proof. Let r ∈ Q, then

−r∗ = {p ∈ Q −p is an upper number forr∗ but not the smallest upper number}={p ∈ Q : −p /∈ r∗ and − p 6= r}={p ∈ Q : −p ≥ r and − p 6= r}={p ∈ Q : −p > r}={p ∈ Q : p < −r} = (−r)∗.

6

Theorem 1.2.18. (O3. Order axiom) Let α, β, γ be cuts such that α < β, then α + γ < β + γ.In particular, if α = 0∗, then β + γ > 0∗ whenever γ > 0∗ and β > 0∗.

Proof. Let α < β, then ∃ p ∈ β such that p /∈ α. Let q ∈ γ, then p+ q ∈ β + γ.Claim: p+ q /∈ α + γLet r ∈ α, then p > r since p /∈ α. Thus p+ q > r + q, which implies that p+ q /∈ α + γ.

Definition 1.2.19. Multiplication of positive cuts Let α, β > 0∗, then define

αβ = {p ∈ Q : p < rs for some r ∈ α, s ∈ β, r > 0, s > 0}.

Theorem 1.2.20. M1. Closure If α and β are cuts, α, β > 0∗, then αβ is a cut.

Definition 1.2.21. Multiplication for cuts, in general Define α0∗ = 0∗α = 0∗ and

αβ =

−[(−α)β] α < 0∗, β > 0∗

−[α(−β)] α > 0∗, β < 0∗

(−α)(−β) α < 0∗, β < 0∗

Theorem 1.2.22. Let α, β, γ be cuts. Then

(i) M2. Commutativity: αβ = βα

(ii) M3. Associativity: α(βγ) = (αβ)γ

(iii) M4. Identity: α1∗ = α.

Theorem 1.2.23. (M5. Inverse) Let α 6= 0∗ then there exists a unique cut, written as,1

α, such

that α.1

α= 1∗.

Theorem 1.2.24. Let α, β, γ be cuts. Then

(i) D. Distributive Law: α.(β + γ) = α.β + α.γ

(ii) O4. Order axiom: If 0∗ ≤ α < β and γ > 0∗, then α.γ > β.γ

In particular, for α = 0∗, β.γ > 0∗ whenever β, γ > 0∗.

Theorem 1.2.25. Let r, s ∈ Q, then(i) r∗ + s∗ = (r + s)∗

(ii) r∗s∗ = (rs)∗

(iii) r < s ⇐⇒ r∗ < s∗

Proof. (i) Let p ∈ r∗ + s∗, then p = q + t for some q ∈ r∗, t ∈ s∗.So p = q + t < r + s ⇒ p ∈ (r + s)∗.

Conversely, let p ∈ (r + s)∗, then p < r + s.

Let h = r + s− p < 0 and define q := r − h

2, t = s− h

2.

Then q < r and t < s and q + t = r + s− h = p. Hence p ∈ r∗ + s∗.

(iii) (⇒) : Let r < s. Then r ∈ s∗, and we know r /∈ r∗. So r∗ < s∗.

(⇐) : Let r∗ < s∗. Then ∃ p ∈ s∗, p /∈ r∗, i.e., p < s and p ≥ r. Thus r ≤ p < s, and sor < s.

7

1.3 Construction of Real Numbers - Dedekind Cuts

Definition 1.3.1. We define the set of real numbers to be the set of all cuts, denoted by R.The set of irrational numbers are defined to be the set of non-rational cuts.

Remark 1.3.2. 1. We have seen that the set of cuts satisfies addition properties (A1 − A5),multiplication properties (M1−M5), distributive property (D), and order properties (O1−O4). Any set which satisfies these 15 properties, is called an ordered field. Thus, the set ofcuts, and hence, the set of real numbers is an ordered field.

2. The set of rational number is identified with the set of rational cuts, and also satisfies these15 axioms. Thus, the set of rational number is just not a subset, but a sub-field, of R.

Definition 1.3.3. Let S be any ordered set in R (i.e., satisfies O1, O2), and let E ⊆ S. Then E isbounded above if ∃ β ∈ S such that x ≤ β for all x ∈ E.

Similarly, E is bounded below if ∃ β ∈ S such that x ≥ β for all x ∈ E.

Definition 1.3.4. Let S be an ordered set and E ⊆ S which is bounded above. Then α ∈ S is aleast upper bound of E if

(i) α is an upper bound of E (i.e., x ≤ α for all x ∈ E);(ii) If β < α, the β is not an upper bound of E.We write α = supE.

A greatest lower bound can be similarly defined.

Definition 1.3.5. If every non-empty subset E of S which is bounded above, has a least upperbound in S, then S is said to have the least upper bound property.

The greatest lower bound property can be similarly defined.

Theorem 1.3.6. R has the least upper bound property.

Proof. We will show that the set of cuts has the least upper bound property. Let S be a nonemptybounded set of cuts. Let a cut β be an upper bound for S such that β /∈ S.

Define γ :=⋃α∈S α.

Claim 1: γ is a cut.(i) It is clear, since α 6= ∅, γ 6= ∅.Since α < β ∀α ∈ S ⇒ γ =

⋃α∈S α < β ⇒ ∃ p ∈ β such that p /∈ γ. Hence γ 6= Q.

(ii) Let p ∈ γ, and q ∈ Q, q < p. Then p ∈ α0 for some α0 ∈ S, which is a cut. Then q ∈ α0 ⊆ γ.

(iii) Let p ∈ γ, then p ∈ α0 for some α0 ∈ S, which is a cut. So ∃ q ∈ α0 ⊆ γ such that q > p.

Claim 2: γ = supS

(i) Let α ∈ S, then α ⊆ γ ⇒ α ≤ γ. So γ is an upper bound for S.

(ii) Let β be an upper bound of S, that is, α ≤ β for all α ∈ S. Then

α ⊆ β ∀ α ∈ S ⇒⋃α∈S

α ⊆ β ⇒ γ ≤ β.

8

Theorem 1.3.7. (Archimedean Property of R) If x, y ∈ R and x > 0, then ∃ n ∈ N such thatnx > y.

Proof. (By contradiction) Let x > 0, y ∈ R. Suppose nx ≤ y for all n ∈ N.Let S = {nx : n ∈ N}, then S is a nonempty subset of R which is bounded above by y. Then

by Theorem 1.3.6, S has a supremum in R.Let α = supS. Then α ≥ nx for all n ∈ N. Also, since α − x < α, α − x cannot be an upper

bound for S. So ∃ m ∈ N such that

α− x < mx⇒ α < (m+ 1)x.

This is contradiction since α ≥ nx for n ∈ N. Thus our assumption must be false, and Archimedeanproperty holds.

Theorem 1.3.8. Q is dense in R, that is, for every x, y ∈ R, x < y, ∃ r ∈ Q such that x < r < y.

Proof. Let x, y ∈ R, x < y. Let x, y correspond to cuts α and β, respectively. Then α < β. So ∃p ∈ Q such that p ∈ β, p /∈ α. Choose r ∈ Q such that r ∈ β, r > p.

So r ∈ β and r /∈ r∗ ⇒ r∗ < β.Since p < r, p ∈ r∗, but p /∈ α ⇒ α < r∗.Thus, ∃ r∗ such that α < r∗ < β, which means x < r < y.

9

Absolute Value Let x ∈ R. Then absolute value of x is defined to be

x =

{x if x ≥ 0

−x if x < 0

Proposition 1.3.9. Let x and y be in R. Then

(i) |x| = 0 if and only if x = 0;

(ii) |−x| = |x|;

(iii) x ≤ |x|;

(iv) |xy| = |x| |y|;

(v) if y ≥ 0, then |x| ≤ y if and only if −y ≤ x ≤ y;

(vi) |x+ y| ≤ |x|+ |y|.

Proof. Parts (i)-(v) is an easy exercise which can be proved by considering two cases in each part,namely, x ≥ 0 and x < 0. We prove part (vi). Consider

|x+ y|2 =∣∣(x+ y)2

∣∣ = (x+ y)2 = x2 + 2xy + y2

=∣∣x2∣∣+ 2xy +

∣∣y2∣∣≤∣∣x2∣∣+ |2xy|+

∣∣y2∣∣= |x|2 + 2 |x| |y|+ |y|2

= (|x|+ |y|)2.

Taking square root both sides, gives (vi).

Corollary 1.3.10. For x, y ∈ R,

(i) |x− y| ≤ |x|+ |y|,

(ii) ||x| − |y|| ≤ |x− y|.

10

1.4 Finite, Countable, and Uncountable Sets

Let f be a mapping from A into B. Then f is one-to one (1-1) if for any x1, x2 ∈ A, f(x1) = f(x2)implies x1 = x2.f is onto if f(A) = B, i.e., for any y ∈ B there exists an x ∈ A such that f(x) = y.If f is both 1-1 and onto, it is called a bijection.

Definition 1.4.1. Two sets A and B are said to be in 1-1 correspondence if there exists aone-one mapping from A onto B. We also say that A and B has same cardinal number or thatA and B are equivalent, and write A ∼ B.

Proposition 1.4.2. The equivalence relation between sets satisfies the following properties:

(i) Reflexive: A ∼ A.

(ii) Symmetric: If A ∼ B then B ∼ A.

(iii) Transitive: If A ∼ B and B ∼ C, then A ∼ C.

Any relation with these three properties is called an equivalence relation.

Proof. Let A,B,C be sets.

(i) Identity map I : A→ A, I(x) = x, is a one-one and onto.

(ii) Let A ∼ B, then there is a bijection f : A→ B. Then, f−1 is a bijection from B onto A.

(iii) Let f : A→ B and g : B → C be bijections, then g ◦ f is a bijection from A onto C.

Definition 1.4.3. Let Jn = {1, 2, . . . , n} and N be the set of all positive integers. For any set A,we say

• A is finite if A = ∅ or A ∼ Jn for some n.

• A is infinite if A is not finite.

• A is countable if A ∼ N.

• A is uncountable if A is neither finite nor countable.

Note: Let A and B be finite sets. Then A ∼ B if and only if A and B have same number ofelements. But this is not the case, if A and B are infinite.Countable sets are sometimes also called denumerable or enumerable sets.

Examples:

1. E ∼ N, where E is the set of all positive even integers. Define f : N→ E as f(n) = 2n, thenf is a bijection.

11

2. Z ∼ N, where Z is set of all integers. Define f : N → Z as f(n) =

n

2if n even

−(n− 1)

2if n odd

Then f is a bijection.

Definition 1.4.4. A sequence of real numbers is defined to a map f from N into R. If f(n) = xn,then we write a sequence as (x1, x2, . . . , xn, . . .), where xn is called the nth term of the sequence(xn).

If A is countable, then there is a bijection from N onto A. Thus we can say that, a setA is countable if and only if A is the range of a sequence of distinct terms, i.e.,A = {xn : n = 1, 2, . . .}.

Theorem 1.4.5. Every infinite subset of a countable set is countable.

Proof. Let A be a countable set, and B ⊆ A, be infinite. Let A = {xn : n = 1, 2, . . .}. Let n1 bethe smallest positive integer such that xn1 ∈ B. Proceed like this: choose nk such that nk is thesmallest integer after nk−1 such that xnk

∈ B.Define f : N→ B as f(k) = xnk

. Then by the choice of the subsequence, any x ∈ B will be xnk

for some k, which shows f is onto. Since all xn’s are distinct, all xnk’s are distinct. Thus if k 6= p,

then xnk6= xnp . So f is one-one. Thus, f is a bijection, and B is countable.

Definition 1.4.6. Let A be a set, and for every element α ∈ A, there is a set Eα. We say{Eα : α ∈ A} is a collection of sets or family of sets. Such a set A is called an index set.

The union of a family of sets is defined to be a set S such that x ∈ S if and only if x ∈ Eα forsome α ∈ A; written as

∪α∈AEα.

If A = N, then we write ∪∞n=1En.

The intersection of a family of sets is defined to be a set P such that x ∈ P if and only if x ∈ Eαfor all α ∈ A; written as

∩α∈AEα.

If A = N, then we write ∩∞n=1En.

Example: Let A = {x ∈ R : 0 < x ≤ 1}. For every x ∈ A, let Ex = {y ∈ R : 0 < y < x}.

If x, z ∈ A, then x < z if and only if Ex ⊂ Ez.

∪x∈AEx = E1.

∩x∈AEx = ∅.

Theorem 1.4.7. Countable union of countable sets is countable. That is, if (En)∞n=1 is a sequenceof countable sets, then S = ∪∞n=1En is countable.

12

Proof. Let, for every n ∈ N, En = {xnk : k = 1, 2, . . . ∞}. Arrange the elements of S in an arrayas follows:

x11 x12 x13 x14 . . .

x21 x22 x23 x24 . . .

x31 x32 x33 x34 . . .

x41 x42 x43 x44 . . ....

......

...

.

The array shows that the elements of S can be arranged in a sequence

x11;x12, x21;x31, x22, x13;x41, x32, x23, x14; . . . .

Omitting the elements which repeat will give a subsequence (xp) of the above sequence. This showsthat S ∼ N, and hence is countable.

Corollary 1.4.8. The set of all rationals is countable.

Proof. Q ={mn

: m ∈ Z, n ∈ N}

=⋃m∈Z

{mn

: n ∈ N}

. For a fixed m, the set{mn

: n ∈ N}

is

countable, since the map f(n) =m

nis a bijection (easy to prove!). We also saw earlier that Z is

countable. So, Q is a countable union of countable sets. Thus, by above theorem, Q is countable.

Theorem 1.4.9. Let A be a set of all sequences whose elements are the digits 0 and 1. Then A isuncountable.

Proof. Let E be a countable subset of A, i.e., E = {s1, s2, s3, . . . : si ∈ A}. Define a sequence s suchthat nth term of s is 1, if nth term of sn is 0, and vice versa. Then s /∈ E, but s ∈ A. Thus, E isa proper subset of A. Since E was arbitrary, we have shown that every countable subset of A is aproper subset of A. Thus A cannot be countable, since A cannot be a proper subset of itself.

Remark: The above theorem show can be used to show that the set of real numbers, R, isuncountable. To see this, first recall that every real number has a binary expansion.

1001 = 1× 23 + 0× 22 + 0× 21 + 1× 20 = 9

.01 = 0× 2−1 + 1× 2−2 =1

4= .25

1001.01 = 9.25

Note that .01 = .010000000000 . . .. Thus every real number x between 0 and 1 can be associatedwith a sequence of 0’s and 1’s (not necessarily unique though!) using binary expansion of x.

Let T be the set of all sequences of 0’s and 1’s. Without getting into the details, lets take it onfaith, that we can define a bijection from T onto a subset S of the interval (0, 1), making T ∼ S.Thus, S is an uncountable subset of R. This implies that R must be uncountable because if R wascountable, then S would also be countable.

13

Chapter 2

Metric Spaces

2.1 Introduction

A metric space can be thought of as a generalization of real line to an abstract space which has anotion of length or distance between two elements, similar to that on R. The main concepts andresults one studies in an undergraduate real analysis course are

• convergence of a sequence in R,

• Cauchy sequences in R,

• continuity of a function of R,

• every Cauchy sequence converges,

• a function is continuous if and only if xn → x implies f(xn)→ xn.

We will see that we can generalize these concepts and several results to this new abstract space.In particular, we generalize them to R2,R3, . . .Rn . . .. Thus, we will develop a better understandingof the concepts in higher dimensions as well as revisit them on R.

Terminology: X will denote a set, not necessarily R or Rn. Although, R or Rn are the spaces ofmost interest. Also, we will call x ∈ X as a ‘point’ in X, taking the analogy of a geometrical pointin R or Rn. For understanding purposes, most of the time, we will confine to examples on R, R2 orR3.

2.2 Basic definitions and examples

Definition 2.2.1. A metric on a set X is function d : X ×X → [0,∞) given by (x, y)→ d(x, y)which satisfies the following properties: for x, y, z ∈ X

(i) d(x, y) = 0 ⇐⇒ x = y,

(ii) Symmetric: d(x, y) = d(y, x),

(iii) Transitive: d(x, y) ≤ d(x, z) + d(y, z).

The set X together with metric d is called a metric space, written (X, d).

14

Examples:

1. Let d : R×R → [0,∞) be d(x, y) = |x− y|. Then (R, d) [also written as (R, |·|)] is a metricspace.

It is easy to see that d(x, y) satisfies properties (i) and (ii). Property (iii) follows from thetriangle inequality of absolute value as follows:

|x− y| = |x+ (−z + z)− y| = |(x− z) + (z − y)| ≤ |x− z|+ |z − y| .

2. Consider R2 and ~x ∈ R2, then ~x = (x1, x2). We define length of a vector ~x as ‖x‖ =√x21 + x22.

Let d : R2×R2 → [0,∞) be

d(~x, ~y) = ‖~x− ~y‖ =√

(x1 − y1)2 + (x2 − y2)2.

Then (R2, d) is a metric space.

Properties (i) and (ii) are easy. To prove property (iii) one needs to first prove the following,and then use the same trick as in Example 1 of adding and subtracting ~z.

Show that ‖~x+ ~y‖ ≤ ‖~x‖+ ‖~y‖ .

The distance defined on R2 (in particular on R) can be generalized to Rn, and is called Eu-clidean distance or metric.

3. If (X, d) is a metric space and Y ⊂ X, then (Y, d) is also a metric space. Thus, Q, N and Zare metric spaces in their own right with the Euclidean metric, d(x, y) = |x− y|.

4. If (X, d) is a metric space, and Dn = nd(x, y), then (X,Dn) is a metric space for all n ∈ N.

Thus if there is one metric on X, then there are infinitely many metrics on X.

5. If (X, d) is a metric space, and D(x, y) =d(x, y)

1 + d(x, y), then (X,D) is a metric space.

Definition 2.2.2. A ball in a metric space (X, d) is a set define as:

Br(a) = {x ∈ X : d(x, a) < r},

where a is called its center and r > 0 its radius.

Note:

(i) The emptyset ∅ and the whole space X may be considered balls with r = 0 and r =∞.

(ii) A ball need not be ‘round’ !!

Examples:

15

1. In (R, |.|),Br(a) = {x ∈ R : |x− a| < r} = (a− r, a+ r).

Also, any open interval (a, b) is an ball centered ata+ b

2and radius

b− a2

.

2. In R2 with Euclidean metric,

Br(~a) = {~x ∈ R : ‖~x− ~a‖ < r},

is a circular disk with center ~a, and radius r, without the circular perimeter.

3. In R3 with Euclidean metric, a ball Br(~a), is a solid sphere centered at ~a, radius r, and withoutthe outer surface.

4. If we consider R2 with a metric d(~x, ~y) = max{|x1 − y1| , |x2 − y2|} where ~x = (x1, x2) and~y = (y1, y2), then with this metric, a ball in R2 is a square without the outer perimeter.

5. Since R, |.| is a metric space, and N ⊆ R, therefore, (N, |.|) is a metric space. In this metricB 1

2(m) = {m} and B2(m) = {m− 1,m,m+ 1}.

Definition 2.2.3. Let A ⊂ X, then a point x ∈ A is called interior point of A if

∃ε > 0 such that Bε(x) ⊆ A.

We also say, in this case that A is a neighborhood of x.

A point x ∈ X (x not in A) is called exterior point of A if

∃ε > 0 such that Bε(x) ⊆ X \ A.

A boundary point of A is a point which is neither interior nor exterior.

Examples: In (R, |.|), let A = [0, 1], then all points 0 < x < 1 are interior points. Exterior pointsare all points in (−∞, 0) ∪ (1,∞). Thus, 0 and 1 are the boundary points.

Definition 2.2.4. A set A in X is open if all its points are interior points, i.e, for every x ∈ A ∃ε > 0 such that Bε(x) ⊆ A.

Examples:

1. The empty set is open.

2. X is open because all balls around any point of x is a subset of X.

3. (a, b) is open in R, but [a, b] or {a} are not.

Theorem 2.2.5. All balls are open sets in X.

16

Proof. Let Br(a) be a ball in X, and x ∈ Br(a). Then d(x, a) < r. Let ε := r − d(x, a) which ispositive.

Claim: Bε(x) ⊆ Br(a).Let y ∈ Bε(x), then d(y, x) < ε.

d(y, a) ≤ d(y, x) + d(x, a) < ε+ d(x, a) = r.

So, y ∈ Br(a).

Note: The converse of above theorem does not hold true, i.e., every open set is not a ball. Forexample, (1, 2) ∪ (4, 5) is a open set but not a ball in R.

Theorem 2.2.6. A set is open if and only if it is a union of balls.

Proof. Let A be an open set in X. Then,

for each x ∈ A, ∃ εx > 0 such thatBεx(x) ⊆ A.

So,A = ∪x∈A{x} ⊆ ∪x∈ABεx(x) ⊆ A.

Hence, A = ∪x∈ABεx(x).

Conversely, let A = ∪αBα, where Bα is a ball. If x ∈ A, then x ∈ Bα0 for some α0. Since ballsare open, there exists ε > 0 such that Bε(x) ⊆ Bα0 ⊆ A.

Theorem 2.2.7. The following are two basic properties of open sets:

(i) Any union of open sets is open.

(ii) Any finite intersection of open sets is open.

Proof. (i) Let {Aα : α ∈ I} be a family of open sets, and B = ∪α∈IAα. Let x ∈ B, then x ∈ Aα0

for some α0 ∈ I. Since Aα0 is open,

∃ ε > 0 such that Bε(x) ⊆ Aα0 ⊆ ∪α∈IAα = B.

(ii) Let B = ∩ni=1Ai, where Ai is open in X, for all i = 1, 2, . . . , n. Let x ∈ B, then x ∈ Ai for alli = 1, 2, . . . , n. Since each Ai is open, for each i, ∃ εi > 0 such that Bεi(x) ⊂ Ai.

Let ε := min{εi : i = 1, 2, . . . , n}. Then for each i, Bε(x) ⊆ Bεi(x) ⊆ Ai. Thus Bε(x) ⊆∩ni=1Ai = B.

Examples: (a,∞) is open in R, since (a,∞) = ∪n∈N(a, n), and each (a, n) is a ball in R. Similarly,(−∞, b) is open in R.

Counter example: Any arbitrary intersection of open sets need not be open. For each n ∈ N, let

An =

(− 1

n,

1

n

), then An’s are open in R. But ∩n∈NAn = {0} which is not open in R.

17

Can you think of an example in R2 (with Euclidean metric), of infinitely many open sets whoseintersection in not open?

Remark: If X is a metric space and Y ⊆ X, then a set A in Y may be open in Y but not openin X. For example, let X = R and Y = N, and A = {5}, then A is open in N, since {5} = B1/2(5).But A is not open in R.

Theorem 2.2.8. Disjoint points in a metric space, X, can be separated by disjoint balls. That is,if x, y ∈ X, x 6= y, then ∃ r > 0 such that Br(x) ∩Br(y) = ∅.

Proof. Let x 6= y, then d(x, y) > 0. Let r =d(x, y)

2, then we claim that Br(x) ∩Br(y) = ∅.

Let z ∈ Br(x) ∩Br(y), then d(z, x) < r and d(z, y) < r. So,

d(x, y) ≤ d(x, z) + d(z, y) < r + r = 2r = d(x, y)

=⇒ d(x, y) < d(x, y),

which is a contradiction. Hence Br(x) ∩Br(y) = ∅.

Remark: A space which satisfies the property in above theorem is called a Hausdorff Space.Thus, metric spaces are Hausdorff.

Definition 2.2.9. A set F in X is called closed if X \ F is open in X.

Examples:

1 X and ∅ are closed, since X = X \ ∅ and ∅ = X \X.

2. In R, [a, b] is closed, since [a, b] = R \(−∞, a) ∪ (b,∞).

3. {a} is closed in R for all a ∈ R, since {a} = X \ (−∞, a) ∪ (a,∞).

4. (a, b] is neither open nor closed in R.

5. {m} is both open and closed in N.

Proposition 2.2.10. Let x ∈ X, then {x} is closed in X, that is, singletons are closed sets.

Proof. We need to prove that A = X \ {x} is a open set in X. Let y ∈ A, then y 6= x. Since X isHausdorff, ∃ r > 0 such that Br(x) ∩Br(y) = ∅.

We claim that Br(y) ⊂ A.Let z ∈ Br(y), then clearly z ∈ X since Br(y) ⊆ X. Further, z /∈ {x}, i.e., z 6= x. To see this,

suppose z = x, then z ∈ Br(x). We also know z ∈ Br(y), which means z ∈ Br(x)∩Br(y). But thisis not possible since Br(x) ∩Br(y) = ∅. So, x 6= z. This proves our claim.

Theorem 2.2.11. The following are two basic properties of closed sets:

(i) Finite union of closed sets in X is closed in X,

18

(ii) Any intersection of closed sets in X is closed in X.

Proof. (i) Let {Fi : i = 1, 2, . . . , n} be a finite collection of closed sets. Then each X \ Fi is open inX, hence ∩ni=1(X \ Fi) is open in X (by Theorem 2.2.7). Since

X \ (∪ni=1Fi) = ∩ni=1(X \ Fi),

∪ni=1Fi is closed in X.

(ii) Let {Fα : α ∈ I} be an arbitrary collection of closed sets. Then each X \ Fα is open in X,hence ∪α∈I(X \ Fα) is open in X (by Theorem 2.2.7). Since

X \ (∩α∈IFα) = ∪α∈I(X \ Fα),

∩α∈IFi is closed in X.

Corollary 2.2.12. Any finite subset in X is closed in X.

Proof. Let F = {x1, x2, . . . , xn}, then F = ∪ni=1{xi}. Since each {xi} is closed in X (by Theorem2.2.10), thus F is closed in X, by above theorem.

Proposition 2.2.13. A closed ball C = {x ∈ X : d(x, a) ≤ r}, a ∈ X, r > 0, is a closed set inX.

Proof is left as an exercise.

Definition 2.2.14. A point x is a limit point of a set A if

for each ε > 0, ∃y 6= x such that y ∈ A ∩Bε(x);

that is, every open ball around x contains a point of A, which is different from x.

A point in A which is not a limit point is called a isolated point of A.

In other words, a limit point is a point which cannot be separated or isolated from the set (usingopen balls). They lie either in the interior or on the boundary of the set. Limit points are alsocalled cluster points or accumulation points.

Examples:

1. In R, if A = (a, b), its limits points are [a, b].

2. In R, if A = [a, b], its limits points are [a, b].

3. In R, if A = {1/n : n ∈ N}, its limit points are {0}.

4. In N, any A ⊆ N has no limit points.

5. Any finite set in R has no limit points.

19

Theorem 2.2.15. A set is closed if and only if it contains all its limit points.

Proof. Let F be a closed set and x be a limit point of F . Suppose x /∈ F , then x ∈ X \ F , whichis open. So, ∃ ε > 0 such that Bε(x) ⊆ (X \ F ). But this means that Bε(x) ∩ F = ∅, which is acontradiction since x is a limit point of F .

Conversely, suppose F is a set containing all its limit points, and let x ∈ X \ F (i.e., x /∈ F ).Then x is not a limit point of F . Therefore,

∃ ε > 0, such that ∀ y 6= x y /∈ F ∩Bε(x)

⇒ F ∩Bε(x) ⊆ {x} ⊆ X \ F

⇒ x ∈ Bε(x) ⊆ X \ F ⇒ X \ F is open.

Hence F is closed.

Definition 2.2.16. Closure of a set A is defined to be the set A together with its limit points;

A = A ∪ {limit points of A}.

Some facts about closure of sets:

• A is the smallest closed set containing A, i.e., if F is any other closed set containing A, thenA ⊂ F .

• A is closed if and only if A = A.

Thus, A = { 1n

: n ∈ N} is not closed since A = A ∪ {0}.

• If A ⊆ B, then A ⊆ B.

Converse is not true. For example, let A = [0, 1] and B = (0, 1), then A = B = [0, 1], butA * B.

• A ∪B = A ∪B.

• A ∩B ⊆ A ∩B.

Equality does not hold, in general. For example, let A = (1, 2) and B = (2, 3), then A ∩B ={2}, but A ∩B = ∅ = ∅.

20

2.3 Convergence in metric spaces

A sequence in a metric space X is a function from N into X, x(n) = xn, written as (xn).

Definition 2.3.1. A sequence (xn) in X converges to a point x in X if

for each ε > 0, ∃N ∈ N such that xn ∈ Bε(x) ∀ n ≥ N.

for each ε > 0, ∃N ∈ N such that d(xn, x) < ε ∀ n ≥ N.

We write xn → x as n→∞, or simply xn → x, or limn→∞

xn = x.

Note:

1. In other words, this means, any open ball centered at x will contain all the sequence termseventually, except finitely many (namely the first N − 1 terms).

2. N above, depends upon ε. Therefore, we sometimes also write as N(ε) or Nε.

3. Convergence in R: Since in R, d(xn, x) = |xn − x|, it is clear that the above definitiongeneralizes the notion of convergence in R:

(xn) converges in R to x ⇐⇒ for each ε > 0, ∃N ∈ N such that |xn − x| < ε ∀ n ≥ N.

Theorem 2.3.2. Limit of a sequence is unique.

Proof. Let (xn) converge to x and y in X. Let ε > 0, then for ε/2 > 0,

since xn → x, ∃ N1 such that d(xn, x) < ε/2 ∀ n ≥ N1;

since xn → y, ∃ N2 such that d(xn, y) < ε/2 ∀ n ≥ N2.

Let N = max{N1, N2}, then n ≥ N implies n ≥ N1, N2. Let n ≥ N , then

d(x, y) ≤ d(x, xn) + d(xn, y) < ε/2 + ε/2 = ε.

Since ε > 0 was arbitrary, d(x, y) = 0, which implies, x = y.

An alternate way to prove the above result is:

Proof. Let (xn) converge to x and y in X, and x 6= y. By Hausdorff property, ∃ ε > 0 such thatBε(x) ∩Bε(y) = ∅.

Since xn → x, ∃ N1 such that xn ∈ Bε(x) ∀ n ≥ N1;

since xn → y, ∃ N2 such that xn ∈ Bε(y) ∀ n ≥ N2.

Let n0 ≥ N1, N2, then xn0 ∈ Bε(x) and xn0 ∈ Bε(y), which contradicts Bε(x) ∩ Bε(y) = ∅. Thusx = y.

21

Theorem 2.3.3. Let (xn) be a sequence in A and xn → x, then x ∈ A.

Proof. If x ∈ A, then x ∈ A.If (xn) is ‘eventually’ a constant sequence, i.e., xn = c for all n ≥ N for some N , then x = c ∈

A ⊂ A.Suppose x /∈ A and (xn) is not eventually a constant sequence. Now

xn → x⇒ for any ε > 0 ∃ N such that xn ∈ Bε(x) ∀ n ≥ N.

But this implies that every open ball centered at x contains points of A, namely xn’s, different fromx. Hence x ∈ A.

Corollary 2.3.4. If (xn) is a sequence in F , and F is a closed set, then limn→∞

xn ∈ F .

Proof. This follows from the above theorem and the fact that F = F .

Theorem 2.3.5. If x is a limit point of a set A in X, then ∃ a sequence (xn) in A such thatxn → x.

Proof. Construction of (xn): For each n ∈ N, let εn = 1/n. Since x is a limit point of A, foreach εn > 0, ∃ xn ∈ Bεn(x) ∩ A, xn 6= x. Thus, d(xn, x) < εn = 1/n.

Let ε > 0, then by Archimedean property of R, ∃ N ∈ N such that 1/N < ε. If n ≥ N , then

d(xn, x) <1

n≤ 1

N< ε.

Thus xn → x.

Convergence in R

In R with the Euclidean metric, |.|, Bε(x) = (x− ε, x+ ε), and

y ∈ (x− ε, x+ ε) ⇐⇒ x− ε < y < x+ ε ⇐⇒ −ε < y − x < +ε ⇐⇒ |x− y| < ε.

Keeping these in mind, we will consider following as the definition of convergence in R

Definition 2.3.6. A sequence (xn) in R converges to x ∈ R if for every ε > 0 ∃ N such thatxn ∈ (x− ε, x+ ε) (or |xn − x| < ε) for all N ≥ n.

Recall: Archimedean Property of R: This property has various equivalent forms, and it says thatin R there is no infinitely small or infinitely large number:

- For every x > 0 (however small), ∃ N such that 1/N < x.

- For every x ∈ R (however large), ∃ N such that x < N .

22

We will use Archimedean property in the ε-N proofs below, to find N corresponding to a given ε.(The real number x will be in terms of ε.)

Example 2.3.7. 1. limn→∞

1

n= 0

Let ε > 0, then by the Archimedean Property of R, ∃ N such that1

N< ε.

If n ≥ N , then1

n≤ 1

N< ε.

That is, ∃ N such that

∣∣∣∣ 1n − 0

∣∣∣∣ < ε for all n ≥ N .

2. limn→∞

1

(n+ 1)2= 0 Let ε > 0, and consider

∣∣∣∣ 1

(n+ 1)2− 0

∣∣∣∣ =1

(n+ 1)2<

1

n2≤ 1

n.

Since 1/n → 0, ∃ N such that 1/n < ε for all n ≥ N . Then for n ≥ N , by above,∣∣∣∣ 1

(n+ 1)2− 0

∣∣∣∣ < 1

n< ε.

3. The sequence (1, 0, 1, 0, 1, . . .) does not converge in R.

Suppose above sequence converges to a point a ∈ R. Let ε =1

2, then ∃ N such that |xn − a| <

1

2for all n ≥ N .

In particular, |0− a| < 1

2and |1− a| < 1

2, and

1 = |0− 1| ≤ |0− a|+ |a− 1| < 1

2+

1

2= 1.

This is a contradiction.

4. If 0 < b < 1, then limn→∞

bn = 0.

Let ε > 0, then we want to show: ∃ N ∈ N such that |bn − 0| < ε for all n ≥ N .

Consider |bn − 0| < ε ⇐⇒ bn < ε ⇐⇒ n ln b < ln ε ⇐⇒ n >ln ε

ln b(since ln b < 0).

By Archimedean property, for x =ln ε

ln b, we can choose a N ∈ N such that

ln ε

ln b< N .

Then if n ≥ N , then n >ln ε

ln bwhich implies that |bn − 0| < ε.

Proposition 2.3.8. Let a ∈ R, then there exists a sequence of rationals (qn) and a sequence ofirrationals (rn) such that qn → a and rn → a, as n→∞.

23

Proof. Let a ∈ R. We first prove that there exists a sequence of rationals (qn) such that qn → a.

Step 1: (Construction of the rational sequence) Consider B1(a) = (a − 1, a + 1), then since Q isdense in R, ∃ q1 ∈ B1(a). Consider B1/2(a) = (a − 1/2, a + 1/2), then since Q is dense in R, ∃q1 ∈ B1/2(a). Continuing this way, we get the nth sequence term qn ∈ Q, such that qn ∈ B1/n(a).

Step 2: (qn → a). Let ε > 0, then since 1/n→ 0, ∃ N such that 1/n < ε for all n ≥ N . For n ≥ N ,since qn ∈ B1/n(a), |qn − a| < 1/n < ε. Thus qn → a.

Similarly using the fact that irrationals are dense in R, the second part can be proved.

Complete Metric Spaces

Definition 2.3.9. A sequence (xn) in X is Cauchy if

For each ε > 0 ∃ N ∈ N such that d(xn, xm) < ε ∀ n,m ≥ N.

Thus a Cauchy sequence is a sequence in which the terms keep getting closer to each other.

Theorem 2.3.10. Every convergent sequence is Cauchy.

Proof. Let (xn) be a sequence in X and xn → x. Let ε > 0. Then ∃ N ∈ N such that d(xn, x) < ε/2for all n ≥ N .

Let m,n ≥ N , then

d(xm, xn) ≤ d(xm, x) + d(x, xm) < ε/2 + ε/2 = ε.

Remark: The converse of above theorem is not true in a general metric space, that is, a Cauchysequence may not be convergent. For example:

Let X = Q. Consider a recursively defined sequence (xn) in Q given by xn+1 =xn2

+1

xn, with

x1 = 1. If (xn) converges to x then x must satisfy x =x

2+

1

x, but this means x2 = 2. So x cannot

be in Q. Thus, (xn) does not converge in Q.

Definition 2.3.11. A metric space is complete if every Cauchy sequence converges.

Some facts and remarks:

1. It follows that the set of rational numbers Q is not complete. But the set of real numbers Ris complete. This is the main reason why we use the real numbers rather than the rationalsin most applications of calculus.

2. X = (0, 1) is not complete, since xn = 1/n is a sequence in X which is Cauchy but notconvergent in X.

24

3. Every metric space X can be completed i.e., there is a complete metric space X̃ which contains

X ; for example Q̃ = R and (̃0, 1) = [0, 1].

4. R is complete, that is a sequence in R is Cauchy if and only if it is convergent.

5. In a complete metric space X, any subset F is complete if and only if F is closed in X.

25

2.4 Continuity

Some set theory facts, we will use in this section:

Let f : X → Y , A1, A2, A ⊆ X and B1, B2, B ⊆ Y , then

(i) A1 ⊆ A2 ⇒ f(A1) ⊆ f(A2),

(ii) B1 ⊆ B2 ⇒ f−1(B1) ⊆ f−1(B2),

(iii) A ⊆ f−1[f(A)], equality holds if f is one-one,

(iv) f [f−1(B)] ⊆ B, equality holds if f is onto.

(v) f−1[Y \B] = X \ f−1(B)

In this section, X, Y will mean (X, dX) and (Y, dY ).

Definition 2.4.1. Let (X, dX) and (Y, dY ) be metric spaces, and f : X → Y . Then f is continuousat x if

∀ε > 0, ∃δ > 0 dX(x, y) < δ ⇒ dY (f(x), f(y)) < ε

Since d(a, b) < r ⇐⇒ b ∈ Br(a), this can be rewritten as

∀ε > 0, ∃δ > 0 such that y ∈ Bδ(x) ⇒ f(y) ∈ Bε(f(x))

which can be further rewritten as

∀ε > 0, ∃δ > 0 such that f [Bδ(x)] ⊆ Bε(f(x)).

We say f is continuous on X if f is continuous at each x ∈ X.

We say f is discontinuous at x if f is not continuous at x, i.e.,

∃ε > 0 such that ∀δ > 0 ∃y ∈ Bδ(x) but f(y) /∈ Bε(f(x)) [ i.e., f [Bδ(x)] * Bε(f(x))].

The following result characterizes continuous functions as the functions which pull open sets backto open sets. In a general topology, this is considered as the definition of a continuous function.

Theorem 2.4.2. Let f : X → Y , then f is continuous ⇐⇒ for every open set V in Y , f−1(V )is open in X.

Proof. Let f be a continuous function and let V be an open set in Y . We want to show thatU := f−1(V ) is open in X. Let x be any point in U . Then f(x) ∈ V , which is open. Hence

∃ ε > 0 such that f(x) ∈ Bε(f(x)) ⊆ V.

Since f is continuous, for ε > 0 above,

∃ δ > 0 such that f [Bδ(x)] ⊆ Bε(f(x)).

26

So,

f [Bδ(x)] ⊆ V ⇐⇒ f−1(f [Bδ(x)]) ⊆ f−1(V ) (∵ B1 ⊆ B2 ⇒ f−1(B1) ⊆ f−1(B2))

⇐⇒ Bδ(x) ⊆ f−1(V ) = U. (∵ A ⊆ f−1(f(A)))

Thus,∃ δ > 0 such that Bδ(x) ⊆ f−1(V ) = U, that is, U is open in X.

Conversely, assume that for every open set V in Y , f−1(V ) is open in X.

Let ε > 0. We want a δ > 0 such that f [Bδ(x)] ⊆ Bε(f(x)).

Let V = Bε(f(x)), then since Bε(f(x)) is open in Y , f−1(V ) = f−1 [Bε(f(x))] is open in X, thatis,

∃δ > 0 such that x ∈ Bδ(x) ⊆ f−1 [Bε(f(x))] .

Now

Bδ(x) ⊆ f−1 [Bε(f(x))]⇒ f [Bδ(x)] ⊆ f(f−1 [Bε(f(x))]

)(∵ A1 ⊆ A2 ⇒ f(A1) ⊆ f(A2))

⊆ Bε(f(x)) (∵ f(f−1(B)) ⊆ B.)

Thus,∃δ > 0 such that x ∈ f [Bδ(x)] ⊆ Bε(f(x)).

Note: A continuous function need not map open sets to open sets. For example, consider theconstant function f : R → R, f(x) = 1. Then f is continuous on R, and f [(0, 2)] = {1}, (0, 2) isopen in R but its image under f is {1} which is not open in R.

Corollary 2.4.3. f is continuous ⇐⇒ for every closed set F of Y , f−1(F ) is closed in X.

Proof is left as an exercise.

Theorem 2.4.4 (Sequential Criterion for Continuity). Let f : X → Y , thenf is continuous ⇐⇒ xn → x⇒ f(xn)→ f(x) (or f( lim

n→∞xn) = lim

n→∞f(xn) ).

Proof. Let f be a continuous function, and let (xn) be a sequence converging to x in X.We want to show: f(xn)→ f(x) in Y .Let ε > 0, and consider Bε(f(x)) which is open in Y . Since f is continuous,

∃δ > 0 such that f [Bδ(x)] ⊆ Bε(f(x)).

Since xn → x, then for δ > 0 above,

∃N ∈ N such that n ≥ N ⇒ xn ∈ Bδ(x).

So,⇒ xn ∈ Bδ(x)⇒ f(xn) ∈ f [Bδ(x)] ⊆ Bε(f(x)).

27

Thus,∃N ∈ N such that n ≥ N ⇒ f(xn) ∈ f [Bδ(x)].

Conversely, suppose f is not continuous. Then there is a point x ∈ X such that

∃ε > 0 such that ∀δ > 0 f [Bδ(x)] * Bε(f(x)).

In particular, for each n, let δ = 1/n. Then

∃ε > 0 such that ∀n f[B1/n(x)

]* Bε(f(x)).

For each n ∈ N , let xn ∈ B1/n(x) such that f(xn) /∈ Bε(f(x)).Since xn ∈ B1/n(x) ⇒ d(xn, x) < 1/n ⇒ xn → x. But f(xn) 9 f(x), as d(f(xn), f(x)) ≥ ε for

all n.

Thus we see that continuous functions preserve convergence.

Proposition 2.4.5. If f : X → Y and g : Y → Z are continuous, then so is g ◦ f : X → Z.

Proof. Let W be a open set in Z. We want to show that (g ◦ f)−1[W ] is open in X. Then sinceg is continuous, g−1(W ) is open in Y . Since f is continuous, f−1[g−1(W )] is open in X. Butf−1[g−1(W )] = (g ◦ f)−1[W ].

Note: If f : X → Y is continuous and f−1 : Y → X exists, f−1 may not be continuous.

Definition 2.4.6. A homeomorphism between two metric spaces X and Y is a map φ : X toYsuch that

φ is bijective,φ is continuous,φ−1 is continuous.

We say X is homeomorphic to Y and write X ∼ Y .

In other words, two spaces are homeomorphic when one can be obtained continuously andcontinuously invertible from the other; in effect when convergence in one space is equivalent toconvergence in the other.

28

Proposition 2.4.7. (a, b) is homeomorphic to (0, 1).

Proof. Define φ : (0, 1)→ (a, b) as φ(t) = (b−a)t+a. Then φ is bijective, and φ−1(t) : (a, b)→ (0, 1)

is φ(t) =t− ab− a

. Both φ and φ−1 are linear functions, which are continuous (we will prove this using

ε−N proof later).

29

Continuity of functions on R

Definition 2.4.8. Let f : D → R, where D ⊂ R. Then f is continuous at a point c ∈ D if

for every ε > 0, ∃ δ > 0, such that |x− c| < δ ⇒ |f(x)− f(c)| < ε.

If f is continuous at all points in D, then it is said to be continuous on D.

So, a function is not continuous or discontinuous at a point c if

∃ ε > 0 such that ∀ δ > 0, there ∃ x ∈ D such that |x− c| < δ but |f(x)− f(c)| ≥ ε.

Example: Define f by f(x) =

x 0 < x < 1

5 x = 1

2 1 < x < 4

Then f is not continuous at x = 1. Take ε = 1. Then we show that there is no δ > 0 such that ifx is in (0, 4) and x is in a δ-ball centered at 1, then f(x) is within 1 [= ε] of 5 [= f(c)].

Now |f(x)− 5| < 1⇒ 4 < f(x) < 6.

If 0 < x < 1, then f(x) = x and so 0 < f(x) < 1.If 1 < x < 4, then f(x) = 2.

So there is no open interval (i.e., ball) centered at 1 on which f(x) is within 1 of 5. Hence f is notcontinuous at x = 1.

Examples: Lets see in class how to use this ε − δ definition to prove continuity of the followingfunctions on their domains or at the given point.

1. f : R→ R f(x) = 2x− 9

2. f : R→ R f(x) = x2

3. f : [0,∞)→ R f(x) =√x.

4. f : R→ R f(x) =x3 − 4

x2 + 1is at x = 2.

5. f : R→ R f(x) =

x sin1

xx 6= 0

0 x = 0

30

Sequential Criterion for Continuity: A function f : D → R is continuous at c ∈ D iffor every sequence (xn) in D, xn → c⇒ f(xn)→ f(c).

Let us see how we use this criterion, to prove continuity or discontinuity of a function, with anexample.

Example: Let f(x) =

{x if x ∈ Q0 if x ∈ R \Q

Prove that f is continuous only at x = 0.

We need to show continuity at x = 0, and prove f is discontinuous everywhere else.

Continuity at x = 0:

We will use the following lemma to show continuity at x = 0.

Lemma 2.4.9. Let f : D −→ R with c ∈ D. Then f is continuous at c, if the following twoconditions hold:

(i) every rational sequence satisfies the sequential criteria;

(ii) every irrational sequence satisfies the sequential criteria.

So to use the lemma above, to show continuity at x = 0, let (qn) and (rn) be sequences of rationalsand irrationals respectively, such that qn → 0 and rn → 0.Then for each n, f(qn) = qn → 0 = f(0) and f(rn) = 0→ 0 = f(0). Hence f is continuous at 0.

Discontinuity at every x 6= 0:

Let x 6= 0.Case 1: x ∈ Q

Since R \Q is dense in R, we get a sequence (rn) in R \Q such that rn → x (by Proposition2.3.8).

Then f(rn) must converge to f(x) = x. But f(rn) = 0→ 0 and 0 6= x. This is a contradiction.

Case 2: x ∈ R \QSince Q is dense in R, we get a sequence (qn) in Q such that qn → x (by Proposition 2.3.8).Then f(qn) must converge to f(x) = 0. But f(qn) = qn → x and x 6= 0. This is a contradiction.

So in both cases, xn → x; f(xn)→ f(x), hence f is discontinuous at all x 6= 0.

31

Uniform Continuity

We will spend almost all of this subsection, discussing uniform continuity of functions on R. Butto empasize the fact that the notion of uniform continuity is defined for any metric space, we willstart with a definition of uniformly continuous function on a general metric space X.

Definition 2.4.10. A function f : X → Y is said to be uniformly continuous on X iffor every ε > 0, ∃ δ > 0 such that, for all x, y ∈ X, dX(x, y) < δ ⇒ dY (f(x), f(y)) < ε.

Note: The concepts of uniform continuity of a function and continuity of a function are differentin following ways:

1. Uniform continuity is a property of a function on a set, whereas continuity can be defined ata single point. To ask whether a given function is uniformly continuous at a certain point ismeaningless.

2. In the definition of continuity, the δ depends upon ε and the point c. But in uniform continuity,δ only depends upon ε, that is, for each ε > 0, we can find a δ > 0 which will work for allpoints x of X.

3. Every uniformly continuous function on X is continuous on X, but the converse is not true(in general). A counter example is f(x) = x2 which is continuous on R, but we will showshortly, that it is not uniformly continuous on R.

We will now talk about uniform continuity of functions on R. Let f : D → R, where D ⊆ R. Thenf is uniformly continuous on D if

for every ε > 0, ∃ δ > 0 such that, for all x, y ∈ D, |x− y| < δ ⇒ |f(x)− f(y)| < ε.

Thus, f : D → R is not uniformly continuous if

∃ ε > 0 such that ∀ δ > 0, ∃x, y ∈ D such that |x− y| < δ but |f(x)− f(y)| ≥ ε.

Lets try to understand uniform continuity graphically: The left graph below shows that for auniformly continuous function, given a ε > 0, one should be able to find a δ works for all pointsc. If we get two different δ, the smaller of them will work for both points. The graph on the rightshows a function which is not uniformly continuous, we cannot come up with one δ > 0 which willwork for all points. Think about why we can’t find one delta which will work for all? What is itabout the right graph that will not allow it?

32

Examples: Following are examples of functions, some which are uniformly continuous and some arenot. We will see the proofs in class.

1. Let f(x) = mx+ b (any linear function), then f is uniformly continuous on R.

2. Let f(x) = x2, then f is not uniformly continuous on R or on [a,∞).

3. Let f(x) = x2, then f is uniformly continuous on [a, b].

4. Let f(x) =1

x, then f is uniformly continuous on [1,∞), but not on (0, 1].

5. Let f(x) =√x, then f is uniformly continuous on [1,∞) and also on [0, 1].

You may ask the questions, are there any conditions under which a continuous function becomesuniformly continuous? The answer is yes, for instance, see the following results.

Theorem 2.4.11. A continuous function on a closed and bounded interval [a, b], is uniformlycontinuous on [a, b].

We will skip the proof for now. Later in the course, time permitting, we will see that this resultfollows from a general result in metric spaces (in topology, in fact) involving the concept of “compactsets”.

This result re- justifies the fact that f(x) = x2 is uniformly continuous on [a, b], even though it isnot uniformly continuous on R.

33

Chapter 3

Differentiation

3.1 Review: Limit of a Function

We begin by making this clear that, in order to talk about limit of a function f : D → R at apoint c, unlike continuity, f need not be defined at c. That is, c need not be a point in thedomain of f . However, we do require c to be close to the domain D, in the sense that c must bean accumulation point of D.

Definition 3.1.1. (Limit of a Function at a point) Let D ⊂ R, let f : D → R, let c be anaccumulation point of D, and let L ∈ R. Then the limit point of f at c is L if

for every open ball centered at L, (L−ε, L+ε), there exists an open ball centered at c, (c−δ, c+δ),such that

x ∈ (c− δ, c+ δ) \ {c} ⇒ f(x) ∈ (L− ε, L+ ε)[

i.e., f [(c− δ, c+ δ) \ {c}] ⊆ (L− ε, L+ ε)].

m

For every ε > 0, ∃ δ > 0 such that if x ∈ (c− δ, c+ δ), and x 6= c, then f(x) ∈ (L− ε, L+ ε).

m

For every ε > 0, ∃ δ > 0 such that if x ∈ D and 0 < |x− c| < δ then |f(x)− L| < ε.

If the limit of f at c does not exist, we say that f diverges at c.We often write

L = limx→c

f(x) or L = limx→c

f or f(x)→ L as x→ c.

34

Theorem 3.1.2 (Algebra of Limits). Let f : D → R and g : D → R, and let c be an accumulationpoint of D. Let lim

x→cf = L and lim

x→cg = M , then

(i) limx→c

(f + g) = L+M

(ii) limx→c

(f − g) = L−M

(iii) limx→c

(f.g) = LM

(iv) If b ∈ R, then limx→c

(bf) = bL

(v) If g(x) 6= 0 for all x ∈ D and if M 6= 0, then limx→c

(f/g) = L/M

One-Sided Limits:

• Left-Hand Limit

if given any ε > 0 there exists a δ > 0 such that for all x ∈ D with 0 < c − x < δ, then|f(x)− L| < ε.

• Right-Hand Limit : Similar

if given any ε > 0 there exists a δ > 0 such that for all x ∈ D with 0 < x − c < δ, then|f(x)− L| < ε.

Remark: Continuity and Limit are related as follows:Let f : D → R and c ∈ D be an accumulation point of D. Then the following are equivalent:

(i) f is continuous at c,

(ii) limx→c

f = f(c)

(iii) limx→c−

f = f(c) = limx→c+

f

Theorem 3.1.3. (Sequential Criterion for Limits) Let f : D → R, and let c be an accumulationpoint of D, then the following are equivalent:

(i) limx→c

f = L

(ii) For every sequence (xn) in D which converges to c such that xn 6= c for all n ∈ N, the sequence(f(xn)) converges to L.

35

3.2 The Derivative

In order to talk about the derivative, we will consider functions with interval domain, i.e., f : I → R,where I is an interval one of the following forms:

(a, b), [a, b), (a, b], [a, b], (a,∞), [a,∞), (−∞, b), (−∞, b]

It is possible to define the derivative of a function having a non-interval domain, but the significanceof the concept is most naturally apparent for functions defined on intervals. Consequently we shalllimit our attention to such functions.

Definition 3.2.1. (Derivative of a function) The derivative of a function f at c, denoted byf ′(c), is defined by

f ′(c) = limx→c

f(x)− f(c)

x− c,

provided this limit exists. The function is then said to be differentiable at c .If D ⊂ I and if f is differentiable at each point of D, then f is said to be differentiable on D.

Notation and Remarks

• Let y = f(x), then f ′(x) is also written asdf(x)

dxor

dy

dx.

• Geometric Interpretation: f ′(c) denotes the slope of the tangent line to the function f atf(c). The graph of a differentiable function is a smooth curve which does not have any sharpturns or breaks.

• ε − δ definition: f is differentiable at c iff for each ε > 0 there exists a δ > 0 such that if

x ∈ I and 0 < |x− c| < δ then

∣∣∣∣f(x)− f(c)

x− c− f ′(c)

∣∣∣∣ < ε.

• Replacing x− c with h, in above definition, we obtain an equivalent expression which is oftenused:

f ′(c) = limh→0

f(c+ h)− f(c)

h

36

Theorem 3.2.2. If f : I → R is differentiable at c, then f is continuous at c.

Proof. For x ∈ I, x 6= c,

f(x)− f(c) =

(f(x)− f(c)

x− c

)(x− c).

Using algebra of limits, and that f is differentiable at c, we have

limx→c

(f(x)− f(c)) = limx→c

(f(x)− f(c)

x− c

)limx→c

(x− c) = f ′(c).0 = 0.

So limx→c

f(x) = f(c), hence f is continuous at c.

Example 3.2.3. Converse of above result is not true. To see this, consider f(x) = |x|. Then weknow f is continuous at c = 0, but

f(x)− f(0)

x− 0=|x|x

=

{1 x > 0

−1 x < 0

So limx→c

f(x)− f(0)

x− 0does not exists⇒ f is not differentiable at 0.

Example 3.2.4. Lets prove the following functions are differentiable using the definition of aderivative, and find their derivatives:

1. f(x) = x3 for x ∈ R

2. f(x) = 1/√x for x > 0

3. f(x) = x3/2 for x ∈ R

4. f(x) = x1/3 for x ∈ R, x 6= 0

1. Let c ∈ R, and x 6= c, then

f(x)− f(c)

x− c=x3 − c3

x− c=

(x− c)(x2 + xc+ c2)

x− c= x2 + xc+ c2

Thus f ′(c) = limx→c

f(x)− f(c)

x− c= lim

x→cx2 + xc+ c2 = 3c2.

2. Let c > 0, and x > 0, x 6= c, then

f(x)− f(c)

x− c=

1√x− 1√

c

x− c=

√c−√x

(x− c)√x√c

=c− x

(x− c)√x√c(√c+√x)

(by rationalization)

=−1√

x√c(√c+√x).

37


f(x)− f(c)

x− c= lim

x→c

−1√x√c(√c+√x)

=−1

2c3/2.

3. Let c ∈ R, and x 6= c, then

f(x)− f(c)

x− c=x3/2 − c3/2

x− c=x3/2 − c3/2

x− c.x3/2 + c3/2

x3/2 + c3/2

=x3 − c3

(x− c)(x3/2 + c3/2)

=(x− c)(x2 + xc+ c2)

(x− c)(x3/2 + c3/2).


f(x)− f(c)

x− c= lim

x→c

x2 + xc+ c2

x3/2 + c3/2=

3c2

2c3/2=

3

2c1/2.

4. Let c 6= 0, and x 6= c, thenf(x)− f(c)

x− c=x1/3 − c1/3

x− c.

Note that

x− c = (x1/3)3 − (c1/3)3 = [x1/3 − c1/3][(x1/3)2 + x1/3c1/3 + (c1/3)2]

= [x1/3 − c1/3][x2/3 + x1/3c1/3 + c2/3].


f(x)− f(c)

x− c= lim

x→c

x1/3 − c1/3

[x1/3 − c1/3][x2/3 + x1/3c1/3 + c2/3]

= limx→c

1

x2/3 + x1/3c1/3 + c2/3

=1

3c2/3.

38

Theorem 3.2.5. Algebraic Rules for Differentiation Let f, g : I → R both be differentiable atc ∈ I. Let a ∈ R. Then

(i) f ± g are differentiable at c, with (f ± g)′(c) = f ′(c)± g′(c).

(ii) fg is differentiable at c, with (fg)′(c) = f ′(c)g(c) + f(c)g′(c).

(iii) af is differentiable at c, with (af)′(c) = af ′(c).

(iv) If g(c) 6= 0 thenf

gis differentiable at c, with

(f

g

)′(c) =

g(c)f ′(c)− f(c)g′(c)

(g(c))2.

Before we begin the proof, we state a lemma which we need for part (iv). This lemma states auseful property of continuous functions, that is, if a continuous function is non-zero at a point c,then it stays non-zero in a neighborhood (or an open ball, in metric terms) around c.

Lemma 3.2.6. If g : D → R is continuous at c ∈ D, and g(c) 6= 0, then ∃ δ > 0 such that g(x) 6= 0for all x ∈ (c− δ, c+ δ) ∩D.

Proof. of the theorem: Parts (i) and (iii) are easy, and are left as an exercise. We will prove parts(ii) and (iv).

(ii) Let x ∈ I, x 6= c, and h(x) = f(x).g(x), then

f(x)g(x)− f(c)g(c)

x− c

=f(x)g(x)−f(x)g(c) + f(x)g(c)− f(c)g(c)

x− c

=f(x)(g(x)− g(c)) + g(c)(f(x)− f(c))

x− c

=f(x)(g(x)− g(c))

x− c+g(c)(f(x)− f(c))

x− c

Thus,

h′(c) = limx→c

h(x)− h(c)

x− c= lim

x→c

[f(x)(g(x)− g(c))

x− c+g(c)(f(x)− f(c))

x− c

]= lim

x→c

f(x)(g(x)− g(c))

x− c+ lim

x→c

g(c)(f(x)− f(c))

x− c

= limx→c

f(x). limx→c

g(x)− g(c)

x− c+ g(c). lim

x→c

f(x)− f(c)

x− c= f(c)g′(c) + g(c)f ′(c) (using continuity of g and that f ′, g′ exist at c).

(iv) Let h(x) =f(x)

g(x), let x ∈ I, x 6= c.

Note that we know g(c) 6= 0. But we also want g(x) 6= 0, when x 6= c but is close to c, sinceotherwise h(x) will not be defined close to c. Luckily, g is differentiable and hence continuousat c. Thus by the lemma above, ∃ δ > 0 such that g(x) 6= 0 for all x ∈ (c− δ, c+ δ).

39

Let x ∈ (c− δ, c+ δ), x 6= c. Consider

h(x)− h(c)

x− c=f(x)/g(x)− f(c)/g(c)

x− c

=f(x)g(c)− f(c)g(x)

g(x)g(c)(x− c)

=f(x)g(c)−f(c)g(c) + f(c)g(c)− f(c)g(x)

g(x)g(c)(x− c)

=g(c)[f(x)− f(c)] + f(c)[g(c)− g(x)]

g(x)g(c)(x− c)

=f(x)− f(c)

g(x)(x− c)− f(c)[g(x)− g(c)]

g(x)g(c)(x− c)

Thus,

h′(c) = limx→c

h(x)− h(c)

x− c= lim

x→c

[f(x)− f(c)

g(x)(x− c)− f(c)[g(x)− g(c)]

g(x)g(c)(x− c)

]= lim

x→c

f(x)− f(c)

g(x)(x− c)− lim

x→c

f(c)[g(x)− g(c)]

g(x)g(c)(x− c)

= limx→c

f(x)− f(c)

x− climx→c

1

g(x)− f(c)

g(c)limx→c

g(x)− g(c)

x− climx→c

1

g(x)

=f ′(c)

g(c)− f(c)g′(c)

g2(c)(using continuity of g and that f ′, g′ exist at c)

=f ′(c)g(c)− f(c)g′(c)

g2(c).

40

The definition of a derivative is defined to be limit of a rational function (difference quotient)f(x)− f(c)

x− c. When we take limit x→ c, always assume x 6= c. The following provides an equivalent

characterization of a derivative which does not involve fractions. It also helps reduce theorems onderivatives to theorems on continuity.

Theorem 3.2.7. [Caratheodory’s Theorem] Let f : I → R and c ∈ I. Then f is differentiableif and only if there exists a function φ : I → R which is continuous at c and satisfies

f(x)− f(c) = φ(x)(x− c) for all x ∈ I. (∗)

In this case, we also have that φ(c) = f ′(c).

Proof. (⇒) Let f ′(c) exists, then define φ by

φ(x) :=

f(x)− f(c)

x− cfor x 6= c, x ∈ I

f ′(c) for x = c

Since limx→c

φ(x) = f ′(c), so φ is continuous at c. It is clear that φ satisfies the equation (∗).

(⇐) Suppose φ is a continuous function at c, and φ satisfies (∗).Let x 6= c, then using (∗) and continuity of φ at c, we have

limx→c

f(x)− f(c)

x− c= lim

x→cφ(x) = φ(c).

Therefore, f is differentiable at c and f ′(c) = φ(c).

Theorem 3.2.8. (Chain Rule) Let I and J be intervals in R, let f : I → J and g : J → R, andlet c ∈ I, with f differentiable at c and g differentiable at f(c). Then g ◦ f is differentiable at c and

(g ◦ f)′(c) = g′(f(c))f ′(c).

Proof. Since f ′(c) exists, Caratheodory’s Theorem implies that there exists a function φ : I → Rsuch that φ is continuous at c and

f(x)− f(c) = φ(x)(x− c), where φ(c) = f ′(c).

Also, since g′(f(c)) exists, there is a function ψ : J → R such that ψ is continuous at d := f(c) and

g(y)− g(d) = ψ(y)(y − d) ∀y ∈ J, where ψ(d) = g′(d).

Substituting y = f(x) and d = f(c), we get

g(f(x))− g(f(c)) = ψ(f(x))(f(x)− f(c)) = ψ(f(x))[φ(x)(x− c)] = [(ψ ◦ f)(x).φ(x)](x− c),

for all x ∈ I such that f(x) ∈ J . Since (ψ ◦ f).φ is continuous at c, its value at c is g′(f(c)).f ′(c),by Caratheodory’s Theorem, g ◦ f is differentiable at c and (g ◦ f)′(c) = g′(f(c))f ′(c).

41

3.3 Mean Value Theorem and its Applications

The Mean Value Theorem is one of the fundamental results in analysis, as the corollaries, examples,applications will indicate. We will see its applications to uniform continuity, finding zeroes/rootsof a function, approximation of functions by polynomials (Taylor’s Theorem). We will also seehow Mean Value Theorem permits one to draw conclusions about the nature of a function frominformation about its derivative.

Geometric Interpretation of Mean Value Theorem: There is a point on the curve y = f(x)at which the tangent line is parallel to the line segment through the points (a, f(a)) (b, f(b)).

We need some definitions and preliminary results to prove this theorem. These results areimportant results, in their own right.

Definition 3.3.1. (Absolute Maximum and Minimum) Let f : I −→ R with c ∈ I. Thenf has a absolute maximum (respectively, a absolute minimum) on I if there is a c ∈ I such thatf(x) ≤ f(c) (respectively, f(c) ≤ f(x)) for all x ∈ I.

Following is an important fact about continuous functions, we will use in the Mean value theoremproof, which says that a continuous function on a closed and bounded interval, attains both itsmaximum and minimum.

Theorem 3.3.2. If f : [a, b]→ R is continuous on [a, b], then f has an absolute maximum and anabsolute minimum on [a, b].

Definition 3.3.3. (Local Maximum and Minimum) Let f : I −→ R with c ∈ I. Then f hasa local maximum (respectively, a local minimum) at c if there is a neighborhood (i.e., an open ball)U = (c− δ, c + δ) of c such that f(x) ≤ f(c) (respectively, f(c) ≤ f(x)) for all x ∈ I ∩ U . We calla local maximum or local minimum as a local extrema.

Theorem 3.3.4. Let f : I → R be differentiable at c ∈ I which is an interior point of I (that is, cis not an end point of I). If f has a local maximum or a local minimum at c, then f ′(c) = 0.

42

Proof. Let f have a local maximum at c. By definition, ∃ δ > 0 such that f(x) ≤ f(c) for allx ∈ (c− δ, c+ δ).

If c− δ < x < c, thenf(x)− f(c)

x− c≤ 0. This implies f ′(c) = limx→c−

f(x)− f(c)

x− c≥ 0.

If c < x < c+ δ, thenf(x)− f(c)

x− c≥ 0. This implies f ′(c) = limx→c+

f(x)− f(c)

x− c≤ 0.

Hence f ′(c) = 0.The proof for f having a local minimum at c is similar.

Theorem 3.3.5. [Mean Value Theorem] Let f : [a, b]→ R be continuous on [a, b] and differen-tiable on (a, b), then there is a c ∈ (a, b) such that

f(b)− f(a) = f ′(c)(b− a).

Proof. Define F : [a, b]→ R by

F (x) = [f(b)− f(a)]x− (b− a)f(x).

Since f is continuous on [a, b] and differentiable on (a, b), so is F , and F ′(x) = [f(b) − f(a)] −(b− a)f ′(x). Also F (a) = F (b).

We need to show that there is a c ∈ (a, b) such that F ′(c) = 0.

Case I: F is constant on [a, b]. Then F ′(x) = 0 for all x ∈ [a, b].

Case II: F is not constant on [a, b]. Since F is continuous on [a, b], it has an absolute maximumand an absolute minimum on [a, b]. Suppose both of the absolute extremes occur at the end pointsa and b, then F (a) = F (b) would imply that F is constant on [a, b]. Thus, one of the absoluteextremes must occur in the interior of [a, b], i.e., ∃ c ∈ (a, b) such that F (c) is an extrema. ByTheorem 3.3.4, F ′(c) = 0.

Consequence I: Uniform Continuity

Corollary 3.3.6. Let f : I → R be differentiable on I. If f has a bounded derivative on I, then fis uniformly continuous on I.

Proof. Let M > 0 such that f ′(x) ≤M for all x ∈ I. Let ε > 0, thento show: ∃ δ > 0 such that ∀x, y ∈ I |x− y| < δ ⇒ |f(x)− f(y)| < ε.By the mean value theorem, for any x, y ∈ I, there is a c between x, y such that

f(x)− f(y) = f ′(c)(x− y).

Thus,

|f(x)− f(y)| = |f ′(c)| |x− y| ≤M |x− y| .

Let δ =ε

M, and let x, y ∈ I, then by the argument above we have

|f(x)− f(y)| ≤M |x− y| < Mδ = ε.

43

Note: Converse of above theorem is not true. For example, let f(x) =√x, we have seen is

uniformly continuous on [0,∞). But its derivative is unbounded on [0,∞).

Example 3.3.7. It follows from above corollary that f(x) = sinx and g(x) = cosx are uniformlycontinuous on R, since their derivatives on R are cos x and sinx respectively, which are bounded.

Consequence II: Understanding the function via. its derivative

Theorem 3.3.8. Let f : [a, b] → R be continuous on [a, b] and differentiable on (a, b), and letf ′(x) = 0 for all x ∈ (a, b). Then f is constant on [a, b].

Proof. Claim: f(x) = f(a) for all x ∈ [a, b].

Let x ∈ I, then apply Mean Value Theorem to f on [a, x]. There is a c ∈ (a, x) such thatf(x)− f(a) = f ′(c)(x− a). But f ′(x) = 0 for all x ∈ I, so f ′(c) = 0, and f(x)− f(a) = 0.Thus f(x) = f(a) for all x ∈ I.

Corollary 3.3.9. Let f, g : [a, b] → R be continuous on [a, b] and differentiable on (a, b), and letf ′(x) = g′(x) for all x ∈ (a, b). Then f = g + C, where C is a constant.

Proposition 3.3.10. Let f be differentiable on I. Then

(i) f is increasing on I if and only if f ′(x) ≥ 0 for all x in I;

(ii) f is decreasing on I if and only if f ′(x) ≤ 0 for all x in I.

Proof. (i) Let f be increasing on I. Let c ∈ I,

If x < c, then f(x) ≤ f(c). Sof(x)− f(c)

x− c≥ 0.

If x > c, then f(x) ≥ f(c). Sof(x)− f(c)

x− c≥ 0.

Hence limx→c

f(x)− f(c)

x− c≥ 0, i.e., f ′(c) ≥ 0. Since c ∈ I was arbitrary, f ′(x) ≥ 0 for all x ∈ I.

Conversely suppose f ′(x) ≥ 0 for all x ∈ I. Let x, y ∈ I such that x ≤ y. Then by the MeanValue Theorem, ∃ c ∈ (x, y) such that

f(y)− f(x) = f ′(c)(y − x) ≥ 0 (since f ′(c) ≥ 0 and y ≥ x).

Thus x ≤ y ⇒ f(x) ≤ f(y).(ii) Similar.

Consequence III: Roots of a function, establishing inequalities, and approximations

Corollary 3.3.11. [Rolle’s Theorem] Let f : [a, b]→ R be continuous on [a, b] and differentiableon (a, b), and let f(a) = f(b). Then there is a c ∈ (a, b) such that

f ′(c) = 0.

44

Proof. Use Mean Value Theorem and the fact that f(a) = f(b).

Proposition 3.3.12. If f : I → R is differentiable on I and f has two distinct roots on I, then f ′

has at least one root in I.

Proof. Let f have roots at a < b ∈ I, i.e., f(a) = 0 = f(b). Then by Rolle’s Theorem, applied to fon [a, b], ∃ c ∈ (a, b) such that f ′(c) = 0.

Example 3.3.13. Let f(x) = 2x5 + x3 + 3x, then has exactly one root.

Since f(0) = 0, x = 0 is a root. Suppose f has two distinct roots, a, b, i.e., f(a) = 0 = f(b).Then by Rolle’s Theorem ∃ c ∈ (a, b) such that f ′(c) = 0. But f ′(x) = 10x4 + 3x2 + 3 ≥ 3 for allx ∈ R. Hence f must only have one root.

Example 3.3.14. Prove the following inequalities:(i) ex ≥ 1 + x for all x ∈ R (ii) |sinx| ≤ x for all x ≥ 0

(i) Case I: x = 0, then ex = x+ 1, so the inequality holds.

Case II: x > 0Since f(t) = et is differentiable on R, apply the Mean Value Theorem to f(t) = et on [0, x]. So,

∃ c ∈ (0, x) such thatex − e0 = ec(x− 0)⇒ ex − 1 = ecx.

Since 0 < c < x, and ex is increasing function,

xe0 < xec < xec ⇒ x < ex − 1 < xex ⇒ x < ex − 1⇒ ex > x+ 1.

Case III: x < 0. Consider f(t) = et on [x, 0], and proceed as in Case II.

(ii) Apply the Mean Value Theorem to f(t) = sin t on [0, x]. So ∃ c ∈ (0, x) such that

sinx− sin 0 = (cos c)x⇒ sinx = (cos c)x⇒ |sinx| = |(cos c)x| ≤ |x| = x.

Thus −x ≤ sinx ≤ x.

Example 3.3.15. Estimate√

89.

Consider f(x) =√x on [81, 89]. By Mean Value Theorem, ∃ c ∈ (81, 89) such that

√89−

√81 =

1

2√c(89− 81)⇒

√89− 9 =

4√c.

Since 81 < c < 89⇒√

81 <√c <√

89 <√

100⇒ 9 <√c < 10⇒ 4

10<

4√c<

4

9

So4

10<√

89− 9 <4

9⇒ 2

5+ 9 <

√89 <

4

9+ 9⇒ 9.04 <

√89 < 9.0444.

Note: Using calculator√

89 ∼ 9.4339811320566.

45

Documents

Chapter 1 Real Number System - SUNY Cortlandweb.cortland.edu/jubrani/MAT603LectureNotes.pdf · 2015. 6. 29. · Chapter 1 Real Number System 1.1 Introduction De nition 1.1.1. An Ordered