calculus

E1: CALCULUS - lecture notes

Stefan BalintEva Kaslik, Simona Epure, Simina Maris, Aurelia Tomoioaga

Contents

I Introduction 9

1 The notions ”set”, ”element of a set”, ”membership of an element in aset” are basic notions of mathematics 9

2 Symbols used in set theory 10

3 Operations with sets 10

4 Relations 11

5 Functions 14

6 Composite function. Inverse of a function. 15

7 Logic symbols 16

8 Converse theorem and contrary theorem 17

9 Necessity and sufficiency 18

II Single variable calculus 19

10 Topology in R1 19

11 Sequences 20

1

12 Convergence 21

13 Rules (for convergence of sequences) 23

14 Limit points of a sequence 26

15 Series of real numbers 27

16 Rules (for convergence of series) 29

17 Absolute convergent series 34

18 Limit of a function at a point 36

19 Rules for the limit of a function 38

20 One sided limits 41

21 Infinite limits 43

22 Limit points of a function at a point 44

23 Continuity 45

24 Rules for continuity 47

25 Properties of continuous functions 48

26 Sequence of functions. Set of convergence. 51

27 Continuity and uniform convergence 53

28 Equal continuous and equal bounded sequence of functions 54

29 Series of functions. Convergence and uniform convergence. 55

30 Convergence criteria for series of functions 57

31 Power Series 58

2

32 Arithmetics of power series 60

33 Differentiable functions 61

34 Rules of differentiability 63

35 Local extremum 68

36 Theorems concerning basic properties of differentiable functions 68

37 Higher-order derivatives and differentials 71

38 Taylor polynomials 72

39 Classification theorem for local extrema 77

40 The Riemann-Darboux integral 79

41 Properties of the Riemann-Darboux integral 81

42 Classes of Riemann-Darboux integrable functions 84

43 Mean value theorem 86

44 The fundamental theorem of calculus 87

45 Techniques to find primitives 89

46 Improper integrals 92

47 Fourier series 94

48 Different forms of Fourier series 101

III Functions of several variables 104

49 Topology in Rn 104

50 Limit of a function at a point 107

3

51 Continuity 108

52 Important properties of continuous functions 111

53 Differentiation 112

54 Basic properties of differentiable functions 117

55 Higher order partial differentiability 121

56 Taylor’s theorems 123

57 Classification theorem for local extrema 124

58 Conditional extrema 125

59 Jordan measurable subsets of R2 125

60 The Riemann-Darboux integral of functions of two variables 127

61 Integrable functions 129

62 Properties of the Riemann-Darboux integral 130

63 Riemann-Darboux integral calculus when A is rectangular 131

64 Riemann-Darboux integral calculus when A is not a rectangle 134

65 Jordan measurable subsets of Rn 136

66 The Riemann-Darboux integral of a n variable function 138

67 Integrable functions of n variables 140

68 Properties of the Riemann-Darboux integral of n-variable functions 140

69 Riemann-Darboux integral calculus for n-variable functions when A is ahypercube 141

70 Elementary curves and elementary closed curves 143

4

71 Line integral of first type 148

72 Line integrals of second type 150

73 Transformation of double integrals into line integrals 152

74 Elementary Surfaces 156

75 Surface integrals of first type 161

76 Surface integrals of second type 162

77 Properties of surface integrals 164

78 Differentiation of an integral containing a parameter 165

5

In which way can a Calculus course be useful to a first

year computer science student?

This is a frequently asked question of first year students at the beginning of their Calculuscourse.

It is difficult to give a full and convincing answer to this question at the very beginningof the course, as we have to talk about the utility of some concepts and mathematicalinstruments, that are unknown to those who ask, in solving practical problems which areout of their reach at the moment.

However, the question cannot and must not be avoided. It is necessary to formulate apartial answer showing the utility of this course in solving real problems, that future com-puter scientists could find interesting. We have to emphasize here that for mathematicsstudents, Calculus is a basic and very important part of their curriculum, and its utilityis usually not questioned outside the field of mathematics.

So let’s get back to giving a partial answer to computer science students. We would liketo point out that in this course, basic concepts and instruments will be presented, usedfor analyzing real or vector functions of one ore more variables. To illustrate the utility ofsome of these concepts and instruments, we will consider the following practical problem:constructing a train schedule.

Constructing a train schedule for a railway network is a real and complex problem. Itis based on the knowledge of speed restrictions in the network, train stations, transportmaterial, options concerning the stops of some trains in certain stations, and a previouscomputation that guarantees that in ideal conditions, the trains will not collide. Someconcepts of calculus prove to be useful in this computation. To guarantee that the trainswill not collide, it is necessary to know, at every moment, the position of every train andto assure that these positions do not coincide at a certain moment of time. Let’s considerfor example the Timisoara-Bucharest railway which can be represented as a curve ABlike in the following figure: and a train that circulates on this railway in the time range

[t0, t0 +T ] will be represented by a point P . If in the considered time range there are moretrains circulating on this railway, we will have to describe the motion of each of them.In order to describe the motion of a train represented by the point P , we can associateto each moment of time t ∈ [t0, t0 + T ] the length of the arc of curve AP , where P is

the position on the curve AB where the train is at the moment t. Therefore, a functionf is obtained, which is defined for t ∈ [t0, t0 + T ] and takes its values in the set [0, l]:f : [t0, t0 + T ] → [0, l]; l is the distance from A to B, on the considered railway.

6

We must emphasize that the object that appeared in a natural way in this problem ofdescribing the position of a train on a railway, is a real function of one real variable, amathematical object that belongs to the field of interests of this course.

Our train has to arrive at given times to its stations and has some speed restrictionsalong the way, hence, the function f could be quite complicated. However, there are somecharacteristics of real motion that have to be translated mathematically as properties ofthe function f . For example, the real motion is continuous, meaning that the train movesfrom the position P1 to the position P2 gradually, passing through all the intermediatepositions and not by jumping. This means that the function f , even if complicated, musthave to following property: for any t2 ∈ [t0, t0 + T ], if t1 tends to t2 then f(t1) tends tof(t2).

A function with the above property is said to be continuous on the interval [t0, t0 + T ].The concept of continuity is studied in this course, revealing several properties. Hence,continuous functions that are studies in this course are useful, for example, for describingthe motion of a train on a railway.

If our train leaves at the moment t0 from station A and moves off continuously from Awithout stopping until the moment t1 at the first station S1, then the function f whichdescribes the motion of the train has the following property: for any t′, t′′ ∈ [t0, t1], t′ < t′′

it results that f(t′) < f(t′′). In this course, such function is said to be increasing. Thecourse presents several properties of monotonous functions. In the case of the consideredmotion, this concept is useful for expressing moving off or approaching.

Due to speed restrictions and stops at the stations, the velocity of the train depends onits position. More exactly, it depends on the moment of time t, as in the time range[t0, t0 + T ], the train may pass through the same place a couple of times. In order to find

the velocity of the train at the moment t1, we consider the mean velocityf(t)− f(t1)

t− t1(distance over time) on a short time range [t, t1] and the limit of this mean velocity when ttends to t1 represents the velocity of the train at the moment t1. In this course, this limitis called the derivative of the function f at t1 and is denoted by f ′(t1). If the train staysin a station in the time range [t1, t2] then it’s velocity is zero, f ′(t) = 0, for t ∈ [t1, t2].If f ′(t) > 0, then the train moves off A, and if f ′(t) < 0 then the train approaches A.If the train moves with a constant velocity in the time range [t1, t2], then f ′(t) = constin the interval [t1, t2]. These show the utility of the concept of derivative for describingmechanic motion.

Finally, we point out that starting from a velocity profile v(t) (which results from speedrestrictions and previously assigning the arrival and departure times) the function f(t)which describes the motion can be recovered using the integral formula:

f(t) = f(t0) +

t∫

t0

v(τ)dτ.

presented in this course.

We hope that this extremely simple and partial reasoning manages to convince computerscience students that they will study at this course mathematical objects and results thatwill be useful in their future careers.

7

The written course is presented in a standard form, similar to the course presented tomathematics students. However, the spoken course is full of comments and examples thatare meant to illustrate the utility and applicability of the concepts and results at solvingreal problems.

The authors

8

Part I

Introduction

1 The notions ”set”, ”element of a set”, ”member-

ship of an element in a set” are basic notions of

mathematics

A strict mathematics course requires a precise definition of all the notions used to presentthe material.

A definition should precisely describe a notion (A) using an other notion (B), which isassumed to be known, or in any event simpler than (A).

Notion (B) must also be strictly defined, and its definition will contain another notion(C) simpler then (B), and so on.

For the construction of a mathematical theory with exact definitions, of all the notions, itis necessary to have a collection of very simple notions to which the rest can be reducedand which are themselves not defined.

We will call such notions basic notions.

From the point of view of common sense, the basic notions of mathematics are so selfevident that they do not require definitions. The meaning of basic notions can be describedby examples.

The notions: a set, an element of a set, membership of an element in a set, are basicnotions of mathematics.

We cannot obtain an exact definition of the above notions, but it is possible to clarifytheir meaning, by examples.

Thus, let us consider the notion of a set. We may speak of the set of days in a year, pointsin a plane, students in a lecture-room, and so on. In these cases, each day of a year, eachpoint in a plane, each student in a lecture-room is an element of the set.

When a concrete set is considered, an essential thing is to be able to affirm for anyelement if it belongs or not to the set. Thus, for the set of days in a year, the 3rd of July,20th of May, 29th of December are all elements of the set, while ”Wednesday”, ”Friday”,”holiday”, ”days in a year” are not. In the second example, only the points in the givenplane are elements of the set. If the point does not lie in the given plane, or the elementis not a point, then the point or the element is not an element of the set.

In order to define a concrete set it is necessary to describe clearly the elements belongingto it. Any faulty description may lead to a logical contradiction.

9

2 Symbols used in set theory

If x is a member (an element) of a set A, then we write x ∈ A, otherwise we write, x /∈ A(∈ is called the membership symbol).

Two sets A and B that have precisely the same elements are said to be equal. Thus,with respect to sets, the equality A = B means that the same set is denoted by differentletters, that is , A and B are two names for the same set.

The notation A = {x, y, z, ...} means that the set A consists of elements x, y, z, ... . In thisnotation, duplicated elements are regarded as one element. For instance: {1, 2, 3, 4, 5} ={1, 1, 1, 2, 2, 3, 4, 5}.If a set A consists of all the elements x of a set B that posses a given property, then wewrite A = {x ∈ B | . . . } where the property is written after the vertical line. For instance,let a and b two real numbers satisfying the condition a < b; then the set of points of theclosed interval [a, b], that is the set of all real members x such that a ≤ x ≤ b, can bewritten as:

[a, b] = {x ∈ R1 | a ≤ x ≤ b}where R1 means the set of all real members.

If every element in a set A is also an element of a set B, then we say that A is a subsetof B and write A ⊂ B or B ⊃ A. The first relation reads ”set A is contained in set B”,and the second relations reads ”B contains A”.

It is easy to prove that if A ⊂ B and B ⊂ A, then A = B.

3 Operations with sets

Definition 3.1. For any two sets A and B the set of elements belonging to A or B or toboth sets is called the union of A and B, and is written A ∪B.

Definition 3.2. For any two sets A and B the set of elements belonging to A and B atthe same time is called the intersection of A and B and is written A ∩B.

Definition 3.3. For any two sets A and B the set of elements of B that are not elementsof A is the difference B − A written B \ A. If the set A is a subset of B, then B \ A iscalled the complement of A in B and is denoted as CBA.

Comment 3.1.

- The notions of union and intersection of sets can be extended to three, four or anynumber of sets. Namely, the union of n sets A, B, C, . . . is the set of those elementswhich belong to at least one of these sets. The intersection of n sets A, B, C, . . .is the set of those elements which belong simultaneously to each set.

- It is possible that two sets A and B have no elements in common. In such a caseA ∩ B contains no elements. Nevertheless, it is still convenient to view A ∩ B as aset (containing no elements). It is called the empty (or null) set, and is denoted bythe symbol ∅.

10

For any set A we have A ⊃ A and A ⊃ ∅; thus A and ∅ are subsets of A; they are calledimproper subsets, all other subsets being proper subsets.

Sometimes, the union of sets is called the sum of sets, and the intersection of sets theproduct sets.

Usually, the operations of union and intersection of sets are defined on the set of all subsetsof a given set S. These operations, for any A,B, C ⊂ S, satisfy the following properties:

• (A ∪B) ∪ C = A ∪ (B ∪ C) associativity of union;

• (A ∩B) ∩ C = A ∩ (B ∩ C) associativity of intersection;

• A ∪B = B ∪ A commutativity of union;

• A ∩B = B ∩ A commutativity of intersection;

• (A ∪B) ∩ C = (A ∩ C) ∪ (B ∩ C) distributivity of intersection over union;

• (A ∩B) ∪ C = (A ∪ C) ∩ (B ∪ C) distributivity of union over intersection;

• for A ⊂ S there is a unique B ⊂ S such that A ∪ B = S, A ∩ B = ∅ : this set isS \ A;

• the set S possesses the property A∩S = A for any A ⊂ S, the empty set ∅ possessesthe property: ∅ ∩ A = ∅ for any A.

There are identities, known as rules of De Morgan, which relate the operations ofcomplementation, taking unions, and taking intersections. These rules are expressedby the formulas:

CS(A ∪B) = CSA ∩ CSB;

CS(A ∩B) = CSA ∪ CSB.

Definition 3.4. For any two sets A and B, the set of ordered couples (a, b) with a ∈ A,b ∈ B is called the cartesian product of A and B and it is denoted A×B.

The cartesian product has the following properties:

A× (B ∪ C) = (A×B) ∪ (A× C);

A× (B ∩ C) = (A×B) ∩ (A× C).

for any sets A,B, C.

4 Relations

Definition 4.1. A binary relation in the set A is a subset R of the cartesian productA× A : R ⊂ A× A.

11

Traditionally, the membership (x, y) ∈ R is denoted by xRy.

The set R = {(x, y) ∈ R × R : x2 + y2 ≤ 1} is a binary relation in the set of all realnumbers R.

Definition 4.2. A binary relation R in the set A is called reflexive if for any x ∈ A wehave xRx.

The set R = {(x, y) ∈ R × R : x − y ≤ 0} is a reflexive binary relation in the set of allreal numbers R.

Definition 4.3. A binary relation R in the set A is called symmetric if

xRy ⇒ yRx for any x, y ∈ A

The set R = {(x, y) ∈ R× R : x2 + y2 ≤ 1} is a symmetric binary relation in the set ofall real numbers R.

Definition 4.4. A binary relation R in the set A is called antisymmetric if

xRy and yRx ⇒ x = y for any x, y ∈ A

The set R = {(x, y) ∈ R× R : x− y ≤ 0} is an antisymmetric binary relation in the setof all real numbers R.

Definition 4.5. A binary relation R in the set A is called transitive if

xRy and yRz ⇒ xRz for any x, y, z ∈ A.

The set R = {(x, y) ∈ R× R : x− y ≤ 0} is a transitive binary relation in the set of allreal numbers R.

Definition 4.6. A binary relation R in the set A is total if for any x, y ∈ A, at least oneof the following statements is true: xRy, yRx.

The set R = {(x, y) ∈ R× R : x− y ≤ 0} is a total binary relation in the set of all realnumbers R.

Definition 4.7. A binary relation R in the set A is partial of there exist x, y ∈ A suchthat none of the following statements is true: xRy, yRx.

The set R = {(x, y) ∈ R × R : x2 + y2 ≤ 1} is a partial binary relation in the set of allreal numbers R.

Definition 4.8. A binary relation R in the set A is a relation of partial order if it satisfiesthe following properties: R is a partial relation; R is reflexive; R is antisymmetric; R istransitive.

The inclusion of sets is a relation of partial order in the set of all parts of a given set S.

12

Definition 4.9. A binary relation R in the set A is a relation of total order if it satisfiesthe following properties: R is a total relation ; R is reflexive; R is antisymmetric; R istransitive.

The set R = {(x, y) ∈ R×R : x− y ≤ 0} is a relation of total order in the set of all realnumbers R.

Definition 4.10. A set A, together with a relation of partial order R in A is calledpartially ordered system and it is denoted by (A,R).

The set of all parts of a given set S, together with the relation of inclusion is a partiallyordered system.

Definition 4.11. A set A together with a relation of total order R in A is called totallyordered system and it is also denoted by (A,R).

The set of real numbers R, together with the binary relation R = {(x, y) ∈ R × R :x− y ≤ 0} is a totally ordered system.

Definition 4.12. Let (A,R) be a partially ordered system and A′ a subset of A : A′ ⊂ A.An element a ∈ A is an upper bound for the set A′ if a verifies a′Ra for any a′ ∈ A′. Anupper bound a∗ for A′ is said to be a least upper bound for A′ if a∗ verifies a∗Ra for anyupper bound a of A′. If it exists, a least upper bound of A′ is denoted by sup A′.

Definition 4.13. Let (A,R) be a partially ordered system and A′ a subset of A : A′ ⊂ A.An element a ∈ A is a lower bound for the set A′ if a verifies aRa′ for any a′ ∈ A′. Alower bound a∗ for A′ is said to be a greatest lower bound for A′ if a∗ verifies aRa∗ forany lower bound a of A′. If it exists, a greatest lower bound of A′ is denoted by inf A′.

Definition 4.14. Let (A,R) be a partially ordered system. An element a ∈ A is maximalif for any a′ ∈ A with the property aRa′, one has a′Ra.

The family P(X) of all subsets of a set X affords an illustration of this concepts. Theinclusion relation R =⊂ between the sets contained in X makes the pair (P(X), j) apartially ordered system. An upper bound for a subfamily B ⊂ P(X) in any set containing⋃B∈B

B and⋃B∈B

B is the only least upper bound of B.

Similarly,⋂B∈B

B is the only greatest lower bound of B. The only maximal element of P(X)

is X.

Definition 4.15. A relation R in a set A is called equivalency if possesses the followingproperties: R is reflexive, symmetric and transitive. For instance, the equality of sets isan equivalency.

For example, the equality in the set of parts P (X) of a given set X is an equivalency.

The set R = {(x, y) ∈ Z×Z : x− y divisible by 5} is a relation of equivalency in the setof integers Z.

13

Definition 4.16. A relation R between the elements of a set A and the elements of a setB is a subset of the cartesian product A×B; R ⊂ A×B.

Traditionally, (x, y) ∈ R is denoted xRy.

Definition 4.17. A function (mapping) f of the set A into the set B written f : A → Bis a relation R between the elements of the sets A and B (R ⊂ A × B) which posses thefollowing properties:

a) for every x ∈ A there exists y ∈ B such that xR y;

b) if (x, y1), (x, y2) ∈ R, then y1 = y2.

Traditionally, a function f defined on the set A into the set B is denoted by f : A → B.

5 Functions

The notion of function plays an important role in mathematics. It is not a basic notion,since we have already seen that it can be defined in terms of sets. However for thosestarting mathematical analysis, it is easiest to consider mapping (function) as a basicnotion clarifying it by examples and describing it in a manner that satisfies commonsense.

If for every x ∈ A an element y ∈ B is chosen according to some rules, then we say thatthere is a function (mapping) f of the set A into the set B, written f : A → B.Thus, a function is defined uniquely by the rule which makes every x ∈ A correspond toy ∈ B.

What does the above description of function lack for it to be a strict definition?Firstly, we must explain what a rule is; secondly, what a correspondence is.Intuitively it is clear what a rule and correspondence are. In simple cases, these notionsdo not involve misunderstandings and are sufficient for a meaningful mathematical theoryto be constructed on their basis.

Let us note once again that the rule defining the element y ∈ B is applicable to everyx ∈ A. The element x ∈ A is called the argument of the function f , the element y ∈ B iscalled the value of the function f corresponding to the element x ∈ A, y = f(x), and thefunction itself is a rule which ”processes” every x ∈ A into y = f(x).

The set A is called the domain of the function, and the set of all the elements y ∈ B forwhich there are x ∈ A such that y = f(x) is called the range of the function f.

We shall consider functions which associate every real number x ∈ A ⊂ R1 with a numbery = f(x) ∈ R. For this kind of functions, a rule can be given by an explicit algebraicexpression; for instance:

y = x2 + 2 x; y =1− x√x + 2

; y =5

√1 + 7

√x.

14

The right-hand sides of the equalities contain the rule that ”processes” x into y. The rulein the first expression is: each x should be squared and added to twice x. The rules in thesecond and third expression can be formulated in a similar way.The rule can be given also by the symbols exp, loga, sin, cos, tan, cot and also combinationsof the symbols and algebraic operations. For instance,

y = log2

√1 + sin x; y =

1

(tan x)12 − 2x

.

The right-hand sides of equalities define the rules for ”processing” x into y.The rule can be given by another frequent method.Let f1 and f2 be functions defined by expressions given above, and a be a number. Wehave then set:

f(x) =

{f1(x) for x < af2(x) for x ≥ a

The above equality can be interpreted as a rule according to which every x has corre-sponding y. This rule can be formulated thus: if an x is less than a, then the correspondingy is computed by rule f1; but if x is greater than or equal to a, then the corresponding yis determined by rule f2.

6 Composite function. Inverse of a function.

Definition 6.1. Let f : X → U , g : U → Y be, respectively, mappings of the set X intothe set U and of the set U into the set Y. For every x ∈ X the element g(f(x)) belongs tothe set Y. The correspondence x 7→ g(f(x)) defines a mapping of the set X into the set Ywhich is denoted by g ◦ f and called the composition of mappings.If X, U, Y are sets of numbers, then the composition of the mappings (functions) g ◦ f iscalled the superposition of the functions or a composite function.

Comment 6.1. The rule associating the element x ∈ X with the element g(f(x)) isthat the mapping f is applied first to x (as a result, the element f(x) ∈ U is obtained),and then the mapping g is applied to the obtained element f(x) ∈ U ; finally we haveg(f(x)) ∈ Y.For instance:

- if y = u2, u = sin x, then y = (sin x)2 = sin2 x;

- if y = tan u, u = x2, then y = tan (x2);

- if y = cos u, u =x

2, then y = cos

(x

2

);

are composite functions.

Definition 6.2. A mapping (function) f : X → Y is said to be injective (an injection)if for different values of the argument there are different values of the function.

Definition 6.3. A mapping (function) f : X → Y is said to be surjective (a surjection)if every y ∈ Y is the image of some x ∈ X, that is, there is an x such that f(x) = y.

15

Definition 6.4. A mapping (function) f : X → Y is said to be bijective (a bijection) ifit is both injective and surjective.

Comment 6.2. 1. An injective mapping possesses the following property: different valuesof the function correspond to different values of the argument. For instance, the numberfunctions: y = 5 x; y = ex; y = arctan x are injective.2. Surjective functions are also called ”onto mappings.”For instance, the number function y = sin x is a surjective mapping of R1 onto the set[−1, 1] but is not surjective mapping of R1 onto all R1 (there is no inverse image of thepoint y = 2).3. A bijective function is a one-to-one mapping f : X → Y. This means that every x ∈ Xhas a corresponding y ∈ Y, y = f(x), with different x ∈ X having different correspondingy ∈ Y, and every y ∈ Y having a corresponding x ∈ X (such that y = f(x), different xcorresponding to different y ∈ Y ).

Definition 6.5. Let f : X → Y be a bijective mapping. Then for every y ∈ Y thereexists a unique x ∈ X such that f(x) = y. The correspondence y 7→ x defines a mappingY 7→ X, which is called the inverse of f and is denoted by f−1. For the number sets Xand Y the mapping f−1 is called the inverse of the function f (or an inverse function).

Comment 6.3. 1. The rule in the Definition 6.5 implies the following property of aninverse mapping (inverse function):

f(f−1(y)) = y for any y ∈ Y.

2. The functions (mappings) f and f−1 are mutually inverse, that is, (f−1)−1 = f.3. To find the inverse of a given number function y = f(x), we must express x in terms

of y. Thus, for y = 3 x + 2 the inverse mapping is x =y − 2

3; for y = x3 it is x = 3

√y, for

y = 10x it is x = log y.

7 Logic symbols

The expressions ”for any element” and ”there exists” are frequently used in mathematics.They are designated in a special manner:

- the first is denoted by the symbol ∀ (the first letter of the word ”Any” inverted);

- the second by the symbol ∃ (the first letter of the word ”Exist” reflected).

We shall also use the symbol ⇒ to mean ”follows”. Thus, if A and B are two sentences,then A ⇒ B means that B follows from A.

If A ⇒ B and B ⇒ A, then the sentence A and B are said to be equivalent, writtenA ⇔ B (A is equivalent to B).

Using this notation, the injectivity of a mapping f : X → Y can be written in the form:

∀x1, x2 ∈ X, x1 6= x2 ⇒ f(x1) 6= f(x2)

16

and the surjectivity of the same mapping in the form:

∀ y ∈ Y, ∃x ∈ X | f(x) = y

the vertical line before f(x) = y is read ”such that”.

The designation Adef⇐⇒ B is used when we want to describe a notion A using a sentence

B. It is read ”A is by definition B”. For instance the notation:

X ⊂ Ydef⇐⇒ {(∀x)(x ∈ X) ⇒ (x ∈ Y )}

defines X as a subset of set Y : the right-hand side of this notation is a sentence and it isread: ”any element x of X is also an element of the set Y ”.

8 Converse theorem and contrary theorem

Many mathematical statements (including theorems) have the following form: ”if A, thenB ”, or, which is the same, ”B follows from A ”, A ⇒ B, where A is the condition, andB is the conclusion of the theorem.

For any statement A ⇒ B we can construct a new statement by interchanging A and B,namely, write B ⇒ A, that is ”if B, then A ”, ”A follows from B ”.

The theorem (statement) B ⇒ A is the converse of the theorem (statement) A ⇒ B.It is obvious that the converse of a converse is the original theorem, therefore the twotheorems are said to be mutually converse.If the direct theorem is true, its converse may be either true or false.

Example 8.1. The direct theorem (Pythagoras’ theorem) is: if a triangle is right-angled,then the square of the hypotenuse is equal to the sum of the squares of the other twosides.The converse is: if the square of the biggest side equals the sum of the squares of the twosmaller sides, the triangle is right-angled.In this case, both the direct theorem and the converse are true.

Example 8.2. The direct theorem is: if two angles are right angles, they are equal.The converse is: if two angle are equal, then they are right angles.Here the direct theorem is true, but the converse is false.

For any statement A we denote A the proposition that A is false.

Example 8.3. If A denotes the statement ” 7 is an even number ” then A denotes thestatement ” 7 is not an even number ”.If A is the statement ” It will rain tomorrow ” then A is the statement ” It will not raintomorrow ”.If A is the statement ” All bullets will hit the target ”, then A is the statement ” At leastone bullet will not hit the target ”.

For the theorem ” if A, then B ”, the statement ” if A, then B ” is called the contrarytheorem. The contrary of a contrary theorem is the initial theorem.

17

Example 8.4. For the theorem ” If the sum of two opposite angles in a quadrilateral isequal to 180◦, then a circle can be circumscribed about the quadrilateral ” the contrarytheorem is ” If the sum of two opposite angles in a quadrilateral is not equal to 180◦, thena circle cannot be circumscribed about the quadrilateral ”.

In this case, both the direct theorem and its contrary are true.The contrary theorem is equivalent to the converse. This means that the contrary theoremis true if and only if the converse theorem is true.

9 Necessity and sufficiency

Let the statement ” if A, then B ” be true. In this case the condition A is said to besufficient for B, and the condition B to be necessary for A.

Let also the converse be true, that is, ” if B, then A”. In this case B is the sufficientcondition for A and the condition A is necessary for B.

Thus, the condition A is necessary and sufficient for B (and the condition B is necessaryand sufficient for A). In other words, conditions A and B are equivalent: A occurs if andonly if B is true.

Example 9.1. Bezout’s theorem is: ”If α is a root of a polynomial P (x), then thepolynomial P (x) is divisible by x− α without remainder ”.The converse is: If a polynomial P (x) is divisible by x − α, then α is a root ofthe polynomial P (x). We know that both Bezout’s theorem and its converse are true.Therefore, the necessary and sufficient condition for the number α to the root of apolynomial P (x) is that ” the polynomial P (x) is divisible by x− α”.The following statement is also true: ” for a polynomial P (x) to be divisible by x − αwithout remainder it is necessary and sufficient that the number α be a root of thepolynomial P (x)”.

18

Part II

Single variable calculus

10 Topology in R1

Definition 10.1. A neighborhood of the point x ∈ R1 is a set V ⊂ R1 which contains anopen interval (a, b) ⊂ R1 containing x; i.e x ∈ (a, b) ⊂ V.For instance, any open interval containing x is a neighborhood of the point x.

Definition 10.2. Let be A ⊂ R1. A point x ∈ R1 is called an interior point of the set Aif there exists an open interval (a, b) such that: x ∈ (a, b) ⊂ A.For instance, a point x of the open interval (a, b) is an interior point of the set (a, b).

Definition 10.3. The interior of a set A ⊂ R1 is the set of all interior points of the setA.Usually, the interior of a set A is denoted by A or Int(A).For instance, if A is an open interval A = (a, b), then A = (a, b) = A.

Definition 10.4. A set A ⊂ R1 is open, if A = A.For instance, any open interval is an open set.A set A ⊂ R1 is open if and only if it contains a neighborhood of each of its points.The union of any family of open sets is open.The set of all real numbers R1 and the empty set ∅ are open.The intersection of a finite number of open sets is open.

Definition 10.5. A set A ⊂ R1 is said to be closed if its complement is open.The intersection of any family of closed sets is closed.The union of a finite number of closed sets is closed.The set of all real numbers R1 and the empty set are closed.Any closed interval [a, b] is a closed set.

Definition 10.6. If A is a subset of R1, then a point x ∈ R1 is a limit point, or a pointof accumulation, of A provided every neighborhood of x contains at least one point y 6= x,with y ∈ A.

Definition 10.7. The closure A of a set A ⊂ R1 is the intersection of all closed setscontaining A. The set of points belonging to A and not to the interior A of A is calledthe boundary of A, denoted usually by ∂A.

The closure operation has the following properties:

a) A ∪B = A ∪B;

b) A ⊃ A;

c) A = A;

d) A = A if and only if A is a closed set;

19

e) x ∈ A if and only if every neighborhood V (x) of x intersect A.

Definition 10.8. A set A ⊂ R1 is bounded if there exist m, M ∈ R1 such that m ≤ x ≤ Mfor every x ∈ A.

Definition 10.9. A set A ⊂ R1 is compact if it is both bounded and closed.

For instance, any closed interval [a, b] is compact.

11 Sequences

Definition 11.1. A function whose domain is the set of positive integers N ={1, 2, . . . , n, . . . } and whose values belong to the set R1 of real numbers, is called a se-quence of real numbers.

Comment 11.1. the value of the function (defining a sequence of real numbers)corresponding to argument 1 is denoted by a1, that corresponding to the argument 2by a2, . . . , that corresponding to the argument n by an. Here, a1 is called the first termof the sequence, a2 the second term, . . . , an the n-th term.The sequence a1, a2, . . . , an, . . . is denoted by (an).

In order to define a sequence the value of the first, second,. . . , and n-th terms of thesequence must be indicated. In other words, a rule must be given for evaluating the n-thterm of the sequence, given its place in the sequence for n = 1, 2, . . . .

Example 11.1. Let an = qn−1, q 6= 0 then a1 = 1, a2 = q, a3 = q2, . . . , an = qn−1, . . . .

Example 11.2. Let an =1

nthen a1 = 1, a2 =

1

2, a3 =

1

3, . . . , an =

1

n, . . . .

Example 11.3. Let an = n2 then a1 = 1, a2 = 4, a3 = 9, . . . , an = n2, . . . .

Example 11.4. Let an = (−1)n then a1 = −1, a2 = 1, a3 = −1, . . . , an = (−1)n, . . . .Thus:

an =

{ −1 for n odd1 for n even

Example 11.5. Let an =1 + (−1)n

2, then a1 = 0, a2 = 1, a3 = 0, a4 = 1. Thus:

an =

{0 for n odd1 for n even

It may happen that as the number n increases, the terms an of the sequence increasestoo.

Definition 11.2. An increasing sequence (an) is one in which an ≤ an+1 for all n ∈ N.

Definition 11.3. A decreasing sequence (an) is one in which an ≥ an+1 for all n ∈ N.

20

Definition 11.4. A sequence which is either increasing or decreasing is called a monotonesequence.

Example 11.6. If q > 1, then the sequence an = qn is increasing and if 0 < q < 1, thenthe sequence an = qn is decreasing. If q ∈ (0, +∞) and q 6= 1, then the sequence an = qn

is monotone.

Definition 11.5. A sequence (an) is called bounded if there exists a number M such that|an| ≤ M for all n.

For instance, if 0 < q < 1, then the sequence an = qn is bounded (|an| < 1). The sequencean = (−1)n is also bounded (|an| ≤ 1).

Definition 11.6. A sequence (an) is called unbounded if it is not bounded. In other words,if for any M > 0 there exists nM such that |anM

| > M.For instance, if q > 1, then the sequence an = qn is unbounded.

Definition 11.7. If (an) is a sequence, then any sequence (ank), where (nk) = n1, n2, . . .

is a strictly increasing sequence of positive integers, is called a subsequence of the sequence(an).

Comment 11.2.

• any subsequence of an increasing sequence is increasing;

• any subsequence of a decreasing sequence is decreasing;

• any subsequence of a bounded sequence is bounded.

12 Convergence

It may happen that as the number n increases without bound, the terms an of the sequenceapproach closely a certain number L. In this case we arrive at an important mathematicalconcept that of the limit of a sequence.

Definition 12.1. A number L is said to be the limit of the sequence (an) if for any numberε > 0 there is a number N (dependent on ε) such that all the terms an of the sequencewith subscript n exceeding N satisfy the condition:

|an − L| < ε.

In this case we writelim

n→∞an = L and read: ”as n tends to infinity, the limit of an equals L” or

an −−−−−→n→∞ L and read: ”as n tends to infinity an tends to L”.

If an −−−−−→n→∞ L, then the sequence (an) is said to be convergent to L.

Comment 12.1.

21

• If the sequence (an) converges to L, then any subsequence (ank) of the sequence (an)

converges to L.Indeed: for any ε > 0 there exists N such that for n > N we have |an − L| < ε.Hence for nk > N we have |ank

− L| < ε.

• Not every sequence has a limit. For instance, the sequence an = (−1)n has nolimit. That is because the subsequence a2k = (−1)2k = 1 converges to 1 and thesubsequence a2k+1 = (−1)2k+1 = −1 converges to −1.

• The limit of a sequence (an), if it exists, it is unique.Assuming the contrary, that is (an) converges to L1 and L2, L1 6= L2, we find N1 andN2 such that |an−L1| < |L1−L2|/2 for n > N1 and |an−L2| < |L1−L2|/2 for n > N2.Since for n > max{N1, N2} we have |L1 − L2| ≤ |L1 − an| + |L2 − an| < |L1 − L2|we obtain that |L1 − L2| < |L1 − L2| what is absurd.

• If the sequence (an) converges to L, then it is bounded. Indeed, considering ε = 1and N1 such that |an − L| < 1 for n > N1 we have

|an| = |an − L + L| ≤ |an − L|+ |L| < 1 + |L|for n > N1.Therefore |an| ≤ max{|a1|, |a2|, . . . |aN1|, 1 + |L|}

Example 12.1. Let us show that limn→∞

1√n

= 0

Indeed, let ε > 0. Consider the inequality∣∣∣∣

1√n− 0

∣∣∣∣ < ε

we have1√n

< ε,1

n< ε2 that is n >

1

ε2.

We set N =

[1

ε2

]+1 where

[1

ε2

]is the integral part of the number

1

ε2. It is obvious that

if n > N, then n >1

ε2and inequality

∣∣∣∣1√n− 0

∣∣∣∣ < ε will be fulfilled.

Note that when proving the existence of a limit we calculated the number N for the givenε in a formal, textbook manner. From now on, we shall compute limits using other, moresimple and convenient rules.

In some cases the limit of a sequence (an) is said to be infinity. The meaning of thisconcept is the following:

Definition 12.2. The limit of the sequence (an) is said to be +∞ if for any M > 0 thereis NM such that an > M for n > NM .

For instance, the limit of the sequence an = n2 is +∞.

Definition 12.3. The limit of the sequence (an) is said to be −∞ if for any M > 0 thereis NM such that an < −M for n > NM .

For instance, the limit of the sequence an = −n2 is −∞.

22

13 Rules (for convergence of sequences)

Suppose that (an) and (bn) are convergent sequences with limits a and b respectively, thenthe following rules apply:

Sum rule: (an + bn) converges to a + b.

Proof. Consider the inequality:

|(an + bn)− (a + b)| = |(an − a) + (bn − b)| ≤ |an − a|+ |bn − b|.

Given ε > 0, let ε′ =1

2ε. Then ε′ > 0 and, since lim

n→∞an = a and lim

n→∞bn = b, there exist

natural numbers N1 and N2 such that n > N1 ⇒ |an−a| < ε′ and n > N2 ⇒ |bn−b| < ε′.Let N be the maximum of N1 and N2 and so n > N ⇒ |an − a|+ |bn − b| = 2 ε′ = εIn other words (an + bn) converges to a + b.

Product rule: (an · bn) converges to a · b.

Proof. Since limn→∞

= b, there is M > 0 such that |bn| ≤ M for any n ∈ N. It follows that:

|an · bn − a · b| = |an · bn − a · bn + a · bn − a · b| = |bn(an − a) + a(bn − b)|≤ |bn| · |an − a|+ |a| · |bn − b| ≤ M |an − a|+ |a| · |bn − b|, for all n ∈ N

Given ε > 0, let ε1 =ε

2Mand ε2 =

ε

2(|a|+ 1).

Since limn→∞

an = a and limn→∞

bn = b, there exist N1 and N2 such that :

n > N1 ⇒ |an − a| < ε1

andn > N2 ⇒ |bn − b| < ε2.

Let N3 the maximum of the N1 and N2 and so conclude that if n > N3 then:|an · bn − a · b| < ε.In other words lim

n→∞an · bn = a · b.

Quotient rule: (an/bn) converges to a/b provided that bn 6= 0 for each n and b 6= 0.

Proof. Firstly it is shown that if limn→∞

bn = b 6= 0 and bn 6= 0 for all n, then limn→∞

bn =1

b.

It is clear that we have: ∣∣∣∣1

bn

− 1

b

∣∣∣∣ =|bn − b||bn| · |b|

Since limn→∞

bn = b there exists an integer N1 such that |bn − b| < 1

2|b| for all n > N1. Let

M be the maximum of2

|b| ,1

|b1| , . . . ,1

|bN1|. Then

∣∣∣∣1

bn

∣∣∣∣ < M for all n.

23

So, given any ε > 0, let ε′ =ε · |b|M

. Then ε′ > 0 and there exists an integer N2 such that

|bn− b| < ε′ for all n′ > N2. Hence,

∣∣∣∣1

bn

− 1

b

∣∣∣∣ < ε for all n > N3 where N3 is the maximum

of N1 and N2. In other words limn→∞

1

bn

=1

b. By the product rule then lim

n→∞an

bn

=a

b.

Scalar product rule: (k · an) converges to k · a for every real number k.The scalar product rule is a special case of the product rule.

Application 13.1. Evaluate

limn→∞

n2 + 2n + 3

4n2 + 5n + 6.

Solution: The quotient rule cannot be applied direct since neither the numerator nor

the denominator ofn2 + 2n + 3

4n2 + 5n + 6converges to a finite limit.

However, if the numerator and denominator are divided by the dominant term n2 thefollowing is obtained:

an =1 +

2

n+

3

n2

4 +5

n+

6

n2

.

It is easy to prove that1

n−−−→x→∞

0 and the constant sequence (k) has limit k. Hence

limn→∞

an =1

4freely using the sum, product, scalar product and quotient rules.

Squeeze rule: Let (an), (bn), (cn) be sequences satisfying an ≤ bn ≤ cn for all n ∈ N. If(an) and (cn) both converge to the same limit L, then (bn) also converges to L.

Proof. If an ≤ bn ≤ cn, then an − L ≤ bn − L ≤ cn − L. Hence |bn − L| ≤max{|an − L|, |cn − L|}. Given ε > 0, there exist natural numbers N1 and N2 such thatn > N1 ⇒ |an − L| < ε and n > N2 ⇒ |cn − L| < ε. Let N be the maximum of N1 andN2. Then for n > N it follows that |bn − L| < ε. In the other words lim

n→∞bn = L

Application 13.2. Show that limn→∞

(−1)n · 1

n2= 0.

Solution: Note that ∣∣∣∣(−1)n · 1

n2

∣∣∣∣ <1

n2.

Now let an = − 1

n2, bn = (−1)n · 1

n2, cn =

1

n2.

Both (an) and (bn) converge to 0. By the squeeze rule (bn) converges to 0.

24

Principle of monotone sequences: A bounded monotone sequence is convergent.

Proof. The statement for a bounded increasing sequence is proved, the proof being similarfor a decreasing sequence.Let (an) be such that a1 ≤ a2 ≤ . . . ≤ an ≤ . . . and an ≤ M for all n ∈ N. LetM0 = sup {an|n ∈ N} the least upper bound of the set of numbers appearing in thesequence. Given ε > 0, M0 − ε cannot be an upper bound for {an|n ∈ N}. Hence, thereexists a value n = N such that aN > M0 − ε. Furthermore an ≤ M0 by the definition ofM0 and hence, for n > N , |an −M0| < ε. This proves that lim

n→∞an = M0.

Application 13.3. A sequence (an) is defined by a1 = 1 and an+1 =√

an + 1 for n ≥ 1.

Show that limn→∞

an =1 +

√5

2.

Solution: First, it is shown by induction on n, that (an) is an increasing sequence.Since a1 = 1 and a2 =

√2, it follows that a1 ≤ a2. Now

an+1 − an =√

an + 1−√

an−1 + 1 =an − an−1√

an + 1 +√

an−1 + 1

and since√

an + 1 +√

an−1 + 1 is positive if an−1 ≤ an then an ≤ an+1. So, by induction(an) is an increasing sequence.

Now a2n−a2

n+1 = a2n−an−1 =

(an − 1

2

)2

−5

4and since (an) is increasing,

(an − 1

2

)2

−5

4≤

0. This quickly leads to (an) being bounded above by1

2(1 +

√5).

By the principle of monotone sequences (an) is convergent. Hence, suppose thatlim

n→∞an = L. Since lim

n→∞an+1 = L we obtain L =

√L + 1 and so L2 = L + 1. The

quadratic equation L2 = L + 1 has two roots, namely1

2(1 ±

√5). Since an ≥ 1 for all

n ∈ N, the positive root is required. Hence L =1

2(1 +

√5).

Theorem 13.1 (Bolzano-Weierstrass theorem). Any bounded sequence (an) of real num-bers contains a convergent subsequence.

Proof. Let SN = {an|n > N}. If every SN has a maximum element, then define asubsequence of (an) as follows: b1 = an1 is the maximum of S1, b2 = an2 is the maximumof the Sn1 , b3 = an3 is the maximum of Sn2 and so on. Therefore (bn) is a monotonedecreasing subsequence of (an). Since (an) is bounded, then so is (bn) too. It follows that(bn) is a convergent subsequence of (an).On the other hand if, for some M, SM does not have a maximum element, then for any am

with m > M there exists an an following am with an > am. Now let c1 = aM+1 and c2 thefirst term of (an) following c1 for which c2 > c1. Now let c3 the first term of (an) followingc2 for which c3 > c2 and so on. Therefore (cn) is monotone increasing subsequence of(an). Since (cn) is bounded, it is convergent.

It is intuitively clear that if an −−−→n→∞

L, then all the terms of the sequence with large

subscripts will differ very little, all of them being approximately equal to L.More precisely, we have:

25

Theorem 13.2 (Cauchy’s criterion for the convergence of a sequence). A sequence (an)has a limit if and only if for any ε > 0 there exists Nε such that all the terms of thesequence with subscripts p, q > Nε satisfy |ap − aq| < ε.

Proof. Let assume that the sequence (an) has a limit L and let ε > 0 be a number.

Consider the numberε

2; by definition of limit there exists an integer N such that

|an − L| <ε

2for all n > N. Hence |ap − L| <

ε

2, |aq − L| <

ε

2for p, q > N and it

follows that |ap − aq| ≤ |ap − L|+ |aq − L| < ε for p, q > N. Let assume now that for anyε > 0 there exists N such that |ap − aq| < ε for p, q > N1.Considering ε = 1 and N1 such that |ap − aq| < 1 for p, q > N1 we have:

|an| = |an − aN1+1 + aN1+1| ≤ |an − aN1+1|+ |aN1+1| ≤ 1 + |aN1+1|, for n > N1

Therefore:|an| ≤ max{|a1|, |a2|, . . . , |aN1|, |aN1+1|+ 1} = M

According to Bolzano-Weierstrass theorem the sequence (an) contains a convergentsubsequence (ank

). Let be L = limnk→∞

ankand ε > 0 a number. There exists N1 such

that for nk > N1 we have |ank− L| <

ε

2and N2 such that |ap − aq| <

ε

2for p, q > N2.

Considering N3 = max{N1, N2} and n > N3 we have:

|an − L| ≤ |an − ank|+ |ank

− L| < ε

where nk > N3 and nk is fixed.

14 Limit points of a sequence

Definition 14.1. The set of limit points of sequence (an) is the collection of points x ∈ R1

for which there exists a subsequence (ank) of the sequence (an) such that lim

nk→∞ank

= x.

Usually the set of limit points of sequence (an) is denoted by L(an).

The sequence (an) converges and limn→+∞

an = L if and only if L(an) = {L}.

Definition 14.2. The limit superior of a sequence (an) is supL(an). The limit superiorof a sequence (an) usually is denoted by lim sup

n→∞an or by lim

n→∞an.

Definition 14.3. The limit inferior of a sequence (an) is inf L(an). The limit inferior ofa sequence (an) usually is denoted by lim inf

n→∞an or lim

n→∞an.

Example 14.1. If an = (−1)n then L(an) = {−1, 1} and limn→∞

an = −1, limn→∞

an = 1.

The sequence (an) converges if and only if

limn→∞

an = limn→∞

an = L.

26

15 Series of real numbers

If a sequence (an) is given, the finite sum

sn = a1 + a2 + · · ·+ an

for each n ∈ N can be formed.If the sequence (sn) converges to some limit s, then s can justifiably be called the sum ofthe infinite series ∞∑

n=1

an = a1 + a2 + . . . .

More precisely:

Definition 15.1. It is said that the symbol∞∑

n=1

an is a convergent series, with sum s, if

the sequence (sn) of n-th partial sums converges to s.

If (sn) is a divergent sequence then, irrespective of its precise behavior,∞∑

n=1

an is called a

divergent series.

Rather regrettably,∞∑

n=1

an is still used to denote a divergent series even through it does

not possess a sum.

Example 15.1. Show that1

2+

1

22+

1

23+ . . . .

has sum 1.

Solution: The n-th partial sum of∞∑

n=1

1

2nis sn =

1

2+

1

22+

1

23+ · · ·+ 1

2n= 1− 1

2n. Since

limn→∞

sn = 1 it can be deduced that∞∑

n=1

1

2nconverges and has sum 1.

Example 15.2. Show that 1 + 2 + 3 + . . . is a divergent series.

Solution: The n-th partial sum is sn = 1+2+ · · ·+n =1

2n (n+1). Since sn is a divergent

sequence,∞∑

n=1

n is divergent.

Example 15.3. Show that∞∑

n=1

1

n2 + n= 1

27

Solution: Since1

n2 + n=

1

n(n + 1)=

1

n− 1

n + 1

the n-th partial sum of∞∑

n=1

1

n2 + ncan be written as

sn =

(1− 1

2

)+

(1

2− 1

3

)+ · · ·+

(1

n− 1

n + 1

)= 1− 1

n + 1

Now limn→∞

sn = 1.

Example 15.4.∞∑

n=1

1

2nis a special case of an important class of infinite series, namely

the geometric series∞∑

n=0

a · xn, where x is a real number.

Notice that the summation here begins at n = 0, and not at n = 1. For this series thesum of the first n terms is

sn = a + a · x + a · x2 + · · ·+ a · xn−1

sox · sn = a · x + a · x2 + · · ·+ a · xn

and by substraction

sn =a(1− xn)

1− x

for x 6= 1. Therefore, limn→∞

sn =a

1− xfor |x| < 1.

Since (sn) diverges for |x| ≥ 1 the following result is obtained:Result: The geometric series

∞∑n=0

a · xn = a + a · x + a · x2 + . . . , a 6= 0

converges if and only if |x| < 1. Moreover, its sum is thena

1− x.

Since the sum of a convergent series is defined to be the limit of the sequence of n-thpartial sums of the series in question the rules concerning the convergence of sequencecan be used to establish theorems concerning convergence of series.

The first result provides a useful test for the divergence of series.

The vanishing condition

If∞∑

n=1

an is convergent, then limn→∞

an = 0.

Proof. Suppose that (sn) converges to some limit s. Hence, (sn−1) also converges to s.But an = sn − sn−1 and so lim

n→∞an = 0.

28

Example 15.5. Consider∞∑

n=1

n

n + 1. Since an =

n

n + 1and lim

n→∞an = 1 6= 0, by the

vanishing condition∞∑

n=1

n

n + 1does not converge. In other words

∞∑n=1

n

n + 1is divergent.

It is important to note that the converse of the vanishing condition is false! In otherwords there are divergent series whose terms nevertheless tend to zero.

Example 15.6. Consider∞∑

n=1

(√

n−√n− 1). The n-th partial sum may be written as:

sn = (√

1−√

0) + (√

2−√

1) + · · ·+√n−√n− 1 =

√n

Clearly (sn) is a divergent sequence and so∞∑

n=1

(√

n−√n− 1) is divergent series. However

an =√

n−√n− 1 =(√

n−√n− 1)(√

n +√

n− 1)

(√

n +√

n− 1)=

1

(√

n +√

n− 1)→ 0

Cauchy’s criterion for the convergence of a series

A series∞∑

n=1

an converges if and only if for any ε > 0 there exists N such that for n ≥ N

and p ≥ 1 the following inequality holds:

|an+1 + an+2 + · · ·+ an+p| < ε.

Proof. Let be sn the n-th partial sum of the series:

sn = a1 + a2 + · · ·+ an.

The series∞∑

n=1

an converges if and only if the sequence (sn) converges. The sequence (sn)

converges if and only if for any ε > 0 there exists Nε such that for q, r > Nε the followinginequality hold

|sq − sr| < ε.

This is equivalent with the condition: for any ε > 0 there exists Nε such that for n ≥ Nε

and p ≥ 1 the following inequality hold:

|an+1 + an+2 + · · ·+ an+p| < ε.

16 Rules (for convergence of series)

By considering the n-th partial sums of the appropriate series and the sum and the scalarproduct rules for sequences the following elementary results can be easily proved.

29

Sum rule: If∞∑

n=1

an and∞∑

n=1

bn are convergent series, then∞∑

n=1

(an+bn) is also convergent

and ∞∑n=1

(an + bn) =∞∑

n=1

an +∞∑

n=1

bn

Scalar product rule: If∞∑

n=1

an is convergent, then∞∑

n=1

(k · an) is convergent for any

k ∈ R1 and ∞∑n=1

(k · an) = k

∞∑n=1

an.

Rules will be established below which can be used to test whether a given series convergesor not.

Integral test: Let f : R1+ → R1

+ be a decreasing function and let an = f(n) for each

n ∈ N. Let jn =

∫ n

1

f(x) dx. The series∞∑

n=1

an converges if and only if jn converges.

The proof of the statement is given in the section where the Riemann integral is rigourouslydefined.

Application 16.1. Establish that the p series∞∑

n=1

1

npconverges if and only if p > 1.

Solution: Consider the function fp : R1+ → R1

+ given by fp(x) =1

xp. When p > 0 this is

a decreasing function of x and an =1

np= fp(n) is the n-th term of the p series

∞∑n=1

1

np.

For p 6= 1

jn =

∫ n

1

1

xpdx =

x1−p

1− p

∣∣∣∣n

1

=1

1− p(n1−p − 1).

So, for p > 1, limn→∞

jn =1

p− 1, and forp < 1 the sequence (jn) is divergent. For p = 1,

jn =

∫ n

1

1

xdx = ln x|n1 = ln(n)

and so (jn) again diverges.

When p ≤ 0,∞∑

n=1

1

npdiverges by the vanishing condition.

Example 16.1. The harmonic series∞∑

n=1

1

ndiverges and the series

∞∑n=1

1

n2converges.

The p-series, together with geometric series, give a fund of known convergent and divergentseries.

30

First comparison test: If 0 ≤ an ≤ bn for all n ∈ N, then∞∑

n=1

bn convergent implies

∞∑n=1

an convergent.

Proof. Let sn =n∑

k=1

ak and tn =∑

k=1

bk. From the given conditions 0 ≤ sn ≤ tn for all

n ∈ N. If∞∑

n=1

bn converges, then limn→∞

tn = t and, since (tn) is an increasing sequence,

tn ≤ t for all n ∈ N.Therefore, sn ≤ t for all n ∈ N and hence (sn) is bounded and increasing sequence. It

follows that (sn) converges and hence∞∑

n=1

an converges.

Example 16.2. The series∞∑

n=1

1 + cos n

3n + 2 · n3is convergent.

Solution: Let an =1 + cos n

3n + 2 · n3. Then an ≥ 0 since cos n ≥ −1. Also an ≤ 2

3n + 2 · n3

since cos n ≤ 1. Therefore, since 3n > 0, an <2

2 n3=

1

n3. Let bn =

1

n3. Then

∞∑n=1

bn

converges, being the p series with p = 3. Hence∞∑

n=1

an also converges.

Second comparison test: Let∞∑

n=1

an and∞∑

n=1

bn be positive term series such that

limn→∞

an

bn

= L 6= 0.

Then,∞∑

n=1

an converges if and only if∞∑

n=1

bn converges.

Proof. Suppose that∞∑

n=1

bn is convergent and let

sn = a1 + a2 + · · ·+ an; tn = b1 + b2 + · · ·+ bn.

Since limn→∞

an

bn

= L for ε = 1 there is an N1 such that

∣∣∣∣an

bn

− L

∣∣∣∣ < 1, for all n > N1

Hence,

an

bn

=

∣∣∣∣an

bn

∣∣∣∣ =

∣∣∣∣an

bn

− L + L

∣∣∣∣ ≤∣∣∣∣an

bn

− L

∣∣∣∣ + |L| < 1 + |L| = k, n > N1.

31

Now consider the positive term series∞∑

n=1

αn and∞∑

n=1

βn where αn = aN1+n and βn =

k · bN1+n.

Hence 0 ≤ αn ≤ βn for all n ∈ N. Since∞∑

n=1

bn converges, then so does∞∑

n=N1+1

bn too

and hence∞∑

n=1

βn converges by the scalar product rule for series. By the first comparison

test,∞∑

n=1

αn converges and, since the addition of a finite number of terms to a convergent

series produces another convergent series,∞∑

n=1

an also converges. This proves that∞∑

n=1

bn

convergent implies that∞∑

n=1

an convergent. The converse of this statement can be proved

by reversing the roles of an and bn in the above argument and observing thatbn

an

→ 1

L.

Example 16.3. Show that the series∞∑

n=1

2 n

n2 − 5 n + 8is divergent.

Solution: Let an =2 n

n2 − 5 n + 8and bn =

1

n. Then

an

bn

=2 n2

n2 − 5 n + 8→ 2 6= 0. Hence

∞∑n=1

an diverges by comparison with the divergent harmonic series.

Ratio test: Let∞∑

n=1

an be a series of positive terms and for each n ∈ N let αn =an+1

an

.

Suppose that (αn) converge to some limit L. If L > 1 then∞∑

n=1

an diverges; if L < 1 then

∞∑n=1

an converges; and if L = 1 the test give no information.

Proof. Suppose that L < 1 and let ε =1

2(1− L).

Now ε > 0 and L + ε = k < 1. Since limn→∞

αn = L there is a value Nε such that

αn = |αn−L + L| ≤ ε + L = k < 1 for all n > Nε. Therefore an+1 ≤ k · an for all n > Nε.Let βn = aNε+n. Then βn+1 ≤ k · βn for all n ∈ N and so (by induction on n)

βn+1 ≤ kn · β1, for all n ∈ N

Now∞∑

n=0

kn · β1 is a convergent geometric series since k < 1. Hence∞∑

n=1

βn converges and

∞∑n=1

an also converges.

32

Suppose now that L > 1 and let ε = L − 1. Now ε > 0 and since limn→∞

αn = L there is a

value Nε such that αn > L− ε = 1 for all n > Nε.Hence an+1 > an for all n > Nε and so an > aNε for all n > Nε. Since aNε 6= 0, (an) does

not tend to zero. By the vanishing condition∞∑

n=1

an diverges.

Example 16.4. Determine those values of x for which∞∑

n=1

n · (4x2)n is convergent.

Solution: Here an = n · (4x2)n, and so

αn =(n + 1) · (4x2)n+1

n · (4x2)n= 4x2

(1 +

1

n

).

Now αn → 4x2. Hence∞∑

n=1

an diverges if 4x2 > 1 (in other words |x| > 1

2),

∞∑n=1

an converges

if |x| < 1

2, and no information is gained if |x| = 1

2.

If |x| = 1

2then

∞∑n=1

an =∞∑

n=1

n diverges.

Therefore∞∑

n=1

n · (4x2)n converges if and only if |x| < 1

2.

The root test: Let∞∑

n=1

an be a series of positive terms. If there exists N and k ∈ (0, 1)

such that n√

an ≤ k for n > N, then the series converges.If n√

an ≥ 1 for an infinity of terms of the series, then the series diverges.

Proof. If there exists N and k ∈ (0, 1) such that n√

an ≤ k for n > N, then an ≤ kn for

n > N. consequently, the series∞∑

n=1

an can be compared with the geometric series∞∑

n=1

kn

which converges since k < 1. This proves the first case.If n√

an ≥ 1 for an infinity of terms of the series, then an ≥ 1 for an infinity of terms ofthe series. Hence the vanishing condition is not satisfied and the series diverges.

Application 16.2. The series∞∑

n=1

1

nnconverges. Using the root test we have n

√1

nn=

1

n≤ 1

2for n ≥ 2.

Alternating series test (Leibnitz): If the sequence (bn) is a monotonic decreasing

sequence and bn −−−→n→∞

0, then the alternating series∞∑

n=1

(−1)n−1 · bn converges.

33

Proof. Let sm =∞∑

n=1

(−1)n−1 · bn.

Then s2m = b1− (b2− b3)−· · ·− (b2m−2− b2m−1)− b2m ≤ b1 and hence the sequence (s2m)is bounded above.Since s2m = (b1 − b2) + (b3 − b4) + · · · + (b2m−1 − b2m) and (bn) is decreasing, (s2m) isincreasing. Consequently, (s2m) converges and let be s = lim

m→∞s2m.

Similarly, the sequence (s2m+1) is a decreasing sequence which is bounded below by b1−b2

and so (s2m+1) converges to a limit t = lim2m+1→∞

s2m+1.

Now t− s = limm→∞

(s2m+1 − s2m) = limm→∞

b2m+1 = 0

Finally, it is shown that limn→∞

sn = s.

If ε > 0, there are integers N1 and N2 such that |s2m − s| < ε for all m > N1 and|s2m+1 − s| < ε for all m > N2. Let N be the maximum of 2N1 and 2N2 + 1. If n > N,then either n = 2 m ( and m > N1 ) or n = 2 m + 1 ( and m > N2 ). In either case|sn − s| < ε for all n > N. In other words, the sequence (sn) converges and hence, the

series∞∑

n=1

(−1)n−1 · bn is convergent.


n=1

(−1)n−1 · 1

nis convergent since

1

nis decreasing sequence

with limit zero.

17 Absolute convergent series

Definition 17.1. The series∞∑

n=1

an is said to be absolute convergent if the series of

absolute values∞∑

n=1

|an| converges.

A convergent series which is not absolute convergent is called conditionally convergent.

Absolute convergence implies convergence.

If the series∞∑

n=1

|an| converges, then the series∞∑

n=1

an converges too.

Proof. By triangle inequality we have

|sn − sm| = |am+1 + · · ·+ an| ≤ |am+1|+ · · ·+ |an|where n > m.

Since∞∑

n=1

|an| converges, for any ε > 0, there exists an integer N such that for n > m > N

we have |am+1|+ · · ·+ |an| < ε and hence |sn−sm| < ε. By the Cauchy criterion we obtain

that the series∞∑

n=1

an converges.

34


n=0

(−1)n

2nconverges, since it is absolute convergent;

∞∑n=0

1

2n=

2.

The alternating harmonic series∞∑

n=1

(−1)n

nis conditionally convergent. It converges (by

Leibnitz criterion) but it is not absolutely convergent since∞∑

n=1

1

ndiverges.

Comment 17.1. Absolute convergence can be tested by the convergence tests givenabove for series with positive terms.Absolute convergence is important for the following reason: the sum of an absolute

convergent series∞∑

n=1

an does not depend on the order in which the terms an are taken.

It can be shown that for any conditionally convergent series∞∑

n=1

an and any real number

S, we can have∞∑

n=1

an = S by rearranging the terms of∞∑

n=1

an. For example rearranging

the terms of the alternating harmonic series we can have it sum to 106 or to −106 orwithin ε to the number of atoms in universe.

Cauchy product of series: If∞∑

n=1

an and∞∑

n=1

bn are absolute convergent series and

cn = a1bn + a2bn−1 + · · ·+ anb1

then∞∑

n=1

cn is absolute convergent and

∞∑n=1

cn =

( ∞∑n=1

an

)·( ∞∑

n=1

bn

)

Proof. Suppose first that∞∑

n=1

an and∞∑

n=1

bn are positive term series and consider the array:

a1b1 a1b2 a1b3 . . .a2b1 a2b2 a2b3 . . .a3b1 a3b2 a3b3 . . .. . . . . . . . . . . .

If wn is the sum of the terms in the array that lie in the n × n square with a1b1 at one

corner, then wn = sn · tn, where sn and tn are the n-th partial sums of∞∑

n=1

an and∞∑

n=1

bn

respectively.

35

Hence limn→∞

wn =

( ∞∑n=1

an

)·( ∞∑

n=1

bn

).

Now∞∑

n=1

cn is the sum of the terms in the array summed ”by diagonals” and so if un is

the partial sum of∞∑

n=1

cn then:

w[n2] ≤ un ≤ wn.

By the squeeze rule we have

limn→∞

un =

( ∞∑n=1

un

)·( ∞∑

n=1

bn

)

as required.

For the general case the above argument can be applied to the series∞∑

n=1

|an|,∞∑

n=1

|bn|,

and∞∑

n=1

|cn| to deduce that the series∞∑

n=1

cn is absolute convergent. As∞∑

n=1

cn is a linear

combination of the series∞∑

n=1

a+n ,

∞∑n=1

a−n ,

∞∑n=1

b+n and

∞∑n=1

b−n we have:

∞∑n=1

cn =

( ∞∑n=1

an

)·( ∞∑

n=1

bn

).

18 Limit of a function at a point

An important concept in calculus is the limit of a function at a point. It is used in thestudy of the continuity, derivatives, integrals, and other important topics in calculus.

Consider a function f : A → R1, whose domain A is a subset of R1. The behaviour ofthe function f as x approaches a fixed real value ”a” shall now be investigated. It shallbe assumed that f(x) is defined for all x close to ”a” but not necessarily at ”a”. In otherwords, it is assumed that the domain of f contains a set of the form (a− r, a)∪ (a, a + r)for some r > 0.

Definition 18.1. L is called the limit of f(x) as x approaches ”a”, if for any ε > 0, thereexists a number δ > 0 (depending on ε) such that |f(x)− L| < ε for all x ∈ A, x 6= a and|x− a| < δ.

We denote this bylimx→a

f(x) = L or f(x) → L for x → a.

36

Comment 18.1. Definition 18.1 does not depend on the value of f at ”a” (if this exists)as the point ”a” is excluded from consideration. If the value f(a) exist, may violate theinequality.Given the function f and the value L the inequality |f(x)− L| < ε means

L− ε < f(x) < L + ε

and therefore, ε can be regarded as the prescribed accuracy of approximating L, i.e howclose to L one wants to get.The number δ is not uniquely determined by ε. You can always ”take a smaller δ” in thesense that if, for a given ε we have:

0 < |x− a| < δ1 ⇒ |f(x)− L| < ε

then, for any 0 < δ < δ1 we have

0 < |x− a| < δ ⇒ |f(x)− L| < ε.

Example 18.1. Let us show that

limx→2

x2 − 4

x− 2= 4.

Let ε > 0 and consider the inequality∣∣∣∣x2 − 4

x− 2− 4

∣∣∣∣ < ε or 4− ε <x2 − 4

x− 2< 4 + ε

is equivalent, for x 6= 2 to 4− ε < x + 2 < 4 + ε or 2− ε < x < 2 + ε showing that we cantake δ = ε.

Example 18.2. Let us show that at any point a > 0, the function f(x) =√

x has thelimit

limx→a

√x =

√a.

Indeed, if ε > 0, then the inequality

|√x−√a| < ε or√

a− ε <√

x <√

a + ε

becomes, by squaring

a− 2√

a · ε + ε2 < x < a + 2√

a · ε + ε2

For the given ”a” and ε we can take δ = 2√

a · ε + ε2.

Example 18.3. The limit limx→0

sin1

xdoes not exist.

Recall that arbitrary close to a = 0 there exists x such that f(x) = 1 as well f(x) = −1.Therefore for any L there is ε > 0 such that for any δ > 0 there exist x such that:

x 6= 0 and |x− 0| < δ and

∣∣∣∣sin1

x− L

∣∣∣∣ > ε.

The limit of f as x approaches ”a” is unique.

Indeed: assume that limx→a

f(x) = L1 and limx→a

f(x) = L2 and L1 6= L2. For ε =|L1 − L2|

2there exists δ1 such that |f(x) − L1| < ε for 0 < |x − a| < δ1, and there exists δ2 suchthat |f(x) − L2| < ε for 0 < |x − a| < δ2. Hence, for 0 < |x − a| < min{δ1, δ2} we have|L1 − L2| ≤ |L1 − f(x)|+ |f(x)− L2| < |L1 − L2| which is absurd.

37

Heine’s criterion for the limit: The function f : A ⊂ R1 → R1 has a limit as xapproaches ”a” if and only if for any sequence (xn), xn ∈ A, xn 6= a, and xn → a asn →∞, the sequence (f(xn)) converges.

Proof. Assume that L is the limit of f(x) as x approaches ”a” and consider a sequence(xn), xn ∈ A, xn 6= a and xn → a as n → ∞. For ε > 0 there exists δ > 0 such that0 < |x − a| < δ ⇒ |f(x) − L| < ε. For δ > 0 there exists N such that |xn − a| < δ forn > N. Hence: |f(xn)− L| < ε for n > N. Therefore, the sequence (f(xn)) converges.Assume now that for any sequence (xn), xn ∈ A, xn 6= a and xn → a as n → ∞ thesequence (f(xn)) converges. Firstly, we show that lim

n→∞f(xn) is independent on (xn).

For that assume the contrary i.e there exist (x′n), (x′′n); x′n, x′′n ∈ A, x′n 6= a, x′′n 6= a andlim

n→∞x′n = lim

n→∞x′′n = a for which lim

n→∞f(x′n) = L′ 6= L′′ = lim

n→∞f(x′′n). Consider the

sequence (xn) defined as

xn =

{x′k for n = 2 k

x′′k+1 for n = 2 k + 1

and remark that xn ∈ A, xn 6= a and limn→∞

xn = a. Hence the sequence (f(xn)) converges.

Let be L = limn→∞

f(xn) and remark that limn→∞

f(x′n) = L′ and limn→∞

f(x′′n) = L′′ has to be the

same L, i.e. L′ = L; L′′ = L. It follows that L′ = L′′ what is absurd. Denote now by L thecommon value of lim

n→∞f(xn) and show that lim

x→af(x) = L. For that assume the contrary.

It follows that there exists ε0 > 0 such that for any n ∈ N there exists xn ∈ A, xn 6= a such

that |xn− a| < 1

nand |f(xn)−L| ≥ ε0. Hence the sequence (f(xn)) does not converge to

L even xn ∈ A, xn 6= a and xn → a as n →∞. That is absurd.

Cauchy-Bolzano’s criterion for the limit: The function f : A ⊂ R1 → R1 hasa limit as x approaches ”a” if and only if for any ε > 0 there exists δ > 0 such that0 < |x′ − a| < δ and 0 < |x′′ − a| < δ ⇒ |f(x′)− f(x′′)| < ε.

Proof. Assume that L = limx→a

f(x) and consider ε > 0. There is δ > 0 such that

0 < |x− a| < δ ⇒ |f(x)− L| < ε

2.

Hence, 0 < |x′−a| < δ and 0 < |x′′−a| < δ ⇒ |f(x′)−f(x′′)| ≤ |f(x′)−L|+|f(x′′)−L| < ε.Assume now that for any ε > 0 there exists δ > 0 such that 0 < |x′ − a| < δ and0 < |x′′ − a| < δ ⇒ |f(x′)− f(x′′)| < ε and consider a sequence (xn), xn ∈ A, xn 6= a andxn −−−→

n→∞a. For δ > 0 there is N such that |xn − a| < δ for n > N.

Hence, for n,m > N we have |f(xn)− f(xm)| < ε. This means that the sequence (f(xn))converges.Applying Heine’s criterion, the function f has a limit as x approaches to ”a”.

19 Rules for the limit of a function

a) If k is a constant, then limx→a

k = k.

38

b) If limx→a

f(x) = L and limx→a

g(x) = M, then limx→a

(f(x)± g(x)) = L±M.

c) If limx→a

f(x) = L and limx→a

g(x) = M, then limx→a

f(x) · g(x) = L ·M.

d) If limx→a

f(x) = L, g(x) 6= 0 and limx→a

g(x) = M 6= 0, then limx→a

f(x)

g(x)=

L

M.

Proof. We prove part b) (the other parts are proved similarly, the proof is rather technicaland can be skipped at first reading). For any ε > 0, there are positive δ1 and δ2 such that

0 < |x− a| < δ1 ⇒ |f(x)− L| < ε

2

0 < |x− a| < δ2 ⇒ |f(x)−M | < ε

2

For 0 < |x− a| < min{δ1, δ2} we have

|f(x)± g(x)− (L±M)| ≤ |f(x)− L|+ |g(x)−M | < ε.

Pinching rule: Suppose that the inequality f(x) ≤ g(x) ≤ h(x) holds for all x in someinterval around ”a”, except perhaps at x = a. If lim

x→af(x) = L and lim

x→ah(x) = L then also

limx→a

g(x) = L.

Proof. Since f(x) ≤ g(x) ≤ h(x), we have f(x) − L ≤ g(x) − L ≤ h(x) − L. Hence|g(x) − L| ≤ max {|f(x) − L|, |h(x) − L|}. For ε > 0 there exist δ1 > 0 and δ2 > 0 suchthat

0 < |x− a| < δ1 ⇒ |f(x)− L| < ε,

0 < |x− a| < δ2 ⇒ |h(x)− L| < ε.

Hence, for 0 < |x− a| < min {δ1, δ2} we have

|g(x)− L| ≤ max {|f(x)− L|, |h(x)− L|} < ε.

Example 19.1. Show that limθ→0

sin θ = 0 and limθ→0

cos θ = 1.

Let θ be measured in radians and consider the angle θ as a central angle in a circle witha radius of 1.

39

The area of the circular sector OAB isθ

2.

The area of the triangle OAB issin θ

2.

Therefore 0 ≤ | sin θ|2

≤ |θ|2

.

”Pinching” sin θ between 0 and θ, both of which approach 0 as θ → 0, proving the firstlimit.From the Pythagorean relation sin2 θ + cos2 θ = 1 we get moreover

limθ→0

cos2 θ = limθ→0

(1− sin2 θ) = 1

But limθ→0

cos2 θ = (limθ→0

cos θ)2 and therefore limθ→0

cos θ = +1 or − 1. The negative sign is

eliminated, since cos θ is positive near θ = 0.

Example 19.2. We use the pinching rule to prove

limx→0

x · sin 1

x= 0

The function x · sin 1

xis bounded below by −|x| and above by |x|, so that

−|x| ≤ x · sin 1

x≤ |x|.

As x → 0, we also have |x| → 0, and therefore

0 ≤ limx→0

x · sin 1

x≤ 0.

proving the claimed limit.

We can similarly show that limx→0

x2 · sin 1

x= 0

Example 19.3. We can show that limx→0

ex = 1 pinching the exponential function ex near

0 between the functions 1 + x and 1 + x + x2 i.e.

1 + x < ex < 1 + x + x2 for x ∈ (−∞, 1).

Using the above inequalities we can similarly calculate

limx→0

ex − 1

x= 1.

The substitution rule: Assume that limx→a

f(x) = L and limy→L

g(y) = M. Then

limx→a

g(f(x)) = M.

Proof. Let ε > 0 be given. Since g(y) → M as y → L there exists δ1 > 0 such that0 < |y − L| < δ1 ⇒ |g(y)−M | < ε. Also since lim

x→af(x) = L there exists δ2 > 0 such that

0 < |x− a| < δ2 ⇒ |f(x)− L| < δ1.Therefore, 0 < |x−a| < δ2 ⇒ |f(x)−L| = |y−L| < δ1 ⇒ |g(y)−M | = |g(f(x))−M | < εproving that lim

x→ag(f(x)) = M.

40

20 One sided limits

The limit limx→a

f(x) = L in Definition 18.1 is a two-sided limit, since the variable x

approaches the point ”a” from both sides. We now analyze one-sided limits, where thevariable x approaches the point ”a” on one side. This is necessary if the function is definedonly on one side of the point in question, or if approaching the point from different sidesgives different limits.

We use the following terminology:”x approaches ”a” from the right”, also ”x approaches ”a” from the above”, denoted byx → a+ or x ↘ a, means a < x < a + δ for δ > 0 sufficiently small.”x approaches ”a” from the left”, also ”x approaches ”a” from the below”, denoted byx → a− or x ↗ a, means a− δ < x < a for δ > 0 sufficiently small.

Definition 20.1. (one-sided limits)

a) L is called the right limit of f at ”a” or the limit of f(x) as x approaches ”a” fromthe right (or from above), denoted by

limx→a+

f(x) = L or limx↘a

f(x) = L

if for any ε > 0, there exists a number δ > 0 such that

a < x < a + δ ⇒ |f(x)− L| < ε;

b) L is called the left limit of f at ”a”, or the limit of f(x) as x approaches ”a” fromthe left (or below), denoted by

limx→a−

f(x) = L or limx↗a

f(x) = L

if for any ε > 0, there exists a number δ > 0 such that

a− δ < x < a ⇒ |f(x)− L| < ε.

Remark 20.1. If the left limit and the right limit exist and are equal

limx→a−

f(x) = limx→a+

= L

then the limit of f at ”a” exists and equals the same value L

limx→a

f(x) = L.

Example 20.1. The function f(x) =√

x is defined only for x ≥ 0. As x approaches 0from the right, the value of

√x tends to 0, lim

x↘a

√x = 0.

Example 20.2. The function

sign(x) =x

|x| =

{1 if x > 0−1 if x < 0

does not have a limit at a = 0; however, the two one-sided limits exist

limx→0+

sign(x) = 1 and limx→0−

sign(x) = −1.

41

Example 20.3. Step and staircase function. The step function is defined as:

step(x) =

0 if x < 01

2if x = 0

1 if x > 0

which, for x 6= 0, can be expressed as

step(x) =1

2(1 + sign(x)).

The step function has one-sided limits at 0

limx→0+

step(x) = 1 and limx→0−

step(x) = −1.

The translated step function step(x − a) has its step at the point ”a” where it has thetwo one-sided limits

limx→a+

step(x− a) = 1 and limx→a−

step(x− a) = −1.

A staircase function is a function with several steps, for example,

Sm(x) =m∑

n=0

step(x− n).

At each step point, the staircase function has a left limit and a right limit which aredifferent,and also not equal to the value of the function Sm at that point. At all otherpoints the left limits and the right limits coincide, and therefore the two sided limits existat x 6= n.

Definition 20.2. A function f : A ⊂ R1 → R1 is increasing if x1, x2 ∈ A, x1 < x2 ⇒f(x1) ≤ f(x2).

Definition 20.3. A function f : A ⊂ R1 → R1 is decreasing if x1, x2 ∈ A, x1 < x2 ⇒f(x1) ≥ f(x2).

Definition 20.4. A function f : A ⊂ R1 → R1 is monotone if it is increasing ordecreasing function.

Monotone limits exist: If a function f : (a, b) → R is monotone, then the one-sidedlimits lim

x↗x0

f(x) and limx↘x0

f(x) exists for any x0 ∈ (a, b).

Proof. Consider x0 ∈ (a, b) and the set

Sx0 = {f(x) |x < x0}If f is increasing, then the set Sx0 is bounded above by f(x0) and if f is decreasing, thenthe set Sx0 is bounded below by f(x0).If f is increasing, then the least-upper-bound of Sx0 i.e. sup Sx0 is the left limit of f atx0 and if f is decreasing the greatest-lower-bound of Sx0 , i.e. inf Sx0 is the left limit of f

42

at x0.In this way it was shown that: for an increasing function f the left limit in x0 exists and

limx→x−0

f(x) = sup Sx0 .

For a decreasing function f the left limit in x0 exists and

limx→x−0

f(x) = inf Sx0 .

Considering the set Rx0 = {f(x) |x > x0} it can be proven, in a similar way, that if fincreases, then

limx→x+

0

f(x) = inf Rx0

and if f decreases, thelim

x→x+0

f(x) = sup Rx0

21 Infinite limits

”Infinity” (±∞) is a mathematical symbol and not a number which is subject to arithmeticoperations.

Definition 21.1. (infinite limits)The function f has the right limit +∞ at ”a” denoted by lim

x→a+f(x) = +∞ if for any

M > 0 there is a δ > 0 such that f(x) > M whenever a < x < a + δ.The function f has the right limit −∞ at ”a” denoted by lim

x→a+f(x) = −∞ if for any

M > 0 there is a δ > 0 such that f(x) < −M whenever a < x < a + δ.

The left limits:lim

x→a−f(x) = +∞, lim

x→a−f(x) = −∞

and the two-sided limits

limx→a

f(x) = +∞, limx→a

f(x) = −∞

are defined analogously.

Example 21.1. Show that

a) limx→0

1

x2= +∞.

b) limx→0−

1

x= −∞ and lim

x→0+

1

x= +∞.

43

Example 21.2. A function can have, at a point a finite one-sided-limit as the point isapproached from one side, and an infinite one-sided-limit as the point is approached fromthe other side.For example the function

h(x) =

{0 if x ≤ 01

xif x > 0

limx→0−

h(x) = 0, limx→0+

h(x) = +∞.

Limits as x → +∞ or as x → −∞ are called limits at infinity, not to be confused withthe infinite limits!

Definition 21.2. (limits at infinity) The number L is the limit of f(x) as x approaches+∞, denoted by

limx→+∞

f(x) = L

if for any ε > 0, there exists a number M > 0 such that

x > M ⇒ |f(x)− L| < ε

The limit at −∞, limx→−∞

f(x) = L is defined analogously.

Example 21.3. The function f(x) =1− x2

1 + x + x2has the following limits at infinity

limx→−∞

f(x) = −1 and limx→+∞

f(x) = −1.

22 Limit points of a function at a point

Definition 22.1. (limit point at ”a”)The number L is a limit point of f(x) at ”a” if there exists a sequence (xn) such thatxn ∈ A, xn 6= a, lim

n→+∞xn = a and lim

n→+∞f(xn) = L.

Usually, the set of limit points of f(x) at ”a” is denoted by La(f).

• inf La(f) is called the inferior limit of f at ”a” and it is denoted by limx→a

f(x),

limx→a

f(x)def= inf La(f).

• supLa(f) is called the superior limit of f at ”a” and it is denoted by limx→a

f(x),

limx→a

f(x)def= supLa(f).

44

The following statement holds: The number L is the limit of f(x) as x approaches”a” if and only if

limx→a

f(x) = limx→a

f(x) = L.

Proof. Assume first that limx→a

f(x) = L and consider a sequence (xn) such that xn ∈A, xn 6= a and lim

n→∞xn = a. For ε > 0 there exists δ > 0 such that

0 < |x− a| < δ ⇒ |f(x)− L| < ε.

Since limn→∞

xn = a there is N such that |xn − a| < δ for n > N. Hence |f(xn)− L| < ε for

n > N.Therefore, lim

n→∞f(xn) = L. We obtain in this way that La(f) = {L} and consequently

limx→a

f(x) = limx→a

f(x) = L.

Assume now that limx→a

f(x) = limx→a

f(x) = L and suppose that L is not the limit of f(x)

as x approaches ”a”. Then, there exists ε0 > 0 such that for every n ∈ N there exists xn

such that |xn − a| < 1

nand |f(xn)− L| > ε0. On the other hand, the sequence (f(xn)) is

bounded and has a subsequence (f(xnk)) which has a limit. It is clear that lim

nk→∞f(xnk

)

is different from L.Hence La(f) contains at least two elements.

Example 22.1. If f(x) = sin1

xfor x ∈ R1 \ {0} then L0(f) = [−1, 1].

23 Continuity

Naively a function f : A ⊂ R1 → R1 is continuous if its graph is a continuous curve. Inparticular if the domain of f contains a neighborhood of a fixed real number ”a”, thenthe graph of f can be drawn through the point (a, f(a)) without removing the pen fromthe paper. The desired behavior at (a, f(a)) can be arranged by insisting that, for allvalues of x, sufficiently close to ”a”, f(x) is close to f(a).

If f : A ⊂ R1 → R1 is a function whose domain contains a neighborhood of ”a”, then thefollowing definition holds.

Definition 23.1. The function f is continuous at ”a” if limx→a

f(x) = f(a).

Note that this definition demands three things:

- firstly that limx→a

f(x) exists,

- secondly that f(a) is defined,

45

- finally that the previous two values are equal.

The ε, δ definition of limx→a

f(x) = L can be easily adapted to give the following ε, δ definition

of continuity at ”a”.

Definition 23.2. The function f is continuous at ”a” if and only if for every ε > 0, thereexists δ > 0 such that:

|x− a| < δ ⇒ |f(x)− f(a)| < ε.

Example 23.1. Use the ε, δ definition of continuity to prove that f(x) = x2 is continuousat a = 0.

Solution: For any ε > 0, determine those x for which |f(x) − f(0)| < ε. Now|f(x)− f(0)| = |x2 − 0| = |x2| < ε provided |x| < √

ε. So let δ =√

ε. If |x− 0| < δ, then|f(x) − f(0)| < ε. In other words, lim

x→0f(x) = 0 and, since f(0) = 0, lim

x→0f(x) = f(0).

Hence, f is continuous at 0.

If a function f is continuous for all x in the range a < x < b, then it can be said that fis continuous on the interval (a, b).If f is continuous for all x in its domain, it can be simply said that f is continuous.

Definition 23.3. If limx→a+

f(x) exists and equals f(a), then f is called right-continuous ata.

Definition 23.4. If limx→a−

f(x) exists and equals f(a), then f is called left-continuous ata.

The ε, δ formulation of the last two definitions are not difficult to write down. Moreover,the following result holds.

A function f is continuous at ”a” if and only if is both left-continuous and right-continuousat ”a”.

Note: If a function f is only defined on the closed interval [a, b] and it is claimed that ”fis continuous on [a, b]” what is meant is that f is continuous on (a, b), right-continuousat ”a” and left-continuous at ”b”.

Heine’s criterion for continuity: The function f : A ⊂ R1 → R1 is continuous ata ∈ A if and only if for any sequence (xn), xn ∈ A, xn −−−→

n→∞a the sequence (f(xn))

converges to f(a).

Proof. The result is obtained from Heine’s criteria for the limit.

Cauchy-Bolzano’s criterion for continuity: The function f : A ⊂ R1 → R1 iscontinuous at a ∈ A if and only if for any ε > 0 there exists δ > 0 such that |x′ − a| < δand |x′′ − a| < δ ⇒ |f(x′)− f(x′′)| < ε.

Proof. The result is obtained from Cauchy-Bolzano’s criteria for the limit.

46

24 Rules for continuity

Sum rule: If f and g are continuous at ”a”, then f + g is continuous at ”a”.

Product rule: If f and g are continuous at ”a”, then f · g is continuous at ”a”.

Reciprocal rule: If f is continuous at ”a” and f(x) 6= 0, then1

fis continuous at ”a”.

Squeeze rule: Let f, g and h be such that

h(x) ≤ f(x) ≤ g(x)

for all x in some neighborhood of ”a” and such that h(a) = f(a) = g(a). If h and g arecontinuous at ”a”, then so is f too.

The proofs of the above rules are left as exercises.

Composite rule: Let f and g be continuous at ”a” and f(a), respectively. Then g ◦ fis continuous at ”a”.

Proof. Let f(a) = b. Since g is continuous at b for every ε > 0 there exists a δ1 > 0 suchthat

|t− b| < δ1 ⇒ |g(t)− g(b)| < ε.

Since f is continuous at ”a” for δ1 > 0, there exists a δ2 > 0 such that

|x− a| < δ2 ⇒ |f(x)− f(a)| < δ1.

Now we deduce that

|x− a| < δ2 ⇒ |g(f(x))− g(f(a))| < ε.

Hence, for any ε > 0, there exists a δ = δ2 > 0 such that

|x− a| < δ ⇒ |(g ◦ f)(x)− (g ◦ f)(a)| < ε.

Example 24.1. Given that the identity function x 7→ x the constant function x 7→ k andthe trigonometric functions sine and cosine are all continuous, the following are provedto be continuous functions:

a) x 7−→ x2 + 2x + 3

x2 + x + 1,

b) x 7−→ x3 · cos x2,

c) x 7−→{

x · sin 1

xif x 6= 0

0 if x = 0

47

The extension by continuity: If f : A ⊂ R1 → R1 is not defined at the point ”a” butL = lim

x→af(x) exists, then the function g : A∪ {a} ⊂ R1 → R1, defined by g(x) = f(x) for

x 6= a and g(a) = L, is continuous at ”a”.The function g is called the extension by continuity of the function f.

Example 24.2. The function g : R1 → R1 defined by

g(x) =

{sin x

xfor x 6= 0

1 for x = 0

is the extension by continuity of the function f : R1 \ {0} → R1, defined by:

f(x) =sin x

x

25 Properties of continuous functions

The results in this section show that the definition of continuity leads naturally to theintuitive geometric interpretation which is used when sketching the graphs of continuousfunctions.

The boundedness property: Let f be continuous on the interval [a, b]. Then

(1) f is bounded on [a, b];

(2) f attains its bounds somewhere on [a, b].

Comment 25.1. What is being said is that, for (1) there exist numbers m and M suchthat

m ≤ f(x) ≤ M for all x ∈ [a, b].

For (2) if m and M are chosen to be the infimum and supremum, respectively, of the set{f(x) | a ≤ x ≤ b}, then (2) claims that there are numbers c and d in [a, b] such thatm = f(c) and M = f(d).

Proof of boundedness property.(1) Let B = {x |x ∈ [a, b] and f is bounded on [a, x]}. Clearly a ∈ B and B is boundedabove by b. By the completness property of R1, B possesses at least upper bound. Letc = sup B. Since f is right-continuous at ”a”, for ε = 1 there exists δ > 0, such that

a < x < a + δ ⇒ |f(x)− f(a)| < 1 ⇒ |f(x)| < 1 + |f(a)|

Hence, f is bounded on [a, a + δ] and so c ≥ a +δ

2> a. We want to show that a = b.

Suppose that c < b. Since c > a, f is continuous at c. Then, for ε = 1 there exists a δ′ > 0such that

|x− c| < δ′ ⇒ |f(x)| ≤ 1 + |f(c)|.

48

In the other words, f is bounded on [c, c + δ′]. But then c +δ′

2∈ B and this contradicts

c being the supremum of B. Thus c = b and so f is bounded on [a, b], as required.(2) Since f is bounded on [a, b]

A = f([a, b]) = {f(x) | a ≤ x ≤ b}

is a set which is bounded both above and below. Let m = inf A and M = sup A. Suppose

that there is no x ∈ [a, b] such that f(x) = M and define g(x) =1

M − f(x)for x ∈ [a, b].

Now g is continuous on [a, b] by the sum and reciprocal rules. By (1) g is bounded on[a, b].Let such a bound be K. Then:

g(x) ≤ K ⇒ 1

M − f(x)≤ K ⇒ 1

K≤ M − f(x) ⇒ f(x) ≤ M − 1

K

This contradicts the fact that M is the least upper bound for f on [a, b], so the assumptionthat f never takes the value M is false. Hence f attains its upper bound on [a, b]. A similarargument shows that f attains its lower bound somewhere on [a, b].

The intermediate value property: Let f be continuous on [a, b] and suppose thatf(a) = α and f(b) = β. For every real number γ between α and β there exists a numberc, a < c < b with f(c) = γ.

Comment 25.2. Here it is being said that, if f take the values α and β somewhere onthe interval [a, b], then f must take all posible values between α and β.

Proof of the intermediate value property. Suppose α < γ < β and let

S = {x |x ∈ [a, b] and f(x) < γ}.

The set S is non-empty, since it contains ”a”. Let c = sup S. It is clear that a < c < b.If f(c) < γ, then, for ε = γ − f(c) > 0, there exists a δ > 0 such that

|x− c| < δ ⇒ |f(x)− f(c)| < ε.

In particular ∣∣∣∣f(

c +δ

2

)− f(c)

∣∣∣∣ < ε

and so

f

(c +

δ

2

)− f(c) < γ − f(c).

But then f

(c +

δ

2

)< γ and hence c +

δ

2∈ S, which contradicts the fact that c is the

supremum of S. Hence, f(c) ≥ γ.If f(c) > γ then, for ε = f(c)− γ > 0, there exists a δ > 0 such that

|x− c| < δ ⇒ |f(x)− f(c)| < ε.

49

Hence c− δ < x ≤ c ⇒ f(x) > γ and so x does not lie in S. In other words, sup S ≤ c− δwhich in turn contradicts the definition of c.It follows that f(c) = γ. Hence, γ is a value of f.The intermediate value property has many applications and the following exampleillustrates one of these.

Example 25.1. Any polynomial of odd degree has at least one real root.

Solution: Let P (X) = a0 + a1 X + · · · + an Xn where n is odd, and without loss ofgenerality, let an = 1. We know that P is continuous. Define

r(x) =P (x)

xn− 1, x 6= 0

Now

|r(x)| =∣∣∣∣P (x)

xn− 1

∣∣∣∣ =∣∣∣an−1

x+ · · ·+ a1

xn−1+

a0

xn

∣∣∣ ≤∣∣∣an−1

x

∣∣∣ + · · ·+∣∣∣ a1

xn−1

∣∣∣ +∣∣∣ a0

xn

∣∣∣

and if M is the maximum of |an−1|, . . . , |a0| then

|r(x)| ≤ M ·n∑

r=1

1

|x|r < M ·∞∑

r=1

1

|x|r =

M · 1

|x|1− 1

|x|=

M

|x| − 1, for |x| > 1.

Hence, |r(x)| < 1 for |x| > 1 + M. In particular 1 + r(x) > 0 for |x| > 1 + M. Hence,P (x) = xn(1 + r(x)) has the same sign as xn for |x| > 1 + M. Since n in odd, there existα, β ∈ R1 with P (α) > 0 (choose α > 1 + M) and P (β) < 0 ( choose β < −(1 + M).By the intermediate value property P (γ) = 0 for some γ, |γ| < 1 + M. Incidentally, thisshows that P has a zero in the interval (−(1 + M), (1 + M)). In fact all the real zeros ofP lie in this interval.

Theorem 25.1 (The interval theorem). Let f be continuous on I = [a, b]. Then, f(I) isa closed bounded interval.

Comment 25.3. The claim here is that continuous functions map intervals onto intervals.This means that the intuitive picture of a continuous functions as one having a continuousgraph is round.

Proof of the interval theorem. By the boundedness property there exist numbers c and din I such that f(c) = m0 and f(d) = M0 and m0 ≤ f(x) ≤ M0, for all x ∈ I.Suppose for simplicity, that c ≤ d. Apply the intermediate value property to f on thesubinterval [c, d] to deduce that f takes all possible values between f(c) = m0 andf(d) = M0. In other words f(I) = [m0,M0].

Theorem 25.2 (The fixed point theorem). Let f : [a, b] → [a, b] be a continuous function.Then there is at least one number c which is fixed by f. That is f(c) = c.

Comment 25.4. This result says that if proceeding continuously from (a, f(a)) to(b, f(b)), then the line y = x must be crossed.

50

Proof of the fixed point theorem. Let g : [a, b] → R1 be defined by g(x) = f(x)− x. Sincethe identity function x 7→ x is continuous on [a, b] the function g is continuous on [a, b].If f(a) = a or f(b) = b, then there is nothing to prove. So it is assumed that f(a) 6= aand f(b) 6= b. Since f maps onto [a, b], g(a) > 0 and g(b) < 0.The intermediate value property applied to g on the interval [a, b] implies that g(c) = 0for some c, a < c < b. Hence f(c) = c.

Theorem 25.3 (The continuity of the inverse function). Suppose that f : A → B is abijection where A and B are intervals. If f is continuous on A, then f−1 is continuouson B.

Proof. Consider the continuous bijection f : A → B where A and B are intervals. Firstit is shown that f is either strictly increasing or strictly decreasing. If f is neither strictlyincreasing nor decreasing, then without loss of generality, there are numbers a1, a2 anda3 such that a1 < a2 < a3 and f(a1) < f(a3) < f(a2). Apply the intermediate valuetheorem to f on the interval [a1, a2] to deduce that f(c) = f(a3) for some c ∈ (a1, a2).This contradicts the fact that f is a bijection. For the rest of the proof it is assumed thatf is strictly increasing, the proof for the strictly decreasing case being similar.Hence f−1 is strictly increasing. Let b ∈ B and f−1(b) = a, so that f(a) = b. For everyε > 0, f maps the interval I = [a − ε, a + ε] onto some interval f(I) = [m,M ]. Since fis strictly increasing, m < b < M, so let δ be the minimum of b−m and M − b. Clearlyδ > 0.Now [b − δ, b + δ] is a subset of [m,M ] = f(I) and so f−1 maps [b − δ, b + δ] intoI = [a− ε, a + ε]. Thus given any ε > 0, there exists a δ > 0 such that

|y − b| < δ ⇒ |f−1(y)− f−1(b)| < ε.

If f : A → B is a strictly monotone surjection, where A and B are intervals, then f andf−1 are continuous.

Let be f : A ⊂ R1 → R1 a continuous function.

Definition 25.1. The function f is called uniformly continuous on A if for any ε > 0there exists δ > 0 such that

|x′ − x′′| < δ ⇒ |f(x′)− f(x′′)| < ε for any x′, x′′ ∈ A.

Theorem 25.4 (Theorem of the uniform continuity). If f : [a, b] ⊂ R1 → R1 iscontinuous, then f is uniformly continuous.

Comment 25.5. This result says that a continuous function on a closed interval [a, b] isuniformly continuous.

26 Sequence of functions. Set of convergence.

Definition 26.1. A sequence of real valued functions defined on A ⊂ R is a functionF : N → {f | f : A → R}. We write F (n) = fn and the sequence of functions is denotedby (fn).

51

Let be A ⊂ R1 and f1, f2, . . . , fn, . . . a sequence of real valued functions fn : A → R1. Wewill denote this sequence by (fn).

Definition 26.2. An element of a ∈ A is called point of convergence of the sequence (fn)if the sequence (fn(a)) converges.The set of all points of convergence is called the set of convergence of the sequence (fn).

Let be B ⊂ A the set of convergence of sequence (fn). For x ∈ B we denote by f(x) thelimit

f(x) = limn→∞

fn(x).

We establish in this way a correspondence from B to R1; i.e. a function f : B ⊂ A ⊂R1 → R1.The function f defined above is called the limit function, on the set B, of the sequence(fn). We will say that the sequence (fn) converges on B to f.

Definition 26.3. Let be (fn) a sequence of function defined on A i.e. fn : A ⊂ R1 → R1.A function f : A → R is called the limit function of sequence (fn) if for any x ∈ A andε > 0 there exists N(x, ε) such that for n > N(x, ε) we have

|fn(x)− f(x)| < ε

written: fn −−−→n→∞

f on A.

Comment 26.1. If in the above definition N depends only on ε and does not depend onx, then we say that the sequence (fn) converges uniformly to f on A.

Definition 26.4. The sequence (fn) is uniformly convergent on A to f if for any ε > 0,there exists N(ε) such that for n > N(ε) and x ∈ A we have

|fn(x)− f(x)| < ε.

If the sequence (fn) is uniformly convergent to f we will write fnu−−−→

n→∞f.

Example 26.1. A = [0, 1], fn(x) = xn, f(x) =

{1 for x = 10 for x ∈ [0, 1)

, fn −−−→n→∞

f but

fnu−−−→

n→∞/ f.

Example 26.2. A = [0, 2 π], fn(x) =sin nx

n, f(x) = 0, fn

u−−−→n→∞

f.

Uniform convergence criteria: Let be (fn) a sequence of functions fn : A ⊂ R1 → R.

• First criterion (Cauchy): The sequence (fn) converges uniformly to a function fdefined on A if and only if for any ε > 0 there exists N(ε) such that, for any n,m > N(ε)and any x ∈ A we have:

|fn(x)− fm(x)| < ε.

52

Proof. Assume first that fnu−−−→

n→∞f. For ε > 0 there exists Nε such that for p ≥ Nε we

have|fp(x)− f(x)| < ε

2, for any x ∈ A.

We have

|fn(x)− fm(x)| < |fn(x)− f(x)|+ |f(x)− fm(x)| ≤ ε

2+

ε

2= ε.

for any n,m > Nε.Assume now that for any ε > 0 there exists Nε, such that for n, m > Nε and x ∈ A wehave

|fn(x)− fm(x)| < ε

and show that there exists f : A ⊂ R1 → R1 such that fnu−−−→

n→∞f.

From hypothesis we have that the sequence of the real numbers (fn(x)) converges. Letbe f(x) = lim

n→∞fn(x). We obtain in this way a function f : A ⊂ R1 → R1. The sequence

of function (fn) converges in any point x ∈ A to f. Let now ε > 0 and Nε such that forn, m > Nε and x ∈ A we have

|fn(x)− fm(x)| < ε.

We choose n0 ≥ Nε and since fn −−−→n→∞

f we have fn − fn0 −−−→n→∞

f − fn0 and more

|f(x)− fn0(x)| < ε for x ∈ A.

Since n0 ≥ Nε was arbitrary chosen, it follows that for any n ≥ Nε and x ∈ A we have

|fn(x)− f(x)| < ε

i.e. fnu−−−→

n→∞f.

• Second criterion: Let be (fn) a sequence of functions defined on A: fn : A ⊂ R1 → R1

and f : A ⊂ R1 → R1. If there exists a sequence (an) of positive real numbers (an > 0)which converges to 0 (an → 0), such that |fn(x) − f(x)| ≤ an, for any n ∈ N and anyx ∈ A, then fn

u−−−→n→∞

f.

Proof. Let be ε > 0. Since an → 0, there exists Nε such that for any n ≥ Nε we havean < ε. It follows that

|fn(x)− f(x)| < ε

for n ≥ Nε and x ∈ A i.e. fnu−−−→

n→∞f.

27 Continuity and uniform convergence

The following statement shows that the uniform convergence conserves the continuity.

Proposition 27.1. Let be (fn) a sequence of functions fn : A ⊂ R1 → R1 which convergesuniformly to f : A ⊂ R1 → R1; fn

u−−−→n→∞

f. If all the functions fn are continuous at a

point a ∈ A, then f is continuous at ”a”.

53

Proof. Let be ε > 0. Since fnu−−−→

n→∞f, there exists Nε such that for any n ≥ Nε and

x ∈ A we have|fNε(x)− f(x)| < ε

3.

In particular, we have

|fNε(a)− f(a)| < ε

3.

Since fNε is continuous at ”a”, there exists δε > 0 such that |x − a| < δε ⇒ |fNε(x) −fNε(a)| < ε

3. Therefore, for any x with |x− a| < δε we have

|f(x)− f(a)| ≤ |f(x)− fNε(x)|+ |fNε(x)− fNε(a)|+ |fNε(a)− f(a)| < ε

3+

ε

3+

ε

3= ε

That means that f is continuous at ”a”.

Corollary 27.1. The limit of a uniform convergent sequence of continuous functions isa continuous function.

28 Equal continuous and equal bounded sequence of

functions

Let be A ⊂ R1 and (fn) a sequence of function defined on A; fn : A → R1.

The expression: ”(fn) is a sequence of continuous functions” means: for any n ∈ N, x ∈A and ε > 0 there exists δ = δ(n, x, ε) > 0 such that:

|x′ − x| < δ → |fn(x′)− fn(x)| < ε.

If the functions fn are uniformly continuous on A, then δ does not depend on x. Hence:”(fn) is a sequence of uniformly continuous function on A” means:for any n ∈ N and any ε > 0, there exists δ = δ(n, ε) > 0 such that:

|x′ − x′′| < δ ⇒ |fn(x′)− fn(x′′)| < ε for any x′, x′′ ∈ A.

It is possible that δ does not depend on n, but depends on x and ε. In this case thesequence (fn) is a sequence of equal continuous functions. More precisely:The sequence (fn) is a sequence of equal continuous functions on A if for any x ∈ A andε > 0 there exists δ = δ(x, ε) > 0 such that for any n :

|x′ − x| < δ ⇒ |fn(x′)− fn(x)| < ε.

If δ is independent on x and n, then the functions of the sequence (fn) are uniformlycontinuous and equal continuous too; they are equal uniformly continuous. More precisely:

54

(fn) is a sequence of functions equal uniformly continuous on A if for any ε > 0 there isa δ = δ(ε) > 0 such that

|x′ − x′′| < δ(ε) ⇒ |fn(x′)− fn(x′′)| < ε

for any n ∈ N and x′, x′′ ∈ A.

The sequence (fn) is a sequence of bounded functions on A if for any n ∈ N there isM = M(n) > 0 such that

|fn(x)| < M

for any x ∈ A.

If M is independent on n, then the functions of the sequence (fn) are equal bounded.More precisely:(fn) is a sequence of equal bounded functions on A, if there is M > 0, such that

|fn(x)| < M

for n ∈ N, x ∈ A.

Theorem 28.1 (Arzela-Ascoli). Let be I = [a, b] a closed interval and (fn) a sequence offunctions fn : I → R1.If (fn) is a sequence of equal continuous and equal bounded functions, then (fn) containsa subsequence (fnk

) which is uniformly convergent on I.

Proof. The proof is rather technical and it will omitted.

29 Series of functions. Convergence and uniform

convergence.

Let be A ⊂ R1 and (fn) a sequence of functions fn : A → R1.

Definition 29.1. It is said that the symbol∞∑

n=1

fn is a convergent series of functions at

the point a ∈ A, if the numerical series∞∑

n=1

fn(a) is convergent.

The symbol∞∑

n=1

fn is a divergent series of functions at the point a ∈ A, if the numerical

series∞∑

n=1

fn(a) diverges.

A point a ∈ A is called point of convergence of the series of functions∞∑

n=1

fn if the series

converges at ”a”.

55

The collection of all the points of convergence of the series is called the set of convergence

of the series∞∑

n=1

fn.

Let be B ⊂ A the set of convergence of the series∞∑

n=1

fn. For x ∈ B we denote by S(x)

the sum

S(x) =∞∑

n=1

fn(x).

We establish in this way a correspondence from B to R1 i.e. a function

S : B ⊂ A ⊂ R1 → R1

The function S defined above is called the sum function, on the set B, of the series∞∑

n=1

fn.

We will say that the series∞∑

n=1

fn converges to S on B, and we will write

S =∞∑

n=1

fn, for x ∈ B.

Definition 29.2. Let be∞∑

n=1

fn a series of functions defined on A, and S a function

defined on B ⊂ A. The series∞∑

n=1

fn converges to S on B if for any x ∈ B and any ε > 0

there exists N = N(x, ε) > 0 such that for any n > N we have

|f1(x) + f2(x) + · · ·+ fn(x)− S(x)| < ε.

If the number N is independent on x, then the series is uniformly convergent on B to S.

In this case we have the following:


n=1

fn converges uniformly to S if for any ε > 0 there is

N = N(ε) > 0 such that

|f1(x) + f2(x) + · · ·+ fn(x)− S(x)| < ε.

for any x ∈ B and n > N(ε).


n=1

fn converges absolutely on B if the series∞∑

n=1

|fn|converges on B.

If the series∞∑

n=1

fn converges absolutely on B, then it converges on B.

56

Example 29.1.

a) Consider the sequence of functions defined by

fn(x) =x2

(1 + x2)n, n ≥ 0

and the series∞∑

n=0

fn(x).

The set of convergence of this series is R1 and the sum of series is

S(x) =

{1 + x2 for x 6= 0

0 for x = 0

The series is absolutely convergent on R1.

b) For n ≥ 1 consider fn defined on R1 as

fn(x) =sinn x

n2

and the series∞∑

n=1

fn.

The series is absolutely convergent on R1.The series is uniformly convergent on R1.

c) For n ≥ 1 consider fn(x) = cosn x and the series∞∑

n=1

fn.

The set of convergence is R1 \ {k · π}k∈Z.The series is absolutely convergent on the set of convergence.

d) consider for n ≥ 1 the functions fn(x) =en·|x|

nand the series

∞∑n=1

fn.

The set of convergence of the series is empty.

30 Convergence criteria for series of functions

Consider the series of functions∞∑

n=1

fn defined on A, i.e. fn : A ⊂ R1 → R1.

Definition 30.1. The series of functions∞∑

n=k+1

fn is called the remainder of order k of

the series∞∑

n=1

fn.

57

1st Criterion: The series∞∑

n=1

fn converges if and only if the remainder of any order k

of the series converges.

Proof. ConsiderSk = f1 + f2 + · · ·+ fn

andσp = fk+1 + fk+2 + · · ·+ fk+p

and remark thatSk+p = Sk + σp.

Therefore, the sequence (Sk+p) converges as p → ∞ if and only if the sequence (σp)converges as p →∞.

2nd Criterion: The series∞∑

n=1

fn converges if and only if the sequence of the sums of

remainders tends to 0.

Proof. Obvious.

3rd Criterion (Cauchy): The series∞∑

n=1

fn converges uniformly on A if and only if for

any ε > 0 there is N = N(ε) such that for n ≥ N and p ≥ 1 we have

|fn+1(x) + fn+2(x) + · · ·+ fn+p| < ε

for x ∈ A.

Proof. An immediate consequence of the Cauchy criterion for sequences.

4th Criterion: Let be∞∑

n=1

an a convergent series of positive numbers.

If |fn(x)| ≤ an for x ∈ A and n ∈ N then the series∞∑

n=1

fn is uniform convergent.

Proof. Obvious.

31 Power Series

Definition 31.1. A series of functions of the form∞∑

n=0

an · xn is called power series.

Clearly, any power series converges when x = 0.

58

Theorem 31.1 (The set of the convergence of power series. Abel-Cauchy-Hadamardtheorem).

- The power series∞∑

n=0

an · xn is absolutely convergent for |x| < R (R called radius of

convergence) where R is given by

R =1

ωif 0 < ω ≤ +∞

R = +∞ if ω = 0

and ω = limn→∞

n√|an|.

- The series diverges for any x with |x| > R.

- For any r ∈ (0, R) the series is uniformly convergent on the closed interval [−r, r].

Proof. Consider x0 ∈ R1 and the series∞∑

n=0

|an| · |x0|n.Apply the root test to this series and obtain:

If limn→∞

n√|an| · |x0| < 1, then the series

∞∑n=0

an ·xn0 is absolutely convergent. In other words,

the series∞∑

n=0

an · xn0 is absolutely convergent for |x0| < R, where R =

1

ωif 0 < ω ≤ +∞

and R = +∞ if ω = 0 and ω = limn→+∞

n√|an|.

Applying the same test, it follows that the series diverges for any x0 with |x0| > R.

For r ∈ (0, R) the series∞∑

n=0

|an| · rn converges (x = r is a point at which the series

∞∑n=0

an · xn converges absolutely) and for x ∈ [−r, r], we have

|an| · |x|n ≤ |an| · rn.

According to the 4th criterion of convergence the series,∞∑

n=0

an ·xn is uniformly convergent

for |x| ≤ r.

Example 31.1.

a) The series∞∑

n=0

xn converges absolutely for |x| < 1 and diverges for |x| > 1. The

radius of convergence is R = 1; the convergence set is (−1, 1).

b) For the series∞∑

n=1

xn

nthe radius of convergence R is R = 1. The convergence set is

[−1, 1).

59

c) The set of convergence of series∞∑

n=1

(−1)n · xn

nis (−1, 1].

d) The series∞∑

n=1

xn

nα(α > 1) is absolutely convergent on [−1, 1].

Concerning the continuity of the sum of a power series we have:

• The sum S of the power series∞∑

n=0

an · xn is a continuous function on (−R, R).

Proof. Let be x0 ∈ (−R,R). There exists r ∈ (0, R) such that −R < −r < x0 < r < R.Since on the closed interval [−r, r] the series converges uniformly, and the terms of theseries are continuous functions, the sum S is continuous on [−r, r]. In particular, it iscontinuous at x0.

• The sum S of the power series∞∑

n=0

an · xn is uniformly continuous on any compact

interval contained in (−R,R).

32 Arithmetics of power series

Let∞∑

n=0

an · xn and∞∑

n=0

bn · xn be power series with radii of convergence R1 and R2,

respectively, where 0 ≤ R1 ≤ R2. Then:

- the sum∞∑

n=0

(an + bn) · xn

- the scalar product∞∑

n=0

k · an · xn

- the Cauchy product∞∑

n=0

cn · xn, cn =∞∑

k=0

an · bn−k

all have radius of convergence at least R1.

Moreover if∞∑

n=0

an · xn has the sum f(x) and∞∑

n=0

bn · xn has the sum g(x), then:

∞∑n=0

(an + bn) · xn = f(x) + g(x)

∞∑n=0

(k · an) · xn = k · f(x)

60

∞∑n=0

cn · xn = f(x) · g(x).

Proof. These claims concerning the sum and scalar product follow from the sum and scalar

product rules for series. To establish the Cauchy product result, note that∞∑

n=0

an ·xn and

∞∑n=0

bn · xn are absolutely convergent for |x| < R1. Since cn · xn =n∑

k=0

(ak · xk)(bn−k · xn−k),

the series∞∑

n=0

cn · xn is absolutely convergent for |x| < R1, and has the sum stated.

Much of the preceding discussion can be modified to apply to series of the form∞∑

n=0

an · (x− a)n.

33 Differentiable functions

Intuitively, a function f : A ⊂ R1 → R1 is differentiable at c ∈ A if a tangent can bedrawn to the curve at the point P (c, f(c)).

Figure 33.1:

The slope of the chord PQ in Figure 33.1 is

f(x)− f(c)

x− c

and as Q moves closer to P it is required that the slope PQ approaches the slope of thetangent line at P .

This geometric idea motivates the following formal definition:

Definition 33.1. A function f : A ⊂ R1 → R1 is differentiable at c ∈ A if

limx→c

f(x)− f(c)

x− c

exists. f ′(c) is written for the value of this limit, called the derivative of f at c.

An alternative form of the limit is obtained by setting x = c + h. Then

limx→c

f(x)− f(c)

x− c= lim

h→0

f(c + h)− f(c)

h.

61

Example 33.1. The function f(x) = x2 is differentiable for all x.

Solution: Considerf(x)− f(c)

x− cfor any x 6= c, where c is fixed. Now

limx→c

f(x)− f(c)

x− c= lim

x→c

x2 − c2

x− c= lim

x→c(x + c) = 2 c.

Hence, f is differentiable at c and f ′(c) = 2 c. Since c was arbitrary, the derivative functionf can be defined as

f ′(x) = 2 x.

Example 33.2. The function f(x) = |x| is not differentiable at c = 0.

Solution: Consider

f(x)− f(0)

x− 0=|x|x

=

{1 for x > 0

−1 for x < 0.

Hence

limx→0+

f(x)− f(0)

x− 0= 1 and lim

x→0−

f(x)− f(0)

x− 0= −1.

Since these right and left limits differ, f is not differentiable at 0.It is easy to show that f is differentiable for all x 6= 0 and f ′(x) = 1 for x > 0 andf ′(x) = −1 for x < 0.

In general, points where f is not differentiable can be often detected by examining the

left and the right limits off(x)− f(c)

x− cas x → c.

The left limit limx→c−

f(x)− f(c)

x− cis called the left derivative of f at c and is denoted by

f ′−(c).

Similarly, the right limit limx→c+

f(x)− f(c)

x− cis called the right derivative of f at c and is

denoted by f ′+(c).

Clearly, f ′(c) exists if and only if f ′−(c) and f ′+(c) both exist and are equal.

The next result establishes that only continuous functions can be differentiable.

Theorem 33.1. If f is differentiable at c, then f is continuous at c.

Proof. Define the function

Fc(x) =

f(x)− f(c)

x− cif x 6= c

f ′(c) if x = c.

Since f is differentiable at c, limx→c

Fc(x) = Fc(c) and hence, Fc is continuous at c.

Nowf(x) = f(c) + Fc(x) · (x− c) for all x.

Since Fc and the identity and constant functions are all continuous at c, f is continuousat c.

62

Note that Example 33.2 shows that there are continuous functions which are notdifferentiable.

The following table gives certain elementary functions and their derivatives.

Function f derivative f ′

f(x) = k a constant f ′(x) = 0f(x) = xn, n ∈ N f ′(x) = n · xn−1

f(x) =√

x f ′(x) =1

2√

xf(x) = sin x f ′(x) = cos xf(x) = cos x f ′(x) = − sin x

f(x) = tan x f ′(x) =1

cos2 x

f(x) = cot x f ′(x) = − 1

sin2 xf(x) = ex f ′(x) = ex

f(x) = ln x f ′(x) =1

x

34 Rules of differentiability

Sum rule Let f and g be functions differentiable at c. Then, their sum f + g isdifferentiable at c and

(f + g)′(c) = f ′(c) + g′(c).

Product rule Let f and g be functions differentiable at c. Then, their product f · g isdifferentiable at c and

(f · g)′(c) = f ′(c) · g(c) + f(c) · g′(c).

Reciprocal rule Let f be a function which is non-zero and is differentiable at c. Then1

fis differentiable at c and

(1

f

)′(c) = − f ′(c)

f 2(c).

63

Proof of product rule. For x 6= c

(f · g)(x)− (f · g)(c)

x− c=

f(x) · g(x)− f(c) · g(c)

x− c=

=f(x) · g(x)− f(x) · g(c) + f(x) · g(c)− f(c) · g(c)

x− c=

=f(x)− f(c)

x− c· g(c) + f(x) · g(x)− g(c)

x− c.

As x → c,f(x)− f(c)

x− c→ f ′(c) and

g(x)− g(c)

x− c→ g′(c).

Therefore(f · g)(x)− (f · g)(c)

x− c−−−−−−−→x→c f ′(c) · g(c) + f(c) · g′(c).

The product and the reciprocal rules can be combined as follows, to give the quotientrule.

Quotient rule If f and g are differentiable at c and g(x) 6= 0, thenf

gis differentiable

at c and (f

g

)′(c) =

f ′(c) · g(c)− f(c) · g′(c)[g(c)]2

.

Example 34.1. Use the above rules to prove that each of the following functions isdifferentiable at the points indicated:

i) f(x) = x2 + sin x, x ∈ R1;

ii) f(x) = x2 · sin x, x ∈ R1;

iii) f(x) = tan x, x ∈ R1 and x 6= (2n + 1)π

2, n integer;

iv) f(x) = xn , n ∈ Z, n 6= 0;

v) f(x) = sec x, x 6= (2n + 1)π

2;

vi) f(x) = csc x, x 6= nπ;

vii) f(x) = tan x, x 6= (2n + 1)π

2;

viii) f(x) = cot x, x 6= nπ.

64

Squeeze rule Let f , g and h be three functions such that g(x) ≤ f(x) ≤ h(x) for all xin some neighborhood of c and such that g(c) = f(c) = h(c). If g and h are differentiableat c, then so is f and f ′(c) = g′(c) = h′(c).

Proof. The given inequalities imply

g(x)− g(c)

x− c≤ f(x)− f(c)

x− c≤ h(x)− h(c)

x− c

for all x > c and the inequality signs are reversed for x < c. The result follows by squeezerule for limits of functions provided that g′(c) = h′(c) can be established. To this end let

Gc(x) =

g(x)− g(c)

x− cif x 6= c

g′(c) if x = c

and

Hc(x) =

h(x)− h(c)

x− cif x 6= c

h′(c) if x = c.

Since g and h are differentiable at c, Gc and Hc are continuous at c. Let k(x) =Gc(x) − Hc(x). Thus k is continuous at c. The earlier inequalities imply that if x > c,then k(x) ≤ 0 and if x < c, then k(x) ≥ 0. Hence, k(c) = 0 and so Gc(c) = Hc(c). Inother words, g′(c) = h′(c).

Example 34.2. The function

f(x) =

x2 sin1

xif x 6= 0

0 if x = 0.

can be squeezed between h(x) = −x2 and g(x) = x2 at x = 0. Since g and h aredifferentiable at 0 with common derivative of value 0, the squeeze rule gives that f isdifferentiable at x = 0.By other rules, f is also differentiable for x 6= 0. Moreover,

f ′(x) =

2x sin1

x− cos

1

xif x 6= 0

0 if x = 0.

Note that limx→0

f ′(x) does not exist so that f ′(0) exists but f ′ is not continuous at 0.

Composite rule Let f be differentiable at c, and g be differentiable at b = f(c). Theng ◦ f is differentiable at c and

(g ◦ f)′(c) = g′(f(c)) · f ′(c).

65

Proof. Let

Fc(x) =

f(x)− f(c)

x− cif x 6= c

f ′(c) if x = c

and

Gb(y) =

g(y)− g(b)

y − bif y 6= b

g′(b) if y = b.

Then Fc is continuous at x = c and, for all x,

f(x) = f(c) + (x− c) · Fc(x).

Gb is continuous at y = b and, for all y

g(y) = g(c) + (y − b) ·Gb(y).

Now

(g ◦ f)(x) =g(f(x)) = g(y) = g(b) + (y − b) ·Gb(y) =

=g(f(c)) + (f(x)− f(c)) ·Gb(f(x)) =

=g(f(c)) + (x− c) · Fc(x) ·Gb(f(x)).

So(g ◦ f)(x)− (g ◦ f)(c)

x− c= Fc(x) ·Gb(f(x)).

The function on the right-hand side of the above equality is continuous at x = c. Hence

limx→c

(g ◦ f)(x)− (g ◦ f)(c)

x− c= Fc(c) ·Gb(f(c)) = f ′(c) · g′(f(c)),

as required.

The composite rule is often called the chain rule and the formula given for the derivative ofa composite is more suggestive in Leibnitz notation. Let ∆x = h and ∆y = f(x+h)−f(x).Then

f ′(x) = limh→0

f(x + h)− f(x)

h= lim

∆x→0

∆y

∆x.

The Leibnitz notation for this limit isdy

dx. Write y = g(u) where u = f(x). Then

f ′(x) =du

dxand g′(f(x)) =

dy

duand (g ◦ f)′(x) =

dy

dx. The chain rule can now be written

asdy

dx=

dy

du· du

dx.

Example 34.3. Show that h(x) = sin x2 is differentiable.

Solution: Let g(x) = sin x and f(x) = x2, then h = g ◦ f . Since f and g are everywheredifferentiable, the composite rule gives that h is differentiable. Moreover,

h′(x) = g′(f(x)) · f ′(x) = 2x cos x.

66

Inverse rule Suppose that f : A → B is a continuous bijection where A and B areintervals. If f is differentiable at a ∈ A and f ′(a) 6= 0, then f−1 is differentiable atb = f(a) and

(f−1)′(b) =1

f ′(a).

Proof. For a ∈ A, let

Fa(x) =

f(x)− f(a)

x− aif x 6= a

f ′(a) if x = a.

Fa is continuous at x = a and for all x ∈ A

f(x) = f(a) + (x− a) · Fa(x).

Given f(a) = b and letting f(x) = y for x ∈ A we have

Fa(x) =y − b

x− afor x 6= a.

Consider

Gb(y) =x− a

y − bfor y 6= b

so

Gb(y) =x− a

f(x)− f(a)=

1

Fa(x)=

1

(Fa ◦ f−1)(y)for y 6= b.

Since f bijective and continuous, so f−1 is continuous too. Also f−1(b) = a and Fa iscontinuous at x = a.Hence Fa ◦ f−1 is continuous at y = b and

(Fa ◦ f−1)(b) = Fa(f−1(b)) = Fa(a) = f ′(a) = (f ′ ◦ f−1)(b) 6= 0

Sox− a

y − b= Gb(y) =

1

(Fa ◦ f−1)(y)→ 1

(Fa ◦ f−1)(b)=

1

f ′(a)as y → b.

In other words

limy→b

f−1(y)− f−1(b)

y − b=

(1

f ′ ◦ f−1

)(b) =

1

f ′(a).

If the function f : A → f(A) is differentiable on the interval A and f ′(a) 6= 0 for any

a ∈ A then f−1 is differentiable on f(A) and (f−1)′(f(a)) =1

f ′(a).

Example 34.4. The function f : (0,∞) → (0,∞) given by f(x) = x2 is a bijection.Its inverse function is given by f−1(x) =

√x. Now f is differentiable for x > 0 and

f ′(x) = 2x 6= 0. Hence, by the inverse rule, f−1(x) =√

x is differentiable and

(f−1)′(x) =1

2√

x.

67

Example 34.5. The function g :(−π

2,π

2

)→ (−1, 1) given by g(x) = sin x is a bijection

with inverse given by g−1(x) = arcsin x. Now g is differentiable and

(g′ ◦ g−1)(x) = cos(arcsin x) for |x| < 1.

Hence, by the inverse rule g−1 is differentiable and

(g−1)′(x) =1√

1− x2for |x| < 1.

35 Local extremum

In this section we present a result which helps to locate the local maxima and minima ofa differentiable function.

Definition 35.1. A function f has a local maximum value at c if c is contained in someopen interval I for which f(x) ≤ f(c) for each x ∈ I. If f(x) ≥ f(c) for each x ∈ I, thenf has a local minimum value at c.

Theorem 35.1 (Local extremum theorem, Fermat). If f is differentiable at c andpossesses a local maximum or a local minimum at c, then f ′(c) = 0.

Proof. Consider the case of a local minimum at x = c. There is an open interval I such

that f(x) − f(c) ≥ 0 for all x ∈ I. If x > c, thenf(x)− f(c)

x− c≥ 0 and if x < c, then

f(x)− f(c)

x− c≤ 0. Thus f ′+(c) ≥ 0 and f ′−(c) ≤ 0. But f ′(c) exists and so f ′+(c) = f ′−(c).

Thus f ′(c) = 0.

Note that although f ′ must vanish at a local extremum this is not sufficient for such apoint. For example, consider the behavior of f(x) = x3 at x = 0. Here f ′(0) = 0, but 0is neither a local maximum, nor a local minimum.

Example 35.1. Localize the local maximum and local minimum of the function

f(x) = x (x− 1) (x− 2)

36 Theorems concerning basic properties of differen-

tiable functions

This section establishes some basic properties of differentiable functions.

Theorem 36.1 (Rolle’s theorem). Let f be differentiable on (a, b) and continuous on[a, b]. If f(a) = f(b) then there exists c ∈ (a, b), such that f ′(c) = 0.

68

Proof. Since f is continuous on [a, b], it attains a maximum value f(c1) and a minimumvalue f(c2) on [a, b] by the boundedness property.If f(c1) = f(c2), then f is constant for all x ∈ [a, b], hence f ′(x) = 0 for all x ∈ [a, b] andthe result follows.If f(c1) 6= f(c2), then at least one of c1 and c2 is not a or b. Hence f has a local maximumor minimum (or both) inside the interval [a, b].By the local extremum theorem f ′ is zero at at least one point inside [a, b].

Theorem 36.2 (Mean value theorem, Lagrange). Let f be differentiable on (a, b) andcontinuous on [a, b]. Then there exists c ∈ (a, b), such that

f ′(c) =f(b)− f(a)

b− a.

Proof. Let g(x) = f(x) − λx, where λ =f(b)− f(a)

b− a. g is differentiable on (a, b) and

continuous on [a, b]. The choice of λ means that g(a) = g(b). Applying Rolle’s theorem

there is c ∈ (a, b), such that g′(c) = 0. Hence, f ′(c)− λ = 0, so f ′(c) =f(b)− f(a)

b− a.

Theorem 36.3 (the increasing-decreasing theorem). If f is differentiable on (a, b) andcontinuous on [a, b] then

(1) f ′(x) > 0 for all x ∈ (a, b) implies f is strictly increasing on [a, b];

(2) f ′(x) < 0 for all x ∈ (a, b) implies f is strictly decreasing on [a, b];

(3) f ′(x) = 0 for all x ∈ (a, b) implies f is constant on [a, b].

Proof. Let x1, x2 ∈ [a, b] with x1 < x2. Since f satisfies the hypothesis of the mean valuetheorem on the interval [x1, x2] we have

f(x2)− f(x1)

x2 − x1

= f ′(c)

for some c, x1 < c < x2. But f ′(c) > 0 and so f(x2) > f(x1). In other words, f is strictlyincreasing on [a, b].

The proof of (2) and (3) is similar.

Comment 36.1. The increasing-decreasing theorem is useful for finding and classifyinglocal extrema and establishing inequalities between functions.

Example 36.1. Find and describe the local extrema of f(x) = x2 · e−x.

Solution: f is everywhere differentiable and

f ′(x) = e−x(2− x) · x.

Local extrema occur only when f ′(x) = 0 and so x = 0 or x = 2. Since e−x > 0, if x < 0then f ′(x) < 0, if x ∈ (0, 2) then f ′(x) > 0 and if x > 2 then f ′(x) < 0. Thus f isdecreasing on (−∞, 0), increasing on (0, 2) and decreasing again on (2, +∞). This meansthat x = 0 gives a local minimum and x = 2 gives a local maximum of f .

69

Example 36.2. Prove that ex ≥ 1 + x for all x.

Solution: Let f(x) = ex − 1− x. f is differentiable and f ′(x) = ex − 1. Hence f ′(x) > 0for x > 0 and so f(x) > f(0) = 0 for x > 0. Since f ′(x) < 0 for x < 0 it follows thatf(x) > f(0) = 0 for x < 0. Finally we obtain that f(x) ≥ 0 for all x ∈ R1. That meansex ≥ 1 + x.

The next result is difficult to interpret geometrically but it will be needed to provel’Hospital’s rule. This is a rule which is well suited to the evaluation of limits of form

limx→x0

f(x)

g(x), where f(x0) = g(x0) = 0.

Theorem 36.4 (Cauchy’s mean value theorem). Let f and g be differentiable on (a, b)and continuous on [a, b]. Then there exists c ∈ (a, b), such that

f ′(c)g′(c)

=f(b)− f(a)

g(b)− g(a)

provided that g′(x) 6= 0 for all x ∈ (a, b).

Proof. First note that g(a) 6= g(b), otherwise Rolle’s theorem applied to g on [a, b]would mean that g′ vanished somewhere on (a, b). Let h(x) = f(x) − λg(x) where

λ =f(b)− f(a)

g(b)− g(a).

By the sum and product rules for continuity and differentiability and our choice of λ,h satisfies all the hypotheses of Rolle’s theorem. Hence there is c ∈ (a, b), such thath′(c) = 0. This gives f ′(c) = λ · g′(c) and the result now follows.

Theorem 36.5 (l’Hospital’s rule, version A). Let f and g satisfy the hypotheses ofCauchy’s mean value theorem and let x0 satisfy x0 ∈ (a, b). If f(x0) = g(x0) = 0,then

limx→x0

f(x)

g(x)= lim

x→x0

f ′(x)

g′(x)

provided that the latter limit exists.

Proof. Apply Cauchy’s mean value theorem to f and g on the interval [x0, x] wherex0 < x ≤ b. Hence there exists c, x0 < c < x such that

f ′(c)g′(c)

=f(x)− f(x0)

g(x)− g(x0)=

f(x)

g(x).

Now

limx→x+

0

f(x)

g(x)= lim

c→x+0

f ′(c)g′(c)

= limx→x+

0

f ′(x)

g′(x)

provided that the latter exists.A similar argument applied on the interval [x, x0] where a ≤ x < x0 gives that:

limx→x−0

f(x)

g(x)= lim

x→x−0

f ′(x)

g′(x)

again provided that the latter exists.

The rule now follows.

70

Example 36.3. Show that

limx→0

sin x

x= 1.

Solution: The functions f(x) = sin x and g(x) = x satisfy the hypotheses of l’Hospitalrule. Moreover

limx→0

sin x

x= lim

x→0

cos x

1= 1.

Example 36.4. Show thatlimx→0

(1 + x)1x = e.

Solution: Via the composite rule for limits of functions we have

ln(limx→0

(1 + x)1x

)= lim

x→0

(ln(1 + x)

1x

)= lim

x→0

ln(1 + x)

x.

By l’Hospital’s rule

limx→0

ln(1 + x)

x= lim

x→0

1

x + 1= 1.

Hencelimx→0

(1 + x)1x = e.

L’Hospital’s rule can be used to evaluate many indeterminate limits once they have beenexpressed as the limit of a quotient of differentiable functions, provided of course that

limx→x0

f(x)

g(x)can be evaluated.

Often this final limit is itself indeterminate (in other words f ′(x0) = g′(x0) = 0) and itmay be tempting to apply l’Hospital’s rule again. But this requires that f ′ and g′ arethemselves differentiable.

37 Higher-order derivatives and differentials

Definition 37.1. If the derived function f ′ of a given differentiable function f is itselfdifferentiable it is said that f is twice differentiable and f ′′ or f (2) denotes (f ′)′ its secondderivative.

Definition 37.2. In general, f is n times differentiable if f is (n−1) times differentiableand its (n− 1)-th derivative is differentiable.The n-th derivative (f (n−1))′ is denoted by f (n). If moreover, f (n) is a continuous function,then f is said to be n times continuously differentiable.

Example 37.1.

1. If f(x) = xm, m ∈ N then

f (n)(x) =

m!

(m− n)!· xm−n for n ≤ m

0 for n > m.

71

2. If f(x) = sin x then f (n)(x) = sin(x +

nπ

2

)for n ≥ 1.

3. If f(x) = ln x then f (n)(x) =(−1)n−1 · (n− 1)!

xnfor n ≥ 1.

The successive differentiation of products of simple functions are:

(f · g)′ =f ′ · g + f · g′ (37.1)

(f · g)(2) =f (2) · g + 2 · f ′ · g′ + f · g(2) (37.2)

(f · g)(3) =f (3) · g + 3 · f (2) · g′ + 3 · f ′ · g(2) + f · g(3). (37.3)

Generally the following formula applies. This can be proved by induction on n.

Theorem 37.1 (Leibnitz formula). Let f and g be n times continuously differentiable.Then h = f · g is also n times continuously differentiable and:

h(n) =n∑

k=0

Ckn · f (n−k) · g(k).

Theorem 37.2 (l’Hospital’s rule, version B). Let f and g be n times continuouslydifferentiable on the interval (a, b) and x0 satisfies a < x0 < b. If

f (k)(x0) = g(k)(x0) = 0 for 0 ≤ k ≤ n− 1

andg(n)(x0) 6= 0

then

limx→x0

f(x)

g(x)=

f (n)(x0)

g(n)(x0).

Example 37.2. Prove that

limx→0

1− cos x

x2=

1

2

and

limx→0

(1

x− cot x

)= 0.

38 Taylor polynomials

Suppose that∞∑

n=0

anxn is a power series with radius of convergence R > 0. Let f(x) be

the sum of series for |x| < R.It can be proved that f is differentiable and that

f ′(x) =∞∑

n=1

an · n · xn−1 for |x| < R.

72

Continually differentiating in this manner leads to

f (k)(x) =∞∑

n=k

an · n · (n− 1) · . . . · (n− k + 1) · xn−k for |x| < R.

If x = 0, then

ak =f (k)(0)

k!for k = 1, 2, . . .

Thus the coefficient of xn in any power series isf (n)(0)

n!where f(x) is the sum of the given

power series. Hence

f(x) =∞∑

n=0

f (n)(0)

n!· xn for |x| < R.

For small values of x, the sum f(x) can be approximated by the polynomial

f(0) +f ′(0)

1!x +

f (2)(0)

2!x2 + . . . +

f (N)(0)

N !xN

for any value of N .

This section investigates how good an approximation this polynomial is when f(x) is notnecessarily the sum of a given power series.

Definition 38.1. Let f an n times continuously differentiable function at 0. The Taylorpolynomial of degree n for f at 0 is defined by:

Tnf(x) = f(0) +f ′(0)

1!x +

f (2)(0)

2!x2 + . . . +

f (n)(0)

n!xn.

Example 38.1. Let f(x) = ex. For k = 1, 2, . . . f (k)(0) = 1. Thus

T0f(x) =1

T1f(x) =1 + x

T2f(x) =1 + x +1

2x2

and so on.

The first result provides an estimate for the difference between f(b), the value of a givenfunction at x = b, and Tnf(b), the value of its Taylor polynomial of degree n at x = b.

Theorem 38.1 (The first remainder theorem). Let f be (n + 1) times continuouslydifferentiable on an open interval containing the points 0 and b. Then the differencebetween f and Tnf at x = b is given by

f(b)− Tnf(b) =bn+1

(n + 1)!· f (n+1)(c)

for some c between 0 and b.

73

Proof. For simplicity assume that b > 0. Let

hn(x) = f(b)−n∑

k=0

f (k)(x)

k!(b− x)k x ∈ [0, b].

Then hn(b) = 0 and hn(0) = f(b)− Tnf(b). Let

g(x) = hn(x)−(

b− x

b

)n+1

· hn(0) x ∈ [0, b].

The function g is continuous on [0, b] and differentiable on (0, b) and g(0) = g(b) = 0.Hence by Rolle’s theorem g′(c) = 0 for some c between 0 and b. Now

h′n(x) = −f (n+1)(x)

n!(b− x)n

after a straightforward calculation. Thus

g′(x) = −f (n+1)(x)

n!(b− x)n +

(n + 1)(b− x)n

bn+1· hn(0)

and so

0 = g′(c) = −f (n+1)(c)

n!(b− c)n +

(n + 1)(b− c)n

bn+1· hn(0)

leading to

hn(0) =bn+1

(n + 1)!· f (n+1)(c).

Denote f(b)− Tnf(b) by Rnf(b) and call it the remainder term at x = b. Thus

f(b) = Tnf(b) + Rnf(b)

and so the error in approximating f(b) by Tnf(b) is given by the remainder term Rnf(b).Since f (n+1) is continuous on a closed interval containing 0 and b, it is bounded on thatinterval.So there exists a number M such that |f (n+1)(c)| ≤ M and so

|Rnf(b)| ≤∣∣∣∣

bn+1

(n + 1)!

∣∣∣∣ ·M.

Thus, for a fixed n, the remainder term will be small for b close to zero. In other wordsTaylor polynomials provide good approximations of the function near x = 0. The nextexample illustrates this.

Example 38.2. Let f(x) = sin x. Then

T7f(x) = x− x3

3!+

x5

5!− x7

7!R7f(x) =

x8

8!(− sin c)

for some c between 0 and x. By the first remainder theorem

R7f(0.1) ≤ 0.18

8!= 2.48 · 10−13.

74

It can now be shown how Taylor polynomials can be used to generate power seriesexpansions for functions f which are infinitely differentiable on an open interval containing0 and x. For x we have:

f(x) = Tnf(x) + Rnf(x).

Now

limn→∞

Tnf(x) =∞∑

k=0

f (k)(0)

k!· xk for |x| < R

where R is the radius of convergence of the resulting power series.

If it can be shown that limn→∞

Rnf(x) = 0 for |x| < R′ < R for some R′, then

f(x) =∞∑

n=0

f (n)(0)

n!· xn for |x| < R′.

This power series is called the McLaurin series for f(x).

Example 38.3. Derive the McLaurin series for f(x) = ex.

Solution: Firstly

Tnf(x) = 1 +x

1!+

x2

2!+ . . . +

xn

n!

and Rnf(x) =xn+1

(n + 1)!ec for some c ∈ (0, x). Now the series

∞∑n=0

xn

n!is absolutely

convergent for all x by the ratio test so for any fixed real number x

limn→∞

Tnf(x) =∞∑

n=0

xn

n!.

By vanishing conditionxn

n!−−−→n→∞

0. Thus

|Rnf(x)| =∣∣∣∣

xn+1

(n + 1)!ec

∣∣∣∣ → 0 as n →∞

for fixed x. Hence, for any x, limn→∞ Rnf(x) = 0. Hence

ex =∞∑

n=0

xn

n!= 1 +

x

1!+

x2

2!+ . . . +

xn

n!+ . . .

In a similar manner, power series can be generated for all the standard functions. In eachcase the sum of the power series is just lim

n→∞Tnf(x) and the range of validity is precisely

those x for which:

a) the resulting power series converges and

b) the remainder Rnf(x) → 0 as n →∞.

75

In deriving the following list, the trickiest part is establishing b):

ex =∞∑

n=0

xn

n!x ∈ R1 (38.1)

sin x =∞∑

n=0

(−1)n · x2n+1

(2n + 1)!x ∈ R1 (38.2)

cos x =∞∑

n=0

(−1)n · x2n

(2n)!x ∈ R1 (38.3)

(1 + x)t = 1 +∞∑

n=1

t · (t− 1) · . . . · (t− n + 1)

n!xn t /∈ N, x ∈ (−1, 1) (38.4)

ln(1 + x) =∞∑

n=1

(−1)n−1 · xn

n− 1 < x ≤ 1. (38.5)

The form of the remainder found in the first remainder theorem is called the Lagrangeform.

The first few terms of the McLaurin series for a given function f provide a goodapproximation of f(x) close to 0.

But what happens when approximations for x close to some other real number a arerequired? Polynomials must be considered not in powers of x but in powers of (x− a).

Definition 38.2. Let f be n times continuously differentiable on an open intervalcontaining a fixed real number a. Define the Taylor polynomial of degree n for f at aby

Tn,af(x) = f(a) +f ′(a)

1!(x− a) +

f (2)(a)

2!(x− a)2 + . . . +

f (n)(a)

n!(x− a)n.

The first remainder theorem can now be generalized.

Theorem 38.2 (Taylor’s theorem). Let f be (n + 1) times continuously differentiable onan open interval containing the points a and b. Then the difference between f and Tn,afat b is given by

f(b)− Tn,af(b) =(b− a)n+1

(n + 1)!fn+1(c)

for some c between a and b.

Proof. For each t between a and b

f(b) = f(t) +f ′(t)1!

(b− t) + . . . +f (n)(t)

n!(b− t)n + F (t)

whereF (t) = Rn,tf(b) = f(b)− Tn,tf(b).

76

Differentiating with respect to t gives:

0 =f ′(t) +

(−f ′(t) +

f (2)(t)

1!(b− t)

)+

(−f (2)(t)

1!(b− t) +

f (3)(t)

2!(b− t)2

)+ . . .

. . . +

(− f (n)(t)

(n− 1)!(b− t)n−1 +

f (n+1)(t)

n!(b− t)n

)+ F ′(t).

Cancelation now gives that

F ′(t) = −f (n+1)(t)

n!(b− t)n.

Apply Cauchy’s mean value theorem to the functions F and G on the interval withendpoints a and b, where G(t) = (b − t)n+1. Thus, there is a number c between a and bsuch that

F (b)− F (a)

G(b)−G(a)=

F ′(c)G′(c)

=−f (n+1)(c)

n!(b− c)n

−(n + 1)(b− c)n.

Hence

−(f(b)− Tn,af(b))

−(b− a)n+1=

f (n+1)(c)

n!(b− c)n

(n + 1)(b− c)n

or

f(b)− Tn,af(b) =(b− a)n+1

(n + 1)!fn+1(c).

The error in approximating f(b) by the polynomial Tn,af(b) is just the remainder term:

Rn,af(b) =(b− a)n+1

(n + 1)!fn+1(c)

where c lies between a and b. The approximation is good for b close to a.

Just as before, power series can be generated in powers of (x − a), called Taylor series,for suitable functions of x. The range of validity is again those x for which

a) the resulting power series converges, and

b) Rn,af(x) → 0 as n →∞.

39 Classification theorem for local extrema

A form of Taylor’s theorem much used in numerical analysis is derived below and used toround off the investigation of local extrema.

From Taylor’s theorem, it follows that

f(x) = Tn,af(x) + Rn,af(x) =

= f(a) +f ′(a)

1!(x− a) +

f (2)(a)

2!(x− a)2 + . . . +

f (n)(a)

n!(x− a)n +

(b− a)n+1

(n + 1)!fn+1(c)

77

for some c between a and x.

Let x − a = h. Then c lies between a and a + h. Thus c = a + θ · h for some θ ∈ (0, 1).Hence the following result holds:

f(a + h) = f(a) +h

1!f ′(a) + . . . +

hn

n!f (n)(a) +

hn+1

(n + 1)!fn+1(a + θ · h)

for some θ ∈ (0, 1).

This expression emphasizes that the value of f at a + h is determined by the values of fand its derivatives at a with θ measuring the degree of indeterminacy.

When the extrema of a function f which has a stationary point at x = a (i.e. f ′(a) = 0)is investigated, it is required that the sign of f(a+h)−f(a) for all small h be determined.The above expression relates f(a + h) − f(a) to the derivatives of f at a, thus enablingthe following to be proved.

Theorem 39.1 (Classification theorem for local extrema). If f is (n + 1) times con-tinuously differentiable on a neighborhood of a and f (k)(a) = 0 for k = 1, 2, . . . , n (inparticular f ′(a) = 0 and so x = a is a stationary point of f) and f (n+1)(a) 6= 0 then:

(1) n + 1 even and f (n+1)(a) > 0 implies that f has a local minimum at x = a.

(2) n + 1 even and f (n+1)(a) < 0 implies that f has a local maximum at x = a.

(3) n+1 odd implies that f has neither a local maximum nor a local minimum at x = a.

Proof. Since f (k)(a) = 0 for k = 1, 2, . . . , n

f(a + h)− f(a) =hn+1

(n + 1)!f (n+1)(a + θh)

where 0 < θ < 1. Since f (n+1)(a) 6= 0 and f (n+1) is continuous, there is a δ > 0 such thatf (n+1)(x) 6= 0 for |x − a| < δ. Thus for all h satisfying |h| < δ, f (n+1)(a + θh) has thesame sign as f (n+1)(a), so f(a + h)− f(a) has the same sign as hn+1 · f (n+1)(a) for all h,|h| < δ.

(1) If n + 1 is even and f (n+1)(a) > 0, then f(a + h) − f(a) > 0 on the open interval(a− δ, a + δ). Hence, x = a gives a local minimum of f .

(2) If n + 1 is even and f (n+1)(a) < 0, then f(a + h) − f(a) < 0 on the open interval(a− δ, a + δ). Hence, x = a gives a local maximum of f .

(3) If n + 1 is odd, the sign of f(a + h)− f(a) changes with the sign of h. It is said thatx = a gives a horizontal point of inflection.

Example 39.1. Determine the nature of the stationary points of

f(x) = x6 − 4x4.

78

40 The Riemann-Darboux integral

It is intended to give one form of the definition of the Riemann integral

∫ b

a

f(x) dx. The

definition involves the areas of the rectangles and applies to a wider class of functionsthan continuous ones.

Definition 40.1. Let [a, b] be a given finite interval. A partition P on [a, b] is a finite setof points {x0, x1, . . . , xn} satisfying

a = x0 < x1 < . . . < xn = b.

Suppose now that f is a function defined and bounded on [a, b] (if f were continuous on[a, b] this would certainly be the case). Then f is bounded on each of the subintervals[xi−1, xi]. Hence f has a least upper bound Mi, and a greatest lower bound mi in [xi−1, xi].

Definition 40.2. The upper Darboux sum of f related to P is defined by

Uf (P ) =n∑

i=1

Mi(xi − xi−1)

where Mi = sup{f(x) |xi−1 ≤ x ≤ xi}.The lower Darboux sum of f related to P is defined by

Lf (P ) =n∑

i=1

mi(xi − xi−1)

where mi = sup{f(x) | xi−1 ≤ x ≤ xi}.

Now f is bounded above and below on the whole [a, b]. So there exist numbers m and Mwith

m ≤ f(x) ≤ M for all x ∈ [a, b].

Thus for any partition of [a, b]:

m(b− a) ≤ Lf (P ) ≤ Uf (P ) ≤ M(b− a)

Hence the setLf = {Lf (P ) |P is a partition of [a, b]}

is bounded above and the set

Uf = {Uf (P ) |P is a partition of [a, b]}

is bounded below.So Lf = sup Lf and Uf = inf Uf exist.

The first result establishes the intuitively obvious fact that Lf ≤ Uf .

Proposition 40.1. If f is defined and bounded on [a, b], then Lf ≤ Uf .

79

Proof. Let P be a partition of [a, b] and P ′ be the partition P ∪ {y} where xi−1 < y < xi

for one particular i, 1 ≤ i ≤ n. In other words, P ′ is obtained by adding one more pointto P .It is now shown that Lf (P ) ≤ Lf (P

′) and Uf (P ) ≥ Uf (P′). Let

M ′i = sup{f(x) |xi−1 ≤ x ≤ y} and M ′′

i = sup{f(x) | y ≤ x ≤ xi}.

Clearly, M ′i ≤ Mi and M ′′

i ≤ Mi. Hence:

Uf (P′) =

i−1∑j=1

Mj(xj − xj−1) + M ′i(y − xi−1) + M ′′

i (xi − y) +n∑

j=i+1

Mj(xj − xj−1) ≤

≤i−1∑j=1

Mj(xj − xj−1) + Mi(y − xi−1) + Mi(xi − y) +n∑

j=i+1

Mj(xj − xj−1) =

=n∑

j=1

Mj(xj − xj−1) = Uf (P ).

In a similar way, it can be shown that Lf (P ) ≤ Lf (P′). It now follows that if

P ′′ = P ∪ {y1, y2, . . . , ym}, where yi are distinct numbers in [a, b], then Lf (P ) ≤ Lf (P′′)

and Uf (P ) ≥ Uf (P′′).

Now suppose that P1 and P2 are two partitions of [a, b] and let P3 = P1 ∪ P2. Thus,Lf (P1) ≤ Lf (P3) and Uf (P2) ≥ Uf (P3). Since Lf (P3) ≤ Uf (P3) it can be deduced thatLf (P1) ≤ Uf (P2). In other words, the lower sum related to a given partition of [a, b] doesnot exceed the upper sum related to any partition of [a, b]. Hence every lower sum is alower bound for the set of upper sums. So Lf (P ) ≤ Uf for all possible partitions P . Butthen Uf is an upper bound for the set of lower sums. Thus Lf ≤ Uf .

Definition 40.3. A function defined and bounded on [a, b] is Riemann-Darboux integrableon [a, b] if Lf = Uf . This common value is denoted by

∫ b

a

f(x) dx = Lf = Uf .

Example 40.1. Prove that f(x) = x is Riemann-Darboux integrable on [0, 1].

Solution: For n ∈ N let Pn ={0, 1

n, 2

n, . . . , 1

}. Hence Uf (Pn) =

n + 1

2nand Lf (Pn) =

n− 1

2n. So

n− 1

2n≤ Lf ≤ Uf ≤ n + 1

2n.

Letting n →∞ it can be deduced that Lf = Uf =1

2.

Example 40.2. Show that the function

f(x) =

{1 if x is rational0 if x is irrational

is not Riemann-Darboux integrable on any interval [a, b].

80

Solution: For any partition P it follows that Lf (P ) = 0 and Uf (P ) = b − a, since anyinterval of real numbers contains infinitely many rationals and irrationals. Hence Lf = 0and Uf = b− a and so Lf 6= Uf .

This definition of

∫ b

a

f(x) dx is only one of the many ways of assigning areas to bounded

regions. There are others, notably the Lebesgue integral; all however, give the same”answer” for areas under the graphs of continuous functions. It will be proved that allcontinuous functions are Riemann-Darboux integrable and a neat method of evaluatingthe integral involved is derived.

Firstly, some elementary properties of the Riemann integral must be established - all ofwhich are essentially properties of areas.

41 Properties of the Riemann-Darboux integral

Proposition 41.1. If f and g are Riemann-Darboux integrable on [a, b] then all theintegrals below exist and

(1)

∫ b

a

(α f(x) + β g(x)) dx = α

∫ b

a

f(x) dx + β

∫ b

a

g(x) dx α, β ∈ R1.

(2)

∫ b

a

f(x) dx =

∫ c

a

f(x) dx +

∫ b

c

f(x) dx a ≤ c ≤ b.

(3) if f(x) ≤ g(x) on [a, b] then

∫ b

a

f(x) dx ≤∫ b

a

g(x) dx.

(4)

∣∣∣∣∫ b

a

f(x) dx

∣∣∣∣ ≤∫ b

a

|f(x)| dx.

Property (1) is described as the linearity of the integral and (2) is called the additiveproperty.

Proof of (1). It is sufficient to prove that the following equalities hold

b∫

a

αf(x) dx = α

b∫

a

f(x) dx and

b∫

a

(f(x) + g(x)) dx =

b∫

a

f(x) dx +

b∫

a

g(x) dx

The equality

b∫

a

αf(x) dx = α

b∫

a

f(x) dx is true for any α ≥ 0, provided by the equalities

Lαf (P ) = αLf (P ) and Uαf (P ) = αUf (P ) for any α ≥ 0 and any partition P of [a, b] .

The equality

b∫

a

αf(x) dx = α

b∫

a

f(x) dx holds provided by U−f (P ) = −Lf (P ), for any

81

partition P of [a, b].

The equality

b∫

a

(f(x) + g(x)) dx =

b∫

a

f(x) dx +

b∫

a

g(x) dx is obtained by observing that

for any partition P of [a, b] the followings hold:

Lf (P ) + Lg(P ) ≤ Lf+g(P ) ≤ Uf+g(P ) ≤ Uf (P ) + Ug(P )

from where:Lf + Lg ≤ Lf+g ≤ Uf+g ≤ Uf + Ug

These inequalities together with:

Lf = Uf =

b∫

a

f(x) dx and Lg = Ug =

b∫

a

g(x) dx

prove that:

Lf+g = Uf+g =

b∫

a

(f(x) + g(x)) dx =

b∫

a

f(x) dx +

b∫

a

g(x) dx

Proof of (2). Let P1 and P2 be partitions of [a, c] and [c, b] respectively. Then P = P1∪P2

is a partition of [a, b]. Clearly, Lf (P ) = Lf (P1) + Lf (P2). Let

L1 = sup{Lf (P1) |P1 is a partition of [a, c]}and

L2 = sup{Lf (P2) |P2 is a partition of [c, b]}.

Since Lf (P ) ≤∫ b

a

f(x) dx (by definition) we have

Lf (P1) + Lf (P2) ≤∫ b

a

f(x) dx.

Hence Lf (P1) ≤∫ b

a

f(x) dx− Lf (P2) and so L1 ≤∫ b

a

f(x) dx− Lf (P2).

Hence Lf (P2) ≤∫ b

a

f(x) dx−L1 and so L2 ≤∫ b

a

f(x) dx−L1 or L1 +L2 ≤∫ b

a

f(x) dx.

Now consider upper sums and observe that Uf (P ) = Uf (P1) + Uf (P2). Let

U1 = inf{Uf (P1) |P1 is a partition of [a, c]}and

U2 = inf{Uf (P2) |P2 is a partition of [c, b]}.

Since Uf (P ) ≥∫ b

a

f(x) dx (by definition) we have

Uf (P1) + Uf (P2) ≥∫ b

a

f(x) dx.

82

Hence Uf (P1) ≥∫ b

a

f(x) dx− Uf (P2) and so U1 ≥∫ b

a

f(x) dx− Uf (P2).

Hence Uf (P2) ≥∫ b

a

f(x) dx− U1 and so U2 ≥∫ b

a

f(x) dx− U1 or U1 + U2 ≥∫ b

a

f(x) dx.

Thus

L1 + L2 ≤∫ b

a

f(x) dx ≤ U1 + U2.

Since f is Riemann integrable on [a, b], for any ε > 0, P can be chosen such thatUf (P )− Lf (P ) < ε. Then:

Uf (P1)− Lf (P1) + Uf (P2)− Lf (P2) = [Uf (P1) + Uf (P2)]− [Lf (P1) + Lf (P2)] =

= Uf (P )− Lf (P ) < ε.

Hence0 ≤ Uf (P1)− Lf (P1) < ε and ≤ Uf (P2)− Lf (P2) < ε.

Hence L1 = U1 and L2 = U2. In other words, f is Riemann-Darboux integrable on both[a, c] and [c, b]. Hence the additive property is established.

Proof of (3). It is sufficient to prove that if f(x) ≥ 0 on [a, b] then:

b∫

a

f(x) dx ≥ 0

The above inequality follows from the inequalities

0 ≤ Lf (P ) ≤ Uf (P )

which are valid for any partition P of [a, b].

Proof of (4). We first prove that |f | is Riemann-Darboux integrable on [a, b]. We considerthe functions f+, f− : [a, b] → R defined by:

f+(x) =

{f(x) if f(x) ≥ 0

0 if f(x) ≤ 0

and

f−(x) =

{0 if f(x) ≥ 0

−f(x) if f(x) ≤ 0

and we remark that:

f(x) = f+(x)− f−(x) and |f(x)| = f+(x) + f−(x)

We will show now that f+, f− : [a, b] → R are Riemann-Darboux integrable on [a, b]. Theboundedness of f+ and f− is obvious. Consider a partition P of [a, b] and denote:

m+i = inf{f+(x) | x ∈ [xi−1, xi]} and M+

i = sup{f+(x) | x ∈ [xi−1, xi]}Remark that:

M+i −m+

i ≤ Mi −mi i = 1, 2, ..., n

83

where:

mi = inf{f(x) | x ∈ [xi−1, xi]} and Mi = sup{f(x) | x ∈ [xi−1, xi]}

Hence, we obtain the inequalities

0 ≤ Uf+(P )− Lf+(P ) ≤ Uf (P )− Lf (P )

for any partition P .It follows that f+ is Riemann-Darboux integrable on [a, b].In a similar way, we obtain that f− is Riemann-Darboux integrable on [a, b].Using (1) and the equality |f | = f++f− we obtain that |f | is Riemann-Darboux integrableon [a, b].In order to obtain the inequality (4) we use:

−|f(x)| ≤ f(x) ≤ |f(x)|

and we deduce that:

−b∫

a

|f(x)| dx ≤b∫

a

f(x) dx ≤b∫

a

|f(x)| dx

42 Classes of Riemann-Darboux integrable functions

Theorem 42.1. If f is continuous on [a, b], then f is Riemann-Darboux integrable on[a, b].

Proof. If ε′ > 0 then either

|f(x)− f(a)| < ε′ for all x ∈ [a, b]

or elseS = {x | x ∈ [a, b] and |f(x)− f(a) = ε′|}

is non empty and by the intermediate value property, inf S exists. A partition P of [a, b]is constructed as follows: if

|f(x)− f(a)| < ε′ for all x ∈ [a, b]

let x0 = a and x1 = b, otherwise let x0 = a and x1 = inf S. Hence x = x1 is thefirst element of [a, b] with |f(x) − f(a)| = ε′. If x1 < b let x2 be the first element of[x1, b] with |f(x) − f(x1)| = ε′, otherwise let x2 = b. Define x3, x4, . . . and so on in asimilar manner. If this process continues indefinitely, a sequence (xn) has been producedin which |f(xn) − f(xn−1)| = ε′ for all n ∈ N. Now xn is an increasing sequence whichis bounded above and so (xn) tends to some limit x. Since f is continuous, this impliesthat f(xn) → f(x). Hence f(xn−1) → f(x) also. But this contradicts the condition|f(xn) − f(xn−1)| = ε′. So there is an integer N such that P = {x0, x1, . . . , xN} is a

84

partition of [a, b] for which |f(xn)− f(xn−1)| = ε′, i = 1, 2, . . . , N .For this partition P , we have

Mi −mi < 2ε′ for all i = 1, 2, . . . , N.

Hence

Uf (P )− Lf (P ) =n∑

i=1

(Mi −mi)(xi − xi−1) ≤ 2ε′(b− a).

Now for any ε > 0 consider ε′ =ε

2(b− a)> 0 and deduce that there exists a partition P

with Uf (P ) − Lf (P ) < ε. Now Lf ≥ Lf (P ) > Uf (P ) − ε ≥ Uf − ε. Since ε is arbitrary,Lf ≥ Uf . Since Lf ≤ Uf , by an earlier result, Lf = Uf . Thus, f is Riemann-Darbouxintegrable on [a, b].

Definition 42.1. A function f is called piecewise continuous on [a, b] if there exists apartition P = {x0, x1, . . . , xn} of [a, b] and continuous functions fi defined on [xi−1, xi],such that f(x) = fi(x) for x ∈ (xi−1, xi), i = 1, 2, . . . , n.

A partition Q of [a, b] can be chosen to contain the points x0, x1, . . . , xn. Then Q =P1 ∪ P2 ∪ . . . ∪ Pn where Pi is a partition for [xi−1, xi]. Hence

Lf (Q) =n∑

i=1

Lfi(Pi)

where Lf (Q) is the lower sum of f related to Q. For each i, Lfi(Pi) is the lower sum of

fi related to Pi. Thusn∑

i=1

Lfi(Pi) ≤ Lf , where Lf is the supremum of all the lower sums

for f on [a, b]. Since fi is continuous on [xi−1, xi], fi is Riemann integrable on [xi−1, xi].Hence:

n∑i=1

(∫ xi

xi−1

fi(x) dx

)≤ Lf .

In a similar fashion

Uf ≤n∑

i=1

(∫ xi

xi−1

fi(x) dx

)

where Uf is the supremum of all upper sums of f on [a, b]. Hence, using Lf ≤ Uf , weobtain

Lf = Uf .

In other words, a piecewise continuous function is Riemann-Darboux integrable and

∫ b

a

f(x) dx =n∑

i=1

∫ xi

xi−1

fi(x) dx.

Example 42.1. The function given by f(x) = x − [x] is piecewise continuous on [0, 3].

Compute

∫ 2

0

f(x) dx.

85

43 Mean value theorem

Theorem 43.1 (the integral mean value theorem). If f and g are continuous on [a, b]and g(x) ≥ 0 for x ∈ [a, b], then there exists c between a and b such that

∫ b

a

f(x) · g(x) dx = f(c)

∫ b

a

g(x) dx.

Proof. By the interval theorem applied to f on [a, b], m ≤ f(x) ≤ M for all x ∈ [a, b]where m is the infimum and M is the supremum of f on [a, b]. Since g(x) ≥ 0 we have

m · g(x) ≤ f(x) · g(x) ≤ M · g(x) for x ∈ [a, b].

Hence

m

∫ b

a

g(x) dx ≤∫ b

a

f(x) g(x) dx ≤ M

∫ b

a

g(x) dx

and

k =

∫ b

a

f(x) g(x) dx

∫ b

a

g(x) dx

∈ [m,M ].

By the intermediate value property, there exists c ∈ [a, b] with f(c) = k. Hence

∫ b

a

f(x) · g(x) dx = f(c)

∫ b

a

g(x) dx.

Corollary 43.1. If f is continuous on [a, b], then there exists c ∈ [a, b] such that

∫ b

a

f(x) dx = f(c)(b− a).

Application 43.1 (Integral test for series). Let f : R+ → R+ be a continuous decreasing

function and let an = f(n) for each n ∈ N. Let jn =

∫ n

1

f(x) dx. The series∞∑

n=1

an

converges if and only if (jn) converges.

Proof. Since f(n + 1) ≤ f(x) ≤ f(n) for all x ∈ [n, n + 1], n ∈ N we have

∫ n+1

n

f(n + 1) dx ≤∫ n+1

n

f(x) dx ≤∫ n+1

n

f(n) dx.

Therefore

f(n + 1) ≤∫ n+1

n

f(x) dx ≤ f(n)

and son+1∑

k=2

f(k) =n∑

k=1

f(k + 1) ≤∫ n+1

1

f(x) dx ≤n∑

k=1

f(k).

86

Now let an = f(n) and jn =∫ n

1f(x) dx. Then

n∑

k=2

ak ≤ jn ≤n−1∑

k=1

ak.

If (jn) converges, then the n-th partial sums of∞∑

n=1

an are increasing and bounded above

and so∞∑

n=1

an is a convergent series.

Conversely, if∞∑

n=1

an is a convergent series, then (jn) is increasing and bounded above and

hence, it is a convergent sequence.

44 The fundamental theorem of calculus

Theorem 44.1. If f is Riemann-Darboux integrable on [a, b] and F (x) =

∫ x

a

f(t) dt, then

F is continuous on [a, b]. Furthermore, if f is continuous on [a, b], then F is differentiableon [a, b] and F ′ = f .

Proof. Since f is integrable, it is bounded on [a, b]. So there exists some number M with|f(t)| ≤ M for all t ∈ [a, b]. For a fixed c, we have

|F (x)− F (c)| =∣∣∣∣∫ x

a

f(t) dt−∫ c

a

f(t) dt

∣∣∣∣ =

∣∣∣∣∫ x

c

f(t) dt

∣∣∣∣ ≤∫ x

c

|f(t)| dt ≤∫ x

c

M dt.

Since the constant function x → M has U(P ) = M |x − c| for the trivial partition P ofthe interval with endpoints x and c,

|F (x)− F (c)| ≤ M |x− c|.

Now given ε > 0, choose δ =ε

M. Hence

|x− c| < δ ⇒ |F (x)− F (c)| < ε.

In other words F is continuous at c and, since c was arbitrary, F is continuous on [a, b].

87

Let c ∈ [a, b] and consider x > c. Then

∣∣∣∣F (x)− F (c)

x− c− f(c)

∣∣∣∣ =

∣∣∣∣∣∣∣∣

∫ x

a

f(t)dt−∫ c

a

f(t)dt

x− c− f(c)

∣∣∣∣∣∣∣∣≤

≤

∣∣∣∣∣∣∣∣

∫ x

c

f(t)dt

x− c− f(c)

∣∣∣∣∣∣∣∣≤

∣∣∣∣∣∣∣∣

∫ x

c

(f(t)− f(c))dt

x− c

∣∣∣∣∣∣∣∣≤

≤

∣∣∣∣∣∣∣∣

∫ x

c

|f(t)− f(c)|dt

x− c

∣∣∣∣∣∣∣∣(since f(c) is constant)

Given ε > 0, there exists δ > 0 such that |f(t) − f(c)| < ε for |t − c| < δ. Hence, for0 < x− c < δ,

∣∣∣∣F (x)− F (c)

x− c− f(c)

∣∣∣∣ ≤

∣∣∣∣∣∣∣∣

∫ x

c

|f(t)− f(c)|dt

x− c

∣∣∣∣∣∣∣∣≤

∫ x

c

ε dt

x− c< ε.

In other words, F ′+(c) = f(c). Similarly F ′

−(c) = f(c). Hence F is differentiable andF ′ = f .

Remark 44.1. Suppose that it is required to evaluate∫ x2

x1

f(t) dt

where x1, x2 ∈ [a, b] and f is continuous on [a, b]. Using the additive property we have∫ x2

x1

f(t) dt =

∫ x2

a

f(t) dt−∫ x1

a

f(t) dt

and by the fundamental theorem∫ x2

x1

f(t) dt = F (x2)− F (x1)

where F (x) =

∫ x

a

f(t) dt.

Remark 44.2. If f is continuous on [a, b] and Φ′ = f on [a, b], then there exists a constant

c such that Φ(x) = F (x) + c for any x ∈ [a, b] where F (x) =

∫ x

a

f(t) dt. That is because

(Φ− F )′ = 0 and by the mean value theorem, Φ− F = c.

Hence if f is continuous on [a, b] and Φ is continuously differentiable on [a, b] such thatΦ′ = f , then for any x1, x2 ∈ [a, b] we have:

∫ x2

x1

f(t) dt = Φ(x2)− Φ(x1).

88

Notice that this method of evaluating integrals

∫ b

a

f(t) dt hinges on the ability to

determine Φ such that Φ′ = f on [a, b].

Definition 44.1. Any function Φ such that Φ′ = f is called a primitive for f .

Unfortunately, most functions do not posses primitives expressible in terms of theelementary functions alone. In such cases it is necessary to settle for numerical estimates.

Example 44.1.

a)

∫ 1

0

(x3 + 2) dx =1

4x4 + 2x

∣∣∣∣1

0

=9

4.

b)

∫ 0

−1

(x2 − x)dx =x3

3− x2

2

∣∣∣∣0

−1

=5

6.

c) Find the primitives for the following functions:

1. f(x) = x2 + 3x− 2;

2. f(x) = 1 + cos 3x;

3. f(x) = ex cosh 2x;

4. f(x) =1√

9− x2;

5. f(x) = |x|.

45 Techniques to find primitives

In the applications of integral calculus, it is necessary to find primitives (when primitivesexist and are expressible in terms of simple functions). Various techniques exist fordetermining primitives and this short section looks at two of the most important:integration by parts and change of variables.

Integration by parts

Proposition 45.1. If the functions f and g are continuously differentiable on [a, b], then

∫f(x) · g′(x) dx = f(x) · g(x)−

∫f ′(x) · g(x) dx.

where

∫f(x)g′(x)dx represents the set of primitives of fg′ and

∫f ′(x)g(x)dx represents

the set of primitives of f ′g.

89

Proof. The function h = f · g is differentiable and its derivative h′ is continuous on [a, b].By the product rule of differentiation, we have

h′(x) = f ′(x) · g(x) + f(x) · g′(x).

Let now ϕ ∈∫

f(x)g′(x)dx and ψ = ϕ − fg. It is easy to see that ψ′ = ϕ′ − f ′g −

fg′ = −f ′g and therefore ψ ∈ −∫

f ′(x)g(x)dx. We obtain that ϕ = fg + ψ and

ψ ∈ −∫

f ′(x)g(x)dx. In other words, ϕ ∈ fg −∫

f ′(x)g(x)dx. It can be shown in

the same manner that for every ψ ∈ −∫

f ′(x)g(x)dx, the function ϕ = fg + ψ ∈∫

f(x)g′(x)dx.

The value of this formula lies in the hope that the primitive on the right-hand side iseasier to evaluate than the original one.

Corollary 45.1. If the functions f and g are continuously differentiable on [a, b] then

∫ b

a

f(x) · g′(x) dx = f(x) · g(x)

∣∣∣∣b

a

−∫ b

a

f ′(x) · g(x) dx.

Example 45.1. Many so-called reduction formulae can be established by repeatedintegration by parts. For example, let

In =

∫cosn x dx.

ThenIn = cosn−1 x · sin x + (n− 1) In−2 − (n− 1) In.

Hence:n In = cosn−1 x · sin x + (n− 1) In−2 n ≥ 2.

This formula, together with the fact that I0 = x and I1 = sin x, leads to the evaluation

of

∫cosn x dx for n ∈ N.

Example 45.2. Show that∫ 2

1

x2e2x dx =1

4e2(3e2 − 1).

Change of variables

Proposition 45.2. If the function g : [α, β] → [a, b] is a continuously differentiablebijection having the property g(α) = a, g(β) = b and f : [a, b] → R1 is continuous, then

(∫f(x) dx

)◦ g =

∫(f ◦ g)(t) · g′(t) dt.

where

∫f(x)dx represents the set of primitives of f and

∫(f ◦ g)(t) · g′(t) dt represents

the set of primitives of (f ◦ g) · g′.

90

Proof. Let F ∈∫

f(x) dx and G(t) = (F ◦ g)(t). By the composite rule of differentiation

G′(t) = F ′(g(t)) · g′(t) = f(g(t)) · g′(t) = (f ◦ g)(t) · g′(t).

Hence

G ∈∫

(f ◦ g)(t) · g′(t) dt

Let now G ∈∫

(f ◦ g)(t) · g′(t) dt and consider g−1 : [a, b] → [α, β]. We have

(g−1)′(x) =1

g′(t)where x = g(t) and the function F = F ◦ g−1 verifies

F ′(x) = G′(g−1(x))(g−1)′(x) = (f ◦ g)(g−1(x))[(g−1)′(x)]−1(g−1)′(x) = f(x)

Therefore, F ∈∫

f(x) dx and G ∈(∫

f(x) dx

)◦ g.

Example 45.3. Evaluate ∫1

x ln xdx.

Solution: Let f(x) =1

x ln xand g(t) = et. Then g′(t) = et and so

∫1

x ln xdx =

∫1

et · tet dt =

∫1

tdt = ln t = ln(ln x).

Corollary 45.2. If the function g : [α, β] → [a, b] is a continuously differentiable bijectionhaving the property g(α) = a, g(β) = b and f : [a, b] → R1 is continuous, then

∫ b

a

f(x) dx =

∫ β

α

(f ◦ g)(t) · g′(t) dt.

Example 45.4. Evaluate ∫ 2

1

t2√

t3 − 1 dt.

Solution: Let f(x) =√

x and g(t) = t3 − 1. Then g′(t) = 3t2 and therefore

∫ 2

1

√t3 − 1 · 3t2 dt =

∫ g(2)

g(1)

√x dx.

Hence ∫ 2

1

t2√

t3 − 1 dt =1

3

∫ 7

0

√x dx =

1

3· 2

3x

32

∣∣∣∣7

0

≈ 4.116.

The change of variables formula is often called integration by substitution where x = g(t)gives the substitution to be used. It is extensively used in elementary calculus books,where trial substitutions, which depend on the form of the integrand, are suggested.

91

Remark 45.1. If f : [−a, a] → R is piecewise continuous and symmetric (f(−x) = f(x))then:

a∫

−a

f(x) dx = 2

a∫

0

f(x) dx

and if f is antisymmetric (f(−x) = −f(x)) then:

a∫

−a

f(x) dx = 0

In order to obtain the above equalities, the integral

a∫

−a

f(x) dx is written as:

∫ a

−a

f(x) dx =

0∫

−a

f(x) dx +

a∫

0

f(x) dx

and then, in the first integral, the substitution x = −t is made.

Remark 45.2. If f : R → R is periodic of period T and piecewise continuous, then forany a ∈ R, the following equality holds:

a+T∫

a

f(x) dx =

T∫

0

f(x) dx ∀a ∈ R

In order to prove this equality, we write:

a+T∫

a

f(x) dx =

0∫

a

f(x) dx +

T∫

0

f(x) dx +

a+T∫

T

f(x) dx

In the integral

a+T∫

T

f(x) dx the substitution x = t + T is made, obtaining:

a+T∫

T

f(x) dx = −0∫

a

f(x) dx

46 Improper integrals

The definition of the Riemann-Darboux integrals applies only to bounded functionsdefined on bounded intervals. This section relaxes these conditions and defines improperintegrals.

The integral of a function defined and bounded on an interval which is not bounded isdefined below. This is called an improper integral of the first kind.

92

Definition 46.1. Let f be a function bounded on [a, +∞) and Riemann-Darboux inte-

grable on [a, b] for every b > a. If limb→∞

∫ b

a

f(x) dx exists, it is said that

∫ ∞

a

f(x) converges.

Otherwise

∫ ∞

a

f(x) diverges.

A completely analogous definition holds for integrals of the form

∫ a

−∞f(x).

Since it is necessary to preserve the additivity of the integral for improper integrals thefollowing is defined: ∫ +∞

−∞f(x) =

∫ 0

−∞f(x) +

∫ +∞

0

f(x)

provided both improper integrals on the right hand side converge.

Example 46.1.

a) The integral

∫ +∞

0

1

1 + x2converges to

π

2.

b) The integral

∫ +∞

1

1

x2converges to 1.

c) The integral

∫ +∞

1

1√x

diverges.

d) The integral

∫ +∞

−∞sin x diverges.

The integral of a function over a bounded interval where the function is not bounded willnow be defined. This is called improper integral of second kind.

Definition 46.2. Let f be a function defined on (a, b] and Riemann-Darboux integrableon [a + ε, b], for ε ∈ (0, b− a). If

limε→0+

∫ b

a+ε

f(x) dx

exists, then it is said that

∫ b

a

f(x) dx converges.

Example 46.2.

a) The integral

∫ 1

0

1√x

dx converges to 2.

b) The integral

∫ 1

0

1√1− x2

dx converges toπ

2.

c) The integral

∫ 1

0

1

xdx diverges.

93

a) The integral

∫ +∞

0

1√x

dx diverges.

Theorem 46.1 (Comparison test for integrals). Let f and g be defined on [a, +∞) andRiemann-Darboux integrable on [a, b] for every b > a. Suppose that

a) 0 ≤ f(x) ≤ g(x) for all x ≥ a;

b)

∫ +∞

a

g(x) dx converges.

Then

∫ +∞

a

f(x) dx converges too.

Proof. Now

0 ≤∫ b

a

f(x) dx ≤∫ b

a

g(x) dx.

Since 0 ≤ f(x) ≤ g(x),

∫ b

a

g(x) dx increases to its limiting value as b → +∞. Hence∫ b

a

f(x) dx is increasing and bounded above. Therefore

∫ +∞

a

f(x) dx is convergent

integral.

A comparison test for improper integrals of the second kind is easily formulated andproved in a similar manner.

Example 46.3. The integral

∫ +∞

0

e−x

1 + x2dx converges.

Solution: Consider the functions f(x) =e−x

1 + x2and g(x) =

1

1 + x2and apply the

comparison test.

47 Fourier series

The idea that a function may be represented by its Taylor series has already beendiscussed. We saw that in order to be able to write the Taylor series of a function fat a point x, it needs to be infinitely differentiable. This is a sever restriction that mostfunctions do not satisfy. Even when Taylor’s theorem with remainder is employed, thefunction still needs to be differentiable a finite number of times and this, like infinitedifferentiability, certainly implies that the function must be continuous. Nevertheless,many functions used to describe important physical phenomena are discontinuous andcannot be represented by Taylor series. For example, the function used to describethe voltage behavior in time in a circuit in which a switch is suddenly operated isdiscontinuous, just like the functional behavior of the gas pressure across a shock front.

94

In principle, at least, Fourier series offer the possibility of representation of continuous andpiecewise continuous functions, because whereas for Taylor series expansion, a functionneeds to be differentiable, for Fourier series expansion it would appear that it only needs tobe integrable; the Fourier coefficients can be computed when f(x) is piecewise continuous.

Definition 47.1. The Fourier series of a piecewise continuous function f(x) defined onthe interval [−π, π] is the series

f(x) ∼ a0

2+

∞∑n=1

(an · cos nx + bn · sin nx)

in which the Fourier coefficients an, bn are given by

an =1

π

∫ π

−π

f(x) · cos nx dx for n = 0, 1, 2, ...

bn =1

π

∫ π

−π

f(x) · sin nx dx for n = 1, 2, ...

Example 47.1.

a) Determine the Fourier series of the function f(x) = π2 − x for x ∈ [−π, π].

Solution: a0 =4π2

3, an = (−1)n+1 · 4

n2for n = 1, 2, ..., bn = 0 for all n.

b) Determine the Fourier series of the function f(x) = |x| for x ∈ [−π, π].

Solution: a0 = π, an = (−1)n+1 · −4

π(2n + 1)2for n = 1, 2, ..., bn = 0 for all n.

c) Determine the Fourier series of the function

f(x) =

{a for −π ≤ x < 0b for 0 ≤ x < π

Solution: a0 = a + b, an = 0 for n = 1, 2, ..., b2n = 0, b2n+1 =2(b− a)

(2n + 1)πfor

n = 1, 2, ....

Usually, until the convergence problem has been resolved, it is customary to denote therelationship between f(x) and its Fourier series by the sign ∼ instead of an equality.

The main result of this section will be establishing a fundamental theorem on theconvergence of Fourier series of a piecewise continuous function f(x). However, as thiswill require several subsidiary results, which are important in their own way, we nowestablish them in the form of two lemmas.

Lemma 47.1 (Integral representation of Sn(x)). The n-th partial sum of the Fourierseries of the function f(x) defined on the fundamental interval [−π, π], and prolonged byperiodic extension outside it, may be represented in the form:

Sn(x) =1

π

∫ π

−π

f(x− u) · sin(n + 1

2

)u

2 sin 12u

du

95

Proof. First, using the summation formula for a geometric progression it follows immedi-ately that

n∑

k=1

eikx =exp

[i(n + 1

2

)x]− exp

[12i x

]

2 i sin 12x

.

Hence, equating the real parts of the two sides of this equation, we deduce that

1

2+

n∑

k=1

cos kx =sin

(n + 1

2

)x

2 sin 12x

.

Integration of this expression over the intervals [−π, 0] and [0, π] shows that

∫ 0

−π

sin(n + 1

2

)u

2 sin 12u

du =

∫ π

0

sin(n + 1

2

)u

2 sin 12u

du =1

2π

since the only contribution from the left-hand side comes from the constant term.

Now consider the n-th partial sum Sn(x) of the Fourier series of f(x):

Sn(x) =a0

2+

n∑

k=1

(ak · cos kx + bk · sin kx).

By the definition of the Fourier coefficients ak and bk we may write

Sn(x) =1

2π

∫ π

−π

f(t) dt +1

π

n∑

k=1

[cos kx

∫ π

−π

f(t) cos kt dt + sin kx

∫ π

−π

f(t) sin kt dt

].

Taking the functions cos kx, sin kx under the integral signs, and employing the trigono-metric identity

cos k(x− t) = cos kx · cos kt + sin kx · sin kt

allows us to write

Sn(x) =1

π

∫ π

−π

f(t)

[1

2+

n∑

k=1

cos k(x− t)

]dt.

Applying the identity

1

2+

n∑

k=1

cos k(x− t) =sin

(n + 1

2

)(x− t)

2 sin 12(x− t)

and writing x− t = u, this becomes

Sn(x) =1

π

∫ x+π

x−π

f(x− u) · sin(n + 1

2

)u

2 sin 12u

du.

The trigonometric factor in this integrand has period 2π so that if, for the purpose ofthe study of its Fourier series, f(x) itself is also regarded as periodic with period 2π,then the entire integrand is periodic with period 2π. Consequently, a definite integralof this function taken over any interval of length 2π will be the same, showing that wemay replace the limits x − π and x + π by −π and π, respectively. This assumption ofthe periodicity of the function f(x) outside [−π, π] in fact places no restriction of f(x),because the Fourier series can only represent f(x) in the fundamental interval [−π, π], sothat the manner in which f(x) is defined outside it is immaterial.

96

Lemma 47.2. For a piecewise continuous function f(x) defined on [−π, π] the followingequalities hold:

a) limn→∞

∫ π

−π

f(x) cos nx dx = 0 and limn→∞

∫ π

−π

f(x) sin nx dx = 0, and

b) limn→∞

∫ b

a

f(x) sin

(n +

1

2

)x dx = 0 if −π ≤ a < b ≤ π.

Proof. Consider the identity:

∫ π

−π

[f(x)− Sn(x)]2 dx =

∫ π

−π

[f(x)]2 dx− 2

∫ π

−π

f(x) Sn(x) dx +

∫ π

−π

[Sn(x)]2.

From the definition of the Fourier coefficients, the orthogonality property of the trigono-metric system i.e.

∫ π

−π

sin mx cos nx dx = 0 for all m,n;

∫ π

−π

sin mx sin nx dx =

{0 for m 6= nπ for m = n

;

∫ π

−π

cos mx cos nx dx =

0 for m 6= nπ for m = n 6= 02π for m = n = 0

;

and from the form of the n-th partial sum Sn(x), it follows that

∫ π

−π

[Sn(x)]2 dx =

∫ π

−π

f(x) Sn(x) dx = π

[a2

0

2+

n∑

k=1

(a2k + b2

k)

].

Combining the last two equalities we have:

∫ π

−π

[f(x)− Sn(x)]2 dx =

∫ π

−π

[f(x)]2 dx− π

[a2

0

2+

n∑

k=1

(a2k + b2

k)

].

as the integrand of the left-hand side integral involves a square, it is either positive orzero, so we may conclude

a20

2+

n∑

k=1

(a2k + b2

k) ≤1

π

∫ π

−π

f 2(x) dx.

This is known as the Bessel inequality and it is true for all n.The fact that the right-hand side is finite by hypothesis, implies that the sum of squaresof the Fourier series coefficients must always be convergent.This result implies that an → 0 and bn → 0 for n → ∞. In terms of the definition ofFourier coefficients the limits an → 0 and bn → 0 are seen to be equivalent to

limn→∞

∫ π

−π

f(x) cos nx dx = 0 and limn→∞

∫ π

−π

f(x) sin nx dx = 0.

97

Observe that when

limn→∞

∫ π

−π

[f(x)− Sn(x)]2 dx = 0,

then we havea2

0

2+

∞∑

k=1

(a2k + b2

k) =1

π

∫ π

−π

[f(x)]2 dx.

This last result is known as Parseval’s equality.

The convergence ∫ π

−π

[f(x)− Sn(x)]2 dx −−−→n→∞

0

is usually known as the convergence in the mean.

We will show now that if f is a piecewise continuous function on [−π, π], then for a, bsuch that −π ≤ a < b ≤ π we have

limn→∞

∫ b

a

f(x) sin

(n +

1

2

)x dx = 0.

For that, we first remark that for any α < β

∣∣∣∣∫ β

α

sin

(n +

1

2

)x dx

∣∣∣∣ =

∣∣∣∣∣cos

(n + 1

2

)α− cos

(n + 1

2

)β

n + 12

∣∣∣∣∣ ≤2

n + 12

.

Now let a = x0 < x1 < . . . xp = b be a partition of the closed interval [a, b] and thecorresponding decomposition of the integral:

∫ b

a

f(x) sin

(n +

1

2

)x dx =

p−1∑i=0

∫ xi+1

xi

f(x) sin

(n +

1

2

)x dx.

Consider mi = inf{f(x) | x ∈ [xi, xi+1]} and represent

∫ b

a

f(x) sin

(n +

1

2

)x dx in the

following form

∫ b

a

f(x) sin

(n +

1

2

)x dx =

p−1∑i=0

∫ xi+1

xi

[f(x)−mi] sin

(n +

1

2

)x dx+

+

p−1∑i=0

mi

∫ xi+1

xi

sin

(n +

1

2

)x dx.

For ωi = Mi −mi, where Mi = sup{f(x) |x ∈ [xi, xi+1]} we have

f(x)−mi ≤ Mi −mi = ωi.

Now ∣∣∣∣∫ b

a

f(x) sin

(n +

1

2

)x dx

∣∣∣∣ ≤p−1∑i=0

ωi ·∆xi +2

n + 1/2

p−1∑i=0

|mi|

98

For ε > 0 we choose the partition such that

p−1∑i=0

ωi ·∆xi <ε

2. This is possible because the

piecewise continuous function f is integrable.

Now we can take n >4

εM(b − a), where M = sup{f(x) |x ∈ [a, b]} and we obtain that

for such values of n we have∣∣∣∣∫ b

a

f(x) sin

(n +

1

2

)x dx

∣∣∣∣ < ε.

Collecting the results together we arrive at Lemma 47.2.

Now we are ready to prove the fundamental Fourier theorem on convergence.

Theorem 47.1 (Fourier theorem). Let f be a piecewise continuous function defined onthe interval [−π, π] and extended by periodicity outside it. If f(x) has finite left-hand andright-hand side derivatives at its points of discontinuity, then:

a) when x = x0 is a point of continuity of f , then

limn→∞

Sn(x0) = f(x0).

b) when x = x0 is a point of discontinuity of f , then

limn→∞

Sn(x0) =1

2

[f(x+

0 ) + f(x−0 )].

Proof. Consider a function f(x) defined on [−π, π] being defined by periodic extensionoutside [−π, π]. Assume that f is piecewise continuous on [−π, π] and has a finitediscontinuity at x0. Denote

f(x−0 ) = limx→x−0

f(x) and f(x+0 ) = lim

x→x+0

f(x).

Then from the Lemma 47.1 we may write:

Sn(x0) =1

π

∫ π

−π

f(x0 − u) · sin(n + 1

2

)u

2 sin 12u

du.

We have also:1

2f(x+

0 ) =1

π

∫ 0

−π

f(x+0 ) · sin

(n + 1

2

)u

2 sin 12u

du

and1

2f(x−0 ) =

1

π

∫ π

0

f(x−0 ) · sin(n + 1

2

)u

2 sin 12u

du.

From the above results we deduce:

Sn(x0)− 1

2

[f(x+

0 ) + f(x−0 )]

=1

π

∫ 0

−π

[f(x0 − u)− f(x+0 )] · sin

(n + 1

2

)u

2 sin 12u

du +

+1

π

∫ π

0

[f(x0 − u)− f(x−0 )] · sin(n + 1

2

)u

2 sin 12u

du

99

The integrands on the right-hand side are well defined everywhere except, maybe at u = 0,where they require examination. The first integrand can be written in the form

F1(u) · sin(

n +1

2

)u

where

F1(u) =f(x0 − u)− f(x+

0 )

u·

12u

sin 12u.

As u → 0, the second factor tends to 1 and when the right-hand side derivative of f existsat x = x0, the first factor tends to −f ′(x+

0 ). So, F1(0) = −f ′(x+0 ) and the integrand is

well defined at u = 0. Similarly, if

F2(u) =f(x0 − u)− f(x−0 )

u·

12u

sin 12u

and if the right-hand derivative of f exists at x = x0, then F2(0) = f ′(x−0 ) and the secondintegrand is also well defined at u = 0.We may thus write

Sn(x0)−1

2

[f(x+

0 ) + f(x−0 )]

=1

π

∫ 0

−π

F1(u)·sin(

n +1

2

)u du+

1

π

∫ π

0

F2(u)·sin(

n +1

2

)u du

Applying Lemma 47.2 we conclude that

limn→∞

Sn(x0) =1

2

[f(x+

0 ) + f(x−0 )].

If f is continuous at x0, thenlim

n→∞Sn(x0) = f(x0).

We have thus proved one form of the Fourier theorem on convergence of Fourier series.

Example 47.2.

a) Deduce the Fourier series expansion of f(x) = π2 − x2 in the interval [−π, π].

b) Deduce the Fourier series expansion of the function f(x) = |x| in the interval [−π, π].

c) Deduce the Fourier series expansion of the function

f(x) =

{a if x ∈ [−π, 0]b if x ∈ (0, π]

in the interval [−π, π].

100

48 Different forms of Fourier series

Theorem 48.1 (change of the origin of the fundamental interval). If f(x) is a piecewisecontinuous function defined in the fundamental interval [−π, π] and by periodic extensionoutside it, then for any α, the Fourier coefficients an, bn are given by

an =1

π

∫ α+π

α−π

f(x) · cos nx dx for n = 0, 1, 2, . . .

bn =1

π

∫ α+π

α−π

f(x) · sin nx dx for n = 1, 2, . . .

The Fourier series of f(x) converges at every point of continuity and:

f(x) =a0

2+

∞∑n=1

(an cos nx + bn sin nx) for x ∈ [α− π, α + π].

Theorem 48.2 (change of the interval length). The Fourier expansion of the piecewisecontinuous function f(x) defined on [−L,L] is the series

f(x) =a0

2+

∞∑n=1

(an cosnπx

L+ bn sin

nπx

L)

with

an =1

L

∫ L

−L

f(x) · cosnπx

Ldx for n = 0, 1, 2, . . .

and

bn =1

L

∫ L

−L

f(x) · sin nπx

Ldx for n = 1, 2, . . .

Example 48.1.

a) Deduce the Fourier series expansion of the function

f(x) =

−x for − π

2≤ x < 0

x for 0 ≤ x < π

2π − x for π ≤ x ≤ 3π

2.

b) Deduce the Fourier series expansion of f(x) = x3 for −1 ≤ x ≤ 1.

When f(x) is an even function defined on the interval [−π, π], then f(−x) = f(x). Thus,it follows directly that f(x) · cos nx is an even function, because cos nx is even, andf(x) · sin nx is an odd function, because sin nx is odd.

Consider the Fourier coefficients an of an even function f(x), that we choose to write inthe form

an =1

π

∫ 0

−π

f(x) · cos nx dx +1

π

∫ π

0

f(x) · cos nx dx.

101

Then, changing the variable in the first integrand by writing u = −x, employing the evennature of the integrand to replace f(−u) · cos n(−u) by f(u) · cos nu and changing thesign of the integral by reversing the limits, we find

an =2

π

∫ π

0

f(x) cos nx dx for n = 0, 1, 2, . . .

The same argument applied to the coefficients bn shows that

bn = 0 for n = 1, 2, . . .

Consequently, if f(x) is an even function on [−π, π], its Fourier series contains only cosinefunctions and is of the form

f(x) =a0

2+

∞∑n=1

an cos nx for x ∈ [−π, π].

This is called the Fourier cosine expansion of the even function f(x) in [−π, π].

Example 48.2.

a) Deduce the Fourier series expansion of the even function f(x) = x2 in [−π, π].

b) Deduce the Fourier series expansion of the even function f(x) = |x| in [−π, π].

When f(x) is an odd function defined on the interval [−π, π], then f(−x) = −f(x). Asimilar argument as in the case of even functions leads us to

an = 0 for n = 0, 1, . . .

and

bn =2

π

∫ π

0

f(x) sin nx dx for n = 1, 2, . . .

from which it follows that the Fourier series of an odd function defined on [−π, π] containsonly sine functions and is of the form

f(x) =∞∑

n=1

bn sin nx for x ∈ [−π, π].

This is called the Fourier sine expansion of the odd function f(x) in [−π, π].

This results can be usefully interpreted in terms of any arbitrary function f(x) which isto be expanded in the half interval [0, π]. Defining a new function g(x), by the rule

g(x) =

{f(−x) for − π ≤ x < 0f(x) for 0 ≤ x ≤ π

we see that g(x) is an even function which is equal to f(x) in the required interval [0, π].Thus, as a Fourier cosine expansion of g(x) only requires the knowledge of g(x) in thehalf interval [0, π] in which g(x) = f(x), it follows that

f(x) =a0

2+

∞∑n=1

an cos nx for x ∈ [0, π]

102

is the desired cosine expansion of f(x) in [0, π].

Alternatively, we may expand the same function f(x) in the half interval [0, π] in a Fouriersine series as follows: define a new function h(x) by the rule

h(x) =

{ −f(−x) for − π ≤ x < 0f(x) for 0 ≤ x ≤ π.

Then, h(x) is an odd function which is equal to f(x) in the required interval [0, π]. TheFourier sine expansion of h(x) only requires the knowledge of h(x) in the half interval[0, π] where h(x) = f(x). So

f(x) =∞∑

n=1

bn sin nx for 0 ≤ x ≤ π

provides the desired sine expansion of f(x) for x ∈ [0, π]. These expansions are oftencalled the half-range expansions of f(x).

We have proved the following theorem:

Theorem 48.3 (Fourier sine and cosine series). If f(x) is an arbitrary function definedand piecewise continuous on [0, π], then it may either be expanded as a Fourier cosineseries

f(x) =a0

2+

∞∑n=1

an cos nx 0 ≤ x ≤ π

in which

an =2

π

∫ π

0

f(x) cos nx dx for n = 0, 1, 2, . . .

or as a Fourier sine series

f(x) =∞∑

n=1

bn sin nx 0 ≤ x ≤ π

in which

bn =2

π

∫ π

0

f(x) sin nx dx for n = 1, 2, . . .

103

Part III

Functions of several variables

49 Topology in Rn

Definition 49.1. The set Rn is the collection of all the finite sequences x = (x1, x2, ..., xn)of n real numbers:

Rn = {(x1, x2, ..., xn)|xi ∈ R1, i = 1, 2, ..., n}

Definition 49.2. A real valued function of n variables associates to every finite sequenceof n real numbers of a set A ⊂ Rn an unique real number.

Formally, f : A ⊂ Rn → R1 is given by

(x1, x2, . . . , xn) = x 7→ f(x) = f(x1, x2, . . . , xn)

where x = (x1, x2, . . . , xn) is an element of A ⊂ Rn.

Example 49.1. The function f : R2 → R1 given by f(x1, x2) = x21 + x2

2 is a real valuedfunction of two variables.

Definition 49.3. A vector valued function f of n variables associates to every finitesequence x = (x1, x2, . . . , xn) of n real numbers of the set A ⊂ Rn a unique vector f(x)from Rm.

Formally, f : A ⊂ Rn → Rm is given by

x 7→ f(x1, x2, . . . , xn) = (f1(x1, x2, . . . , xn), . . . , fm(x1, x2, . . . , xn))

where x = (x1, x2, . . . , xn) ∈ A and (f1(x1, x2, . . . , xn), . . . , fm(x1, x2, . . . , xn)) ∈ Rm.

Example 49.2. The function f : R3 → R2 given by

f(x1, x2, x3) = (x1 + x2 + x3, x1 · x2 · x3)

is a vector function of three variables. Here f1(x1, x2, x3) = x1+x2+x3 and f2(x1, x2, x3) =x1 · x2 · x3.

If f(x1, x2, . . . , xn) = (f1(x1, x2, . . . , xn), . . . , fm(x1, x2, . . . , xn)), then fi are real valuedfunctions of n variables for i = 1,m and are called the scalar components of the vectorfunction f .

When functions of one real variable were discussed, it was necessary to investigate realnumbers which were close to a fixed real number a. This lead to an interest in the quantity|x− a|. Analogies of length and distance in Rn can be obtained as follows:

104

Rn is organized as an n-dimensional vector space using the sum and the scalar productdefined by:

(x1, x2, . . . , xn) + (y1, y2, . . . , yn) = (x1 + y1, x2 + y2, . . . , xn + yn)

k(x1, x2, . . . , xn) = (kx1, kx2, . . . , kxn)

For x ∈ Rn the norm (or length) of x is defined by

‖x‖ =

√√√√n∑

i=1

x2i =

√x2

1 + x22 + . . . + x2

n.

The distance between x and a = (a1, a2, . . . , an) is taken to be ‖x− a‖. So

‖x− a‖ =

√√√√n∑

i=1

(xi − ai)2 =√

(x1 − a1)2 + (x2 − a2)2 + . . . + (xn − an)2.

A neighborhood of a ∈ Rn is a set V ⊂ Rn which contains a hypersphere Sr(a) centeredin a,

Sr(a) = {x ∈ Rn | ‖x− a‖ < r} r > 0

Sr(a) ⊂ V .

Deleting a from a neighborhood V of a, a deleted neighborhood of a will be obtained:

V 0 = V \ {a}.

Example 49.3. For any (a1, a2) ∈ R2 a hypersphere Sr(a) is a neighborhood ofa = (a1, a2). For example, the set

S1(0,−1) = {(x1, x2) |x21 + (x2 + 1)2 < 1}

is a neighborhood of a = (0,−1).

Now the limit of a sequence (xk) of points of Rn can be defined. A sequence (xk) of pointsof Rn is a function whose domain is the set of natural numbers and whose values belongto Rn. The value of the function corresponding to argument k is denoted by xk. Thesequence x1, x2, . . . , xk, . . . is denoted by (xk).

Definition 49.4. A vector x ∈ Rn is said to be the limit of the sequence (xk) if for anyε > 0 there exists N = N(ε) > 0 such that for any k > N we have ‖xk − x‖ < ε. In thiscase we write lim

k→∞xk = x.

The limit of the sequence (xk), if it exists, is unique. If a sequence (xk) converges to x,then the sequence is bounded, i.e. there exists M > 0 such that ‖xk‖ < M for any k ∈ N.

If a sequence (xk) converges to x, then any subsequence (xkl) of the sequence (xk) converges

to x.

105

A sequence (xk), xk = (x1k, x2k, ..., xnk) ∈ Rn converges to x = (x1, x2, ..., xn) ∈ Rn if andonly if the sequence (xik) converges to xi for any i = 1, 2, ..., n.

According to Bolzano-Weierstrass theorem, any bounded sequence (xk) of points of Rn

contains a convergent subsequence.

The Cauchy’s criterion for the convergence of a sequence (xk) of points xk ∈ Rn statesthat (xk) converges if and only if for any ε > 0 there exists Nε such that for p, q > Nε wehave ‖xp − xq‖ < ε.

Definition 49.5. Let be A ⊂ Rn. A point x ∈ Rn is called an interior point of the set Aif there exists a hypersphere Sr(x) such that Sr(x) ⊂ A.

For instance, a point y of a hypersphere Sr(a) is an interior point of the hypersphere.

Definition 49.6. The interior set of A ⊂ Rn is the set of all interior points of the set A.Usually the interior of a set A is denoted by Int(A).

For instance, if A is the hypersphere Sr(a), then Int(A) = Sr(a) = A.

Definition 49.7. A set A ⊂ Rn is said to be open if A = Int(A).

For instance, any hypersphere Sr(x) is an open set.

A set A ⊂ Rn is open if and only if it contains a neighborhood of each of its points.

The union of any family of open sets is open.

The sets Rn and ∅ are open.

The intersection of a finite number of open sets is open.

Definition 49.8. A set A ⊂ Rn is said to be closed if its complement is open.

The intersection of any family of closed sets is closed.

The union of a finite number of closed sets is closed.

The sets Rn and ∅ are closed.

A closed hypersphere Sr(a) defined as:

Sr(a) = {x ∈ Rn | ‖x− a‖ ≤ r}

is closed.

Definition 49.9. A point a ∈ Rn is a limit point (or a point of accumulation) of the setA ⊂ Rn provided every deleted neighborhood of a intersects A.

Definition 49.10. The closure A of a set A ⊂ Rn is the intersection of all closed setscontaining A.

The set of points in A and not in the interior Int(A) of A is called the boundary of Aand it is denoted by ∂A.

106

The closure operation has the properties:

• A ∪B = A ∪B;

• A ⊃ A;

• A = A;

• A = A ⇔ A is closed;

• x ∈ A if and only if every neighborhood V of x intersects A.

Definition 49.11. A set A ⊂ Rn is bounded if there exists r > 0 such that A ⊂ Sr(0).

Definition 49.12. A set A ⊂ Rn is compact if it is both bounded and closed.

For instance, a closed hypersphere Sa(r) is a compact set.

Example 49.4. The set D defined by

D = {(x, y) |x + y ≤ 1 and x ≤ 0 and y ≤ 0}

is a compact subset of R2.

Solution: D is bounded since it is contained in the hypersphere S2(0) = {(x, y) |x2+y2 <4}. If a is an element of the complement of D, then a lies a distance r > 0 away from atleast one of the lines x + y = 1, x = 0 or y = 0. Hence, the open hypersphere Sr(a) liesin one of the regions x + y > 1, x < 0 or y < 0. So Sr(a) lies in the complement of D.Thus, D is closed.

Remark 49.1. If A ⊂ Rn is a compact set, then every sequence (xk) with xk ∈ A containsa subsequence (xkl

) which converges to a point x0 ∈ A.

Definition 49.13. A set A ⊂ Rn is connected if there are no open sets G1, G2 such that

A ⊂ G1 ∪G2, A ∩G1 6= ∅, A ∩G2 6= ∅, and (A ∩G1) ∩ (A ∩G2) = ∅.

50 Limit of a function at a point

Definition 50.1. Let f : A ⊂ Rn → R1 be a real valued n variable function and a a pointof accumulation of A (i.e. every deleted neighborhood of a contains at least one pointa′ ∈ A). The real number L is called the limit of f(x) as x tends to a if for any ε > 0,there exists δ = δ(ε) > 0 such that

0 < ‖x− a‖ < δ ⇒ |f(x)− L| < ε.

We writelimx→a

f(x) = L.

107

Just like in the case of functions of one variable, this definition is technically difficult toimplement except for the simplest functions. However, the obvious generalization of thesum, product and quotient rules can be proved. Their use is illustrated in the followingexample.

Example 50.1. Evaluate lim(x,y)→(2,1)

f(x, y) where f(x, y) =x2 − y2

x2 + y2.

Solution: As x → 2 and y → 1, x2 − y2 → 3 and x2 + y2 → 5. Hence

x2 − y2

x2 + y2−−−−−−→(x,y)→(2,1)

3

5.

Definition 50.2. Let f : A ⊂ Rn → Rm be a vector valued function of n variables and apoint of accumulation a of A. L ∈ Rm is called the limit of f(x) as x tends to a, if forany ε > 0, there exists δ > 0 such that

0 < ‖x− a‖ < δ ⇒ ‖f(x)− L‖ < ε.

We writeL = lim

x→af(x).

If f(x1, . . . , xn) = (f1(x1, . . . , xn), . . . , fm(x1, . . . , xn)) and L = (L1, . . . , Lm), then thefollowing statement holds:

limx→a

f(x) = L ⇔ limx→a

fi(x) = Li i = 1, m.

Example 50.2. Evaluate lim(x,y)→(2,1)

f(x, y) where f(x, y) =

(xy

x2 + y2, x− y

).

Theorem 50.1 (Heine’s criterion for the limit). The function f : A ⊂ Rn → Rm has alimit as x approaches a if and only if for any sequence (xk), xk ∈ A, xk 6= a, and xk → aas k →∞, the sequence (f(xk)) converges.

Proof. As for the one variable real valued functions.

Theorem 50.2 (Cauchy-Bolzano’s criterion for the limit). The function f : A ⊂ Rn →Rm has a limit as x → a if and only if for any ε > 0 there exists δ > 0 such that

0 < ‖x′ − a‖ < δ and 0 < ‖x′′ − a‖ < δ ⇒ ‖f(x′)− f(x′′)‖ < ε.

Proof. As for the one variable real valued functions.

51 Continuity

Definition 51.1. A real valued n variable function f : A ⊂ Rn → R1 is continuous ata ∈ A if lim

x→af(x) = f(a).

108

This definition requests three things: firstly that limx→a

f(x) exists, secondly that f(a) is

defined, and finally that the previous two values are equal.

In terms of ε, δ this is equivalent to the following:

Definition 51.2. A function f : A ⊂ Rn → R1 is continuous at a ∈ A if for every ε > 0there exists δ = δ(ε) > 0 such that

‖x− a‖ < δ ⇒ |f(x)− f(a)| < ε.

Definition 51.3. A vector valued n-variable function f : A ⊂ Rn → Rm is continuous ata ∈ A if for every ε > 0 there exists δ = δ(ε) > 0 such that

‖x− a‖ < δ ⇒ ‖f(x)− f(a)‖ < ε.

Example 51.1. Use the ε, δ condition of continuity to prove that the following functionsare continuous at the mentioned points:

a) f(x, y) = x2 + y2 at x = y = 0;

b) f(x, y) = (x2 − y2, x · y) at x = 1, y = 1;

c) f(x, y, z) = x + y + z at x = y = z = 0;

d) f(x, y, z) = (x2 + y2 + z2, x + y + z) at x = y = z = 1.

The rules for continuous functions of one variable can be generalized to give correspondingrules for functions of several variables. These are stated in the next two theorems.

Theorem 51.1. Let f and g be real valued functions of n variables defined in aneighborhood of a. If f and g are continuous at a, then so are f + g, f · g, and, when

f(x) 6= 0,1

f.

Theorem 51.2. Let f : A ⊂ Rn → B ⊂ Rm be continuous at a ∈ A and g : B ⊂ Rm → Rp

be continuous at f(a) = b ∈ Rm. Then the composite function g◦f : A → Rp is continuousat a.

Discontinuities of functions of more than one variable are often difficult to spot. In thecase of a function f of two variables, any discontinuities can be visualized geometricallyby appealing to the surface in R3 represented by the equation z = f(x, y). Some, but byno means all, of the discontinuities of such a surface correspond to holes or tears.

Example 51.2. The function f : R2 → R1 given by f(x, y) = x2 + y2 is continuous forall (x, y). The surface given by z = f(x, y) is a parabolic bowl. To see this, notice that ahorizontal section for z = k ≥ 0 gives the circle x2 + y2 = k and a vertical cross-sectionfor fixed x or y gives a parabola.

Example 51.3. Investigate the behavior of the function f given by

f(x, y) =

xy

x2 + y2if (x, y) 6= (0, 0)

0 if (x, y) = (0, 0)

as (x, y) approaches (0, 0).

109

Solution: f is discontinuous at (0, 0) and the discontinuity is much nastier than theremovable jump of finite discontinuities seen for functions of one variable. Recall thatwhen establishing the discontinuity of a function of one variable at some points often theright and left-hand limits existed, but were unequal. Clearly, if lim

(x,y)→(0,0)f(x, y) exists,

its value must be independent on the way in which (x, y) approaches (0, 0). Considerlimx→0

f(x,mx):

limx→0

f(x,mx) = limx→0

mx2

x2 + m2x2=

m

1 + m2.

But this quantity varies with m and so f cannot be continuous at (0, 0), no matter whatvalue is specified for f(0, 0).

Example 51.4. Investigate the behavior of the function f given by

f(x, y) =

xy3

x2 + y6if (x, y) 6= (0, 0)

0 if (x, y) = (0, 0)

as (x, y) approaches (0, 0).

Solution: Firstly

limx→0

f(x,mx) = limx→0

m3x4

x2 + m6x6= lim

x→0

m3x2

1 + m6x4= 0.

However this is not sufficient evidence to suppose that f(x, y) approaches 0 as (x, y)approaches (0, 0). In fact

limx→0

f(x, 3√

x) = limx2

x2 + x2=

1

2.

Hence f is discontinuous at (0, 0).

Theorem 51.3. Let f : A ⊂ Rn → Rm, f(x) = (f1(x), . . . , fm(x)) and a ∈ A. Thefunction f is continuous at a ∈ A if and only if the functions fi, i = 1, 2, . . . , m arecontinuous at a.

Theorem 51.4 (Heine’s criterion for continuity). The function f : A ⊂ Rn → Rm iscontinuous at a ∈ A if and only if for any sequence (xk), xk ∈ A, xk −−−→

k→∞a the sequence

(f(xk)) converges to f(a).

Theorem 51.5 (Cauchy-Bolzano’s criterion for continuity). The function f : A ⊂ Rn →Rm is continuous at a ∈ A if and only if for any ε > 0 there exists δ > 0 such that

‖x′ − a‖ < δ and ‖x′′ − a‖ < δ ⇒ ‖f(x′)− f(x′′)‖ < ε.

Remark 51.1. If the function f : A ⊂ Rn → Rm is continuous at a ∈ A, then thefunction ‖f‖ : A ⊂ Rn → R1

+ defined by ‖f‖(x) = ‖f(x)‖, is continuous at a.

Remark 51.2. Generalizations of some important theorems, proved for single variablereal valued continuous functions, requires higher dimensional analogues of the topologyin R1.

110

52 Important properties of continuous functions

Theorem 52.1 (The boundedness property). If f : A ⊂ Rn → Rm is continuous on thecompact set A, then

a) the set f(A) is bounded and

b) there exists a ∈ A such that ‖f(a)‖ = sup ‖f(A)‖.

Proof.a) Assume that f(A) is unbounded. Then for every k ∈ N there exists xk ∈ A such that‖f(xk)‖ > k. The sequence (xk) is bounded and therefore there exists a subsequence (xkl

)of the sequence (xk) which converges towards a point x0 ∈ A, xkl

−−−→kl→∞

x0. Hence, the

sequence (f(xkl)) converges to f(x0). Therefore, there exists N such that for kl > N we

have ‖f(xkl)‖ ≤ ‖f(x0)‖+ 1. Absurd.

b) Consider R = sup ‖f(A)‖ and note that for every k ∈ N there exists xk ∈ A such that

R− 1

k< ‖f(xk)‖ < R.

For the sequence (xk) there exists a subsequence xklsuch that xkl

−−−→kl→∞

x0 ∈ A. Therefore

f(xkl) −−−→

kl→∞f(x0) and ‖f(xkl

)‖ −−−→kl→∞

‖f(x0)‖. From the inequality

R− 1

kl

< ‖f(xkl)‖ < R.

it follows that ‖f(x0)‖ = R.

Corollary 52.1. If f : A ⊂ Rn → R1 is continuous on the compact set A, then:

a) there exists m,M such that

m = inf{f(x) | x ∈ A} M = sup{f(x) | x ∈ A}

b) there exist c and d, c, d ∈ A such that f(c) = m and f(d) = M .

Definition 52.1. A function f : A ⊂ Rn → R1 is uniformly continuous (on A) if forevery ε > 0 there exists δ = δ(ε) > 0 such that for x′, x′′ ∈ A we have

‖x′ − x′′‖ < δ ⇒ ‖f(x′)− f(x′′)‖ < ε.

Theorem 52.2. A vector valued function f : A ⊂ Rn → Rm is uniformly continuous (onA) if and only if its scalar components f1, f2, . . . , fm : A → R1 are uniformly continuous.

Theorem 52.3. If f : A ⊂ Rn → Rm is continuous on the compact set A, then f isuniformly continuous on A.

Theorem 52.4. Let A ⊂ Rn, A′ ⊂ Rm and f : A → A′ such that f(A) = A′. Thefunction f is continuous on A if and only if for every open set G′ ⊂ Rm, there exists anopen set G ⊂ Rn such that

G ∩ A = f−1(G′ ∩ A′).

111

Corollary 52.2. The function f is continuous on A if and only if for every closed setF ′ ⊂ Rm, there exists a closed set F ∈ Rn such that

F ∩ A = f−1(F ′ ∩ A′).

Theorem 52.5. If the set A ⊂ Rn is connected and the function f : A ⊂ Rn → Rm iscontinuous, then the set f(A) is connected.

Corollary 52.3. If the set A ⊂ Rn is compact and connected and the function f : A ⊂Rn → R1 is continuous, then f(A) is a closed interval.

53 Differentiation

This section defines what is meant by saying that a function of n variables is differentiable,but to begin, let’s examine the concept of partial differentiability.

Definition 53.1. Let f : A ⊂ Rn → R1 be a real valued function of n variables and aan interior point of the set A. The function f is said to be partially differentiable withrespect to xi at a if

limh→0

f(a1, ..., ai−1, ai + h, ai+1, ..., an)− f(a1, ..., ai, ..., an)

h

exists. The value of this limit is denoted by∂f

∂xi

(a) and is called the partial derivative of

f with respect to xi at the point a.

Definition 53.2. If f is partially differentiable with respect to xi in a neighborhood of a,

then the function x 7→ ∂f

∂xi

(x) defined on that neighborhood is called the partial derivative

of f with respect to xi.

Remark 53.1. To calculate partial derivatives, one has to differentiate (in the normalmanner) with respect to xi keeping all the other variables fixed. Hence, the obvious rulesfor partially differentiating sums, products and quotients can be used.

Example 53.1. Calculate the partial derivatives of the function f given by

f(x, y, z) = x2y + x · sin y +y

z.

Solution: The partial derivatives with respect to x is

∂f

∂x= 2xy + sin y.

Similarly,∂f

∂y= x2 + x cos y +

1

zand

∂f

∂z= − y

z2.

112


f(x1, ..., xn) =n∑

i=1

n∑j=1

aijxixj (x1, ..., xn) ∈ Rn.

Solution:∂f

∂xk

=n∑

j=1

(akj + ajk)xj.

Definition 53.3. Let f : A ⊂ Rn → Rm be an n variable vector valued function (i.e.f(x1, ..., xn) = (f1(x1, ..., xn), ..., fm(x1, ..., xn))) and a an interior point of the set A.The function f is said to be partially differentiable with respect to xi at a if the scalarcomponents f1(x1, ..., xn), ..., fm(x1, ..., xn) are partially differentiable with respect to xi

at a. The vector

(∂f1

∂xi

(a), . . . ,∂fm

∂xi

(a)

)is called the partial derivative of f with respect

to xi at a and it is denoted by∂f

∂xi

(a):

∂f

∂xi

(a) =

(∂f1

∂xi

(a), . . . ,∂fm

∂xi

(a)

).

If f = (f1, ..., fm) is partially differentiable with respect to xi in a neighborhood of a, a

function∂f

∂xi

defined on that neighborhood, called the partial derivative of f with respect

to xi is obtained:

x 7→ ∂f

∂xi

(x) =

(∂f1

∂xi

(x), . . . ,∂fm

∂xi

(x)

).


f(x, y, z) = (x + y + z, xy + xz + yz, xyz).

Solution:

∂f

∂x= (1, y + z, yz) ;

∂f

∂y= (1, x + z, xz) ;

∂f

∂z= (1, x + y, xy).

Definition 53.4. Let f : A ⊂ Rn → R1 be a real valued function of n variables, a aninterior point of the set A and u a unit vector in Rn (i.e. ‖u‖ = 1). If the limit

limt→0

f(a + t · u)− f(a)

t

exists it is called the directional derivative of f at the point a and it is denoted by

∇uf(a) = limt→0

f(a + t · u)− f(a)

t.

Remark 53.2. Let ei = (0, ..., 0, 1︸︷︷︸i

, 0, ..., 0). The directional derivative of f at a is

∇eif(a) =

∂f

∂xi

(a) i = 1, n.

Hence, partial derivatives are special cases of directional derivatives.

113

Example 53.4. If u = (ux, uy) and f(x, y) = x · y, then

∇uf(x, y) =∂f

∂x· ux +

∂f

∂y· uy = y · ux + x · uy.

Definition 53.5. Let f = (f1, . . . , fm) be a vector valued n variables function f : A ⊂Rn → Rm; a an interior point of A and u a unit vector in Rn (i.e. ‖u‖ = 1).If the limit

limt→0

f(a + t · u)− f(a)

t

exists, it is called the directional derivative of f at the point a and it is denoted by ∇uf(a).

It is easy to see that∇uf(a) = (∇uf1(a), . . . ,∇ufm(a)).

Remark 53.3. The directional derivative ∇uf(a) exists if the directional derivatives∇ufi(a), i = 1, m exist.

Example 53.5. The directional derivative of the function f(x, y, z) = (xy+xz+yz, xyz)at the point (x, y, z) is

∇uf(x, y, z) = ((y + z)ux + (x + z)uy + (x + y)uz, yz · ux + xz · uy + xy · uz).

Theorem 53.1. Let f be a real valued function of n variables, f : A ⊂ Rn → R1, and a

an interior point of A. If the partial derivatives∂f

∂xi

, i = 1, n exist in a neighborhood of a

and they are continuous at a, then the following equality holds:

limh→0

1

‖h‖

[f(a + h)− f(a)−

n∑i=1

∂f

∂xi

(a) · hi

]= 0.

Proof. Consider the vectors vj = (a1, a2, . . . , aj, aj+1 + hj+1, . . . , an + hn) for j = 0, n− 1and vn = a and represent f(a + h)− f(a) in the form:

f(a + h)− f(a) =n−1∑j=0

[f(vj)− f(vj+1)] =n−1∑j=0

∂f

∂xj+1

(vj+1 + θj+1 · hj+1 · ej+1)hj+1

=n∑

i=1

∂f

∂xi

(vi + θi · hi · ei) · hi with 0 ≤ θi ≤ 1

where ei = (0, . . . , 0, 1, 0, . . . , 0).Hence:

1

‖h‖

[f(a + h)− f(a)−

n∑i=1

∂f

∂xi

(a) · hi

]=

1

‖h‖n∑

i=1

[∂f

∂xi

(vi + θi · hi · ei)− ∂f

∂xi

(a)

]· hi.

Now remark that ‖vi + θi · hi · ei − a‖ ≤ ‖h‖, i = 1, n and therefore, for ε > 0, there isδ > 0 such that ‖h‖ < δ ⇒

∣∣∣∣∂f

∂xi

(vi + θi · hi · ei)− ∂f

∂xi

(a)

∣∣∣∣ <ε

n, i = 1, n.

114

So,

‖h‖ < δ ⇒ 1

‖h‖

∣∣∣∣∣f(a + h)− f(a)−n∑

i=1

∂f

∂xi

(a) · hi

∣∣∣∣∣ < ε.

The above theorem shows that in a small neighborhood of a the function f can be

approximated by the polynomial of first degree f(a) +n∑

i=1

∂f

∂xi

(a) · hi.

Definition 53.6. A real valued n variables function f : A ⊂ Rn → R1 is said to bedifferentiable at a if it is partially differentiable at a with respect to every variable xi and

limh→0

1

‖h‖

[f(a + h)− f(a)−

n∑i=1

∂f

∂xi

(a) · hi

]= 0.

The function daf : Rn → R1 defined by

daf(h) =n∑

i=1

∂f

∂xi

(a) · hi, ∀h ∈ Rn

is called the Frechet derivative of f at a.

Remark 53.4. The Frechet derivative daf : Rn → R1 of a function f : A ⊂ Rn → R1 ata is a linear function on Rn. It is a polynomial of first degree in h1, h2, ..., hn.

Remark 53.5. For ‖h‖ = 1, we have da(h) = ∇hf(a).

Example 53.6. Show that the following functions are differentiable and compute theirFrechet derivatives.

a) f(x, y) = x2 + y2, d(x,y)f(hx, hy) = 2x · hx + 2y · hy;

b) f(x, y) = x · y, d(x,y)f(hx, hy) = y · hx + x · hy;

c) f(x, y, z) = x·y+x·z+y ·z, d(x,y,z)f(hx, hy, hz) = (y+z)·hx+(x+z)·hy+(x+y)·hz.

Remark 53.6. If the real valued function f :⊂ Rn → R1 is differentiable at a ∈ A, thenf is continuous at a.

Definition 53.7. Let f = (f1, . . . , fm) be a vector valued function of n variables,f : A ⊂ Rn → Rm, and a an interior point of A. The function f is said to be differentiableat a if every scalar component fj, j = 1, m of f is differentiable at a.The function daf : Rn → Rm defined by

daf(h) =m∑

j=1

(n∑

i=1

∂fi

∂xi

(a) · hi

)· ej

is called the Frechet derivative of f at a, where ej = (0, . . . , 0, 1, 0, . . . , 0) ∈ Rm.

115

The Frechet derivative is a set of m polynomials of first degree in h1, h2, ..., hn.

Example 53.7. Show that f(x1, x2, x3) = (x1x2x3, x21 + x2

2 + x23) is differentiable at any

point and compute its Frechet derivative.

Solution: daf(h) = (x2x3h1 + x1x3h2 + x1x2h3, 2x1h1 + 2x2h2 + 2x3h3)

Definition 53.8. The matrix of the linear function daf is called the Jacobi matrix of fat a. This is a m× n matrix and is given by

Ja(f) =

(∂fi

∂xi

(a)

)

m×n

.

We have daf(h) = Ja(f) · h.

Example 53.8. Consider the vector valued function of n variables f : Rn → Rm definedby

f(x1, . . . , xn) =m∑

i=1

(n∑

j=1

aijxj

)· ei.

Show that f is Frechet differentiable at any point x ∈ Rn and the following relations hold:

dxf(h) =m∑

i=1

(n∑

j=1

aijhj

)· ei

Ja(f) =

(∂fi

∂xi

(a)

)

m×n

= (aij)m×n.

Remark 53.7. If the vector valued function of n variables f : A ⊂ Rn → Rm isdifferentiable at a ∈ A, then f is continuous at a.

Theorem 53.2. Let f : A ⊂ Rn → B ⊂ Rm and g : B ⊂ Rm → Rp. If f is differentiableat a ∈ Int(A) and g is differentiable at f(a) = b ∈ Int(B), then h = g ◦ f is differentiableat a and dah = dbg ◦ daf.

Proof. f differentiable at a implies

f(x)− f(a) = daf(x− a) + ε1(x) · ‖x− a‖

with ε1(x) → 0 as x → a.g differentiable at b = f(a) implies

g(y)− g(b) = dbg(y − b) + ε2(y) · ‖y − b‖

with ε2(y) → 0 as y → b.Hence

h(x)− h(a) =g(f(x))− g(f(a)) = dbg(f(x)− f(a)) + ε2(f(x)) · ‖f(x)− f(a)‖=dbg(daf(x− a) + ε1(x) · ‖x− a‖) + ε2(f(x)) · ‖daf(x− a) + ε1(x) · ‖x− a‖‖=dbg ◦ daf(x− a) + ‖x− a‖dbg(ε1(x)) + ‖daf(x− a) + ε1(x)‖x− a‖‖) · ε2(f(x)).

116

Denote

ε3(x) =h(x)− h(a)− dbg ◦ daf(x− a)

‖x− a‖=dbg(ε1(x)) +

‖daf(x− a) + ε1(x) · ‖x− a‖‖‖x− a‖ · ε2(f(x))

Hence‖ε3(x)‖ ≤ ‖dbg‖ · ‖ε1(x)‖+ (‖daf‖+ ‖ε1(x)‖) · ‖ε2(f(x))‖

and ε3(x) → 0 as x → a.

Remark 53.8. The Jacobi matrix of h at a is the product of the Jacobi matrix of g at”b” and the Jacobi matrix of f at a :

∂hi

∂xj

(a) =m∑

k=1

∂gi

∂yk

(b) · ∂fk

∂xj

(a), i = 1, p, j = 1, n.

Example 53.9. Consider f : R2 → R2 defined by f(x1, x2) = (x1 + x2, x1 · x2) and

g : R2 → R2 defined by g(ρ, θ) = (ρ cos θ, ρ sin θ). Find h(ρ, θ) = (f ◦ g)(ρ, θ) and∂h

∂ρ,

∂h

∂θ.

Corollary 53.1. Let f : A ⊂ Rn → B ⊂ Rn be a bijection where A,B are open subsetsof Rn.If f is differentiable at a ∈ A and f−1 is differentiable at b = f(a), then daf is a bijectionof Rn on Rn and

(daf)−1 = df(a)f−1.

The above statement follows from the equality f−1 ◦ f = iA using the rule of differentiablecomposite functions.

54 Basic properties of differentiable functions

Mean value theorem(Lagrange) Let x, h ∈ Rn. Denote by [x, x + h] the set definedby:

[x, x + h] = {x + th ∈ Rn | 0 ≤ t ≤ 1}This set is called the closed segment which joins x and x + h.Consider A ⊂ Rn, A an open subset, f : A → R1 and x, x + h ∈ A.

Theorem 54.1. If the following conditions hold:

a) [x, x + h] ⊂ A

b) f is differentiable on [x, x + h]

then there exists t0 ∈ (0, 1) such that:

f(x + h)− f(x) = dx+t0hf(h) =n∑

i=1

∂f

∂xi

(x + t0h) · hi

117

Proof. Consider the function ϕ(t) = f(x+th) for t ∈ [0, 1]. The function ϕ is differentiableon [0, 1] and ϕ′(t) = dx+thf(h). Since f(x + h)− f(x) = ϕ(1)− ϕ(0), applying the meanvalue theorem for the function ϕ on [0, 1], we obtain that there exists t0 ∈ (0, 1) suchthat:

ϕ(1)− ϕ(0) = ϕ′(t0) = dx+t0hf(h) =n∑

i=1

∂f

∂xi

(x + t0h) · hi

Remark 54.1. If the function f : A ⊂ Rn → Rm with m ≥ 2, then the above theorem isfalse. In order to illustrate this, consider for example the function f : [0, 2π] ⊂ R → R2,f(t) = (cos t, sin t).

In the case f : A ⊂ Rn → Rm (m ≥ 2) the mean value theorem is the following:

Theorem 54.2. If the following conditions hold:

1) [x, x + h] ⊂ A

2) f is differentiable on [x, x + h]

3) ‖dx+thf‖ ≤ M ∀t ∈ [0, 1]

then ‖f(x + h)− f(x)‖ ≤ M · ‖h‖ .

Proof. Consider again the function ϕ(t) = f(x + th) for t ∈ [0, 1] and

ψ(t) =n∑

i=1

(ϕi(1) − ϕi(0)) · ϕi(t) for t ∈ [0, 1]. For ψ there exists t0 ∈ [0, 1] such that

ψ(1)− ψ(0) = ψ′(t0). Hence

‖ϕ(1)− ϕ(0)‖2 =m∑

i=1

[ϕi(1)− ϕi(0)]2 =m∑

i=1

[ϕi(1)− ϕi(0)] · ϕ′i(t0) ≤

≤[

m∑i=1

[ϕi(1)− ϕi(0)]2

] 12

·[

m∑i=1

[ϕ′i(t0)]2

] 12

≤ M · ‖h‖ · ‖ϕ(1)− ϕ(0)‖

and‖ϕ(1)− ϕ(0)‖ ≤ M · ‖h‖

Local extremum

Definition 54.1. A function f : A ⊂ Rn → R1 has a local maximum value at c ∈ A ifthere exists some open neighborhood V ⊂ A of c for which f(x) ≤ f(c), for any x ∈ V .It f(x) ≥ f(c) for any x ∈ V , then f has a local minimum value at c.

Theorem 54.3 (Fermat). If f is differentiable at c and possesses a local maximum or alocal minimum at c, then ∂f

∂xi(c) = 0 for i = 1, n.

118

Proof. Consider h ∈ Rn and for t ∈ R1 sufficiently close to 0, the function ϕ(t) = f(c+th).The function f possesses a local maximum or a local minimum at c if and only if ϕpossesses a local maximum or a local minimum at t = 0. Since the derivative of ϕ at alocal extremum is equal to 0 it follows that

ϕ′(0) =n∑

i=1

∂f

∂xi

(c) · hi = 0 ∀h ∈ Rn

Therefore∂f

∂xi

(c) = 0 for i = 1, n

Note that although ∂f∂xi

(c) = 0 at a local extremum c, this is not a sufficient condition forsuch a point.

Example 54.1. If f(x, y) = xy, then ∂f∂x

= y and ∂f∂y

= x and hence (0, 0) is

the only possible local extremum of f . However, for any δ > 0, f(δ, δ) = δ2 andf(−δ, δ) = −δ2 < 0. Hence, f takes both positive and negative values in any neighborhoodof (0, 0). Thus (0, 0) is neither a local maximum nor a local minimum.

Definition 54.2. A point c is a stationary point of f if ∂f∂xi

(c) = 0 for i = 1, n.

Just like in the case of functions of one variable, stationary points for functions of severalvariables can be classified with the aid of Taylor approximations.

Definition 54.3. Let f : A ⊂ Rn → Rm be a differentiable function on the open set A.If the partial derivatives A 3 x 7→ ∂fi

∂xjare continuous, i = 1,m, j = 1, n, then f is said

to be continuously differentiable.

Theorem 54.4 (of local inversion). If the function f : A ⊂ Rn → Rn is continuouslydifferentiable on the open set A and the Jacobi matrix of f , ( ∂fi

∂xj(a))n×n, is not singular

(i.e. det( ∂fi

∂xj(a)) 6= 0), then there exists an open neighborhood U of a and an open

neighborhood V of f(a) = b such that f : U → V is bijective. Moreover, the inversef−1 : V → U is differentiable at b = f(a) and the following equality holds:

dbf−1 = (daf)−1

Proof. The proof of this theorem is rather technical and will be skipped.

Example 54.2. Show that if ρ 6= 0, then the function f(ρ, θ) = (ρ cos θ, ρ sin θ) is locallyinvertible.

Example 54.3. Show that if ρ 6= 0, then the function f(ρ, θ, ϕ) = (ρ sin θ cos ϕ, ρ sin θ sin ϕ, ρ cos θ)is locally invertible.

119

Implicit functions

Consider A ⊂ Rn and B ⊂ Rm two open subsets and let be the function f : A×B → Rm.Denote by f1, f2, ... , fm the scalar components of f , i.e. f = (f1, f2, ..., fm) and considerthe system of equations:

(∗)

f1(x1, x2, ..., xn, y1, y2, ..., ym) = 0f2(x1, x2, ..., xn, y1, y2, ..., ym) = 0· · · · · · · · · · · · · · · · · ·fm(x1, x2, ..., xn, y1, y2, ..., ym) = 0

Definition 54.4. If there exists a function ϕ : A′ ⊂ A → B, ϕ = (ϕ1, ϕ2, ..., ϕm) suchthat the following equalities hold:

(∗∗)

f1(x1, x2, ..., xn, ϕ1(x1, x2, ..., xn), ..., ϕm(x1, x2, ..., xn)) = 0f2(x1, x2, ..., xn, ϕ1(x1, x2, ..., xn), ..., ϕm(x1, x2, ..., xn)) = 0· · · · · · · · · · · · · · · · · · · · · · · · · · · · · ·fm(x1, x2, ..., xn, ϕ1(x1, x2, ..., xn), ..., ϕm(x1, x2, ..., xn)) = 0

for any (x1, x2, ..., xn) ∈ A′, then the function ϕ = (ϕ1, ϕ2, ..., ϕm) is said to be definedimplicitly by the system of equations (∗).

It is clear that the system of equations (∗) can be written shortly as

f(x, y) = 0

where x = (x1, x2, ..., xn) and y = (y1, y2, ..., yn).

Theorem 54.5 (Implicit function theorem). If the function f has the following properties:

1) there exists a ∈ A and b ∈ B such that f(a, b) = 0

2) f is continuously differentiable on A×B

3) the Frechet derivative dbfa : Rm → Rm is bijective

where fa : B → Rm is defined by fa(y) = f(a, y), then there exists an open neighborhoodU of a and an open neighborhood V of b and a function ϕ : U → V having the followingproperties:

i) ϕ(a) = b

ii) f(x, ϕ(x)) = 0 ∀x ∈ U

iii) f is continuously differentiable on U

and the following equality holds:

dxϕ = −(dyfx)−1 ◦ (dxfy)

where fx(y) = f(x, y), fy(x) = f(x, y) and y = ϕ(x).

Note that the function ϕ defined implicitly cannot be always written as an explicit formula.

Example 54.4. a) Find the function defined implicitly by the equation x2 + y2 = 1.

b) Show that the equation y5 + y − x = 0 defines implicitly a function y = y(x).

120

55 Higher order partial differentiability

Let f : A ⊂ Rn → Rm be a function partially differentiable with respect to every variablexj, j = 1, n on the open set A.

Definition 55.1. If the partial derivatives x 7→ ∂fi

∂xjare partially differentiable at a ∈ A

with respect to every variable xk, it is said that f is twice partially differentiable at a withrespect to every variable.The partial derivative with respect to the variable xk of the partial derivative ∂fi

∂xjwill be

denoted by ∂2fi

∂xk∂xj(a), i.e. ∂

∂xk( ∂fi

∂xj)(a) = ∂2fi

∂xk∂xj(a) and will be called partial derivative of

second-order of f .

For a scalar component fi there exists n2 partial derivatives of second-order.

Example 55.1. Consider f(x, y) = x2 + y2 + ex cos y. The first order partial derivativesof f exist at every point (x, y) ∈ R2 and are given by

∂f

∂x= 2x + ex cos y

∂f

∂y= 2y − ex sin y

The first-order partial derivatives of f themself are partially differentiable at any (x, y) ∈R2 with respect to x and y and we have:

∂

∂x(∂f

∂x) =

∂2f

∂x2= 2 + ex cos y

∂

∂y(∂f

∂x) =

∂2f

∂y∂x= −ex sin y

∂

∂x(∂f

∂y) =

∂2f

∂x∂y= −ex sin y

∂

∂y(∂f

∂y) =

∂2f

∂y2= 2− ex cos y

The partial derivatives are the second-order partial derivatives of f .

Definition 55.2. In general, f is said k-times partially differentiable at a ∈ A with respectto every variable if f is (k− 1)-times partially differentiable with respect to every variable

on an open neighborhood of a and every (k − 1)-th order partial derivative ∂k−1fi

∂xjk−1···∂xj1

is

partially differentiable with respect to every variable xjkat a.

We denote∂

∂xjk

(∂k−1fi

∂xjk−1· · · ∂xj1

)(a) =∂kfi

∂xjk∂xjk−1

· · · ∂xj1

(a)

and we call it k-th order partial derivative of f .

Example 55.2. Find the k-th order partial derivative of the function f(x, y) = x2 + y2 +ex cos y.

Definition 55.3. If the partial derivatives of first order x 7→ ∂fi

∂xjare differentiable at a

point a ∈ A, it is said that f is twice differentiable at a.

Definition 55.4. The second Frechet derivative of f at a is denoted by d2af and defined

as a function d2af : Rn × Rn → Rm by the formula

d2af(u)(v) =

m∑i=1

(n∑

j=1

n∑

k=1

∂2fi

∂xj∂xk

(a) · uj · vk

)ei

u, v ∈ Rn, ei = (0, ..., 0, 1, 0, ..., 0), i = 1, n.

121

The second Frechet derivative d2af is a system of m bilinear forms in u1, u2, ..., un;

v1, v2, ..., vn.

The second Frechet derivative of f at a satisfies

limu→0

1

‖u‖‖da+uf(v)− daf(v)− d2af(u)(v)‖ = 0

for every v ∈ Rn. In other words, the polynomial da+uf(v) can be approximated by thepolynomial [daf + d2

af(u)](v).

Theorem 55.1 (mixed derivative theorem of Schwarz). If the function f is twicedifferentiable at a, then the following relations hold:

∂2fi

∂xj∂xk

(a) =∂2fi

∂xk∂xj

(a) i = 1,m j, k = 1, n

Proof. The proof of this theorem is rather technical and will be skipped.

Example 55.3. Consider f(x, y, z) = (xy + xz + yz, xyz) and verify that:

∂2f

∂x∂y=

∂2f

∂y∂x

∂2f

∂x∂z=

∂2f

∂z∂x

∂2f

∂y∂z=

∂2f

∂z∂y

Theorem 55.2 (Criterion for second-order differentiability). If the partial derivatives of

second-order ∂2fi

∂xj∂xkexist in a neighborhood of a and they are continuous at a, then f is

twice differentiable at a.

Definition 55.5. If the partial derivatives of order k− 1, ∂k−1fi

∂xjk−1···∂xj1

are differentiable at

a ∈ A it is said that f is k-times differentiable at a.

The Frechet derivative of order k of f at a is defined as the function dkaf : Rn×· · ·×Rn →

Rm given by

dkaf(u1)(u2) · · · (uk) =

m∑i=1

(n∑

j1=1

n∑j2=1

· · ·n∑

jk=1

∂kfi

∂xjk· · · ∂xj1

(a) · u1j1

u2j2· · ·uk

jk

)ei

The Frechet derivative of order k of f at a satisfies:

lim‖uk‖→0

1

‖uk‖‖dk−1a+ukf(u1)(u2) · · · (uk−1)−dk−1

a f(u1)(u2) · · · (uk−1)−dkaf(u1)(u2) · · · (uk)‖ = 0

If the function is k-times differentiable at a, then the following relations hold:

∂kfi

∂xj1∂xj2 · · · ∂xjk

(a) =∂kfi

∂xσ(j1)∂xσ(j2) · · · ∂xσ(jk)

(a)

Theorem 55.3 (Criterion for k-times differentiability). If the partial derivatives of k-th order exist in a neighborhood of a and they are continuous at a, then f is k-timesdifferentiable at a.

122

56 Taylor’s theorems

Theorem 56.1 (Taylor’s formulae with integral remainder). If the partial derivatives oforder m + 1 of the function f : A ⊂ Rn → Rp are continuous and the closed segment[x, x + h] is included in the open set A, then:

f(x + h) = f(x) +1

1!dxf(h) +

1

2!d2

xf(h)(h) + · · ·+ 1

m!dm

x f

m︷︸︸︷(h) · · · (h) +

+1

m!

∫ 1

0

(1− t)m · dm+1x+thf

m+1︷︸︸︷(h) · · · (h) dt

Proof. The function g(t) = f(x + th) is considered for t ∈ [0, 1]. For g the followingrelations hold

dkg

dtk= g(k)(t) = dk

x+thf

k︷︸︸︷(h) · · · (h) k = 1,m + 1

On the other hand

d

dt[g(t) +

1− t

1!g′(t) + · · ·+ (1− t)m

m!gm(t)] =

(1− t)m

m!gm+1(t)

g(1)− [g(0) +1

1!g′(0) + · · ·+ 1

m!gm(0)] =

1

m!

∫ 1

0

(1− t)m · gm+1(t) dt

Hence

f(x + h) = f(x) +1

1!dxf(h) +

1

2!d2

xf(h)(h) + · · ·+ 1

m!dm

x f

m︷︸︸︷(h) · · · (h) +

+1

m!

∫ 1

0

(1− t)m · dm+1x+thf

m+1︷︸︸︷(h) · · · (h) dt

Theorem 56.2 (Taylor’s formula with the Lagrange remainder). If the function f : A ⊂Rn → Rp is m + 1 times differentiable on A and ‖dm+1

y f‖ ≤ M on the closed segment[x, x + h] which is included in A, then:

‖f(x + h)− f(x)− 1

1!dxf(h)− · · · − 1

m!dm

x f

m︷︸︸︷(h) · · · (h) ‖ ≤ M · ‖h‖

m+1

(m + 1)!

Proof. The proof of this theorem is rather technical and it will be skipped.

Theorem 56.3 (Taylor’s formulae with O(‖h‖m) remainder). If the function f is (m−1)-times differentiable on A and m-times differentiable at x ∈ A, then:

‖f(x + h)− f(x)− 1

1!dxf(h)− · · · − 1

m!dm

x f

m︷︸︸︷(h) · · · (h) ‖ = O(‖h‖m)

Proof. By induction.

123

57 Classification theorem for local extrema

Theorem 57.1. If f : A ⊂ Rn → R1 has continuous partial derivatives of second-orderon the open set A and daf = 0 for an a ∈ A, then:

i) if f has a local minimum at a, then d2af(h)(h) ≥ 0

ii) if f has a local maximum at a, then d2af(h)(h) ≤ 0

Proof. Consider the Taylor formulae

f(a + h) = f(a) +1

1!daf(h) +

1

2!d2

af(h)(h) + O(‖h‖2)

and hence

f(a + h)− f(a) =1

2!d2

af(h)(h) + O(‖h‖2)

i) if a is a local minimum for f , then there exists r > 0 such that for ‖h‖ < r we have

f(a + h)− f(a) =1

2!d2

af(h)(h) + O(‖h‖2) ≥ 0

Let be h ∈ Rn, h 6= 0 and t ∈ R1 such that |t| < r‖h‖ .

We have1

2!d2

af(th)(th) + O(‖th‖2) ≥ 0

t2[1

2!d2

af(h)(h) + ‖h‖2 · O(‖th‖2)

‖th‖2] ≥ 0

d2af(h)(h) + ‖h‖2 · O(‖th‖2)

‖th‖2≥ 0

It follows that d2af(h)(h) ≥ 0.

ii) The second statement is proved similarly.

Theorem 57.2 (Sufficient condition for local extrema). Assume that f : A ⊂ Rn → R1

has continuous partial derivatives of second-order on A and daf = 0 for a ∈ A.

i) If d2af(h)(h) ≥ 0 for h ∈ Rn and det( ∂2f

∂xi∂xj(a)) 6= 0, then f has a local minimum at

a;

ii) If d2af(h)(h) ≤ 0 for h ∈ Rn and det( ∂2f

∂xi∂xj(a)) 6= 0, then f has a local maximum at

a.

Proof. The proof is rather technical and will be skipped.

Example 57.1. Find and classify the stationary points of

f(x, y) = x4 − y4 − 2(x2 − y2)

Example 57.2. Determine the values of k for which

f(x, y) = k(ey − 1) sin x− cos x cos 2y + 1

possesses a minimum at (0, 0).

124

58 Conditional extrema

Let us consider a function f : A ⊂ Rn → R1, A open set and Γ ⊂ A, defined by:

Γ = {x ∈ A | gi(x) = 0 i = 1, p}

where gi : A → R1, p < n .It will be assumed that f and gi, i = 1, p have continuous first order partial derivativeson A.

Definition 58.1. If the restriction f/Γ has an extremum at a ∈ Γ, then we call thisextremum conditional (by the equations gi(x) = 0 i = 1, p).

Theorem 58.1. If dag1, · · · , dagp are linearly independent and f has a conditionalextremum at a, then there exist p constants λ1, · · · , λp such that

daf =

p∑i=1

λidagi

or∂f

∂xk

(a) =

p∑i=1

λi∂gi

∂xk

(a) k = 1, n

Proof. The proof is rather technical and it will be skipped.

Example 58.1. Find the conditional extrema of the following functions:

a) f(x, y) = x3 if x2 + 6xy + y2 = 1

b) f(x, y) = xy if 2x + 3y = 1

c) f(x, y, z) = x2 + y2 + z2 if x + y + z = 1

59 Jordan measurable subsets of R2

Let be the set of one dimensional bounded intervals of the form (a, b), [a, b), (a, b], [a, b],where a, b ∈ R. The cartesian product ∆ = I1 × I2 of two intervals of this type will becalled rectangle in R2. The area of such a rectangle ∆ is defined by

area(∆) = length(I1)length(I2)

Consider the set P of all finite reunions of rectangles ∆, i. e. P ∈ P if and only if thereexists ∆1, ∆2, · · · , ∆n such that

P =n⋃

i=1

∆i

Proposition 59.1. If P1, P2 ∈ P, then P1 ∪ P2 ∈ P and P1 \ P2 ∈ P.

125

Proof. Direct verification.

Proposition 59.2. For any P ∈ P there exists ∆1, ∆2, · · · , ∆n such that P =n⋃

i=1

∆i and

∆i ∩∆j = ∅ if i 6= j.

Proof. The proof is rather technical and will be skipped.

Definition 59.1. For a P ∈ P the area is defined by

area(P ) =n∑

i=1

area(∆i)

where P =n⋃

i=1

∆i and ∆1, ∆2, · · · , ∆n are disjoint.

Proposition 59.3. The area defined in this way for P ∈ P satisfies:

1. area(P ) > 0 for P ∈ P2. if P1, P2 ∈ P and P1 ∩ P2 = ∅, then area(P1 ∪ P2) = area(P1) + area(P2)

3. it is independent on the decomposition of P in finite union of disjoint intervals.

Definition 59.2. For A ⊂ R2, A bounded we define

areai(A) = supP⊂A,P∈P

area(P ) areae(A) = infP⊃A,P∈P

area(P )

Definition 59.3. A bounded set A ⊂ R2 is said Jordan measurable if

areai(A) = areae(A)

Definition 59.4. If the bounded set A ⊂ R2 is Jordan measurable, then the area of A isdefined as

area(A) = areai(A) = areae(A)

Proposition 59.4. A bounded set A ⊂ R2 is Jordan measurable if and only if for anyε > 0 there exist Pε, Qε ∈ P such that Pε ⊂ A ⊂ Qε and area(Qε)− area(Pε) < ε.

Proposition 59.5. A bounded set A ⊂ R2 is Jordan measurable if and only if there existtwo sequences (Pn), (Qn), Pn, Qn ∈ P and Pn ⊂ A ⊂ Qn such that

limn→∞

area(Pn) = limn→∞

area(Qn)

In this case we have

area(A) = limn→∞

area(Pn) = limn→∞

area(Qn)

Proposition 59.6. A bounded set A ⊂ R2 is Jordan measurable if and only if the areaof its boundary is equal to zero.

126

Proposition 59.7. If A1 and A2 are Jordan measurable sets, then A1 ∪ A2 and A1 \ A2

are Jordan measurable and if A1 ∩ A2 = ∅, then area(A1 ∪ A2) = area(A1) + area(A2).

Proposition 59.8. Let M ⊂ R2 be a bounded set. If for any ε > 0 there exist two Jordanmeasurable sets A and B such that A ⊂ M ⊂ B and area(B)− area(A) < ε, then the setM is Jordan measurable.

Proposition 59.9. If there exist two sequences (An), (Bn) of Jordan measurable setssuch that An ⊂ M ⊂ Bn and

limn→∞

area(An) = limn→∞

area(Bn)

then M is Jordan measurable and

area(M) = limn→∞

area(An) = limn→∞

area(Bn)

Proof. The proof of the above statements is rather technical and it will be skipped.

60 The Riemann-Darboux integral of functions of

two variables

Let A be a given bounded and Jordan measurable subset of R2.

Definition 60.1. A partition P of A is a finite set of subsets Ai, i = 1, n of A satisfying:n⋃

i=1

Ai = A, every Ai is Jordan measurable, if i 6= j, then Ai ∩ Aj = ∅.The diameter of the set Ai is the number d(Ai) defined by

d(Ai) = max(x′,y′),(x′′,y′′)∈Ai

√(x′ − x′′)2 + (y′ − y′′)2

The norm of the partition P is the number

ν(P ) = max{d(A1), d(A2), · · · , d(An)}

Suppose now that f is a function defined and bounded on A, f : A → R1. Then f isbounded on each part Ai. Hence f has a least upper bound Mi and a greatest lowerbound mi on Ai.

Definition 60.2. The upper Darboux sum of f related to P is defined by

Uf (P ) =n∑

i=1

Mi · area(Ai)

where Mi = sup{f(x, y) | (x, y) ∈ Ai}.The lower Darboux sum of f related to P is defined by

Lf (P ) =n∑

i=1

mi · area(Ai)

where mi = inf{f(x, y) | (x, y) ∈ Ai}.

127

Definition 60.3. The Riemann sum of f related to P is defined by

σf (P ) =n∑

i=1

f(ξi, ηi) · area(Ai)

where (ξi, ηi) ∈ Ai.

Remark 60.1. It is obvious that we have

Lf (P ) ≤ σf (P ) ≤ Uf (P )

Now f is bounded above and below on A. So there exist numbers m and M withm ≤ f(x, y) ≤ M for all (x, y) ∈ A.Thus for any partition P of A we have

m · area(A) = m ·n∑

i=1

area(Ai) ≤ Lf (P ) ≤ Uf (P ) ≤ M ·n∑

i=1

area(Ai) = M · area(A)

Hence the setLf = {Lf (P ) |P is a partition of A}


Uf = {Uf (P ) |P is a partition of A}is bounded below.So Lf = sup

PLf and Uf = inf

PUf exist.

The firs result establishes the intuitively obvious fact that for a bounded function Lf ≤ Uf .

Theorem 60.1. If f is defined and bounded on A, then Lf ≤ Uf .

Proof. Let P be a partition of A and P ′ the partition P ′ = P∪{A′i, A

′′i } where A′

i∪A′′i = Ai

for one particular i, 1 ≤ i ≤ n. In other words, P ′ is obtained by decomposing Ai in twomeasurable subsets.It is now shown that Lf (P ) ≤ Lf (P

′) and Uf (P ) ≥ Uf (P′).

Let M ′i = sup{f(x, y) | (x, y) ∈ A′

i} and M ′′i = sup{f(x, y) | (x, y) ∈ A′′

i }.Clearly M ′

i ≤ Mi and M ′′i ≤ Mi. Hence

Uf (P′) =

i−1∑j=1

Mj · area(Aj) + M ′i · area(A′

i) + M ′′i · area(A′′

i ) +n∑

j=i+1

Mj · area(Aj) ≤

≤i−1∑j=1

Mj · area(Aj) + Mi · area(A′i) + Mi · area(A′′

i ) +n∑

j=i+1

Mj · area(Aj) =

=n∑

j=1

Mj · area(Aj) = Uf (P )

In a similar fashion it can be shown that Lf (P ) ≤ Lf (P′).

It now follows that if P ′′ = P ∪ {A′i1, A′′

i1, · · · , A′

im , A′′im}, then Lf (P ) ≤ Lf (P

′′) andUf (P ) ≥ Uf (P

′′).

128

Now suppose that P1 and P2 are two partitions of A, P1 = {A1, · · · , Am} and P2 ={B1, · · · , Bn} and let P3 the partition P3 = {Ai ∩Bj | i = 1,m, j = 1, n}.Thus Lf (P1) ≤ Lf (P3) and Uf (P2) ≥ Uf (P3). Since Lf (P3) ≤ Uf (P3) it can be deducedthat Lf (P1) ≤ Uf (P2).In other words the lower sum related to a given partition of A does not exceed the uppersum related to any partition of A.Hence every lower sum is a lower bound for the set of upper sums. So Lf (P ) ≤ Uf for allpossible partitions P . But then Uf is an upper bound for the set of lower sums.Thus Lf ≤ Uf .

Definition 60.4. A function f defined and bounded on A is Riemann-Darboux integrableon A if Lf = Uf . This common value is denoted by

∫∫

A

f(x, y) dx dy

and it is called the double integral of f .

Theorem 60.2. The function f defined on A and bounded on A is Riemann-Darbouxintegrable on A if and only if for any ε > 0 there exists P such that Uf (P )− Lf (P ) < ε.

Proof. It follows from Theorem 60.1

Theorem 60.3. The function f defined on A and bounded on A is Riemann integrableon A if and only if there exists a number I(= Lf = Uf ) such that for any ε > 0 thereexists δ > 0 such that for ν(P ) < δ we have |σ(P )− I| < ε.

Proof. It follows from Remark 60.1 and Theorem 60.2.

Remark 60.2. The constant function f(x, y) = 1 is Riemann integrable on A and∫∫

A

f(x, y) dx dy = area(A)

Remark 60.3. This definition of the integral

∫∫

A

f(x, y) dx dy is only one of the many

ways to define the integral of a two variables function. There are others, notably Lebesqueintegral; all, however, give the same thing in the case of continuous functions.

61 Integrable functions

Theorem 61.1. If f is continuous on A and A is Jordan measurable, then f is Riemann-Draboux integrable on A.

Proof. Since A is compact the function f is uniformly continuous. For ε > 0 there existsδ > 0 such that

[(x′ − x′′)2 + (y′ − y′′)2]12 < δ ⇒ |f(x′, y′)− f(x′′, y′′)| < ε

area(A)

129

Choose P such that

(x′, y′), (x′′, y′′) ∈ Ai ⇒√

(x′ − x′′)2 + (y′ − y′′)2 < δ

for i = 1, n. Hence

Uf (P )− Lf (P ) =n∑

i=1

(Mi −mi)area(Ai) <ε

area(A)·

n∑i=1

area(Ai)

Hence f is Riemann-Darboux integrable on A.

Definition 61.1. A function f is called piecewise-continuous on A if there exists apartition P = {A1, · · · , An} of A and continuous functions fi, i = 1, n defined on Ai

such that f(x) = fi(x) for x ∈ Int(Ai).

Theorem 61.2. A piecewise-continuous function is Riemann-Darboux integrable and

∫∫

A

f(x, y) dx dy =n∑

i=1

∫∫

Ai

fi(x, y) dx dy

Proof. The proof is rather technical and it will be skipped.

62 Properties of the Riemann-Darboux integral

Theorem 62.1. If f and g are Riemann-Darboux integrable on A, then all the integralsbelow exist and the following relations hold:

(1)

∫∫

A

[αf(x, y) + βg(x, y)] dx dy = α

∫∫

A

f(x, y) dx dy + β

∫∫

A

g(x, y) dx dy, α, β ∈ R1

(2)

∫∫

A

f(x, y) dx dy =

∫∫

A1

f(x, y) dx dy +

∫∫

A2

f(x, y) dx dy where A1 ∪ A2 = A and

A1 ∩ A2 = ∅

(3) if f(x, y) ≤ g(x, y) on A, then

∫∫

A

f(x, y) dx dy ≤∫∫

A

g(x, y) dx dy

(4) |∫∫

A

f(x, y) dx dy| ≤∫∫

A

|f(x, y)| dx dy

Property (1) is called the linearity of the integral and (2) is called the additive property.

Proof. Is made using the definition of the Riemann-Darboux integral.

130

Theorem 62.2 (The mean value theorem). Let f : A → R1 be integrable on A andsatisfying

m ≤ f(x, y) ≤ M for (x, y) ∈ A

Then

m · area(A) ≤∫∫

A

f(x, y) dx dy ≤ M · area(A)

Proof. Use property (3) from Theorem 62.1.

63 Riemann-Darboux integral calculus when A is

rectangular

We intend to show that in some conditions the calculus of the integral of a two variablesfunction reduces to the iterative calculus of the integrals of one variable functions.Assume that A is given by A = [a, b]× [c, d] and f : A → R1.

Theorem 63.1. If the function f is integrable on A and if for every x ∈ [a, b] (x-fixed)the function fx(y) = f(x, y) is integrable on [c, d] i.e. the integral

I(x) =

d∫

c

fx(y) dy =

d∫

c

f(x, y) dy

exists, then the iterative integral

b∫

a

dx

d∫

c

f(x, y) dy

exists too and the following equality holds:

∫∫

A

f(x, y) dx dy =

b∫

a

dx

d∫

c

f(x, y) dy

Proof. Consider the partitions Px = {a = x0 < x1 < · · · < xi < · · · < xn = b} of [a, b] andPy = {c = y0 < y1 < · · · < yj < · · · < ym = d} of [c, d]. Hence P = {Aij}i=0,n−1,j=0,m−1 isa partition of A, Aij = [xi, xi+1)× [yj, yj+1).Denote by

mij = inf{f(x, y) | (x, y) ∈ Aij} and Mij = sup{f(x, y) | (x, y) ∈ Aij}

For (x, y) ∈ Aij we have

mij ≤ f(x, y) ≤ Mij i = 0, n− 1, j = 0,m− 1

131

Hence

mij · [yj+1 − yj] ≤yj+1∫

yj

f(x, y) dy ≤ Mij · [yj+1 − yj]

for i = 0, n− 1, j = 0,m− 1 and x ∈ [xi, xi+1].Therefore

mij · [yj+1 − yj] ≤ infx{

yj+1∫

yj

f(x, y) dy |x ∈ [xi, xi+1]} ≤yj+1∫

yj

f(x, y) dy ≤

≤ supx{

yj+1∫

yj

f(x, y) dy |x ∈ [xi, xi+1]} ≤

≤ Mij · [yj+1 − yj]

and hence

mij[yj+1 − yj][xi+1 − xi] ≤ [xi+1 − xi] · infx{

yj+1∫

yj

f(x, y) dy |x ∈ [xi, xi+1]} ≤

≤ [xi+1 − xi] ·yj+1∫

yj

f(x, y) dy ≤

≤ [xi+1 − xi] · supx{

yj+1∫

yj

f(x, y) dy |x ∈ [xi, xi+1]} ≤

≤ Mij[yj+1 − yj][xi+1 − xi]

Hence

Lf (P ) ≤ LI(x)(Px) =n−1∑i=0

[xi+1 − xi] ·yi+1∫

yi

f(x, y) dy ≤ UI(x)(Px) ≤ Uf (P )

where

Lf (P ) =n−1∑i=0

m−1∑j=0

mij[xi+1 − xi][yj+1 − yj]

LI(x)(Px) =m−1∑j=0

n−1∑i=0

[xi+1 − xi] · infx{

yj+1∫

yj

f(x, y) dy |x ∈ [xi, xi+1]}

UI(x)(Px) =m−1∑j=0

n−1∑i=0

[xi+1 − xi] · supx{

yj+1∫

yj

f(x, y) dy | x ∈ [xi, xi+1]}

Uf (P ) =m−1∑j=0

n−1∑i=0

Mij[xi+1 − xi][yj+1 − yj]

132

Since

sup Lf (P ) = inf Uf (P ) =

∫∫

A

f(x, y) dx dy

we have

sup LI(x)(Px) = inf UI(x)(Px) =

b∫

a

dx

d∫

c

f(x, y) dy

and ∫∫

A

f(x, y) dx dy =

b∫

a

dx

d∫

c

f(x, y) dy

Changing x with y we obtain also

∫∫

A

f(x, y) dx dy =

d∫

c

dy

b∫

a

f(x, y) dx

Example 63.1. Consider A = [1, 3]×[1, 2] and f(x, y) = 1(x+y)2

. Evaluate

∫∫

A

f(x, y) dx dy.

Example 63.2. Evaluate

∫∫

A

f(x, y) dx dy in the following cases:

a) A = [1, 3]× [2, 5] and f(x, y) = 5x2y − 2y3

b) A = [0, 1]× [0, 1] and f(x, y) = x2

1+y2

c) A = [0, 1]× [0, 1] and f(x, y) = y

(1+x2+y2)32

Theorem 63.2. If A = [a, b]× [c, d] and f : A → R1 is continuous, then the function

F (x, y) =

∫∫

[a,x]×[b,y]

f(u, v) du dv =

x∫

a

y∫

c

f(u, v) du dv

has continuous partial derivatives of first order:

∂F

∂x=

y∫

c

f(x, v)dv∂F

∂y=

x∫

a

f(u, y)du

The second order partial derivative ∂2F∂x∂y

exists and

∂2F

∂x∂y= f(x, y) for (x, y) ∈ A

133

Proof. Represent F (x, y) in the following form

F (x, y) =

x∫

a

y∫

c

f(u, v) du dv =

x∫

a

du

y∫

c

f(u, v) dv =

y∫

c

dv

x∫

a

f(u, v) du

Theorem 63.3. If there exists Φ : A → R1 such that

∂2Φ

∂x∂y= f(x, y)

then ∫∫

A

f(x, y) dx dy = Φ(a, c)− Φ(b, c) + Φ(b, d)− Φ(a, d)

Proof. Consider the partitions Px = {a = x0 < x1 < · · · < xi < · · · < xm = b} andPy = {c = y0 < y1 < · · · < yj < · · · < yn = d} of [a, b] and [c, d] respectively. Now letbe P = {Aij}i=0,m−1,j=0,n−1 a partition of A where Aij = [xi, xi+1] × [yj, yj+1] and applytwice the mean value theorem for the expression

Φ(xi+1, yj+1)− Φ(xi+1, yj)− Φ(xi, yj+1) + Φ(xi, yj)

obtaining

Φ(xi+1, yj+1)− Φ(xi+1, yj)− Φ(xi, yj+1) + Φ(xi, yj) =

=∂2Φ

∂x∂y(ξij, ηi,j)(xi+1 − xi)(yj+1 − yj) = f(ξij, ηi,j)(xi+1 − xi)(yj+1 − yj)

where xi ≤ ξij ≤ xi+1 and yj ≤ ηij ≤ yj+1.Hence

m−1∑i=0

n−1∑j=0

f(ξij, ηi,j)(xi+1 − xi)(yj+1 − yj) = Φ(b, d)− Φ(b, c)− Φ(a, d) + Φ(a, c)

and the equality∫∫

A

f(x, y) dx dy = Φ(a, c)− Φ(b, c) + Φ(b, d)− Φ(a, d)

64 Riemann-Darboux integral calculus when A is not

a rectangle

Let A the set defined by

A = {((x, y) |x ∈ [a, b] and y ∈ [g(x), h(x)]}where g, h are continuous functions satisfying g(x) ≤ h(x) for every x ∈ [a, b].

134

Theorem 64.1. If the function f : A → R1 is integrable on A and for every x ∈ [a, b]the integral

I(x) =

h(x)∫

g(x)

f(x, y) dy


b∫

a

dx

h(x)∫

g(x)

f(x, y) dy


∫∫

A

f(x, y) dx dy =

b∫

a

dx

h(x)∫

g(x)

f(x, y) dy

Proof. This case can be reduced to the case when A is a rectangle.

Example 64.1. Compute

∫∫

A

y2√

R2 − x2 dx dy where A = {(x, y) |x2 + y2 ≤ R2}.

Example 64.2. Compute

∫∫

A

(x2 +y) dx dy where A = {(x, y) | y2−x ≤ 0 and x2−y ≤

0}.

Let A,B Jordan measurable sets.

Theorem 64.2. If T : B → A is a bijection such that T and T−1 have continuous partialderivatives, then

area(A) =

∫∫

A

dx dy =

∫∫

B

∣∣∣∣∣

∣∣∣∣∣∂x∂ξ

∂x∂η

∂y∂ξ

∂y∂η

∣∣∣∣∣

∣∣∣∣∣ dξ dη

where T (ξ, η) = (x(ξ, η), y(ξ, η)) for every (ξ, η) ∈ B.

Proof. Consider a partition PB = {B1, B2, · · · , Bn} of B and the coresponding partitionPA = {A1, A2, · · · , An} of A with Ai = T (Bi). If (PB) is small, then

area(Ai) =

∣∣∣∣∣

∣∣∣∣∣∂x∂ξ

∂x∂η

∂y∂ξ

∂y∂η

∣∣∣∣∣

∣∣∣∣∣ · area(Bi)

Hencen∑

i=1

area(Ai) =n∑

i=1

∣∣∣∣∣

∣∣∣∣∣∂x∂ξ

∂x∂η

∂y∂ξ

∂y∂η

∣∣∣∣∣

∣∣∣∣∣ · area(Bi)

Considering a sequence of partitions PnB with ν(Pn

B)n→∞−→ 0, we obtain the stated

result.

135

Theorem 64.3. If A, B ⊂ R2 are Jordan measurable sets, T : B → A is a bijectionsuch that T and T−1 have continuous partial derivatives and f : A → R1 is an integrablefunction, then the following equality holds:

∫∫

A

f(x, y) dx dy =

∫∫

B

f(x(ξ, η), y(ξ, η))

∣∣∣∣∣

∣∣∣∣∣∂x∂ξ

∂x∂η

∂y∂ξ

∂y∂η

∣∣∣∣∣

∣∣∣∣∣ dξ dη

Proof. Similarly as before.


∫∫

A

y2√

R2 − x2 dx dy where A = {(x, y) |x2 + y2 ≤ R2} by an

appropriate change of variables.

65 Jordan measurable subsets of Rn

Let be the set of one dimensional bounded intervals of the form (a, b), [a, b), (a, b], [a, b],where a, b ∈ R. The cartesian product ∆ = I1 × · · · × In, where Ii are intervals of thistype, is called hypercube in Rn.The volume of such a hypercube ∆ is defined by

vol(∆) = length(I1)length(I2) · · · length(In)

Consider the set P of the finite unions of hypercubes ∆, i. e. P ∈ P if and only if exist∆1, ∆2, · · · , ∆k such that

P =k⋃

l=1

∆l

It is easy to verify that the following statement hold:

P1, P2 ∈ P ⇒ P1 ∪ P2 ∈ P and P1 \ P2 ∈ P

Proposition 65.1. for any P ∈ P there exist ∆1, ∆2, · · · , ∆k such that P =k⋃

l=1

∆l and

∆p ∩∆q = ∅ if p 6= q.

Definition 65.1. For P ∈ P the volume is defined as

vol(P ) =k∑

l=1

vol(∆l)

where P =k⋃

l=1

∆l and ∆1, ∆2, ..., ∆k are given by Proposition 65.1.

Proposition 65.2. The volume defined above for P ∈ P satisfies:

1. vol(P ) ≥ 0 for P ∈ P

136

2. P1, P2 ∈ P , P1 ∩ P2 = ∅ ⇒ vol(P1 ∪ P2) = vol(P1) + vol(P2)

3. vol(P ) is independent on the decomposition of P .

Definition 65.2. For A ⊂ Rn, A bounded we define

voli(A) = supP⊂A,P∈P

vol(P ) vole(A) = infP⊃A,P∈P

vol(P )

Definition 65.3. A bounded set A ⊂ Rn is said Jordan measurable if

voli(A) = vole(A)

Definition 65.4. If the bounded set A ⊂ Rn is Jordan measurable, then the volume of Ais defined as

vol(A) = voli(A) = vole(A)

Proposition 65.3. A bounded set A ⊂ Rn is Jordan measurable if and only if for anyε > 0 there exist Pε, Qε ∈ P such that Pε ⊂ A ⊂ Qε and vol(Qε)− vol(Pε) < ε.

Proposition 65.4. A bounded set A ⊂ Rn is Jordan measurable if and only if for anyε > 0 there exist two sequences (Pk), (Qk), Pk, Qk ∈ P and Pk ⊂ A ⊂ Qk such that

limk→∞

vol(Pk) = limk→∞

vol(Qk)

In this case we havevol(A) = lim

k→∞vol(Pk) = lim

k→∞vol(Qk)

Proposition 65.5. A bounded set A ⊂ Rn is Jordan measurable if and only if the volumeof its boundary is equal to zero.

Proposition 65.6. If A1 and A2 are Jordan measurable sets, then A1 ∪ A2 and A1 \ A2

are Jordan measurable and if A1 ∩ A2 = ∅, then vol(A1 ∪ A2) = vol(A1) + vol(A2).

Proposition 65.7. Let M ⊂ Rn be a bounded set. If for any ε > 0 there exists twoJordan measurable sets A and B such that A ⊂ M ⊂ B and vol(B) − vol(A) < ε, thenthe set M is Jordan measurable.

Proposition 65.8. If there exist two sequences (Ak) and (Bk) of Jordan measurable setssuch that Ak ⊂ M ⊂ Bk and

limk→∞

vol(Ak) = limk→∞

vol(Bk)

then M is Jordan measurable and

vol(M) = limk→∞

vol(Ak) = limk→∞

vol(Bk)

The proof of the above statements is rather technical and it will be omitted.

137

66 The Riemann-Darboux integral of a n variable

function

Let A be a bounded and Jordan measurable subset of Rn.

Definition 66.1. A partition of A is a finite set of subsets Al, l = 1, k of A having thefollowing properties:

1) every Al is Jordan measurable

2)k⋃

l=1

Al = A

3) if p 6= q then Ap ∩ Aq = ∅

The diameter of the set Al is the number

d(Al) = supx,y∈Al

‖x− y‖

The norm of the partition P is the number

ν(P ) = max{d(A1), · · · , d(Ak)}

Suppose now that f is a real valued n variables function defined and bounded on A,f : A → R1.Then f is bounded on each part Al, l = 1, k. Hence f has a least upper bound Ml and agreatest lower bound ml on Al (l = 1, k).

Definition 66.2. The upper sum of f related to P is defined by

Uf (P ) =k∑

l=1

Ml · vol(Al)

where Ml = sup{f(x) |x ∈ Al}.The lower sum of f related to P is defined by

Lf (P ) =k∑

l=1

ml · vol(Al)

where ml = inf{f(x) |x ∈ Al}.The Riemann sum of f related to P is defined by

σf (P ) =k∑

l=1

f(ξl) · vol(Al)

where ξl ∈ Al.

138

Remark 66.1. It is clear that the following inequalities hold:

Lf (P ) ≤ σf (P ) ≤ Uf (P )

Now f is bounded above and below on A. So there exist numbers m and M such that

m ≤ f(x) ≤ M for x ∈ A

Thus for any partition P of A we have

m · vol(A) ≤ m ·k∑

l=1

vol(Al) ≤ Uf (P ) ≤ σf (P ) ≤ Lf (P ) ≤ M ·k∑

l=1

vol(Al) = M · vol(A)

Hence, the setLf = {Lf (P ) |P is a partition of A}


Uf = {Uf (P ) |P is a partition of A}

is bounded below.So Lf = sup Lf and Uf = inf Uf exist.The first result establishes the intuitively obvious fact that for a bounded functionLf ≤ Uf .

Theorem 66.1. If f is defined and bounded on A, then Lf ≤ Uf .

Proof. The same as in two dimensions.

Definition 66.3. A function f defined and bounded on A is Riemann-Darboux integrableon A if Lf = Uf .This common value is denoted by

∫

A

f(x) dx or

∫· · ·

∫

A

f(x1, · · · , xn) dx1 · · · dxn

and is called the Riemann-Darboux integral of f .

Theorem 66.2. The function f defined and bounded on A is Riemann-Darboux integrableon A if and only if for any ε > 0 there exists P such that we have

Uf (P )− Lf (P ) < ε

Proof. The same as in two dimensions.

Remark 66.2. The constant function f(x) = 1 for x ∈ A is Riemann integrable on Aand ∫

· · ·∫

A

1 dx1 · · · dxn = vol(A)

139

67 Integrable functions of n variables

Theorem 67.1. If f is continuous on A and A is Jordan measurable, then f is Riemann-Darboux integrable on A.

Proof. Since A is compact the function f is uniformly continuous. For ε > 0 there existsδ > 0 such that

‖x′ − x′′‖ < δ ⇒ |f(x′)− f(x′′)| < ε

vol(A)

Choose P such that x′, x′′ ∈ Ai ⇒ ‖x′ − x′′‖ < δ for i = 1, n.Hence

Uf (P )− Lf (P ) =n∑

i=1

(Mi −mi) · vol(Ai) <ε

vol(A)·

n∑i=1

vol(Ai)

Hence f is Riemann integrable on A.

Definition 67.1. A function f is called piecewise-continuous on A if there exists apartition P = {A1, · · · , Ak} of A and continuous functions fi, i = 1, k defined on Ai

such that f(x) = fi(x) for x ∈ Int(Ai).

Theorem 67.2. A piecewise-continuous function is Riemann integrable and

∫

A

f(x) dx =k∑

i=1

∫

Ai

fi(x) dx

Proof. Is technical and it will be omitted.

68 Properties of the Riemann-Darboux integral of n-

variable functions

Theorem 68.1. If f and g are Riemann-Darboux integrable on A ⊂ Rn then all theintegrals below exist and the relations hold:

(1)

∫

A

[αf(x) + βg(x)] dx = α

∫

A

f(x) dx + β

∫

A

g(x) dx, α, β ∈ R1

(2)

∫

A

f(x) dx =

∫

A1

f(x) dx +

∫

A2

f(x) dx where A1 ∪ A2 = A and A1 ∩ A2 = ∅

(3) if f(x) ≤ g(x) on A, then

∫

A

f(x) dx ≤∫

A

g(x) dx

(4) |∫

A

f(x) dx| ≤∫

A

|f(x)| dx

140

Property (1) is called the linearity of the integral and (2) is called the additive property.

Proof. It is proved using the definition of the Riemann-Darboux integral.

Theorem 68.2 (The mean value theorem). Let f : A → R1 be integrable on A andsatisfying

m ≤ f(x) ≤ M for x ∈ A

Then

m · vol(A) ≤∫

A

f(x) dx ≤ M · vol(A)

Proof. Use property (3) from Theorem 68.1.

69 Riemann-Darboux integral calculus for n-variable

functions when A is a hypercube

We intend to show that in some conditions the calculus of the integral of a n variablesfunction reduces to the iterative calculus of the integrals of one variable functions.Assume that A is a hypercube A = [a1, b1]× [a2, b2]× · · · × [an, bn] and f : A → R1.

Theorem 69.1. If the function f is integrable on A and if for every fixed x1 ∈ [a1, b1] thefunction fx1(x2, · · · , xn) = f(x1, x2, · · · , xn) is integrable on A1 = [a2, b2] × · · · × [an, bn]i.e. the integral

I(x1) =

∫· · ·

∫

A1

fx1(x2, · · · , xn) dx2 · · · dxn =

b2∫

a2

· · ·bn∫

an

f(x1, x2, · · · , xn) dx2 · · · dxn


b1∫

a1

I(x1) dx1 =

b1∫

a1

dx1

b2∫

a2

· · ·bn∫

an

f(x1, x2, · · · , xn) dx2 · · · dxn


∫

A

f(x) dx =

b1∫

a1

I(x1) dx1 =

b1∫

a1

dx1

b2∫

a2

· · ·bn∫

an

f(x1, x2, · · · , xn) dx2 · · · dxn

Proof. Similar as for the two variables functions.


∫∫∫

A

dxdydz

(1 + x + y + z)3for the set A bounded by the planes:

x = 0, y = 0, z = 0 and x + y + z = 1.

141


∫∫∫

A

z dx dy dz for the set A defined by x2

a2 + y2

b2+ z2

c2≤ 1.


∫∫∫

A

(x2

a2+

y2

b2+

z2

c2) dx dy dz for the set A defined by x2

a2 + y2

b2+

z2

c2≤ 1.

Remark 69.1. The above theorem reduces successively the evaluation of the integral tothe evaluation of integrals for functions of one variable.

b1∫

a1

· · ·bn∫

an

f(x1, · · · , xn) dx1 dx2 · · · dxn =

b1∫

a1

dx1

b2∫

a2

dx2 · · ·bn∫

an

f(x1, · · · , xn) dxn

Remark 69.2. Theorem 69.1 is valid also for more complex sets A as in 2-dimensionalcase.

Theorem 69.2. If A,B ⊂ Rn are Jordan measurable and T : B → A is a bijection, Tand T−1 having continuous partial derivatives, then

vol(A) =

∫

A

1 dx =

∫· · ·

∫

A

dx1 · · · dxn =

∫· · ·

∫

B

∣∣∣∣D(x1, · · · , xn)

D(ξ1, · · · , ξn)

∣∣∣∣ dξ1 · · · dξn

where T (ξ1, · · · , ξn) = (x1(ξ1, · · · , ξn), · · · , xn(ξ1, · · · , ξn)).

Proof. As in 2-dimensional case.

Theorem 69.3. If A,B ⊂ Rn are Jordan measurable sets, T : B → A is a bijection, Tand T−1 having continuous partial derivatives and f : A → R1 is an integrable function,then the following equality holds:

∫

A

f(x) dx =

∫

B

f(x(ξ))

∣∣∣∣D(x1, · · · , xn)

D(ξ1, · · · , ξn)

∣∣∣∣ dξ

where∣∣∣D(x1,··· ,xn)

D(ξ1,··· ,ξn)

∣∣∣ is the determinant of the Jacobi matrix of T .

Proof. As in two dimensions.

Example 69.4. Compute the volume of the set A ⊂ R3 bounded by x2 + y2 + z2 ≤ R2.

Example 69.5. Compute the integral

∫∫∫

A

xyz

x2 + y2dx dy dz where A is bounded above

by the surface (x2 + y2 + z2)2 = a2xy and below by the surface z = 0.

142

70 Elementary curves and elementary closed curves

The way of defining a line integral is quite similar to the familiar way of defining a definiteintegral known from calculus. In order to do this, we must introduce the concepts of curveand arc length. We will present these concepts in a particular framework which can beextended in a natural way.

Definition 70.1. An elementary curve (elementary arc) is a set of points C ⊂ R3 forwhich there exists a closed interval [a, b] ⊂ R and a function ϕ : [a, b] → C having thefollowing properties:

a) ϕ is bijective;

b) ϕ is of class C1 and ϕ′(t) 6= 0, ∀t ∈ [a, b] .

The points A = ϕ(a) and B = ϕ(b) are called the end points of the curve. The functionϕ is called a parametric representation of the curve and the vector ϕ′(t) is tangent to thecurve at the point ϕ(t).

Figure 70.1:

Definition 70.2. An elementary closed curve is a set of points C ⊂ R3 for which thereexists a closed interval [a, b] ⊂ R and a function ϕ : [a, b] → C with the followingproperties:

a) ϕ is bijective from [a, b) to C and ϕ(a) = ϕ(b) ;

b) ϕ is of class C1 and ϕ′(t) 6= 0, ∀t ∈ [a, b] .

The function ϕ is called a parametric representation of the curve and the vector ϕ′(t) istangent to the curve at the point ϕ(t).

Example 70.1. If x0 = (x01, x

02, x

03) and h = (h1, h2, h3) the closed segment [x0, x0 + h]

which joins the points x0, x0 + h is an elementary curve. In this case, we can take[a, b] = [0, 1] and ϕ : [a, b] → C is given by ϕ(t) = (x0

1 + th1, x02 + th2, x

03 + th3).

Example 70.2. The circle C defined by x21 + x2

2 = 1 and x3 = 0 is an elementaryclosed curve C. In this case we can take [a, b] = [0, 2π] and ϕ : [a, b] → C is given byϕ(t) = (cos t, sin t, 0).

143

Figure 70.2:

Remark 70.1. Any elementary or elementary closed curve possesses an infinity ofparametric representations.

Remark 70.2. The end points A, B of an elementary curve are independent of the para-metric representation of the curve. This means that for every parametric representationψ : [c, d] → C of the curve we have {ψ(c), ψ(d)} = {ϕ(a), ϕ(b)} = {A,B}.

Using a parametric representation of an elementary curve or of an elementary closed curvewe can define the curve length.

Definition 70.3. The length of the elementary curve (elementary closed curve) C is givenby:

l =

b∫

a

||ϕ′(t)‖ dt =

b∫

a

√ϕ1

2(t) + ϕ22(t) + ϕ3

2(t) dt

where ϕ(t) = (x1(t), x2(t), x3(t)) is a parametric representation of C and ϕi(t) = dϕi

dt(t),

i = 1, 2, 3.

Remark 70.3. The curve length is independent of the parametric representation of thecurve C. In other words, if ψ : [c, d] → C is a second parametric representation of C,then

d∫

c

‖ψ′(τ)‖ dτ =

b∫

a

‖ϕ′(t)‖ dt

Example 70.3. The length of the closed segment [x0, x0+h] represented by ϕ(t) = x0+th,t ∈ [0, 1] is

l =

1∫

0

√h2

1 + h22 + h2

3 dt = ‖h‖

and the length of the circle C = {(x1, x2, x3) ∈ R3 | x21 + x2

2 = 1, x3 = 0} represented byϕ(t) = (cos t, sin t, 0), t ∈ [0, 2π] is

l =

2π∫

0

√sin2 t + cos2 t dt = 2π

144

Example 70.4 (Shows that the curve length is independent of the parametric repre-sentation). For the circle C = {(x1, x2, x3) ∈ R3 | x2

1 + x22 = 1, x3 = 0} the parametric

representation ϕ : [0, π] → C, ϕ(t) = (cos 2t, sin 2t, 0) is chosen. Using this representationwe have

l =

π∫

0

2√

sin2 2t + cos2 2t dt = 2π

This is the same value as the one obtained in the case of the parametric representationϕ(t) = (cos t, sin t, 0), t ∈ [0, 2π].

Remark 70.4. Since ϕ′(t) 6= 0,∀t ∈ [a, b], an elementary curve C has only two endpoints. In other words, if C is an elementary curve then there exist two points A,B ∈ Csuch that for any parametric representation ψ : [c, d] → C of the curve C we have{ψ(c), ψ(d)} = {A,B}.When ψ(c) = A and ψ(d) = B, then if τ moves from c to d, then ψ(τ) moves from A toB.When ψ(c) = B and ψ(d) = A, then if τ moves from c to d, then ψ(τ) moved from B toA.The two ways to cover the curve C, from A to B or from B to A, are called orientationsof C and the above presented facts show that on an elementary curve there are two ori-entations. Moreover, the covering given by an arbitrary representation of C is one of theabove mentioned orientations.In other words, the set of representations is divided in two classes: for all the representa-tions which belong to one of the classes we have one orientation (say from A to B) andfor the other class the opposite orientation (from B to A).

Figure 70.3:

Consider now an elementary curve C with the parametric representation ϕ : [a, b] → Cfor which the orientation of the curve is from A to B (ϕ(a) = A,ϕ(b) = B).

If in the formula

l =

b∫

a

‖ϕ′(τ)‖ dτ

we replace the fixed upper limit b with a variable upper limit t, the integral becomes

sA(t) =

t∫

a

‖ϕ′(τ)‖ dτ

145

The value sA(t) represents the arc length AA′ ⊂ C, where A′ = ϕ(t).The function sA is defined on the closed interval [a, b] and it is a bijection from [a, b] to[0, l], sA(a) = 0 and sA(b) = l. More, sA and s−1

A are continuously differentiable.The function sA can be used in order to define a new parametric representation of thecurve C, namely: xA : [0, l] → C, xA = ϕ ◦ s−1

A . In this representation of C, s ∈ [0, l]serves as a parameter and xA(0) = A, xA(l) = B. When s moves from 0 to l, then xA(s)moves from A to B. The parametric representation xA is canonic when the orientationof C is from A to B, i.e. the parameter s is the arc length AxA(s) and there is no otherrepresentation with this property.Consider now for the same elementary curve C with the end points A and B a parametricrepresentation ψ : [c, d] → C for which the orientation is from B to A (ψ(c) = B andψ(d) = A). If in the formula

l =

d∫

c

‖ψ′(τ)‖ dτ

we replace the fixed upper limit d with a variable upper limit t, the integral becomes

sB(t) =

t∫

c

‖ψ′(τ)‖ dτ

The value sB(t) represents the arc length BB′ ⊂ C, where B′ = ψ(t).

The function sB : [c, d] → [0, l] is a bijection and sB(c) = 0, sB(d) = l. The function s−1B

can be used in order to define a new parametric representation xB = ψ ◦ s−1B of C, such

that xB(0) = B, xB(l) = A. If s moves from 0 to l, then xB(s) moves from B to A. Therepresentation xB is canonic if the orientation of C is from B to A, i.e. the parameter sis the arc length BxB(s) and there is no other representation with this property.

Example 70.5. In the case of the closed interval [x0, x0+h] which joins the points A = x0

and B = x0 + h, if the parametric representation ϕ(t) = x0 + th, t ∈ [0, 1] is chosen, thenϕ(t) moves from A to B when t moves from 0 to 1. If the parametric representationψ(τ) = x0 + (2 − τ)h, τ ∈ [1, 2] is chosen, then ψ(τ) moves from B = x0 + h to A = x0

when τ moves from 1 to 2.

Using the representation ϕ, the arc length AA′ is given by

sA(t) =

t∫

0

√h2

1 + h22 + h2

3 dτ = t · ‖h‖

with sA(0) = 0 and sA(1) = ‖h‖.The function s−1

A : [0, ‖h‖] → [0, 1] is s−1A (s) = s

‖h‖ , and the canonic parametric

representation xA : [0, ‖h‖] → C is given by

xA(s) = (ϕ ◦ s−1A )(s) = x0 +

1

‖h‖ · s · h.

146

Using the representaion ψ, the arc length BB′ is given by

sB(t) =

t∫

1

‖ψ′(τ)‖ dτ =

t∫

1

√h2

1 + h22 + h2

3 dτ = (t− 1) · ‖h‖

with sB(1) = 0 and sB(2) = ‖h‖.The function s−1

B : [0, ‖h‖] → [1, 2] is given by s−1B (s) = 1+ s

‖h‖ , and the canonic parametric

representation xB : [0, ‖h‖] → C is

xB(s) = (ψ ◦ s−1B )(s) = x0 + (2− 1− s

‖h‖) · h = x0 + (1− s

‖h‖) · h

The representation xA(s) = x0 + 1‖h‖ · s · h is canonic in the orientation from A to B, and

the representation xB(s) = x0 + (1 − s‖h‖) · h is canonic in the orientation from B to A.

In both cases, the parameter s moves from 0 to l.

Now consider an elementary closed curve C. In this case, there aren’t two end points Aand B, and we cannot speak about the orientation from A to B or from B to A. In thefollowings, we will show how to proceed in order to introduce two orientations in the caseof an elementary closed curve C.

For the elementary closed curve C let’s consider a parametric representation ϕ : [a, b] → Cand the point A = ϕ(a) ∈ C.Since ϕ′(t) 6= 0, if t moves from a to b then ϕ(t) describes the curve, moving in a uniqueway. This movement of ϕ(t) is one orientation of the curve. The opposite movement onC is the opposite orientation. If ψ : [c, d] → C is an arbitrary representation of C thenwhen t increases on [c, d], ψ(t) moves on C according to one of the orientations:

ifd

dτ

(ϕ ◦ ψ−1(τ)

)=

dt

dτ> 0 then ϕ(t) and ψ(τ) move in the same way when t moves

from a to b and τ moves from c to d;

ifd

dτ

(ϕ ◦ ψ−1(τ)

)=

dt

dτ< 0 then ϕ(t) and ψ(τ) move in opposite senses as t moves from

a to b and τ moves from c to d.

As in the case of an elementary curve C we can consider the function sA : [a, b] → [0, l]defined by

sA(t) =

∫ t

a

‖ ϕ′(τ) ‖ dτ

If A′ = ϕ(t) then sA(t) is the arc length AA′ described by ϕ(τ) when τ moves from a tot.

The function sA : [a, b] → [0, l] is a bijection and can be used in order to define a newrepresentation of C: x : [0, l] → C, xA − ϕ ◦ s−1

A . In this representation, s ∈ [0, l] is theparameter and xA(0) = xA(l) = A. When s moves from 0 to l then xA(s) moves on C

and describes the curve C moving in the same sense as ϕ(t). The function ˜xA : [0, l] → C

defined by ˜xA(s) = xA(l− s) is the representation of C which corresponds to the oppositeorientation.

For instance, in the case of the circle:

C = {(x1, x2, x3)|x21 + x2

2 = 1, x3 = 0}

147

Figure 70.4:

considering the parametric representation ϕ(t) = (cos t, sin t, 0) with t ∈ [0, 2π] we

have: ϕ(0) = (1, 0, 0) = A, sA(t) =

∫ t

0

dτ = t; xA(s) = (cos s, sin s, 0), s ∈ [0, π];

xA(0) = (1, 0, 0) = A; xA(s) moves on the circle as in Fig. 70.4, and ˜xA(s) =(cos(2π − s), sin(2π − s), 0) moves on the circle as in Fig. 70.5:

Figure 70.5:

It follows that an elementary curve and an elementary closed curve can be representedas:

X(s) = (X1(s), X2(s), X3(s)), 0 ≤ s ≤ l

where l is the curve length and s ∈ [0, l] is the arc length X(0)X(s) ⊂ C.

For an elementary curve C, there exist two representations of this kind corresponding tothe two orientations of C. For an elementary closed curve C if we fix a point A on C,then we also have two representations of this kind corresponding to the two orientationsof C.

71 Line integral of first type

Let now an elementary curve C and one of its parametric representations

x(s) = (x1(s), x2(s), x3(s)) 0 ≤ s ≤ l

in function of the arc length s. In order to make a choice assume that x(s) moves from Aand B when s moves from 0 to l.

Let now f(x1, x2, x3) be a given function which is defined (at least) at each point of Cand is continuous function of s, i.e. s 7→ f(x1(s), x2(s), x3(s)) is continuous.We subdivide C into n portions in an arbitrary manner:

148

Figure 71.1:

Let P0(= A), P1, P2, ..., Pn−1, Pn(= B) the end points of these portions and let

s0(= 0) < s1 < s2 < · · · < sn(= l)

the lengths of arcs AQi; si = length(AQi). Then we choose an arbitrary point on eachportion, say, a point Q1 between P0 and P1, a point Q2 between P1 and P2 etc. Takingthe values of f at these points Q1, Q2, · · · , Qn we form the sum

In =n∑

m=1

f(Qm)(sm − sm−1)

We do this for n = 2, 3, · · · in a completely independent manner, so that the greatest∆sm = sm− sm−1 approaches zero as n approaches infinity. We obtain a sequence of realnumbers I2, I3, · · · . The limit of this sequence is called the line integral of first type of falong C from A to B and is denoted by

∫

C

f ds

The curve C is called the path of integration. Since, by assumption, f is continuous andC is smooth, that limit exists and is independent of the orientation and of the choice ofsubdivisions and points Qm. In fact, the position of a point P on C is determined by thecorresponding value of the arc length s; since A and B correspond to s = 0 and s = lrespectively, we have

∫

C

f ds =

l∫

0

f(x1(s), x2(s), x3(s)) ds

It is easy to see that if C is represented by the continuously differentiable vector functionϕ : [a, b] → R3, ϕ = ϕ(t) then

∫

C

f ds =

b∫

a

f(ϕ1(t), ϕ2(t), ϕ3(t)) ·√

ϕ21(t) + ϕ2

2(t) + ϕ23(t) dt

Hence the line integral of first type is equal to the definite integral and familiar propertiesof ordinary definite integrals are equally valid for line integrals.

Proposition 71.1. a)

∫

C

k · f ds = k

∫

C

f ds k = constant

149

b)

∫

C

(f + g) ds =

∫

C

f ds +

∫

C

g ds

c)

∫

C

f ds =

∫

C1

f ds +

∫

C2

f ds, where the path C is subdivided into two disjoint arcs C1

and C2

Figure 71.2:

Remark 71.1. If C is an elementary closed curve, the line integral of first type is definedsimilarly.

For a line integral over a closed path C, the symbol

∮

C

(instead of

∫

C

) is sometimes used

in the literature.

Example 71.1. Evaluate the following line integrals:

1)

∫

C

xy ds, where c is the segment which joins the points A(0, 0) and B(1, 1);

2)

∮

C

(x + y) ds, where c is the closed curve given by the parametric representation

x = cos t, y = sin t, t ∈ [0, 2π].

72 Line integrals of second type

In many applications the integrands appearing in the line integrals of first type are of theform:

f · dx1

dsor g · dx2

dsor h · dx3

ds

where dx1

ds, dx2

ds, dx3

dsare the derivatives of the functions occurring in the parametric

representation of the path of integration.

The integrals

∫

C

f · dx1

dsds,

∫

C

g · dx2

dsds,

∫

C

h · dx3

dsds are called line integrals of second type.

Their values depend on the orientation of C; changing the orientation of C the integralsare multiplied by −1.

150

We simply denote these integrands by:

∫

C

f · dx1

dsds =

∫

C

f dx1

∫

C

g · dx2

dsds =

∫

C

g dx2

∫

C

h · dx3

dsds =

∫

C

h dx3

In terms of the considered parametric representations, these line integrals of second typeare equal to the following Riemann integrals:

∫

C

fdx1 =

l∫

0

f(x1(s), x2(s), x3(s))dx1

dsds =

l∫

0

f(x1(s), x2(s), x3(s)) · cos α(s)ds

∫

C

gdx2 =

l∫

0

g(x1(s), x2(s), x3(s))dx2

dsds =

l∫

0

g(x1(s), x2(s), x3(s)) · cos β(s)ds

∫

C

hdx3 =

l∫

0

h(x1(s), x2(s), x3(s))dx3

dsds =

l∫

0

h(x1(s), x2(s), x3(s)) · cos γ(s)ds

where α(s), β(s), γ(s) are the angles of the tangent to the curve and the coordinate axisOx1, Ox2, Ox3, respectivelly.

In terms of an arbitrary parametric representation ϕ : [a, b] → C which corresponds to thesame orientation, these line integrals of second type are equal to the following Reimannintegrals:

∫

C

fdx1 =

b∫

a

f(ϕ(t)) · ϕ1(t)√ϕ2

1(t) + ϕ22(t) + ϕ2

3(t)dt

∫

C

gdx2 =

b∫

a

g(ϕ(t)) · ϕ2(t)√ϕ2

1(t) + ϕ22(t) + ϕ2

3(t)dt

∫

C

hdx3 =

b∫

a

h(ϕ(t)) · ϕ3(t)√ϕ2

1(t) + ϕ22(t) + ϕ2

3(t)dt

All these integrals depend on the orientation of the curve C. If the orientation changes,the value of the integral changes its sign.

For the sums of these types of integrals along the same path C we adopt the simplifiednotation ∫

C

f dx1 + g dx2 + h dx3

151

which is equal to the Riemann integral

l∫

0

[f(x1(s), x2(s), x3(s)) · cos α(s) + g(x1(s), x2(s), x3(s)) · cos β(s) + h(x1(s), x2(s), x3(s)) · cos γ(s)] ds

Example 72.1. Evaluate the line integral

I =

∫

C

[x21x2 dx1 + (x1 − x3) dx2 + x1x2x3 dx3]

where C is the arc of the parabola x2 = x21 in the plane x3 = 2 from A(0, 0, 2) to B(1, 1, 2).

Example 72.2. Evaluate the above line integral where C is the segment of the straightline x2 = x1, x3 = 2 from A(0, 0, 2) to B(1, 1, 2).

73 Transformation of double integrals into line inte-

grals

Double integrals over a plane region may be transformed into line integrals over theboundary of the region and conversely. This transformation is of practical as well astheoretical interest and can be done means of the following basic theorem.

Theorem 73.1 (Green’s theorem in the plane). Let R be a closed bounded region in thex, y plane whose boundary C consists of finite many elementary closed curves. Let f(x, y)and g(x, y) be functions which are continuous and have continuous partial derivatives offirst order everywhere in some domain containing R. Then the following equality holds:

∫∫

R

(∂g

∂x− ∂f

∂y) dx dy =

∮

C

f dx + g dy =

∮

C

[f cos α + g cos β] ds

The integration being taken along the entire boundary C of R such that R is on the leftas one moves on C.

Figure 73.1:

152

Proof. We first prove the theorem for a special region R which be represented in both ofthe forms:

R = {(x, y) | a ≤ x ≤ b , u(x) ≤ y ≤ v(x)}and

R = {(x, y) | c ≤ y ≤ d , p(y) ≤ x ≤ q(y)}

Figure 73.2:

Figure 73.3:

In this case we have

∫∫

R

∂f

∂ydx dy =

b∫

a

(

v(x)∫

u(x)

∂f

∂ydy) dx =

b∫

a

[f(x, v(x))− f(x, u(x))] dx

= −b∫

a

f(x, u(x)) dx−a∫

b

f(x, v(x)) dx = −∫

C∗

f(x, y) dy −∫

C∗∗

f(x, y) dx

= −∫

C

f(x, y) dx

since y = u(x) represents the oriented curve C∗ and y = v(x) represents the orientedcurve C∗∗.If portions of C are segments parallel to the y-axis such as C and ˜C,

153

Figure 73.4:

then the result is the same as before, because the integrals over these portions are zeroand may be added to the integrals over C∗ and C∗∗ to obtain the integral over the wholeboundary C.Similarly we obtain

∫∫

R

∂g

∂xdx dy =

d∫

c

q(y)∫

p(y)

dx

dy =

∫

C

g(x, y) dy

Therefore ∫∫

R

[∂g

∂x− ∂f

∂y

]dx dy =

∫

C

f dx + g dy

and the theorem is proved for special regions.We now prove the theorem for a region R which itself is not a special region but can besubdivided into finitely many special regions. In this case we apply the theorem to eachsubregion and then add the results; the left-hand members add up to the integral over Rwhile the right-hand members add up to the line integral over C plus integrals over thecurves introduced for subdividing R. Each of the latter integrals occur twice, taken oncein each direction. Hence these two integrals cancel each other, and we are left with theline integral over C.

Example 73.1. Using Green’s theorem, evaluate the following integrals:

a)∫C

y dx + 2x dy, where C is the boundary of the square 0leqx ≤ 1, 0leqx ≤ 1

(counterclockwise);

b)∫C

y3 dx + (x3 + 3y2x) dy, where C is the boundary of the region y = x2 and y = x,

where 0leqx ≤ 1 (counterclockwise);

c)∫C

2xy dx+(ex +x2) dy, where C is the boundary of the triangle with vertices (0, 0),

(1, 0), (1, 1) (clockwise);

d)∫C

−xy2 dx + x2y dy, where c is the boundary of the region in the first quadrant

bounded by y = 1− x2 (counterclockwise).

154

Now we will present a second theorem of Green which concern the transformation of adouble integral of the Laplacian of a function into a line integral of its normal derivative.Let w(x, y) be a function which has continuous second order partial derivatives in a domainD of the x, y-plane.

Definition 73.1. The Laplacian of w is by definition the function ∆w : D → R1 definedby

∆w =∂2w

∂x2+

∂2w

∂y2

Assume now that D contains a region R (R ⊂ D) of the type indicated in Green’s theorem.

Theorem 73.2. The following equality holds:

∫∫

R

∆w dx dy =

∮

C

∂w

∂nds =

∮

C

∇nw ds =

∮

C

gradw · n ds

where n is the outward unit normal vector to C.

Proof. Consider f = −∂w∂y

and g = ∂w∂x

and remark that ∆w = ∂g∂x− ∂f

∂y. Applying Green’s

theorem in the plane , we obtain

∫∫

R

∆w dx dy =

∮

C

−∂w

∂ydx +

∂w

∂xdy =

∮

C

(−∂w

∂y· dx

ds+

∂w

∂x· dy

ds) ds =

∮

C

gradw · n ds

The integrand of the last integral may be written as the dot product of the vectors

grad w = (∂w

∂x,∂w

∂y) and n = (

dy

ds,−dx

ds)

that is

n · grad w =∂w

∂x· dy

ds− ∂w

∂y· dx

ds

The vector n is the outward unit normal vector to C. that is because the vectorτ = (dx

ds, dy

ds) is the unit tangent vector to C and τ · n = 0.

The dot product n · grad w is the directional derivative ∂w∂n

= ∇nw.Therefore we have ∫∫

R

∆w dx dy =

∮

C

∂w

∂nds =

∮

C

∇nw ds

Let v(x, y) be a vector valued function v(x, y) = (f(x, y), g(x, y)) which have continuousfirst order partial derivatives in a domain D of the x, y-plane.

Definition 73.2. The divergence of v is by definition the real valued function div v :D → R1 defined by

div v =∂f

∂x+

∂g

∂y

155

Theorem 73.3. If D contains a region R (R ⊂ D) of the type indicated in Green’stheorem, then the following equality holds:

∫∫

R

div v dx dy =

∮

C

v · n ds

where n is the outward unit normal vector to c.

Proof. Applying 73.1 we obtain∫∫

R

div v dx dy =

∫∫

R

(∂f

∂x+

∂g

∂y) dx dy =

∮

C

−g dx + f dy =

∮

C

v · n ds

Example 73.2. Verify this formula when v = (x, y) and C is the circle x2 + y2 = 1.

74 Elementary Surfaces

We shall consider surface integrals. This considerations will require knowledge of somebasic facts about surfaces, which we shall now explain and illustrate by simple examples.

Definition 74.1. An elementary surface is a set of points S ⊂ R3 for which there exists abounded, open and connected set D ⊂ R2 and a function ϕ : D → S having the followingproperties:

a) ϕ is bijective;

b) ϕ is of class C1 and the vector

∂ϕ

∂u×∂ϕ

∂v= N = (

∂ϕ2

∂u·∂ϕ3

∂v−∂ϕ3

∂u·∂ϕ2

∂v,∂ϕ3

∂u·∂ϕ1

∂v−∂ϕ1

∂u·∂ϕ3

∂v,∂ϕ1

∂u·∂ϕ2

∂v−∂ϕ2

∂u·∂ϕ1

∂v)

is different from 0 for any (u, v) ∈ D.

The function ϕ is called parametric representation of S. The vectors∂ϕ

∂u,

∂ϕ

∂vare tangents

to the surface S at the point ϕ(u, v).

The vector Nϕ is called the normal vector to the surface S at the point ϕ(u, v) and the

vector nϕ =Nϕ

‖Nϕ‖is the unit normal vector to the surface S at the point ϕ(u, v).

Example 74.1. The set S = {(x1, x2, 1) ∈ R3 | 0 < x1 < 1 , 0 < x2 < 1} is an elementarysurface. A bounded, open and connected set D ⊂ R2 and a function ϕ : D → S havingthe properties a) and b) are:

D = {(u, v) ∈ R2 | 0 < u < 1 , 0 < v < 1} and ϕ(u, v) = (u, v, 1)

A normal vector to S at the point ϕ(u, v) is Nϕ = (0, 0, 1) which is actually a unit normal

vector. The vectors∂ϕ

∂u= (1, 0, 0) and

∂ϕ

∂v= (0, 1, 0) are tangents to S at the point

ϕ(u, v).

156

Example 74.2. The set S = {(x1, x2, x3) ∈ R3 | x21 +x2

2 = 1 , x1 > 0 , x2 > 0 , 0 < x3 <1} is an elementary surface. A bounded, open and connected set D ⊂ R2 and a functionϕ : D → S having the properties a) and b) are:

D = {(u, v) ∈ R2 | 0 < u <π

2, 0 < v < 1} and ϕ(u, v) = (cos u, sin u, v)

A normal vector to S at the point ϕ(u, v) is Nϕ = (cos u, sin u, 0) which is actually a unit

normal vector. The vectors∂ϕ

∂u= (− sin u, cos u, 0) and

∂ϕ

∂v= (0, 0, 1) are tangents to S

at the point ϕ(u, v).

Example 74.3. The set S = {(x1, x2, x3) ∈ R3 | x21 +x2

2 +x23 = 1 , x1 > 0 , x2 > 0 , x3 >

0} is an elementary surface. A bounded, open and connected set D ⊂ R2 and a functionϕ : D → S having the properties a) and b) are:

D = {(u, v) ∈ R2 | 0 < u <π

2, 0 < v <

π

2} and ϕ(u, v) = (cos u·sin v, sin u·sin v, cos v)

The vector Nϕ = − sin v · (cos u · sin v, sin u · sin v, cos v) is a normal vector to S andnϕ = −(cos u · sin v, sin u · sin v, cos v) is a unit normal vector to S. The vectors∂ϕ

∂u= (− sin u sin v, cos u sin v, 0) and

∂ϕ

∂v= (cos u cos v, sin u cos v,− sin v) are tangents

to S at the point ϕ(u, v).

Remark 74.1. An elementary surface possesses an infinity of parametric representations.

The direction of Nϕ is independent of the parametric representation, but the orientationof the normal vector Nϕ depends on the parametric representation of S. If insteadof the parametric representation x = ϕ(u, v), (u, v) ∈ D we consider the parametricrepresentation x = ψ(u′, v′), where x = ψ(u′, v′) = ϕ(−u′, v′), (u′, v′) ∈ D′ andT : D → D′ is defined by T (u, v) = (−u, v), then the orientation of Nϕ changes;Nϕ = −Nψ.

In general, if x = ϕ(u, v), (u, v) ∈ D and x = ψ(u′, v′), (u′, v′) ∈ D′ are two parametricrepresentations of S, then:

- if the determinant of the Jacobi matrix of T = ψ−1◦ϕ, T : D → D′, T (u, v) = (u′, v′)is positive, then the orientation of the normal vector to S does not change;

- if the determinant of the Jacobi matrix of T = ψ−1◦ϕ, T : D → D′, T (u, v) = (u′, v′)is negative, then the orientation of the normal vector to S changes;

Remark 74.2. If an elementary surface S is given by the parametric representationϕ : D ⊂ R2 → S, x = ϕ(u, v), and C is an elementary curve on S (C ⊂ S) thenC ′ = ϕ−1(C) is an elementary curve in D. If u = g(t) and v = h(t), t ∈ [a, b] is aparametric representation of C ′ then a parametric representation of C is obtained asx = ϕ(g(t), h(t)), t ∈ [a, b].

Example 74.4. Consider the elementary surface

S = {(x1, x2, 1) ∈ R3 | 0 < x1 < 1 , 0 < x2 < 1}

157

with the parametric representation ϕ(u, v) = (u, v, 1), (u, v) ∈ (0, 1) × (0, 1), and theelementary curve C on S (C ⊂ S) defined by

C = {(x1, x2, x3) ∈ R3 | x1 = x2 , x3 = 1 , x1 ∈ [1

3,2

3]}

The curve C ′ = ϕ−1(C), C ′ ⊂ D has the parametric representation: u = t, v = t,t ∈ [1

3, 2

3]. The parametric representation x = ϕ(g(t), h(t)) of C in this case is given by

x(t) = (t, t, 1).

Example 74.5. In the case of the elementary surface

S = {(x1, x2, x3) ∈ R3 | x21 + x2

2 = 1 , x1 > 0 , x2 > 0 , 0 < x3 < 1}with the parametric representation ϕ(u, v) = (cos u, sin u, v), (u, v) ∈ D = {(u, v) ∈R2 | 0 < u < π

2, 0 < v < 1}, and the elementary curve on S is defined by

C = {(x1, x2, x3) ∈ R3 | x21 + x2

2 = 1 , x3 =1

2,

1

2< x1 <

√3

2,

√3

2< x2 <

1

2}

The curve C ′ = ϕ−1(C), C ′ ⊂ D has the parametric representation: u = t, v =1

2,

t ∈[π

3,2π

3

]. The parametric representation x = ϕ(g(t), h(t)) of the curve C in this case

is x(t) =

(cos t, sin t,

1

2

).

Remark 74.3. Using the formula of parametric representation of the elementary curveC ⊂ S, x = ϕ(u(t), v(t)), we obtain that the tangent vector t to C at ϕ(u(t), v(t)) is givenby:

dx

dt=

(∂ϕ1

∂u· du

dt+

∂ϕ1

∂v· dv

dt,∂ϕ2

∂u· du

dt+

∂ϕ2

∂v· dv

dt,∂ϕ3

∂u· du

dt+

∂ϕ3

∂v· dv

dt

)

The vectors:

∂ϕ

∂u=

(∂ϕ1

∂u,∂ϕ2

∂u,∂ϕ3

∂u

)and

∂ϕ

∂v=

(∂ϕ1

∂v,∂ϕ2

∂v,∂ϕ3

∂v

)

are tangent vectors to S at ϕ(u, v) anddx

dt=

∂ϕ

∂u· du

dt+

∂ϕ

∂v· dv

dt.

If the elementary surface S is given by the parametric representation x = ϕ(u, v),(u, v) ∈ D and the elementary curve C on S (C ⊂ S) is given by the parametricrepresentation u = u(t), v = v(t), t ∈ [a, b], then the length l of the curve C is givenby:

l =

b∫

a

√E · u2 + 2F · u · v + G · v2dt

where:

E = ‖∂ϕ

∂u(u(t), v(t))‖2 F =

∂ϕ

∂u(u(t), v(t)) · ∂ϕ

∂v(u(t), v(t))

G = ‖∂ϕ

∂v(u(t), v(t))‖2 u =

du

dtv =

dv

dt

158

The expression E · u2 +2F · u · v+G · v2 is called first fundamental form of S. It is of basicimportance because it enables us to measure lengths, angles between curves and areas onthe corresponding surface. In fact, we have already seen how we compute the length of acurve. Now we consider the measurement of the angle between the curves

C1 : x = ϕ(g(t), h(t)) C2 : x = ϕ(p(t), q(t))

Let be P ∈ S, P = ϕ(g(t0), h(t0)) = ϕ(p(t0), q(t0)) a point of intersection of the twocurves.The tangent vector at the point P to the curve C1 is T1 and the tangent vector at thepoint P to the curve C2 is T2:

T1 = (∂ϕ1

∂u· g +

∂ϕ1

∂v· h,

∂ϕ2

∂u· g +

∂ϕ2

∂v· h,

∂ϕ3

∂u· g +

∂ϕ3

∂v· h)

T2 = (∂ϕ1

∂u· p +

∂ϕ1

∂v· q, ∂ϕ2

∂u· p +

∂ϕ2

∂v· q, ∂ϕ3

∂u· p +

∂ϕ3

∂v· q)

The angle at the point of intersection P between C1 and C2 is defined as the angle γbetween T1 and T2 at P and

cos γ =T1 · T2

‖T1‖ · ‖T2‖Since:

T1 · T2 = E · g · p + F (g · q + h · p) + G · h · q

‖T1‖ =

√E · g2 + 2F · g · h + G · h2

‖T2‖ =√

E · p2 + 2F · p · q + G · q2

we obtain that the angle between to intersecting curves on a surface can be expressed interms of E, F , G and the derivatives of the functions representing the curves, evaluatedat the point of intersection.

We will show in the followings how to compute the areas on a surface S.

The area A′ of a part S ′ of the elementary surface S, represented by x = ϕ(u, v),(u, v) ∈ D, is given by:

A′ =∫∫

D′

√EG− F 2 du dv

where D′ is the part of D corresponding to S ′.This formula can be made plausible by noting that:

∆A =√

EG− F 2 ·∆u ·∆v

is the area of a small parallelogram whose sides are the vectors

∂x

∂u·∆u and

∂x

∂v·∆v

From the definition of the vector product it follows that:

∆A = |∂x

∂u·∆u× ∂x

∂v·∆v| = |∂x

∂u× ∂x

∂v| ·∆u ·∆v =

√EG− F 2 ·∆u ·∆v

159

The integrand is obtained by subdividing S ′ into parts S1, S2, ..., Sn and approximatingthe area of each part Sk by the area of the parallelogram from the tangent plane to S at apoint from Sk and forming the sum of all the approximating areas. This is done for eachk = 1, 2, ... so that the dimensions of the largest Sk approaches zero as n →∞. The limitof these sums is the integral.

In various applications, surface integrals occur for which the concept of orientation of asurface is essential.

In the case of an elementary surface, for the unit normal vector n there exist twoorientations (see Fig. 74.1) and we can associate to each of them one orientation ofthe elementary surface S (as for the elementary curves, using two ways to cover thecurve). The set of representations of the elementary surface S is decomposed (accordingto these orientations) in two disjoint classes. For all representations belonging to one ofthese classes, we have one of the two orientations of n (of S) and for all representationsfrom the other class we have to opposite orientation of n (of S).

Figure 74.1:

If a smooth surface S is orientable, then we may orient S by choosing of the two possibledirections of the unit normal vector n.

If the boundary of the elementary surface S is a simple closed curve C, then we mayassociate with each of two possible orientations of S an orientation of C as it is shown inFigs. 74.2.

Figure 74.2:

The rules is: looking the curve C from the top of the unit normal vector n the sense onthe curve is always counterclockwise.Using this idea we may extend the concept of orientation to surfaces which can bedecomposed in elementary surfaces, as follows: A surface S which can be decomposedin elementary surfaces is said to be orientable if we can orient each elementary piece of

160

S in such a manner that along each curve C∗ which is a common boundary of two piecesS1 and S2 the positive direction of C∗ related to S1 is opposite to the positive directionof C∗ related to S2.

Figure 74.3:

However this may not hold in the large. There are non-orientable surfaces. An exampleof such a surface is the Mobius strip. A model of a Mobius strip can be made by takinga long rectangular piece of paper and sticking the shorter sides together so that the twopoints A and the two points B coincide (see Fig.74.4).

Figure 74.4: Mobius strip

Definition 74.2. A surface S is orientable if a chosen normal orientation given at anarbitrary point P0 ∈ S can be continued in a unique and continuous way to the entiresurface S.

Hence, the surface S is orientable if there does not exist a closed curve C ⊂ S passingthrough P0 such that the chosen normal orientation reverses by moving continuously alongthe curve C from P0 and back to P0.

75 Surface integrals of first type

Surface integrals occur in many applications, for example, in connection with the centerof gravity of a curved lamina, the potential due to charges distributed on surfaces.

Let S be an elementary surface of finite area and let f a real valued function whichis defined and continuous on S. We subdivide S into n parts S1, S2, · · · , Sn of areasA1, A2, · · · , An. In each part Sk we choose an arbitrary point Pk and form the sum:

In =n∑

k=1

f(Pk) · Ak

161

This we do for each n = 1, 2, · · · in an arbitrary manner, but so that the largest partSk tends to a point as n approaches infinity. The sequence I1, I2, · · · , In, · · · has a limitwhich is independent of the choice of subdivisions and points Pk. This limit is called thesurface integral of first type of f over S and is denoted by

∫∫

S

f dS

To evaluate the surface integral, we may reduce it to a double integral as follows: If S isrepresented in parametric form by a vector function x = x(u, v), then

∫∫

S

f dS =

∫∫

R

f(x(u, v)) ·√

EG− F 2 du dv

where R is the region corresponding to S in the u, v plane.

The value of the surface integral of first type does not depend on the parametricrepresentation of the surface.

If the elementary surface S is represented in the form x3 = g(x1, x2), then

∫∫

S

f dS =

∫∫

S

f(x1, x2, g(x1, x2)) ·√

1 + (∂g

∂x1

)2 + (∂g

∂x2

)2 dx1 dx2


∫∫

S

µD2 dS, where S is defined by x21 + x2

2 + x23 = a2 and

D2 = x21 + x2

2.

76 Surface integrals of second type

Let S be an elementary surface. We orient S by choosing a unit normal vector n.Denoting the angles between n and the positive x1, x2 and x3 axes by α1, α2 and α3

respectively, we haven = (cos α1, cos α2, cos α3)

Let u1, u2 and u3 be given functions which are defined and continuous at every point ofS. The surface integrals to be considered are usually written in the form:

∫∫

S

u1 dx2 dx3

∫∫

S

u2 dx3 dx1

∫∫

S

u3 dx1 dx2

and by definition this means∫∫

S

u1 dx2 dx3 =

∫∫

S

u1 cos α1 dS

∫∫

S

u2 dx3 dx1 =

∫∫

S

u2 cos α2 dS

162

∫∫

S

u3 dx1 dx2 =

∫∫

S

u3 cos α3 dS

It is clear that the value of such an integral depends on the choice of n, that is, onthe orientation of S. The transition to the opposite orientation corresponds to themultiplication of the integral by −1, because then the components cos α1, cos α2, cos α3 ofn are multiplied by −1.The sum of the above three integrals may be written in a simple form by using vectornotation.In fact we introduce the vector function

u(x1, x2, x3) = (u1(x1, x2, x3), u2(x1, x2, x3), u3(x1, x2, x3))

and we obtain∫∫

S

u1 dx2 dx3 + u2 dx3 dx1 + u3 dx1 dx2 =

∫∫

S

(u1 cos α1 + u2 cos α2 + u3 cos α3) dS

=

∫∫

S

u · n dS

To evaluate the above integrals we may reduce then to double integrals over a planeregion.If S can be represented as

x3 = h(x1, x2)

and is oriented such that n points upward, (then α3 is acute), then∫∫

S

u3 dx1 dx2 =

∫∫

R

u3(x1, x2, h(x1, x2)) dx1 dx2

where R is the orthogonal projection of S in the x1, x2 plane.If n points downward, (then α3 is obtuse), then we have

∫∫

S

u3 dx1 dx2 = −∫∫

R

u3(x1, x2, h(x1, x2)) dx1 dx2

For the other integrals the situation is quite similar.If S is represented in parametric form

x = x(u, v)

then the normal vector is

(a) n = +∂x∂u× ∂x

∂v

‖∂x∂u× ∂x

∂v‖ or (b) n = −

∂x∂u× ∂x

∂v

‖∂x∂u× ∂x

∂v‖

and ∫∫

S

u3 dx1 dx2 = ±∫∫

R

u3(x(u, v)) · D(x1, x2)

D(u, v)du dv

with + if n is as (a) and − if n is as (b). Here R is the region corresponding to S in theu, v plane.

163

77 Properties of surface integrals

Let A be a closed bounded region in space whose boundary S is is a union of elementarysurfaces and is orientable.

Theorem 77.1 (divergence theorem of Gauss). If the vector function u(x1, x2, x3) hascontinuous first order partial derivatives in some domain containing A, then

∫∫∫

A

div u dV =

∫∫

S

u · n dS

where n is the outer unit normal vector of S.

Proof. The proof is technical and will be skipped.

Corollary 77.1. If u = grad f , then∫∫∫

A

∆f dV =

∫∫

S

∂f

∂ndS

and for ∆f = 0 we have ∫∫

S

∂f

∂ndS = 0

Corollary 77.2. If u = f · grad g, then

div u = f ·∆g + grad f · grad g

u · n = f · (n · grad g) = f · ∂g

∂nand ∫∫∫

A

(f ·∆g + grad f · grad g) dV =

∫∫

S

f · ∂g

∂ndS

This equality is the first Green’s formula.By interchanging f and g we obtain

∫∫∫

A

(f ·∆g − g ·∆f) dV =

∫∫

S

(f · ∂g

∂n− g · ∂f

∂n) dS

This equality is the second Green’s formula.

Let S be an elementary surface in space and let C be the boundary of S, an elementaryclosed curve.

Theorem 77.2 (Stokes’s theorem). If the vector valued function v has continuous firstorder partial derivatives in a domain in space which contains S, then

∫∫

S

(curl v) · n dS =

∫

C

v · t ds

Here: curl v = ( ∂v3

∂x2− ∂v2

∂x3, ∂v1

∂x3− ∂v3

∂x1, ∂v2

∂x1− ∂v1

∂x2), n is the unit normal vector of S and t is

the unit tangent vector of C.

164

78 Differentiation of an integral containing a param-

eter

It can sometimes happen that an integrand, in addition to being a function of x, alsodepends on a parameter t. Furthermore, the domain of integration may depend also onthe parameter t. So that the value of the integral must then itself depend on t. In thissection we will show the problem of differentiation with respect to t of such an integral.Firstly we consider the case in 1-dimension, i.e.

I(t) =

ψ(t)∫

ϕ(t)

f(x, t) dx

Theorem 78.1. If the functions ϕ(t), ψ(t) are differentiable functions with respect to tin some interval c ≤ t ≤ d and the function f(x, t) is continuous with respect to x overthe interval ϕ(t) ≤ x ≤ ψ(t) and continuously differentiable with respect to t, then thefunction I(t) is differentiable and

d

dt

ψ(t)∫

ϕ(t)

f(x, t) dx = ψ′(t) · f(ψ(t), t)− ϕ′(t) · f(ϕ(t), t) +

ψ(t)∫

ϕ(t)

∂f

∂t(x, t) dx

Proof. From the mean value theorem for derivatives in t + h ∈ [c, d], we have

ϕ(t + h) = ϕ(t) + h · (dϕ

dt)(ξ) with ξ ∈ (t, t + h)

ψ(t + h) = ψ(t) + h · (dψ

dt)(η) with η ∈ (t, t + h)

f(x, t + h) = f(x, t) + h · (∂f

∂t)(x, ξ) with ζ ∈ (t, t + h)

Now we have

I(t + h) =

ψ(t+h)∫

ϕ(t+h)

f(x, t + h) dx =

=

ϕ(t)∫

ϕ(t)+h·ϕ′(ξ)

f(x, t + h) dx +

ψ(t)∫

ϕ(t)

f(x, t + h) dx +

ψ(t)+h·ψ′(η)∫

ψ(t)

f(x, t + h) dx =

= −h · ϕ′(ξ) · f(x′, t + h) + h · ψ′(η) · f(x′′, t + h) +

ψ(t)∫

ϕ(t)

f(x, t + h) dx

whereϕ(t) ≤ x′ ≤ ϕ(t) + h · ϕ′(ξ) ψ(t) ≤ x′′ ≤ ψ(t) + h · ψ′(η)

165

Next, forming the difference I(t + h)− I(t) and combining the integrals we obtain

I(t + h)− I(t) = h · ψ′(η) · f(x′′, t + h)− h · ϕ′(ξ) · f(x′, t + h) + h

ψ(t)∫

ϕ(t)

(∂f

∂t)(x, ξ) dx

Finally, forming the difference quotient I(t+h)−I(t)h

and taking the limits as h → 0 it followsthat ξ, η all tend to t. Hence

dI

dt= ψ′(t) · f(ψ(t), t)− ϕ′(t) · f(ϕ(t), t) +

ψ(t)∫

ϕ(t)

∂f

∂t(x, t) dx

Corollary 78.1. If f(x, t) is continuous with respect to x over the interval [a, b] andcontinuously differentiable with respect to t, then

d

dt

b∫

a

f(x, t) dx =

b∫

a

∂f

∂t(x, t) dx

Now we consider the 3-D case.Let R ⊂ R3 be a domain and (a, b) ⊂ R1 an open interval. We consider a continuouslydifferentiable function x : (a, b) × R → R3 and a bounded set ω0 ⊂ R having a smoothsurface S0. For t arbitrary, but fixed, denote by ω(t) the set

ω(t) = {x(t, ξ) | ξ ∈ ω0}

and assume that the jacobian D(x1,x2,x3)D(ξ1,ξ2,ξ3)

of the function xt : ω0 → ω(t) defined by

xt(ξ) = x(t, ξ)

is different from zero: D(x1,x2,x3)D(ξ1,ξ2,ξ3)

6= 0.

Now consider F : R3 × (a, b) → R1 a function having continuous first order partialderivatives and the integral

I(t) =

∫∫∫

ω(t)

F (x, t) dx1 dx2 dx3

Theorem 78.2. The function I(t) is continuously differentiable and

dI

dt=

∫∫∫

ω(t)

∂F

∂t(x, t) dx1 dx2 dx3 +

∫∫

S(t)

F · v · n dS

where: S(t) is the boundary of ω(t), v = ∂x∂t

and n is the unit normal vector of the surfaceS(t).

Proof. Is rather technical and it will be omitted.

166

References

[1] R. Haggarty, Fundamentals of Mathematical Analysis ; Addison-Wesley, 1989, Oxford

[2] A. B. Israel, R. Gilbert, Computer-Supported Calculus ; Springer Wien New York,2001, RISC Johannes Kepler University, Linz, Austria

[3] C. Lanczos, Applied Analysis ; Sir Isaac Pitman, 1967, London

[4] F. Ayres, J. Cault, Differential and Integral Calculus in Simetric Units ; Mc.Grow-Hill, 1988

[5] A. Jeffrey, Mathematics for engineers ad scientists ; Van Nostrand, 1961

[6] E. Kreiszig, Advanced engineering mathematics ; Wiley & Sons, 1967

[7] O. V. Manturov, N. M. Matveev, A course of higher mathematics ; Mir, 1989

167

Documents

calculus