MATH234 THIRDSEMESTER CALCULUS - Department of Mathematics · 4 CONTENTS 6. TheTwoVariableChainRule 58 7. Problems 61 8. Gradients 62 9. Thechainruleandthegradientofafunctionofthreevariables

z=f(x,y)

x

z

y

MATH 234THIRD SEMESTER

CALCULUS

Fall 2013

1

2

Math 234 – 3rd Semester CalculusLecture notes version 0.9(Fall 2013)

is is a self contained set of lecture notes for Math 234. e notes were wrien bySigurd Angenent, some problems were taken from Guichard’s open calculus text whichis available at http://www.whitman.edu/mathematics/multivariable/src/

e LATEX files, as well as the P and I files that were used to pro-duce the notes before you can be obtained from the following web site:

http://www.math.wisc.edu/~angenent/Free-Lecture-Notesey are meant to be freely available for non-commercial use, in the sense that “freesoware” is free. More precisely:

Copyright (c) 2009 Sigurd B. Angenent. Permission is granted to copy, distribute and/or modify thisdocument under the terms of the GNU Free Documentation License, Version 1.2 or any laterversion published by the Free Soware Foundation; with no Invariant Sections, no Front-CoverTexts, and no Back-Cover Texts. A copy of the license is included in the section entitled ”GNU FreeDocumentation License”.

http://www.whitman.edu/mathematics/multivariable/src/http://www.math.wisc.edu/~angenent/Free-Lecture-Notes

Contents

Chapter 1. Vector Geometry in Three dimensional space 51. Three dimensional space 52. Geometric description of vectors 53. Arithmetic of vectors 64. Vector algebra 75. Component representation of vectors 86. The dot product 97. The cross product 108. The triple product 129. Determinants 1310. Determinants, the triple product, and the cross product 1311. Defining equations for lines and planes 1412. Problems 16

Chapter 2. Parametric curves and vector functions 191. Vector functions 192. Using vector functions to describe motion 193. Lines 204. Circular motion 205. The cycloid 216. The helix 217. The derivative of a vector function 228. The derivative as velocity vector 239. Acceleration 2410. The differentiation rules 2511. Vector functions of constant length 2612. Two examples 2713. Arc length 2814. Arc length derivative 2915. Unit Tangent and Curvature 3016. Osculating plane 3117. Problems 31

Chapter 3. Functions of more than one variable 351. Functions of two variables and their graphs 352. Linear functions 383. adratic forms 394. Functions in polar coordinates r, θ 425. Methods of visualizing the graph of a function 44Problems 46

Chapter 4. Derivatives 491. Interior points and continuous functions 492. Partial Derivatives 503. Problems 514. The linear approximation to a function 525. The tangent plane to a graph 55

3

4 CONTENTS

6. The Two Variable Chain Rule 587. Problems 618. Gradients 629. The chain rule and the gradient of a function of three variables 6610. Implicit Functions 69Problems 7211. The Chain Rule with more Independent Variables;

Coordinate Transformations 7312. Problems 7513. Higher Partials and Clairaut’s Theorem 7814. Finding a function from its derivatives 7915. Problems 81

Chapter 5. Maxima and Minima 831. Local and Global extrema 832. Continuous functions on closed and bounded sets 833. Problems 854. Critical points 865. When there are more than two variables 896. Problems 917. A Minimization Problem: Linear Regression 928. Problems 939. The Second Derivative Test 9410. Problems 9911. Second derivative test for more than two variables 10012. Optimization with constraints and the method of Lagrange multipliers 10113. Problems 104

Chapter 6. Integrals 1051. Ways of Integrating 1052. Double Integrals 1063. Problems 1184. Triple integrals 1195. Why compute a Triple Integral? 1226. Integration in special coordinate systems 1277. Problems 130

Chapter 7. Vector Calculus 1351. Vector Fields 1352. Examples of vector fields 1353. Line integrals 1384. Problems 1405. Line integrals of vector fields 1406. Another Fundamental Theorem of Calculus 1467. Conservative vector fields 1488. Problems 1499. Flux integrals 14910. Green’s Theorem 15311. Conservative vector fields and Clairaut’s theorem 15512. Problems 15713. Surfaces and Surface integrals 15814. Examples 16315. The divergence theorem and Stokes’ theorem 16516.

#‰∇ – differentiating vector fields 16617. Problems 169

Math 234 – Answers and Hints 173

CHAPTER 1

Vector Geometry in ree dimensional space

1. ree dimensional space

e world according to our first and second semester calculus courses is flat: exceptfor a brief digression about surfaces of revolution, everything that we discussed in Math221 and 222 took place in the (x, y)-plane. All curves were curves in the plane and allfunctions had graphs that were curves in the plane. is semester we leave two dimen-sions behind and enter the three dimensional world. In order to understand the objectswe will be dealing with, such as curves that are free to loop around in space, or functionswhose graphs are themselves two dimensional curved surfaces, we will first review somethree dimensional geometry. In particular, we will review the use of vectors in threedimensional geometry.

2. Geometric description of vectors

2.1. Points and their coordinates. We are used to describing the location of anypoint in the plane by choosing two perpendicular “coordinate axes” (the x and y axes),and specifying the corresponding (x, y)-coordinates of any given point. In the same waywe can describe where points are in three dimensional space by choosing three mutuallyperpendicular axes, which we call the x, y, and z axes. To say where some given point Pis, we travel from the origin to P , first along the x axis, then parallel to the y-axis, andfinally parallel to the z-axis. e distances we had to go in the x, y, and z directions arethe x, y, and z coordinates of our point P .

y-axis

z-axis

x-axis

Figure 1. To determine the location of points in three dimensional space (such as the center of theblue sphere in this drawing), we should choose three coordinate axes, and specify three numbers:the x, y, and z coordinates of the point.

5

6 1. VECTOR GEOMETRY IN THREE DIMENSIONAL SPACE

2.2. Vectors. While points and their coordinates are used to described locations inspace, vectors are used to describe displacements, i.e. how to go from one point to an-other. Such a displacement has a size (how far we have to go), and a direction (which waydo we go). Vectors also get used in non-geometric situations to describe objects that havesize and direction, e.g. velocities and forces in physics are typical examples of vector-likeobjects.

Informal definition of “vectors”. Wewill think of a vector as an arrow connecting twopoints. If the points areA andB then we call the vector

# ‰

AB. If we translate a vector# ‰

AB

without turning it then we say that the resulting vector# ‰

CD is the same vector as theoriginal vector

# ‰

AB. A more precise way of saying that we should be able to move# ‰

AB“without turning,” is to insist that the line segments AB and CD should be parallel, andhave the same length and orientation.

A

B

C

D

Figure 2. This figure contains four points (A,B,C ,D), two line segments (AB andCD), but onlyone vector since

# ‰AB and

# ‰CD represent the same vector:

# ‰AB =

# ‰CD.

We say that the arrows# ‰

AB and# ‰

PQ both represent the same vector. Since both# ‰AB and

# ‰PQ are the same vector we will oen want to use a notation for vectors that

does not emphasize any particular choice of initial- and endpoint. e notation we willuse in this course is

#‰a =# ‰

AB =# ‰

PQ,

i.e., a single leer with an arrow on top will always stand for a vector in this course.

to addtwo vectors…

…move one vectoruntil its initial point…

…is the end point ofthe other…

…and combine them.

BP

Q

BP

Q

C

B

C

B

C

A A A A

#‰a #‰a #‰a #‰a

#‰

b#‰

b#‰

b#‰

b#‰a +

#‰

b

Figure 3. Adding vectors

3. Arithmetic of vectors

To add two vectors# ‰

AB and# ‰

PQ we first translate the vector# ‰

PQ so that its initialpoint becomes B; let the result of this translation be the vector

# ‰

BC . en, by definition,

4. VECTOR ALGEBRA 7

the sum of# ‰

AB and# ‰

PQ is# ‰

AC : in a formula,# ‰

AB +# ‰

PQ =# ‰

AB +# ‰

BC =# ‰

AC.

An equivalent way of adding two vectors# ‰

AB and# ‰

PQ is to move the vectors around untilthey have the same initial point. Two vectors with a common initial point form two sidesof a parallelogram (see Figure 4) and the sum of the two vectors is the diagonal of thatparallelogram.

A

B

CC

D

A

B

CC

D

A

B

CC

D

A

BD

# ‰

AB +# ‰

AD =?

Figure 4. Using a parallelogram to add vectors. To find# ‰AB+

# ‰AD we move the vector

# ‰AD so

that its initial point is at B, i.e. the endpoint of# ‰AB. This gives us a parallelogram ABCD, where

# ‰AD =

# ‰BC . Therefore

# ‰AB +

# ‰AD =

# ‰AB +

# ‰BC =

# ‰AC

One can also multiply vectors with numbers. To multiply a vector #‰a with a positivereal number t > 0, we multiply the length of the vector by a factor t, without changingthe direction of the vector.

#‰a

2 #‰a

− #‰a

#‰a

#‰

b

− #‰a

− #‰b

#‰a − #‰b

#‰

b − #‰a

Figure 5. Multiplying and subtracting vectors

4. Vector algebra

e addition and multiplication of vectors and numbers satisfy a number of alge-braic properties that should look familiar, as they are very similar to the usual algebraicproperties for adding and multiplying numbers. Here they are:

#‰a +#‰

b =#‰

b + #‰a commutative law

( #‰a +#‰

b ) + #‰c = #‰a + (#‰

b + #‰c ) t · (s · #‰a) = (ts) · #‰a associative laws

t · ( #‰a + #‰b ) = t #‰a + t #‰b (t+ s) #‰a = t #‰a + s #‰a distributive laws


5. Component representation of vectors

5.1. Components of a vector in two dimensional space. ere is a way to representa vector by specifying a list of numbers instead of by giving a geometric description of thevector. To do this for vectors in the plane, we must choose two perpendicular coordinateaxes (the “x” and “y” axes). We define

#‰e1 = vector with length 1, in the direction of the x axis#‰e2 = vector with length 1, in the direction of the y axis

en any other vector can be wrien as the sum of a multiple of #‰e1 and another multipleof #‰e2:

(1) #‰a = a1 #‰e1 + a2 #‰e2.

See Figure 6. e numbers a1 and a2 are called the components of the vector #‰a . If weknow the components a1 and a2 of a vector, and if we know the two vectors #‰e1 and #‰e2,then we can reconstruct the vector #‰a by using the formula (1).

#‰e1

#‰e2

#‰a #‰a #‰a

a1#‰e1

a2#‰e2

Figure 6. Describing a vector in terms of its components.

Instead of using the notation (1), one very oen writes

(2) #‰a =(a1a2

), or #‰a =

[a1a2

], or #‰a = ⟨a1, a2⟩.

is notation says that #‰a is the vector whose components are a1 and a2. Since the twovectors #‰e1 and #‰e2 depend on our choice of coordinate axes, we can only use the compo-nent notation if it is clear to everyone how we chose the coordinate axes.

e first way of writing the vector, in which the components a1 and a2 are listed in acolumn enclosed in either parentheses or square brackets, is the standard way of writing“column vectors,” and is used in linear algebra courses (math 320, 340, 341, etc.), as wellas by most computational soware (MatlabTM, Octave, etc.). e other way of writing thecomponents, i.e. as ⟨a1, a2⟩, also gets used, especially when one has to type the equationsrather than write them by hand.

5.2. Components of a vector in three dimensional space. e preceding also ap-plies to vectors in three dimensional space: instead of choosing two coordinate axes wechoose three axes, and call them the x, y, and z axes (or, the x1, x2, and x3 axes). enwe define #‰ı , #‰ȷ , and

#‰

k (or #‰e1, #‰e2, and #‰e3) to be vectors of length one in the direction of

6. THE DOT PRODUCT 9

the three coordinate axes. A vector #‰a in space can then be wrien as a combination ofthe three vectors #‰ı , #‰ȷ , and

#‰

k , namely,

#‰a = a1#‰ı + a2

#‰ȷ + a3#‰

k , or #‰a =

a1a2a3

.e #‰e1, #‰e2, #‰e3 notation is more systematic, but the #‰ı , #‰ȷ ,

#‰

k notation, which was intro-

a2#‰e2a1

#‰e1

a3#‰e3

#‰e2

#‰e3

#‰e1

The vector #‰a =

a1a2a3

is#‰a = a1

#‰e1 + a2#‰e2 + a3

#‰e3

a1a2a3

Figure 7. Components of a vector in three dimensional space

Josiah Willard Gibbs1839–1903

https://en.wikipedia.org/wiki/Josiah_Willard_Gibbs

duced into vector geometry and vector calculus by J.W.Gibbs, is also very common.

5.3. Length of a vector whose components are given. We will write

∥ #‰a∥

for the length of a vector #‰a . If the vector is given in components,

#‰a = a1#‰e1 + a2

#‰e2, or #‰a = a1 #‰e1 + a2 #‰e2 + a3 #‰e3,

then the length of the vector is determined by Pythagoras’ law (see Figures 6 and 7):

(3) ∥ #‰a∥ =√a21 + a

22, or ∥ #‰a∥ =

√a21 + a

22 + a

23.

6. e dot product

ere are two different descriptions of the dot product of two vectors: one geometric,and the other in terms of the components of the vectors.

6.1. Geometric description of the dot product. If #‰a and#‰

b are two given vectors,then, by definition,

θ #‰a

#‰b

The dot product betweentwo vectors.

(4) #‰a •#‰

b = ∥ #‰a∥ ∥ #‰b ∥ cos θ,

where θ is the angle between the two vectors #‰a and#‰

b .

https://en.wikipedia.org/wiki/Josiah_Willard_Gibbshttps://en.wikipedia.org/wiki/Josiah_Willard_Gibbs


6.2. e dot product in terms of vector components. If we choose an orthonormalset of vectors #‰e1, #‰e2, #‰e3, and write

#‰a = a1#‰e1 + a2

#‰e2 + a3#‰e3 =

a1a2a3

, #‰b = b1 #‰e1 + b2 #‰e2 + b3 #‰e3 =b1b2b3

,then

(5) #‰a •#‰

b = a1b1 + a2b2 + a3b3.

e fact that (4) and (5) always give the same result is not obvious (the formulas look verydifferent), and requires a proof. A very common proof relies on the law of cosines (it wasgiven in math 222 – see also Problem 12.10)

6.3. Algebraic properties of the dot product. e dot product has the followingalgebraic properties, which we will use very oen throughout this course:

#‰a •#‰

b =#‰

b • #‰a commutative

s( #‰a •#‰

b ) = (s #‰a) •#‰

b associative

( #‰a +#‰

b ) • #‰c = #‰a • #‰c +#‰

b • #‰c . distributive

We will not prove these properties here. Proofs can be given if one starts either fromthe algebraic description of the dot-product (5), or from the geometric description (4) (al-though the distributive property is more difficult to prove from the geometric descriptionthan from the algebraic description.)

e sign of the dot product tells us if the angle between two vectors is acute, obtuse,or if the vectors are perpendicular:

#‰a ⊥ #‰b ⇐⇒ #‰a • #‰b = 0(6a)#‰a •

#‰

b > 0 ⇐⇒ θ < π2

(6b)

#‰a •#‰

b < 0 ⇐⇒ θ > π2.(6c)

7. e cross product

As with the dot product, the cross product of two vectors also has a geometric de-scription, and a description in terms of components.

7.1. Geometric description of the cross product. Let #‰a and#‰

b be two vectors inthree dimensional space, then their cross product is the vector #‰a× #‰b that satisfies

• #‰a× #‰b is perpendicular to #‰a , and also to #‰b• the length of #‰a× #‰b is given by

∥ #‰a× #‰b ∥ = ∥ #‰a∥ ∥ #‰b ∥ sin θ,

where θ is the angle between the vectors #‰a and#‰

b ,• the three vectors #‰a , #‰b , #‰a× #‰b satisfy the right hand rule: if on your right hand

#‰a is the index finger and#‰

b is the middle finger, then your thumb points in thedirection of #‰a× #‰b . See Figure 8.

7. THE CROSS PRODUCT 11

#‰a

#‰

b

#‰a× #‰b

#‰a

#‰

b

#‰a× #‰b

Figure 8. The cross product: #‰a× #‰b is perpendicular to both #‰a and #‰b ; its direction follows fromthe right-hand rule.

e length of the cross product of two vectors has a geometric interpretation. Namely,the quantity ∥ #‰a∥ ∥ #‰b ∥ sin θ is exactly the are of the parallelogram spanned by the vectors#‰a and

#‰

b .

height = ∥ #‰a∥ sin θ

base = ∥ #‰b ∥

#‰a

θ

Area=height×base

#‰

b

7.2. Algebraic description of the cross product. If #‰a and#‰

b are given by (4), i.e. by

#‰a = a1#‰e1 + a2

#‰e2 + a3#‰e3 =

(a1a2a3

),

#‰

b = b1#‰e1 + b2

#‰e2 + b3#‰e3 =

(b1b2b3

),

then

#‰a× #‰b =

a2b3 − a3b2a3b1 − a1b3a1b2 − a2b3

.7.3. Algebraic properties of the cross product. e cross product has the distribu-

tive property, namely,

(7) ( #‰a +#‰

b )× #‰c = #‰a× #‰c + #‰b× #‰c ,

holds true for any three vectors #‰a ,#‰

b , #‰c .

e cross product is not commutative: #‰a× #‰b and #‰b× #‰a are not the same thing.Instead, we have :

(8) #‰a× #‰b = − #‰b× #‰a .

Because of this property the cross product is said to be “anti-commutative.”


e associative property fails completely for the cross product: for most vectors #‰a ,#‰

b , #‰c one has

(9)

( #‰a× #‰b )× #‰c ̸= #‰a×( #‰b× #‰c )

If you need a vector that is perpendicular to two given vectors, take their cross prod-uct.

e length of the cross product #‰a× #‰b is the area of the parallelogram spanned bythose vectors.

8. e triple product

Just as two vectors in the plane form a parallelogram, three vectors in space willform a shape called a parallelepiped. By definition, a parallelepiped is a solid body eachof whose faces is a parallelogram.

θ

#‰a

#‰c

#‰

b

#‰

b× #‰c

height

θ

#‰a#‰

b#‰c

#‰

b× #‰c

height

Figure 9. A parallelepiped spanned by three vectors #‰a ,#‰

b , #‰c . Since the base of the paral-lelepiped is a parallelogram with edges

#‰

b and #‰c , we haveArea of base = ∥ #‰b× #‰c ∥.

The height of the parallelepiped is ∥ #‰a∥ cos θ, and therefore the volume is given byVolume = height · area of base = ∥ #‰a∥ ∥ #‰b× #‰c ∥ cos θ = #‰a •

( #‰b× #‰c

).

This derivation applies to the situation on the le, where the vector #‰a and the cross product#‰

b× #‰cpoint in the same direction. If these vectors form an obtuse angle, as is the case on the right, thencos θ < 0, and the height is −∥ #‰a∥ cos θ. In that case one has

Volume = height · area of base = −∥ #‰a∥ ∥ #‰b× #‰c ∥ cos θ = − #‰a •( #‰b× #‰c

).

If we are given three vectors #‰a ,#‰

b , and #‰c , then the volume of the parallelepiped theydetermine is given by the formula

“Volume equals Area of base times height”

In terms of the three vectors this is

(10) V =∣∣∣ #‰a • ( #‰b× #‰c )∣∣∣ .

A derivation is sketched in Figure 9. e quantity #‰a • (#‰

b× #‰c ) (without the absolutevalues) is called the triple product of the three vectors #‰a ,

#‰

b , and #‰c . Apart from its usein computing the volume of a parallelepiped, the triple product appears in many other

10. DETERMINANTS, THE TRIPLE PRODUCT, AND THE CROSS PRODUCT 13

contexts. At first sight the expression #‰a • (#‰

b× #‰c ) suggests that the order in which thevectors appear is important, but this turns out not to be true. One has

#‰a •( #‰b× #‰c

)=

#‰

b •(

#‰c× #‰a)= #‰c •

(#‰a× #‰b

)for any #‰a ,

#‰

b , #‰c .

9. Determinants

For any four numbers a, b, c, d, one defines the 2× 2 determinant to be

(11)

∣∣∣∣ a bc d∣∣∣∣ = ad− bc .

One can also define 3 × 3 determinants. Namely, for any nine numbers a1, . . . , c3 onedefines

(12)

∣∣∣∣∣∣a1 b1 c1a2 b2 c2a3 b3 c3

∣∣∣∣∣∣ = a1b2c3 − a1b3c2 − a2b1c3 + a2b3c1 + a3b1c2 − a3b2c1 .is can be wrien as∣∣∣∣∣∣

a1 b1 c1a2 b2 c2a3 b3 c3

∣∣∣∣∣∣ = a1(b2c3 − b3c2)− a2(b1c3 − b3c1)+ a3(b1c2 − b2c1)(13)= a1

∣∣∣∣ b2 c2b3 c3∣∣∣∣− a2 ∣∣∣∣ b1 c1b3 c3

∣∣∣∣+ a3 ∣∣∣∣ b1 b1b2 b2∣∣∣∣

where each coefficient in the first row is multiplied with the 2×2 determined that remainsaer one deletes the row and column containing the coefficient.

Instead of expanding along the first row one can also expand along the first column:

(14)

∣∣∣∣∣∣a1 b1 c1a2 b2 c2a3 b3 c3

∣∣∣∣∣∣ = a1∣∣∣∣ b2 c2b3 c3

∣∣∣∣− b1 ∣∣∣∣ a2 c2a3 c3∣∣∣∣+ c1 ∣∣∣∣ a2 b2a3 b3

∣∣∣∣Many other mnemonic devices exist to remember how to compute a 3 × 3 determinant.A popular trick is “Sarrus’ rule” (see Figure 10.)

One can also define larger determinants, i.e. 4 × 4, 5 × 5, etc, and generally n × ndeterminants. e theory, which is beyond the scope of this course, is treated in linearalgebra courses such as Math 320, 340, or 341.

10. Determinants, the triple product, and the cross product

If the numbers a1, . . . , c3 in a determinant happen to be the components of threevectors #‰a ,

#‰

b , #‰c , i.e. if

#‰a =

a1a2a3

, #‰b =b1b2b3

, #‰c =c1c2c3

,then the corresponding determinant is exactly the triple product:

(15)

∣∣∣∣∣∣a1 b1 c1a2 b2 c2a3 b3 c3

∣∣∣∣∣∣ = #‰a • ( #‰b× #‰c ).


a1 a2 a3 a1 a2

+ + +---

b1 b2 b3 b1 b2

c1 c2 c3 c1 c2

a1b2c3 a2b3c1 a3b1c2a3b2c1 a1b3c2 a2b1c3

Figure 10. Computing 3 × 3 determinants. There are several shortcuts to remember howto compute a 3 × 3 determinant. Pictured here is “Sarrus’ rule,” which tells us to copy the firsttwo columns of the determinant to the right of the determinant, and read off the six terms in thedeterminant by following the diagonals.

Related to this is the following practical trick for computing the cross product of twocolumn vectors. Given two column vectors

#‰

b and #‰c one can write their cross product asb1b2b3

×c1c2c3

=∣∣∣∣∣∣

#‰e1 b1 c1#‰e2 b2 c2#‰e3 b3 c3

∣∣∣∣∣∣=

∣∣∣∣ b2 c2b3 c3∣∣∣∣ #‰e1 − ∣∣∣∣ b1 c1b3 c3

∣∣∣∣ #‰e2 + ∣∣∣∣ b1 b1b2 b2∣∣∣∣ #‰e3.

e 3 × 3 determinant in this equation is unusual in that some of its entries are vectorsinstead of numbers. e intention of this notation is that one expand the determinantalong the first column, as in (13) and then interpret the result as a vector.

11. Defining equations for lines and planes

11.1. Lines. Let ℓ be a line in the plane, and suppose we know one point A on theline, and that we also have a vector #‰n that is perpendicular to the line (and we exclude#‰n =

#‰0 .) Such a vector is called a normal vector to the line. Given any other pointX in

the plane we can form the vector# ‰

AX and consider its dot-product with the normal. Wehave

#‰n •# ‰

AX = ∥ #‰n∥ ∥ # ‰AX∥ cos θ,

where θ is the angle between the normal vector #‰n and# ‰

AX .

e combination ∥ # ‰AX∥ cos θ is, up to its sign, the distance from the line ℓ to thepoint X : If X lies on the side of ℓ at which the normal vector points then #‰n •

# ‰

AX > 0; ifX lies on the other side then #‰n •

# ‰

AX < 0. We therefore have the following formula forthe distance between a point X and the line ℓ:

(16) d =#‰n •

# ‰

AX

∥ #‰n∥When we use this equation to compute the distance from X to ℓ, it is good to recall thatif #‰x = ( x1x2 ) and

#‰a = ( a1a2 ) are the position vectors of the points X and A, then

# ‰

AX = #‰x − #‰a =(x1 − a1x2 − a2

).

11. DEFINING EQUATIONS FOR LINES AND PLANES 15

X

A

ℓ

d

θ#‰n

XA

ℓ

d

θ#‰n

π − θ

#‰n •# ‰

AX < 0d = ∥ # ‰AX∥ cos(π − θ)

= −∥ # ‰AX∥ cos θ#‰n •# ‰AX > 0 d = ∥ # ‰AX∥ cos θ

Moreover, the length of the normal vector is ∥ #‰n∥ =√

n21 + n22, so we can rewrite (16) as

d =n1(x1 − a1) + n2(x2 − a2)√

n21 + n22

.

is last formula is more impressive than (16), but it is beer to remember (16).

e equation for the distance from any point X to a given line ℓ is also importantbecause it gives us the defining equation for the line ℓ. e defining equation is anequation that tells us for any given pointX in the plane if that point is on the line or not.Since X is on ℓ exactly when the distance from ℓ to X vanishes, it follows from (16) thatX is on ℓ if and only if

(17) #‰n •# ‰

AX = 0.

We can again rewrite this equation in a few different ways. If we want to write it in termsof the position vectors of A and X , then we get

#‰n •(

#‰x − #‰a)= 0, i.e.: #‰n • #‰x = #‰n • #‰a .

Wrien without vectors, but in terms of the coordinates of the points A, X , and thecomponents of the normal vector #‰n, we can write this last version of our equation as

n1x1 + n2x2 = n1a1 + n2a2.

11.2. Planes. We can repeat the derivation of the distance from a point to a line inthe plane and derive a formula for the distance from a point in three dimensional spaceto a given plane. e drawings are harder to make (at first only, practice makes perfect!),but the resulting formulas are the same.

e distance from a point X to a plane P is given by equation (16), where #‰n is anormal vector to the plane (a vector that is perpendicular to the plane), and A is somepoint on the plane that we happen to know.


A

X

#‰n

θd

d = ∥ # ‰AX∥ cos θ#‰n •

# ‰

AX = ∥ #‰n∥ ∥ # ‰AX∥ cos θ

12. Problems

1. Explain how you can use the dot prod-uct to find the angle between the vectors#‰a = 2 #‰ı − 3 #‰ȷ , and #‰b = #‰ȷ + #‰k .

2. For which value(s) of the number s arethe vectors

#‰a =

(s

1− s

)and

#‰

b =

(23

)perpendicular? For which values of s do theymake an acute angle? •

3. Figure 11 shows a cube whose sides havelength 1.

Choose A to be the origin, and let the x,y, and z axes be along the sides AB, AD,and AE, respectively.

(a) Draw the vectors #‰e1, #‰e2, and #‰e3 in thefigure.

(b) Find a normal vector to the planethrough the points B, D, and E.

(c) Draw the plane through ACH (or atleast the portion of that plane that lies in-side the cube). Find a normal to the planeACH .

(d) Find the angle between the two planesBDE and ACH . (The angle between twoplanes is the same as the angle between theirnormal vectors, i.e. to find the angle betweentwo planes find a normal vector for each of

the planes and compute the angle betweenthese two vectors.)

(e) Find the angle between the two planesBDE and HFC .

4. (a) Draw two vectors #‰a and#‰

b for which#‰a has length 3,

#‰

b has length 5, and forwhich #‰a •

#‰

b = −12. How many solutionsare there? •(b)Can there be two vectors #‰a and

#‰

b whoselengths are ∥ #‰a∥ = 3 and ∥ #‰b ∥ = 5, andwhose inner product is #‰a •

#‰

b = 25? •

5. Compute#‰a = ( #‰ı× #‰ȷ )× #‰ȷ and #‰b = #‰ı×( #‰ȷ× #‰ȷ ).What does your answer say about the asso-ciative property for the cross product? (See§ 7.3.)

What about#‰c = ( #‰ı× #‰ȷ )× #‰k and #‰d = #‰ı×( #‰ȷ× #‰k )?

6. Which of the following vector equationsare true for any pair of vectors #‰a and

#‰

b ? Ei-ther give a proof (using the algebraic prop-erties or the algebraic or geometric descrip-tions).

(a) ( #‰a +#‰

b ) • ( #‰a − #‰b ) = ∥ #‰a∥2 −∥ #‰b ∥2 ? •(b) If #‰a ⊥ #‰b then

∥ #‰a + #‰b ∥2 = ∥ #‰a∥2 + ∥ #‰b ∥2 ? •(c) If #‰a ⊥ #‰b then

∥ #‰a − #‰b ∥2 = ∥ #‰a∥2 − ∥ #‰b ∥2 ? •

12. PROBLEMS 17

A

B

C

D

E FGH

Figure 11. Figure for problem 12.3

7. True or False:(a) If #‰a ⊥ #‰b and also #‰b ⊥ #‰c then #‰a ⊥ #‰c?

(b) If #‰a ⊥ #‰b and also #‰a ⊥ #‰c then #‰a ⊥(

#‰

b + #‰c ) ?

(c) If #‰a ⊥ #‰b and also #‰b ⊥ #‰c then #‰b ⊥( #‰a − #‰c ) ?(d) If #‰a ⊥ #‰b + #‰c and also #‰a ⊥ #‰b − #‰c then#‰a ⊥ #‰b ?

8. Simplify the following expressions

(a) ( #‰a +#‰

b )×( #‰a + #‰b ) •(b) ( #‰a +

#‰

b + #‰c )×( #‰a + #‰b + #‰c ) •(c) ( #‰a +

#‰

b + #‰c )×( #‰a + #‰b + #‰c ) •(d) ( #‰a +

#‰

b − #‰c )×( #‰a − #‰b + #‰c )(e) ( #‰a +

#‰

b − #‰c ) • ( #‰a − #‰b + #‰c )

9. This problem is about “cross division,”i.e. can you solve #‰a× #‰b = #‰c for #‰b if youknow #‰a and #‰c ?

(a) Let#‰a = #‰e1 − #‰e3, #‰c = #‰e1 + 3 #‰e2 + 2 #‰e3.

Find a vector#‰

b for which #‰a× #‰b = #‰c , ifthere is such a thing. (Hint: if #‰c = #‰a× #‰b ,then what do you know about #‰a • #‰c ?) •

(b) Let #‰a = 2 #‰e1− #‰e3, and #‰c = #‰e1+3 #‰e2+2 #‰e3. Find a vector

#‰

b for which #‰a× #‰b = #‰c ,if such a thing exists. •

10. The law of cosines says that in a triangle△ABC for which you know the sides ABandAC , as well as the angle ∠A, the lengthof the opposing side BC is given by

(BC)2 = (AB)2 + (AC)2

− 2(AB)(AC) cos∠A.

Show how you can use the dot product to(re)prove this law.

Hint: consider the vector equation# ‰BC =

# ‰AC − # ‰AB. You will need both the

geometric description (4) of the dot product,and the algebraic properties from § 6.3.

CHAPTER 2

Parametric curves and vector functions

1. Vector functions

So far in calculus we have only considered functions y = f(x) where both the inde-pendent variable x and the dependent variable y are real numbers.

A vector function is a function of one variable whose values are vectors instead ofnumbers. One way to specify a vector function is to say what its components are:

#‰x(t) =

x(t)y(t)z(t)

= x(t) #‰e1 + y(t) #‰e2 + z(t) #‰e3.2. Using vector functions to describe motion

One way to visualize a vector function #‰x(t) is to think of the vector #‰x(t) for anygiven value of t as the position vector of some point in space (or the plane, if #‰x(t) is a two-dimensional vector). In other words, we represent the vector #‰x(t) as an arrow startingat the origin, and ending at some point X(t) whose coordinates are (x(t), y(t), z(t)):

#‰x(t) =# ‰

OX(t).

As t varies, the point X(t) moves around and traces out a curve. Such a curve is called aparametrized curve, or a parametric curve. e quantity t is called the parameter.

We will now take a look at some examples of parametric curves.

#‰x(t)

O

X(t)

Figure 1. A parametric curve: as the parameter t changes, the vector #‰x(t)will also move. Keep-ing the initial point of the vector #‰x(t) at the origin O, the endpoint X(t) traces out a space curve.

19

20 2. PARAMETRIC CURVES AND VECTOR FUNCTIONS

3. Lines

Consider the parametric curve given by

(18) #‰x(t) = #‰a + t #‰v

where #‰a and #‰v are given constant vectors. As before we let X(t) be the point with#‰x(t) =

# ‰

OX(t), i.e. #‰x(t) is the position vector of the point X(t), and as t changes, X(t)traces out the parametric curve.

To see what the parametric curve looks like, we let A be the point with# ‰

OA = #‰a ,then, since

# ‰

OX(t) =# ‰

OA+# ‰

AX(t),

it follows from (18) that# ‰

AX(t) = t #‰v . Now consider going from the origin O to thepoint X(t) in two steps: first move from O to the point A, then go from A to X(t). edisplacement in the second step is

# ‰

AX(t) = t #‰v . Changing t will then make the pointX(t) slide along the line through the point A in the direction of #‰v .

#‰a#‰v

#‰x(t) = #‰a + t #‰v

X(t)

Origin

A

t #‰v

Figure 2. Vector form of linear motion given by #‰x(t) = #‰a + t #‰v .

We say that #‰x(t) given by (18) describes motion with constant velocity, whose ve-locity vector is #‰v .

4. Circular motion

For given constants R > 0 and ω we consider the vector function

(19) #‰x(t) = R cosωt #‰e1 +R sinωt #‰e2 =(R cosωtR sinωt

).

e corresponding point is X(t) =(R cosωt,R sinωt

). It lies on the circle of radius R

with center at the origin, and the angle subtended by OX(t) and the positive x-axis isexactly ωt.

If ω > 0 then as t increases, the angle ωt increases and the point X(t) goes aroundthe circle in counter-clockwise direction. Ifω < 0 thenX(t) goes around in the clockwisedirection.

e number ω is the rate of increase of the angle ωt, and is called the angular ve-locity of the motion.

6. THE HELIX 21

#‰x(t)ωt

X(t)

O

Figure 3. Circular motion with angular velocity ω.

5. e cycloid

e cycloid is the curve we get if we put a (bicycle) wheel on the ground, markthe point on the tire that touches the ground, and follow this point as we roll the wheelforward. If we call the point X , then it depends on the angle θ that the wheel has turnedsinceX was on the ground. Figure 4 provides a derivation of the vector function #‰x(θ) =# ‰

OX(θ) that describes the cycloid. e result is

(20) #‰x(θ) =(Rθ −R sin θR−R cos θ

).

X

C

B

AO

θθ

θ

O AA

CC

X

X

Figure 4. The cycloid. A wheel of radius R rolls over the x-axis. Initially the wheel touches thex-axis at the origin O. The cycloid is the curve traced out by a point X on the wheel.

Derivation of the cycloid motion. The arc AX and the line segment OA have the samelength. Since AX has length Rθ, the x coordinates of the points A, B, and C are Rθ. The righttriangle CXB has hypotenuse R, so the lengths of XB and CB are R sin θ, and R cos θ, respec-tively. Therefore the coordinates of the point X are x = Rθ −R sin θ, and y = R−R cos θ.

6. e helix

When we walk up a spiral staircase we are tracing out a helix: we are going aroundin circles, and moving upward at the same time. e parametric curve that does this (and


that has the z-axis as its central axis) is given by

(21) #‰x(θ) =

R cos θR sin θaθ

or: #‰x(θ) = R cos θ #‰e1 +R sin θ #‰e2 + aθ #‰e3.Here R > 0 is the radius of the helix, i.e. the radius of the circle on the ground abovewhich the helix lies; the number a represents the rate at which the helix goes up.

x y

z

θ

aθ

X

O

YA

Figure 5. The Helix. The point X traces out a helix: it sits at a height aθ above the point Y ,while Y runs around on a circle of radius R; here θ = ∠AOY

7. e derivative of a vector function

For a function y = f(x) of one variable we had twoways of describing the derivative:on one hand we had a geometric description of f ′(x) as “the slope of the tangent to thegraph,” and on the other we could describe f ′(x) in terms of a difference quotient, i.e.

f ′(x) = lim∆x→0

f(x+∆x)− f(x)∆x

.

For vector functions we can imitate both descriptions. We begin with the formal def-inition in terms of limits and then proceed to the geometric description, in which weinterpret the derivative as the “instantaneous velocity vector.”

Definition. If #‰x(t) is a vector function, then we set

(22) #‰x ′(t) def= lim∆t→0

#‰x(t+∆t)− #‰x(t)∆t

.

For (22) to make sense we would have to define what the limit of a vector function is.is can be done, but we will not go into the precise definitions in this course. More

8. THE DERIVATIVE AS VELOCITY VECTOR 23

important for our use is that if the components of a vector function #‰x(t) are given, thenthe derivative can be computed by just differentiating those components:

(23) #‰x ′(t) =

x′(t)y′(t)z′(t)

, or #‰x ′(t) = x′(t) #‰e1 + y′(t) #‰e2 + z′(t) #‰e3.As with ordinary functions of one variable we will use Leibniz’ notation for the derivativewhenever it seems convenient. us the following are equivalent ways of expressing thesame derivative:

#‰a ′(t) =d #‰a(t)

dt=

d

dt#‰a(t).

Example. For instance,

#‰x(θ) =

cos θ0θ

= cos θ #‰e1 + θ #‰e3defines a vector function. Here we have called the independent variable θ instead of t.e derivative of this vector function is

d #‰x

dθ=

d

dθ

cos θ0θ

=− sin θ0

1

= − sin θ #‰e1 + #‰e3.8. e derivative as velocity vector

Suppose the motion of some point X(t) in space is described by its position vectorfunction #‰x(t). Let us try to define the instantaneous velocity of the point. is velocityshould have magnitude (“how fast the point is moving”) and also direction (“which way

Δx

v = dx/dt

x(t)x(t+

Δt)

X(t)

O

Figure 6. The vector function #‰x(t) traces out a curve in space. The vector #‰x(t) is the positionvector of a point X(t) on this curve. As we increase time from t to t+∆t, the point X(t) moves.The displacement of the point X(t) is given by ∆ #‰x = #‰x(t + ∆t) − #‰x(t). The average velocityvector during this displacement is “displacement/time”, i.e. ∆ #‰x/∆t.

If we let ∆t → 0, then the average velocity becomes the instantaneous velocity at time t:#‰v = lim∆t→0 ∆ #‰x/∆t = #‰v ′(t). This vector is tangent to the curve traced out by the vectorfunction #‰x(t). We call it a tangent vector.


is the point going?”). e velocity should therefore be a vector. To see which vector, wego back to the notion that “velocity” is always “displacement divided by time.”

We consider two instances in time, say, time t and time t+∆t. en the position vec-tors of the pointX at these two different times are #‰x(t) and #‰x(t+∆t). e displacementof the point X between these two times is then

∆ #‰x = #‰x(t+∆t)− #‰x(t)

(see Figure 6.) We say that the average velocity over the time interval from t to t+∆t is“the displacement divided by ∆t,” i.e.

#‰v average =#‰x(t+∆t)− #‰x(t)

∆t.

Note that the average velocity is a vector. If we write it out in components, we get a muchlarger formula:

#‰v average =

x(t+∆t)− x(t)∆t

y(t+∆t)− y(t)∆t

z(t+∆t)− z(t)∆t

.One big advantage of using vector notation is that many formulas simplify considerablywhen wrien in terms of vectors.

To get the instantaneous velocity, we do the same thing as in one variable calculus:we take the limit as∆t → 0 of the average velocity over the time interval from t to t+∆t.us we get

(24) #‰v (t) = lim∆t→0

#‰x(t+∆t)− #‰x(t)∆t

def=

d #‰x

dt.

In terms of components this derivative is

#‰x ′(t) =d #‰x

dt=

x′(t)y′(t)z′(t)

.us the velocity vector of any given vector function #‰x(t) is the same as the derivativeof this vector function.

9. Acceleration

Having found the velocity vector of a point X(t) whose position vector is a givenvector function

# ‰

OX(t) = #‰x(t), we can also define the acceleration vector of themovingpoint. By definition, the acceleration vector is the derivative of the velocity vector, i.e.

(25) #‰a(t) =d #‰v

dt=

d2 #‰x

dt2=

x′′(t)y′′(t)z′′(t)

.is definition is entirely analogous to the definition of acceleration (“a = dvdt ”) from firstsemester calculus. e only difference is that, here, the position, velocity, and accelerationall have directions in addition to magnitudes: they are vectors.

10. THE DIFFERENTIATION RULES 25

Newton’s famous law relating forces and acceleration continues to hold. If a pointX(t) moves according to some vector function #‰x(t), then some force must be actingon this point. is force is a vector (it has magnitude and direction), and, according toNewton, it is given by

(26)#‰

F = m #‰a = md #‰v

dt= m

d2 #‰x

dt2,

where m is the mass of the object at the point X(t) whose motion we are considering. Itis always assumed to be a positive number.

Note that according to this law, the absence of forces, i.e.#‰

F =#‰0 , is the same as

d #‰vdt =

#‰0 , i.e. no force acts on the point if and only if its velocity vector is constant. Here

“constant” means constant magnitude and constant direction.

10. e differentiation rules

Just as with ordinary derivatives, the derivatives of vector functions satisfy certainrules, such as the product rule. e purpose of these rules is not the same as in one variablecalculus. ere we used sum, product, quotient and chain rules to compute derivativesof given functions without having to fall back on the definition of a derivative all thetime. For vector functions we do not need such rules, because we can differentiate themby simply differentiating each of their components (see the above example). Instead, thedifferentiation rules for vector functions are mostly used to gain insight and establishgeneral facts about vector functions, a number of which we will see shortly.

10.1. e sum rule. e analog of the sum rule (“derivative of the sum is the sum ofthe derivatives”) looks exactly like the ordinary sum rule. It says that for any two vectorfunctions #‰a(t) and

#‰

b (t) one has

d

dt

(#‰a(t)± #‰b (t)

)=

d #‰a(t)

dt± d

#‰

b (t)

dt.

10.2. emany product rules. ere is no quotient rule for vector functions, simplybecause we have no way of dividing vectors. On the other hand we have two ways ofmultiplying vectors, and we can also multiply vectors and numbers, so there are threedifferent product rules. Fortunately they all look like the product rule from first semestercalculus.

If #‰a(t) and#‰

b (t) are vector functions, and if f(t) is a function, then

d #‰a(t) •#‰

b (t)

dt=

d #‰a(t)

dt•

#‰

b (t) + #‰a(t) •d

#‰

b (t)

dt

d #‰a(t)× #‰b (t)dt

=d #‰a(t)

dt× #‰b (t) + #‰a(t)×d

#‰

b (t)

dt

d f(t) #‰a(t)

dt=

df(t)

dt#‰a(t) + f(t)

d #‰a(t)

dt

In spite of the fact that these rules “look right,” they could still be wrong, so to be surewe would have to prove them. e proofs are very straightforward. Here is a short proof


for the product rule involving the dot product. To shorten the formulas we omit the “(t)”from all functions:

d #‰a •#‰

b

dt=

d

dt

(a1b1 + a2b2

)=

da1b1dt

+da2b2dt

=da1dt

b1 + a1db1dt

+da2dt

b2 + a2db2dt

ordinary product rule

=da1dt

b1 +da2dt

b2 + a1db1dt

+ a2db2dt

switch terms around

=d #‰a

dt•

#‰

b + #‰a •d

#‰

b

dt. recognize the dot-products

11. Vector functions of constant length

As an immediate application of the product rule for the dot-product we prove thefollowing fact about vector functions whose length does not change, i.e. vector functions#‰a(t) that change their direction, but not their length.

#‰a(t)

∆ #‰a#‰a(t+∆t)

If a vector function #‰a(t) hasconstant length, then, when theparameter t undergoes a smallchange ∆t, the correspondingsmall change ∆ #‰a in the vectorfunction will be almost perpendic-ular to #‰a(t) itself.

eorem. Let #‰a(t) be a vector function. en a necessary and sufficient condition forthe length ∥ #‰a(t)∥ to be constant is that #‰a(t) and #‰a ′(t) be perpendicular for all t.

P. Differentiating both sides of the equation

∥ #‰a(t)∥2 = #‰a(t) • #‰a(t)

we get

(27)d

dt∥ #‰a(t)∥2 = #‰a ′(t) • #‰a(t) + #‰a(t) • #‰a ′(t) = 2 #‰a(t) • #‰a ′(t).

If #‰a(t) has constant length, then ∥ #‰a(t)∥2 is also constant, and thus ddt∥#‰a(t)∥2 = 0.

erefore, for a vector function #‰a(t)whose length is constant, #‰a(t) • #‰a ′(t) = 0, i.e. #‰a(t) ⊥#‰a ′(t).

Conversely, if #‰a(t) is a vector function for which #‰a(t) ⊥ #‰a ′(t) holds for all t, then#‰a(t) • #‰a ′(t) = 0, and (27) implies that ddt∥

#‰a(t)∥2 = 0, i.e. that ∥ #‰a(t)∥2 and hence ∥ #‰a(t)∥are constant.

□

12. TWO EXAMPLES 27

12. Two examples

12.1. Motion on a straight line. We return to the motion given by (18), i.e.

(28) #‰x(t) = #‰a + t #‰v .

e velocity and acceleration are easy to compute:

d #‰x(t)

dt= #‰v ,

d2 #‰x(t)

dt=

d #‰v

dt=

#‰0 ,

since #‰v is a constant vector in this case.

We see that if a point X(t) moves according to the parametrization (18), then itsvelocity is constant, and its acceleration is zero. According to Newton’s law, no force isexerted on an object undergoing this motion.

12.2. Circular motion. For the point X(t) moving on a circle of radius R with an-gular velocity ω we have (19), i.e.

#‰x(t) = R cosωt #‰e1 +R sinωt #‰e2

so that the velocity and acceleration are easy to compute:

#‰v (t) = #‰x ′(t) = −ωR sinωt #‰e1+ ωR cosωt #‰e2,#‰a(t) = #‰v ′(t) = −ω2R cosωt #‰e1− ω2R sinωt #‰e2.

Note that the velocity vector #‰v (t) is perpendicular to the position vector #‰x(t), aspredicted in § 11. Our expression for the velocity vector #‰v (t) contains the familiar re-lation between angular velocity and velocity: the velocity v = ∥ #‰v (t)∥ with which thepoint X(t) is moving is

v(t) = ∥−ωR sinωt #‰e1 + ωR cosωt #‰e2∥(29)

=√ω2R2 sin2 ωt+ ω2R2 cos2 ωt

= ωR.

Hence the angular velocity of an object undergoing circular motion is

(30) ω =v

R.

#‰

F#‰v (t) ωt R

X

Figure 7. If an objectmoves along a circle with constant angular velocity, then the force#‰F required

to make the object follow that motion is#‰F = −ω2 #‰x . In particular it is parallel to the position

vector #‰x but in the opposite direction.


We also note that the acceleration is a multiple of the position vector:#‰a(t) = −ω2 #‰x(t).

According to Newton the force acting on the object atX(t) is#‰

F = m #‰a = −mω2 #‰x , andits magnitude is

(31) F = ∥ #‰F ∥ = ∥mω2 #‰x(t)∥ = mω2R,because ∥ #‰x(t)∥ = R at all times.

Using (30) we can replace the angular velocity ω by the actual velocity, which leadsto the classical formula for the centrifugal force

(32) F =mv2

R.

13. Arc length

For any given vector function there is a simple formula for the length of the curveit traces out. e formula is essentially the same as the formula for the length of a para-metric curve (or, to a lesser extent, of the graph of a function) that was described in Math221. Here we repeat the intuitive derivation of the formula, wrien in terms of vectorsthis time.

Let #‰x(t) (a ≤ t ≤ b) be a vector function. To determine the length of the arc tracedout by X(t) as t varies from t = a to b, we divide the interval a ≤ t ≤ b into manyvery short subintervals. e corresponding points X(t) on the curve split the curve intomany short segments, each of which will be “close to a line segment.” We approximatethe length of the curve by adding the lengths of all these short segments. Finally we takethe limit in which the number of partition points becomes infinite and our sum of lengthsof short segments becomes an integral. To see which integral we get, we need to find anexpression for the length of a short segment between two adjacent partition points onthe curve.

Suppose we have two points on the curve, with parameter values t and t + ∆t, re-spectively. e points are X(t) and X(t + ∆t), and the distance between them is thelength of the vector ∆ #‰x from one point to the next. is vector is

Δx start(t=a)

end(t=b)

partition piece

X(t)

X(t+Δt)

∆x = #‰x(t+∆t)− #‰x(t) =#‰x(t+∆t)− #‰x(t)

∆t∆t ≈ #‰x ′(t)∆t,

so that its length is ≈ ∥ #‰x ′(t)∥∆t. Adding the lengths of the short segments together,we find that the length is approximately

∑∥ #‰x ′(t)∥∆t (where the summation is over all

short pieces of the curve). Taking the limit we arrive at this formula for the length of thecurve traced out by #‰x(t), a ≤ t ≤ b:

(33) Length =∫ bt=a

∥ #‰x ′(t)∥ dt.

is integral looks simple, but that appearance turns out to be deceptive as we findout when we write it in terms of the components of the vector function #‰x(t). Suppose#‰x(t) = x(t) #‰e1 + y(t)

#‰e2 + z(t)#‰e3. en

#‰x ′(t) = x′(t) #‰e1 + y′(t) #‰e2 + z

′(t) #‰e3,

so that∥ #‰x ′(t)∥ =

√x′(t)2 + y′(t)2 + z′(t)2.

14. ARC LENGTH DERIVATIVE 29

erefore the length formula (33) of the curve is equivalent to

(34) Length =∫ bt=a

√x′(t)2 + y′(t)2 + z′(t)2 dt.

e square root makes this formula a reliable source of very difficult integrals. In fact thelist of curves whose length one can actually compute by doing the integral is rather short(see Problem …).

14. Arc length derivative

Let #‰x(t) be some vector function that describes the motion through space of somepoint X(t), and let f(t) be some other function. In what follows it will help to think ofthe parameter t as “time.” Typical examples of functions f that wemight want to considerare f(t) = ∥ #‰x(t)∥ (the distance to the origin of the point X(t)) or f(t) = ∥ #‰x ′(t)∥ (thespeed at which the point is moving.)

To describe the rate with which f(t) is changing we could compute its derivative,

df

dt

which tells us what the ratio between the change ∆f of f , and the change ∆t in theparameter t is (at least approximately, if ∆t is small). If we interpret t as “time” thenthis derivative tells us how fast f(t) changes per second. But sometimes it is more usefulto know how much f changes aer we have travelled a small distance along the curve,rather than aer a short amount of time has passed. In other words, for two nearby pointsX(t) and X(t+∆t) on the curve we would like to know the ratio

(35)change in f

distance travelled=

f(t+∆t)− f(t)distance from X(t) to X(t+∆t)

We can work this out by observing that the distance fromX(t) toX(t+∆t) is the lengthof the vector from X(t) to X(t+∆t), i.e.

distance from X(t) to X(t+∆t) = ∥ #‰x(t+∆t)− #‰x(t)∥ .Assuming ∆t is small, we have

∥ #‰x(t+∆t)− #‰x(t)∥ =∥∥∥∥ #‰x(t+∆t)− #‰x(t)∆t

∥∥∥∥ ∆t ≈ ∥∥ #‰x ′(t)∥∥ ∆t.We substitute this in (35), and get

change in fdistance travelled

≈ f(t+∆t)− f(t)∥ #‰x ′(t)∥∆t

.

Now let ∆t → 0: the quantity on the le becomes what is called the arc length deriv-ative of the function f along the curve vx(t), and which is commonly denoted by dfds Inthe quantity on the right we recognize the derivative of f with respect to t (time), whichleads to

(36)df

ds=

1

∥ #‰x ′(t)∥df

dt.

Here dfdt = f′(t) is the usual derivative of f with respect to t.

If we want to emphasize the distinction between these two derivatives, then we cancall dfdt the “time derivative of f .”


15. Unit Tangent and Curvature

15.1. Unit tangent. We have seen that we can find a tangent vector to the curvetraced out by some vector function #‰x(t), simply by differentiating the vector function:#‰x ′(t) always provides a tangent vector (if #‰x ′(t) ̸= #‰0 ). In fact any multiple λ #‰x ′(t) of

A vector with length 1 iscalled a unit vector this vector will also be a tangent vector (provided λ ̸= 0.) We can single out one special

tangent vector, by choosing λ > 0 so that λ #‰x ′(t) has length 1. Since for λ > 0 we have∥λ #‰x ′(t)∥ = λ∥ #‰x ′(t)∥ the value ofλ thatwill makeλ #‰x ′(t) a unit vector isλ = 1/∥ #‰x ′(t)∥.

For this reason the vector

(37)#‰

T (t) =d #‰x

ds=

#‰x ′(t)

∥ #‰x ′(t)∥

is called the unit tangent vector to the curve corresponding to the vector function #‰x(t).

15.2. Example. For our constant velocity parametrization (18) of a straight line from§ 3 we have

#‰x(t) = #‰a + t #‰v ,

so that #‰x ′(t) = #‰v and hence

#‰

T =#‰v

∥ #‰v ∥.

We see that the unit tangent vector is constant.

15.3. Curvature and normal. If the curve described by a vector function #‰x(t) is nota straight line, then the tangent to the curve will turn as one moves along the curve. ecurvature vector #‰κ measures how much the curve is curved. It is defined to be the rateof change of the unit tangent, but with respect to arc length instead of with respect to thegiven parameter t. us

(38) #‰κ def=d

#‰

T

ds.

According to our definition of “derivative with respect to arc length” the right hand sidestands for

(39)d

#‰

T

ds=

1

∥ #‰x ′(t)∥d

#‰

T

dt.

To write this completely in terms of the original vector function #‰x(t) we use (37)

(40) #‰κ =1

∥ #‰x ′(t)∥d

dt

{ 1∥ #‰x ′(t)∥

d #‰x

dt

}is formula is not as short as the original definition (38), but it does show that the curva-ture vector comes about by differentiating the vector function #‰x(t) twice (and dividingby ∥ #‰x ′(t)∥ at the right moments.)

17. PROBLEMS 31

eorem. e curvature vector #‰κ is perpendicular to the tangent, i.e. #‰κ ⊥ #‰T .

P. We have to show that #‰κ •#‰

T = 0. From the second form (39) of the definitionof #‰κ we see

#‰κ •#‰

T = #‰κ •( 1∥ #‰x ′(t)∥

d#‰

T

dt

)=

1

∥ #‰x ′(t)∥#‰κ •

d#‰

T

dt.

Remember that#‰

T (t) is always a unit vector, i.e.#‰

T (t) has constant length: by § 11 thisimplies that d

#‰Tdt ⊥

#‰

T (t) and we are done. □

ere are two concepts that are derived from the curvature vector: the curvature κis by definition the length of the curvature vector #‰κ ,

(41) κ = ∥ #‰κ∥ =

∥∥∥∥∥d#‰

T

ds

∥∥∥∥∥ ,and the normal vector to the curve is

(42)# ‰

N =#‰κ

∥ #‰κ∥=

d#‰Tds∥∥∥d #‰Tds ∥∥∥ .

e normal vector is undefined when #‰κ =#‰0 , because it would require division by zero.

Since #‰κ is perpendicular to#‰

T , the normal vector# ‰

N is also perpendicular to#‰

T (henceits name).

(43)d

#‰

T

ds= κ

# ‰

N

16. Osculating plane

At any pointX(t) on a space curve given by #‰x(t) one defines the osculating planeto be the plane that contains the point X(t) and that is parallel to both the tangent

#‰

T (t)

and normal# ‰

N(t) of the curve.

If we want to write a defining equation for the osculating plane as in § 11.2 thenwe need a vector perpendicular to the osculating plane. Since this plane is defined to beparallel to both

#‰

T and# ‰

N , we can find a normal vector to the osculating plane by takingthe cross product of

#‰

T and# ‰

N . is vector is called the binormal to the curve. In aformula, it is defined to be

(44)#‰

B =#‰

T× # ‰N .

17. Problems

1. What sign does ω have in Figure 7 ? Howwould the figure change if we change thesign of ω? Does the force

#‰F on the object

change if we change the sign of ω?

2. While reading the definition of the cur-vature vector and especially aer seeing thenot-so-nice formula (40) for the curvature

vector it is natural to think “isn’t there a sim-pler way to define curvature?” Here is oneaempt. The questions invite you to judgethis alternative definition of “curvature” onits merits.

For any parametric curve a tangent vec-tor is given by #‰x ′(t). To see if the curve de-viates from being a straight line we simply


check if #‰x ′(t) changes, and we can do thisby computing the derivative of #‰x ′(t). So letus define

# ‰K(t) = #‰x ′′(t),

and let us see if this measures howmuch thecurve is curved. (As mathematicians we candefine whatever we want, and this definitionis a lot simpler than (40).)

(a) True or False: if# ‰K(t) =

#‰0 for all t then

the curve is a straight line …?

(b) Compute# ‰K(t) for #‰x(t) =

(t2

t2

), and

draw the curve traced out by this particularvector function.

(c) True or False: if #‰x(t) traces out a straightline then

# ‰K =

#‰0 …?

(d) Conclusion: To what extent does thestatement “

# ‰K =

#‰0 ” have anything to

do with the statement “ #‰x(t) traces out astraight line”?

3. Suppose a point P is rotating around aline ℓ, keeping its distance to the line fixedat r, and moving in a plane perpendicular tothe line. Suppose the point has angular ve-locity ω: this means that during a time in-terval of length t the angle swept out by theline segment connecting P to ℓ is exactly ωt.

In a previous math or physics class it wasshown that the velocity of the point P is ωr,where r is the distance from P to the line ℓ.

The angular velocity vector is defined tobe the vector #‰ω whose length is ω, and thatis parallel to the line ℓ. There are two suchvectors (± #‰ω). By definition #‰ω points in thedirection in which a screw would move if it

were turning in the same direction as thepoint P .

(a) Assuming the line ℓ passes through theorigin show from the drawing that the ve-locity vector of the point P is #‰v = #‰ω× #‰x .(There are two things to worry about: does#‰ω× #‰x have the same direction as #‰v ? anddoes #‰ω× #‰x have the same length as #‰v ?)(b) Show that the acceleration vector isgiven by #‰a = #‰ω×( #‰ω× #‰x). (hint: don’t usethe drawing, but combine the definitions of#‰v and #‰a , in (24) and (25) and also the prod-uct rule; finally, keep in mind that you havejust found that #‰v = #‰ω× #‰x .)(c) If someone told you they had computedthe acceleration vector and found

#‰a = ( #‰ω× #‰ω)× #‰x ,

could they be right? Explain! What if theytold you they got #‰a = #‰ω× #‰ω× #‰x?(d) True or False (explain your answers):

(a) #‰v ⊥ #‰x? (b) #‰a ⊥ #‰v ? (c) #‰aand #‰x are parallel?

(e) Include the acceleration vector #‰a in theabove drawing.

4. Consider the “twisted cubic,” i.e. the curvegiven by #‰x(t) = t #‰e1 + t2 #‰e2 + t3 #‰e3.

(a) For any given t find the tangent line tothe curve at the point X(t), and find wherethis curve intersects the xy-plane.

(b) If you call that intersection point P (t),then which curve is traced out by the pointP (t) as t varies?

5. Compute the length of one full turn of thehelix by taking the parametrization given in

Origin

∆s = r∆θ = rω∆t

#‰ω

#‰x

#‰v = #‰ω× #‰xℓ

r rP P

17. PROBLEMS 33

(21) and computing the length of the seg-ment with 0 ≤ θ ≤ 2π.

Aer computing the length, considerthis: let P be the perimeter of the circle un-derneath the helix, and let H be the heightachieved by one full turn of the helix. Showthat the length L of the helix satisfies L2 =P 2 +H2.

6. There is a multistory parking ramp wherethe way out is a path in the shape of a he-lix that is wound around the outside of thebuilding. As a car drives down this pathat night its headlights shine a spot on theground. Which curve is traced out by thislight spot as the car drives all the way down?

Make a good drawing. Assume for sim-plicity that the center of the Parking ramp isthe z-axis.

7. Compute the tangent, curvature, normaland binormal for the following curves

(a) The parabola: #‰x(t) =(t2

t

). At which

point on the curve is the curvature thelargest?

(b) Neil’s parabola: #‰x(t) =(

t2

t3

). At

which point on the curve is the curvature thelargest?

(c) The helix: #‰x(θ) =(

R cos θR sin θaθ

)(see § 6 for

an explanation of the constantsR and a). Atwhich point on the curve is the curvature thelargest?

(d) The graph of y = ex by using theparametrization #‰x(t) =

(tet

). Where on

the graph is the curvature the largest? •

CHAPTER 3

Functions of more than one variable

1. Functions of two variables and their graphs

1.1. Definition. A function of two variables has two ingredients: a domain and arule. e domain of the function is a collection of points in the xy-plane. For each point(x, y) from the domain of the function, the rule should tell us how to find the functionvalue f(x, y).

Just as with functions of one variable, the “rule” that gives us the function value isoen specified by some formula, e.g. f(x, y) = x + y. e domain of a function is theset of points at which we define the function. is can in principle be any set of pointsin the plane. Typically the domain will be a rectangle, or a disc, or it could be the entirexy-plane, possibly with some points and lines removed.

z

height:z=f(x,y)

Domain o

f f

x

y

Figure 1. The graph of some function, and its domain (a rectangle in this example).

1.2. Graphs. By definition, the graph of a function z = f(x, y) is the collection ofall points (x, y, z) in three dimensional space that satisfy the equation z = f(x, y).

e graph is usually a surface that floats above (or below) the domain of the function(see Figure 2).

35

36 3. FUNCTIONS OF MORE THAN ONE VARIABLE

1.3. Level sets. e graph of a function of two variables is a surface siing in threedimensional space, which can be difficult to draw or visualize. Instead of looking at thegraph we can also consider its level sets. If c is any real number, then, by definition, thelevel set at level c of the function is the set of all points (x, y) in the plane that satisfyf(x, y) = c.

z

c

x

y

level set at level c

level set at level c

x

y

Figure 2. The graph of some function (top), and a construction of one of its level sets (boom).Note that by definition the level set (“at level c”) is the curve in the xy-plane under the graph: itis obtained by intersecting the graph of the function with a horizontal plane at height c, and thenprojecting this curve of intersection onto the xy-plane.

Since the level set is the set of all solutions to the equation f(x, y) = c, one oen usesthe notation f−1(c) (“f -inverse of c”) for the level set. We can summarize the definitionin an equation:

f−1(c) ={(x, y) : f(x, y) = c

}.

Note that the definition says that f−1(c) is not a number, but a set of points!

1. FUNCTIONS OF TWO VARIABLES AND THEIR GRAPHS 37

Level sets tend to be curves in the xy-plane, although in general level sets can haveany shape (see Problem 5.13 for an example.) ey are usually easier to draw than thegraphs of the corresponding functions.

1.4. An example from the “real” world. Here is a function of local interest. edomain of the function is the water surface of Lake Mendota (let’s pretend this is a planedomain), and the function, which we will call d instead of f , is given by d(x, y) = thedepth of the lake at location (x, y). ere is no formula for this function, but the Wiscon-sin Department of Natural Resources has measured the depth and presented the resultsin terms of the level sets of the function d.

Figure 3. The level curves of a function z = d(x, y). The domain of this function is the lakesurface, and d(x, y) is the depth in meters of Lake Mendota at (x, y). To see the graph of thefunction we could try to drain the lake.

See http://limnology.wisc.edu/lake_information/mendota/mendota.html

1.5. A comment about language and set-theoretic notation. Wewill oen say “con-sider a function z = f(x, y)…”, but there is a sense in which this is incorrect. It is conve-nient to say “consider a function z = f(x, y)…” since it not only names the function, butit also gives the independent variables x, y, and the dependent variable z a name. Nev-ertheless, the symbol in the equation z = f(x, y) that actually represents the function is“f”. e correct way of introducing the function¹ would be to say “consider a functionf .”

In fact, in the notation that is used inmodernmathematics onewouldwrite “Considerthe function f : D → R…” Here f is the name of the function we are introducing, D is

¹Saying “consider the function z = f(x, y)…” to introduce the function f is like saying “Please meet mybrother Joe, Bill, and Sue” when you want to introduce your brother Joe, who happens to be standing next toBill and Sue. To introduce your brother, you would of course say “Please meet my brother Joe.” and to introducethe function you should really say “Consider the function f .”

http://limnology.wisc.edu/lake_information/mendota/mendota.html


the domain of that function (so D is a set of points in the plane), and R stands for the setof real numbers, indicating that computing f always results in a real number.

1.6. Vector notation. If #‰x is the position vector of the point (x, y) in the plane, i.e.if #‰x = ( xy ), then one sometimes writes

f(x, y) = f( #‰x).

Physicists have a preference for #‰r instead of #‰x (because they call the position vector the“radius vector”), and will write f(x, y) = f( #‰r ).

2. Linear functions

e simplest function of one variable are those of the form f(x) = ax + b. eirgraphs are lines, and we called them linear functions.

A linear function of two variables is a function f of the form

(45) z = f(x, y) = ax+ by + c,

where a, b, c are constants.

x

y

z

Figure 4. The graph of a linear function z = ax+ by + c.

e graph of a linear function is always a plane. Indeed, the graph consists of allpoints (x, y, z) that satisfy the equation

−ax− by + z = c,

which we can write as#‰n • #‰x = #‰n • #‰p ,

where

#‰n =

−a−b1

, and #‰p =00c

.

3. QUADRATIC FORMS 39

3. adratic forms

Aer learning about linear functions in pre-calculus one usually goes on to quadraticfunctions. We will do the same for functions of two variables and study adratic Forms.Just as in the one variable case where quadratic functions can have a maximum or min-imum, quadratic forms provide examples of functions of two variables that can have amaximum or a minimum, or, it turns out, a third kind of “min-max” or “saddle shape.”ey provide the basic profile of what we will run into when we look for local minimaand maxima of functions of two variables. In particular, the technique of classifying qua-dratic forms by completing the square, which we will see in this section, is the key to thesecond derivative test for functions of more than one variable.

3.1. Definition. e general quadratic form in two variables is

(46) f(x, y) = Ax2 +Bxy + Cy2,

whereA, B, and C are constants. Depending on the values of these constants the graphsof the functions can have a number of different shapes.

In addition to these quadratic forms one can also consider the more general class ofquadratic functions,

f(x, y) = Ax2 +Bxy + Cy2 +Dx+ Ey + F,

which also have terms of degree 1 and 0. We will restrict ourselves to quadratic forms(for now).

e prototypical examples. ere are four important special cases that are represen-tative of what the graphs of quadratic forms can look like. ese special cases are

f(x, y) = x2 + y2, and g(x, y) = −x2 − y2,(47a)

h(x, y) = x2, and h̃(x, y) = −x2,(47b)k(x, y) = xy(47c)

eir graphs are discussed in Figure 5.

3.2. Classifying quadratic forms – the general procedure. All quadratic forms havegraphs that look like one of the examples shown above – but how can we tell which itis? In other words, if Q(x, y) is a given quadratic form how can we tell if it is definite,indefinite, or semidefinite? How do we know for which (x, y) the formQ(x, y) is positiveor negative? It turns out that we can always find out by using the trick of “completingthe square.”

e general procedure for a given quadratic formQ(x, y) = Ax2+Bxy+Cy2 is asfollows:

(1) If A = 0, then we really have Q = Bxy + Cy2 and we can factor Q as

Q(x, y) = (Bx+ Cy)y.


(2) Assume A ̸= 0. We factor out A, and complete the square for the first twoterms:

Q(x, y) = A{x2 +

B

Axy +

C

Ay2}

= A{(

x+B

2Ay)2 − ( B

2Ay)2

+C

Ay2}

= A{(

x+B

2Ay)2︸︷︷︸

u2

+4AC −B2

4A2y2︸︷︷︸

±v2

}.

(3) If 4AC −B2 > 0, then the expression in braces is positive, and we can write

Q(x, y) = A(u2 + v2), where u = x+B

2Ay, and v =

√4AC −B2

2Ay.

Depending on the sign of A our function is always positive or always negative,and we say the form is positive definite or negative definite.

The two forms f and g from (47a)are called definite, since they cannotchange sign:

f(x, y) = x2 + y2

is the sum of two squares, and there-fore is always positive, unless both xand y vanish. Similarly, g(x, y) =−f(x, y) is always negative, exceptat (x, y) = (0, 0).

The form h(x, y) = x2 is called semi-definite because it too cannot changeits sign. Clearly, h(x, y) = x2 isnever negative, but for h(x, y) to bepositive, we need x ̸= 0. So, the func-tion h(x, y) is positive, except on theline x = 0 (the y axis). The graph ofthe function h̃(x, y) = −y2 is simi-lar, but upside down.

The form k(x, y) = xy is called in-definite, because it can be both posi-tive and negative: if x and y have thesame sign, then xy > 0, but if theyhave opposite signs, then xy < 0.Thus the graph of z = xy lies abovethe xy-plane in the first and thirdquadrants, and below the xy-plane inthe second and fourth quadrants.

xy > 0

xy > 0

xy < 0

xy < 0x

y

Figure 5. Graphs of some representative quadratic forms.

3. QUADRATIC FORMS 41

(4) If 4AC −B2 < 0, then we have

Q(x, y) = A(u2 − v2), where u = x+ B2A

y, and v =

√B2 − 4AC

2Ay.

When this happens we can factor the quadratic form, i.e. we have

Q(x, y) = A(u+ v)(u− v).

e form is indefinite.(5) in the only remaining case we have 4AC −B2 = 0, so that

Q(x, y) = A(x+

B

2Ay)2

.

In this case the form is a perfect square (times A). e form is semi-definite.

To understand this procedure it is perhaps best to look at how it works in some examples.

3.3. Classifying quadratic forms – two examples.

3.3.1. An indefinite quadratic form. Consider the formQ(x, y) = −3x2+9xy+6y2.We rewrite this as follows:

Q = −3x2 + 6xy + 9y2

= −3(x2 − 2xy − 3y2

)= −3

[x2 − 2xy + y2︸︷︷︸−4y2] complete the square

= −3[(x− y)2 − 4y2

] in this case we get the difference of twosquares, so use a2 − b2 = (a− b)(a+ b)

= −3(x− y − 2y)(x− y + 2y)= −3(x− 3y)(x+ y).

is shows thatQ(x, y) > 0 when y > 13x or y < −x, andQ(x, y) < 0 when−x < y <13x.

y

x

Q(x,y)


3.3.2. A positive definite quadratic form. To see a different example, consider the qua-dratic form Q(x, y) = 2x2 − 4xy + 6y2. By completing the square we can write it as

Q(x, y) = 2{x2 − 2xy + 3y2

}= 2

{x2 − 2xy + y2 + 2y2

}the square is complete

= 2{(x− y)2 + 2y2

}= 2(x− y)2 + 4y2.

We see that this particular quadratic form is positive definite.

4. Functions in polar coordinates r, θ

Recall that instead of using Cartesian coordinates (x, y) to specify the location pointsin the plane, we can also use polar coordinates. In many cases it is much easier to describea function using polar coordinates than in Cartesian coordinates.

To go back and forth between Cartesian and Polar Coordinates we can use the fol-lowing relations

x = r cos θ(48a)

y = r sin θ(48b)

r =√

x2 + y2(48c)

θ = arctany

x

(48d)

e equation for θ is only valid for x > 0, where −π2 < θ <π2 . In other regions of the

plane there are other expressions relating θ to (x, y). See problem 5.8.

θ

r

x

y

P

θ0

θ=θ0r=r0

Figure 7. Polar coordinates are defined in the picture on the right (see also equations (48)). Onthe le: the set of points at which θ has one given value θ0 form a half line emanating from theorigin that makes an angle θ0 with the positive x-axis. The set of points at which r has a givenvalue r0 form a circle centered at the origin, with radius r0.

e simplest kinds of functions one can consider in polar coordinates are those thatonly depend on one of those coordinates, i.e. functions that only depend on the radius r,and functions that only depend on the polar angle θ. Let’s look at some examples of suchfunctions.

4. FUNCTIONS IN POLAR COORDINATES r, θ 43

xy

z

z = r =√

x2 + y2

r

z

z=Φ(r) =

r

Figure 8. Radially symmetric functions. The graph of z = r.

4.1. Radially symmetric functions. e functions

f(x, y) = x2 + y2, g(x, y) =√x2 + y2, h(x, y) = ln

(x2 + y2

),

all can be expressed in terms of the radius r only. Namely, using r2 = x2 + y2, we have

f(x, y) = r2, g(x, y) = r, h(x, y) = ln r2(= 2 ln r).

In general, a function z = f(x, y) that can be wrien in terms of the radius r only, i.e. afunction for which there is some function Φ of one variable with

f(x, y) = Φ(r), i.e. f(x, y) = Φ(√

x2 + y2),

is called a radially symmetric function.Since a radially symmetric function only depends on the radius r, its level sets consist

of circles centered at the origin (one exception: the origin, r = 0 can also be a level set,and this is obviously not a circle but a point.)

As an example, we consider the function g(x, y) =√x2 + y2 = r in more detail.

e function Φ of one variable here is Φ(r) = r. We can try to visualize the graph of gby first looking at the positive x-axis only. ere we have f(x, 0) =

√x2 = x. We get

the graph of g by revolving the graph of z = x around the z-axis. See Figure 8.

4.2. Functions of θ only. Here are two functions that happen to depend on the polarangle θ only:

f(x, y) = sin θ, h(x, y) = θ.

We can rewrite these functions in terms of x and y by using the relations between Carte-sian and Polar coordinates (48). We get

f(x, y) = sin θ =y

r=

y√x2 + y2

for f , and

h(x, y) = θ = arctany

xfor h, at least in the right half plane where x > 0.

A function that only depends on θ is constant on rays emanating from the originbecause the polar angle θ is constant on such rays. e level sets of such a functiontherefore consist of half-lines (“rays”) starting at the origin. Its graph consists of “spokes”aached to the z-axis. Each spoke lies above a ray in the xy-plane with some polar angleθ, and is aached to the z-axis at a height given by the function value. As we vary θ, the


spoke rotates around the vertical axis and moves up or down, as dictated by the function.Figure 9 shows what happens for f(x, y) = sin θ.

θx y

z=f(θ)

“ray”

“spoke”

The graph of a function of θ onlyconsists of horizontal spokes

aached to the z-axis.The graph of z = sin θ

(the x-axis is coming right at us.)

Figure 9

e function z = θ has a simpler formula in polar coordinates but actually has amore complicated graph. Let us try to visualize its graph: the spokes that make up thegraph are horizontal, aached to the z-axis, and are at height θ. If we increase the angleθ the spokes go up at a steady rate in a way that should remind us of a helix (see § 6and Figure 5). Based on this description its graph should look like the surface drawn inFigure 10. e surface is called the helicoid, and it is not the graph of a function (it failsthe “vertical line test.”) We could have known this from the beginning , because when wedescribed our function as f(x, y) = θ, we should have immediately asked which θ? epolar angle θ of any given point is only determined up to a multiple of 2π. e “graph”that we have drawn of the “function” z = θ reflects this. To make h(x, y) = θ into anhonest function we have to say which of the many possible angles θ we choose when weare given a point. One possible choice is to always require the polar angle θ to lie between0 and 2π (radians). More precisely, we can insist on

0 ≤ θ < 2π.

If we do this then there is a unique angle θ for each point (x, y) in the plane. e graphof this function is shown on the right in Figure 10.

5. Methods of visualizing the graph of a function

5.1. Freezing a variable. If a function is not familiar, then a good strategy for draw-ing its graph is to “freeze a variable.” In other words, to analyze a function z = f(x, y)we pretend y is a constant: then x is the only independent variable, and we can try todraw the graph of the function z = f(x, y), now thinking of this as a function of onlyone variable. is graph is a curve in the xz plane. We get one such curve for each choiceof y. Piecing these graphs together then gives us the graph of the two-variable functionz = f(x, y).

We could apply the same procedure with the roles of x and y switched: i.e. for eachfixed x you try to graph z = f(x, y) as a function of the variable y only, aer which wetry to fit all the graphs we get for different values of x together.

x

y

z

5. METHODS OF VISUALIZING THE GRAPH OF A FUNCTION 45

x

y

x

y

Figure 10. The graph of z = θ is the helicoid. It is not the graph of a function, but one can extracta function by choosing a “branch” of the function. One possible choice, drawn here on the right,is to restrict the polar angle θ to the interval 0 ≤ θ < 2π. There are many other possible choices.

5.2. Moving graphs. ere is another way of visualizing a function z = f(x, y) oftwo variables in which we think of one of the independent variables (e.g. y) as “time.” efinal picture is not one static image of a three dimensional surface, but rather a movie ofa graph that is moving around in the xz plane.

If we have a function z = f(x, y), then let us think of y as time, and let us relabelit as t, so that we are looking at the function z = f(x, t). Now at each moment in timet we can think of z = f(x, t) as a function of one variable x whose graph we can try todraw, regarding it as a still-image. en, as we let time t vary, puing the still images ina sequence, you get a movie of a graph of a changing function of one variable.

For instance, if the function is (once again) the saddle surface function z = xy, thenwe would be considering the function z = xt. At each moment t the graph of z = xt is

t=1

z

x x x x x

z z z z

t=−1 t=−1/2 t=0 t=1/2

Figure 11. The saddle movie. It’s about a line segment whose slope changes, even though it isotherwise stuck to the origin.


a line with slope t. Puing these graphs together gives a movie which begins with a lineof rather negative slope; during the movie the slope increases, and in the middle of themovie our line has achieved horizontality; finally, the closing shot presents us with a linewith a very positive slope. Figure 11 shows some stills from the movie.

is interpretation is not very different from the procedure of “freezing the y vari-able.” e only real difference lies in what we do with all the separate graphs we get aerwe freeze a variable. In one case we try to piece them together to make a bigger draw-ing of a three-dimensional object, in the other we put them together to make a motionpicture.

Problems

In the problems in this stage of the course, you will be asked to “sketch the graph of a function.”From math 221 you remember that this meant you had to find minima, maxima, inflection points,and other features of the graph. In 234 you will learn to do the same for functions of two (andmore) variables, but for now you should try to use the method of “freezing a variable” or othersimilar tricks to get an idea of what the graph of f looks like.

You can use a graphing program (such as Grapher.app on the Mac, GraphCalc on Windows,or one of the many websites such as http://www.graphycalc.com/) to check your answer.

Note: very oen students try to fittheir drawings into a region the sizeof a post-it. In this course, wheneveryou make a drawing, especially if it’sa three-dimensional drawing, make itlarge! Use half a page for a drawing.Make sure you have enough paper, tryto find lots of cheap scrap paper.

1. If we were to drain LakeMendota, as sug-gested in § 1.4, would the lake boom give usthe graph of d(x, y) or of −d(x, y)? (whered is the depth of the lake)? •

2. What are the signs of the coefficients a,b, and c for the linear function whose graphis drawn in Figure 4? •

3. About planes and their intersections withthe coordinate axes.

(a) Where does the plane z = 3x − y + 6intersect the three coordinate axes? •

(b) Find the equation for the plane that in-tersects the x-axis at x = 4, the y-axis aty = 2, and the z-axis at z = 3. •

(c) Find the equation for the plane that in-tersects the x-axis at x = a, the y-axis aty = b, and the z-axis at z = c. (Write theequation as nice as possible.) •

4. Find a formula for the distance to the ori-gin of the graph of (45). •

5. Classify the following quadratic forms asdefinite, indefinite, or other, by completingthe square. Determine the zero set for eachof these quadratic forms.

(a) f(x, y) = x2 + 2y2 •

(b) Q(x, y) = x2 − y2 •

(c) g(x, y) = x2 − 4xy + 3y2 •

(d) Q(s, t) = 9s2 − 36st+ 81t2 •

(e)M(α, β) = 12α2 − αβ + β2. •

(f) Q(x, y) = xy + y2 •

(g) Q(x, y) = x2 + 2xy •

6. For which values of the constant k is thequadratic form

Q(x, y) = x2 + 2kxy + y2

positive definite? •

7. Which functions of two variables z =f(x, y) are defined by the following formu-lae?

http://www.graphycalc.com/

PROBLEMS 47

▷ Find draw the domain of each function(the largest domain on which the definitionwould make sense).

▷ Try to sketch their graphs.

▷ Draw the level sets for each function.

(a) z = xy •

(b) z − x2 = 0 •

(c) z2 − x = 0 •

(d) z − x2 − y2 = 0 •

(e) z2 − x2 − y2 = 0 •

(f) xyz = 1 •

(g) xy/z2 = 1 •

(h) x+ y + z2 = 0

(i) x+ y + z2 = 1

8. The following expressions are all equal tothe polar angle θ in some region of the xy-plane. Explain why the expression gives θ,and identify in which region this holds.

(a) θ = arctany

x•

(b) θ = π + arctany

x•

(c) θ = 2π + arctany

x•

(d) θ =π

2− arctan x

y•

(e) θ = arcsin y√x2+y2

. •

9. “The level set is always a curve…” — not!If d(x, y) is the depth function of Lake Men-dota (see §1.4), then what are the level setsd−1(c) for c = 0, c = +24 and for c = −24

Documents

MATH234 THIRDSEMESTER CALCULUS - Department of Mathematics · 4 CONTENTS 6. TheTwoVariableChainRule 58 7. Problems 61 8. Gradients 62 9. Thechainruleandthegradientofafunctionofthreevariables