We will study the chain rule for functions of several variables...

Preview:

Citation preview

Unit #22 : The Chain Rule, Higher Partial Derivatives & Opti-mization

Goals:

• We will study the chain rule for functions of several variables.

• We will compute and study the meaning of higher partial derivatives.

• We will begin a discussion of optimization.

Reading: Sections 14.6, 14.7, 15.1.

2

Composing Functions of Several Variables

Reading: Section 14.6.

There are many ways to compose multi-variable functions. Suppose we have afunction f(x, y) with output z. That is, z = f(x, y). Suppose we let x be theoutput of a function g(t) and y the output of a function h(s). Then, by substitutingthese functions in for x and y, we get a new function

f(g(t), h(s))

This is a function of two variables t and s. If, on the other hand, the secondfunction were also a function of t, as in y = h(t), we would end up with a functionof a single variable,

f(g(t), h(t))

Draw a diagram showing the relationships between the variables in these twoexamples.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 3

Example (Ideal Gas) The pressure P (in kilopascals), volume V (in liters),and temperature T (in ◦K) of one mole of an ideal gas are related by the formula

PV = nRT = (8.31)T if n = 1 .

Suppose the pressure is increasing at 0.05 kPa per second, and the temperatureis increasing at 0.15 ◦K per second. Identify the way in which functions arecomposed in this problem.

4

Now suppose we want to find the rate of change of volume at the moment whenthe pressure is 20 kPa and the temperature is 320◦ K. In other words, we are givendPdt, dT

dt, P , and T , and are trying to find dV

dt. (Think back to our much earlier work

with related rates: this is the same idea coming back!)

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 5

The Chain Rule

The Chain Rule for the form of composition dealt with above should answer the

following question: If z = f((g(t), h(t)), how does the derivativedz

dtrelate to the

derivatives of f, g, and h?

There is a separate chain rule for every form of composition. By obtaining thechain rule in a few simple cases, it will be easy to see how to get others.

6

We start with our understanding of linear approximations:

∆z ≈ fx(x, y)∆x + fy(x, y)∆y ,

∆x ≈ g′(t)∆t , ∆y ≈ h′(t)∆t ,

x = g(t) and y = h(t) .

Combining these, we get

∆z ≈ fx(g(t), h(t)) · g′(t)∆t + fy(g(t), h(t)) · h

′(t)∆t .

Dividing by ∆t and taking limits to get derivatives, we obtain

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 7

For short,

dz

dt=

∂z

∂x

dx

dt+

∂z

∂y

dy

dt

8

Now apply this Chain Rule to solve the ideal gas question on the precedingpage.

dP

dt= 0.05 kPa/s and

dT

dt= 0.15oK/s

P = 20 kPa T = 320oK/s

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 9

Example: Consider a hillside defined by the function z = f(x, y) = xy2,where z is in meters, and x and y are in kilometers. We are walking along astraight path, with

x(t) = t and y(t) = 3t

where t is measured in hours.Sketch out the path we are taking in the xy plane.

10

Give the units ofdz

dt,dx

dt, and

∂z

∂x.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 11

z = f(x, y) = xy2, x(t) = t and y(t) = 3t

How quickly are we moving up or downhill one hour into the hike (at t = 1)?

12

Indicate the interpretation of your result on the contour diagram for f(x, y) =xy2.

2

2

2

22

4

4

4

44

6

6

6

6

8

8

8

8

10

10

10

10

12

12

12

12

14

14

14

14

16

16

16

16

18

18

18

18

20

20

20

0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 50

0.5

1

1.5

2

2.5

3

3.5

4

4.5

5

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 13

Higher Order Partial Derivatives

Reading: Section 14.7.

If f(x, y) = xey +y

x

then∂f

∂x(x, y) =

and∂f

∂y(x, y) =

These are the first order partial derivatives (only one derivative is taken).

If we differentiate∂f

∂xagain, with respect to x, the result is denoted

∂2f

∂x2. It is a

second order partial derivative (we have taken two derivatives of f). [Seealso H-H, Examples 1 and 2 of Section 14.7.]

14

Compute∂2f

∂x2.

Compute the other second order partial derivatives,

•∂2f

∂x∂y=

∂x

(

∂f

∂y

)

=

•∂2f

∂y∂x=

∂y

(

∂f

∂x

)

=

•∂2f

∂y2=

∂y

(

∂f

∂y

)

=

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 15

Find∂2f

∂x∂yand

∂2f

∂y∂xfor f(x, y) = sin(x2y).

16

The pattern in these two examples are not a coincidence. It is a general theoremthat for all reasonable functions f ,

∂2f

∂x∂y=

∂2f

∂y∂x.

The results for higher derivatives are the same: only the variables used inthe derivative matter, not the order in which the derivatives are taken.

[H-H, portion headed “The Mixed Partial Derivatives are Equal” in Section 14.7]

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 17

What do the second derivatives tell you?

Example: Consider the function z = x2 − y2. Find each of the first andsecond partial derivatives, evaluated at the point (−2, 1).

18

The point (-2, 1) is indicated on the graph. Indicate the interpretation of thefirst derivatives on the graph.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 19

Indicate now the interpretation of the second derivatives,∂2f

∂x2and

∂2f

∂y2, on

the graph.

20

What do mixed partial derivatives tell you?

The value of∂2f

∂x∂yanswers the question: “for a small step in the x direction, what

is the rate of change of the y-slope”? (or vice-versa) [See Example 3 in Section14.7 of H-H.]

Consider the function f(x, y) = xy.

If you slice through the graph of f(x, y) = xy using the plane x = k, whatdoes the intersection look like? Note that it passes through the point (k, 0, 0)on the x-axis.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 21

What happens to the slope of this intersection when you change k? At whatrate does the slope change? How would you describe, in words, the change inthe shape of the surface as you move along the x-axis? i.e. What would the

sign of∂2f

∂x∂ybe?

22

Example: Consider the contour diagram below.

P

−2

−102

y

x

Determine the sign of each of thefollowing first derivatives.

• fx(P ) Neg Pos

• fy(P ) Neg Pos

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 23

P

−2

−102

y

x

Determine the sign of each ofthe following pure second deriva-tives.

• fxx(P ) Neg Pos

• fyy(P ) Neg Pos

24

P

−2

−102

y

x Determine the sign of the mixedsecond derivative.

• fxy(P ) Neg Pos

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 25

Taylor Polynomials of Degree Two for Functions of Several Vari-ables

Reading: Taylor Approximations in Section 14.7.

We have already discussed the linear approximation

f(x, y) ≈ f(a, b) + fx(a, b)(x− a) + fy(a, b)(y − b) ,

which is valid if (x, y) is near (a, b). This formula can be called:

• the local linearization of f near (a, b)

• the equation of the plane tangent to the surface f(x, y) at (a, b)

• the Taylor polynomial of degree 1 for f(x, y) centered at (a, b)

26

If we wanted a better approximation, we could use a 2D parabolic shape to mimicthe function, instead of a simple plane.Write your guess as to the natural form of a quadratic Taylor polynomialfor a 2-variable function, around the point (x, y) = (a, b). Be careful with theconstants.

The reason for some coefficients having 2’s, and others not, comes from matchingthe second derivatives of the function and the Taylor polynomial. The derivationof this is tedious, but straightforward.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 27

Calculate the quadratic Taylor polynomial approximating cos(3x)y2 for (x, y)near (0, 1).

28

Local and Global Extrema

Reading: Section 15.1.

The definitions for local maximum, global maximum, local minimum, and globalminimum are very similar to those used for functions of a single variable.

Local Extremaf has a local maximum/minimum at (a, b) if

• (a, b) is not on the boundary of the domain, and

• f(a, b) ≥ f(x, y) for all points (x, y) near (a, b). (≤ f(x, y) for a localminimum)

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 29

Global Extremaf has a global maximum/minimum at (a, b) if

• f(a, b) ≥ f(x, y) for all points (x, y) in the domain of f .(f(a, b) ≤ f(x, y) for a local minimum)

The definition of global extremum captures our goal in a search for global optima,but it is not necessarily easy to find! We need to do some work to find a strategyfor identifying global extrema.

30

Example: Draw a contour diagram for f = x2 + y2 on the domain−2 ≤ x ≤ 2, −2 ≤ y ≤ 2. Use heights of z = 0, 1, 2, 3, 4, etc.

As a review, draw the direction of the gradient vector at several points on thecontour diagram.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 31

Based on the contour diagram, identify the global maxima and minima off(x, y) = x2 + y2 on the domain D, where D is the square −2 ≤ x ≤ 2,−2 ≤ y ≤ 2.

What can you say about the gradient at the global minimum?

32

Global Extrema on a Closed Bounded DomainIf f(x, y) is defined on a closed, bounded domain, the global extrema occureither

• on the boundary of the domain, or

• at a point where grad f =< 0, 0 > in the interior of the domain.

(Note: closed means the domain includes its boundary, and bounded meansthat the domain does not stretch out to infinity in any direction. )

For problems where the domain of the function is not bounded, proving a localmaximum is in fact a global maximum can be very difficult, and requires buildingarguments for each problem. We will only ask this in relatively simple problems.

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 33

Example: Where are the global maximum and global minimum of thefunction on the area shown below?

0

20

30

40

50

60

70

70

80

80

90

100

34

Identify any local min and max points on the contour diagram below.

20 30

40 50

60 70

80

90

100

110

120

130 140

150

160 170

180

190 200

210 230

240 260

Where are the global maximum and global minimum of the function on thedomain shown?

Unit 22 – The Chain Rule, Higher Partial Derivatives & Optimization 35

Identify any local min and max points on the contour diagram below.

0.1

0.25

0.5

1

1.5

1.5

2 2.5 2.5 3

Where is are the global maximum and global minimum of the function on thearea shown?