327
Finance Theory Robert C. Merton

Finance Theory - Robert C. Merton

Embed Size (px)

DESCRIPTION

Finance Basics

Citation preview

Page 1: Finance Theory - Robert C. Merton

Finance Theory

Robert C. Merton

Page 2: Finance Theory - Robert C. Merton

Table of Contents

I. Introduction ............................................................................................................. 1

II. On the Arithmetic of Compound Interest: The Time Value of Money................ 8

III. On the Theory of Accumulation and Intertemporal Consumption Choice by Households in an Environment of Certainty ............................................... 34

IV. On the Role of Business Firms, Financial Instruments and Markets in an Environment of Certainty............................................................................. 57

V. The "Default-Free" Bond Market and Financial Intermediation in Borrowing and Lending .................................................................................................. 76

VI. The Value of the Firm Under Certainty ................................................................. 115

VII. The Firm's Investment Decision Under Certainty: Capital Budgeting and Ranking of New Investment Projects........................................................... 134

VIII. Forward Contracts, Futures Contracts and Options............................................... 151

IX. The Financing Decision by Firms: Impact of Capital Structure Choice on Value............................................................................................................. 165

X. The Investor's Decision Under Uncertainty: Portfolio Selection ......................... 185

XI. Implications of Portfolio Theory for the Operation of the Capital Markets: The Capital Asset Pricing Model ........................................................................ 225

XII. Risk-Spreading via Financial Intermediation: Life Insurance .............................. 241

XIII. Optimal Use of Security Analysis and Investment Management .......................... 249

XIV. Theory of Value and Capital Budgeting Under Uncertainty................................. 270

XV. Introduction to Mergers and Acquisitions: Firm Diversification ......................... 287

XVI. The Financing Decision by Firms: Impact of Dividend Policy on Value ............ 296

XVII. Security Pricing and Security Analysis in an Efficient Market............................. 312

Copyright © 1982 by Robert C. Merton. These Notes are not to be reproduced without the author’s written permission. All rights reserved.

Page 3: Finance Theory - Robert C. Merton

1

I. INTRODUCTION

Product Markets

Capital Markets• Stock• Bond• Money• Futures

Manufacturingor Business

Firms

Financial Intermediaries

Labor Markets

Households

Consumption

Savings

Savings

(Borrowings)

Output

InvestmentCapital

Domain of Finance

Product Markets

Capital Markets• Stock• Bond• Money• Futures

Manufacturingor Business

Firms

Financial Intermediaries

Labor Markets

Households

Consumption

Savings

Savings

(Borrowings)

Output

InvestmentCapital

Domain of Finance

This course is an introduction to the theory of optimal financial management of households,

business firms, and financial intermediaries. For the term "optimal" to have meaning, a criterion for

measuring performance must be established. For households, it is assumed that each consumer has

a criterion or "utility" function representing his preferences among alternatives, and this set of

preferences is taken as "given" (i.e., as exogenous to the theory). This traditional approach to

households and their tastes does not extend to economic organizations and institutions. That is,

they are regarded as existing primarily because of the functions they serve instead of functioning

primarily because they exist. Economic organizations and institutions, unlike households and their

tastes, are endogenous to the theory. Hence, in the theory of the firm, it is not a fruitful approach to

treat the firm as an "individual" with exogenous preferences. Rather, it is assumed that firms are

created as means to the ends of consumer-investor welfare, and therefore, the criterion function for

judging optimal management of the firm will be endogenous.

In a modern large-scale economy, it is neither practical nor necessary for management to

"poll" the owners of the firm to make decisions. Instead certain data gathered from the capital

markets can be used as "indirect" signals for the determination of the optimal investment and

financing decisions. What the labor and product markets are to the marketing, production and

Page 4: Finance Theory - Robert C. Merton

Robert C. Merton

2

product-pricing managers, the capital markets are to the financial manager. Hence, a good financial

manager must understand how capital markets work.

Since the capital markets are central, it is quite natural to begin the study of Finance with

the theory of capital markets. To derive the functions of financial markets and institutions, we

investigate the behavior of individual households. Using portfolio selection theory, the households'

demand functions for assets and financial securities are derived to develop the demand side of

capital markets. Taking as given the supply of available assets (i.e., the investment and financing

decisions of business firms), the demands of households are aggregated and equated to aggregate

supplies to determine the equilibrium structure of returns of assets traded in the capital market.

Inspection of the structure of these demand functions leads in a natural way to an introductory

theory for the existence and optimal management of financial intermediaries.

In the second part of the course, the supply side of the capital markets is developed by

studying the optimal management of business firms (given the demand functions of households).

The two elements which make Finance a nontrivial subject are time and uncertainty.

Capital investments often require substantial commitments of resources to earn uncertain cash

flows which may not be generated before some distant future date. It is the financial manager's

responsibility to determine under what conditions such investments should be taken and to ensure

that sufficient funds will be available to take the investments. Because future flows and rates of

return are not known with certainty, to make good decisions, the financial manager must have a

thorough understanding of the tradeoff between risk and return.

While the basic mode of approach has universal application, it should be understood that

the assumed environment is the (reasonably) large corporation in a large-scale economy with well-

developed capital markets and institutions similar to those in the United States. Although the

emphasis is on the private sector, most of the analysis can be applied directly to public sector

financing and investment decisions. However, certain assumptions made in developing the theory

(which are quite reasonable in the assumed environment) will require modification before being

applied to small businesses with limited access to the capital markets or to foreign countries with

significantly different institutional and social structures.

Page 5: Finance Theory - Robert C. Merton

Finance Theory

3

Summary of Different Parts of Finance Households (Personal Finance)

Taken as Given: 1. A criterion function for choice among alternative consumption

programs

2. Initial endowments

To be Determined: 1. Optimal consumption-saving decision

2. Optimal allocation of savings (portfolio selection)

Manufacturing or Business Firms (Corporate Finance)

Taken as Given: 1. Owners of the firm are households [either directly or through

financial intermediaries]

2. Proper management is to operate the firm in the best interests of the

owners or shareholders

3. The technology or "blueprints" of available projects (including cost

and revenue forecasts) are known either as point values (certainty) or

as probability distributions.

To be Determined: 1. An operation criterion for measuring good management

2. Investment decision in physical assets (capital budgeting)

a. Which assets to invest in

b. How much to invest in total

3. The long-term financing decision

a. Dividend policy

b. Capital structure decisions and the cost of capital

4. The short-term financing decision

a. Management of working capital and cash

Page 6: Finance Theory - Robert C. Merton

Robert C. Merton

4

5. Mergers and Acquisitions: Firm diversification

6. Taxation and its impact on 2-5 (above)

Financial Intermediaries (Financial Institutions)

Taken as Given: 1. Owners of the intermediary are households [either directly or

through other financial intermediaries]

2. Proper management is to operate the intermediary in the best

interests of the owners or shareholders

To be Determined: 1. Why they exist and what services they provide

2. How the management of financial intermediaries differs from the

management of business firms

3. Efficient management and measurement of performance

4. The role of market makers

Capital Markets and Financial Instruments (Capital Market Finance)

To be Determined: 1. Why they exist and what services they provide

2. The characteristics of an "efficient" capital market

3. How an efficient capital market permits decentralization of decision

making

4. The role of capital markets as a source of information (or "signals")

for efficient decision making by households and managers of

business firms and financial intermediaries

5. The empirical testing of finance theories using capital market

data

Page 7: Finance Theory - Robert C. Merton

Finance Theory

5

Basic Methodology and Approach of the Course

1. How should the system work?

2. Does it work that way?

3. If not, is there an opportunity for improvement (and hence, a profit opportunity)?

4. If you and the market "disagree," then who is right?

Frequently-Used Concepts

Equilibrium: To understand each element of the system, one must frequently analyze the whole

system. To do so, we look at the aggregated resultant of the actions of each unit. If each unit is

choosing the "best" plan possible and the aggregation of the actions implied by these plans are such

that the market clears (i.e., supply equals demand for every item), then these "best" plans can be

realized, and the market is said to be in equilibrium. In general, it will be assumed that the markets

are in or tending toward equilibrium.

Competition: The basic paradigm adopted is that markets operate such that the very best at their

"job" will earn a "fair" return and those that are not will earn a less-than-fair return. This is in

contrast to the view that anyone can earn a "fair" return and the "smart" people will earn a "super"

return. In certain situations, it will be assumed that the capital markets satisfy the technical

conditions of pure competition.

"Perfect" or "Frictionless" Markets: At times, we will use the abstract concept of a perfect market.

That is, there are no transactions costs or other frictions; that there are no institutional restrictions

against market transactions of any sort; there are no divisibility problems with respect to the scale of

transactions; that equal information is available to all market participants. In some cases, actual

markets will be sufficiently "close" to this abstraction to use the resulting analysis directly. In other

cases, it provides a "benchmark" for the study of imperfections.

Page 8: Finance Theory - Robert C. Merton

Robert C. Merton

6

Summary 53-Year Return Experience: Stocks and Bonds (1926–1978)

Source: “Stocks, Bonds, Bills, and Inflation: Historical Returns (1926–1978),” R.G. Ibbotson and R.A. Sinquefield, Financial Analysts Foundation (1979).

Type Average Annual

Return Standard Deviation

Growth of $1000 (Average Compound Return)

Common Stocks (S&P 500) 11.2% 22.2% $89,592 (8.9%) Long-Term Corporate Bonds 4.1% 5.6% $ 7,807 (4.0%) Long-Term Government Bonds 3.4% 5.7% $ 5,342 (3.2%) U.S. Treasury Bills 2.5% 2.2% $ 3,728 (2.5%)

“Inflation-Adjusted” (Consumers Price Index) (“Real”) Returns

Type Average Annual

Return Standard Deviation

Growth of $1000 (Average Compound Return)

Common Stocks (S&P 500) 8.7% 22.3% $23,399 (6.1%) Long-Term Corporate Bonds 1.6% NA $ 2,018 (1.3%) Long-Term Government Bonds 0.9% NA $ 1,377 (0.6%) U.S. Treasury Bills 0.0% 4.6% $ 965 (0.0%)

Page 9: Finance Theory - Robert C. Merton

Finance Theory

7

Page 10: Finance Theory - Robert C. Merton

8

II. ON THE ARITHMETIC OF COMPOUND INTEREST: THE TIME VALUE OF

MONEY

From our everyday experiences, we all recognize that we would not be indifferent to a

choice between a dollar to be paid to us at some future date (e.g., three years from now) or a

dollar paid to us today. Indeed, all of us would prefer to receive the dollar today. The

assumption implicit in this common-sense choice is that having the use of money for a period of

time, like having the use of an apartment or a car, has value. The earlier receipt of a dollar is

more valuable than a later receipt, and the difference in value between the two is called the time

value of money. This positive time value of money makes the choice among various

intertemporal economic plans dependent not only on the magnitudes of receipts and expenditures

associated with each of the plans but also upon the timing of these inflows and outflows.

Virtually every area in Finance involves the solution of such intertemporal choice problems, and

hence a fundamental understanding of the time value of money is an essential prerequisite to the

study of Finance. It is, therefore, natural to begin with those basic definitions and analytical tools

required to develop this fundamental understanding. The formal analysis, sometimes called the

arithmetic of compound interest, is not difficult, and indeed many of the formulas to be derived

may be quite familiar. However, the assumptions upon which the formulas are based may not be

so familiar. Because these formulas are so fundamental and because their valid application

depends upon the underlying assumptions being satisfied, it is appropriate to derive them in a

careful and axiomatic fashion. Then, armed with these analytical tools, we can proceed in

subsequent sections with the systematic development of finance theory. Although the emphasis

of this section is on developing the formulas, many of the specific problems used to illustrate

their application are of independent substantive importance.

A positive time value of money implies that rents are paid for the use of money. For goods

and services, the most common form of quoting rents is to give a money rental rate which is the

dollar rent per unit time per unit item rented. A typical example would be the rental rate on an

apartment which might be quoted as "$200 per month (per apartment)." However, a rental rate

can be denominated in terms of any commodity or service. For example, the wheat rental rate

Page 11: Finance Theory - Robert C. Merton

Finance Theory

9

would have the form of so many bushels of wheat rent per unit item rented. So the wheat rental

rate on an apartment might be quoted as "125 bushels of wheat per month (per apartment)."

In the special case when the unit of payment is the same as the item rented, the rental rate

is called the own rental rate, and is quoted as a pure percentage per unit time. So, for example, if

the wheat rental rate on wheat were ".01 bushels of wheat per month per bushel of wheat rented,"

then the rental rate would simply be stated as "1 percent per month." In general, the own rental

rate on an item is called that item's interest rate, and therefore, an interest rate always has the

form of a pure percentage per unit time.

Because it is so common to quote rental rates in terms of money, the money rental rate

(being an own rental rate) is called the money interest rate, or simply the interest rate, and the

rents received for the use of money are called interest payments. Moreover, as is well known, to

rent money from an entity is to borrow, and to rent money to an entity is to lend. If one borrows

money, he is a debtor, and if he lends money, he is a creditor.

Throughout this section, we maintain four basic assumptions:

(A.II.1) Certainty: There is no uncertainty about either the magnitude or timing of any

payments. In particular, all financial obligations are paid in the amounts and at the time promised.

(A.II.2) No Satiation: Individuals always strictly prefer more money to less. (A.II.3) No Transactions Costs: The interest rate at which an individual can lend in a

given period is equal to the interest rate at which he can borrow in that same period. I.e., the borrowing and lending rates are equal.

(A.II.4) Price-Taker: The interest rate in a given period is the same for a particular

individual independent of the amount he borrows or lends. I.e., the choices made by the individual do not affect the interest rate paid or charged.

In addition, we will frequently make the further assumption that the rate of interest in each

period is the same, and when such an assumption is made, that common per period rate will be

Page 12: Finance Theory - Robert C. Merton

Robert C. Merton

10

denoted by r. Although no specific institutional structure for borrowing or lending is presumed,

the reader may find it helpful to think of the described financial transactions as being between an

individual and a bank. Indeed, for expositional convenience, we will call loans made by

individuals, "deposits."

Compound Interest Formulas

Compound Value

Let V n denote the amount of money an individual would have at the end of n periods if he

initially deposits V o dollars and allows all interest payments earned to be left on deposit (i.e.,

reinvested). V n is called the compound value of V o dollars invested for n periods. Suppose

the interest rate is the same each period. At the end of the first period, the individual would have

the initial amount V o plus the interest earned, ,rV o or 1 o oo = V + = (1+r) .V rV V If he

redeposits V 1 dollars for the second period at rate r, then

. V)r+(1 =] Vr)+r)[(1+(1 = Vr)+(1 = V o2

o12 Similarly, at the end of period 1),-(t he will

have V 1-t and redeposited, he will have V)r+(1 = Vr)+(1 = V ot

1-tt at the end of period t.

Therefore, the compound value is given by

(II.1) ,V)r+(1 = V on

n

and )r+(1 n is called the compound value of a dollar invested at rate r for n periods.

Problem II.1. "Doubling Your Money": Given that the interest rate is the same each period, how

many periods will it take before the individual doubles his initial deposit? This is the same as

asking how many periods does it take before the compound value equals twice the initial deposit

(i.e., V2 = V on ). Substituting into (II.1), we have that the number of periods required, n* , is

given by

Page 13: Finance Theory - Robert C. Merton

Finance Theory

11

(II.2) r)+(1.69315/ = r)+(1(2)/ = n* logloglog

where "log" denotes the natural logarithm (i.e., to the base e). Two "rules of thumb" used to

approximate n* in (II.2) are:

(II.3) )72" of Rule(" 72/100r n* ≈

and

(II.4) )69" of Rule(" 69/100r + 0.35 n* ≈

Of the two, the Rule of 69 is the more precise although the Rule of 72 has the virtue of requiring

only one number to remember. Both rules provide reasonable approximations to n*. For

example, if r equals 6 percent per annum, to one decimal place, the Rule of 72 gives n* = 12.0

years while the Rule of 69 and the exact solution gives n* = 11.9 years. Moreover, in this day of

hand calculators, any more accurate estimates should simply be computed using (II.2). For

further discussion of these rules, see Gould and Weil (1974).

Present Value of a Future Payment

The present value of a payment of $x, n periods from now, (x),PV n

is defined as the smallest number of dollars one would have to deposit today so that with it and

cumulated interest, a payment of $x could be made at the end of period n. It is therefore, equal

to the number of dollars deposited today such that its compound value at the end of period n is

$x. If one can earn at the same rate of interest r per period on all funds (including cumulated

interest) for each of the n periods, then the present value can be computed by setting x = V n in

(II.1), and solving for . )r+x/(1 = )r+/(1V = Vnn

no I.e.,

(II.5) ,)r + x/(1 = (x)PVn

n

Page 14: Finance Theory - Robert C. Merton

Robert C. Merton

12

and )r+1/(1 n is the present value of a dollar to be paid n periods from now.

If one were offered a payment of $x, n periods from now, what is the most that he would

pay for this claim on a future payment today? The answer is (x).PV n To see this, suppose that

the cost of the future claim were (x).PV > P n Further, suppose that instead of buying the future

claim, he deposited $P today and reinvested all interest payments for n periods. At the end of

n periods, he would have )r+$P(1 n which by hypothesis is larger than $x. = )r+(x)(1PV

nn

I.e., he would have more money at the end of n periods by simply depositing the money rather

than by purchasing the future claim for P. Therefore, he would be better off not to purchase the

future claim.

If one owned a future claim on a payment of $x, n periods from now, what is the least

amount that he would sell this claim for today? Again, the answer is (x).PV n Suppose that the

price offered for the future claim today were (x).PV < P n If he sells, then he will have $P

today. Suppose that, instead of selling the future claim, he borrows (x)PV$ n today for one

period. At the end of the first period, he will owe (x)PV n plus interest, (x),rPV n for a total of

(x).PVr)+(1 n If he pays off this loan and interest by borrowing (x)PVr)+$(1 n for another

period (i.e., he "refinances" the loan), then at the end of this (the second) period, he will owe

(x)PVr)+(1 n plus interest, (x)PV r)+r(1 n for a total of (x).PV)r+(1 n2

If he continues to

refinance the loans in the same fashion of n periods, then at the end of period n, he will owe

(x)PV)r+(1 nn

or $x which he can exactly pay off with the $x payment from the claim he

owns. The net of these transactions is that he will have received (x)PV$ n initially which by

hypothesis is larger than $P. I.e., he would have more money initially by borrowing the money

"against" the future claim rather than by selling the future claim for $P, and therefore he would

be better off not to sell the future claim.

In summary, if the price of the future claim, P, exceeds its present value, PVn(x), then

the individual would prefer to sell the claim rather than hold it (or if he did not own it, he would

not buy it). If the price of the future claim, P, is less than its present value, (x),PV n then the

Page 15: Finance Theory - Robert C. Merton

Finance Theory

13

individual would prefer to hold it rather than sell it (or if he did not own it, he would buy it).

Therefore, at (x),PV = P n the individual would have no preference between buying, holding, or

selling the future claim. Hence, the present value of a future payment is such that the individual

would be indifferent between having that number of dollars today or having a claim on the future

payment.

Present Value of Multiple Future Payments

The present value of a stream of payments with a schedule of x$ t paid at the end of

period t for N1,2,..., = t is defined as the smallest number of dollars one would have to deposit

today so that with it and cumulated interest, a payment of x$ t could be made at the end of

period t for each period t, N.1,2,..., = t We denote this present value by ).x,...,x,xPV( N21

To derive the formula for its present value, we proceed as follows: Suppose that we establish

today N separate bank accounts where in "Account #t," we deposit )x(PV tt dollars,

N.1,2,..., = t If we let the interest payments accumulate in Account #t until the end of period t,

then the amount of money in the account at that time will equal the compound value of

).x(PV tt By the definition of the present value of a single future payment, we will have just

enough money to make a payment of x$ t at the end of period t by liquidating Account #t. If

we follow this procedure for each of the N separate accounts, then we would be able to make

exactly the schedule of payments required. Hence, the present value of the stream of payments

with this schedule is equal to the total amount of deposits required for these N accounts. I.e.,

(II.6) . )x(PV =

)x(PV+...+)x(PV+)x(PV = )x,...,x,xPV(

tt

N

1=t

NN2211N21

So, the present value of a stream of payments is just equal to the sum of the present values of

each of the payments. Hence, if one can earn at the same rate of interest r per period on all

Page 16: Finance Theory - Robert C. Merton

Robert C. Merton

14

funds (including cumulated interest) for each of the N periods, then from (II.5) and (II.6), we

have that

(II.7) . )r+/(1x = )x,...,x,xPV( tt

N

1=tN21 ∑

As this derivation demonstrates, a claim on a stream of future payments is formally

equivalent to a set of claims with one claim for each of the future payments. As was shown, an

individual would be indifferent between having )x(PV$ tt today or a payment of x$ t at the

end of period t. It, therefore, follows that he would be indifferent between having

)x,...,x,x$PV( N21 today or a claim on the stream of future payments with the schedule of x$ t

paid at the end of period t for . N1,2,..., = t

As may already be apparent, the present value concept is an important tool for the

solution of intertemporal choice problems. For example, suppose that one has a choice between

two claims: the first, call it "claim Y," provides a stream of payments of y$ t at the end of

period t for N,1,2,..., = t and the second, call it "claim X," provides a stream of payments of

x$ t at the end of period t for . N1,2,..., = t Which claim would one choose? We have

already seen that one would be indifferent between having a claim on stream of future payments

or having its present value in dollars today. So one would be indifferent between having claim Y

or )y,...,y,y$PV( N21 today, and similarly, one would be indifferent between having claim X

or )x,...,x,x$PV( N21 today. Hence to make a choice between having )y,...,y,y$PV( N21

today or )x,...,x,x$PV( N21 today is formally equivalent to making a choice between claim Y

or claim X. But, as long as one prefers more to less, the former choice is trivial to make:

Namely, one would always prefer the larger of )y,...,y,y$PV( N21 or )x,...,x,x$PV( N21 today.

Thus, one would prefer claim Y to claim X if ),x,...,x,xPV( > )y,...,y,yPV( N21N21 and

would prefer claim X to claim Y if . )x,...,x,xPV( < )y,...,y,yPV( N21N21 Moreover, if the

two present values are equal, then one would be indifferent between the two claims.

Page 17: Finance Theory - Robert C. Merton

Finance Theory

15

In the formal notation, both claim X and claim Y had the same number of payments:

namely N. However, nowhere was it assumed that some of the y x tt or could not be zero.

Thus, the timing of the payments need not be the same. Moreover, nowhere was it assumed that

some of the y x tt or could not be negative. Since the y x tt or represent cash payments to

the owner of the claim (i.e., a receipt) a negative magnitude for these variables is interpreted as a

cash payment from the owner of the claim (i.e., an expenditure). Indeed, it is entirely possible for

the present value of a stream of payments to be negative which simply means one would be

willing to make an expenditure and pay someone to take the claim. Hence, the present value tool

provides a systematic method for comparing claims whose schedules of payments can differ

substantially both with respect to magnitude and timing. While our illustration applied it to

choosing between two claims, it can obviously be extended to the problem of choosing from

among several claims. Its use in this intertemporal choice problem can be formalized as follows:

Present Value Rule:

If one must choose among several claims, then proceed by: first, computing the present

values of all the claims. Second, rank or order all the claims in terms of their present values from

the highest to the lowest. Third, if one must choose only one claim, then take the first claim (i.e.,

the one with the highest present value). More generally, if one must choose k claims out of a

larger group, then take the first k claims in the ordering (i.e., those claims with the k largest

present values in the group). This procedure for choosing among several claims is called the

Present Value Rule.

Note that if the rate of interest in every period were zero, then the present value of a

stream of payments is just equal to the sum of all the payments (i.e., .N

t1 2 N

t=1

PV( , , ..., ) = x x x x ) ∑

In this case, the Present Value Rule would simply say "choose that claim which pays one the

most money in total (without regard to when the payments are received)." However, because of

Page 18: Finance Theory - Robert C. Merton

Robert C. Merton

16

the time value of money, the interest rate will not be zero, and no such simple rule will apply.

That one cannot rank or choose between alternative claims without taking into account the

specific interest rate available is demonstrated by the following problem:

Problem II.2. Choosing Between Claims: Suppose that one has a choice between "claim X"

which pays $100 at the end of each year for ten years or "claim Y" which provides for a single

payment of $900 at the end of the third year. Given that the interest rate will be the same each

year for the next ten years, which one should be chosen? The Present Value Rule says "Choose

the one with the larger present value." However, as the following table demonstrates, the claim

chosen depends upon the interest rate.

Interest Rate, r Present Value of Claim X Present Value of Claim Y

0% $1000 $900 2% 898 848 5% 772 777 8% 671 714 10% 614 676 12% 565 641 While the present values of both claims decline as one moves in the direction of higher interest

rates, the rate of decline in the present value of Claim Y is smaller than the rate of decline for

Claim X. Hence, for interest rates below 5 percent, one should choose Claim X and for rates

above 5 percent, one should choose Claim Y.

The result obtained here that one claim is chosen over the other for some interest rates

and the reverse choice is made for other interest rates often occurs in choice problems and is

called the switching phenomenon. It is called this because an individual would "switch" his

choice if he were faced with a sufficiently different interest rate. Hence, without knowing the

interest rate, the choice between two claims will, in general, be ambiguous. So, in general,

unqualified questions like "which claim is better?" will not be well posed without reference to

the specific environment in which the choice must be made. Note, however, that for a specified

Page 19: Finance Theory - Robert C. Merton

Finance Theory

17

interest rate, the present value of each claim is uniquely determined, and therefore the choice

between them at that interest rate level is always unambiguous.

In Problem II.2, it was stressed that, in general, the solution to the problem of choosing

among alternative claims will depend upon the interest rate at which the individual can borrow or

lend. However, it is equally important to stress that the solution depends only upon that interest

rate. Specifically, given that rate of interest, the solution is not altered by the existence of other

claims that an individual owns (i.e., his endowment). Moreover, the solution does not depend

upon whether he plans to use the payments received for current consumption or to save them for

consumption in the future. That is, the solution does not depend upon the individual's

preferences or tastes for future consumption. While this demonstrated independence of the

solution to either the individual's tastes or endowments has far-ranging implications for the

theory of Finance, further discussion is postponed to Section III where the general intertemporal

choice problem for the individual is systematically examined.

Continuous Compounding

It is not uncommon to see an interest rate quoted as "R% per year, compounded n times

a year." For example, a bank might quote its rate on deposits as "7% per year, compounded

quarterly (i.e., every three months or four times a year)" or "7% per year, compounded monthly

(i.e., every month or twelve times a year)." Provided that funds are left on deposit until the end

of a compounding date, such quotations can be interpreted to mean that n times a year, the

account is credited with cumulated interest earned at the rate, (R/n), per period of (1/n) years.

The "true" annual rate of interest, call it in, when there are n such compoundings per year can

be derived using the compound value formula (II.1). From that formula, one dollar will grow to

)R/n+$(1 n in one year, and therefore,

(II.8) . )R/n+(1 = i + 1 nn

Page 20: Finance Theory - Robert C. Merton

Robert C. Merton

18

By inspection of (II.8), for a given value of R, more frequent compoundings (i.e., larger n)

result in a larger "true" annual interest rate, .in The limiting case of n → ∞ is called

continuous compounding, and the limit of (II.8) is

(II.9) e = i + 1 R∞

where "e" is a constant equal to 2.7183..., and eR is called the exponential factor. The

difference between the true or effective annual rate i∞ and the stated rate R will be larger, the

larger is R although for typical interest rates, this difference will not be large. For example, at a

stated rate of R = 5%, i∞ = 5.13%. However, the cumulative difference in compound value for

higher interest rates and over several years can be significant as is illustrated in the following

table:

Compound Value of $100 at the End of N Years

At 10% At 10% per Year, N per Year Compounded Continuously 1 $ 110.00 $ 110.52 2 121.00 122.14 5 161.05 164.87 10 259.37 271.83 15 417.72 448.17 20 672.75 738.91 30 1,744.93 2,008.55 One can, of course, invert the original question and ask "What continuously-compounded

rate, ,rc will produce a "true" annual interest rate, r?" From (II.9), we have that

(II.10) ,r +1 er ≡c

or by taking (natural) logarithms of both sides of (II.10), we can rewrite (II.10) as

Page 21: Finance Theory - Robert C. Merton

Finance Theory

19

(II.11) . r) +(1 rc log≡

In the analysis of interest rate problems, it is frequently more convenient to work with the

continuously-compounded rate, ,rc rather than the actual rate, r. For example, in Problem II.1,

we derived a formula for the number of periods required to double our money, n*. Substituting

from (II.11) into (II.2), we have that

II.12) . r.69315/ = r(2)/ = n cc* log

If, in addition, one approximates the stream of payments from a claim, ,}x{ t by a

continuous stream of payments, {x(t)}, then the discrete-time formula for the present value of a

stream of payments, (II.7), can be approximated by the integral formula,

(II.13) c1 2

N- tr

n0

PV(x ,x ,...,x ) x(t) dt,e≈ ∫

and in some cases, the integral expression in (II.13) provides an easier way to compute formula

for the present value than its discrete-time counterpart in (II.7).

Annuity Formulas

A claim which provides for a stream of payments of equal fixed amounts at the end of

each period for a specified number of periods is called an annuity. Suppose that one owned an

annuity claim which pays $y at the end of each year for N years. How much money would one

have at the end of year N if payments are immediately deposited in an account which earns r%

Page 22: Finance Theory - Robert C. Merton

Robert C. Merton

20

per year (on both cumulated interest and the initial deposit) in each year? Using the compound

value formula, (II.1), we have that:

year 1's payment will grow to )r+y(1 1-N

year 2's payment will grow to )r+y(1 2-N

year 3's payment will grow to )r+y(1 3-N

. . . . . . . . . . . . . . . . . . year (N-1)'s payment will grow to r)+y(1 year N's payment will grow to y . Hence, the total amount accumulated, N ,S will be the sum of all N terms. I.e., SN =

. )r+(1y = )r+y(1 t1-N

0=t

t-NN

1=t∑∑ To further simplify the formula, we make a brief digression to

develop a mathematical formula. The sum of a geometric progression,

,x = x + ... + x + x + 1 t1-N

=0t

1-N2 ∑ is given by the formula

(II.14) .N -1

t

t=0

Nx = ( - 1)/(x - 1) x∑

From (II.14), we also have that

(II.14a) 1).-1)/(x-xx( = x NtN

1=t∑

Applying (II.14) with r + 1 = x to the expression for ,S N we can rewrite it as

(II.15) 1]/r. - )r+y[(1 = SN

N

Page 23: Finance Theory - Robert C. Merton

Finance Theory

21

S N is called the compound value of an annuity, and 1]/r - )r+[(1 N is called the annuity

compound value factor.

Maintaining the assumption that the interest rate is the same each year, what is the present

value of an annuity (denoted by AN )? From (II.7), we have that

.)r+1/(1y = )r+y/(1 = At

N

1=t

tN

1=tN ∑∑ From (II.14a), we can rewrite the expression for the

present value as

(II.16) ]/r)r+1/(1-y[1 = AN

N

and N

[1 - 1/(1+ r) /r] is called the annuity present value factor.

Formula (II.16) could have been derived by a different (but equivalent) method. From

(II.15), we know that a N-year annuity paying $y per year is equivalent to a claim which

provides a single payment of S$ N paid at the end of year N. From (II.5), we have that

.)r+/(1S = )S(PVN

NNN But, the present values of two equivalent streams are the same, and

therefore N

N N = /(1 r .)SA + The reader may verify that this is the case by inspection of (II.16).

Note that if one has a N-period annuity at time (t=) zero, then this same claim will

become a (N-1) period annuity at time 1, = t and at time t, it will be an (N–t) period annuity.

Hence, the change in the present value of an N-period annuity over one period is equal to

,A - A N1-N and from (II.16), can be written as

(II.17) .)r+y/(1- = A - AN

N1-N

Inspection of (II.17) shows that the present value of an annuity declines each period until at time

t = N (called its expiration date), its present value is zero. Note further that the rate of decline is

larger the closer the annuity is to its expiration date. However, in the special limiting case of a

Page 24: Finance Theory - Robert C. Merton

Robert C. Merton

22

perpetual annuity or perpetuity where N = ∞, the present value remains unchanged through

time, and is given by

(II.18) y/r. = A∞

Problem II.3. Mortgage Payment Calculations: Probably the annuity claim with which

households are most familiar is the mortgage which is a specific form of loan used to finance the

purchase of a house. The terms of a standard or conventional mortgage call for the borrower to

repay the loan with interest by making a series of periodic payments of equal size for a specified

length of time. In effect, the house buyer "issues" to the lender (usually a bank) an annuity claim

in exchange for cash today. Typically, the length of time, the periodicity of the payments, and

the interest rate are quoted by the bank. Given this information, one can then determine the size

of the periodic payments as a function of the amount of money to be borrowed. Suppose the

bank quotes its mortgage terms as follows: the length of the mortgage's life or term is 25 years;

the periodicity of the payments is once a year; and the interest rate charged is 8 percent per year.

If the amount of money to be borrowed is $30,000, then what will be the annual payments

required? To solve this problem, we use formula (II.16). The amount of money received in

return for the annuity, $30,000, equals the present value of the annuity, .AN The number of

payments, N, equals 25, and the annual interest rate, r, equals .08. Thus, the required annual

payments, y, are given by the formula

(II.19) ].)r+1/(1-/[1rA =y NN

The annuity present value factor for r = .08 and N = 25 equals 10.675. Therefore, y =

$30,000/10.675 or approximately $2810 per year.

Although the size of the payments remains the same over the life of the mortgage, the

amount of money actually borrowed (called the principal of the loan) does not. In addition to

Page 25: Finance Theory - Robert C. Merton

Finance Theory

23

covering interest payments, a portion of each year's payment is used to reduce the principal. In

the example above, during the first year of the mortgage, the amount of money borrowed is

$30,000, and therefore, the interest part of the payment is .08 × $30,000 or $2,400. However,

because the total payment made is $2,810, the balance after interest, $410, is used to reduce the

principal. Hence, for the second year in the life of the mortgage, the amount actually borrowed is

not $30,000, but $29,590. The following table illustrates how the level of payments are

distributed between interest payments and principal reduction over the life of the mortgage.

25-Year 8% Mortgage: Distribution of Payments Interest Payments Principal Reduction Amount of Loan Year Total Payment Amount % of Total Amount % of Total Outstanding 1 $2,810 $2,400 85.4% $ 410 14.6% $29,590 2 2,810 2,367 84.2 443 15.8 29,147 5 2,810 2,252 80.1 558 19.9 27,589 10 2,810 1,990 70.8 820 29.2 24,052 15 2,810 1,605 57.1 1,205 42.9 18,855 20 2,810 1,039 37.0 1,771 63.0 11,220 25 2,810 208 7.4 2,602 92.6 0 Note that early in the life of the mortgage, almost all of the total payment goes for interest

payments. However, by the seventeenth year, the distribution of the payment is approximately

half interest payment and half principal reduction, and as the mortgage approaches its expiration

date, virtually all the payment goes for the reduction of principal.

The general case for the distribution of the payments between interest and principal

reduction can be solved by using formulas (II.16) and (II.17). Because the amount of the

mortgage outstanding always equals its present value, the principal at time t, ,A t-N is given by

]/r.)r+1/(1-y[1 = At-N

t-N We can rewrite this expression in terms of the initial size of the

mortgage, ,AN as

Page 26: Finance Theory - Robert C. Merton

Robert C. Merton

24

(II.20) 1]. - )r+]/[(1)r+(1 - )r+[(1A = ANtN

Nt-N

Moreover, the change in principal between t and 1 + t is equal to A - A t-N1-t-N which from

(II.17) can be written as

(II.21) ,)r+y/(1- = A - At-N

t-N1-t-N

and the percentage of the total payment used to reduce principal between t and 1 + t can be

written as

(II.22) . )r+1/(1 =]/y A - A[ t-N1-t-Nt-N

Problem II.4. Saving for Retirement: A bank recently advertised that if one would deposit $100

a month for twelve years, then at that time, the bank would pay the depositor $100 a month

forever. This is an example of a regular saving plan designed to produce a perpetual stream of

income later, and frequently arises in analyses of retirement plans. For example, how many years

in advance of retirement should one begin to save $X a year so that at retirement, one would

receive $C a year forever?

If it is assumed that the annual rate of interest is the same in each year and if one starts

saving T years prior to retirement, then from formula (II.15), a total of 1]/r - )r+$X[(1 T will

have been accumulated by the retirement date. From formula (II.18), it will take $C/r at that

time to purchase a perpetual annuity of $C per year. Hence, the required number of years of

saving is derived by equating the accumulated sum to the cost of the annuity. By taking the

logarithms of both sides and rearranging terms, we have that

(II.23) r],+[1C/X]/+[1 = T loglog

or alternatively, using (II.11), we can rewrite (II.23) in terms of the equivalent continuously-

compounded interest rate as

Page 27: Finance Theory - Robert C. Merton

Finance Theory

25

(II.24) .rC/X]/+[1 = T clog

Note that for a fixed ratio of C/X, the length of time required is inversely proportional to the

(continuously-compounded) interest rate. So, if that rate is doubled, then the required saving

period is halved. In the special case where C = X, (II.24) reduces to

(II.25) r0.69315/ = T c

where 0.69315 ≈ log(2). Comparing (II.25) with (II.2), the number of years of required saving is

exactly equal to the number of years it takes to "double your money," and therefore a "quick"

solution for T can be obtained by using either the Rule of 72 or the Rule of 69. Applying (II.25)

to the bank advertisement, we can derive the monthly interest rate implied by the bank to be 0.48

percent per month or 5.93 percent per year.

Problem II.5. The Choice Between a Lump-Sum Payment or an Annuity at Retirement: Having

participated in a pension plan, it is not uncommon for the individual to be offered the choice at

retirement between a single, lump-sum payment or a lifetime annuity. Suppose one is offered a

choice between a single payment of $x or an annuity of $y per year for the rest of his life.

Given that the interest rate at which he can invest for the rest of his life is r, which should he

choose? Provided that y > rx, the proper choice depends upon the number of years that the

individual will live. Clearly, if he expects to live long enough, then he should choose the

annuity. Otherwise, he should take the lump-sum payment. We can determine the "switch point"

in terms of life expectancy by solving for the number of years, N* , such that the present value

of the annuity is just equal to the lump-sum payment x. Substituting x for AN in (II.16) and

rearranging terms, we have that

(II.25) r].+[1rx)]/-[y/(y = N* loglog

Page 28: Finance Theory - Robert C. Merton

Robert C. Merton

26

Hence, if he expects to live longer than N* years, then he should choose the annuity.

Problem II.6. Tax-Deferred Saving for Retirement: Under certain provisions of the tax code,

individuals are permitted to establish tax-deferred savings plans for retirement (e.g., Individual

Retirement Accounts or Keogh Plans). Contributions to these plans are deductible from current

income for tax purposes and interest on these contributions is not taxed when earned. These

plans are called "tax-deferred" rather that "tax-free" because any amounts withdrawn from the

plan are taxed at that time. Suppose that an individual faces a proportional tax rate of τ which is

the same each period and that the interest rate r is the same each period. Further suppose that

he contributes $y each year to the plan until he retires N years from now at which time he

begins a withdrawal program on an annuity basis for n years. Assuming that his first

contribution to the plan takes place one year from now, what is the economic benefit of the tax-

deferred saving plan over an ordinary saving plan?

Using formula (II.15), his total before-tax amount accumulated at retirement,

is NN , $y[(1+r -1]/r.)S From formula (II.16), he can generate a withdrawal plan of

nN$q = /[1- 1/(1+r ])rS per year for n years from this accumulated sum. However, he must

pay taxes of $τq each year on the withdrawals. Hence, the tax-deferred plan will produce an

after-tax stream of payments for n years beginning at retirement of

(II.26) ].)r+1/(1-1]/[1-)r+)y[(1-(1 = q$ nN1 τ

If, instead, he had chosen an ordinary saving plan, he would have had to pay $τy additional

taxes each year during the accumulation period because contributions to an ordinary saving plan

are not deductible. So, without changing his expenditures on other items during the

accumulation period, he could only contribute )y-$(1 τ each year. Moreover, the interest

earned in an ordinary saving plan is taxable at the time it is earned. Therefore, instead of earning

Page 29: Finance Theory - Robert C. Merton

Finance Theory

27

at the rate r each year on invested money, he only receives rate )r-(1 τ after tax. Again using

formula (II.15), his total amount accumulated at retirement from the ordinary saving plan, ,S 2

is )r.-1]/(1-))r-(1+)y[(1-$(1 N τττ Because he has paid the taxes on contributions and

interest along the way, the S$ 2 accumulated is not subject to further tax. However, any interest

earned on invested money during the subsequent withdrawal period is taxed at rate τ. Thus,

from formula (II.16), he can generate an after-tax withdrawal plan of

]))r-(1+1/(1-/[1rS)-(1 = q$ n22 ττ per year for n years which can be rewritten as

(II.27) ].))r-(1+1/(1-1]/[1-))r-(1+)y[(1-(1 = q$ nN2 τττ

Clearly, the tax-deferred plan provides a positive benefit because q1 > q2. Inspection of

(II.26) and (II.27) shows that this differential can be expressed in terms of a higher effective

interest rate on accumulations in the tax-deferred plan. Specifically, the tax-deferred plan is

formally equivalent to having an ordinary saving plan where the interest earned is not taxed.

Problem II.7. The Choice Between Buying or Renting a Consumer Durable: For most large

consumer durables (e.g., a house or car), the individual can either choose to buy the good or rent

it. Suppose an individual faces the decision of whether to buy a house for $I or rent it where the

annual rental charge is $X per year. If he buys the house, then he must spend $M for

maintenance and $PT for property taxes each year. These are both included in the rent.

Suppose that the individual faces a proportional tax rate of τ which is the same each period and

that the interest rate r is the same each period. His problem is to choose the method of

obtaining housing services with the lowest (present value of) cost.

The present value of cost equals the discounted value of the after-tax outflows discounted

at the after-tax rate of interest, )r.-(1 τ Because property taxes can be deducted from income

Page 30: Finance Theory - Robert C. Merton

Robert C. Merton

28

for federal income tax purposes, the after-tax outflow for property taxes each year is )PT.-(1 τ

Hence, the cost of owning the house, PCO, can be written as

(II.28) )r-M/(1+PT/r+I =

))r-(1+)PT]/(1-(1+[M + I = PCO t

1=t

τ

ττ∑∞

where we have assumed that the (properly-maintained) house continues in perpetuity and applied

the annuity formula. Similarly, the cost of renting the house, PCR, can be written as

(II.29) )r.-X/(1 =

))r-(1+X/(1 = PCR t

1=t

τ

τ∑∞

Hence, if PCR > PCO, then it is better to own rather than rent. Of course, the relationship

between PCR and PCO depends upon the rent charged. In a competitive market, the rent

charged should be such that the landlord earns a return competitive with alternative investments.

Hence, X should be such that the present value of the after-tax cash flows to the landlord equals

the cost of his investment I. The pretax net cash flow to the landlord each year is (X-M-PT). In

computing his tax liability, the landlord can deduct depreciation, D, a non-cash item. Hence,

his taxes are (X-M-PT-D) where τ τ is his proportional tax rate. Therefore, his after

tax cash flow is (X-M-PT)(1 - ) + Dτ τ . Discounting these after-tax cash flows at his after-

tax interest rate, (1 - )r, τ we have that X must satisfy

I = [(X-M-PT)(I - )+ D]/(I - )r τ τ τ or

(II.30) X = rI + M + PT - D/(1 - ).τ τ

From (II.28), (II.29), and (II.30), we have that the cost saving of owning over renting can be

written as

Page 31: Finance Theory - Robert C. Merton

Finance Theory

29

(II.31) PCR - PCO = [I + PT/r]/(1 - ) - D/[(1 - )(1 - )r].τ τ τ τ τ

The advantage to ownership is that one is not taxed on the rent paid to oneself. The disadvantage

is that one cannot take a tax deduction for the (non-cash) depreciation item. So if the

depreciation rate on the property is high or the individual is in a low tax bracket, then renting is

less costly. On the other hand, if property taxes are high and the individual is in a high tax

bracket, then owning is probably less costly.

"Pure" Discount Loan

A pure discount loan calls for the borrower to repay the loan with interest by making a

single lump-sum payment to the lender at a specified future date called the maturity or expiration

date. Hence, unlike an annuity-type loan, there are no interim payments made to the lender. This

form of loan is most common for short maturity loans, and the best known examples are U.S.

Treasury Bills and corporate commercial paper. If it is assumed that the interest rate is the same

each period, then the present value of a discount loan (denoted by DN ) which has a promised

payment of $M to be paid N periods from now can be written as

(II.32) .)r+ M/(1= DN

N

If one has a N-period discount loan at time (t=) zero, then this same loan will become a (N – 1)

period discount loan at time t = 1, and at time t, it will be a (N - t) period discount loan.

Hence, the change in the present value of a N-period discount loan over one period is equal to

,D-D N1-N and from (II.32), can be written as

(II.33)

NN -1 N

N

- = rM/(1+ r )D D

= rD .

Page 32: Finance Theory - Robert C. Merton

Robert C. Merton

30

Inspection of (II.33) shows that unlike an annuity, the present value of a discount loan increases

each period until at t = N, its present value is M. Hence, the amount of money actually

borrowed increases over the life of the loan. The rate of increase each period is the same and

equal to the interest rate r.

"Interest-Only" Loans

Another common form for a loan is an "interest-only" loan which calls for the borrower to

make a series of periodic payments equal in amount to the interest payments for a specified

length of time and, in addition, at the end of that length of time, to make a single payment equal

to the initial amount borrowed (i.e., the principal). The periodic payments are called coupon

payments, and the single, lump-sum (or "balloon") payment at the end is called the return of

principal or simply the principal payment. This form of loan is most common for long maturity

loans, and the best known examples are U.S. Treasury Notes and corporate bonds.

The structure of "interest-only" loans is a mixture of the annuity and pure discount forms

of loans. With the exception of the principal payment, the payment patterns are like those of an

annuity because the size of the coupon payments are all the same. Like a discount loan, there is a

lump-sum payment at the maturity date. However, unlike both the annuity and discount loans,

the amount of the loan outstanding or the principal remains the same throughout the term of the

loan. If it is assumed that the interest rate is the same each period, then the present value of an

interest-only loan (denoted by I N ) which has a coupon payment of $C per period and a

balloon payment of $M can be written as

(II.34)

.)r+ M/(1+ ]/r)r+1/(1-C[1 =

)r+ M/(1+ )r+C/(1 = I

NN

NtN

1=tN ∑

Page 33: Finance Theory - Robert C. Merton

Finance Theory

31

If the initial amount borrowed is $M and the coupon is set equal to the interest on the amount

borrowed (i.e., C = rM), then substituting into (II.34), we have that

(II.35) M= I N

independent of N. Hence, the present value of the loan remains the same over the life of the

loan.

Compound and Present Values When the Interest Rate Changes Over Time

To this point, all the formulas were derived using the assumption that the interest rate at

which the individual can borrow or lend is the same in each period. We now consider the general

case where the interest can vary, and we denote by rt the one-period rate of interest which will

obtain for the period beginning at time (t – 1) and ending at time t. If, as before, V n denotes

the compound value of V o dollars invested for n periods, then

; V)r+)(1r+(1 = V)r+(1 = V ; V)r+(1 = V o12122o11 and

. V)r+)...(1r+)(1r+)(1r+(1 = V)r+(1 = V o12-t1-tt1-ttt Hence, the analogous formula to (II.1)

for the compound value is

(II.36) ( )n

n o

t=1t 1+ rV V=

⎡ ⎤⎢ ⎥⎣ ⎦Π

where "Π" is a shorthand notation for the "product of." I.e.,

( ) ( )( ) ( )( )1 2 1

n

t=1

t n n1+ r 1+ r 1+ r ... 1+ r 1+ r .−≡∏ For notational simplicity, we define

the number Rn as that rate such that compounding at that (equal) rate each period for n periods

Page 34: Finance Theory - Robert C. Merton

Robert C. Merton

32

will give the same compound value as compounding at the actual (and different) one-period

rates. That is,

(II.37)

nn

n t

t=1

(1+ (1+ ),)R r≡ Π

and therefore, R + 1 n is the geometric average of the n.1,2,..., = t },r+{1 t Hence, we can

rewrite (II.36) as

(II.38) .V)R+(1 = V on

nn

From (II.38) and the definition of present value, the present value of a payment of $x, n

periods from now, can be written as

(II.39) ,)R+x/(1 = (x)PVn

nn

and the present value of a stream of payments with a schedule of x$ t paid at the end of period

t, t = 1,2,...,N, can be written as

(II.40)

. )Rx

xPV=)x,...,x,xPV(

ttt

N

1=t

tt

N

1=tN21

+/(1=

)(

Using the formalism of ,Rn the compound and present value formulas when interest rates vary

look essentially the same as in the constant interest rate case. However, care should be exercised

to ensure that one does not confuse the "R" n with the . "r" n The former depends upon the

entire path of interest rates from time t = 1 to time t = n while the latter is simply the one-

period rate that obtains between t = n – 1 and t = n. For example, from (II.37), we have that

Page 35: Finance Theory - Robert C. Merton

Finance Theory

33

(II.41) . R

<

=

>

r ifonly and if R

<

=

>

R 1-nn1-nn

Hence, r = R nn if and only if .R = R 1-nn Moreover, R > r 1-nn does not imply that .r > r 1-nn

Further discussion of the relationship between the }R{ t and }r{ t is postponed until Section V

where they will be placed in substantive context.

This completes the formal preparation on the time value of money, and, as promised, we

now turn to the systematic development of finance theory.

Page 36: Finance Theory - Robert C. Merton

34

III. ON THE THEORY OF ACCUMULATION AND INTERTEMPORAL CONSUMPTION CHOICE BY HOUSEHOLDS IN AN ENVIRONMENT OF CERTAINTY

Begin the study of Finance with the analysis of an economy where all future outcomes are

known with certainty, but households receive income (their endowments) and consume at

different points in time. In particular, it is shown how the consumption-saving decision is made

and why the introduction of a capital market and financial securities can improve consumer

welfare.

As was discussed in the Introduction, the major decisions of the financial manager are to

choose which (physical) investments to make and to choose the appropriate means for financing

them. It is assumed that the "correct" policies chosen will be those that maximize some criterion

function (or performance index) specified by the firm. We prepare for the study of corporate

finance by deducing here and in Section IV a rational criterion function for the firm and the

management rules which optimize this criterion function in the simplified world of perfect

markets and certainty. Despite the simplicity of the model relative to the "real" world, the results

derived from this model form a basis for the rationalization of the more complex decision rules

developed later. Hence, while the manifest functions of the analysis are to show how

intertemporal allocations are made and to show what role capital markets play in these

allocations, an important latent function of the analysis is to provide a foundation for corporate

financial theory.

We begin the analysis by solving the two-period problem and then extend it in a natural

fashion to the general case of many periods.

Consumer Behavior: The Two-Period Case

The four assumptions of Section II (A.II.1) - (A.II.4), are maintained throughout the

analysis. It is further assumed that each consumer has a well-behaved utility function expressing

his preferences between current consumption, ,C0 and next period's consumption, .C1

Because the emphasis is on the intertemporal allocation of consumption, it is assumed that there

Page 37: Finance Theory - Robert C. Merton

Finance Theory

35

is a single consumption good in each period. The consumer's utility function is denoted by

U[C0,C1]. Because both period's consumptions are considered goods (in contrast to "bads"), it is

assumed that U1[C0,C1] ≡ ∂U[C0,C1]/ ∂C0 > 0 and U2[C0,C1] ≡ ∂U[C0,C1]/∂C1 > 0. By

assuming the strict inequality, we rule out the possibility of satiation. I.e., consumers will always

strictly prefer more to less of either C0 or C1. We also assume sufficient regularity and

concavity of U to ensure existence of unique interior maximums.

An indifference curve is the set of all combinations of current and next-period

consumption, (C0,C1), such that the consumer is indifferent among these alternative

combinations i.e., they are curves of equal utility or iso-utility curves. Formally, it is the

functional relationship between C0 and C1 such that U[C0,C1] = ,U where U is a constant.

Figure 1 illustrates the general shape of the indifference curves, and as they are drawn,

. U > U > U321

Analytically, by the Implicit Function Theorem or heuristically, by using

differentials, we have that ,CdU + CdU = 0 = Ud 1201 or that

(III.1) 0,<]C,C[U]/C,C[U- = dC

dC102101

0

1

U=U⎟⎟⎠

⎞⎜⎜⎝

where (dC1/dC0) is the slope of the indifference curve defined by U[C0,C1] = U_ at the point

(C0,C1). As shown in Figure 1, this slope is always strictly negative.

Case 1. The Simplest Capital Market: Pure Exchange

For this case, we assume that there are no means of physical production. I.e., there is no

way of using the current period's goods to produce additional goods next period. However,

suppose there does exist a market for trading current period's goods in return for a claim on

goods next period. So, an individual can go to the market and exchange current period goods for

"pieces of paper" which, in turn, can be exchanged next period for goods. Alternatively, he can

Page 38: Finance Theory - Robert C. Merton

Robert C. Merton

36

receive current period goods by issuing "pieces of paper" which he must redeem for goods next

period. In effect, in the former case, he is lending and in the latter, he is borrowing.

If, by convention, the price per unit of current period goods is set equal to one (i.e., a unit

of current period goods is numeraire), then the (current) price per unit of next period goods, P,

is the rate of exchange for claims on next period goods in terms of current period goods. So, P

units of current period goods can buy a claim on one unit of next period goods. In an

intertemporal context, this price is also written as P 1/(1+ r)≡ where r is the rate of interest.

Hence, one unit of current goods can be exchanged for (1 + r) units of goods delivered next

period.

Figure III.1

Indifference Curves

Page 39: Finance Theory - Robert C. Merton

37

A consumer's endowment of exogenous income is denoted by (y0,y1) where y0 is the

number of units of current goods he owns and y1 is the number of units of goods that he will

receive next period. The consumer's current wealth, W0, is equal to the value of his endowment

i.e., W0 = y0 + Py1. The consumer's feasible consumption set is the set of all combinations

(C0,C1) which he can afford to buy. Thus, if (C0,C1) are in the consumer's feasible

consumption set, then the cost of that consumption program, C0 + PC1, can be no larger than his

wealth W0. Moreover, as long as a consumer prefers more consumption to less, he would never

choose a program which costs less than his wealth. Hence, if it is assumed that the consumer

will choose the most preferred feasible consumption program, then he will act so as to maximize

U[C0,C1] subject to his budget constraint that W0 = C0 + PC1.

Substituting for C0 in U from the budget constraint, we can write the consumer choice

problem as

(III.2) ]C,PC - WU[ 110C1

Max

which leads to the first-order condition for an interior maximum

(III.3) ] ,C,PC-W[U + ]PC,PC-W[U- = 0 = dC

dU *1

*102

*1

*101

1

where (C0*,C1

*) is the optimal consumption program. Noting that C0* = W0 - PC1

*, we can

rewrite (III.3) as

(III.4) ,r) + (1- = 1/P- = )dC/dC(U = U01 *

where U* ≡ U [C0*,C1

*] is the maximum feasible value of utility. Hence, the optimum occurs at

the point where an indifference curve is tangent to the budget constraint as shown in Figure 2.

Note that in arriving at the optimality condition (III.3), we have used assumption (A.II.4)

that the consumer acts as a pure competitor or price-taker. So, in solving for his most preferred

consumption program, the consumer treats the price (or interest rate) as a given number which

does not change in response to the different consumption choices that he might make.

Page 40: Finance Theory - Robert C. Merton

Robert C. Merton

38

Figure III.2

In the absence of an exchange market and without physical storage of goods through time,

the optimal consumption program for the consumer will simply be to consume current income.

I.e., 1 1 and .o oC y C y= = Hence, if the solution to (III.3) yields *

0 0C y ≠ (and

therefore, *

1 1C y ), ≠ then the consumer will be better off as a result of the creation of an

exchange market. Moreover, he can be no worse off because he always has the option not to use

the market and choose C0 = y0 and C1 = y1 which is called the autarky point.

Even if physical storage of goods is feasible, then in the absence of an exchange market,

the feasible consumption choices are constrained to have C0 ≤ y0. That is, physical storage

Page 41: Finance Theory - Robert C. Merton

Finance Theory

39

allows one to "move" goods "forward" in time for consumption, but it does not allow one to

"move" goods "backward" in time.

So, for example, suppose that one had an income stream of (y0 =) ten bushels of wheat

this period and (y1 =) fifty bushels of wheat next period. In the absence of an exchange market,

there is no way that he can consume more than ten bushels of wheat this period even if costless

storage of wheat were available. However, in the presence of an exchange market, in addition to

the ten bushels he has, he could consume up to 50/(1+r) bushels of wheat in the initial period

where r is the market interest rate. Even if his endowment had been y0 = 50 and y1 = 10, then

he would still be better off to save wheat for next period through the exchange market rather than

by storage provided that the interest rate is positive.

Problem III.1: Choosing an Optimal Consumption Allocation: Suppose that one has a

preference function given by U[C0,C1] = log(C0) + log(C1)/(1+δ) and an endowment of y0 = y1

= y. If r is the market rate of interest, then what is the optimal allocation

(C0*, C1

*)? From (III.3), we have that U1[C0*,C1

*]/U2[C0*,C1

*] = (1+δ)C1*/C0

* = 1+r, or that

* *1 0 0C = (1+r)C /(1+ ). W = y + Py = (2+r)y/(1+r). δ From the budget constraint,

( ) ( )* * *

0 0 1 1C = W - PC = 2+r 1 . y C r− +⎡ ⎤⎣ ⎦ Substituting into the budget constraint for

*1C from the optimality condition, we have that

(III.5a) y })] +r)(2+r)/[(1+)(2+(1 { = C*0 δδ

and

(III.5b) . )+r)y/(2+(2 = C*1 δ

Page 42: Finance Theory - Robert C. Merton

Robert C. Merton

40

Time Preference

A consumer is said to have a positive time preference if for every (a,b) such that b > a,

U[b,a] > U[a,b]. He has no time preference if U[b,a] = U[a,b], and a negative time preference

if U[b,a] < U[a,b].

In the example of preferences used in Problem III.1, δ can be interpreted as the

consumer's rate of time preference. If δ > 0, then he has positive time preference. If δ = 0,

then he has no time preference, and if δ < 0, then he has negative time preference. Note that in

that example, if the interest rate exceeds his rate of time preference (r > δ), then *0C y< , and

he will save some of his current period's income to consume next period. If r < δ, then

C0* > y and he borrows against next period's income to consume more than his current income.

If r = δ, then C0* = y, and he does not trade, but consumes exactly his income in each period.

Suppose that the consumer in this example were the only person in the economy (i.e., a

"Robinson Crusoe" economy). Because he can only trade with himself, the autarky solution is

the only feasible solution. However, we can compute the "equilibrium" rate of interest consistent

with autarky and that rate clearly must be r = δ. Hence, by this example, we have illustrated one

of the possible explanations for a positive rate of interest: namely, consumers' impatience to

consume or a positive time preference.

Case 2. A No-Exchange Market Economy: Pure Production

As in the first case, we assume that the consumer has an endowment of exogenous income

(y0,y1), but in addition, he has the opportunity to use some of his current income to produce

next-period goods. One may wish to think of the "good" as seed which can either be eaten

(consumed) or planted (invested). However, because there is no exchange market, physical

production is the only means he has to increase his next period's consumption beyond next

period's income. Moreover, because there is no exchange market, the only way that he can

Page 43: Finance Theory - Robert C. Merton

Finance Theory

41

produce is by forgoing some current consumption i.e., if X0 denotes the amount he invests in

production, then

(III.6) 0 00 = - 0.y CX >

The technology available to him is described by a production function f, such that X0

units of current goods invested will produce X1 = f(X0) units of the good next period. It is

assumed that f(0) = 0 and df/dX0 ≡ f′ (X0) > 0. It is further assumed that the production

technology exhibits non-increasing returns to scale (i.e., )2 20/ 0d f dX ≤ . Figure 3 illustrates

the production function for decreasing returns to scale, and for 0 ≤ X0 ≤ y0, describes his

Production Possibility Frontier. The maximum output that he can produce is max1X = f(y0)

which corresponds to X0 = y0 and C0 = 0. Hence, f(y0) ≥ X1 ≥ 0. His next period's

consumption can be written as

(III.7) )C-yf( + y =

X + y = C

001

111

Page 44: Finance Theory - Robert C. Merton

Robert C. Merton

42

Figure III.3

Production Function

which for y0 - C0 ≥ 0, describes his feasible consumption set of Consumption Possibility

Frontier. Because there is no exchange market and therefore, no prices, the consumer does not

have a budget constraint of the type in Case 1. However, his consumption choices are

constrained by (III.7) which is called a technological budget constraint. Hence, if, as in Case 1,

it is assumed that the consumer will choose the most-preferred feasible consumption program,

then he will act so as to maximize U[C0,C1] subject to his technological budget constraint.

Substituting for C1 in U from (III.7), we can write the consumer choice problem as

)]C-yf( + y,CU[ 0010C

Max0

which leads to the first-order condition for an interior maximum

Page 45: Finance Theory - Robert C. Merton

Finance Theory

43

(III.8) )].C-yf( + y,C[U)C-y(f-)]C-yf( + y,C[U = 0 *001

*02

*00

*001

*01 ′

Assuming that the optimum is interior, we can rewrite (III.8) as

(III.9) )X(f =] C,C[U]/C,C[U *0

*1

*02

*1

*01 ′

where * *1 1 0C = y + f(X ) and the optimal amount to plant,

*0X is given by

* *0 0 0X = y - C . As was done in Case 1, we have from (III.1) that (III.9) can be rewritten in

terms of the slope of an indifference curve through (C0*,C1

*) as

(III.10) )X(f- = )dC/dC( *

0U=U01 * ′

* * *

0 1where U = U[C ,C ]. Figure 4 plots the Consumption Possibility Frontier along with a

graphical solution of the optimal consumption-production program (C0

*,C1*,X0

*). Because

there is no exchange market, he cannot "borrow" against next period's income, y1, to consume

more in the current period. (i.e., C1 ≥ y1 and C0 ≤ y0). Hence, as shown in Figure 4, the

Consumption Possibility Frontier has a vertical portion for C1 ≤ y1.

Page 46: Finance Theory - Robert C. Merton

Robert C. Merton

44

Figure III.4

Although there is no market rate of interest in Case 2, we can define an "implied" or

"technological" rate of interest, ,r by *01 + r f ( )X .′≡ By comparing (III.10) with (III.4), we

see that r serves as a surrogate for the market rate r, and hence illustrates a second reason for

a positive rate of interest: namely, the productivity of (physical) investment.

Case 3. Production Within an Exchange Market Economy

We maintain the same assumptions about the consumer's endowment of exogenous income

and a production technology as in Case 2. However, we now allow for an exchange market as in

Case 1 where the current market price of next period's goods is P = 1/(1+r). In this environment

his current wealth, W0, can be written as

Page 47: Finance Theory - Robert C. Merton

Finance Theory

45

(III.11) X - )XPf( + Py + y = W 00100

where the first two terms on the right-hand side represent the current value of his endowment of

exogenous income and the last two terms represent the net current value of operating his

production technology with an input intensity of X0. That is, if he buys inputs today with a

current value of X0, then he will receive an output next period of f(X0) which has a current

value of Pf(X0). The difference between the two is the net increment to his current wealth from

operating the technology at that intensity. Note that unlike in Case 1, the consumer's current

wealth is affected by one of his decisions: namely, the amount of physical production he

undertakes, X0.

As in Cases 1 and 2, the consumer chooses an investment-consumption program,

(X0,C0,C1), so as to maximize U[C0,C1] subject to the budget constraint that W0 = C0 + PC1.

Because there now exists an exchange market, (III.6) in Case 2 is no longer a constraint i.e., the

consumer can borrow against future income to either consume or invest in physical production in

the current period. Substituting for C0 from the budget constraint, we can write the consumer

choice problem as

]C,PC - X - )XPf( + Py + yU[ 110010}C,X{

Max10

which leads to the set of first-order conditions for an interior maximum

(III.12a) '* * *1 01 0 1U/ = 0 = [ , ](Pf ( ) - 1)U C CX X∂ ∂

and

(III.12b) ] ,C,C[U +] C,C[PU- = 0 = CU/ *1

*02

*1

*011∂∂

where

(X0*,C0

*,C1*) denotes the quantities chosen for the optimal investment-consumption program, and

Page 48: Finance Theory - Robert C. Merton

Robert C. Merton

46

* * * *0 0 1 0 0 1C = y + Py + Pf(X ) - X - PC . Because the consumer is assumed never to be

satiated, U1[C0*,C1

*] > 0, and we can rewrite (III.12a) as

(III.13a) . r+1 =

1/P = )X(f *0′

By inspection of (III.13a), we see that, unlike in (III.9) of the Robinson Crusoe Case 2, the

optimal amount to invest in physical production, *0X , does not depend either upon the

consumer's preferences, U, or his endowment, (y0,y1). Hence, two consumers with quite

different preferences between current and future consumption and with quite different

endowments, but who face the same market rate of interest and have the same production

technologies, will choose the same level of physical investment in their technologies,

*0X . Such a result about physical production is called an efficiency condition because it is

independent of either preferences or endowments, and hence independent of who owns the

production technology.

One interpretation of the optimality condition (III.13a) can be derived as follows: as

previously noted, the current wealth of the consumer is affected by the choice of production

intensity. I.e., W0 can be written as W0(X0). If **0X denotes that amount of physical

investment which maximizes the current wealth of the consumer, then from (III.11),

X0** is the solution to the problem:

]X - )XPf( + yP + y[ 0010}X{

Max0

which leads to the first-order condition for an interior maximum

(III.14) 0 0 '**

0dW /dX = 0 = Pf ( ) - 1X

Page 49: Finance Theory - Robert C. Merton

Finance Theory

47

which is identical to (III.13a). i.e., ** *0 0X = X . Hence, optimality condition (III.13a) can be

interpreted as saying "Choose physical investment so as to maximize one's current wealth." This

is called the Value Maximization Rule and it has significant implications for the theory of

Finance. However, discussion of these implications is postponed until Section IV.

Consider now the second optimality condition (III.12b). From (III.1), it can be rewritten in

terms of the slope of an indifference curve through the point (C0*, C1

*) as

(III.13b) . r)+(1- = 1/P- = )dC/dC(U=U01 *

Comparing (III.13b) with (III.4), we find that it is identical to the optimality condition in the Pure

Exchange Case 1 if we use as current wealth, * * *0 0 1 0 0W y + Py + Pf(X ) - X . ≡ Hence, one

can describe the solution of the optimal investment-consumption program for the consumer as

taking place in two steps. First, choose physical investment so as to maximize current wealth.

Second, as in the case of pure exchange, use the exchange market to borrow or lend (against this

maximized wealth) so as to achieve the most-preferred, feasible consumption allocation. Figure

5 provides a graphical solution of the problem, and is, in essence, a composite of Figures 3 and 4.

As inspection of Figure 5 clearly demonstrates, the consumer is better off in the presence of an

exchange market than he was in the Robinson Crusoe framework of Case 2.

Hence, the existence of an exchange or capital market will not only affect the patterns of

consumption chosen but also will alter the allocation of physical investment among the various

technologies, and in so doing affect the total output for the economy.

Page 50: Finance Theory - Robert C. Merton

Robert C. Merton

48

Figure III.5

Note: By trading, he reaches a higher indifference curve.

The Multi-Period Consumption and Allocation Decision: The T-Period Case

We now extend the previous analysis to a consumer who lives for T-periods with a utility

function for lifetime consumption described by U[C0,C1,...,CT-1,CT] where Ct is his

consumption in period t, t = 0,...,T. Let yt denote the exogenous income he will receive in

period t, t = 0,...,T. There exists an exchange market which is open each period and allows for

trading the current period's consumption good and claims on consumption goods in the future.

Specifically, at each point in time, there are (T+1) different claims traded in units where the τth

Page 51: Finance Theory - Robert C. Merton

Finance Theory

49

such claim gives its owner the right to one unit of the consumption good payable τ periods from

the date at which it is issued, τ = 0,...,T. In effect, these claims are pure discount loans as

defined in Section II. Let Pt(τ) denote the price at date t of a discount loan which pays one unit

of the consumption good τ periods from date t (i.e., at date t + τ). If, by convention, the

current period's (or "spot") price of the consumption good is taken as numeraire', then Pt(0) = 1,

for all t.

In the absence of any production capabilities, the consumer's current wealth, W0, at date t

= 0 can be written as

(III.15) . )( yP=W 0

T

0=0 τ

ττ∑

As in the two-period analysis, the consumer's feasible consumption set is the set of all

consumption programs that he can afford to buy. Hence, for a consumption program to be

feasible, it must satisfy

T

0 0

=0

C WP ( ) τ

τ

τ ≤∑ which defines the feasible consumption set.

Provided that satiation is ruled out, the T-period consumer allocation problem is formulated as

maximize U[C0,C1,...,CT] subject to the budget constraint that . )( CP = W 0

T

0=0 τ

ττ∑ Noting

that P0(0) = 1, we can substitute for C0 from the budget constraint, and rewrite the problem as

],...,,,)( CCCCP-WU[ T210

T

1=0

}C,...,C,C{τ

ττ∑Max

T21

which leads to T first-order conditions

(III.16) T,1,2,...,= ],C,...,C,C[U+)(P]C,...,C,C[U-=0 *T

*1

*01+0

*T

*1

*01 ττ τ

where Uτ ≡ ∂U[C0,C1,...,CT]/∂Cτ-1 denotes the partial derivative of U with respect to its τth

argument and (C*0,C

*1,...,C

*T) is the optimal consumption program with

Page 52: Finance Theory - Robert C. Merton

Robert C. Merton

50

T

* *00 0

=1

= - C W CP ( ) .τ

τ

τ∑ In words, (III.16) says that at the optimum, the ratio of the marginal

utility of consumption should just equal the ratio of the marginal cost of consumption in period τ

to the marginal utility of current consumption in period τ, P0(τ), to the marginal cost of current

consumption, P0(0) = 1. From (III.16), we have that

(III.17) ,

* * * * * *t+1 0 1 T s+1 0 1 T

0 0

[ , ,..., ]/ [ , ,..., ]U C C C U C C C

= (t)/ (s) s, t = 0,1,...,T .P P

As with Case 3 of the two-period analysis, we now expand the analysis of the T-period

case to allow for production. Generalizing the production function description of the technology

from the two-period case, let ft(X0t,X1t,...,Xt-1,t) denote the production function for output in

period t, (t=1,2,...,T) where Xjt is the amount of input required to be invested in period j,

(j=0,1,2,...t–1), in order to produce output ft in period t. In an analogous fashion to (III.11) in

the two-period case, we can write the current wealth of the consumer as

(III.18)

XP -

)X,...,X,X(f)(P + y)(P = W

)(0

1-T

0=

1,-100

T

1=0

T

0=0

ττ

ττττττ

ττ

τ

ττ

∑∑

where

T

j

j= +1

X Xτ ττ

≡ ∑ is the total amount of inputs required in period τ to allow production

plan {ft}, τ = 0,...,T–1. Define the net increment to the consumer's current wealth of production

plan {ft}, V0, by

(III.19) . )( XP - )X,...,X,X(f)(P V 0

1-

0=1,-100

1=0 τ

ττττττ

τττ ∑∑≡

TT

Page 53: Finance Theory - Robert C. Merton

Finance Theory

51

The combined investment-consumption choice problem is formulated as choose the production

and consumption program so as to maximize U[C0,C1,...,CT] subject to the budget constraint

that . C)(P = V + y)(P = W 00=

000=

0 ττ

ττ

ττ ∑∑TT

Substituting for C0 from the budget constraint, the

problem can be rewritten as choose (C1,...,CT) and (Xjt, j = 0,...,t-1 and t=1,...,T-1) so as to

⎥⎦

⎤⎢⎣

⎡ ∑∑ C,...,C,C)(P VyPU T101=

00

0=

- + )( ττ

ττ

ττTT

Max

which leads to T(T+1)/2 first-order conditions for the production choices

(III.20a)

* * *jt 1 0 1 T

jt0

U/ = 0 = [ , ,..., ] U C C CX

V / , j = 0,1,...,t - 1 and t = 0,...,T - 1X

∂ ∂

∂ ∂

and T first-order conditions for the consumption choices

(III.20b) . T1,..., = ] ,C,...,C,C[U +

)(P]C,...,C,C[U- = 0 = CU/*T

*1

*01+

0*T

*1

*01

ττ

τ

τ∂∂

Noting that U1 > 0, the first-order conditions (III.20a) can be rewritten as ∂V0/∂Xjt = 0, j =

0,1,...,t-1 and t = 0,1,...,T-1, and in that form, are simply the generalization of condition

(III.13a) in the two-period case. Indeed, the interpretation given to (III.13a) in the two-period

case of choosing a physical production program so as to maximize the consumer's current wealth

carries over exactly to the T-period case. From (III.18) and (III.19), the set of {Xjt} which

maximizes W0 are the ones that maximize V0. But, the set of first-order conditions that

maximize V0 are simply ∂V0/∂Xjt = 0. Hence, (III.20a) simply says choose physical production

so as to maximize current wealth, and therefore the Value Maximization Rule applies in the

general T-period case.

Inspection of (III.20b) shows that it is identical to the first-order conditions for the pure-

exchange case (III.16) where the level of current wealth used is the maximized value,

Page 54: Finance Theory - Robert C. Merton

Robert C. Merton

52

W*0. Hence, as was shown in the two-period case, the solution of the T-period optimal

investment-consumption program for the consumer can be described as taking place in two steps:

namely, first, choose physical investments so as to maximize current wealth. Second, use the

exchange market to borrow or lend so as to achieve the most preferred feasible consumption

allocation.

On the Connection Between the T-Period and Two-Period Analyses

While the T-period consumer choice is a more realistic description of the world than the

two-period formulation, the analysis is more complex and is burdened by a barrage of notation.

Moreover, it does not readily lend itself to the relatively intuitive graphical display of the

solution. We have already shown that the fundamental behavioral characteristics (such as the

Value Maximization Rule) deduced in the two-period case carry over to the general T-period

case. We now show that, in essence, the general T-period problem can always be structured so as

to "look like" a two-period problem. Not only does this connection between the two problems

make the analysis of the T-period problem more tractable, but it also provides the appropriate

framework for studying the intertemporal consumption-investment choice problem in an

uncertain environment.

In the previous analysis, we solved the entire lifetime consumption choice problem by

having the consumer choose at date t = 0, (C0,C1,...,CT) so as to maximize U[C0,C1,...,CT]

subject to his budget constraint . )( CP = W 0

0=0 τ

ττ∑

T

Suppose we move ahead one period to

date t = 1. Suppose further that the consumer consumed C0 units at date t = 0. The consumer

choice problem at date t = 1 can be formulated as choose (C1,C2,...,CT) so as to maximize

0 1 TU[C ,C ,...,C ] subject to his budget constraint C1)-(P = W 1

T

1=1 τ

ττ∑ where W1 is his

Page 55: Finance Theory - Robert C. Merton

Finance Theory

53

wealth at date t = 1. Note: 0C is not a choice variable at t = 1 because whatever was

consumed at time t = 0 is now past history.

We can solve the optimal choice problem at t=1 in the same way that the problem was

solved at t = 0, and in analogous fashion to (III.16), we arrive at the (T-1) first-order conditions

that

(III.21) T,2,..., = ] ,,...,,[ +

1)-(],...,,1)-(

CCCU

PCCCP - W,C[U- = 0

*T101+

1*T

*2

*1

T

2=102

τ

ττ

τ

ττ∑

where . C1)-(P - W = C *1

2=1

*1 τ

ττ∑

T

From (III.21), it is clear that the optimal solution

* * *1 2 T(C , C ,...,C ) will depend upon the amount of wealth W1, the prices

1 1 0{P (1),...,P (T-1)}, C , and the form of the utility function U.

Define the function J by

(III.22) * *

1 11 1 T0 0J[ , ; (1),..., (T - 1)] U[ , ,..., ] .W C CP PC C≡

J is the "level" of utility associated with a consumption program of ,)C,...,C,C,C( *T

*2

*10 and is

the maximal level of utility (corresponding to the most preferred feasible program) conditional

on consuming 0C units at date t = 0 and having wealth W1 at time t = 1. Because the prices

{P1(τ)} are not affected by the choices made by the consumer, they can be treated as

parameters. Hence, a shortened form for J is simply to write it as ]."W,CJ[" 10

Return now to the original problem of selecting an optimal consumption program at time

t = 0. Of course, at t = 0, the consumer is free to choose any (feasible) level for C0. A

necessary condition for a consumption program to be optimal is that whatever level of

consumption is chosen for C0, the choices made for C1,C2,...,CT must be the best one can do

Page 56: Finance Theory - Robert C. Merton

Robert C. Merton

54

conditional on having chosen level C0. That, of course, is exactly what

* * *1 2 TC ,C ,...,C represent in the t = 1 problem just solved where they represent the best the

consumer can do conditional on having chosen to consume 0C at time t = 0. Further, we have

that wealth at time t = 1, W1, can be expressed in terms of wealth at time t = 0 as

(III.23) . (1)P]/C-W[ = W 0001

I.e., whatever part of current wealth that is not currently consumed will grow in one period by the

one-period interest rate. Hence, having solved the conditional (on t = 0 consumption)

optimization problem as of t = 1, we can reformulate the consumer choice problem at t = 0 as:

Choose current consumption, C0, so as to maximize J[C0,W1] subject to the budget constraint

W0 = C0 + P0(1)W1. Expressed in this way, except for some notational differences, this problem

is in essence the same as the two-period choice problem solved in Case 1 of this section where

the utility function "J" replaces "U[Co,C1]" and "W1" replaces next period consumption "C1"

i.e., the T-period consumption problem can be reformulated as a two-period problem.

Although in the formulation the utility function, J, has utility depending upon (next

period's) wealth, the consumer still only gets direct utility from consumption. In effect, wealth

W1 acts as a "surrogate" for future consumption so that the utility "tradeoff" between C0 and

W1 is really a tradeoff between current and future consumption. J is sometimes called the

indirect or derived utility function, and provided that the direct utility function, U[C0,C1,...,CT],

is a well behaved, (quasi) concave function, J will be a well behaved, (quasi) concave function

in (C0,W1).

To solve the problem, we substitute for W1 using the budget constraint to get:

(1)]P)/C-W(,C[ J 0000C

Max0

which leads to the first-order condition

(III.24) (1)P]/W,C[J -] W,C[J = 0 0*1

*02

*1

*01

Page 57: Finance Theory - Robert C. Merton

Finance Theory

55

where subscripts denote partial derivatives of J with respect to the appropriate arguments and

* *1 0 0 0W (W - C )/P (1). ≡ As in (III.4), we can rewrite (III.24) in terms of the slope of an

indifference curve through the point (C*0,W

*1) as

(III.25) (1),P1/- = )C/ddW( 0J=J0*1 *

where J* ≡ J[C*0,W

*1]. Figure 6 provides the graphical solution which is

Figure III.6

analogous to the one displayed in Figure 2 for the two-period problem. Although the derivation

presented here is more descriptive than rigorous, the analysis can be made rigorous by using the

mathematical technique of dynamic programming. While the explicit development of this

technique is more appropriately the subject of an advanced treatment of Finance, the interested

Page 58: Finance Theory - Robert C. Merton

Robert C. Merton

56

reader can find its development in the context of this problem in Fama [American Economic

Review, March 1970].

In summary, we have solved the general intertemporal consumption-investment problem in

a certainty environment. In so doing, we have shown that the creation of financial securities and

an exchange market will make the consumer better off. In particular, we showed that an

exchange market was the only means by which an individual consumer can convert future

income or output into current consumption. While this manifest function of the exchange market

more than justifies its existence (indeed, if such markets did not exist, we would have to invent

them), it has an important latent function as the means for permitting an efficient organization of

the economy's production. This important latent function is the topic of the next section.

Page 59: Finance Theory - Robert C. Merton

57

IV. ON THE ROLE OF BUSINESS FIRMS, FINANCIAL INSTRUMENTS, AND MARKETS IN AN ENVIRONMENT OF CERTAINTY Every modern economy has as part of its institutional structure financial instruments,

capital markets, and business firms. In the previous section, we saw that the creation of a capital

market made households better off even in the simplistic environment of certainty. In this

section, we expand upon that analysis to explain the role of business firms.

In Section III, it was shown that in the presence of a capital market the optimal production

rule was to choose investment so as to maximize one's current wealth. In that model, the

consumer-owner of the technology made the production decisions, and therefore all technologies

were presumed to be owner-managed. However, for most modern economies, a majority of

production is carried out by business firms whose managers are not the (sole or even majority)

owners of the firm. This empirical fact raises several questions. In particular, why is this the

structure that we observe? What changes in the analysis of Section III are induced by this

separation of ownership from management? How can an efficient allocation of resources be

achieved with this (at least partial) centralization of production decisions?

To answer these questions, we begin with the following stylized description of the

formation of a business firm. First, the individual of Section III (the "founder") forms a

corporation and contributes his technology described by the production function f to the firm.

In return, he receives ownership of the whole firm, and therefore the right to one hundred percent

of the output of the firm. Second, he hires a manager (or technocrat) whose job it is to run the

firm. Specifically, the manager must choose the amount of physical production to undertake, and

then raise the additional resources necessary to carry out the production plans. The former is

called the investment decision and the latter is called the financing decision.

In this structure, the consumer has "turned over" the production and financing decisions

to the manager but still retains complete (albeit indirect) ownership of the technology through his

ownership of the firm's stock. Clearly, since the manager is hired by the owner, the manager's

job is to make decisions which are in the best interests of the owner. What is not so clear is how

he can achieve this goal. Of course, the manager could review each decision with the owner

Page 60: Finance Theory - Robert C. Merton

Robert C. Merton

58

including the production choices, cost of obtaining capital, etc., and ask him which combination

he prefers. However, in that case, the owner would have to have the same knowledge and spend

essentially the same amount of time as he would as an owner-manager, and therefore there would

be little point in hiring a manager to "run the business". Moreover, while this procedure might be

feasible when there is a single owner of the firm, it becomes increasingly more difficult as the

number of owners becomes large. Indeed, for a large corporation in the United States, the

number of shareholders or "owners" can range from several thousand to over a million. Hence, a

feasible or operational rule for managing the firm should not require the manager to "poll" the

owner(s) about his decisions. Furthermore, to be effective, the "right rule" should not require the

manager to know the tastes or endowments of the owner(s) because such data are virtually

impossible to obtain, and even if the data were available as of one point in time, they would

change over time. Indeed, since shares of stock change hands every day, the owners of the

corporation change every day. Thus, to be feasible, the right rule should be independent of who

the owner or owners are.

If a feasible rule for the manager to follow were found which would lead him to make the

same investment and financing decisions that each of the individual owners would have made

had they made the decisions themselves, then such a rule would clearly be the "right rule."

Because it was shown in (III.13a) and (III.20a) of Section III that an individual owner would

choose the investment plan which maximizes his current wealth, it follows that the right rule for

the manager is to choose investment so as to maximize current stockholders' (owners') wealth.

Moreover, inspection of (III.13a) and (III.20a) will show that the optimal investment decision

depends only upon the structure of the production technology and market interest rates (i.e., bond

prices). Specifically, it does not depend upon the tastes or endowments of the owners, and so it

can be made without any specific information about the owners. Therefore the manager can

follow the "right rule" without polling the owners with respect to his decisions.

There are a variety of ways to restate the operational criterion by which the manager

should make the investment decision for the firm. One such restatement is: "Choose investment

so as to maximize profits." To see this, consider the two-period case of Section III where the

Page 61: Finance Theory - Robert C. Merton

Finance Theory

59

production technology available to the firm is described by f(X0) with X0 denoting the input

provided at time zero. If, as defined in Section III, P = 1/(1+r) is the price today of a unit of

output delivered next period, then by selling the (future) output of the firm today, the current

revenues of the firm are Pf(X0) and the current profit, ,∏ equals current revenues minus costs

or ∏ = Pf(X0) - X0. If the manager chooses X0 so as to maximize ,∏ then the chosen amount

of investment X*0 will satisfy Pf ‘(X*

0) = 1 which is exactly condition (III.13a).

This restatement of the operational criterion is also valid in the general case of T-periods

and uncertain cash flows to the firm provided that "profit" is defined in a very technical fashion.

However, it can be misleading if one applies the common (accounting or flow) usage for the

word "profit": namely, "profit in period t" is equal to period t gross cash flow minus period t

costs. So, for example, if the production process requires many periods, then which period's

profit is to be maximized? Or if either future revenues or costs are uncertain, then what is the

meaning of "maximize profits" when profits are described by a random variable?

A second restatement of the operational criterion is: "Using market interest rates, choose

investment so as to maximize the present value of the firm's net cash flows." This is the Present

Value Rule deduced for choosing among claims in Section II where the discount rates used are

the market interest rates because these represent the "cost of money" to the firm. In the two-

period case of Section III, the net cash flow in period 0 is – X0 and in period 1 is f(X0). Hence,

from (II.40), the Present Value Rule says "choose X0 so as to maximize – X0 + f(X0)/(1+r)."

The maximizing amount of investment, ( )* *0 0, will satisfy ' 1X f X r= + which is exactly

(III.13a).

In the general T-period case of Section III, P0(t) denotes the current market price of $1

payable t periods in the future. The discount rate for period t cash flows, Rt, is defined in

(II.37). Because the Present Value Rule is to be applied using market interest rates, we have

from the present value formula (II.39) with x = 1 that P0(t) = 1/(1+Rt)t. If, as in Section III, ft

denotes production output in period t and Xt denotes the total amount of inputs required in

Page 62: Finance Theory - Robert C. Merton

Robert C. Merton

60

period t, then (ft - Xt) is the net cash flow in period t, t = 0,1,...,T. Therefore, from (II.40), the

Present Value Rules says "choose production inputs, (Xjt, j = 0,...,t-1 and t = 1,...,T-1) so as to

maximize the present value of the net cash flows, PV0" which can be written as

(IV.1)

T

0 1 -1,0

=0

[ ( , ,..., ) - ]/(1+ f )PV X X X X R .ττ τ τ τ τ ττ

τ

≡∑

Noting that and T0 0 0f X ,≡ ≡ we can rewrite (IV.1) as

(IV.2) 1 1T T -1

0 1 -1,0=1 =0

= ( , ,..., )/( + - f ) )PV X X X R X R ./( +τ ττ ττ τ τ τ ττ

τ τ∑ ∑

However, . )R+1/(1 = )(P0τ

ττ Therefore, from (III.19) and (IV.2), PV0 = V0. Hence, the set

of choices for Xjt which maximize PV0 will satisfy

jt jt0 0/ / = 0, j = 0,1,...,t - 1PV VX X∂ ∂ ≡ ∂ ∂ and t = 1,...,T-1 which is exactly condition

(III.20a). Thus, unlike the "Profit Maximization" restatement, the "Present Value Rule"

restatement causes no ambiguities when the production process involves many periods.

However, like the "Profit Maximization" restatement, the Present Value Rule is not well defined

if the future net cash flows are uncertain i.e., what does it mean to maximize the discounted sum

of T random variables?

A third restatement of the operational criterion is: "choose investment so as to maximize

the current market value of the firm." In the two-period case, we determine the current market

value of the firm as a function of the investment decision X0 as follows. Suppose the firm

chooses to operate its technology at the intensity X0, and makes known to the public what its

plans are. At this time which is prior to the actual raising of the necessary funds to implement

the production plan, a market price for the firm is established. Call this market value V_. Note:

since, at this point, the original owner or "founder" still owns one hundred percent of the firm,

V_(X0) is the market value of his ownership (contingent on the firm being operated at intensity

X0). Moreover, although the firm has neither implemented its production plan nor even raised

Page 63: Finance Theory - Robert C. Merton

Finance Theory

61

the necessary funds to purchase the inputs, the founder could actually sell either all or part of his

holdings for λV_ where λ is the fraction of his holdings that he chooses to sell. To determine

V_, we first establish what value the firm will have after it has raised the necessary additional

capital and entered into production. This value, call it V+(X0), is determined by noting that it

must be priced to yield a return competitive with other securities available to investors. In the

certainty environment, this competitive rate of return will be the interest rate r. Since the end of

period value of the firm will be f(X0), V+ must satisfy (1+r)V+(X0) = f(X0), and therefore

V+(X0) = f(X0)/(1+r).

The firm can raise the additional capital by either issuing debt or more equity. In either

case, it must raise $X0 to realize the production plan. If it is done by a debt issue, then the firm

issues a claim promising to pay a fixed amount, b, at the end of the period. If investors are to

provide $X0 to the firm today, then b must be chosen so that they will earn the competitive

interest rate r on their investment i.e., b = (1+r)X0, and the current market price of the debt will

be $X0. By definition, the market value of the firm is equal to the market value of its liabilities

which in this case are debt and equity. Hence, the market value of equity will equal V+(X0) - X0.

But, under this financing arrangement, the founder retains ownership of all the equity, and

therefore ( )0 + 0 0 0 0 0V_ (X ) = V X - X or V_(X ) = f(X )/(1+r) - X . Hence, if the

manager chooses X0 so as to maximize V_(X0), then that X0 will satisfy (III.13a) and will,

therefore, coincide with the decision which would have been made by the owner had he made it.

If the necessary capital is raised by issuing additional equity, then the original owner(s)

must give up some percentage of the equity. As with debt, the additional equity must be priced

to yield a competitive return. If γ is the percentage ownership given to the new shareholders,

then the value of their holdings as of next period will be γ f(X0). Therefore, to raise $X0 today,

γ must be chosen so that γ f(X0) = (1+r)X0 or γ = (1+r)X0/f(X0). Under this financing method,

the original owner's holdings will be worth (1–γ)V+(X0) = f(X0)/(1+r) – X0 = V_(X0) which is the

same as for the debt financed case. So, for either form of financing the right rule is to maximize

V_(X0).

Page 64: Finance Theory - Robert C. Merton

Robert C. Merton

62

To complete the analysis, suppose that in fact the firm has other assets in addition to the

production technology represented by f. Suppose these other assets are simply cash in the

amount of $C. By an analysis similar to the ones just used, one can show that V_(X0) =

f(X0)/(1+r) – X0 + C. Hence, even in the case where C ≥ X0 so that no external financing is

required to implement the production plan, the value-maximization rule leads to the right

decision: Namely, choose X0 = X*0 such as to satisfy f ‘(X*

0) = 1+r.

Using similar arguments in the general T-period case, one can show that the current

market value of the firm, V_, is equal to V0 as defined in (III.14). Hence, if the manager

chooses the production inputs {Xjt} so as to maximize V_, then the resulting choices will

maximize V0 which is exactly condition (III.20a). Therefore, the "Maximize Current Market

Value" Rule leads to the correct decisions when the production process involves many periods.

Although we have not as yet analyzed the case where future cash flows are uncertain, it is

clear that in that case the current market value of the firm is still well defined. (E.g., the future

cash flows of the IBM corporation are uncertain, but there is a current price for its stock which is

not uncertain). Hence, unlike the other two restatements, the "Maximize Current Market Value"

Rule causes no ambiguities if future cash flows of the firm are uncertain. Moreover, as will be

shown later in these Notes, provided that the capital markets are competitive, this Rule leads to

the "right" decision even in an uncertain environment.

In summary, the objective or criterion function for the firm is its current market value,

and good management is to make decisions so as to maximize the firm's criterion function.

Provided that managers operate in this fashion, an efficient allocation of the economy's

productive resources can be achieved with the ownership and management functions separated.

Note that the existence of a well-functioning capital market is essential to the feasibility

of this efficient separation. Of course, the manifest function of the capital market in terms of the

firm's actual transactions is to provide a means for the firm to raise the necessary resources to

carry out its production plans. However, an equally important, but latent function is to provide

information which is necessary for the manager so that he can make the "correct" decisions about

operating the firm. Specifically, while it is reasonable to assume that a good manager will have

Page 65: Finance Theory - Robert C. Merton

Finance Theory

63

as much information about his firm's production technology, {ft}, as anyone, such "internal (to

the firm)" information is not sufficient to make decisions. Indeed, in the absence of a capital

market, we saw in (III.9) that, in addition, the manager would require "external (to the firm)"

information: Namely, the tastes and endowments of the owner. While, in the presence of a

capital market, the manager no longer requires this specific set of external information, he still

requires external information in the form of interest rates or prices. The existence of a capital

market allows the manager to substitute one set of external information which is relatively easy

to obtain for another set which is virtually impossible to obtain. In essence, prices in the capital

market "capture" all the essential information about tastes, endowments, and other investment

opportunities that the manager requires to make the correct decisions.

In reaching these results about the appropriate criterion function for the firm and the role

that capital markets play in the allocation of the economy's productive resources, we have made a

number of abstractions from reality: Namely, we assume perfect certainty about all current and

future events, and a "frictionless" world with no transactions costs, no indivisibilities, all

information available to everyone at no cost; and no explicit labor costs including management's

compensation. Moreover, we assumed that both individuals and firms behave competitively with

respect to their transactions in the capital markets. While, under these hypothesized conditions,

the owner would be just indifferent between the owner-manager structure or the separated

structure of a non-owner manager who makes decisions so as to maximize market value, the

introduction of the slightest "frictions" will generally lead to a definite preference for the

separated structure.

For example, a standard division of labor argument would lead to a definite preference for

the separated structure if either the cost of paying the professional manager is less than that

which the owner could earn in some other occupation or for the same cost, a professional

manager could be found who has a superior understanding of the firm's technology. Indeed, in an

owner-manager structure, the owner must have both the talents of a manager and the financial

resources necessary to carry out production. In the separated structure, no such coincidence is

Page 66: Finance Theory - Robert C. Merton

Robert C. Merton

64

required. Further, there is the "learning curve" or "going concern" effect which favors the

separated structure. Suppose the owner wants to sell all or part of his technology either now or at

a later date. In an owner-manager structure, the new owner will incur additional costs while he

becomes familiar with the operations of the firm. If there are economies of scale (a form of

"synergism"), then the separated structure is again favored because more than one person's

technology can be managed within a single entity at lower costs than within separate entities. As

will be shown later in the Notes, the introduction of uncertainty will cause individuals to want to

diversify their investments across many technologies, and diversification is difficult to achieve

within an owner-manager structure. Finally, provided that the manager has the most accurate

information about the firm's technology available (i.e., he is technically competent) and provided

that he uses this information to maximize the market value of the firm (i.e., he is benevolent),

then the owners of the firm need to know nothing about either the technology of the firm or the

intensity at which it is being operated. Hence, the separated structure allows for savings in the

costs of information gathering.

Thus, in an economy with production activities and a well functioning capital market, one

would expect to find that, in general, the owners of business firms will not be the managers and

that the ownership of such firms is dispersed among many individuals. Further, one would

expect to observe that, over time, the changes in the composition of the ownership would be far

more volatile than the changes in the composition of the management. However, if the

management follows the value-maximization rule, then it will be acting in the best interests of

the owners at each point in time.

Of course, one might be skeptical about the realism of such "mutual-admiration society"

behavior. It is certainly possible for the current management of a firm to be either incompetent

or malevolent, or both. Of course, the owners could "fire" the management by voting them out.

However, since a major benefit of the separated structure is that the owners can remain relatively

uninformed about the operations of the firm, it is not apparent how these owners could know

whether their firm is being mismanaged or not. The feasibility of voting rights being a solution

to the problem is further aggravated if ownership of the firm is widely dispersed. If that is the

Page 67: Finance Theory - Robert C. Merton

Finance Theory

65

situation, then the holdings of any single owner are likely to be so small that he would not incur

the expense to become informed and to convey this information to other owners. Thus, voting

rights alone can do little to solve this dilemma. However, there is another mechanism called the

takeover which, at least in part, can.

Suppose some entity has identified a significantly mismanaged firm (i.e., one whose

management has chosen an investment plan which leads to a market value that is significantly

less than the maximum value that could be achieved). Specifically, the firm has production

technology f(X0) and the management has announced that their investment plan is to operate at

intensity ( )* * *0 0 0 0 0, and where satisfies ' 1 .X X X X f X r+ + ≠ = + Moreover, by

supposition, ( )0_V X + is significantly less than ( )*

0_V X . In response to the current

management's announced plan, the market value of the firm will be V_(X+0). Suppose that

the entity buys all the shares of this firm at the current market value. Having done so, it fires the

management and installs a new management that will choose to operate the firm at intensity

*0X . Having announced the change in the firm's investment plans, the entity now sells the

shares of the firm at the new market price, *

0 0V (X ), based upon the new investment plan.

Hence, by taking over the firm and changing its investment plans, the entity earns an immediate

profit of * +0 0V_(X ) - V_(X ). Note: the entity did not have to add any tangible resources to

the firm to achieve this profit. Hence, the only expenses incurred are the cost of identifying a

mismanaged firm and the cost of acquiring the firm's shares.

While the cost of identifying a mismanaged firm will vary, it can be quite low if the entity

happens to be a supplier, customer, or competitor of the firm because much of the information

required may have been gathered for other purposes already. For this reason, the takeover

mechanism can work even if resources are not spent for the explicit reason of identifying

mismanaged firms. However, if significant mismanagement of firms were widespread, then it

Page 68: Finance Theory - Robert C. Merton

Robert C. Merton

66

would pay to spend resources in search for such firms in much the same way that resources are

spent on research for new physical investment projects. Therefore, the threat of a takeover and

the subsequent removal of management provides a strong incentive for current management

(acting in its own self interest) to act in the interests of the firm's current stockholders by

maximizing market value. Indeed, even in the absence of any explicit instructions from the

shareholders or knowledge of the theory for "good management," one might expect managers to

move in the direction of value maximization as simply a matter of self-preservation. Moreover,

it should be noted that the analysis depends in no way on whether the source of the

mismanagement is incompetence or malevolence (i.e., whether the current management are

"fools" or "knaves"), and therefore the takeover mechanism serves equally well to correct either

one. Of course, the effectiveness of the takeover mechanism will depend upon how much of a

threat it poses for current management. For example, in an attempt to prevent the formation of

monopolies in various product markets, the Justice Department will take legal action under the

anti-trust laws to prevent mergers or acquisitions which might reduce competition. Because it is

more likely that a supplier, customer, or competitor will be the entity to identify a mismanaged

firm, this public policy will tend to reduce the threat of takeover. For much the same reason, the

managements of larger firms are probably less vulnerable to a takeover bid. As an aside, this

example illustrates how public policy objectives can be in conflict with one another where no

simple resolution of the conflict is available.

In summary, the gains in efficient resource allocation and reduced costs from the

combined institutional structure of a well functioning capital market and owner-separated-from-

manager business firms does not rest upon the delicate and naive assumption of a mutual-

admiration society with no conflicts between the interests of owners and managers. Indeed, in

the absence of any external "checks," the management of a firm with dispersed ownership

certainly has the opportunity to enrich themselves at the expense of the owners. However,

because a larger market value for the firm reduces the chances for a takeover and makes the

owners better off, the external check of the takeover mechanism forces management to act as if

its interests were coincident with the owners.

Page 69: Finance Theory - Robert C. Merton

Finance Theory

67

In concluding this analysis of the business firm, it is worthwhile to reiterate remarks made

in the Introduction. Firms are economic organizations designed to serve people by performing

specific functions. While in the corporate form, the firm legally has a "corpus," it does not have

a "soul," and therefore has no independent right to existence. Thus, the reader should examine

with care those theories that treat the firm "as if" it were an individual and then deduce the

"proper" rules for good management based upon an exogenously specified utility function for the

firm. Similarly, the reader should be skeptical of theories that treat the firm as if it were "an

island unto itself" and, as such, have management decisions based only upon data which are

"internal" to the firm. The decision as to whether or not a specific project is to be undertaken

should not be based solely upon the engineering and economic specifications of the project.

Such decisions must take into account the economic environment in which the firm is operating,

and to do so require external information. In the analysis presented here, market interest rates

provide the appropriate connection with the outside environment. The switching phenomenon

(illustrated in Problem II.2 of Section II) clearly demonstrates that the right decision will not be

invariant to these external rates.

Stocks, bonds, and other financial instruments which are an essential part of the

proceeding analyses are all examples of financial assets. In its purest form, a financial asset,

unlike a physical asset, has no value for itself but derives all its value from what it gives its

owner a claim on. For example, a stock certificate has virtually no value as a physical asset (i.e.,

as a physical asset, its value is that of a used piece of paper), but it may have great value as a

financial asset because it represents a percentage ownership or claim on the firm's physical assets

and their associated earnings flows. In our stylized description of the formation of a business

firm, the founder gave a physical asset (the technology) to the firm in exchange for a financial

asset (shares of stock in the firm) giving him a claim on the output of the firm. When the firm

raised the necessary additional capital for production, investors may have given the firm physical

assets (the inputs required for production) in return for a financial asset (either debt or equity in

Page 70: Finance Theory - Robert C. Merton

Robert C. Merton

68

the firm). Or, more likely, the investors exchanged one financial asset (money) for another (debt

or equity), and then the firm gave this money to another firm in return for raw materials.

The principal function of a financial asset is to serve as a store of value. Indeed, the

capital markets could not exist without financial assets. While it is not necessary that a financial

asset have no intrinsic worth (i.e., stock certificates could take the form of engravings on gold

bars), there are two good reasons why it is preferable that it not have any significant value as a

physical asset. First, because to have a positive value as a physical asset is not required for a

financial asset to serve its function, to use something which has a significant physical value as a

financial asset is to waste scarce economic resources. Second, if a financial asset also has value

as a physical asset, then its value will be determined as the maximum of its value as either a

physical asset or a financial asset. If its value as a physical asset should exceed its value as a

financial asset, then it will cease to serve the function of a financial asset. For example, coins

made from metals (e.g., copper, silver or gold) have frequently had the value of their (melted-

down) metal content exceed their stated monetary value in which case they have ceased to be

used as money.

Another function served by financial assets is to allow divisibility of ownership of

physical assets which are not generally divisible. For example, to physically divide a race horse

would be to destroy virtually all its value. However, by issuing a financial asset which provides

for a fractional ownership of the race horse (i.e., a right to a certain percentage of all purses, stud

fees, and sales) accomplishes divisibility without affecting the underlying physical asset.

The types of financial assets that are traded in markets are easily identified and are of a

standard form. Hence, they are reasonably liquid in that they can be sold within a short period of

time at something near to the current market price. In general, the existence of financial assets

lowers the requirements for information needed by both parties in order to have trade. Unlike

many physical assets, financial assets are relatively easy to transport from one physical location

to another.

While these services alone would be sufficient to explain the existence of financial assets,

the most important reason for their creation is that without financial assets, it is necessary that

Page 71: Finance Theory - Robert C. Merton

Finance Theory

69

saving must equal investment for each economic unit. For households, saving equals income

minus consumption. In general, for all units, savings equals current income minus current

expenditures. In every case without financial assets, the saving of each unit would have to equal

its investment in physical assets. Indeed, in the "Robinson Crusoe" economy with no financial

assets in Section III, this constraint was specifically stated in (III.6), and it was the relaxation of

this constraint through the introduction of financial assets which allowed the individual to choose

a better allocation. While, even with financial assets, it is still necessary that for the economy as

a whole aggregate saving must equal aggregate investment, it is no longer necessary that saving

must equal investment for each unit.

As a form of summary, we illustrate the benefits of financial assets and a capital market

for both the pure exchange and production cases with a three-period, two-person economy

example. In all cases, we assume that the preferences and "wage income" endowment of person

#I are given by:

(IV.3) )C( + )C( + )C( = )C,C,C(U I2

I1

I0

I2

I1

I0

I logloglog

and

. 2700 = y and ; 1500 = y ; 300 = y I2

I1

I0

Similarly, for person #II, we assume that the preference and "wage income" endowment are given

by

(IV.4) )C( + )C( + )C( = )C,C,C(U II2

II1

II0

II2

II1

II0

II logloglog

and

300. = y 1500; = y 2700; = y II2

II1

II0 and

Note that in this example, both people have identical preferences and similar magnitudes of

income except the time patterns of their receipt are reversed.

Problem IV.1: No Production and No Exchange Market:

Page 72: Finance Theory - Robert C. Merton

Robert C. Merton

70

In this case, both individuals have no choice but to consume their current income each period

because "no production" implies "no storage." Hence,

I I I I I I0 0 1 1 2 2C = y = 300; C = y = 1500; C = y = 2700, and therefore, UI = 9.08.

Similarly, II II II II II II II0 0 1 1 2 2C = y = 2700; C = y = 1500; C = y = 300; and U = 9.08.

Problem IV.2: No Production with an Exchange Market:

Suppose now that there is an exchange market with market interest rates (r1,r2). The current

wealth of person j is given by j j j j

0 0 1 2 1 21W = y + y /(1+r )+ y /(1+r )(1+r ), j = I,II. As

discussed in Section III, person j will choose a consumption program as follows:

)}C( + )C( + )C({ j2

j1

j0 logloglogMax

subject to the constraint that j j j j

1 1 20 0 1 2W = C + C /(1+r ) + C /(1+r )(1+r ). The optimal

solution is given by

(IV.5)

/3W)r+)(1r+(1 = C)r+(1 = C

/3W)r+(1 = C)r+(1 = C

/3W = C

j021

*j12

*j2

j01

*j01

*j1

j0

*j0

for j = I,II.

To determine the market interest rates (r1,r2), we impose the equilibrium condition that

aggregate (planned) saving must equal aggregate (planned) physical investment for the economy

in each period. Because there are no means of production (including storage), aggregate

investment, and hence aggregate saving, must equal zero i.e.,

I* II* I IIt t t tC + C = y + y , t = 0,1,2. The equilibrium set of interest rates that allow these

market clearing conditions to be satisfied is (r1 = 0, r2 = 0). Hence,

Page 73: Finance Theory - Robert C. Merton

Finance Theory

71

I II j*0 0 tW = W = 4500, and from (IV.5), we have that C = 1500 for j = I, II and t

= 0,1,2 and that I IIU = U = 9.53. Hence, both people are better off with an exchange

market than they were in Problem IV.1. Note that in both problems, aggregate saving,

I IIt tS + S , equals zero in each period where

j j j*t t tS y - C , ≡ (j = I, II and t = 0,1,2).

However, in Problem IV.1, I IIt tS = S = 0 whereas in this problem,

jtS 0 ≠ i.e., saving need

not equal investment for each unit.

Note that even if we had relaxed the no-production condition in Problem IV.1 to allow for

costless storage, the resulting solution would not have been the same as with an exchange

market. Person #II could achieve the optimal (1500, 1500, 1500) consumption plan using

storage. However, even with storage, Person #I would still choose the same (300, 1500, 2700)

allocation chosen in the absence of storage. This underscores the point that storage only allows

one to "transport" goods forward in time and not backwards whereas by having financial assets

and an exchange market, one can change his allocation in either direction.

Problem IV.3: Production and No Exchange Market

Suppose that Person #I has, in addition to his wage income endowment, a production technology

which transforms one unit of input this period into two units of output next period

(i.eIt+1 t., f = 2X , t = 0,1). Suppose further that Person #II has the (storage) production

technology which transforms one unit of input this period into one unit of output next period

(i.e., IIt+1 tf = X , t = 0,1). Even though Person #I's technology provides for a 100 percent rate

of return per period, his optimal choice is to not use his technology to produce any goods. The

reason is that his current period's income I0(y = 300) is so small by comparison with his later

period's income that he prefers to consume all his current income rather than produce. Therefore,

Page 74: Finance Theory - Robert C. Merton

Robert C. Merton

72

he derives no benefit from his technology, and has the same consumption allocation as in the no-

production case of Problem IV.1. On the other hand, Person #II does use his production

technology to achieve the optimal allocation (1500, 1500, 1500). Because Person #II is

producing goods with a technology which is inferior to the one owned by Person #I who is not

producing at all, there is an obvious "loss" to the economy. Under this allocation, there is a total

of 3000 available to the economy in each period. However, if the 1200 that Person #II carries

over from the current period by storage had been employed in Person #I's technology, then there

would have been an extra 1200 available to the economy in the second period with a

corresponding compound increase for the third period. As we now show, this inefficient

allocation is corrected by the introduction of a competitive exchange market.

Problem IV.4: Production with an Exchange Market

Suppose we now combine the production technologies of Problem IV.3 with the exchange

market in Problem IV.2. For a competitive exchange market and the given technologies the

equilibrium interest rates (r1,r2) must each be greater than or equal to 100 percent. Otherwise,

Person #I would register an indefinitely large demand for current period goods. Indeed, by

requiring that markets clear, the equilibrium rates will just equal 100 percent. At r1 = r2 = 1, the

present value of the (superior) technology will be zero. Hence, the wealth of each person will be

equal to the present (or "capitalized") value of his wage income. Thus,

I II0 0W = 300 + 1500/2 + 2700/4 = 1725 and W = 2700 + 1500/2 + 300/4 = 3525. From

(IV.5), the optimal consumption program chosen by Person

I* I* I* I0 1 2#I is C = 575; C = 1150; and C = 2300 with U = 9.182. Similarly for

Person #II, II* II II* II0 2 2C = 1175; C = 2350; and C = 4700 with U = 10.113. As a

result of the introduction of an exchange market, both people are better off than they were in

Problem IV.3.

Page 75: Finance Theory - Robert C. Merton

Finance Theory

73

In summary, the need for financial assets arises form the discrepancy between (desired)

saving and investment of individual economic units. If investment exceeds saving for a given

economic unit, it can finance this "saving deficit" by either issuing a financial asset (a liability to

the issuer) or by selling an already existing financial asset. Purchase and sale transactions of

already existing financial assets take place in a secondary market. Hence, with such a market,

the outstanding stock of financial assets need not change even if some units have a saving deficit.

Primary and secondary markets are an efficient means of channeling required investment funds

to the most productive units. While the analyses and examples have been centered around

private sector investment for a closed economy with no government, the same principles apply to

public investment. Thus, the same analyses could be applied to less developed countries where

the government is the main investment unit. While the analyses in Problems IV.1 - IV.4 were

structured along the lines of two individuals, the same analyses and resulting benefits would

apply if the two people were reinterpreted as two countries where international capital flows

replaced individual savings-investment deficits. The following flow and balance sheet

statements provide a detailed description of savings and investment flows for the case examined

in Problem IV.4.

Page 76: Finance Theory - Robert C. Merton

Robert C. Merton

74

Flow Statement t = 0

Person #I Person #II Aggregate (I + II)

Wage Income 300 2,700 3,000 Production Income 0 0 0

Operating Income 300 2,700 3,000 Interest Income (Expense) 0 0 0

Net Income 300 2,700 3,000 –Consumption –575 –1,175 –1,750

Savings (275) 1,525 1,250 –Investment –1,250 0 –1,250

Savings Surplus (Deficit) (1,525) (1,525) 0

Balance Sheet t = 0+

Person

#I Person

#II Aggregate Person

#I Person

#II Aggregate Assets Liabilities Capital 1,250 0 1,250 1,525 0 1,525 Debt Bonds 0 1,525 1,525 Capitalized

Wage Income 1,425 825 2,250 1,150 2,350 3,500 Net Worth

2,675 2,350 5,025 2,675 2,350 5,025

Flow Statement t = 1 Person #I Person #II Aggregate (I + II)

Wage Income 1,500 1,500 3,000 Production Income 1,250 0 1,250

Operating Income 2,750 1,500 4,250 Interest Income (Expense) (1,525) 1,525 0

Net Income 1,225 3,025 4,250 –Consumption –1,150 –2,350 –3,500

Savings 75 675 750 –Investment –750 0 –750

Savings Surplus (Deficit) (675) 675 0

Page 77: Finance Theory - Robert C. Merton

Finance Theory

75

Balance Sheet t = 1+

Person

#I Person

#II Aggregate Person

#I Person

#II Aggregate Assets Liabilities Capital 2,000 0 2,000 2,200 0 220 Debt Bonds 0 2,200 2,200 Capitalized

Wage Income 1,350 150 1,500 1,150 2,350 3,500 Net Worth

3,350 2,350 5,700 3,350 2,350 5,700

Flow Statement t = 2

Person #I Person #II Aggregate (I + II)

Wage Income 2,700 300 3,000 Production Income 2,000 0 2,000

Operating Income 4,700 300 5,000 Interest Income (Expense) (2,200) 2,200 0

Net Income 2,500 2,500 5,000 –Consumption –2,300 –4,700 –7,000

Savings 200 (2,200) (2,000) –Investment (Liquidation) –(2,000) 0 –(2,200)

Savings Surplus (Deficit) 2,200 (2,200) 0

Page 78: Finance Theory - Robert C. Merton

76

V. THE "DEFAULT-FREE" BOND MARKET AND FINANCIAL INTERMEDIATION IN BORROWING AND LENDING

In Sections III and IV, we derived some of the important functions served by a capital

market in the efficient allocation of the economy's productive resources. By making a number of

abstractions from reality, we were able to derive these functional characteristics using a relatively

simple structure. The most important of these abstractions in terms of simplification was the

assumption of a perfect certainty environment. This assumption ensured that the future course of

interest rates were known in advance and that the promised payments on all claims would be met

at the time promised. Unfortunately, the perfect certainty assumption is also the least realistic of

the abstractions made, and therefore will be jettisoned beginning in Section VIII at the cost of

introducing a more complex structure. However, in this section, we continue (at least in part)

with the assumptions of Sections III and IV to analyze the "default-free", fixed-income securities

part of the capital market where maintaining the certainty assumption does the least violence to

reality.

Fixed income securities are claims with fixed or stated payments promised at specified

times. The most common type of fixed-income security is debt. However, what is "promised" is

not always paid, and the event of not meeting a promise on a fixed-income security is called

default. A default-free, fixed-income security is contained in that subset of fixed-income

securities where the promised payments will be met with (virtual) certainty. Strictly interpreted,

the only securities that fall in this class are debt issues of the federal government and its agencies

or debt issues which are guaranteed by the federal government and this is because the federal

government can always meet money-fixed obligations by "printing" money. However, in

practice, many state and some local government issues as well as some "gilt-edge" corporate debt

issues are treated as if they were default-free. These securities are not only important because

they represent a not insignificant fraction of the capital market ($800 billion of federal

government debt obligations are held by the private sector), but also because their prices provide

the base yield upon which other securities' prices are determined. For example, because the

promised payments on fixed income securities are also the maximum payments that their holders

Page 79: Finance Theory - Robert C. Merton

Finance Theory

77

can receive, the promised yield on a fixed income security must be at least as large as the yield on

a corresponding, default-free fixed income security.

On the Pricing of Discount Bonds and the Term Structure of Interest Rates

We begin the study of default-free income securities by examining how prices are

determined for discount bonds which promise a payment of $1. As in Section III, let Pt(τ)

denote the price at date t of a default-free bond which promises a payment of $1 at date (t + τ).

Let rt denote the one-period rate of interest that can be earned between date (t - 1) and date t.

Suppose that there is a market in which these discount bonds are traded and that this market is

"open" for trading each period. Further, suppose that there are no transactions costs or taxes.

Consider two bonds with maturities τ1 and τ2 respectively at date t = 0. The return per dollar

from holding bond j over the next period is equal to the ratio of bond j's price next period to

its current price, i.e., P1(τj – 1)/P0(τj), j = 1,2. Suppose that P1(τ1 – 1)/P0(τ1) < P1(τ2 –

1)/P0(τ2). Then, any investor who plans to invest in the first bond now would be better off to

purchase the second bond now and wait (at least) until next period to purchase the first bond. To

see this, let the investor have $I to invest now. If he buys the first bond, then he will purchase

N1 = I/P0(τ1) bonds which will be worth $N1P1(τ1–1) next period. If instead he invests in the

second bond now, then he will purchase N2 = I/P0(τ2) bond now which will be worth $N2P1(τ2 -

1) next period. By hypothesis, N2P1(τ2–1) > N1P1(τ1–1). Hence, by following the second

strategy, the investor will have enough money next period to buy N1 of the first bonds plus he

will have [N2P1(τ2–1) – N1P1(τ1–1) left over. At these prices, the second bond is said to

dominate the first bond because independent of preferences or time horizon, every investor

would prefer the second bond to the first.

If P1(τ1 - 1)/P0(τ1) > P1(τ2 - 1)/P0(τ2), then by a similar argument, the first bond would

dominate the second bond. Since no investor would be willing to hold a dominated bond, a

necessary condition for equilibrium is that no bond dominate any other bond. Thus, in

equilibrium, we have that

Page 80: Finance Theory - Robert C. Merton

Robert C. Merton

78

(V.1) 1 0 1 01 1 2 2( - 1)/ ( ) = ( - 1)/ ( )P P P Pτ τ τ τ

for all maturities τ1 and τ2. Because these bonds are default-free, Pt(0) ≡ 1 for all t, and by

definition, Pt(0)/Pt-1(1) ≡ 1 + rt. Hence, we can rewrite equilibrium condition (V.1) as

(V.2) 1 0 1( - 1)/ ( ) = 1 + P P rτ τ

for all maturities τ. Moreover, the same argument can be used to show that for any starting

date t, the one-period return per dollar on bonds of all maturities must be the same. Therefore,

condition (V.2) can be rewritten more generally as

(V.3) t+1 t t+1P ( -1) / P ( ) = 1 + rτ τ

for all dates t = 0, 1, 2, ... and all maturities τ.

Consider now a specific bond which at t = 0 has maturity T. At date t = T, the bond

matures and will therefore have price PT(0) = 1. At date t = T-1, we have from (V.3) that its

price must satisfy PT-1(1) = PT(0)/(1 + rT) = 1/(1 + rT). At date t = T - 2, we have again from

(V.3) that its price must satisfy PT-2(2) = PT-1(1)/(1+rT-1) = 1/[(1 + rT)(1 + rT-1)]. Continuing in

this "backwards" recursive fashion, we can derive the price that this bond must have so that it

neither dominates nor is dominated by a one-period bond at any point in time during its

existence. The price formula is given by

(V.4) ⎥⎦

⎤⎢⎣

⎡∏ )r+(1 1/ = (T)P j

T

j=10

and this must hold for all maturities T.

As the reader will note from (II.37) and (II.38), (V.4) simply says that "the equilibrium

price for a default-free, discount bond is given by the present value formula using the current and

future one-period, market interest rates." Further, using (II.38), we can rewrite (V.4) in terms of

the average compound rate of return as

Page 81: Finance Theory - Robert C. Merton

Finance Theory

79

(V.5) ( ) ( )0 1/ 1T

TP T = R .+

Thus, given complete knowledge of the future course of one-period interest rates,

{r1,r2,...,rT,...}, one can use (V.4) to determine the current prices of default-free, discount bonds

of all maturities. However, the process can also be "reversed": namely, given a complete set of

current prices for default-free, discount bonds of all maturities, {P0(1),...,P0(T),...}, the future

course of one-period interest rates can be determined. From (V.4), we have that

(V.6) 1 -(T)] P1)/-(TP[ = r 00T

for T = 1, 2,... . Note that the difference between (V.6) and (V.3) is that (V.6) specifies a

condition on the price ratio of two bonds with different maturities at the same point in calendar

time while (V.3) specifies a condition on the price ratio of the same bond at two different points

in calendar time. However, (V.3) and (V.6) can be combined to specify a relationship between

the dynamics or time series of a specific bond's price over time and the statics or cross-sectional

series of different maturity bond prices at the current time. This relationship can be written as

(V.7) ,(t)P1)/-(tP = )(P1)/-(P 001-tt ττ

for all future dates t = 1, 2, ... and all maturities τ. Thus, from a cross section of current bond

prices, one can deduce the dynamics of future bond prices and interest rates.

In describing the cross-sectional structure of current bond prices, it is the practice to quote

the average compound returns or yields on the different maturity bonds rather than their prices.

These yields are determined by the current prices using (V.5). I.e.,

(V.8) . 1 - ](T)P[ = R-1/T

0T

The curve generated by plotting the yield, RT, against maturity, T, is called the yield curve or

the term structure of interest rates. A "rising" term structure is one where RT+1 > RT for all T

and is illustrated in Figure V.1. A "U-shaped" term structure is one where either RT+1 > RT for

Page 82: Finance Theory - Robert C. Merton

Robert C. Merton

80

0 < T < T* and RT+1 < RT for T* < T or RT+1 < RT for 0 < T < T* and RT+1 > RT for T* < T,

and is illustrated in Figure V.2. A "flat" term structure is one where RT = RT+1 for all T. Of

course, in general, the only restrictions on the shape of the term structure are that the current

bond prices implied by these yields satisfy (V.4) and that the future one-period interest rates

implied by these yields are non-negative.

As was discussed briefly at the end of Section II, one should not confuse "RT" with "rT"

in interpreting the yield curve. While the {rT} can be deduced from {RT}, the two are not

equal to one another, and indeed a plot of the rT versus T can look qualitatively quite different

from the yield curve. Moreover, if one buys a discount bond at a yield of RT(T > 1), then, in

general, its rate of return or growth in value in each period will not be the same and, indeed, can

be quite different from RT. Table V.1 illustrates these points by showing how the yields for

different maturity bonds at the current time correspond to the time pattern of one-period interest

rates. In addition, it also provides a comparison of the pattern of appreciation from an initial

$1000 investment in a fifteen-period discount bond with the pattern which would be generated if

each period the $1000 investment grew at the yield rate, R15.

Page 83: Finance Theory - Robert C. Merton

Finance Theory

81

Figure V.1 A “Rising” Term Structure

Figure V.2 A “U” Shaped Term Structure

Page 84: Finance Theory - Robert C. Merton

Robert C. Merton

82

Table V.1

Interest Rates, Yields, and Investment Returns Comparisons

At Calendar Time, t

One-Period Interest Rate, rt

Yield, Rt

Actual Value of $1000 Initial

Investment

Value of $1000 Initial Investment

at R=6.2% per Period

1 2% 2.0% $1020 $1062 2 5 3.5 1071 1128 3 10 5.6 1178 1199 4 8 6.2 1272 1273 5 4 5.8 1323 1353 6 2 5.1 1350 1437 7 3 4.8 1390 1526 8 4 4.7 1446 1621 9 5 4.7 1518 1722 10 6 4.9 1609 1830 11 7 5.1 1722 1944 12 8 5.3 1860 2065 13 9 5.6 2027 2193 14 10 5.9 2230 2330 15 11 6.2 2475 2475

As Table V.1 along with Figure V.3 demonstrates, the actual appreciation pattern from

investing in a fifteen-period discount bond is very different from the hypothetical pattern

generated by a constant rate of growth at that bond's yield rate. Because the bond's price

dynamics must satisfy (V.3), the rate of return on the bond in each period must equal that period's

one-period interest rate, and hence, the observed erratic return pattern is a direct reflection of the

variability in those rates. Although the actual and hypothetical investments are (virtually) equal

at the end of period four, this is a coincidence of the particular pattern in the one-period rates. In

general, unless the term structure is flat, the values of the two investments will coincide only at

the maturity date of the bond.

Similarly, Table V.1 and Figure V.4 show that the cross-sectional pattern of yields is very

different from the time series of one-period interest rates. As with any average, the changes in

Page 85: Finance Theory - Robert C. Merton

Finance Theory

83

the yields are less pronounced than the changes in the one-period rates. Moreover, the "turning

points" or the (approximately) flat points in the yield curve always occur after the "turning

points" in the one-period rates. So, for example, the local "peak" in the yield curve at the end of

period four occurs after the local peak in the one-period rates at the end of period three.

Similarly, the local "trough" in the yield curve between periods eight and nine occurs after the

trough in the one-period rates at the end of period six. Note too that the longer-term yields are

less sensitive than the shorter-term yields to a change in any one of the one-period rates. From

period one to period three, the one-period rates went from 2 percent to 10 percent while the

yields went from 2 percent to 5.6 percent. However, from period six to period fifteen, the one-

period rates increased steadily from 2 percent to 11 percent while the yields only went from 5.1

percent to 6.2 percent. Finally, because the yields are "geometric" averages, the T-period yield

will always be less than the (arithmetic) average of the one-period rates for the T periods (i.e.,

T

tT

t=1

< (R r ./T) ) ∑

Page 86: Finance Theory - Robert C. Merton

Robert C. Merton

84

Figure V.3

Page 87: Finance Theory - Robert C. Merton

Finance Theory

85

Figure V.4

Page 88: Finance Theory - Robert C. Merton

Robert C. Merton

86

The relationships between the {rT} and {RT} illustrated in Table V.1 are patterns that

hold in general. From (V.4) and (V.5), we have that

(V.9) . ])R+)/(1r+[(1 = )R+)/(1R+(1 1+1/TT1+TT1+T

Therefore, as was pointed out in (II.41), R

>

=

<

R T1T+ if and only if . R

<

=

>

r T1T+ Specifically, flat or

turning points in the yield curve correspond to maturities where rT+1 = RT. Hence, if he pattern

of one-period rates between t = 0 and t = T* is a rising one (i.e., r1 < r2 < ... < rT*), then from

(V.9), the yields for maturities in that region will also be rising (i.e., RT > RT-1 for T = 1,

2,...,T*). Moreover, for the peak in yields to coincide with the peak in one-period rates (i.e.,

RT*+1 < RT*), the one-period rate rT*+1 would have to satisfy

T)]/r+(1[ < )r+(1 *t

*

11+T* loglog

T

∑ which for T* much larger than one would require that rT*+1

<< rT*. Hence, if the yield curve rises significantly over an extended number of periods, then

almost certainly, the peak in the yield curve will occur after the peak in the one-period rates. A

similar argument applies for the trough in the yield curve occurring after the trough in one-period

rates when the yield curve is declining.

From (II.37), we can derive the effect on the T-period yield from a change in one of the

one-period rates to be

(V.10) T.1,2,..., = t )},r+)/(1R+{(1T

1 = r/R tTtT ∂∂

Hence, the sensitivity of the yield curve between (T - 1) and T to the one-period interest rate for

that period can be written as

(V.11) . )}r+)/(1R+{(1T

1 = r)/R-R( TTT1-TT ∂∂

Inspection of (V.11) shows that the longer is the maturity, the less sensitive the yield curve will

be to distant future one-period rates. Indeed, in the limit as T → ∞, ∂(RT - RT–1)/ ∂rT → 0.

Therefore, virtually all yield curves will exhibit a "flattening" pattern for very long maturities.

Page 89: Finance Theory - Robert C. Merton

Finance Theory

87

Since this pattern will occur for virtually all time paths in future one-period rates, great care must

be exercised in using the yield curve to draw inferences about the distant future one-period

interest rates. For example, such a pattern does not imply the existence of a stable, "long-run" or

"steady-state" one-period interest rate.

While we have formulated the term structure analysis here in discrete time with an (as of

yet) unspecified minimum time interval of "one period", it is common practice to study the yield

curve as if it were continuous and to assume that the one-period or "shortest" bond has an

infinitesimal length of time until maturity: namely dt. Using the notation developed in (II.10) of

Section II, let rc(t) denote the rate of interest between dates t and t + dt. Equation (V.3) can be

rewritten as

(V.12) (t)dt.r + 1 = )(Pdt)/-(P ctdt+t ττ

By using the same "backwards" recursive analysis which led to (V.4), we derive from (V.12) that

bond prices at date t = 0 must satisfy

(V.13) expT

c0

0

(T) = [-P r (s)ds] ,∫

for all maturities T. If Rc(T) denotes the average continuously-compounded rate of return on a

discount bond that matures at time T in the future, then it follows that

(V.14) (T)T] ,R[- = (T)P c0 exp

and from (V.13) and (V.14), the relationship between the yield curve and future interest rates is

given by

(V.15) T

cc

0

(T) = [R r (s)ds]/T∫

Hence, in the limiting case of continuous time, the average compound return is equal to a simple

arithmetic average of the future short rates.

To further explore the relationship between the yield curve and future short interest rates,

we differentiate (V.15) to obtain

Page 90: Finance Theory - Robert C. Merton

Robert C. Merton

88

(V.16) . (T)]/TR - (T)r[ = (T)/dTRd ccc

In an analogous fashion to (V.9) in discrete time, we have from (V.16) that c(T)/dT 0dR>

<= if

and only if c c(T) (T) .r R>

<= Therefore, turning points in the yield curve correspond to maturities

{T*} where rc(T*) = Rc(T

*). As in the discrete time analysis, dRc(T)/dT tends to zero as T →

∞, and therefore, independent of rc, the yield curve "flattens out" for large T.

The curvature of the yield curve can be studied using (V.16) and its derivative which is

given by

(V.17) 2 2

c c c(T)/d = {d ( )/dT - 2d (T)/dT}/T d R T r R .τ

At turning point maturities {T*}, the yield curve will have a local peak if drc(T*)/dt < 0 and a

local trough if drc(T*)/dt > 0. Hence, each turning point in the yield curve will always occur

after the corresponding turning point in future short interest rates. Points of inflection or zero

curvature in the yield curve will occur for those maturities {T+} such that drc(T+)/dt =

2dRc(T+)/dT.

Problem V.l: Analyzing the Term Structure

Suppose that the yield curve at the current time is given by:

3 4

2 3

c(T) = R + - , 0 T TR AT BT

= R + [AT - BT ]/T, T T

≤ ≤

where A > 0, B > 0, and T ≡ 3A/4B. What is the future time pattern of short interest rates

implied by this yield curve, and how does this pattern compare with the shape of the yield curve?

From (V.15) and (V.16), we have that the time pattern of short rates implied by this yield

curve can be written as

Page 91: Finance Theory - Robert C. Merton

Finance Theory

89

2c(t) = R + (3A - 4Bt) , 0 t Ttr

= R , t T .

≤ ≤

From (V.16), we have that

cdR (T)/dT = T(2A - 3BT) , 0 T T .≤ ≤

Hence, the time path of short rates starts at ;R = (0)rc rises monotonically until it peaks at t* =

A/2B; it then declines monotonically until at t = T, it remains constant at . R

While the two patterns are similar, the time path of the short rates rises more steeply and

peaks earlier than the yield curve (i.e., t* < T*). Moreover, the peak level of the short rates

B/4A + R = )t(r 23*c is higher than the peak level of the yield curve .B/27A4 + R = )T(R 23*

c

To examine the curvature of the yield curve, we have from (V.17) that

3 3

2 2c(T)/d = 2(A - 3BT), 0 T < Td R T

= AT /2T , T > T.

Hence, the yield curve starts out convex until it reaches an inflection point at T+ = A/3B and

becomes concave on the interval ( ), .T T+

While the first derivative of the yield

curve is continuous at T the second derivative is not, and Rc is again convex for (T, ).∞

To examine the curvature of the time path of future short rates, we derive the second

derivative of the path to be

2 2c(t)/d = 6(A - 4Bt), 0 t Td tr

= 0, t > T .

≤ ≤

Like the yield curve, the time path of short rates starts out convex, reaches an inflection point at

t+ = A/4B where it becomes concave until t = T. Although both the time path and the yield

curve reach their inflection points midway between the starting point and the peak, the inflection

Page 92: Finance Theory - Robert C. Merton

Robert C. Merton

90

point of the time path occurs earlier than the inflection point for the yield curve (i.e., t+ < T+).

As a form of summary, the analysis shows in continuous time what Figure V.4 illustrated for the

discrete-time analysis: Namely, that the yield curve and the future time path of interest rates can

differ significantly.

In summary, the yield curve or term structure is a plot at a given point in time of a cross-

section of discount bond yield which differ only with respect to their maturities. Although

inherently a static construct, the yield curve derived from equilibrium bond prices in an

environment of certainty has an exact relationship to the dynamics or time path of future interest

rates. Even in an environment where future interest rates are uncertain, the term structure is still

well-defined. In such an environment, there will be a set of prices for discount bonds {Pt(τ)} at

each point in time, and by the definition of yield, these prices can be used in (V.8) to uniquely

determine a set of yields {RT} which can then be plotted against maturity to form a yield curve.

Of course, once future interest rates are stochastic, the relationship derived between current

prices and future interest rates, (V.6), will no longer be valid. However, with some additional

assumptions, the yield curve can still be used to make inferences about the structure of the

stochastic processes which describe interest rate dynamics. Moreover, as will be seen, the yield

curve frequently provides sufficient information to solve problems involving the pricing of fixed-

income securities.

On the Pricing of the General Default-Free Fixed Income Securities

In preceding analyses, we studied the price relationships among default-free, discount

bonds with the same promised payment ($1) at maturity. Consider now a general default-free,

fixed-income security with a schedule of promised payments of $xt to be paid at the end of

period t, t = 1,2,...,T. xt can be either positive in which case the owner of the security receives

a payment of $xt, or negative in which case the owner must pay out, $|xt|. We denote the

equilibrium market price of this security at time t = 0 by V0(x1,...,xT).

Page 93: Finance Theory - Robert C. Merton

Finance Theory

91

If there exists a set of default-free discount bonds with current equilibrium prices denoted

as before by {P0(τ)}, then the current equilibrium price of the general default-free, fixed-income

security must satisfy

(V.18) ).(Px = )x,...,x,x(V 0

T

1=T210 ττ

τ∑

The proof that (V.18) must hold in equilibrium is by contradiction. Namely, if (V.18) does not

hold, then we will show that the general security either dominates or is dominated by other

available securities, and a necessary condition for equilibrium is that no such dominance exists.

Define δ by

T

00 1 2 T

=1

( , ,..., ) - ( ).V x x x x Pτ

τ

δ τ≡ ∑ Suppose that V0 were larger than

)(Px 0

T

=1

τττ∑ and hence, δ > 0. If an investor purchases the general security for V0 then he will

receive in return a stream of payments of $xt at the end of period t for periods t = 1,2,...,T.

Consider an alternative investment which calls for the purchase of a group of discount bonds in

the following quantities: Buy 0 [ + /( ( )T)]N x Pτ τ δ τ≡ bonds each of which pays $1 at its

maturity date τ periods from now and do this for bond maturities τ = 1,2,...,T. The cost of

acquiring these bonds is δττ ττ

ττ

+ )(Px = )(PN 0

T

1=0

T

1=∑∑ and hence, is the same as the cost of

the general security. Because each of the t-period maturity bonds purchased will pay $1 at date

t, the investor will receive a stream of payments of $Nt at the end of period t for periods t =

1,2,...,T. By hypothesis, δ > 0 and therefore, Nt > xt for t = 1,2,...,T. So, for the same initial

cost, the investor will receive each period a larger payment from the alternative investment than

he would receive from the general security. Hence, every investor would strictly prefer the

alternative investment to the general security. The general security is dominated by the

alternative investment, and therefore, the hypothesized condition is not consistent with

equilibrium pricing.

Page 94: Finance Theory - Robert C. Merton

Robert C. Merton

92

Suppose instead that V0 were smaller than )(Px 0

T

1=

τττ∑ and hence δ < 0. The entity

that issued the general security (e.g., an individual, firm, or financial institution) is required to

make a payment of $xt at the end of period t to the owner of the general security for periods

t = 1,2,...,T. Suppose the entity purchases the general security in the market and finances this

purchase by issuing Nτ discount bonds of maturity τ for τ = 1,2,...,T. As was shown, the total

proceeds from issuing these bonds )(PN 0

T

1=

τττ∑ is equal to V0 and hence, the total transaction

does not change the current cash position of the entity. However, the net resultant of the

transaction is to replace the entity's liability to pay $xt at the end of period t with a liability to

pay $Nt at the end of period t for t = 1,2,...,T. By hypothesis, δ < 0 and hence, Nt < xt for

each t. By making the transaction, the entity reduces the amount it has to pay in every period.

Therefore, as long as δ < 0, the entity can make itself better off by purchasing the general

security and financing its purchases by issuing the appropriate quantities of discount bonds.

From the viewpoint of an issuer, the general security (as a means of raising money) is dominated

by the alternative of issuing discount bonds. Of course, form the viewpoint of a buyer, the

general security dominates the specific package of discount bonds {Nτ}. Thus, the hypothesized

condition that δ < 0 is not consistent with equilibrium pricing, and this completes the proof that

(V.18) must obtain in equilibrium.

On Arbitrage Opportunities: A Special Case of Dominance

The requirement that prices be such that no investment dominates any other investment is

frequently called a "No-Arbitrage" (or "No-Easy Money") condition although the two are not

strictly the same. An arbitrage opportunity is said to exit if there is a set of feasible transactions

which require no cash payments at any time, and the resultant of these transactions is to produce

positive cash receipts at one or more points in time. In effect, the existence of an arbitrage

opportunity implies that it is possible to get something of value for nothing.

Page 95: Finance Theory - Robert C. Merton

Finance Theory

93

A simple example of an arbitrage opportunity would be as follows: Suppose that shares

of General Motors stock were selling for $54 a share on the New York Stock Exchange while at

the same time, these shares were selling for $55 on the London Stock Exchange. An investor

who simultaneously sold k shares of GM on the London Exchange for a total of $55k and

bought k shares of GM on the New York Exchange for a total of $54k would immediately

produce a positive cash receipt of $55k - $54k = $k. By delivering the shares purchased in New

York to cover the shares sold in London, the investor would eliminate any further liabilities

associated with these transactions, and hence, this set of transactions requires no cash payments

by him at any time. However, as a result of these transactions, the investor has immediately

increased his wealth by $k. Indeed, as long as the contemporaneous prices for GM on the two

exchanges are different, the investor can continue to increase his wealth by making these

transactions. The investor is truly getting something for nothing. Just as the laws of

thermodynamics rule out the existence of a perpetual-motion machine, so the laws of economics

rule out the existence of persistent arbitrage opportunities.

As a second, somewhat more-complicated example of an arbitrage opportunity, we

reexamine the analysis used to derive (V.18) with the additional institutional assumption that at

least one investor can buy or issue (sell) any of the available securities in arbitrary amounts.

Consider the following set of transactions: buy xτ units of a τ-period discount bond for

maturities τ = 1,2,...,T and simultaneously, issue (or sell) one unit of the general security. Let k

denote the number of "units" of this "package" taken by an investor where k > 0 means "buy

xk τ units of the discount bonds τ = 1,2,...,T and issue k units of the general security" and k <

0 means "issue x|k| τ units of the discount bonds and buy |k| units of the general security."

At the time that the transactions are made (t = 0), there is a cash outflow of ⎥⎦

⎤⎢⎣

⎡∑ )(Pkx$ 0

T

1=

τττ

and a cash inflow of $kV0. Hence,

T

00

=1

$k $k - ( )V x Pτ

τ

δ τ≡⎡ ⎤⎢ ⎥⎣ ⎦

∑ is the current net cash flow

to the investor. So, by choosing the sign of k such that kδ > 0, the investor receives an

immediate, positive cash payment of $kδ as a result of these transactions. Note that in period t

Page 96: Finance Theory - Robert C. Merton

Robert C. Merton

94

(t = 1,2,...,T), the investor receives $kxt from the discount bonds which mature in that period

and pays out $kxt on the k units of the general security issued. Hence, for any k chosen, the

net cash flows associated with the investment package are zero in every future period. Just as in

the first example, the result of these transactions is to immediately increase the investor's wealth

by $kδ. By assumption, the magnitude of k is not bounded. Therefore, as long as ,0 ≠δ the

investor can continue to increase his wealth without bound. So, either the investor ends up with

all of society's wealth or the prices of the discount bonds and the general security change so that

δ = 0. Clearly, the latter is the sensible conclusion, and therefore, by his actions, the investor will

"force" prices to adjust until δ = 0. Thus, under the hypothesized institutional conditions, prices

must satisfy (V.18).

Although subtle, the differences between a dominance situation and an arbitrage

opportunity are important. The price conditions required to rule out dominance are formally the

same as the ones that rule out arbitrage opportunities, and the existence of an arbitrage

opportunity necessarily implies a dominance situation. However, the existence of a dominance

situation does not necessarily imply an arbitrage opportunity. To see this, consider the case

where δ > 0 and therefore, a collection of discount bonds dominates the general security.

Suppose this dominance situation is recognized by a specific investor. If the institutional

structure permits, he can and will enter into a set of arbitrage transactions, and by his actions in

the market, he will unilaterally force prices to adjust until δ = 0. However, suppose that this

investor owns none of the general security and further suppose that institutional restrictions

prevent him from issuing the general security. Then, at least for this investor, δ > 0 does not

provide an arbitrage opportunity because the set of transactions required to institute arbitrage is

not feasible. The only action that he can take is simply not to purchase any of the general

security, and this action provides little, if any, pressure on prices to adjust so that δ = 0.

Of course, if it is feasible for some other investor to issue the general security and if this

other investor recognizes that the dominance situation exists, then this other investor can perform

the arbitrage transaction and prices will adjust. Or, as described in the dominance proof of

(V.18), if the investors who own the general security recognize that the dominance situation

exists, then these investors will sell their holdings of the general security, and their collective

Page 97: Finance Theory - Robert C. Merton

Finance Theory

95

actions will tend to force prices to adjust until δ = 0. Thus, the significant difference between

arbitrage and dominance is the mechanism by which such opportunities are eliminated. In the

case of arbitrage, it takes only one investor who recognizes the opportunity to force prices to

adjust until the opportunity is eliminated. In the case of dominance generally, several investors

with specific endowments must recognize the opportunity for the same price adjustment to

obtain. For this reason, price relationships derived from a "No-Arbitrage" condition are less

likely to be violated than ones derived form a "No-Dominance" condition. However, it should be

pointed out that the occurrence of a significant dominance situation is an infrequent event

although it will occur far more frequently than a true arbitrage opportunity.

As was the case in the arbitrage derivation of (V.18), most arbitrage opportunities can be

exploited only if the arbitrageur can sell securities that he does not own. While in that

derivation, the term "issue" was used to describe all such sales, it is usually only used to describe

the sale of a security whose obligations to the purchaser are those of the seller. For example, if

General Motors sells a fixed income security which obliges General Motors to make the

specified payments to the purchaser, then General Motors is said to have "issued" that fixed

income security. Such sales are called primary (market) offerings, and are rarely, if ever, made

by individuals. The purchase or sale of already-existing securities whose obligations are not

those of the seller is called a secondary (market) transaction, and most arbitrage transactions are

of this type. A secondary-market sale of a security not owned by the seller is called a short-sale.

A short-sale is accomplished by borrowing the security from someone who owns it and

then selling it in the market. The terms of the "loan agreement" are typically as follows: (1)

Like a standard demand loan, either the borrower (short-seller) or the lender can terminate the

loan at any time. At the time of termination, the borrower must return the security borrowed to

the lender by either purchasing the security in the market ("covering" his short) or borrowing the

security from another lender. (2) During the time that the security is borrowed, the borrower

must reimburse the lender for all payments (including interest, dividends, and other distributions)

that he would have received from the security had he not lent it to the borrower. (3) The

borrower may be required to post and maintain sufficient collateral to ensure his ability to meet

his obligations, (1) and (2), to the lender. Unlike a conventional money loan, the lender is not

Page 98: Finance Theory - Robert C. Merton

Robert C. Merton

96

paid interest for lending his security. Hence, the lender earns a return equal to the one he would

have received had he remained the owner and not lent the security. However, because he is no

longer the owner of record, he forgoes any non-cash benefits of ownership (e.g., voting rights)

while the security is on loan. Hence, for this and other inconveniences associated with lending

the security including the risk that the short-seller may not meet his obligations, the lender may

require some additional compensation. The usual form of the compensation is to require that at

least some of the collateral for the loan be cash which in effect, provides the lender with an

"interest-free" loan. Alternatively, the borrower may pay a fee or premium for the loan.

In summary, the short-sale is an important transaction for the exploitation of arbitrage

opportunities. Therefore, in institutional environments which prohibit short-sales, one must rely

on the weaker mechanism of dominance to ensure that price relationships such as (V.18) will

obtain. Fortunately, the actual institutional structure that exists permits most securities traded in

organized markets to be sold short.

Thus, especially in environments which permit short-sales, one would expect the price

relationship between pure discount bonds and general fixed-income default-free securities to

satisfy (V.18). From (V.8), we can rewrite (V.18) as

(V.19) . )Rx = V +/(1T

1=0

τττ

τ∑

From (V.19), one can evaluate any default-free security using a properly-constructed yield curve.

While (V.19) looks like a present value formula, nowhere in either the dominance or arbitrage

derivation of (V.18) was it required that the future time path of interest rates be known with

certainty. Hence (V.18), and therefore (V.19), provide the proper equilibrium price relationships

even when interest rates are stochastic.

Having established the fundamental price relationship between default-free discount

bonds and default-free fixed-income securities, we now demonstrate its use in a number of

specific applications.

Page 99: Finance Theory - Robert C. Merton

Finance Theory

97

On Coupon Bonds and Estimating the Term Structure of Interest Rates

As the analysis leading to (V.18) demonstrates, it is sufficient to have a complete set of

current discount bond prices to determine the equilibrium price of any default-free fixed-income

security. It was also shown that such a set is sufficient to construct the term structure of interest

rates and forward prices. However, while discount bonds are frequently issued with maturities of

less than one year, they are rarely issued with longer maturities, and this is the case not only for

government debt, but for corporate debt as well. Therefore, one cannot generate the term

structure by simply observing the contemporaneous prices of discount bonds for all maturities

because such an array of bonds does not exist. However, by using the current prices of the

default-free bonds which are available, it is possible to estimate both the "missing" discount bond

prices and the term structure.

The most common form for intermediate and longer-term debt is the coupon bond. Like

the "Interest-Only" loans discussed in Section II, the coupon bond calls for a stream of periodic

and equal-in-size (coupon) payments and a single, lump-sum (principal) payment at maturity. If

Cj denotes the coupon payment per period for periods 1,2,...,Tj and Mj denotes the principal

payment at the maturity date Tj, then, from (V.18), the equilibrium price of coupon bond #j, Bj

must satisfy

(V.20) . )T(PM + (t)PC = B j0j0j

T

=1tj ∑

j

Equivalently, the price of the coupon bond can be written in terms of yields as

(V.21) . )R+/(1M + )RC = BT

Tjt

tj

T

=1tj +/(1 j

j

j

Again, it should be emphasized that (V.20) and (V.21) must be satisfied in equilibrium even if

interest rates are stochastic. In the special case of nonstochastic interest rates and a "flat" term

structure, (V.21) reduces to (II.34) and can be rewritten as

(V.22) )r+/(1M+]/r)r+1/(1-[1C = B Tj

Tjj

jj

where r is the per period rate of interest common to all periods.

Page 100: Finance Theory - Robert C. Merton

Robert C. Merton

98

To estimate the missing discount bond prices using available coupon bond prices, we

proceed as follows: Suppose there are n coupon bonds numbered in ascending order with

respect to their maturities (i.e., T1, T2 <... < Tn ≡ T where T is the maximum maturity of any of

the bonds). In equilibrium, the prices of these bonds must satisfy (V.20), which can be written as

the system of equations

(V.23) )T(P)M+C(+...+(2)PC+(1)PC = B 101101011

)T(P)M+C(+...+)T(PC+...+(2)PC+(1)PC = B 202210202022

. )T(P)M+C(+...+)T(PC+...+)T(PC+...+(2)PC+(1)PC = B n0nn20n10n0n0nn

Because we know the terms {Cj,Mj,Tj} and current prices {Bj} of the coupon bonds, (V.23)

can be viewed as a system of n linear equations for the T (unknown) discount bond prices

{P0(1),P0(2),...,P0(Tn)}. (V.23) can be rewritten in compact vector-matrix notation as

(V.24) B = AP

where B denotes a n x 1 vector of the coupon bond prices {B1,...,Bn}; P denotes a 1 x T

vector of the discount bond prices; and A denotes a T x n matrix whose elements aij are: for i

= 1,...,n, aij = Ci, j = 1,2,...,(Ti - 1); aij = (Ci + Mi) for j = Ti; aij = 0 for j = Ti + 1,...,T.

Because (V.23) is a linear set of equations, there are well-established procedures for

solving it when a solution exists. If the number of equations is fewer than the number of

unknowns (i.e., n < T), then clearly, there will not be a unique solution because not enough

information is available. If n = T, then a unique solution will exist provided that the n

equations in (V.23) are linearly independent (i.e., the rank of the matrix A in (V.24) is equal to

T). Such linear independence will occur if the n bonds chosen are sufficiently different with

respect to their terms. In this case, the solution for P is obtained by matrix inversion

(P = A-1B). Although matrix inversion is a difficult operation to do by hand, there exist very

efficient computer programs which solve these equations with little difficulty even when n is

quite large.

Page 101: Finance Theory - Robert C. Merton

Finance Theory

99

Problem V.2: Using Coupon Bond Prices to Estimate Discount Bond Prices

There are three coupon bonds with the following terms and current prices: (B1 = $961,

C1 = $100, M1 = $900, T1 = 2), (B2 = $968, C2 = $100, M2 = $900, T2 = 3), and (B3 = $879,

C3 = $50, M3 = $950, T3 = 3). What are the implicit current discount bond prices for periods

1,2, and 3, and what are the corresponding term structure yields? From (V.23), we have that

(i) 961 = 100 P0(1) + 1000 P0(2) + 0 P0(3)

(ii) 968 = 100 P0(1) + 100 P0(2) + 1000 P0(3)

(liii) 879 = 50 P0(1) + 50 P0(2) + 1000 P0(3) .

If we multiply equation (iii) by 2 and subtract equation (ii) from it, then we find that 790 = 1000

P0(3) or P0(3) = $0.79. Substitute $0.79 for P0(3) in (ii) to get 100 P0(1) + 100 P0(2) = 178

and subtract this from equation (i). The resultant is 783 = 900 P0(2) or P0(2) = $0.87. Finally,

substitute $0.87 for P0(2) in equation (i) to get 100 P0(1) = 91 or P0(1) = $0.91. Having

solved for the discount bond prices, we now use (V.8) to determine the term structure: Namely,

R1 = 9.89%; R2 = 7.21%; and R3 = 8.17%. Finally, using these discount bond prices, we can

value payments to be made during the first three periods on any default-free security. For

example, what is the current value for a default-free security with a stream of payments, x1 =

$800; x2 = –$600; and x3 = $2500? Using the above prices in (V.18), we have that V0 = 800 ×

.91 – 600 × .87 + 2500 × .79 = $2181. The reader should note that the only data required in this

problem were the terms and current prices for the coupon bonds. Nowhere was it assumed that

interest rates were nonstochastic.

The central purpose of Problem V.2 was to illustrate how one can compute the discount

bond prices needed to use (V.18) and to construct the term structure. However, the analysis used

also illustrates how an investor can "manufacture" discount bonds when none exists provided

that short-selling is permitted. For example, suppose the situation is as in Problem V.2 and an

investor would like to have a three-period discount bond. Consider an investing "package"

Page 102: Finance Theory - Robert C. Merton

Robert C. Merton

100

where he buys 2k units of bond #3 and sells short k units of bond #2. At the end of period 1,

he will receive coupon payments of $100k on the bonds that he owns, and he must pay $100k

entity which lent him the bonds for short sale. Hence, the net cash flow from the investment

package at the end of period 1 is zero. By the same analysis, the net cash flow from the package

at the end of period 2 is zero. At the end of period 3, he will receive $2000k in coupon and

principal payments on the bonds he owns, but he must pay $100k in coupon payments and $900k

to repurchase the bonds he has shorted. Hence, the net cash flow from the investment package at

the end of period 3 is $1000k. Thus, the pattern of returns from this investment package is

identical to those of a three-period discount bond with a promised payment at maturity of

$1000k. k is simply a scale factor chosen by the investor in the same way that he would choose

the number of discount bonds he wants to purchase. The cost of the package would be 2k × 879

– 968k or $790k. Thus, the formal mathematical manipulations used to deduce implicit

discount prices are the same as the ones used to determine the combination of purchases and

short-sales required to "manufacture" discount bonds when such bonds do not exist.

To complete the analysis of (V.23), we now examine the case where the number of bonds

in the sample exceeds the number of maturities (i.e., n > T). In this case, there are more

equations than unknowns, and hence, for a solution to exist, the "extra" equations must be

redundant. Specifically, a unique solution will exist if and only if the number of linearly

independent equations in (V.23) equals T, and the solution for each such linearly independent

subset of equations satisfies the other (n - T) equations. In terms of (V.24), this condition

implies that both the row and column ranks of A be equal to T. If different linearly independent

subsets of the n equations lead to different values for the discount bond prices, then the row

rank will exceed the column rank, and no solution will exist. The economic implication of

nonexistence is that not all the coupon bond prices satisfy (V.18), and therefore, either a

dominance or an arbitrage situation exists among the outstanding coupon bonds. This point is

demonstrated in the following problem.

Page 103: Finance Theory - Robert C. Merton

Finance Theory

101

Problem V.3: Arbitrage Opportunities in Coupon Bonds

Assume the same environment as described in Problem V.2, but now add one more bond

with a market price and terms given by (B4 = $4000, C4 = $300, M4 = $4000, T4 = 3). If

there are to be no arbitrage opportunities, then the price of bond #4 must satisfy (V.20). I.e.,

(iv) 4000 = 300 P0(1) + 300 P0(2) + 4300 P0(3) .

Therefore, the system of equations corresponding to (V.23) is (i), (ii), (iii) from Problem V.2 and

(iv). From the solution of Problem V.2, (i), (ii), and (iii) are a linearly independent subset of this

system with a solution {P0(1) = .91, P0(2) = .87, P0(3) = .79}. Thus, if there is a solution to

this system, (iv) must be a redundant equation satisfied by the solution to (i), (ii), and (iii). But,

300 × .91 + 300 × .87 × .79 = 3931 which is not equal to B4 = 4000. Indeed, based upon

the prices of the other three bonds, bond #4 is "overpriced" in the sense that at these prices, bond

#4 is dominated by the purchase of some combination of the other three bonds.

To show this dominance, let k4 denote the number of units of bond #4 either owned or to

be purchased by an investor and let kj denote the number of units of bond #j in the proposed

dominating investment package, j = 1,2,3. If the {kj} are selected so as to satisfy the

conditions: (a) 100 k1 + 100 k2 + 50 k3 = 300 k4; (b) 1000 k1 + 100 k2 + 50 k3 = 300 k4; (c)

1000 k2 + 1000 k3 = 4300 k4, then the cash receipts from the proposed package in periods 1, 2,

and 3 will be identical to the cash receipts from k4 units of bond #4 in those periods. These

conditions are satisfied by: k1 = 0; k2 = 1.7 k4; k3 = 2.6 k4. However, the cost of acquiring these

identical streams of payments is not the same. The cost of acquiring the k4 units of bond #4 is

$4000 k4. The cost of acquiring the "package" is $961 k1 + $968 k2 + $879 k3 = $3931 k4. Thus,

the package dominates bond #4. Of course, if the institutional structure permits short-sales, then

these prices would imply an arbitrage opportunity where the arbitrageur would purchase 1.7 k4

units of bond #2 and 2.6 k4 units of bond #3 for each k4 units of bond #4 sold short. For each

such transaction, his wealth would increase by $69 k4.

Page 104: Finance Theory - Robert C. Merton

Robert C. Merton

102

As in the case for (V.20), price relationships deduced from the condition that no

dominance situations exist are relative pricing formulas. I.e., they specify conditions under

which a set of prices will be internally consistent with respect to one another. Because only a

subset of securities are examined, a set of prices that satisfies such relative pricing formulas need

not be one which will clear markets in equilibrium. Therefore, relative pricing formulas provide

necessary, but not sufficient, conditions for equilibrium.

For example, in the solution of Problem V.3, it was shown that bond #4 was "overpriced"

relative to the prices of bonds #2 and #3 in the sense that at these prices, anyone would prefer an

appropriate mix of these bonds to holding bond #4. Hence, the posited prices in that problem

cannot be equilibrium prices. However, if the price of bond #4 were changed so as to be

consistent with the other three bond prices (i.e., B4 = 3931), there is not sufficient information

given in the problem to determine whether or not these prices would clear the market.

In summary, we have shown how discount bonds can be estimated using coupon bond

prices, and how these estimates can be used to identify mispriced securities. While there are

many (virtually) default-free coupon bonds traded in the market, their diversity in terms of

coupon, principal, and maturity is usually not sufficient to generate a unique set of discount bond

prices for all maturities. Moreover, differences in terms, other than those discussed here, can

also cause errors in the price estimates obtained from (V.23). Some examples would be

differences in sinking fund and call provision and the tax treatment of the returns earned from

holding the bond. The latter is especially important in the case of municipal bonds whose

coupon payments are usually exempt from federal income taxes. Hence, precise estimates for the

{P0(τ)} can rarely be made. However, statistical techniques can be applied to the structural

equations (V.23) to estimate both the prices and the precision of the estimates.

Yield-to-Maturity and Duration for Coupon Bonds

A frequently suggested alternative to (V.23) for estimating the term structure and

identifying mispriced default-free securities is the yield-to-maturity method. The yield-to-

Page 105: Finance Theory - Robert C. Merton

Finance Theory

103

maturity for a coupon bond r* is defined as that value of r which causes (V.22) to obtain for

the current market price of the bond. I.e., it is that common per period rate of interest which

would obtain if: (i) the bond price is an equilibrium price; (ii) the term structure is "flat"; (iii)

future interest rates are nonstochastic.

By manipulating (V.22), we have that the yield-to-maturity for coupon bond #j, r*j is equal to (x

– 1) where x is the real-root solution to the polynomial equation

(V.25) . )C + M( + xM - x)C + B( - xB = 0 jjjT

jj1+T

jjj

Certainly, the yield-to-maturity method appears to be an attractive alternative to (V.23) in terms

of data requirements and computational simplicity. Both methods require only current bond

prices and their terms to estimate the term structure. However, to identify RT using (V.23)

requires a minimum of two different coupon bonds whereas a single T-period bond can be used

to compute that bond's yield-to-maturity. Because Bj, Cj, and Mj are all positive, one can be

assured by Descartes' Rule of Sign that there is only one real-root solution to (V.25), and there

exist very fast and accurate numerical methods for finding the root of such a polynomial

equation. Moreover, unlike (V.23) which requires a simultaneous solution of a system of

equations, the yield-to-maturity equation can be solved separately for each bond. Undoubtably,

these attractive computational features provide the genesis of the standard practice of quoting

coupon bond prices as "priced to yield 100 r*j percent." There is no harm in such a convention

unless it is misused in the resolution of substantive issues. Specifically, these features of the

yield-to-maturity method are attractive only if it provides valid estimates for the term structure

and correctly identifies mispriced securities.

To determine the conditions under which this method does provide valid estimates, we

begin by examining pure discount bonds (i.e., Cj = 0). Inspection of (V.25) shows that the yield-

to-maturity on a pure discount bond is simply its yield j

TR as defined in (V.8). So, trivially,

the yield-to-maturity method applied to pure discount bonds provides a valid estimate for the

term structure. The yield method also works in identifying mispriced securities when comparing

Page 106: Finance Theory - Robert C. Merton

Robert C. Merton

104

pure discount bonds of the same maturity. However, for pure discount bonds with different

maturities, the one with the higher yield-to-maturity need not be the better buy. To illustrate this

point, consider the time pattern of interest rates and yields presented in Table V.1. By

construction, the discount bond prices and their yields displayed there are "fair" in the sense that

bonds of all maturities will have the same holding period returns. Yet, inspection of Table V.1

shows that the yield-to-maturity on a two-period bond R2 equals 3.5 percent while the yield on a

four-period bond R4 equals 6.2 percent. Indeed, if the price of a four-period bond were such

that its yield were 6 percent and other bond prices were unchanged, then for the same time

pattern of interest rates, the four-period bond would be dominated by an initial investment in the

two-period bond followed by a "rolling-over" of one-period bonds for periods three and four.

That is, by investing $1000 in the four-period bond when it is priced to yield 6 percent, the value

of the investment at the end of four periods would be $1262. However, by investing $1000 in the

alternative, the value at the end of four periods would be $1272.

In a similar fashion, it is straightforward to show that the yield-to-maturity method cannot

be used to identify the "better buy" when comparing coupon bonds with different maturities.

Moreover, as the following problem demonstrates, it cannot in general be used to compare

coupon bonds with the same maturity.

Problem V.4: Bond Swapping with Coupon Bonds

Investment strategies which attempt to improve the returns on a portfolio of fixed-income

securities by exchanging bonds currently in the portfolio for other bonds with the same maturity

and risk are called bond swapping strategies. Suppose that a portfolio of default-free fixed-

income securities contains bond #1 which is a 15-year bond with an annual coupon C1 = $50; a

principal or "face value" M1 = $4865; and a current market price of $1545. Suppose further that

bond #2 which is a 15-year bond with an annual coupon C2 = $200 and a face value M2 =

$1000 is currently selling for $1545. As the investment manager of this portfolio, should you

"swap" bond #1 for bond #2?

Page 107: Finance Theory - Robert C. Merton

Finance Theory

105

Solving (V.25) for bond #1 and bond #2, we have that their yields-to-maturity are *

1r =

.10 and *2r = .12 or 10 percent and 12 percent, respectively. Thus, if the manager uses the

yield-to-maturity method for selecting bonds, then he will choose to swap bond #1 for bond #2.

However, is this the correct decision?

To answer this question requires additional information. Suppose that the future time

path of interest rates is as described earlier in this section in Table V.1. Because the initial

investment required to acquire either bond is the same, clearly, the proper choice is the bond

which provides the larger cumulative increment to the value of the portfolio. In a similar fashion

to the derivation of the annuity formula in Section II, we can use the actual time path of interest

rates described in Table V.1 to determine the accumulated sum at the end of fifteen years,

j15S from holding bond #j to maturity and reinvesting all coupon payments received in the

interim. So, for example, the $50 received from bond #1 at the end of year 1 will be deposited

for year 2 at 5 percent and then reinvested along with accumulated interest at 10 percent for year

3, and so on. At the end of year 15, this payment will have grown to . )r+(1 $50 t

15

2=t∏ Doing

this for all payments for both bonds, we find that the accumulated sum at the end of fifteen years

for bond #1 is 115S = $6127 and for bond #2,

215S = $6049. Therefore, bond #1 is actually the

better investment.

Hence, unlike the yield on pure discount bonds, the yield-to-maturity on coupon bonds

does not provide a ranking for comparing bonds of the same maturity. Moreover, while the yield

on a discount bond is always equal to the actual average compound return earned from holding

the bond to maturity, the yield-to-maturity on a coupon bond does not. To see this, note that

bond #2 is formally equivalent to a 15-year pure discount bond which pays $6049 at maturity and

has current market price of $1545. From (V.8), the average compound return on such a bond is

given by (6049/1545)1/15 – 1 or 9.5 percent which is significantly different from its 12 percent

yield-to-maturity. Thus, even if bond #2 had been the better investment, in general, its yield-to-

maturity will not be equal to the average compound return from holding it to maturity. This

Page 108: Finance Theory - Robert C. Merton

Robert C. Merton

106

significant discrepancy also demonstrates that the yield-to-maturity method as an alternative to

solving (V.23) can produce significant errors in estimating the term structure.

In summary, the yield-to-maturity method is not a reliable one either for making bond

swapping decisions or for estimating the term structure. The correct method is to estimate the

pure discount bond prices using (V.23) and then to evaluate individual coupon bonds using

(V.21).

The reason that the yield-to-maturity on a coupon bond fails to provide either the correct

return on the bond when held until maturity or the correct ranking of alternative bond

investments can be traced to the derivation of the present value formula in Section II. In that

derivation, it was essential that the interest rates used be the ones at which payments received

could be reinvested. Therefore, (V.22) from which the yield-to-maturity is derived, is a valid

present value formula only if the coupon payments received can be reinvested each period at rate

*jr . Hence, unless the reinvestment rates each period happen to equal

*jr , the average compound

return from holding a coupon bond until maturity will not equal its yield-to-maturity. Indeed, as

was demonstrated by Problem II.2 in Section II, the choice among claims cannot in general be

made without reference to these reinvestment rates. Because the yield-to-maturity method makes

no reference to such reinvestment rates, it is perhaps not surprising that it cannot provide an

unambiguous ranking among investments.

While the preceding analysis provides essentially a negative report on the yield-to-

maturity method, it was pursued in detail because this method is frequently used and mis-used.

Also, in the study of corporate finance in Section VII, the same issues will arise again with

respect to the internal rate of return method for making capital budgeting decisions.

Two other yield terms frequently used in connection with coupon bonds are the coupon

rate and the current yield. The coupon rate r+ is defined to be the ratio of the coupon payment

per period to the principal or face value of the bond (e.g., for bond # , / ).j j jj r C M+ ≡ The

current yield r is defined to be the ratio of the coupon payment per period to the current

market price of the bond (e.g., for bond # , jjj / ) j r C B .≠ The current yield is the rate of

return that would be earned from holding the bond for one period if the price of the bond does

Page 109: Finance Theory - Robert C. Merton

Finance Theory

107

not change. Using (V.22), the relationship among the coupon rate, the current yield, and the

yield-to-maturity for bond #j can be expressed by

(V.26) )r+)(1r/r( = )r-r)/(r-r( T-*j

+jj

+j

*jj

*j

j

So, for example, the current yield will equal the yield-to-maturity if and only if either Tj = ∞

(i.e., the bond is a perpetuity) or Bj = Mj (i.e., the current yield equals the coupon rate). If

,r > r+jj then ,r > r > r +

jj*j and if ,r < r

+jj then .r < r < r +

jj*j As with the yield-to-maturity,

neither the coupon rate nor the current yield is particularly useful for estimating the return from

holding a coupon bond.

All pure discount bonds with the same maturity date have the same time pattern of

payments: Namely, all payments are made on the maturity date. However, depending upon the

relationship between the relative size of the coupon and principal payments, the time pattern of

payments for coupon bonds with the same maturity date need not be the same. For example, in

Problem V.4, both bonds had fifteen-year maturities. However, bond #1 had a $200 per year

coupon and a $1000 principal whereas bond #2 had a $50 per year coupon with a $4865

principal. Thus, a $1545 investment in bond $1 would receive a larger fraction of its total

payments relatively earlier than the same investment in bond #2. The duration of a default-free

coupon bond is the (value-weighted) average time of the payments received on the bond and is

defined for bond #j by

(V.27) jT

jj t

t=1

tD δ≡∑

where forj0 j jt j (t)/ , t = 1,2,..., -1C P B Tδ ≡ and

j

jj 0 j jjT ( + ) ( )/ .C M P T Bδ ≡ Hence,

jtδ is

the fraction of bond #j's total value attributable to the payment received in period t, t = 1,...,Tj.

Because coupon bond #j is formally equivalent to a collection or portfolio of pure discount bonds

with B$ jjtδ invested in pure discount bonds which mature at date t, Dj is equal to the (value-

weighted) average maturity of this portfolio of bonds. While duration does provide more

information about the time pattern of payments than the maturity, it does not appear to have

Page 110: Finance Theory - Robert C. Merton

Robert C. Merton

108

much operational importance for the evaluation of coupon bonds. For example, it is not true in

general that comparing two coupon bonds with the same durations, the one with the higher yield-

to-maturity is the better buy.

An alternative definition of duration sometimes used is

(V.28) γ jt

T

1=tj t ’ D

j

∑≡

where j -t*

j jjt (1+ / ,)C r Bγ ≡ for t = 1,2,...,Tj – 1 and j

j

j -T*j j jjT

( + )(1+ / .)C M r Bγ ≡ This

measure differs from the original because it replaces the actual present value of the individual

payments with the discounted value using the yield-to-maturity of the bond. Two attractive

features of this measure are: (1) it can be computed without knowledge of discount bond prices

{P0(t)}. (2) It does provide a measure of the sensitivity of the bond's price to a change in its

yield-to-maturity. Specifically,(V.29) .’ D- = r/B)B/r( j*jjj

*j ∂∂

Because it is equal to the (negative of the) price elasticity of a bond with respect to a change in its

yield-to-maturity 'jD is sometimes used as a measure of the relative price variability of a

coupon bond with respect to a change in interest rates. However, it is not a reliable measure

because actual changes in interest rates do not affect the yield-to-maturity on different bonds in

the same way. That is, a bond with a longer duration than another bond may have a smaller

percentage price change in response to a change in interest rates because the effect of this change

on its yield-to-maturity may be smaller than the effect on the second bond's yield-to-maturity.

For a short, but excellent, discussion on the use and misuse of duration as a measure of price

variability, see Cox, Ingersoll, and Ross (1979).

Financial Intermediation and Interest Rate Spreads

We have assumed throughout this section that the fixed-income securities are traded in

organized markets. That is, to raise money, individuals or firms issue fixed-income securities

directly in the market, and to invest money, they purchase such securities directly in the market.

Page 111: Finance Theory - Robert C. Merton

Finance Theory

109

However, there is an alternative to direct market participation: Namely, fixed-income securities

can be issued to or purchased from a financial intermediary. A financial intermediary is defined

as an economic organization whose principal function is to purchase financial securities and

finance these purchases by issuing financial securities. Probably the best-known type of financial

intermediary is a bank which makes ("purchases") loans and finances them by ("issuing")

deposits. Although the operational differences have become progressively less distinct, the

banking function is further specialized by both the type of loan made and the form of deposits

issued. For example, commercial banks specialize in making short-term loans to business firms

and individuals and finance them principally by demand deposits. Savings and loan associations

and mutual savings banks specialize in making long-term loans (principally mortgages) and

finance them by time deposits. Other examples of financial intermediaries are finance

companies, insurance companies, and investment companies.

Whether a specific financial security is best handled by a market or financial intermediary

will, or course, depend upon the relative costs associated with the two alternatives. The market

system works best when there are a large number of both buyers and sellers of the security

willing to transact in minimum lot sizes sufficient to cover the costs of maintaining a market. In

general, such securities would have to be of standard form and available in reasonably large

quantities. Further, information about the issuer which is relevant to the evaluation of the

securities would have to be available to a large number of potential participants at a reasonable

cost.

The advantages of using financial intermediation come when there are important

economies of scale. For example, by its geographical location, a bank may have significantly

lower costs in gathering information about the local real estate market than would a nonlocal

entity. Significant information asymmetries in general will favor financial intermediation. If the

economic lot size required to support a market is large, then the financial intermediary may

provide divisibility otherwise unavailable. It may also provide added flexibility over a market by

allowing nonstandardized contracts. As will be shown later, a financial intermediary may

provide more-efficient risk-spreading at a lower cost than could be achieved with markets alone.

In the light of these differences between the two alternatives, it is not surprising that virtually all

Page 112: Finance Theory - Robert C. Merton

Robert C. Merton

110

borrowing by individuals is done through financial intermediaries rather than by issuing claims

directly in the market and the majority of fixed-income securities held by individuals are claims

against financial intermediaries.

The no-arbitrage pricing formulas derived for default-free fixed-income securities imply

that there is a single interest rate for each period rt. In fact, it is not uncommon to find fixed-

income securities with the same maturities and promised payments that sell for different prices

and hence, different yields. These persistent differences in promised yields are called interest

rate spreads, and they occur both for fixed-income securities traded in markets and for similar

securities available through financial intermediaries. As mentioned earlier, one reason for these

differences is that the observed securities are not all default-free. When there are different

probability assessments of receiving the promised payments, then promised yields will be

different. A second reason for these differences is that the terms (other than the maturity and

promised payments) are different. Two examples would be differences in sinking fund and call

provisions.

A third reason is that the tax treatment of the returns earned on the securities is different.

Coupon bond payments on municipal bonds are exempt from Federal and (sometimes) State

income taxes. Returns earned from price appreciation on a bond are taxed at a different (capital

gains) rate than coupon payments. One series of US Government bonds (appropriately called

"flower bonds") provided a means for reducing Federal estate taxes. A more subtle example of a

tax difference occurs for demand deposits issued by banks. Instead of paying the market rate of

interest on such deposits, banks frequently provide the service and convenience of a checking

account at "no charge." Because interest income is taxable but service charges for a checking

account are in general not tax-deductible, the implicit interest received in the form of these "free

services" is, in effect, tax-free. In general, one must include not only the explicit cash payments

but also the value of any "payments in kind" when comparing returns on fixed-income securities.

A fourth reason for the differences is a difference in the transactions costs associated

with different fixed-income securities. Dealers and market-makers who provide the services of

an orderly market for trading these securities are compensated for these services by buying at one

price (the bid price) and selling at a higher price (the ask price). The average compensation per

Page 113: Finance Theory - Robert C. Merton

Finance Theory

111

"round trip" trade is the difference between the ask price an the bid price which is called the bid-

ask spread. Because there are now two "prices" for a fixed-income security, two identical

securities could have different observed transaction prices if the last trade for one were a

purchase and the last trade for the other were a sale. If the marginal cost of making a market in

fixed-income security #j is higher than the marginal cost of making a market in fixed-income

security #i, then security #j is said to have lower marketability then security #i. Other things the

same, a security which has lower marketability will have a larger bid-ask spread. Because a

larger bid-ask spread implies a greater cost to the investor making transactions in that security,

equilibrium promised yields on less-marketable securities will be higher. Interest rate spreads

caused by costs are especially common for fixed-income securities available through financial

intermediaries as Problem V.5 illustrates.

Problem V.5: Interest Spreads on Consumer Installment Loans

A consumer goes to a bank to obtain a 36-month loan of $3000 to buy an automobile. As

is standard for such loans, the terms call for a series of equal monthly payments to repay the loan

with interest (i.e., it is an annuity type loan). It is given that future market interest rates are

nonstochastic and the term structure is "flat" at a level of 10 percent per year (i.e., 0.7974% per

month). The cost to the bank of closing the loan is $60 and the month cost to the bank of

servicing the loan is $0.65. If there is no chance that the consumer will default on the loan, what

is the smallest monthly payment required by the bank so that it would make the loan? What

would be the corresponding "quoted" interest rate on this loan?

Because the bank can always invest its funds at 10 percent per year in default-free fixed-

income securities, it will only make the loan on terms that will generate (at least) a 10 percent

return on its funds and cover all costs. If $x denotes the monthly payment by the consumer,

then the payment received by the bank per month net of servicing costs is $y ≡ $(x – .65). To

receive this stream of payments, the bank must initially pay out $3000 to the consumer and $60

in closing costs or a total of $3060. Because the reinvestment rate is the same each period, we

Page 114: Finance Theory - Robert C. Merton

Robert C. Merton

112

can use the present value formula for an annuity derived in Section II to determine the monthly

payments y which will generate a 10 percent annual return. Substituting r = .007974, N = 36,

and AN = 3060 in (II.19), we have that

(V.30) . $98.12 =

])r+1/(1-/[1Ar =y NN

Therefore, the monthly payment made by the consumer x is equal to $98.77. As discussed in

Problem II.2, the practice is for the bank to quote the terms of the loan in the form of an annual

interest rate based upon the amount of money borrowed by the consumer. This "quoted" interest

rate is the yield-to-maturity r* (annualized) on an annuity which pays $x per month for 36

months and has a present value of $3000. I.e., r* is the solution to equation (V.30) where Bj =

$3000, Cj = x and Mj = 0. Solving this equation, we have that r* = .009488 or .9488% per

month. The annualized interest rate implied by this monthly rate is given by (1 + r*)12 – 1 or 12

percent. Although the quoted rate is 200 basis points higher than the market rate (100 basis

points equals 1 percent on an interest rate), the bank only earns the market rate of 10 percent on

its funds. The difference in rates just covers the cost of creating and servicing the loan. Of

course, if there is a chance that the consumer will default on the loan, then the additional costs

associated with repossessing the automobile and selling it would have to be covered by the

promised monthly payments, and the spread between the quoted and market rates would be even

larger.

This completes our study of default-free fixed-income securities. In Sections II-V, we

have emphasized the intertemporal aspects of financial markets and instruments with little or no

explicit consideration of uncertainty. In Sections VI and VII, corporate investment decisions are

examined in such an environment. Beginning in Section VIII, the balance of this Volume will be

devoted to the role of uncertainty in financial theory.

Page 115: Finance Theory - Robert C. Merton

Robert C. Merton

113

Table III Summary of Selected Financial Instruments Markets Obligation Secondary

Market Maturities Denominations Volume Volatility (Range

over 2 years) Quotation Basis Forward; Futures Market

Treasury Bills U.S. Government Obligations

Excellent secondary market.

3 month 6 month 1 year

$10,000 $15,000 $50,000 $100,000 $500,000 $1,000,000

- $6 billion per wk. In 3- and 6-month maturities - $3-5 billion per day in secondary market.

(high) (low) 9.37 4.76

Discounted interest based on actual days in 360-day year.

One-week forward market sometimes available on bid side. Futures traded on International Monetary Market.

U.S. Agency Paper Obligations of U.S. agencies established by Congressional acts

Good secondary market

30 days to 40 years

$1,000 to $100,000

$87 billion outstanding in May, 1976.

Varies with each issue and maturity.

Discounted or interest-bearing. Interest based on 30-day month or 360-day year.

3-5 day forward market for some “when issued” securities some trading for delayed delivery; standby commitments available. GNMA futures traded on Chicago Board of Trade.

Prime Commercial Paper

Promissory notes of issuing companies (industrial & financial)

No secondary market.

30-270 days $5,000 to $5 million $100,000 basic trading unit.

$50.5 billion outstanding in May, 1976

12.50 5.00 Discounted or occasionally interest bearing. Based on actual days in 360-day year.

No forward market. No futures market.

Certificates of Deposit

Obligation of bank accepting the deposit

Good secondary market

1-12 months and occasionally up to 18 months

$100,000 and up; $500,000 minimum trading unit; $1,000,000 most common trading unit.

$70.6 billion outstanding in June, 1976 in CDs of over $100,000.

12.66 5.56 Yield basis. Interest paid on actual days in 360-day year. Interest & principal paid at maturity.

No forward market. No futures market.

Bankers Acceptances

Obligation of bank against which draft is drawn and which accepts draft

Good secondary market.

30-180 days. 90 days is most common primary market maturity.

Issued in odd denominations; traded in $100,000 to $1,000,000 lots.

$19.5 billion outstanding in May, 1976.

12.16 4.94 Discounted. Interest based on 360-day year.

No forward market. No futures market.

Federal Funds Obligations of the bank borrowing funds

No secondary market.

Usually overnight.

Negotiated among participants, generally $1 million units.

$26.9 billion—average weekly volume for July, 1976

13.55 4.73 Par basis—interest paid based on 360-day year. Interest & principal paid at maturity.

There is a forward market or one to two weeks. No futures market.

Long Term & Intermediate Term Government Securities

U.S. Government obligation

Limited secondary market—good for some short terms but very thin for long terms

1-10 year notes and 10-40 year bonds.

$10,000 minimum. $188 billion outstanding in June, 1976.

3-5 year 8.69 6.71 Long term

Price basis; quoted in dollars per hundred dollars face value.

No forward market. No futures market.

Page 116: Finance Theory - Robert C. Merton

Finance Theory

114

Page 117: Finance Theory - Robert C. Merton

115

VI. THE VALUE OF THE FIRM UNDER CERTAINTY In Sections III and IV, it was shown that the maximization of the current value of the firm is

the appropriate primary objective for good management. It is therefore natural to begin the study of

corporate finance by developing first techniques for determining the value of the firm and then

examine how that market value is affected by these investment and financing decision variables

which are under management's control. In this section, valuation formulas are derived in a certainty

environment where the future cash flows of the firm are known, and in Section VII, the capital

budgeting or the firm's investment decision problem is studied within this same framework. In

Sections XIV and XV, the investment decision by firms is reexamined in the context of uncertainty.

Valuation techniques can be separated into three categories:

(i) Rules of thumb intended to facilitate comparisons of value among similar assets. (ii) Approaches based upon the economic theory of market value under certainty. (iii) Approaches which explicitly recognize uncertainty and take into account risk in a

market context. While category (i) techniques are frequently used in practice (especially in security analysis),

they are less useful for corporate financial decisions because of the difficulty in determining the

impact of alternative management decisions on market value. Moreover, the use of such techniques

can be "dangerous" unless the user understands the set of implicit assumptions upon which their

valid application depends and the associated limits within which they can be relied upon.

As will be demonstrated, most Rules of Thumb are simplified abstractions of the techniques

in category (ii). Since these techniques also permit the analysis of alternative management decisions

on firm value, we begin the study of value with these techniques. However, the reader is warned

that because they assume certainty, the valuation formulas derived in category (ii) are themselves

significant abstractions, and care must be exercised in applying them in practice. Category (iii) to be

examined in Section XIV is the least abstract of the three and therefore, the most rigorous.

However, these techniques are also the most complicated and require more information and analysis

to implement. Which technique to use will depend upon the situation and the judgment of the

manager. One important purpose of this course is to help develop this judgment.

Page 118: Finance Theory - Robert C. Merton

Robert C. Merton

116

There are four basic approaches used to determine value in a certainty context:

(I) The value of the firm is the present value of the stream of dividends paid by the firm.

(The "Dividend-Discount" approach.)

(II) The value of the firm is the present value of the cash flows generated by the firm.

(The "Discounted Cash Flow" approach.)

(III) The value of the firm is the present value of the earnings generated by the firm. (The

"Discounted Earnings" approach.)

(IV) The value of the firm is the present value of earnings generated from assets currently

in place plus future investment opportunities. (The "Growth Opportunities"

approach.)

As an aid to the reader, a glossary of notation used in the analysis to follow is presented on

the next page.

To determine which (if any) of the four statements of values, I-IV, are correct, we start from

first principles. If Z(t) is the return per dollar from investing in the equity of the firm between time

t and t+1, then, by definition

(VI.1) S(t)

1)+ S(t+ 1)+d(t Z(t) ≡

where d(t+1) is the dividend per share paid at time (t+1) and S(t+1) is the price per share (ex-

dividend paid at time t+1). From the identity (VI.1), we derive a price restriction from the

(arbitrage) condition under certainty that all securities must yield the interest rate. I.e., that

(VI.2) S(t)

1)+ S(t+ 1)+d(t Z(t)= r(t) + 1 ≡

where r(t) is the one-period rate of interest from time t to t+1.

Consider a firm which will remain in business for T periods (from now) and then liquidate.

To deduce the value of the stock today, we first go forward in time and then work backwards to

today.

Page 119: Finance Theory - Robert C. Merton

Finance Theory

117

At time T in the future, the firm will pay its last dividend, d(T), per share and the ex-

dividend price per share at that time will be the salvage value (per share) of the firm, SALV, which

is assumed to be paid out as either a liquidating dividend or return of capital.

Glossary of Notation

V(t) ≡ market value of the firm at time t

n(t) ≡ number of shares of the firm's stock outstanding at time t

S(t) ≡ price per share of stock at time t (ex-dividend paid at time t)

V(t) ≡ n(t)S(t) if the firm is all equity-financed

D(t) ≡ total dividends paid by the firm at time t

d(t) ≡ dividend per share = D(t)/n(t–1)

Z(t) ≡ return per dollar to the investor in the firm

REV(t) ≡ total revenues in period t = stream of cash receipts

O(t) ≡ total operating cash outflow in period t

π(t) ≡ after-tax profits in period t

DEP(t) ≡ depreciation in period t

CGS(t) ≡ cost of goods sold in period t

τ(t) ≡ taxes paid in period t

I(t) ≡ gross investment (both new and replacement) in period t

i(t) ≡ net (new) investment in period t

I(t) ≡ i(t) + DEP(t)

X(t) ≡ "gross" profit or net cash flow in period t

X(t) ≡ π(t) + DEP(t)

Pt(s) ≡ price of a default-free discount bond at time t which pays $1 at time t+s (i.e., s

periods in the future).

r(t) ≡ short-term, one-period riskless interest rate for period t

Page 120: Finance Theory - Robert C. Merton

Robert C. Merton

118

Except for some tax implications, we could assume that d(T) includes this payment in which case

SALV = 0. From (VI.2), we have that

(VI.3) 1)-S(T

+ d(T) = 1)-S(T

S(T)+ d(T) = 1)- Z(T= 1)-r(T + 1 SALV

or

(VI.3') . 1)]-r(T+[1

+ d(T) = 1)-S(T SALV

Consider an investor who at time (T–2) is going to buy the stock; the total return in dollars

for holding one share for one period [from (T–2) to (T–1)] will be d(T–1) + S(T–1), and again to

avoid arbitrage, we have from (VI.2) that

(VI.4) 2)-S(T

1)- S(T+ 1)-d(T = 2)- Z(T= 2)-r(T + 1

or

(VI.4') 2)]-r(T+[1

1)- S(T+ 1)-d(T = 2)-S(T

Substituting for S(T–1) from (VI.3') into (VI.4'), we have that

(VI.5) .1)]-r(T+2)][1-r(T + [1

SALV+ d(T) + 2)]-r(T + [1

1)-d(T = 2)-S(T

At time (T–3) from now, we have from (VI.2) that

(VI.6) 3)-S(T

2)- S(T+ 2)-d(T = 3)-(T r + 1

or from (VI.5) and (VI.6),

Page 121: Finance Theory - Robert C. Merton

Finance Theory

119

(VI.6')

1)]-r(T+2)][1-r(T+3)][1-r(T+[1+d(T)

+

2)]-r(T+3)][1-r(T+[11)-d(T

+3)]-r(T+[1

2)-d(T =

3)]-r(T+[12)-S(T+2)-d(T = 3)-S(T

SALV

Proceeding inductively in this backwards fashion, we arrive at the price per share today (time zero)

which ensures that an investor buying the stock at any time and selling at any other time will earn a

fair return and no arbitrage opportunities will be created. I.e.,

(VI.7)

T

t Tt=1

s=1 s=1T

t Tt=1

T

0 0t=1

d(1) d(2) d(T) + SALVS(0) = + +...+[1+r(0)] [1+r(0)][1+r(1)] [1+r(0)]...[1+r(T -1)]

SALVd(t)S(0)= +( [1+r(s -1)]) [1+r(s -1)]

SALVd(t)= +[1+ R(t) [1+ R(T)] ]

= P P(t)d(t) + (T) SALV

∑∏ ∏

where R(t) and P0(t) are as defined in Section V. Indeed, (VI.7) follows directly as a special case

of valuation formulas (V.18) and (V.19) in Section V.

If it is assumed that the firm is financed entirely by equity (of a single-homogeneous class)1,

then we have that the current market value of the firm is V(0) ≡ n(0)S(0) where n(0) is the number

of shares currently outstanding. From (VI.7), we have

(VI.8) T

t Tt=1

n(0)SALVn(0)d(t)V(0) + [1+ R(t) [1+ R(T)] ]

≡∑

for notational convenience, we rewrite (VI.7) and (VI.8) as

1An assumption maintained until we reach Section IX.

Page 122: Finance Theory - Robert C. Merton

Robert C. Merton

120

(VI.7') ]R(t)+[1

d(t) = S(0) t1=t∑∞

and

(VI.8') ]R(t)+[1

d(t)n(0) = V(0) t1=t∑∞

where it is understood that a finite-lived firm will have d(t) = 0 for t > T, and any

salvage value is incorporated in d(T).

Returning to our four approaches to valuation: From (VI.8'), (I) is valid provided that it is

more carefully stated to say that the current market value of the firm is equal to the present

(discounted) value of the stream of dividends paid by the firm to the current shares outstanding.

Total dividends paid by the firm at time t, D(t), are equal to n(t–1)d(t).2

So, in general, ]R(t)+[1

D(t) V(0) t1=t∑∞

≠ unless n(t) n(0).t≡

I.e., unless the firm neither issues any

new shares (to raise additional capital) nor purchases any shares ("share repurchase") for treasury

stock.

Note: Even though the value of the firm (or an individual share) is written as the present

value of future dividends, the investor will earn the market return on his investment over any sub-

period of time even if no dividends are paid during that time. E.g., if d(3) = 0, then the return from

period 2 to period 3, will be S(3) Z(2) = 1 + r(2).S(2)

To work out the dynamics of how an all-equity-financed firm's value changes through time,

it is important to distinguish between the change in an individual investor's wealth from the return

earned by the firm and the change in the firm's total value. By definition,

V(t +1) n(t +1)S(t +1)= n(t)S(t)+ [n(t +1)- n(t)]S(t +1).≡ Substituting (partly) for S(t+1) form

(VI.2), we have that

2 Note it is "n(t–1)" because we have assumed that dividends paid at time t go only to shares outstanding as of time (t–1).

Page 123: Finance Theory - Robert C. Merton

Finance Theory

121

(VI.9) 1)+1)S(t+m(t + 1)+D(t-r(t))V(t)+(1 =

1)+1)S(t+m(t +1)] +d(t-r(t))S(t)+n(t)[(1 = 1)+V(t

where m(t+1) ≡ n(t+1) – n(t) = number of new shares issued by the firm at the (ex-dividend) price

S(t+1). [If m(t+1) < 0, then this corresponds in absolute value to the number of shares purchased

by the firm from shareholders.] From (VI.9), the change in firm value, ∆V ≡ V(t+1) - V(t), can be

written as

(VI.10) 1)]+1)S(t+[m(t +1)] +D(t-[r(t)V(t)= V∆

= r(t)V(t) + {m(t+1)S(t+1)–D(t+1)} _______________ ____________________ Total change in Net new financing by shareholder's the firm or net new wealth from capital raised by the investing in firm company

So, for example, AT&T could have a beginning-of-the-year market value of $20 billion; investors

could average a 10% return for the year (or $2 billion); AT&T could pay out dividends (and interest)

of $1.5 billion; and it could issue $4 billion worth of new shares (and debt). From (VI.10), the

change in firm value would be ∆V = $2 billion + {$4-1.5}billion = $4.5 billion while the net gain to

investors would be $2 billion. Hence, the change in the market value of the firm can be larger,

smaller, or equal to the change in shareholders wealth. From (VI.9), we also have

(VI.11) . } 1)+1)S(t+m(t - 1)+D(t + 1)+V(t {r(t)]+[1

1 = V(t)

Having established the validity of the stream-of-dividends approach, what can be said about

the other three methods? Using REV(t+1) to denote the total revenues [or stream of cash receipts

during period (t to t+1)] and O(t+1) to denote total cash (operating) outflow, the basic cash flow

accounting identity can be written as

(VI.12) REV(t+1) + m(t+1)S(t+1) = O(t+1) + D(t+1)

Page 124: Finance Theory - Robert C. Merton

Robert C. Merton

122

Total cash inflow Total cash outflow

REV(t) and O(t) can be expressed in terms of their component parts as

( ) ( ) ( ) ( ) ( )REV t t t CGS t DEP tπ τ= + + +

and

( ) ( ) ( ) ( )O t I t t CGS tτ= + +

where the terms are defined in the glossary.

Moreover, from the accounting identity (VI.12), we have that

(VI.13a) ( 1) ( 1) ( 1) ( 1) ( 1)D t m t S t REV t O t+ − + + = + − +

(VI.13b) ( )( 1) ( 1) ( 1) ( 1) ( 1) 1 ( 1) ( 1)D t m t S t t DEP t I t X t I tπ+ − + + = + + + − + = + − +

(VI.13c) 1)+i(t - 1)+(t = 1)+1)S(t+m(t - 1)+D(t π

From (VI.11c) and (VI.13a) - (VI.13c), we have that

(VI.14a) 1V(t) = {V(t +1)+ REV(t +1)- O(t +1)}

[1+r(t)]

(VI.14b) 1)}+I(t-1)+X(t+1)+{V(t r(t)]+[11 = V(t)

(VI.14c) 1)}.+i(t-1)+(t+1)+{V(t r(t)]+[1

1 = V(t) π

Let V(T) denote the value of the firm at time T in the future; then, by employing the same

backward technique used in deducing the value of a share of stock in (VI.7), we can work backward

to solve (VI.14a) - (VI.14c) which can be rewritten as

(VI.15a) T

t Tt=1

V(T)[REV(t) - O(t)]V(0) = + [1+ R(t) [1+ R(T)] ]∑

Page 125: Finance Theory - Robert C. Merton

Finance Theory

123

(VI.15b) ]R(T)+[1

V(T) + ]R(t)+[1

I(t)]-[X(t) = V(0) Tt

T

=1t∑

(VI.15c) ]R(T)+[1

V(T) + ]R(t)+[1

i(t)]-(t)[ = V(0) Tt

T

1=t

π∑

Provided that 0, = }]R(T)+[1

V(T){T T∞→

lim we can rewrite (VI.15a) - (VI.15c) for a firm that is

going to continue indefinitely as

(VI.15a') tt=1

[REV(t) - O(t)]V(0) = [1+ R(t)]

(VI.15b') ]R(t)+[1

I(t)]-[X(t) = V(0) t1=t∑∞

(VI.15c') ]R(t)+[1

i(t)]-(t)[ = V(0) t1=t

π∑∞

From (VI.15a'), we see that the value of the firm can be written as the present value of the cash flows

generated by the firm and hence, approach (II) is a valid description of value.

From (VI.15'), it is not a valid claim that the current value of the firm can be written as the

present discounted value of future earnings (i.e., ) ]R(t)+[1

(t) V(0) t1=t

π∑∞

≠ because in general, to

generate a specific earnings flow, it is necessary to make capital expenditures in the future.

Working in net terms, if it is necessary to make (net new) investment expenditures i(t) in the ith

period, then the present value of this opportunity cost is i(t)/[1+R(t)]t. Summing over all t, we get

the total additional cost required to generate the stream of earnings {π(t)} to be . ]R(t)+[1

i(t)t

1=t∑∞

Page 126: Finance Theory - Robert C. Merton

Robert C. Merton

124

Subtracting these costs from the present value of the earnings will give the value of the firm which

is verified in (VI.15b') in gross terms or in (VI.15c') in net terms.

Note: ]R(t)+[1

(t)t

1=t

π∑∞

may either overstate or understate the correct market value because

i(t) can be negative or positive although i(t) ≥ – DEP(t).

In a contracting industry, one might expect to find that gross investment may not be as large

as required for replacement (i.e., i(t) < 0), and hence, capacity would decline over time. In a stable

or stagnant industry, gross investment might just match replacement requirements (i.e.,

i(t) 0), ≈ and capacity would remain about constant over time. In an expanding industry, gross

investment would probably exceed replacement requirements (i.e., i (t) > 0), and capacity would

increase over time. So, unless the economy as a whole is stagnant, tt=1

(t)[1+ R(t)]

π∞

∑ will be a biased

estimate for market value. Nonetheless, approach (III) is valid if interpreted in the sense of

(VI.15c').

Although equivalent to the other three statements of value, the "current earnings and future

investment opportunities" approach (IV) is probably the most interesting. It is an especially useful

form for the investor planning to invest in the firm, and is the most natural approach for a (single)

owner planning to take over the firm.

To avoid notational complexities, let us assume that and , for all r(t) r R(t) r t= = (i.e., a

"flat" term structure).

To determine the value of the firm, the take-over investor considers three things:

(1) the competitive ("normal" or "alternative") rate he can make in the market which is r.

(2) the earnings potential of the existing assets of the firm.

(3) the opportunities (if any) for the firm to invest in real assets that will yield more than the

competitive rate of return. (Due to special advantages of the firm.)

Clearly, the take-over investor is not concerned with dividends patterns because he can choose any

pattern he wishes.

Page 127: Finance Theory - Robert C. Merton

Finance Theory

125

To evaluate the earnings potential of the existing physical (tangible) assets, one can use the

regular discounted cash flow formula. Using the annuity formula (Section II, formula (II.16)), we

can compute an equivalent perpetual constant flow. Call it ,X ce and the value of the firm's tangible

assets is .r

X ce

To evaluate the (intangible) assets associated with future investment opportunities, first,

consider those projects beginning in period t requiring investment in that period of I(t). Second,

using the discounted cash flow method, evaluate the projects as of date t . Third, using the annuity

formula, convert this value into an equivalent perpetual annuity with a constant flow of F(t)

dollars at the end of each year. Define: * F(t)(t) .r I(t)≡ Then (t)r* is the average rate of return per

period on projects taken in period t (and F(t) is the (equivalent) dollar return on projects

undertaken in period t). The present value at the beginning of period t of these projects is

.r

(t)I(t)r = r

F(t) *

Note: *(t)r is an average rate of return. Since one can always earn at least r by buying

market securities, one would not (voluntarily) take investments yielding less than r . Therefore *(t) r.r ≥

The "goodwill" difference between worth and cost is ],r

r-(t)rI(t)[ = I(t) - r

(t)I(t)r **

and the

present value of this "goodwill" is ;)r+](1r

r-(t)rI(t)[ t-*

and the present value of all such

"goodwill" for all the future is t=1

* 1(1+ r)

I(t)[ r (t) - rr

].∞

The current value of the firm will be the sum of the value of current assets plus the current value of

"goodwill." I.e.,

Page 128: Finance Theory - Robert C. Merton

Robert C. Merton

126

(VI.16) ].r

r-(t)rI(t)[ )r+(1

1 + r

X = V(0)*

t1=t

ce ∑∞

To show that (VI.16) is equivalent to the other formulations for value, note that by definition of *( ) ,cet and r X

*

* * *

-1* *

1

(1)(2) (1) (1)

(3) (1) (1) (2) (2) (2) (2) (2)

( ) ( -1) ( -1) ( -1) ( ) ( ).

ce

ce

cet

ces

X XX IrX

X I I X Ir r rX

X t X t t I t Ir rX s s=

=

= +

= + + = +

= + = +∑

From (VI.15b'), we have that

1 2

[ ( ) ( )] (1) (1) [ ( ) ( )](0)[1 ] (1 ) (1 )t t

t t

X t I t X I X t I tVr r r

∞ ∞

= =

− − −= = +

+ + +∑ ∑

1

2 1

(1) 1 [ *( ) ( ) ( )](1 ) (1 )

tce

cett s

X I X r s I s I tr r

∞ −

= =

−= + + −

+ +∑ ∑

1

2 1 1

*( ) ( ) (( )(0)(1 ) (1 )

tce

t tt s t

X r s I s I tVr r r

∞ − ∞

= = =

= + −+ +∑∑ ∑

Note:

-1 ** *

2 1 2 3

*

1

-1 1 1

1 1( ) ( ) (1) (1) (2) (2) ...(1 (1 (1) ) )

1( ) ( ) ...(1 )

1 1 1 1 1 1 1•(1 (1 (1 (1 (1 (1) ) ) ) ) )

t

t k kt s k k

kk s

k s k s s j sk s k s j

s I sr I I r rr r r

s I s rr

rr r r r r r

∞ ∞ ∞

= = = =

= +

∞ ∞ ∞

= + = + =

= + ++ + +

+ ++

= = =+ + + + + +

∑∑ ∑ ∑

∑ ∑ ∑

So,

1

2 1 1

*( ) ( ) *( ) ( )(1 ) (1 )

t

t st s s

r s I s r s I sr r r

∞ − ∞

= = =

=+ +∑∑ ∑ or

Page 129: Finance Theory - Robert C. Merton

Finance Theory

127

-1 * *

2 1 1

( ) ( ) ( ) ( )(1 (1) )

t

t st s s

s I s s I sr r orr r r

∞ ∞

= = =

=+ +∑∑ ∑

*

1

1 ( ) -(0) ( )[ ] ( .16). //(1 )

cet

t

t rrXV I t which is VIr rr

=

= ++∑

____________ __________________________ Value of tangible assets Value of future opportunities Inspection of (VI.16) demonstrates two important points: (1) a firm can have positive value without

any physical assets; (2) the current value of the firm will only be affected by future investment

opportunities if those opportunities have rates of return on physical assets that exceed the market

rate.

To summarize, the four major approaches to valuation (appropriately interpreted) are equally

valid, and in fact, equivalent. Because each follows from the other using the basic accounting

identity, none is "more primal" than any other. Which one uses is more a matter of convenience.

The above analysis is precise and without controversy. If the world were certain, then there

would be nothing more to do in terms of valuation formulas. However, the future is uncertain, and

the impact of uncertainty on valuation is non-trivial. As will be shown in Section XIV, it is possible

to develop precise valuation formulas under uncertainty, but these (or the underlying assumptions)

are subject to controversy.

Before going into the use of these formulas in the firm's investment decisions, we conclude

this section with a brief discussion of "growth" and "glamour" stocks which gives a precise

definition for such stocks and may clear up some misconceptions about what growth stocks are.

Growth Stocks

Page 130: Finance Theory - Robert C. Merton

Robert C. Merton

128

A common rule of thumb used to value a firm is to compute the average or "normal" price-

to-earnings ratio ("PE") for companies within the same industry (or risk class), and to estimate the

value of the firm by assuming that it should have the same price-to-earnings ratio. I.e., if there are

n firms in the industry, and (0)V (0) ii andπ denote current earnings and firm value for the ith

firm, then ii

i

(0)V PE (0)π≡ and the industry average is

n

ii=1

1PE .PEn≡ ∑ To find the "fair value" for

the firm, we set (0). )PE( = V(0) π• A variate of this "quick-and-dirty" method is to compute the

average earnings-to-price ratio, and setn n

ii

i1 i=1

1 1 (0) (0)EP = V(0) = .EPn n (0) EPVππ≡∑ ∑ The crude

justification for this method goes as follows: Earnings are what is available to shareholders; if the

current earnings of the firm are reasonable estimates of future earnings and if the firm is expected

to remain in business indefinitely, then the shareholders can reasonably expect to receive in

payment (or in equivalent increase in share value) (0)π each period indefinitely. Viewed in this

light, ownership of shares is essentially the same as owning a perpetual annuity or consol bond

(page II-17, formula (II.18)). From page II-17, if A∞ is the value of such an annuity; y is the

payment per period; k = required rate of return (or discount rate) on the stream, then .ky = A∞

Applying the analog to the ith firm, we have that

andi ii ii

i i i

(0) (0) 1V = = = .kEP PE(0) (0)V kπ

π≡ If it is assumed that, on average, the required return on

all firms in the same industry (or risk class) is the same, then k EP≡ is the required return for the

stream of the firm, and therefore, V(0) should equal .EP(0) =

k(0) ππ

In the light of our previous analysis, is there a rationale for this approach, and if so, under

what conditions is it valid?

From the analysis of the stream-of-earnings approach to valuation, we found ((VI.15c')) that

the firm's value could be written as .]R(t)+[1

i(t)-(t) = V(0) t=1t

π∑∞

Under what conditions will

Page 131: Finance Theory - Robert C. Merton

Finance Theory

129

k(0) = V(0) π where k is the return per period on the firm? If current earnings are an estimate of

future earnings ( (t) (0))t

π π≡

and the firm requires no net new investment to generate these

earnings (i.e., i(t) 0)t≡

and if the term structure is essentially "flat" (at least, in "real" terms), then

r r(0) = V(0) whereπ is the per period rate of return. In simple terms, if the firm's investment

(gross) is essentially matched by depreciation (i.e., maintenance); if current earnings are

representative of future earnings; if the reinvestment rate is reasonably constant, then the rule of

thumb method is reasonable.

Further, if one interprets the earnings used in computing the or PE EP as a kind of "long-

term average earnings," then even for cyclical-type companies, the rule of thumb has some validity.

Thus, by using the discounted cash flow method to convert the flows from current tangible assets

to an equivalent equal annual flow (0)},{π then provided the other conditions hold, .r(0) = V(0) π

In early empirical studies, when earnings were "smoothed" in this fashion to eliminate transient

earnings, the PE rule of thumb was at times a reasonably good forecaster of value except for an

important subset of stocks.

These stocks had unusually high price-to-earnings ratios, and such stocks have traditionally

been identified as "glamour" or "growth" stocks.

Note: It should be noted that the type of stocks in this category have a consistently higher-

than-average PE ratio. I.e., many stocks may have current PE ratios, ,(0)

V(0)π

which are very

high (or even undefined, if 0), (0)≤π but have long-run (0)

V(0)π

which are not "out-of-line." Such

Page 132: Finance Theory - Robert C. Merton

Robert C. Merton

130

stocks are not what is meant by growth or glamour stocks. Rather growth stocks are stocks with

(0)V(0)π

that are unusually high persistently.

In earlier times, such stocks were thought to be outside the traditional mode of analysis and

were simply excluded from such discussions. The problem was that PE1 = EP was treated as an

estimate for the required rate of return k, and often, when "normal" companies would have PE's

of 10 or 12 implying a 8.5%, - 10% = k growth stocks would have PE's of 25-50 implying a

2% - 4% = k which would often be below the riskless rate r. Did this mean that investors in such

stocks were "fools," and that investment in such stocks could only be justified on the basis that

somehow investors would be willing to pay even more for them later, independent of the "rational,"

implicitly-low return?

An alternative, rational explanation can be found in the previous analysis of growth

opportunities. From (VI.16), we can rewrite the expression for (0)

V(0) = PEπ

(noting that

)X = (0) ceπ as

(VI.17)

r.> (t)r r1 >

]r

r)-(t)r[I(t) )r+(1

1 (0)1 +

r1 =

(0)V(0) = PE

*

*

t=1t

if

∑∞

ππ

Thus, if a company has some special advantages (e.g., patents, superior distribution capabilities,

monopoly, etc.) so that it can reasonably be expected to find investment opportunities in the future

which yield (non-competitive) rates of return (t)r* which exceed the market required return r,

then from (VI.17), it is quite rational to bid the price of the firm beyond the normal PE associated

with the profits generated by the assets currently in place. Moreover, from the derivation of

(VI.16), the investor will earn a rate of return r on such investments. The magnitude of the

difference between the PE ratio and 1/r will depend on the size and number of investment

opportunities that have returns that exceed r (i.e., I(t)) and the size of the spread between (t)r*

(the average return on projects) and r (the required rate of return by investors).

Page 133: Finance Theory - Robert C. Merton

Finance Theory

131

Examples of companies that have at times been termed "growth" stocks are IBM, Coca Cola,

Polaroid, Xerox, and several drug companies.

Identification of "Growth" Stocks

Questions for thought (Q.1) Is a company whose total asset size and earnings are growing over time (in a steady trend) a

growth stock?

(Q.2) Is a company whose earnings per share are growing over time a growth stock?

(Q.3) Is a company whose earnings per share are growing over time at a rate less than the required

rate of return r not a growth stock?

(Q.4) Is a company whose earnings per share are growing over time at a rate greater than the

required rate of return r a growth stock?

(Q.5) As an investor, what are the main questions you would want answered in deciding whether a

firm was a growth company or not?

Example: The Constant-Growth Case

Although not exactly empirically relevant, the constant-growth case displays some of the

qualitative characteristics of growth stocks. Assume that the firm does have investments such that

r > (t)r* and further that the average return on investments per period, (t),r* is the same in every

period. I.e., **(t) > r.r rt≡

From the previous analysis, we have that

. (0) = X(1) ; 1)-I(tr + 1)-X(t = X(t) * π Assume that the firm's investment policy leads to a total

investment each period which is a constant fraction of that period's gross earnings, i.e., X(t) = I(t) δ

where 0 1 .≤ ≤δ Then,

Page 134: Finance Theory - Robert C. Merton

Robert C. Merton

132

]r+(0)[1 =] r+1)[1-X(t = 1)-X(tr + 1)-X(t = X(t) 1-t*** δπδδ and

r = 1)-X(t

1)-X(t-X(t) *δ = rate of growth of earnings. Substituting for

]r+(0)[1 = X(t) = I(t) 1-t*δπδδ into (VI.16), we have that

*

*

t-1*

tt=1

*t-1

t=1

(0) r r (0)[1+ ]rV(0) = + r r [1+r ]

(0) (r r) 1+ r= 1+ [ ]r (1+r) 1+ r

π δπ δ

π δ δ

⎛ ⎞−⎜ ⎟⎜ ⎟⎝ ⎠

⎡ ⎤−⎢ ⎥⎣ ⎦

From p. II-15, *1+r(y )

1+rδ

1

j

t j

t=1 =0

-1 1 1+2y = = y = = = , provided that r > r* .yy -1 1- y r - r*

δδ

∞∞ ∞− ⎡ ⎤

⎢ ⎥⎣ ⎦∑ ∑

So,

*

*(0) ( - r) 1+rrV(0) = 1+ r (1+r) r r

π δδ

⎡ ⎤•⎢ ⎥

−⎣ ⎦

or

(VI.18)

where rate of growth of earnings.

*

*

(1- ) (0)V(0) = r - r

(1- ) (0)= g = rr - g

δ πδ

δ π δ≡

Note that ,(0))-(1 = D(0) πδ and therefore, ,g-r

D(0) = V(0) a version of the "dividend-discount"

model.

Question: What is the interpretation of the rg ≥ case?

Page 135: Finance Theory - Robert C. Merton

Finance Theory

133

What is the model telling you?

Page 136: Finance Theory - Robert C. Merton

134

VII. THE FIRM'S INVESTMENT DECISION UNDER CERTAINTY: CAPITAL

BUDGETING AND RANKING OF NEW INVESTMENT PROJECTS The most important decisions for a firm's management are its investment decisions.

While it is surely possible to get the firm into "trouble" through poor financing decisions or

improper management of working capital, the value of the firm is principally determined by the

prospects for its investments. Investments by the firm take two forms: (i) internally-generated

projects which, if undertaken, create new assets; and (ii) the acquisition of external already-

existing assets from other firms by either direct purchase of the assets or the acquisition of the

whole firm by merger, consolidation, or takeover.

Mergers and acquisitions are important topics for financial management and will be

discussed in Section XV. However, with the exception of a few specialized firms, the primary

function of the business firm is to find and undertake profitable new projects, and it is this form

of investment which is the topic of this section. The capital budgeting problem is how to select

those physical investments or projects so as to maximize the value of the firm. Much of the

formal apparatus has already been developed in Sections II, VI, and to some extent in Section V.

However, to put these tools in a more specific framework, we examine the various traditional

capital budgeting methods used to evaluate projects.

Before proceeding, we begin with some definitions:

A project is defined by the series of net cash flows it generates at the end of each period,

{X(1),X(2),...,X(N)}. These flows {X(t)} can be either positive or negative. If X(t) is positive,

then the project provides a net flow of cash into the firm at the end of period t, and if X(t) is

negative, then it causes a new flow of cash out of the firm. Since most projects require an initial

outflow, it is a common convention to denote this flow by "–I0" where I0 is the (positive)

outflow or initial investment in the project. For symmetry, we will also denote –I0 by "X(0)",

the net cash flow at the end of the "zeroth" period (or the beginning of the first period).

X(t) = [Revenues–Costs–Depreciation] × (1 – tax rate) + Depreciation –

Investment (in the project)

Page 137: Finance Theory - Robert C. Merton

Robert C. Merton

135

= After-tax Operating Profits – net new investment (in the project)

Let k denote the cost of capital to the firm (measured in percent per period) where the

cost of capital is the (external) rate of return required by investors for providing funds to the firm

and it reflects all the market opportunities available to investors. In a world of certainty (which is

the formal setting for this section), the cost of capital is simply the market rate of interest, r.

However, we follow tradition of using "k" rather than "r" to include the possibility in a quasi-

uncertainty sense (made rigorous in Section XIV) that different risk projects will have different

required returns (and in particular, required rates different from the riskless interest rate).

Following the practice of Section II, to simplify the analysis, it is assumed that the

explicit opportunity cost to investors for investing in the firm, k, is constant over time. If k

were changing over time, then in an analogous fashion to R(t) in Sections II and V, we could

define K(t) by t

t

j=1

[1+ K(t) [1+ k(t)],] ≡ ∏ and use "[1 + K(t)]t" everywhere in the formulas when

"[1 + k]t" appears.

Independent Projects are project such that the firm can decide to do both or either one or neither.

(Note: this definition has no implications of statistical independence among projects.)

Mutually Exclusive Projects are projects such that the firm can only do one or the other, but not

both.

Traditional Methods of Project Selection

I. Pay-Back Method

Page 138: Finance Theory - Robert C. Merton

Finance Theory

136

If I0 is the initial investment, then the payback period is that value of T such that

.X = I t

1=t0 ∑

T

I.e., it is the minimum length of time until the net cash flows sum to the value of the initial

investment. The payback method says rank all (independent) projects from the shortest to the

longest and then take (invest in) all projects with a payback period less than or equal to some

given time, T*. When choosing among mutually exclusive projects, select the one with the

smaller payback period.

II. Present Value Method (Review Section II) The (net) present value of a project is

. )k+(1

X(t) =

)k+(1

X(t) + I - = PV

t

N

0=tt

N

1=t0 ∑∑

As described in Section II, the present value rule says rank all (independent) projects from the

highest to the lowest, and then take all investments with positive (or as a matter of indifference,

zero) present value. When choosing among mutually exclusive projects, select the one with the

largest present value.

Note: If the cost of capital were changing over time, then the present value of the project will be

.

]K(t)+[1

X(t) + I-= PV t

N

1=t0 ∑ and the method is still applicable

III. Internal Rate of Return Method (Review Section V on Yield-to-Maturity)

The internal rate of return for a project, i, is that discount rate such that the present

value of the project (computed at that rate) is zero. I.e., i is the solution to

Page 139: Finance Theory - Robert C. Merton

Robert C. Merton

137

. ]i+[1

X(t) + I- = 0

t

N

1=t0 ∑

It is called an internal rate because, unlike k (the cost of capital), which is an (external) market

(opportunity cost) rate, i depends only on the nature of the time-flow patterns of the project and

is completely unrelated to any market rate.

The internal rate of return rule says rank all (independent) projects from the highest to

the lowest, and then take all investments whose internal rate of return is greater than some

specified rate i* (usually taken to be the cost of capital, i.e., i* = k). When choosing among

mutually-exclusive projects, select the one with the largest internal rate of return.

IV. Profitability Index Method

I / )k+(1

X(t) = PI = 0t

T

1=t⎥⎦

⎤⎢⎣

⎡∑Indexity Profitabil

Method: Rank all (independent) projects form the highest to the lowest and take all investments

with profitability index greater than one. When choosing among mutually exclusive projects,

select the one with the largest profitability index.

Evaluation of these Methods:

Problems with Payback 1. Neglects the time value of money (no discounting)

2. Neglects all flows beyond the payback period (implicit "infinite" discounting)

Therefore, misses future negative or positive flows.

A related method sometimes used is the "Modified" Payback Method. The modified payback

period is defined as the minimum T such that

)k+(1

X(t) = I t

1=t0 ∑

T

Page 140: Finance Theory - Robert C. Merton

Finance Theory

138

Present Value

In perfect capital markets and certainty, the value of the firm is equal to the present value of all

its future flows discounted at the (market-determined) cost of capital. Hence, the present value

rule maximizes the value of the firm. It is sometimes called a conservative rule because the firm

always has available investments which will earn k: namely, it can buy its own stock. Thus, the

firm should never take negative PV projects. In uncertainty, the rule can be modified according

to the "risk-adjusted" method to be discussed later: Namely,

)r+(1

(t)X + I- = PV

t

t

1=t0

α∑N

where = tα a certainty equivalent and (t)X is the expected cash flow. In general, present value is

the most appropriate of these four traditional techniques.

Internal Rate of Return

While the present value method assumes that the flows can be reinvested at the cost of capital

(which is always possible), the internal rate of return assumes that the flows can be reinvested at

the internal rate of return i.

Technical Problems that Can Arise with Internal Rate of Return

1. There may be either more than one value of i or no value of i which makes the

present value of the project zero.

2. If the cost of capital is varying over time, the "cut-off" rule of taking only projects

with i = k is not well-defined.

Example: Present Value vs. Internal Rate of Return

Assume k = .05

Page 141: Finance Theory - Robert C. Merton

Robert C. Merton

139

Present Value

Project A (Mutually Exclusive of B)

Year End Net Cash Flow Discount 1 Present Value

(1+k)t 0 -1,000,000 1 -1,000,000 1 3,150,000 .952 2,998,800 2 -3,307,500 .907 -2,999,992 3 1,157,630 .864 1,001,192 Present Value of A = Sum of PV = 0 Project B (Mutually Exclusive of A) Year End Net Cash Flow Discount 1 Present Value

(1+k)t 0 -1,000,000 1 -1,000,000 1 3,210,000 .952 3,055,920 2 -3,433,800 .907 -3,114,457 3 1,224,080 .864 1,057,605 Present Value of B = Sum of PV = -4,548 So, by the Present Value Method, A is preferred to B. Example (continued) Internal Rate of Return Project A: Let x = 1 + i

Page 142: Finance Theory - Robert C. Merton

Finance Theory

140

0 = x

1,157,630 +

x

3,307,500 -

x

3,150,000 + 1,000,000- = PV

32

or find the roots of the cubic equation:

0 = 1.15763 + 3.3075x - x3.15 + x- 23

3 roots: 1.05 = i + 1 = x

.05 = i 1.05 = i + 1 = x

1.05 = i + 1 = x )x,x,x(

33

A22

11321

so

are

Project B: Let x = 1 + i

0 = x

1,224,080 +

x

3,433,800 -

x

3,210,000 + 1,000,000- = PV

32

or find the roots of the cubic equation:

3 23.21 3.4338 1.22408 0x x x− + − + =

3 roots: 1.10 = i + 1 = x

1.07 = i + 1 = x

1.04 = i + 1 = x )x,x,x(

33

22

11321 are

So there are three internal rates of return .10=i .07;=i .04;=i B3

B2

B1

Page 143: Finance Theory - Robert C. Merton

Robert C. Merton

141

Example (continued)

"Switching Points" at k = .05 take A over B

at k = .08 take B over A

Page 144: Finance Theory - Robert C. Merton

Finance Theory

142

As noted in Section VII, this is not a paradox because different interest rates imply different

"worlds" with different alternatives. Thus, which technology to use (e.g., wood bridge versus a

steel bridge) rarely can be answered with knowledge of the technology only. The switching

problem (or multiple-roots problem) occurs when there is more than one positive root which

makes the present value equal to zero. One can use the following rule to check to see whether

more than one such root can occur: if 0, = a + ... + xa + xa + x n2-n

21-n

1n then (Descarte's rule of

signs) the number of positive roots either is equal to the number of variations of signs of the ai's or is

less than this number of variations by an even integer.

In the example, both projects had three sign changes, and hence, either three or one

positive roots.

It should be noted that from the tables in this example, both the payback and the modified

payback methods would have picked Project B over Project A.

More on Present Value versus Internal Rate of Return

If X(0) = – I0 < 0 and all X(t) ≥ 0, for t = 1,2,... for all the projects being considered

and if the projects are independent, then the Present Value Rule and Internal Rate of Return Rule

will lead to the same answer with respect to which projects will be taken. To see this, note that a

plot of present value versus cost of capital will look like:

Page 145: Finance Theory - Robert C. Merton

Robert C. Merton

143

Hence, if i > k, then the present value will be positive. However, even in the case of a single

positive root, the rankings of projects by the two methods can be different. Hence, danger lurks

for evaluating mutually exclusive projects using i or in using it in the case of capital rationing as

the following example illustrates.

Example: Suppose that you have $1000 and you can purchase either Project A or Project B.

Given that the only investment alternative available in future years for any money received will

be to stuff it in a mattress or bury it in a coffee can (i.e., k = 0), which should you take?

Project A: Pay $1000 today (i.e., I0 = 1000) and you receive no payments until the end of

fifteen years when you will receive $4,177 (i.e., x(1) = x(2) = x(3) = ... = x(14) = 0 and x(15) =

4177).

Project B: Pay $1000 today (i.e., I0 = 1000) and you receive $214 at the end of each year for

fifteen years (i.e., x(1) = x(2) = x(3) = ... = x(14) = x(15) = 214).

We know by Descarte's rule of signs that both bonds have only one positive root. Hence, the

internal rate of return for both is unique. Using the present value tables and the formula for an

annuity, the internal rate of return on A is iA = .10 and on B is iB = .20. Clearly, on a IRR

basis, B is preferred to A. What about present value? At k = 0,

2210 =

(15x214) + 1000- = )0+(1

214 + 1000- = PV

3,177 = )0+(1

4177 + 1000- = PV

t

15

1B

15A

and

Clearly, by the Present Value Rule, A is preferred to B. Which is "more" correct? Fist, note

that since all interim payments cannot be invested to earn a positive return, it is easy to compute

how much money we will have at the end of fifteen years from each project: for Project A, we

have $4177 and for Project B, we will have only $3210. Since they both cost the same, which do

Page 146: Finance Theory - Robert C. Merton

Finance Theory

144

you prefer? Further, since we know the final amounts, we can compute an actual average

compound return per year for both. I.e., 1000

4177 = )R+(1 15

A has the solution RA = .10. So, the

true return per year from A is 10%, and 1000

3210 = )R+(1 15

B has the solution RB = 8.2%. So, the

true return per year from B is 8.2% NOT 20%. Hence, Present Value is a better ranker. Note:

the internal return, iB, is a number and need not bear a close relationship to the actual returns

earned. E.g., 20% versus 8.2%.

In bond evaluation, yield-to-maturity is just an internal rate of return calculation, and

therefore, as noted in Section V, the same warnings apply to comparing alternative bond

investments by yield-to-maturity even when the bonds have the same maturity date.

Imperfections and Capital Budgeting

If the firm is a "perfect competitor" for capital (i.e., the firm's cost of capital is unaffected by the

scale of its investments) and capital markets are "reasonably" perfect, then the correct capital

budgeting decision rule is present value. However, in the face of certain imperfections, this

decision rule may require modification.

Capital Rationing: an examination of all the decision rules given shows that each assumes that

there is no budget constraint for profitable investments. I.e., each period, the firm looks over all

available project proposals and selects all projects with positive present value. This done, then a

budget is established to determine how much capital is needed (and from which sources it will be

raised) to carry out the program. If the estimates of the cost of capital and the cash flows are

accurate, then there should be little problem in raising the necessary (additional) funds in the

capital market. Further, this procedure is optimal relative to the (efficiency) criterion of

maximizing market value. Note: the procedure to be described is contrary to the one an

individual consumer would follow in allocating his income (and wealth) over various

consumption goods at different points in time.

Page 147: Finance Theory - Robert C. Merton

Robert C. Merton

145

However, in certain situations, there may be a (predetermined) absolute limit to the

amount that can be invested by the firm in any one period. This situation is called capital

rationing. It may occur for the firm in countries where there are no (or poorly-organized) capital

markets; or for divisions of firms where (incorrectly determined) decentralization rules dictate a

fixed budget for each division prior to the examination of the projects available; it is not an

infrequent case in the public sector where resources are at times allocated (prior to specific

knowledge of projects) on the basis of "last year's" allocation (of I0)".

Under capital rationing, it is sometimes suggested that the Profitability Index (or

"Benefit/Cost" ratio) is a better rule than present value. While it is true that the profitability

index gives the most Present value per dollar of initial investment which is highly suggestive of

what one should do in a constrained situation, it does not reflect future budgetary constraints.

Thus, a plan may satisfy the current budget constraint, but violate all future constraints.

The best technique in this situation is to maximize present value subject to the budget

constraint in each year using mathematical programming techniques. While such a procedure is

not optimal relative to (unconstrained) maximizing of market value, it does produce a feasible

program. Moreover, the "shadow prices" or dual variables will give an explicit estimate of the

marginal costs of the rationing. These values can often be used to argue for the elimination of

the constraints, particularly if the costs are high. Always, ask yourself: why the constraint? How

much is it costing? Is it rational?

Rising Cost of Capital. It is typically assumed that the cost of capital is a constant function of the

amount of investment, in each period. However, if k depends on the scale, then programming

techniques must be employed.

Page 148: Finance Theory - Robert C. Merton

Finance Theory

146

Application of Present Value: The Replacement Problem

The product decisions are already made and the decision is to choose between two

alternative machines to produce the product. Technical change is neglected and the optimal

horizon for the product run is given.

I. Replacement time for each machine is known.

Same product for T years: which machine? Life of machine A is T1 years. Life of machine B

is T2 years.

Page 149: Finance Theory - Robert C. Merton

Robert C. Merton

147

Machine A costs IA and has operating costs per year C,...,C,C A2A

1A

T1 and has salvage value with

T)-T(2 1 years (to go), of S.

Machine B costs IB and has operating costs per year .C,...,C,C B2B

1B

T 2 Assume replace each

machine with the same machine.

The present value of the costs of machine A over its life is

)k+(1

C + I = P t

tA

1=tAA ∑

T1

For machine B

)k+(1

C + I = P t

tB

1=tBB ∑

T 2

If we choose machine A, then it must be replaced at time T1. At that time, the present value of

costs will be

1

T T

T Tbecause it is not used for its full life

1- tA

A t -t=1

SC + - , I(1+k (1+k) )

If we choose machine B, then it must be replaced at time T2. At that time, the present value of

costs will be PB again because it is used for its full life. To decide which machine to use, we

compare the present values of costs for the entire product life, which are, today,

( ) ( ) ( )1

1 1

T T t' AA A AT t T T

t 1

C1 SP P I

1 k 1 k 1 k

−=

⎡ ⎤⎢ ⎥= + + −⎢ ⎥+ + +⎣ ⎦

and

Page 150: Finance Theory - Robert C. Merton

Finance Theory

148

( ) ( )2 2

'B B B BT T

1 1P P P P 1 ,

1 k 1 k

⎡ ⎤= + = +⎢ ⎥

+ +⎢ ⎥⎣ ⎦

and choose the smaller one between ' 'A BP and P .

A common situation is when the length of the product run is anticipated to run

indefinitely into the future (formally, T = ∞). Hence, if a machine has life of length n, we

anticipate replacing the machine every n years and making an "infinite" number of

replacements. If P is the present value of costs of the machine over one cycle, i.e.,

,)k+(1

C + I = Pt

tn

1=t∑ then, as above, the present value of costs over the product life will be

( ) ( ) ( ) ( )'

2 3 41 1 1 1

n n n n

P P P PP P

k k k k= + + + + +

+ + + +K

( ) ( ) ( ) ( )

'1 2 3 4

1 1 1 11

1 1 1 1n n n n

Pk k k k

⎡ ⎤⎢ ⎥= + + + + +⎢ ⎥⎡ ⎤ ⎡ ⎤ ⎡ ⎤ ⎡ ⎤+ + + +⎢ ⎥⎣ ⎦ ⎣ ⎦ ⎣ ⎦ ⎣ ⎦⎣ ⎦

K

or ( )'

0

1where

1j

nj

P P X Xk

=

= ≡+

From (II.14), we have that 1. < X N X-1

1 =

1-X

1-X = X N

j1-N

0=j

foras ∞→∑ So substituting for

X, we have

( )

( )( )

' 111 1 11

1

n

n

n

kP P P

kk

⎡ ⎤⎢ ⎥ ⎡ ⎤=⎢ ⎥= = ⎢ ⎥⎢ ⎥ + −⎢ ⎥− ⎣ ⎦⎢ ⎥+⎣ ⎦

So to determine which machine to choose, take the one with the smaller'P .

Page 151: Finance Theory - Robert C. Merton

Robert C. Merton

149

An alternative representation (due to Lewellen) is to convert the present value

calculations into a constant annual cost (flow) comparison. This approach would be useful for

comparison between the choice between buying and maintaining the machine or renting (or

leasing) the machine with a service contract from another firm, i.e., what is the (maximum)

constant annual payment that you would be willing to pay at the end of each year for renting the

machine and having it serviced? Clearly, this represents an annuity payment problem. The

present value of the annuity is P; the (maximum) rate to be paid is k; find the annual payments

implied: From Section II, we have that the formula for a N-year annuity is

k

))k+(1

1 - y(1

= AN

N

where y = the annual payment.

Corresponding to the perpetuity (N = ∞), we have

k

y = A∞

so the annual flow (cost) will be

⎥⎥⎥⎥

⎢⎢⎢⎢

)k+(1

1-1

k P = k =y

n

P

Again, we can select between two machines by choosing the one with the smaller y, and if a

leasing contract is available for less than y, then take it.

II. Optimal Replacement Time

In the previous analysis, it was assumed that the machines' replacement times were

known and that they corresponded to their physical lengths of life. Rarely is this the case.

Normally, the decision to replace a machine is an economic one. As before, assume that we will

always be replacing old machines with new machines and that replacement goes on indefinitely.

Page 152: Finance Theory - Robert C. Merton

Finance Theory

150

Let I = initial investment and C1,C2,C3,...,CT be the annual operating costs up to the end of the

physical life of the machine, T; let S1,S2,S3,...,ST-1,ST (= 0) be the salvage value of the machine

at the end of each year. From the assumptions of the problem, an optimal replacement time

solution will be the same for all time, i.e., it will never be optimal to replace every two years for a

while and then switch to replacing every three years, etc.

Let τ = length of time between replacements (0 < τ ≤ T). To "convert" the problem to

the type of the previous section, consider that each of T different replacement strategies defines

a "different" machine (which it does in the economic sense). Thus, let P(τ) = present value of

costs for one cycle for the machine replaced every τ years. Then

1

( ) -(1 (1) )

tt

t

SCP I .k k

ττ

ττ=

= ++ +∑

Let P(τ) be the present value of costs of the machine (replaced every τ years) over the product

life. Then

( )( )

( )1 1

1 1P = - P .

+kττ τ

⎡ ⎤⎢ ⎥⎢ ⎥⎣ ⎦

The optimal replacement time, τ* , will the be τ such that

P P for all possible*( ) ( ) = 1,2,3,...,Tτ ττ ≤

We can also write the conditions in terms of equalized annual costs by:

such that is thenandP )(k = y *τττ

. y y * τττ possible allfor ≤

Note: τ* will depend on k, the structure of the operating costs, and the salvage values, and

hence will be different for different cost of capital, etc.

Page 153: Finance Theory - Robert C. Merton

151

VIII. FORWARD CONTRACTS, FUTURES CONTRACTS AND OPTIONS In this section, the assumption of certainty is removed and we begin the study of financial

instruments in an uncertain environment. Futures and option securities permit investors to

modify the patterns of returns which would otherwise be received from the underlying securities

and in particular, to eliminate or hedge against the uncertainties of price changes in these

securities. Forward or futures contracts have long been traded on basic commodities and in

recent years, have been widely expanded to include financial securities (see chart on the last page

of Section V). Although most organized markets use futures contracts, the more-intuitive

contract is the forward contract. As an introduction, therefore, we analyze the forward contract

in the context of a forward loan agreement. The reader may find it helpful to review the analysis

of default-free fixed income security pricing in Section V before proceeding to the example

It is a common practice for borrowers to obtain a commitment for a loan in advance of

actually receiving the money. For example, a firm may undertake a project that does not require

investment until some future date, and therefore, the firm may have no need for funds now.

However, to ensure the availability and terms of a loan sufficient to finance this future

investment, the firm may enter into an agreement with a bank now to borrow the money at a

specified future date. While the terms of such agreements are variable, a typical example would

be a τ-period term discount loan where the bank agrees to lend $L to the firm at date T and the

firm agrees to pay back to the bank $M at date T + τ.

Such an agreement is an example of a forward contract. Specifically, it is a forward

contract where the firm agrees to deliver to the bank at date T a τ-period discount bond on

which the firm promises to pay $M at maturity and the bank agrees to pay the firm $L (the

delivery price) on delivery. Although not necessary, typically no money changes hands at the

time that the forward contract is made. Under this assumption and the assumption that the loan

is default-free, what is the equilibrium value for M .

The spot price for an item is defined as the price for that item delivered immediately. For

example, the current spot price for a discount bond which pays $1 at date T is ( )0 ,P T and

at date t , the spot price for that same bond will be t).-(TPt The forward price associated

Page 154: Finance Theory - Robert C. Merton

Finance Theory

152

with a forward contract is defined as that delivery price which makes the value of the forward

contract equal to zero at the time that the contract is made.

Because in the case at hand, nothing is paid by either party to the other for making the

contract, the terms must be such that the value of the forward contract (not to be confused with

the forward price of the contract) is zero at the time it is made. Otherwise, if the value of the

contract to the lender (borrower) were positive, then the value of the contract to the borrower

(lender) would be negative, and the borrower (lender) would be giving something of value away

for nothing. Hence, the relationship between L and M must be such that L is equal to the

forward price of the contract. If )T,M,(L,F t τ denotes the value of the forward contract (to the

lender) at time t then, to avoid arbitrage from (V.18), it must satisfy

(VIII.1) t t tF (L, M,T, ) = - L P (T - t) + M P (T + - t)τ τ

for t = 0,1,2,…,T. Of course, the value of the forward contract to the borrower at time t is

).T,M,(L,F- t τ From the condition that 0, = )T,M,(L,F 0 τ we have that

),+(TP(T)/PL = M 00 τ and therefore, (VIII.1) can be rewritten as

(VIII.2) }(T)] Pt)/-(TP[ -)] +(TPt)/-+(TP[ (T){PL = F 0t0t0t ττ

From (VIII.2), the value of the forward contract to the lender at date t is proportional to the

difference between the return per dollar from holding a (T + τ)-period discount bond for the

period [0,t] and the return per dollar from holding a T-period discount bond for the same

period. The proportionality factor is equal to the value at the time the contract is made of a

discount bond which pays $L at its maturity date T. Thus, while ,0 = F0 the value of the

contract at date t , Ft will not equal zero if the holding period returns on the two bonds are not

the same. This can happen if interest rates are stochastic, and will happen whenever the ex-post

time path of interest rates is different from what was expected ex-ante. Of course, if Ft is

positive, then, ex-post, the borrower is worse off than if he had not entered into the agreement

because the value of the contract to him, –Ft , is negative. However, just as the lender is

committed to making the loan on the terms agreed upon, so the borrower is equally committed to

Page 155: Finance Theory - Robert C. Merton

Robert C. Merton

153

take the loan on these terms. If the borrower had the choice of not taking the loan, then the

agreement would not be a forward contract but rather an option contract.

Because $L is the T-period forward price for a default-free discount bond which pays

$M at date (T + τ), L/M is the T-period forward price for a default-free discount bond which

pays $1 at date (T + τ). If we define TP0(τ) to be the forward price for delivery at time T of a

discount bond which pays $1 at date (T + τ), then TP0(τ) = L/M and therefore,

(VIII.3) 0 0 0 0 1T (τ) = (T + τ)/ (T) , τ = , ,... .P P P

Since (T)P0 is the amount one would pay today for one dollar delivered at date T and

)+(TP0 τ is the current price of the bond, from (VIII.3) the forward price TP0(τ) is equal to the

current price of the bond measured in units of dollars paid at date T. With a complete set of

these forward prices for all T and τ, the forward price associated with a forward contract for

any default-free, fixed-income security can be computed. If 0sV denotes the s-period forward

price for a default-free security which pays x$ t at date T1,2,..., = t t, then

(VIII.4) 0

T

0 ts

t=s+1

s = (t - s)V xP∑

for 1,-T0,1,..., = s where it is assumed that the security is delivered at date s but ex-the-period

s-payment (i.e., after the payment of x$ s has been made).

The only data required to compute forward prices for all default-free securities are the

current spot prices of discount bonds or equivalently, the current term structure of interest rates.

Indeed, the more-common practice is to quote a forward yield rate, TR0(τ) rather than the forward

price TP0(τ) where in an analogous fashion to (V.8) in Section V, TR0(τ), is defined by

Page 156: Finance Theory - Robert C. Merton

Finance Theory

154

(VIII.5) ( ) ( ) 1/10 0T T R P

ττ τ −≡ −⎡ ⎤⎣ ⎦

Consider the more general case of a forward contract which calls for delivery of one unit

(e.g., share) of a security or commodity at date T at a price L. Let X(t) denote the spot price

of the commodity at time t. If shortsales are permitted; if there are no shortage or transactions

costs; and if the commodity or security provides no payouts prior to T, then it follows from the

condition of no arbitrage that the value of the forward contract is

(VIII.6) t)-(TLP - X(t) = T)(L,F tt

From the definition of the forward price at time t , ,Lt we have from (VIII.6) that

(VIII.7) . t)-(TPX(t)/ = L tt

The dollar gain on a forward contract entered into at time t between t and t + 1 is given by

(VIII.8) . t)-(TP

1)-t-(TP - X(t)

1)+X(tX(t) = T),L(F - T),L(F

t

1+tttt1+t ⎥

⎤⎢⎣

The change in the forward price is given by

(VIII.9) . t)-(TP

1)-t-(TP - X(t)

1)+X(t

1)-t-(TP

X(t) = L-L

t

1+t

1+tt1+t ⎥

⎤⎢⎣

Combining (VIII.8) and (VIII.9), we have that

(VIII.10) .] L-L1)[-t-(TP = T),L(F - T),L(F t+1t+1tttt+1t

The change in the value of a contract entered into at time 0 = t between 0 = t and T, = t the

delivery date, is given by

Page 157: Finance Theory - Robert C. Merton

Robert C. Merton

155

(VIII.11)

T -1

T 0 0 0 t+1 0 t 0

t=0

0

T 0

( ,T) - ( ,T) = [ ( ,T) - ( ,T)]F L F L F L F L

= X(T) - L

= - L L

In preparation for the analysis of futures contracts, consider the following investment

strategy in forward contracts: enter into a forward contract at time t. At time t + 1, settle the

contract and put the proceeds into a discount bond which matures at time T . Now enter into a

new forward contract. The initial value of the investment is zero. The increment to value

between t and 1 + t is given by (VIII.10). This increment (invested in a discount bond) will be

worth at time T , .L - L = 1)-t-(TP/T)},L(F - T),L(F{ t1+t1+tttt1+t The total value of this

investment strategy at time T will, therefore, be given by

(VIII.12)

L- L =] L - L[=

1)-t-(TP / } T),L(F - T),L(F {

0Tt1+t

1-T

0=t

1+tttt1+t

1-T

0=t

which is identical to (VIII.11), the increment from holding a single forward contract for the entire

period until delivery.

If a person is long in a futures contract, then he is required to purchase at date T one unit

of the security (or commodity) at the then futures price (denoted by) f(T), and he also will

receive in cash at date 1, + t the difference between the futures price at that date and the futures

price at date t [i.e., f(t)] - 1)+f(t for . 1-T0,..., = t [Note: If 0, < f(t) - 1)+f(t then he

must pay out | f(t) - 1)+f(t | in cash.]

If a person is short in a futures contact, then he is required to sell at date T one unit of

the security (or commodity) at the then futures price, f(T), and he also must pay in cash at date

1, + t the difference between the futures price at that date and the futures price at date

Page 158: Finance Theory - Robert C. Merton

Finance Theory

156

1.-T0,..., = t f(t), - 1)+f(t t, for [Note: If ,0 < f(t) - 1)+f(t then he will receive

| f(t) - 1)+f(t | in cash.]

The futures price at date t, f(t), is defined to be that price such that the value of a

futures contract is zero. It follows immediately that the futures price at the delivery date T is

equal to the spot price at that date. I.e., X(T). = f(T)

Note that holding a futures contract long until the delivery date is quite similar to the

examined investment strategy of "rolling over" forward contracts which from (VIII.12) was

shown to be equivalent to simply holding a forward contract until delivery.

Consider the analogous investment strategy in futures contracts: At the beginning of each

period 1),-T0,..., = (t t enter into N t futures contracts. At the end of the period (i.e., 1 + t ),

you will receive f(t)]. - 1)+[f(tN$ t Invest this money in bonds that mature at date T . [So, at

date T , you will have 1)-t-(TPf(t)]/ - 1)+[f(tN$ 1+tt from this transaction.] Adjust your

position so that you have N 1+t contracts for the period 1 + t to 2. + t

The accumulated sum at date T , ,V T from this strategy will be

(VIII.13) . 1)-t-(Tf(t)]/ - 1)+[f(t PN = V 1+tt

1-T

0=tT ∑

Consider the case where we choose 1. = N t In this case, from (VIII.13), the accumulated

sum will be

(VIII.14)

-1 -1

10 0

-1

1 10

1[ ( 1) - ( )] [ ( 1) - ( )[ -1]

( - -1)

( ) - (0) [ ( 1) - ( )][1- ( - -1)] / ( - -1)

T T

Ttt t

T

t tt

f t f t f t f tVT tP

f T f f t f t T t T t .P P

+= =

+ +=

= + + +

= + +

∑ ∑

Thus, by inspection of (VIII.14), unlike a forward contract, the dollar return from entering into a

futures contract and remaining long one contract until the delivery date will not, in general,

produce a dollar return equal to the difference between the futures price at the time of initial

entry, f(0), and the price at delivery, f(T).

Page 159: Finance Theory - Robert C. Merton

Robert C. Merton

157

However, there is one case where a strategy in the futures contracts will exactly replicate

the outcome of a forward contract. Suppose that changes in interest rates over the life of the

futures contract are known with certainty. Then, a feasible strategy would be to set

1).-t-(TP = N 1+tt [Note: since this strategy requires one to know at time t , the price that a

bond will have at time 1, + t it is not feasible in a world of uncertain interest rates.]

Substituting for N t in (VIII.13), we have that

(VIII.15)

T -1

T

t=0

= [f(t +1) - f(t)]V

= f(T) - f(0).

Suppose that simultaneously with following this strategy, we also go short one forward contract

at time 0. = t Because both the forward contract and the futures contract have zero value, these

positions require zero investment. From (VIII.11) and (VIII.15), the accumulated value from

these combined positions, '

T V , can be written as

(VIII.16)

'T 0T

0

= f(T) - f(0) - [ - ]V L L

= - f(0)L

because X(T) = f(T) and from (VIII.7) X(T). = LT But, to achieve '

T V requires no

investment. Hence, if f(0), L0 ≠ then an arbitrage opportunity would exist. Thus, to avoid

arbitrage, we have that, for nonstochastic interest rates,

(VIII.17) ,L = f(t) t

and from (VIII.7), that

(VIII.18) . t)-(TPX(t)/ = f(t) t

Moreover, for most practical cases of relatively short-lived futures contacts (i.e., T not too

large), the uncertainty about the bond price at time 1 + t viewed from time t will be small, and

therefore (VIII.17) should be an excellent approximation.

Page 160: Finance Theory - Robert C. Merton

Finance Theory

158

The reader is warned that while the relation between the futures and forward price derived

in (VIII.17) will hold whenever interest rates are nonstochastic, the relation between the futures

and spot price in (VIII.18) will only obtain under the posited assumptions about the security or

commodity underlying the contract. That is, if the underlying item is a commodity which is not

being stored, then t)-(TPX(t)/ < f(t) t can be a stable result because it is not possible to

shortsell the commodity spot. Similarly, if there are payouts on the security prior to the delivery

date or storage costs for the commodity, then (VIII.18) need not obtain without creating an

arbitrage opportunity.

Options: Insurance for the Value of Risky Securities

A Put Option gives it owner the right to sell a specified number of shares of stock at a

specified price per share (the "exercise price") on or before a specified date (the "expiration

date".) If the option is not exercised on or before the expiration date, then it expires and becomes

worthless.

If T denotes the expiration date and if S(t) denotes the stock price per share at date t,

then the value of the put option per share on its expiration date is S(T)]-E[0,Max where E

denotes the exercise price.

Page 161: Finance Theory - Robert C. Merton

Robert C. Merton

159

Put Option Viewed as a Term Insurance Policy

General Example (3/6/79) Asset Insured Stock IBM Asset's Current Value Stock Price, $S $303.875 Term of Policy Time until expiration 7 months and 14 of the put days (10/20/79) Maximum Insurance Cover- Exercise Price $300.00 age [Face Value of Policy of the Put, $E (maximum loss to insurer) Amount of the Deductible $[S – E] $3.875 (maximum loss to insured) Insurance Premium Put Price/per share $15.25 Important Differences

• Early Exercise and Marketability

• Dividends

Three Ways of Reducing Risk

1) Diversification: "Mixing" less-than-perfectly correlated risky assets

2) Substituting the riskless security for risky assets

3) Insurance: options

If an investor holds a risky security and reduces his risk by the purchase of a put option on

that risky security, then such an investment strategy is called a "Protective Put" or "Insured

Equity" strategy.

Figure VIII.1 illustrates the basic payoff structure to a "Protective Put" strategy for the

case when the risky security is IBM stock. Note that the payoff structure is a nonlinear function

of the price of IBM stock, and therefore, this method of reducing risk is fundamentally different

Page 162: Finance Theory - Robert C. Merton

Finance Theory

160

from the alternative method of reducing risk which is to reduce one's holdings of IBM and invest

in the riskless security.

Reducing the risk of a portfolio of stocks by the purchase of put options can be

accomplished by either purchasing a put option on each individual stock within the portfolio or

by purchasing a put option on the portfolio itself. The pattern of returns achieved by these

alternate approaches to the Protective Put Strategy will be somewhat different.

The following table provides the simulated return experience from following a protective

put strategy where one purchases a put option on each stock in the portfolio.

Page 163: Finance Theory - Robert C. Merton

Robert C. Merton

161

Summary Statistics for Rate of Return Simulations* Stocks Mixed With Commercial Paper Strategies Versus "Protective Put" Strategies July 1963- June 1977

Semi-Annual: Stocks1 /

Protective

Put2 /

(E = S)

Protective

Put3 /

(E = .gS) Average Rate of Return 4.6% 4.5% 4.7% Standard Deviation 13.7% 7.9% 10.4% Highest Return 49.1% 35.1% 40.8% Lowest Return –16.4% –1.8% –7.0% Average Compound Return 3.7% 4.2% 4.2% Growth of $1000 $2,829 $3,209 $3,218

Semi-Annual: 75% Stocks

1 /

25% Paper 50% Stocks

1 /

50% Paper Commercial

Paper Average Rate of Return 4.2% 3.8% 3.1% Standard Deviation 10.0% 6.9% 1.0% Highest Return 37.9% 26.8% 5.9% Lowest Return –10.8% –5.3% 1.7% Average Compound Return 3.8% 3.6% 3.1% Growth of $1000 $2,812 $2,717 $2,339

*Source: "The Returns and Risks of Alternative Put Option Portfolio Investment

Strategies," by Robert C. Merton, Myron S. Scholes, and Mathew Gladstein (Journal of Business, January 1982).

1 /

Equal-dollar Weighted Portfolio of 30 Dow Jones Industrial stocks rebalanced semi-annually. Returns include reinvesting all dividends. No provisions for taxes or transaction costs.

2 /

Same as 1 /

footnote 1 plus a six-month put option with exercise-price-equal-to-initial-stock price for each share of each stock.

Page 164: Finance Theory - Robert C. Merton

Finance Theory

162

3 /

Same as 2 /

except exercise price is equal to 90% of initial stock price. The following table provides the simulated return experience from following a protective

put strategy where one purchases a put option on the whole portfolio. The portfolio chosen was a

value-weighted portfolio of all New York Stock Exchange stocks and the particular Protective

Put Strategy examined was to purchase a one-month put on the portfolio with an exercise price

equal to the initial value of the portfolio times one plus the one-month interest rate.

Summary Statistics for Rate of Return Simulations

January 1927 - December 1978

Per Month: NYSE Stocks

Protective Put

30-Day U.S. Treasury

Bills

Average Rate of Return 0.85% 0.55% 0.21%

Standard Deviation 5.89% 3.55% 0.19%

Highest Return 38.55% 30.14% 0.81%

Lowest Return -29.12% -7.06% -0.24%

Average Compound Return 0.68% 0.49% 0.21%

Growth of $1000 $67,527 $21,400 $3,604

Average Annual Compound Return 8.47% 6.04% 2.55% A call option gives its owner the right to buy a specified number of shares of stock at a

specified price per share (the "exercise price") on or before a specified date (the "expiration

date"). If the option is not exercised on or before the expiration date, then it expires and becomes

worthless. The value of the call option per share on its expiration date is E]. - S(T)[0,Max

As an exercise, show that the value at the expiration date of a protective put strategy

levered by going short (i.e., borrowing) in a riskless discount bond with face value of $E and

maturity date equal to the expiration date of a put is exactly equal to the value of a call option on

Page 165: Finance Theory - Robert C. Merton

Robert C. Merton

163

the same stock with exercise price and expiration date the same as for the put. Having shown

this, you will have proved that the purchase of a call option is equivalent to buying the stock;

levering the position by borrowing; and insuring the risk by purchasing a put option.

On the Relationship Between Risky Debt and Options

Consider a firm with two classes of liabilities: equity and debt. Assume that there is a

single, homogeneous class of debt with the following terms:

1. The debt is a "pure" discount loan where the firm promises to pay $M ("face

value") for each bond on the maturity date T. If there are n bonds outstanding, then the total promised payment to the debtholders is nM $B ≡ on the maturity date T.

2. In the event that the firm does not make the promised payment ("default"), then

the firm is turned over to the debtholders, and each bondholder will receive his pro rata share of the "reorganized" firm. The original equityholders will receive nothing in that event.

Let V(t) denote the market value of the firm at date t (which, by definition, will

always be equal to the sum of the market value of debt plus equity).

On the maturity date of the debt, if the value of the firm exceeds the amount of the

promised payment (i.e., B), > V(T) then it is in the interest of the equityholders (who elect

management) to have the debt paid. Thus, the value of the debt issue in that event will be B,

and the value of equity will be B. - V(T)

On the maturity date of the debt, if the value of the firm is less than the amount of the

promised payment (i.e., B), < V(T) then the firm cannot make the promised payment. Because

corporate equity enjoys limited liability, the equityholders cannot be compelled to contribute the

"short fall" to pay the bondholders, and it is, clearly, not in their interests to do so. Thus, the firm

will default, and the value of the debt issue in that event will be V(T), and the value of equity

will be 0.

In summary, on the maturity date, the

Page 166: Finance Theory - Robert C. Merton

Finance Theory

164

(VIII.19) V(T)]-B[0, - B =V(T)] [B, = MaxMinissuedebt of value

and the

(VIII.20) B]. - V(T)[0, =

V(T)] - B[0, + B - V(T) =

V(T)] [B, - V(T) =

Max

Max

Minequity of value

Note: If the debt issue were default-free, then the value of the debt at maturity would always equal B, the promised amount. Inspection of the above value formula shows, therefore, that risky corporate debt "looks

like" a combined position of buying a default-free discount bond with face value B and maturity

T and issuing (short-selling) a put option on the firm's value with an exercise price = B and an

expiration date T. If there was not limited liability or equivalently, if the equityholders had

chosen to get leverage by personal (unlimited liability) borrowing where the aggregate face value

of the loan were B, then the value of the equity at maturity would be B - V(T) [which could, of

course, be negative].

Inspection of the value formula for equity (VIII.20) shows that equity levered by

corporate borrowing "looks like" a combined position of levering equity with an unlimited

liability loan and purchasing a put option on the firm's value with an exercise price = B and

expiration date T.

Thus, if one buys corporate debt, then one is not only lending money, but is also issuing

insurance to the equityholders against declines in the asset value of the firm below $B.

Similarly, if one issues corporate debt, then one is not only borrowing money, but is also

purchasing insurance against a decline in the value of the firm's assets below $B.

Most kinds of insurance of guarantees of the value of a security can be viewed as options.

Hence, the theory of option price determination has much broader application beyond simply

evaluating puts and calls. Some examples would be deposit insurance and loan guarantees.

Page 167: Finance Theory - Robert C. Merton

165

IX. THE FINANCING DECISIONS BY FIRMS: IMPACT OF CAPITAL

STRUCTURE CHOICE ON VALUE The capital structure of a firm is defined to be the menu of the firm's liabilities (i.e, the

"right-hand side" of the balance sheet). A great variety of types of securities can be and are used

in firms' capital structures. In addition to common stock equity, some typical examples are bank

loans, commercial paper, secured bonds, debentures, convertible bonds, income bonds, preferred

stock, and warrants. While the traditional treatment of capital structure is to examine each of

these types of securities separately, the modern approach (as was suggested in Section VIII)

views all these types as part of a unified theory of contingent claims pricing. That is, each of

these "hybrid" securities can be represented as a "mixture" (albeit at times a complicated one) of

pure" "default-free" debt and "pure" (as if 100%-financed by common stocks) equity. Beyond

simply providing a unified theory of pricing, this approach avoids many of the pitfalls and

misconceptions about the costs and benefits of different capital structure choices. Therefore, the

analysis in this section of the capital structure choice and its impact on the total market value of

firm will focus almost exclusively on the choice between debt and equity in providing the firm's

external financing.

As background, the reader should become familiar with the meaning (and effects) of

financial leverage and with the distinction between financial risk and asset (or business) risk. It

will be helpful to develop a feel for typical debt-to-asset ratios in various industries.

In the study of the firm's investment decisions in Section VII, it was assumed that all

external financing was done by issuing equity, and therefore that the firm had the simplest

structure possible: namely, all claims on the firm are homogeneous equity. The fundamental

question explored here is: Given the investment decision of the firm, does the financial structure

of the firm "matter"? That is, for a fixed investment policy, will a change in the firm's mix

between debt and equity cause a change in the market value of the firm?

As one might expect, the answer to this fundamental question depends upon the assumed

environment. Since, by hypothesis, the investment policy of the firm is fixed, the total cash flow

generated by the firm will not be affected by the capital structure choice. Thus, if the capital

structure matters (and thereby, a change in it will cause a change in the market value of the firm),

Page 168: Finance Theory - Robert C. Merton

Robert C. Merton

166

then, from the valuation formulas derived in Sections VI and VII, a capital structure change must

cause a change in the cost of capital. [The only exception would be if the capital structure choice

changes either the tax liabilities or the level of government subsidies to the firm, a topic

addressed later in this section.] Since, as shown in Section VII, the cost of capital is used in

determining the (optimal) investment decision, if the capital structure matters, then it is necessary

for the manager to consider simultaneously both the investment and financing decision in making

an overall optimal set of decisions for the firm.

Why should the financial structure matter? With the exception of certain tax features, it

is necessary to assume some type of uncertainty to give this question serious meaning because

otherwise, debt and equity are essentially indistinguishable. Possibilities are:

1. Does the issuance of debt create a new set of securities which were previously not

available? (i.e., how substitutable is personal leverage for corporate leverage?)

2. Are there costs to bankruptcies?

3. Are there tax features unique to corporate debt?

4. What are the effects on control of the firm?

5. Does the existence of outstanding debt "induce" changes in investment policy?

Other factors which are often considered by managers in deciding on the debt/equity ratio are:

1. growth rate of future sales

2. stability of future sales

3. the competitive structure of the industry

4. the asset structure of the industry

5. lender attitudes toward the firm and its industry.

To analyze the problem, we begin by studying the impact of capital structure in a specific

environment and use this as a benchmark for insights into why financing decisions might affect

value. This environment includes the following assumptions:

(A.1) No income taxes (to be modified later).

Page 169: Finance Theory - Robert C. Merton

Finance Theory

167

(A.2) The debt-to-equity ratio is changed by issuing debt to repurchase stock or

by issuing equity to pay off debt. Moreover, a change in the capital

structure is affected immediately and there are no transactions costs.

(A.3) The firm finances all investment by external means (i.e., dividend policy is

to pay dividends equal to earnings).

(A.4) The expected values of the (subjective) probability distributions of future

(operating) earnings for each firm are the same for all investors

("homogeneous investor beliefs").

(A.5) No growth of earnings: the expected value of operating earnings for all

future periods are the same.

(A.6) All investments that the firm considers are from the same risk class, i.e.,

the business risk characteristics are independent of the number of projects

taken, and are taken as constant.

A "Benchmark": The "Pure Equity" Case (100% Financing by Equity)

Let ≡ X average expected dollar return per period for the firm

Let ≡ k0 cost of capital for 100% equity financed firm

= expected rate of return required for the firm's particular risk characteristics, and it

is assumed to be constant over time.

Then, the market value of this firm is:

(IX.1) 0

0V =

X

k

Let F = (expected) annual interest payments on debt outstanding 0B = market value of debt outstanding

Page 170: Finance Theory - Robert C. Merton

Robert C. Merton

168

E = (expected) annual earning available to shareholders

= F - X S0 = market value of equity outstanding

i0

F k

B≡ = cost of debt = required (expected) return by investors to

hold this amount of debt in the firm

e0

E k

S≡ = cost of equity = required (expected) return by investors

to hold this amount of equity in the firm B + S = V 000

or0i e 00

0 0 0

X F + E + k k SB = = kV V V

(IX.2) . k S+B

S + k S+B

B = k V

S + k V

B = k e00

0i

00

0e

0

0i

0

00 ⎟⎟

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

0k is called the "weighted" cost of capital and is the relevant number to use in the investment

(capital budgeting) decision of Sections VI and VII.

The question "Does financial structure `matter'?" can be restated as "does k0 change for

different mixes of debt and equity (given a fixed investment policy)?" Or, for a given level of business or

asset risk, does changing the financial risk of equity change the total value of the firm?

Page 171: Finance Theory - Robert C. Merton

Finance Theory

169

"Extreme" Classical Theory: The "Net Income" Approach

Assumption: ke and ki are constants with k > k ie

Example: 5%; = k 10% = k ie and net operating earnings = $1000

Case 1: which implies and so,0 = $3000 F = $150 B

Net Operating Earnings $ 1000 kV

S + k V

B = k e0

0i

0

00 ⎟⎟

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

minus interest payments (150) = (.10) 11500

8500 + (.05)

11500

3000⎟⎠⎞

⎜⎝⎛

⎟⎠⎞

⎜⎝⎛

Earnings available to equity $ 850 = 8.7% 11500

1000 ≈

÷ k = .10e ______

Market Value of Equity )S( 0 $8500 Leverage Factor 8500

3000 =

S

B 0

0≡

+ Market Value of Debt )B( 0 3000 = 5 .3≈

Market Value of Firm )V( 0 $11500

Case 2: $6000= B0 which implies so $300, = F

Net Operating Earnings $1000 06000 7000

= ( .05) + (.10)k13000 13000⎛ ⎞⎜ ⎟⎝ ⎠

minus interest payments (300) = 7.7% 13000

1000 ≈

Earnings available to equity $ 700

.10 = k e÷ ______

Page 172: Finance Theory - Robert C. Merton

Robert C. Merton

170

Market Value of Equity )S( 0 $7000 Leverage Factor 7000

6000 =

S

B 0

0≡

+ Market Value of Debt )B( 0 6000 .86 ≈

Market Value of Firm )V( 0 $13000

Classical Approach (generally) as is illustrated in figure above 1. Assumes that there is an optimal capital structure and hence, through the

appropriate choice of leverage, the value of the firm can be increased.

2. Assumes that beyond some point of leverage, ke rises at an increasing rate.

3. Assumes that beyond some point of leverage, ki may rise.

Page 173: Finance Theory - Robert C. Merton

Finance Theory

171

Modigliani-Miller (as is illustrated in figure above)

Assume perfect capital markets: equal information; no transactions costs; investors are

rational and believe everyone else is; free access to borrowing and lending; no taxes; firm debt is

default-free. Their basic propositions are:

(1) The total market value of the firm and its cost of capital, ,k0 are independent of its capital

structure (the total market value of a firm is calculated by capitalizing the expected stream of

operating earnings at a discount rate appropriate for its risk class).

Page 174: Finance Theory - Robert C. Merton

Robert C. Merton

172

(2) The required expected return on equity, ,k e is equal to the capitalization of a "pure" equity

stream, plus a premium for financial risk which equals the difference between the "pure" equity

capitalization rate and ,k i times the leverage factor ).S/B( 00

(3) Therefore, the "cut off" rate for asset selection (investment policy), ,k0 is independent of

the financing decision.

Graphically, under the assumption that the debt has no risk of default (hence, ki is constant),

the M-M result is:

Proposition 2 can be derived formally from equation (IX.2) as follows:

00

0i

0

0e

e0

00

0

0i

0 0

00

0

0i

k = B

V k + S

V k

k = V

S k - B

S k =

( B + S )

S k - B

S k

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

⎛⎝⎜

⎞⎠⎟

or

(IX.3) e 0

0

00 ik = k +

B

S ( k - k )

⎛⎝⎜

⎞⎠⎟

Page 175: Finance Theory - Robert C. Merton

Finance Theory

173

(First) Proof of M–M result by arbitrage

Suppose the borrowing and lending rates are equal and the same for all investors and

corporations, i.e., r, = ki ≡constant rate of interest; (the debt is default-free). Consider two

companies with identical anticipated earnings, i.e., X. = X = X 21 Suppose that company #1 has no

debt (financed completely by equity) and company #2 has some debt.

Company #1 Company #2 Earnings: X = X 1 Earnings: X = X 2

Debt: r = k 0 = B i10 rateat Debt: r = k 0 > B i

20 rateat

Equity: )V (= S 1

010 Equity: S2

0

Firm Value: V 1

0 Firm Value: )S + B (= V 20

20

20

Consider an investor who currently owns S2 dollars of stock in company #2. If α = % of total

shares of company #2 held by this investor, we have that . S = S 202 α

Suppose that 2 10 0 > ,V V i.e., by the "right" choice of leverage, company #2 will have a larger value

than #1. If the investor continues to hold his present portfolio, his return in dollars, ,Y 2 will be his

fractional claim, ,α times the portion of earnings available to shareholders, - 20X .rB I.e.,

(IX.4) )Br-(X = Y 202 α

The investor could sell out his present holdings and choose an alternative portfolio as follows:

Step 1: sell his current holding for )S (= S 202 α dollars.

Step 2: borrow )B( 20α dollars.

Step 3: with the proceeds from steps 1 and 2, buy shares in company #1.

If S1 = number of dollars invested in company #1's shares, then

. V = )B + S( = B + S = S 20

20

20

20

201 αααα

Page 176: Finance Theory - Robert C. Merton

Robert C. Merton

174

Step 4: as an owner of S$ 1 worth of shares of company #1, he will have claim on )S/S( 101 percent

of #1's earnings. Because #1 has no debt, V = S 10

10 and so, he has claim on

)XV/S( = )XS/S( 101

101 dollars of return.

The return in dollars on his new portfolio, 1,Y will be

( )2

2 2011 0 01 1

0 0

αVSY X r αB X rαB , from step 3

V V

⎛ ⎞= − = −⎜ ⎟⎜ ⎟⎝ ⎠

(IX.5) ( ) orV

VVαXrBXαrBX

V

10

10

202

0201

0

20

⎟⎟⎠

⎞⎜⎜⎝

⎛ −+−=⎟⎟

⎞⎜⎜⎝

⎛−=

⎟⎟⎠

⎞⎜⎜⎝

⎛ −+=

10

10

20

21 V

VVαXYY

so for . Y> Y ,V > V 2110

20 Hence, we have demonstrated that for the same number of dollars

invested in either case, if ,V > V 10

20 then the investor can earn a higher return in the second portfolio

than in the first, for every possible outcome of earnings, X. Therefore, rational investors will

"switch" to portfolio #2 from portfolio #1 until 1 20 0 .V V≤ The argument goes through in precisely the

same way, if it was assumed that . V > V 20

10 Therefore, to avoid arbitrage or dominance,

(IX.6) . V = V 20

10

A second proof of the M-M proposition using the same notation and company data as in

the first proof is as follows (where firm #1 has no debt and firm #2 has some debt):

Case I: Hold as an investment: S$ 1 of shares of company #1 . V = S = 10

10 αα The return

from the investment will be . X =α

Page 177: Finance Theory - Robert C. Merton

Finance Theory

175

Alternative Investment:

Transaction Investment Return (1) (sell $ S1 of #1 and) buy

the same fraction α of the shares of #2

( )2 2 20 0 0S V Bα α ⎡ ⎤≡ −⎣ ⎦ ( )2

0X - rBα

(2) buy α percent of the bonds of firm #2

20Bα

20r Bα

Total 20Vα Xα

So, if V10 > V2

0, the investor gets the same return for α(V10 - V

20) fewer dollars invested.

Case II: Hold α% of shares of firm #2: $S2 of shares of #2 =

2 2 20 0 0(S = (V -B ). α α The return will be

20(X-rB ).α

Alternative Investment:

Transaction Investment Return (1) buy fraction α of

the shares of #1 ( )1 1

0 0S Vα α≡ Xα

(2) borrow ( )20Bα dollars

20Bα−

20- rBα

Total ( )1 20 0V Bα − ( )2

0X - rBα

If 2 10 0V > V , then the investor gets the same return for

2 10 0(V -V ) α fewer dollars. Hence, to

avoid dominance or arbitrage, 1 20 0V = V .

Page 178: Finance Theory - Robert C. Merton

Robert C. Merton

176

Thus, given their assumptions, M-M demonstrate, by a powerful arbitrage argument, that

the capital structure or financing decision among alternative instruments does not affect the

market value of the firm or its (average) cost of capital for determining which assets to purchase.

The intuition is that a purely financial transaction for a fixed amount of real assets should not

affect any "real" decisions or values. Or, if personal borrowing is a perfect substitute for

corporate borrowing, then M-M holds because investors will not pay more for firms that borrow

for them if they can do it themselves. The classical view of the capital structure simply assumes

that leverage "matters." M-M showed why it does not. To disagree with the conclusions of M-M

one must therefore disagree with their assumptions.

Items that could affect the M-M Conclusion

1. Tax deductibility of interest payments by the firm.

2. The risks of personal versus corporate leverage (limited liability and bankruptcy).

3. Cost of borrowing may be higher for the investor than the firm.

4. Institutional restrictions may prevent institutional investors from "levering". Margin

requirements restrict individuals.

5. Transactions costs in establishing the arbitrage position.

6. Moral Hazard: management makes decisions in the best interests of the shareholders that

may conflict with the interests of the bondholders, and therefore, reduce the overall

market value of the firm.

Of these items, by far, the most important are (1) and (2). In the proofs of M-M, it is

assumed that both corporate debt and personal borrowing is default-free, and therefore, investors

who levered equity by personal borrowing could exactly replicate the payoffs to investors who

held shares levered by corporate borrowing. As was already demonstrated in Section VIII, this

will no longer be the case when there is a possibility of default on the debt of the firm. To

review the difference between personal and corporate borrowing, we maintain the assumption

Page 179: Finance Theory - Robert C. Merton

Finance Theory

177

that personal borrowing is default-free and briefly reexamine the case of a pure discount term

loan:

Consider a personal term loan (with no interim interest payments) with a face value of B

dollars due at time T in the future. Let the firm (unlevered) have a current value of V0. Then

the value of levered equity, Ep, would be Ep = V0 – B/[1 + R(T)]T, and at time T in the future

if the firm is worth V(T), the payoffs to the debt and levered equity will be

Consider a corporate loan with the same terms except limited liability for the shareholders. The

payoffs are:

Page 180: Finance Theory - Robert C. Merton

Robert C. Merton

178

By inspection, the payoff to the debtholders in the corporate case is less favorable than in the

personal case. Correspondingly, the payoff to the equityholders in the corporate case is more

favorable than in the personal case. Therefore, the current value of the corporate-levered equity

will exceed the current value of the personal-levered equity, i.e., Ec > Ep. Correspondingly, the

current value of the corporate debt will be less valuable than the current value of the personal

debt, i.e., Dc < Dp.

Inspection of the payoffs to Ep versus Ec shows that, in the event that V(T) < B,

Ec(T) – Ep(T) = B – V(T). As noted in Section VIII, the corporate-levered equityholders are

"insured" against losses that would occur for the personal-levered equityholders if V(T) < B. So,

we can write the value of the corporate-levered equity as

c p

0T

E = E + g

= V - B / [1+ R(T) ] + g

where g is the value of this "downside insurance" (i.e., the put option insurance premium).

Similarly, the corporate debtholder is not only lending money but, in addition, is "insuring" the

equityholder, i.e., we can write the value of corporate debt as

c pT

D = D - g’ = B / ([1 + R(T) ] - g’

where g′ is the "liability" associated with issuing the put insurance (its cost). If V0 is the value

of the unlevered firm and if g = g′, then M-M holds even when bankruptcy is possible. That is,

even in the presence of default possibilities, M-M will hold if either put options on the stock exist

or if these options can be created by low-transaction cost investment strategies. Of course, if

there are significant "dead-weight" losses to the firm's liability holders from a bankruptcy (e.g.,

attorney fees, disruption of the operations of the firm), then corporate leverage can matter.

This analysis should serve to underscore once again (as noted in Section VIII) that one

cannot compare the "true" or "economic" cost of the debt of one firm with that of another by

simply comparing promised yields on debt. That is, by definition, the promised yield (for the

period) is simply [B/Dc] – 1 which can be rewritten (for T = 1) as

Page 181: Finance Theory - Robert C. Merton

Finance Theory

179

(IX.7) R)g+(1-B

R)+g(1 + R = yield Promised

where g is the value of the (implicit) put option. If the value of the put option on one firm is

larger than the value of a corresponding put option on the other, then it is entirely possible that

the debt with the higher promised yield could have a lower economic cost than the debt with the

lower promised yield.

The analysis also makes clear why the promised yield on a personal loan will be lower

than on a (comparable) corporate loan because in the former, the investor pledges all his assets

and in the latter (with limited liability) he pledges only his share of the corporate assets.

Effect of Corporate and Personal Taxes on the M-M Result

The federal tax law allows corporations and individuals to deduct interest payments from

their income before computing taxes. This tax shield is a subsidy to borrowers and may induce

corporations and individuals to borrow when they otherwise might not.

Taxation in the M-M Model

Let X = operating income (before interest and taxes) and XT = after-tax earnings before

interest; Tc = corporate tax rate; Tp = personal tax rate; Bc = "long-run" amount of debt

outstanding and R = interest payments. V0 = value of unlevered firm and V = value of levered

firm. Let k0 = required pre-tax expected return on the unlevered firm. Then

(IX.8) .k

X)T-(1 =

k)T-(1

)T-)(1T-(1X = V

0

c

0p

pc0

On the levered firm,

RT + )XT-(1 = R + )T-R)(1-(X = X cccT

Page 182: Finance Theory - Robert C. Merton

Robert C. Merton

180

If the debt is riskless, MM argue that R = rBc, then the value of the debt is ,B = )rT-(1B)rT-(1

cp

cp

and they show, by their arbitrage argument, that

(IX.9)

BT + V =

r

RT + k

X)T-(1 = V

cc0

c

0

c

In essence, because of the tax subsidy, the levered firm is equivalent to the unlevered firm plus a

certain number (TcBc) of riskless bonds. MM assume that all earnings are paid out as dividends

which are taxable at Tp. Moreover, they assumed that the magnitude of the tax shield is certain

which need not be so if Bc is "pegged" to V. The latter is not of substantive importance because

the value of the tax shield can be shown to equal TcBc, even if there is a possibility of default.

Alternatively to corporate borrowing, let all the flows be riskless and let Y = after-tax

income to an investor who maintains a fixed total leverage (corporate + personal borrowing =

constant) position. Let Bp = amount of personal borrowing. Then,

(IX.10) )]T-(1Br - )T-)(1T-)(1B-rX[( = Y pppcc

and the value of this stream will be

(IX.11)

B - )T-(1B- V =

)rT-(1

)]T-(1B + )T-)(1T-(1Br[ - V = )YV(

pcc0

p

pppcc0

If Bc + Bp = constant, then dBc = –dBp and

(IX.12) . MM T = 1 + )T-(1- = dB

dVcc

c

claimas

However, suppose that one pays capital gains on the income of the firm, then

Page 183: Finance Theory - Robert C. Merton

Finance Theory

181

(IX.13) )] ,T-(1Br-)T-)(1T-)(1B-rX([ = Y ppgcc

and if we capitalize at (1 – Tp)r, then

(IX.14) andc c g p p0

p

r[ (1- )(1- ) + (1- )]B T T B TV(Y ) = V

r(1- )T−

(IX.15)

T < T 0 >

T ,T ,T 0 >

<

)T-(1

)T-)(1T-(1 - 1 =

dB

dV

cp

pgcp

gc

c

if

on depending

In summary, while the theoretical and empirical evidence is hardly conclusive on whether

or not capital structure matters, it is probably a reasonable conclusion that generally, the effects

of capital structure on the firm's cost of capital will not be large enough to make a capital

budgeting project worth undertaking when it would not have been undertaken if financed entirely

by equity. There are, of course, exceptions to this general rule especially when projects are

subsidized by government and the subsidy takes the form of below-market interest rate loans,

loan guarantees, or tax exemption for corporate debt.

In completing this section, we present another example of the care that must be exercised

in computing the cost of borrowing. The example is that of a bank loan with compensating

balances and line fees.

Problem IX.1. On the Cost of Bank Borrowing

Loan Commitment = Maximum that can be Borrowed = "Line" ≡ L

Principal Amount Borrowed = Gross Borrowings ≡ B

Stated Interest Rate on Loan ≡ R = r + δ

where r = "prime" rate and δ ≡ amount "over prime" charged.

Page 184: Finance Theory - Robert C. Merton

Robert C. Merton

182

CB ≡ required (by the bank) amount to be kept on deposit in free balances in the

form of noninterest-bearing demand deposits. ("Compensating Balances")

= c1L + c2B (i.e., a fraction of the line plus a fraction of the principal)

P ≡ penalty charged for not maintaining sufficient compensating

balances

= Rp[CB__

–CB]

where Rp ≡ penalty rate and CB ≡ compensating balances actually maintained.

D ≡ amount of noninterest-bearing demand deposits maintained by firm

d ≡ amount of noninterest-bearing demand deposits which would have been maintained by

the firm even if there were no loans.

Of each $1 deposited, $.16 must be maintained at the Federal Reserve, so that only $.84

represent free-balances.

Therefore, CB = .84D or D = CB/.84 ≈ 1.19CB Fee is payable to the bank for the unused part of the line [i.e., L–B]. Let

RL ≡ rate paid as a line fee

M ≡ actual amount of money available for corporate purposes

(IX.16) d + D - B = M

I ≡ $ charges paid for money

(IX.17) -CB]CB[R + B)-(L R + B R = I pL•

Let TR ≡ the "true" interest rate cost of borrowing = I/M

Page 185: Finance Theory - Robert C. Merton

Finance Theory

183

(IX.18) { }2 1/C CT L L PP P

R R R R B R R L R CB M⎢ ⎥ ⎢ ⎥= + − + + −⎣ ⎦ ⎣ ⎦

(IX.18') { } [ ]2 1.84 /C CT L L PP P

R R R R B R R L R D B D d⎢ ⎥ ⎢ ⎥= + − + + − − +⎣ ⎦ ⎣ ⎦

Should the firm maintain the compensating balance or pay the penalty? [i.e., which D should be

chosen for d D 1.19CB + d]. ≤ ≤ Holding fixed the amount of money available for corporate

purposes, M, how is TR R affected by the choice of D?

[ ]{ } [ ]2 .84T p L pR

dR R R c R dB R dD M dB dDM

= + − − − −

But Therefore dM = 0 dB = dD. ,⇒

(IX.19) ( ) ( )L 2 pM fixed

dRR R .84 c R /

dDM⎡ ⎤= − − −⎣ ⎦

( )( )

LTp optimum

M 2

R RdR0 if R D 1.19CB d

dD .84 c

−< > ⇒ = +

[ ]( )

Tp

M 2

dR0 if R indifference w.r.t. choice of D

dD .84 cLR R−

= > ⇒−

[ ]( )

Tp

M 2

dR0 if R D d

dD .84 cL

optimum

R R−> > ⇒ =

A Numerical Example: Prime = r = 18%; δ = 2% so that R = 20%

Page 186: Finance Theory - Robert C. Merton

Robert C. Merton

184

Compensating Balance Requirement: 10% of the Line plus 10% of the Principal

[i.e., c1 = c2 = .10]

Compensating Balance Penalty Rate: Rp = [R – RL]/[.84 – c2]

Line fee: RL = 0.5%; Payments of interest and fees once a year.

Line = L = $10,000,000; d = 0.

Since Rp is such that for fixed M, the level of deposits has no effect upon TR , assume that D

is chosen such that D = CB__

/.84 (i.e., no penalties)

Amount for Corporate Purposes (M) "True" Interest Cost RT

$1,000,000 53.49% 2,000,000 37.81 3,000,000 32.59 4,000,000 29.97 5,000,000 28.41 6,000,000 27.36 7,000,000 26.61 8,000,000 26.05 9,000,000 25.62 10,000,000 25.27 This TR should be compared with "stated" rate of R = 20%

Note: 1.25TR

r

∂ ≈∂

[i.e., an increase in prime of 100 basis points will cause (at least) a 125 basis point increase in the

cost of the loan.]

Page 187: Finance Theory - Robert C. Merton

185

X. THE INVESTOR'S DECISION UNDER UNCERTAINTY: PORTFOLIO SELECTION

By assuming certainty and perfect exchange markets in Sections II-VII, the optimal

consumption and investment decisions by households were derived; a rational criterion function

for the firm was deduced and rules for investment choice by firms were established. Beginning

with Section VIII, and for the balance of these Notes, the certainty assumption is dropped. The

introduction of uncertainty substantially complicates decision making by all economic units. As

a result, the structure of the capital market and the types of financial instruments and

intermediaries required for an efficiently functioning economy are greatly expanded. As in the

certainty case, we begin with the analysis of the individual household or consumer allocation

problem. To do so, we postulate that the criterion of choice for individuals satisfies the von

Neumann-Morgenstern Expected Utility Maxim. That is, in choosing among uncertain

alternatives, each person's rankings of those alternatives can be represented by computing the

expected value of some utility function of the random variable payoffs to these alternatives.

In making an allocation of his wealth, the investor has many assets to choose from, and

within limits of divisibility and transactions costs, he can choose mixes or combinations of these

assets to form alternative portfolios. The solution to the general problem of selecting the best

asset mix is called portfolio theory. It takes as given the menu of available assets where assets

are operationally-defined by their joint probability distribution of end-of-period values. Thus,

strictly defined, portfolio theory has nothing to say about where these distributions come from or

about why some assets exist and others do not. However, we will use the theory in an

equilibrium context to deduce certain properties of these distributions; to determine what

information about the distributions is required by investors to make optimal decisions; and to

answer (at least in part) the question of why certain types of assets exist.

The basic formulation of the portfolio selection problem is as follows: Assume that the

investor has a von Neumann-Morgenstern utility function for end-of-period wealth, U(W) and

assume further that U is strictly concave (such investors are called "globally risk averse"). Let W 0

Page 188: Finance Theory - Robert C. Merton

Robert C. Merton

186

denote his initial wealth and suppose that there are n different assets or securities available in units

called "shares". Let Poi denote the current price per share of asset n.1,2,..., = i ,i which is known.

The investor can buy or sell all the shares he wants at the current price (i.e., he acts as a "price taker").

Denote by N i the number of shares (or units) of the ith security that he chooses to purchase. His set

of feasible choices is determined by his budget constraint: . PN = W 0ii

n

1=io ∑ Suppose that he has a

probability distribution for the end-of-period price per share, ,P1i for each asset. Then, his end-of-

period wealth will be the random variable . PN = W 1ii

n

1=i1 ∑ Define

W

PN wo

oii

i ≡ to be the fraction of

his wealth invested in the ith security and define the (random) variable return (per dollar invested)

in the ith asset to be . P

P Z oi

1i

i ≡ Then, we can write the expression for W 1 as

(X.1) Zw W = W P

P W

PN = PN = W ii

n

=1iooo

i

1i

o

oii

n

=1i

1ii

n

=1i1 ∑∑∑ ⎟⎟

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

Note: by definition of wi and the budget constraint:

i=1

n

iw = 1 .∑

So the investor's problem of portfolio selection can be written as either

(X.2) PN = W )}PNE{U( }N,...,N,N{

oii

n

10

1ii

n

1n21∑∑ subject to

Max

or equivalently

(X.2′) .w = 1 )}Zw WE{U( }w,...,w,w{

i

n

1ii

n

1o

n21∑∑ subject

Max

Page 189: Finance Theory - Robert C. Merton

Finance Theory

187

Suppose that one of the available securities is "riskless" and offers a return per dollar of R with

certainty. If, by convention, we choose the nth security to be riskless, then we have that R = P

P = Z on

1n

n

with certainty. Note that w + w = w = 1 ni-1n

1in1 ∑∑ or w - 1 = w i

m1n ∑ where m n - 1 .≡ Define:

1nii1

o

WZ = =w ZW

≡ ∑ return per dollar on the portfolio. With a riskless asset, we can rewrite Z as

+ )Z-Z(w = Z)w-(1 + Zw = Zw + Zw = Zw = Z niim1ni

m1ii

m1nnii

m1ii

n1 ∑∑∑∑∑

. R + R)-Z(w = Z iim1n ∑ Hence, the investor problem becomes:

(X.3) Max1 2 m

m

io i} 1{ , ,...,w w w

E { U ( - R) + R }W w Z⎛ ⎞⎡ ⎤⎜ ⎟⎢ ⎥⎜ ⎟⎣ ⎦⎝ ⎠

where } w,...,w { m1 are "free" decisions variables since the budget constraint was substituted out. To

solve (X.3), the usual calculus technique gives us that the maximum

(X.4)

111

212

m1m

(EU) = 0 = E{U ( ) ( - R)}W Z

w(EU)

= 0 = E {U ( )( - R)}W Zw

(EU) = 0 = E {U ( )( - R)}W Z

w

∂ ′∂

∂ ′∂

∂ ′∂

M M M

or

(X.4') . m1,2,..., = i )}W(UE{ R = }Z)W(UE{ 1i1 ′′

Call * * *1 2 m{ , ,..., }w w w (and by the budget constraint, )w - 1 = w *

im1

*n ∑ the optimal portfolio

(proportions) which will be the solution to (X.4). Suppose some . 0 < w*i This means that the investor

will short-sell security . i To short-sell, the investor borrows shares today and sells them (using the

proceeds of the sale to purchase other securities). He must return the same number of units of that

security at the end of the period. Unlike borrowing money, the short-seller has a liability for returning

the specified number of shares instead of a specified number of dollars. If ,0 < w*n then we call it

Page 190: Finance Theory - Robert C. Merton

Robert C. Merton

188

borrowing. Since the nth asset is riskless, the liability can be denoted as a specified dollar amount.

Let m **ii1 ( - R) + R =wZ Z≡ ∑ return per dollar on the optimal portfolio.

. )}(URE{+R)}-)((UE{ =

R)} + R)-(

WZWw

Zw)(W(UE{ = }Z)W(UE{

1i1*i

m

1

i*i

m

11

*1

′′∑

∑′′

But from (X.4), m.1,2,..., = 1 0, = R)}-Z)(W(E{U’ i1 So

(X.5) *1 1E{U ( ) } = RE{U ( )} .W WZ′ ′

Example: Quadratic Utility

.

b

1 W

2b

1 =

0 > b 1/b, W 0 W2

b - W = U(W) 2

≤≤

Suppose that there are just two securities: asset #1 has (uncertain) return Z 1 and asset #2 has a

certain return R. = Z 2 If = w= w1 the fraction of his wealth invested in the "risky" asset #1 and W2

= 1–W1 = 1–W = fraction of his wealth invested in the "safe" asset #2, then the investor's

portfolio problem can be described by:

11{ } { }

22 2 21 1 1

{ }

{ ( )} { ( [ ( - ) ])}

{ [ ( - ) ] - [ ( - 2 ( - ) ]})2

ow w

o ow

E U E U w R RW W Z

b E w R R R wR RW W wZ Z Z R

Max Max

Max

= +

= + + +

Maximizing, by the usual calculus, gives the condition that *w = w , the optimal portfolio, when

R)]}-Zr( + )R-Z(w[bW-R)-Z(WE{ = dw

dEU = 0 1

21

*2o1o

Page 191: Finance Theory - Robert C. Merton

Finance Theory

189

or by dividing by W o and rearranging terms

or R)-ZRE(Wb + })R-ZE{(wbW = R)-ZE( 1o2

1*

o1

(X.6) ])R-ZE[(bW

R]-ZR)E[bW-(1 = w 2

1o

1o*

If b

1 > RW o (i.e., if it is not possible to achieve the "satiation" level of wealth for certain), then a

necessary condition for him to hold some amount of the "risky" asset (i.e., 0) > w* is that

0 > R)-ZE( 1 or ,1E( ) > RZ i.e., the expected return on the "risky" asset must exceed the certain

return . R Note: the expected return on the portfolio R >R] + R)-Z(wE[ = E(Z) = 1* for 0 > w*

only if . R > )ZE( 1

Mean-Variance Portfolio Selection and the Effects of Diversification

While in general, the expected utility maxim requires knowledge of the complete joint

distribution of asset returns to determine the optimal portfolio, under certain conditions, it is

sufficient to know only the first two moments of the joint distribution. That is, the criterion

function for choice can be written as a function of just the expected value (mean) of end-of-

period wealth and the variance of end-of-period wealth )]W( ,)WV[E( 11 Var where V is an

increasing function of its first argument and a decreasing function of its second argument).

Under these conditions, the choice problem is called the Mean-Variance portfolio selection

problem. While beyond the scope of these Notes, it can be shown that if the time interval

between successive portfolio revisions is small, the optimal portfolio choice can be well-

approximated by the mean-variance problem's solution. For the balance of these Notes, we shall

focus exclusively upon environments in which the mean-variance criterion is appropriate.

Page 192: Finance Theory - Robert C. Merton

Robert C. Merton

190

The mean-variance model is the first step in introducing uncertainty quantitatively into

the ranking of portfolios or investments. Classical methods of ranking investments use a single

parameter measure such as (expected) rate of return or (expected) present discounted value.

Although such one parameter measures make ranking quite easy (highest to lowest), they clearly

do not reflect differences among alternatives due to uncertainty. With a two parameter ranking,

there is no simple ranking like highest to lowest. The purpose of the mean-variance model is to

determine optimal portfolios and to make explicit, the tradeoff between risk and return.

A digression on some characteristics of probability distributions

1. The first moment or expected value or the mean of the random variable X which can

take on value x1 with probability ; P1 value x2 with probability ; ;2 ...P value xn with

probability Pn is defined as

xP = X = E(X) ii

n

=1i∑

2. The second (central) moment or variance of the random variable X is defined by

. X-( )xP =] )X-E[(X = 2ii

n

=1i

22X ∑σ The standard deviation is a measure of the

dispersion of possible outcomes around the expected value, as is illustrated in Figure X.1.

Page 193: Finance Theory - Robert C. Merton

Finance Theory

191

Figure X.1.a Figure X.1.b

In Figure X.1, the distribution in (b) is more disperse than in (a), and (b) has a larger

standard deviation (and variance) than (a). Note: in the special case when there is only

one possible outcome for X, call it y, then y = X = E(X) and

. = 0 = X2X σσ An alternative useful formula for σ 2

X is:

2

wheren n

22i iiX i

i=1 i=1

n n n n222

i i i ii i iii=1 i=1 i=1 i=1

n22 22 2

i ii=1

= )x xP P

= - 2X + x x x xP P P PX

)x - 2 + = E[ ] - ( XP XX X

( -X X =

( -2 X + X ) =

=

σ ∑ ∑

∑ ∑ ∑ ∑

3. The covariance between two random variables X and Z is defined by

y thatprobabilit theiswhere

Cov

)Z-)(X-( PZxP =

)}Z-)(ZX-E{(X = Z)(X, =

ijjiij

m

j=1

n

=1i

XZ

∑∑

σ

Page 194: Finance Theory - Robert C. Merton

Robert C. Merton

192

. Z = Zand x = X ji It is a measure of the co-relationship between variations in possible

outcomes of the two random variables X and Z around their expected outcome.

Note: (1) the covariance is symmetric, i.e.,

X)(Z, =

)}X-)(XZ-E{(Z =

)}Z-)(ZX-E{(X = Z)(X,

Cov

Cov

(2) the covariance of a random variable with itself is the variance, i.e.,

σ 2X

2 = })X-E{(X = )}X-)(XX-E{(X = X)(X, Cov

(3) if X and Z are independent, then 0. = Z)(X,Cov I.e., if they are

independent, then PP = P Zj

Xiij where

probXi i = {X = X }P and }Z={Z = P j

Zj prob Hence,

= )Z-)(X-( ZxP = Z)(X, jiij

m

1=j

n

1=i∑∑Cov

[ ] [ ])Z-( )X-( ZPxP jZj

m

j=1i

Xi

n

=1i

∑∑

but 0 = X - X = = )X-( PX - xPxP Xi

n

1=ii

Xi

n

1=i

iXi

n

1=i∑∑∑

so 0. = Z)(X,Cov

4. The correlation coefficient between the random variables X and Z is defined by

. = Z)(X,

ZX

XZ

ZXXZ σσ

σσσ

ρ Cov≡ Roughly, its absolute magnitude represents the percentage of the

variation in X explained by the variation in Z . Its sign says whether X and Z tend to

move in the same or opposite directions. Note: . 1 || ; = XZZXXZ ≤ρρρ If 1, = XZρ then X

and Z are perfectly positively correlated and if XZ = 1 ,ρ − then X and Z are

perfectly negatively correlated. If X and Z are independent, 0 = XZρ and X and Z

are said to be uncorrelated. Of course, . 1 = XXρ

Page 195: Finance Theory - Robert C. Merton

Finance Theory

193

Characteristics of Portfolios and the Effects of Diversification Consider the case of n assets with (random) variable returns (per dollar) . Z i As done in the general

case, if = wi fraction (of each dollar invested in the portfolio) invested in the ith asset, then we can

define 1n

iiZ w Z=∑ as the (random) variable return (per dollar invested) on the composite security or

portfolio, with proportions )w,...,w,w( n21 invested in each asset. The expected return on the

portfolio, ( ) ,iE Z Z= is 1 1 1

( ) [ ] ( )n n n

ii i ii iE Z E w w wZ Z ZE = = =∑ ∑ ∑ , the weighted sum of the

expected returns on the individual assets. Let σ 2i be the variance of the return on the ith asset (i.e.,

σσ ij2

i2i }]Z-ZE{[ = and be the covariance between the returns on the ith and jth assets (i.e.,

)}Z-Z)(Z-ZE{( jjiiij ≡σ and = 2iii σσ ). Then, the variance of the portfolio can be computed to be

.] )Z-ZwE[( =] )Z-E[(Z = 2ii

n

1=i

22 ∑σ

n n

ii i iii=1 i=1

Note : - Zw wZ ZZ( - ) = ∑ ∑

( ) ( )2 nn n

jj ji ii ii ij=1i=1 i=1

n n

i i jj i ji=1 j=1

= w Z Zw wZ ZZ Z

w w ( - )( - )Z ZZ Z

( - ) ( - ) ( - )

=

⎡ ⎤ ∑∑ ∑⎣ ⎦

∑∑

. = ww=

)]Z-Z)(Z-ZE[(ww=

)]Z-Z)(Z-Z(ww]ZZw[ { E

2ijji

n

1=j

n

1i=

jjiiji

n

1=j

n

1i=

jjiiji

n

1=j

n

1i=

2iii

n

1i=

[ E = } )-(

σσ∑∑

∑∑

∑∑∑Hence,

Page 196: Finance Theory - Robert C. Merton

Robert C. Merton

194

Example: The effects of diversification

Suppose there are two risky assets with (random) variable returns Z 1 and Z 2 , and suppose that

1 21 2E( ) = = E( ) = = mZ ZZ Z , i.e., they have the same expected return; suppose that

2 22 2 21 21 21 2E[( - ] = = E[( - ] = = ) ) vZ ZZ Zσ σ , i.e., they have the same variance of return. Let

= )}Z-Z)(Z-ZE{( = 221112σ covariance of the returns which can be written in terms of the

correlation coefficient, ,12ρ between asset #1 and #2, as . v = = 212211212 ρσσρσ Let = Z return

on the portfolio mix of #1 and #2: Zw + Zw = Z 2211 ). 1= w + w 21(for The expected return on Z

can be computed as

. m = )mw+w( =

mw + mw = )ZE(w + )ZE(w = )Zw+ZwE( = E(Z) = Z

21

2122112211

Hence, in this example, for any mix ,)w,w( 21 the expected return on the portfolio will be the same,

namely m. What about the variance of ? Z

].w+ww2 + w[v = vw + vww2 + vw =

w + ww2 + w =

])m-ZE[(w +m)] -Zm)(-ZE[( ww2 +] )m-ZE[(w =

])m-Z(w + m)-Zm)(-Z(ww2 + )m-Z(wE[ =

])m)-Z(w + m)-Z(wE[( =] )Z-E[(Z =

221221

21

2222

21221

221

22

221221

21

21

22

222121

21

21

22

222121

21

21

22211

22

ρρσσσ

σ

Hence, as the relative proportions )w,w( 21 are varied, the variance of the portfolio is changed.

Since 1, = w + w 21 to see the effect on ,2σ we first substitute w,-1 = w-1 = w w= w 121 and to get

.w)] -)w(1-2(1-[1v =] )w-(1 + w)-2w(1+w[v = 1222

12222 ρρσ

So ,2σ as a function of the "mix parameter" w, is a parabola. Since the expected return is the same

for all mixes, a risk-averse investor would want to choose the portfolio with the smallest variance

Page 197: Finance Theory - Robert C. Merton

Finance Theory

195

(dispersion). Formally, we can calculate the "variance minimizing" mix, ,w* by using the

calculus as follows:

or

min min

*

2 212

w w

22 * *

w= 12w

= [ [1- 2(1- )w(1- w)]]v

d 1 = 0 = - [2(1- )(1- 2 )] = | v w w

dw 2

ρσ

σ ρ

which is not exactly a big surprise because of the symmetry of the problem. However, this

general technique is applicable even if the individual variances are not equal. Figure X.2

presents the graph of σ 2 for various correlation coefficients and mixes. From above, the minimum

variance (corresponding to 2

1 = w ) for a given ρ12 is

)+(1 2v = 12

22 ρσ min

Page 198: Finance Theory - Robert C. Merton

Robert C. Merton

196

Figure X.2

As Figure X.2 shows, as long as the two assets are not perfectly positively correlated, the investor

can lower the variance of his return by mixing. This phenomenon is called the diversification

effect. Note that the less positively correlated are the returns, the greater the effect. If they are

independent (i.e., 0 = 12ρ ), then the variance is halved. Although negative correlation is even better,

the existence of such assets in the real world is rare. Although this example was very specialized, the

diversification effect holds generally. Hence, risk-averters will tend to diversify if they act rationally.

While diversification is not a new idea (or rule), our systematic approach will allow us to measure

quantitatively how much diversification is provided by adding securities to a portfolio and how much the

investor should diversify. Later, these quantitative results will lead to a number of new qualitative

insights.

Constructing Portfolios and Composite Securities

Page 199: Finance Theory - Robert C. Merton

Finance Theory

197

Consider the two-risky asset case of the previous example, but now allow

Z =] ZE[ Z = )ZE( 2211 and not to be equal. Further, assume

σσ 22

222

21

211 = })Z-ZE{( = })Z-ZE{( and are not equal, and by convention, assume that

. > 21

22 σσ Form a composite security from a combination of security #1 and security #2. Denote its

random variable return by Z and its expected return by Z and variance by .2σ Let = w fraction

of each dollar of the composite security invested in security #2; = w- 1 fraction invested in #1. Then

).Z-Z w(+ Z = Zw)-(1 + Z w= )Zw)-(1 + ZE(w = Z 1211212 If ,Z > Z 12 then

Figure X.3.a

If ,Z < Z 12 then

Page 200: Finance Theory - Robert C. Merton

Robert C. Merton

198

Figure X.3.b

The variance of , Z, 2σ is given by

1

22 2 2 22 1 212

2 2 2 2 22 1 1 2 1 2 1 112 12

= + 2w(1- w) + (1- w )w

= [ + - 2 ] + 2[ - ]w + w

ρσ σ σ σ σρ ρσ σ σ σ σ σ σ σ

The rest of this section is devoted to finding the characteristics of Z (i.e., σ 2 ,Z ) as w

changes and for different assumptions about Z 1 and .Z 2

Case 1: Suppose that 1 2 and Z Z are perfectly (positively or negatively) correlated, i.e.,

12 = + 1,ρ then

2 22 2 2 2

1 2 2 12 1 = +2w(1- w) + (1- w = [w +(1- w)) ]wσ σ σ σ σ σ σ

or

2 1 = | w +(1- w) |σ σ σ

i.e., σ is in a linear relationship to σσ 21 and ; and linear in

w for 1. + = 12ρ

Page 201: Finance Theory - Robert C. Merton

Finance Theory

199

Figure X.4a

The only other case where σ is in a linear relationship to σσ 12 and is when 0, = 1σ i.e.,

security #1 is riskless. Then σσσσ 222

22 |w| = w = and

Figure X4.b

Proposition: 0 and 0 only if2 21 22 1 12 12 + - 2 = = 1.ρ ρσ σ σ σ ≥

Page 202: Finance Theory - Robert C. Merton

Robert C. Merton

200

Proof: 1. clearly, if then 2 21 22 112 12 0, + - 2 > 0ρ ρσ σ σ σ≥

2. suppose for 0, > 12ρ it were possible that 0. < 2 - + 211221

22 σσρσσ

Then, .2 < + 2112

22

21 σσρσσ But 12 1,ρ ≤ so it must be that σσσσ 21

22

21 2 < + or

0. < 2 - + 2122

21 σσσσ But that means that 0 < )-( 2

21 σσ which cannot be. Hence,

2 21 21 2 12 + - 2 0.ρσ σ σ σ ≥

3. Suppose 0. = 2 - + 211222

21 σσρσσ Then

,1)-2( = 2 - + 21122122

21 σσρσσσσ or .1)-2( = )-( 2112

221 σσρσσ But 12 1ρ ≤ and

21 2( - 0.)σ σ ≥ So the only way for equality to hold is if 1 = 12ρ and 1 2 = . σ σ .

Case 2: Z Z 21 and are not perfectly correlated, i.e., 1. < 12ρ

2

22

12

12 1 22

12

12 1 2 12 = [ + - 2 ] w - 2[ - ]w + σ σ σ ρ σ σ σ ρ σ σ σ

and σ σ = [ ] w - 2[ ]w + 2

12

1st derivative of σ with respect to w

(X.7a) ]} - [ -] 2 - + {w[ 2 = dw

d2112

212112

22

21

2

σσρσσσρσσσ

(X.7b) σσσρσσσρσσσ ]-[ -] 2-+w[

= dw

d 2112212112

22

21

2nd derivative of σ with respect to w

Page 203: Finance Theory - Robert C. Merton

Finance Theory

201

(X.8a) from the Proposit ion2 2

2 21 21 2 122

d = 2[ + - 2 ] > 0, dw

σ ρσ σ σ σ

(X.8b) 1 < 0> )-(1 = dw

d12122

22

21

2

2

ρρσ

σσσfor

From the formulae for σ 2 and σσ 2 , as a function of w is a parabola and σ as a function

of w is a hyperbola. From (X.8a) and (X.8b), both σ 2 and σ are strictly convex functions

of w.

The minimum variance composite security with "mix" (wmin,1-wmin)

Because σ 2 is a (strictly) convex parabola, there exists a unique minimum value corresponding

to proportion, .wmin Of course, this value of w minimizes σ as well. The minimum point

will occur where the 1st derivative is zero, i.e., min

2

w=wd

= 0.|dwσ

From (X.7a) or (X.7b) we

have that

(X.9) ]2-+[

)-( = w

211222

21

211221

σσρσσσσρσ

min

Thus, for 0) < dw

d 0 <

dw

d ,w < w

2 σσ (andmin and for

0). > dw

d ( 0 >

dw

d ,w > w

2

min

σσ and Can ? 1 > wmin Suppose so, then from (X.9)

1 > ]2-+[

- = w

211222

21

211221

σσρσσσσρσ

min or . > 1

212 σ

σρ But, by convention,

. > > 1221

22 σσσσ or So, 1 > 12ρ which is impossible. Hence, wmin < 1. Can wm < 0? From

(X.9), this would imply that σσρ

2

112 > which is possible since . > 12 σσ Thus, if , > 12

2

1 ρσσ

Page 204: Finance Theory - Robert C. Merton

Robert C. Merton

202

then min (Note : if then 112 12

2

0 < < 1 0, > ).wσρ ρσ

≤ So, if σ 2 is not too much larger than σ 1

or if 1 2 and Z Z are not too highly (positively) correlated, then 1. < w < 0 min Given (X.7b),

(X.8b), and (X.9), we can graph σ as a function of w as:

Figure X.5

Note in Figure X.5: because σ is a convex function of w, the curve will always lie below the

straight line ).1, = (w )0, = (w 21 σσ and For our purposes, it will be much more useful to work

with the relationship between σ and Z rather than with σ and w , (i.e., combine graphs

Figures (X.3a) or (X.3b) with (X.5)). Because ),Z-Z w(+ Z = Z 121 we have by the "chain rule"

(valid for Z Z 12 ≠ ) that

(X.10) )Z-Z(

dw

d

=

dw

Zddw

d

= Zd

dw

dw

d =

Zd

d

12

σσσσ

Page 205: Finance Theory - Robert C. Merton

Finance Theory

203

and

(X.11) . )Z-Z(

dw

d

= Zd

d2

12

2

2

2

2

σσ

From (X.10), if ,Z (<) > Z 12 then Zd

dσ will have the same (opposite) sign as .

dw

dσ From

(X.11), independent of the relative sizes of Zd

d ,Z Z 2

2

12

σand will have the same sign as .

dw

d2

But, from (X.8b) 0. > dw

d2

2σ So, and

2

2d > 0 d Z

σ σ is a convex function of .Z Graphically, the

two cases are:

Figure X.6a

Page 206: Finance Theory - Robert C. Merton

Robert C. Merton

204

Figure X.6b

By convention, graphs such as Figures (X.6), are plotted with the expected return on the ordinate

and standard deviation on the abscissa:

Figure X.7

Note: ∞==

dσZd

whereor0Zd

dσwhereoccursσmin

The curve in Figure (X.7) traces out all the feasible expected return-standard deviation (or mean-

variance) combinations possible from the two risky assets. And to each ),Z( σ point on that

curve, there corresponds a unique portfolio of these assets described by w).-(w,1

Page 207: Finance Theory - Robert C. Merton

Finance Theory

205

General Composite Securities

The preceding analysis examined the simplest composite security constructed from two

securities. We now analyze composite securities constructed from many assets.

First, consider composite security #I constructed from securities with return ,Z Z 21 and where

Z 1 has mean, variance, and covariances Z ),,,,Z( 2141312211 andσσσσ has mean, variance,

and covariances ).,,,,Z( 242321222 σσσσ If wI is the fraction of composite security #I invested

in security 1 and )w-(1 I is the fraction invested in security #2, then the return on #I,

Z Z)w-(1 + Zw = Z ,Z I2I

1I

II andis has an expected return Z I and variance σ 2I as

constructed in the previous section.

Second, consider the composite security #II constructed from other securities with returns

,Z Z 43 and where Z 3 has mean, variance, and covariances ),,,,Z( 343231233 σσσσ and Z 4

has mean, variance, and covariances ).,,,,Z( 434241244 σσσσ If wII is the proportion of the

composite security #II invested in )w-(1 Z II3 and is the proportion invested in ,Z 4 then the

return of #II, ,Z)w-(1 + Zw = Z 4II

3II

II and Z II has expected return Z II and variance σ 2II as

constructed in the previous section.

Third, one can compute the covariance between composite securities #I and #II,

Cov 1 II I,II( , ) ,Z Z σ≡ from knowledge of the variances and covariances of .Z ,Z,Z,Z 4321 and

Fourth, form a composite security with return Z, constructed from (composite) securities I and

II with returns ,Z Z III and where Z I has expected return, variance, and covariance

).,,Za( III,2II σσ If w is the fraction of the composite security invested in security #I and

Page 208: Finance Theory - Robert C. Merton

Robert C. Merton

206

w)-(1 is the fraction invested in security #II, then ,Zw)-(1 + wZ = Z III and the mean and

variance as a function of w can be computed as was done in the previous section. Further,

if 0 > 0 > 2II

2I σσ and and Z Z III and are not perfectly correlated, then Figures (X.5) -(X.7)

will describe the mean-variance "tradeoff." Otherwise, Figure (X.4) will be the description.

Note:

1 2 3 4

1 2 3 4

1 2 3 41 2 3 4

[ (1- ) ] (1- )[ (1- ) ]

( ) ( (1- )) ((1- ) ) [(1- )(1- )]

I I II II

I I II II

Z w w w w w wZ Z Z Z

w w www w w wZ Z Z Z Z Z Z Zµ µ µ µ

= + + +

= + + += + + +

So, we see that composite securities containing many securities can be constructed by combining

securities to form composite securities and combining these composite securities to form (more

complicated) composite securities, etc. Hence, we can generate any portfolio by this process and

each portfolio will have an expected return and variance. Further, the graph of the mean-

variance "tradeoff" will be like either Figures (X.5) - (X.7) or Figure (X.4) as in the two-security

case.

Portfolios and Efficient Portfolios

Suppose that there are n securities with (random) variable returns Z i with expected returns

; Z = )ZE( ii variances of returns ; 2iσ covariances of returns ,ijσ for . n1,2,..., =j i, Further,

suppose that all n securities have uncertain returns (i.e., 0 > 2iσ for all securities). By mixing

these securities together to form portfolios, one can create "new" (composite) securities which

will also have expected returns, variances, and covariances with the other (both "basic" and

"composite") securities. Hence, one can create an infinite number of securities from the original

n. Is there a way to reduce the number of securities (or portfolios) that one need consider as

possibilities for selected portfolios? We know that, asked to choose a portfolio from a group of

portfolios all with the same expected return, risk-averse mean-variance maximizers will choose

Page 209: Finance Theory - Robert C. Merton

Finance Theory

207

the portfolio with the smallest variance. Suppose that we classify or subdivide all the possible

portfolios into groups where each portfolio within a given group has the same expected return;

then determine, for each group, which member has the smallest variance. The collection of

"winner" portfolios from each group is called the Frontier portfolio set. A portfolio is a member

of the Frontier portfolio set if and only if among all portfolios possible with the same expected

return, it has the smallest variance. Clearly, it is a necessary condition that a portfolio be a

Frontier portfolio if it is ever going to be chosen by a risk-averter (as an optimal portfolio). We

can reduce the possibilities even more: given a choice between two portfolios with the same

variance, a risk-averter will prefer the one with the larger expected return. So, among Frontier

portfolios, compare all portfolios with the same variance and select the one with the largest

expected return. This final collection of portfolios is called the Efficient Portfolio Set. A

portfolio is a member of the Efficient portfolio set if and only if there does not exist another

portfolio which has a variance smaller or equal to its variance and which has an expected return

greater than or equal to its expected return. Clearly, any portfolio selected by a risk-averter (as an

optimal portfolio) must be an efficient portfolio.

Page 210: Finance Theory - Robert C. Merton

Robert C. Merton

208

Figure X.8

In Figure X.8, the cross-hatched area represents feasible (possible) portfolios; the boundary line

(which is a parabola) is the Frontier portfolio set; the heavy-lined part of the boundary is the

Efficient portfolio set. The point ),Z( 2σ minmin is called the minimum-variance portfolio, and is

a part of the Efficient portfolio set. As noted, it is common practice to plot the portfolio sets in

Expected Return-Standard deviation space where it is a hyperbola.

Page 211: Finance Theory - Robert C. Merton

Finance Theory

209

Figure X.9

We now present an analytical derivation of the Frontier and Efficient Portfolio sets to show that

the qualitative results presented in Figures X.8 and X.9 are correct and to demonstrate that in

practice, given the expected returns, variances, and covariances of the primary securities, the

efficient frontier can be computed.

Let Z be the random variable return on any portfolio (constructed from the n

"primary" securities) which has expected return m , i.e.,

(X.12) Zw = Z ii

n

=1i∑

where the portfolio weights are restricted to satisfy

Page 212: Finance Theory - Robert C. Merton

Robert C. Merton

210

(X.12a) 1 = w

i=1

n

i∑

and

(X.12b) m. = ZwZw = E(Z) = Z ii

n

=1i

ii

n

=1i

= )E( ∑∑

Obviously, all possible combinations of w,...,w,w n21 which satisfy the constraints (X.12a) and

(X.12b) represent all the possible portfolios with expected return m. To find the Frontier

portfolio set, we must determine the particular combination )w,...,w,w( n21 which satisfies

constraints (X.12a) and (X.12b) and minimizes the variance. Formally, this is a constrained

minimization problem which can be solved by using Lagrange multipliers, i.e., minimize 2

subject to (X.12a) and (X.12b), or

(X.13) ]}Zww-[1 + ww2

1{ ii

n

1=i

2i

n

1=i1ijji

n

1=j

n

1=i

-[m +] ∑∑∑∑ λλσMin

where 1 2 and λ λ are the multipliers and remember that . ww = ijji

n

j=1

n

=1i

2 σσ ∑∑ To determine a

critical point, we differentiate (X.13) with respect to λλ 21n21 , ,w,...,w,w and set each partial

derivative equal to zero, to obtain the optimality conditions

(X.14)

w = 1

Zw = m

n1,2,..., = i Z- - w = 0

i

n

=1i

ii

n

=1i

i21ijj

n

j=1

∑ forλλσ

These are 2)+(n linear equations to be solved for the 2)+(n unknowns

. ,,w,...,w,w 21n21 λλ and Let ]v[ ij be the elements of the inverse of the variance

Page 213: Finance Theory - Robert C. Merton

Finance Theory

211

-covariance matrix of returns .] [ ijσ Then, if we call:

,-BC D ; Av C ; ZZv B ; Zv A 2ij

n

j=1

n

=1ijiij

n

j=1

n

=1ijij

n

j=1

n

=1i

≡∑∑∑∑∑∑ ≡≡≡ the solutions are:

(X.15)

n n

ij ij

j=1 j=1i

2 1

m v v j = w

j(CZ - A) + (B- AZ )

DCm-A B-Am

= = D D

λ λ

∑ ∑

From (X.15), we can compute the portfolio variance, ,2σ to be

(X.16) D

B + 2Am-Cm = 2

2σ Note: the variance of the Frontier portfolio set is a

parabola as a function of the expected return, m.

One can solve for the expected return as a function of the standard deviation [using (X.16)] to be:

(X.17a) Frontier2A 1m = + D(C -1)

C Cσ

and the Efficient (part of) Frontier to be

(X.17b) SetEfficient 1)-D(C C

1 +

C

A = m 2σ

More on the Role of Financial Instruments and Intermediaries: A Mutual Fund Theorem The previous analysis assumed that all the securities available were risky (i.e., 0 > 2

iσ ). What

happens if a )1+(n st riskless security becomes available with (certain) return R? Before

answering that question, we digress:

Digression: Suppose that you already have a composite security or portfolio (containing only

risky assets) with (random) variable return ; Z P expected return ; Z = )ZE( PP variance of

return 2 2

P PPE[( - ) ] = .Z Z σ Suppose that there is now available a riskless security with

Page 214: Finance Theory - Robert C. Merton

Robert C. Merton

212

(certain) return R, and you want to construct a new portfolio by combining the "old" portfolio

with the riskless security. Let = w fraction of your wealth invested in the "old" (risky) portfolio

and = w)-(1 fraction of your wealth invested in the riskless asset. If Z is the (random)

variable return on the new portfolio, then

R + R)-Z w(= w)R-(1 + )Z w(= Z PP

and the expected return is PE(Z) Z = w( - R) + R.Z=

Note: . )Z-Z w(= R - R)-Zw(-R + R)-Z w(= Z - Z PPPP The variance of the new portfolio is

. w =] )Z-ZE[(w =] )Z-Z(wE[ =] )Z-E[(Z = 2P

22PP

22PP

222 σσ The standard deviation of

the new portfolio, oris , 2σσ

σσ P|w| =

Note: the standard deviation is linear in the "mix" w, as was shown earlier.

Figures X.10 illustrate how the variance, standard deviation, and expected return vary as one

alters the mix.

Page 215: Finance Theory - Robert C. Merton

Finance Theory

213

Figure X.10.a Figure X.10.b

Figure X.10.c

Z

We can also solve for the expected return as a function of the standard deviation: since

,0 w , |w| = P ≥ifσσ then,

y,Graphicall . R + R)-Z(

= R + R)-Z w(= ZP

PP σ

σ

Page 216: Finance Theory - Robert C. Merton

Robert C. Merton

214

Figure X.11

The important point to remember from this analysis is that various combinations of a risky

security with a riskless security plot as straight lines in the Expected Return - Standard Deviation

plane (Figure X.11).

- End of Digression -

Return now to the question posed before the digression: What is the effect on the efficient

portfolio frontier of adding a riskless security? Using the result displayed in Figure X.11, we can

determine the answer geometrically (as is done in Figure X.12) by combining Figure X.11 with

the "old" frontier for (risky) assets as displayed in Figure X.9.

Page 217: Finance Theory - Robert C. Merton

Finance Theory

215

Figure X.12

The curve DBE is the "old" efficient frontier when only risky assets were available. In

particular, the point B is a portfolio which contains only risky assets because it lies on DBE.

Think of this specific portfolio as the "old" risky portfolio analyzed in the digression (i.e., take

σσ *P

*P = Z = Z and ). In that case, line ABC in Figure X.11 corresponds exactly to the line in

Figure X.10, and it represents various (possible) positive mixes of the "old" portfolio with the

riskless security. Therefore, every point on line ABC is now a feasible portfolio with the

introduction of the (additional) riskless security. Note that every point on ABC is (strictly,

except for point B) above points on DBE, and hence, the new efficient portfolio frontier is the

straight line ABC. Thus, every portfolio in the efficient portfolio set can be interpreted as a

"mix" of two portfolios: namely, a portfolio containing only risky securities in the proportions

described by point B and a (trivial) portfolio containing only the riskless security (point A).

Because of the importance of the particular portfolio ),Z( ** σ represented by point B, the

specific weights of the holdings of basic securities in that portfolio, )w,...,w,w( *n

*2

*1 are called

Page 218: Finance Theory - Robert C. Merton

Robert C. Merton

216

the optimal combination of risky assets. Further we can determine explicitly what these optimal

proportions are from the (formal) analysis previously done. The proportions are (using the

notation of that analysis)

(X.18) n1,2,..., = i ,RC)-(A

R)-( Zv

= w

jij

n

1=ji

We now summarize: (I) we know that risk-averters in selecting an optimal portfolio from

among the 1)+(n individual securities will always choose a portfolio which lies along the

Efficient Portfolio Frontier (line ABC in Figure X.11), which is a straight line with slope

σ *

* R)-Z( and intercept ; R (II) we know that every efficient portfolio can be constructed by

mixing two particular portfolios (or "mutual funds"), and that one "fund" holds just the riskless

asset and the other holds only risky assets in the proportions described in (X.18); (III) the

proportions described in (X.12) depend only on the expected returns, variances, and covariances

of the "primary" securities and require no other information to compute.

Note: the proportions, ,w*i in (X.18) do not depend on the individual investors' utility functions

or on how much wealth they have.

A "Mutual Fund" or "Separation" Theorem

Every risk-averse, mean-variance utility maximizer would be indifferent between

selecting his optimal portfolio from among the original 1)+(n securities or from just the two

mutual funds described in (II), provided that the investor agrees with the estimates of ),Z( iji σ

used to form the (optimal) risky mutual fund.

Proof: follows from (I) and (II) and the definition of the efficient portfolio set.

Page 219: Finance Theory - Robert C. Merton

Finance Theory

217

Remember the first such separation theorem was deduced in an earlier set of lectures where it

was shown that the individual investors could hire a "technocrat" to make all production

decisions, and provided that he followed the "right" rule (i.e., maximize market value), they

would be indifferent between his handling production or each of them doing it individually.

Here, we find that all the individual risk-averse investors can hire a "technocrat" portfolio

manager and give him the rule to hold proportions *iw in his fund, and the only decision that the

individual investor need make is what proportion of his wealth to hold in the riskless asset.

(Essentially, the problem solved in the digression). It is a true separation or decentralization

because the portfolio manager need only "worry" about determining the expected returns,

variances, and covariances of the individual securities and need not know what the investors'

preferences or wealth levels are to do his job; and the investors do not need to know the

individual expected returns, variances, covariances, etc. of the securities, but only the aggregate

) ,Z( ** σ to make their decisions.

How the Investor selects the optimal "mix" between the two funds: A Graphical Solution. If the

investor makes his decisions solely on the basis of the mean and variance of his portfolio and if

he is risk-averse, then one can solve for indifference curves (lines of constant utility level)

showing the tradeoff between expected return and standard deviation (or variance), and these

curves will have a shape as displayed in Figure X.12.

Page 220: Finance Theory - Robert C. Merton

Robert C. Merton

218

Figure X.13

His individual optimal portfolio will be the point where one of his indifference curves is tangent

to the Efficient Frontier, and ) ,Z( σ optimaloptimal are the expected return and standard deviation of

the return on his optimal portfolio. From the lower half of the graph, we see that implies putting

optimalw percent of his wealth in the "risky" fund and the rest in the riskless asset. Note: He

only required knowledge of R),,Z( ** σ to choose his optimal portfolio.

Page 221: Finance Theory - Robert C. Merton

219

• Components of Best Performing Risky Assets Only Portfolio:

•Diversification Risk Modulation

•Risk Modulation through Hedging or Leveraging

•Market Timing Active Management

PassiveWell-Diversified

EfficientPortfolio

“Efficient Exposures”

Superior PerformingMicro Aggregate

Excess-ReturnPortfolio

“Alpha Engines”

Active Asset-ClassAllocation

Macro SectorMarket Timing

SuperEfficient

Portfolio of Risky Assets

RisklessAsset

Portfolio

Optimal Portfolio of

Assets

Alter Shape of Payoffs onUnderlying

Optimal Portfolio

StructuredEfficient Form of Payouts to

Client

(Optimal Combination of Risky Assets)

Domain of Investment Management: Stages of Production Process

(Derivative Securities with Non-Linear Payoffs)

HouseholdsEntrepreneursEndowment Corporation

Client

• Risk Modulation through Insurance or non-linear leverage

• Pre-programmed dynamic trading

• “Building Block” State-Contingent Securities to create specialized payout patterns

• Tax efficient• Regulatory efficient• Liquidity allocation

Page 222: Finance Theory - Robert C. Merton

Robert C. Merton

220

Macro Asset Classes

Small-CapDomesticEquities

Mid-CapDomesticEquities

Large-CapDomesticEquities

Fixed-Income Real Estate Other

Passive Well-Diversified

EfficientPortfolio

Weighted to match a benchmark

Implementing Diversification as one of the Three Risk Management Tools

Indexing of portfolios

Passive Management: Efficient Exposures to Various Asset Classes

Page 223: Finance Theory - Robert C. Merton

Finance Theory

221

ASSET CLASS BENCHMARK WEIGHT LONG (SHORT)INCREMENTAL

REVISED WEIGHT

Small-Cap Equity 5% +5% 10%

Mid-Cap Equity 10% 0% 10%

Large-Cap Equity 30% (10%) 20%

Emerging MarketEquity

15% (5%) 10%

Domestic Fixed-Income

30% 5% 35%

Real Estate 10% 5% 15%

Active Management: Enhancing Portfolio Performance

Asset-Class Allocation: Macro-Sector Market Timing

“Long-Short” combinations to change fractional allocations from Benchmark Weights

100% 0% 100%

Super-PerformingMicro AggregateExcess-Return

Portfolio

Engine #1U.S. Risk ArbitrageHedge Fund

Engine #2Technical Analysis ofEquities Fund

Engine #3FundamentalAnalysis of Equities Fund

Engine #4Foreign CurrencyForecast Fund

Engine #5Private Equity Fund

Engine #NMortgage-backSecurity Relative Value Fund

Micro “Excess Return” Portfolio: Security Selection: “Alpha Engines”

Optimal Weighting•Security Analysis

•Technical Analysis

•Proprietary Derivative-Security Pricing Models

Page 224: Finance Theory - Robert C. Merton

Robert C. Merton

222

Creating the Optimal Portfolio of Assets: Mix of Optimal Combination of Risky Assets (“OCRA”) and the Riskless Asset

Optimal Risk OCRA Risk

Risk0

100%

Optimum Percent

• Hedge or Leverage OCRA to obtain Optimal Portfolio

• Implement Macro Market timing of Risky versus Riskless Asset Performance

Risk/Expected Reward “Menu”

Expected Reward

OCRA Reward

Optimal Portfolio Reward

Riskless Reward

Percentage of Optimal Portfolio Invested in OCRA

Page 225: Finance Theory - Robert C. Merton

Finance Theory

223

Transform Shape of Payoffs from Investing in the Optimal Portfolio: Derivatives

$100,000

$95,000 Minimum Guarantee Floor

Value of Investor Insured Portfolio, $

Value of Optimal Portfolio, $

“Insured Equity” Payoff

“Uninsured Equity” Payoff• Insurance and non-linear leverage

• Transform Payoff Pattern to fit precise preferences: custom design

$95,000

$95,000 Minimum Guarantee Floor

Value of Investor Custom Pattern Portfolio , $

Value of Optimal Portfolio, $

0

0

0

“Ceiling” Maximum Payout

$190,000

$190,000

0

Page 226: Finance Theory - Robert C. Merton

Robert C. Merton

224

Structured Holdings to Create Most Efficient Form of Payouts to Client

• Tax efficient (income, wealth, estate/inheritance)

• Regulatory efficient

• Liquidity efficient

Tools

Derivatives: Futures, Forwards, Swap Contracts

Special Purpose Vehicle (SPV): Custom-created targeted-purpose security

Asset Substitution:

Municipal (tax-exempt) bonds for taxable bonds

liquid “on-the-run” US Treasury Bonds for “off-the-run” less-liquid US Treasury or Agency bonds

Location of Entity: (e.g., Bermuda for insurance)

Location of Assets and Liabilities: on or off-balance sheet; investment versus trading account; taxable or non-taxable part of one’s accounts.

Page 227: Finance Theory - Robert C. Merton

225

XI. IMPLICATIONS OF PORTFOLIO THEORY FOR THE OPERATION OF THE CAPITAL MARKETS: THE CAPITAL ASSET PRICING MODEL

We have shown that for all risk-averse, mean-variance utility-maximizers who agree on

the expected returns, variances, and covariances of the individual basic securities, the optimal

portfolio chosen can be represented as a mix of two securities (portfolios): one security is the

riskless security with return R and the other is a particular combination of risky assets. We now

consider the implications of these results for equilibrium expected returns and asset prices.

Suppose that everyone in the market agreed on expectations. Then, if the market is in

equilibrium (i.e., the prices of securities are such that when investors are holding their optimal

portfolios, the aggregate supply of each security is equal to the aggregate amount of that security

demanded), what must be the composition of the "risky" portfolio represented by point B in

X.11 in Section X (i.e., the optimal combination of risky assets with mean Z*

and variance

σ 2* )? In Section X, it was shown that all investors would be indifferent between selecting an optimal

portfolio from the n risky assets and the riskless asset and from just two assets: the "risky"

mutual fund composed of the optimal combination of risky assets and the riskless asset. Hence,

for expositional purposes, assume that the investors just invest in the risky fund and the riskless

asset and that the fund then invests the money in the primary risky securities according to

formula (X.18). Let there be K investors and consider investor #k, k = 1,2,...,K. Let

wk

* = fraction of the kth investor's wealth invested in the risky fund in his optimal portfolio;

W k

o = amount of initial wealth of the kth investor; k k k

o* d w W≡ = number of dollars invested by the kth investor in the fund =

demand for the fund. Define: M = equilibrium market value of all risky assets ("the market")

= PN ii

n

1=i∑ where N i = number of shares of security i outstanding

Page 228: Finance Theory - Robert C. Merton

Robert C. Merton

226

and Pi = equilibrium price per share of firm i Define: VR = equilibrium market value of the aggregate supply of riskless asset (which

may be zero).

In equilibrium, aggregate wealth W must satisfy

(XI.1) V + PNVW W Rii

n

=1i

Rko

K

=1k

= + M= ∑∑≡

In equilibrium, aggregate demand = aggregate supply, i.e.,

(XI.2) .V = W)w-(1 M;= Wwd Rko

k*

K

=1k

ko

k*

K

=1k

kK

=1k

= ∑∑∑

So, M is the total number of dollars invested in the fund. How much is (implicitly) invested in

risky primary asset i ? From (X.18) the total dollars of investment demanded in security i is

(XI.3) n.1,2,..., = i M,w D *ii ≡

But, in equilibrium, the supply of asset i must equal the demand, i.e.,

(XI.4) . n1,2,..., = i ,PN = D iii

From (XI.3) and (XI.4),

(XI.5) n.1,2,..., = i ,

PN

PN = M

PN = w

ii

n

1=i

iiii*i

Page 229: Finance Theory - Robert C. Merton

Finance Theory

227

Thus, in equilibrium, the prices must be such that the fraction of the optimal-combination-of-

risky-assets portfolio allocated to security i must equal the ratio of the market value of the ith

security to the market value of all risky assets. A portfolio which holds assets in proportion to

their market value is called a market portfolio. (XI.5) states that in equilibrium, the optimal

combination of risky assets must be a market portfolio. Since each investor's optimal portfolio is

a combination of the optimal combination of risky assets and the riskless asset, we have that in

equilibrium, each investor holds a combination of the market portfolio and the riskless asset.

Further, since we have that every efficient portfolio (except just holding the riskless asset alone)

is perfectly positively correlated, all investors' portfolios are perfectly correlated. Further, since

the relative holdings of risky assets by each investor are the same as in the market portfolio and

since prices cannot be negative, we have that in equilibrium, no investor will optimally short-sell

any risky asset. Can we say more? Let Z M be the random variable return per dollar invested in the

market portfolio; then M ME( ) Z Z≡ is the expected return on the market and

2 2M MME{( - } =)Z Z σ≡ the variance of the return on the market. In equilibrium,

σσσ ij*j

*i

n

1=j

n

1=i

2*

2Mi

*i

n

1=i

*M ww = = ; Zw = Z = Z ∑∑∑ where w*

i is as defined in (X.18).

The following derivation is designed to avoid using any mathematics beyond the

elementary calculus, and therefore, is somewhat tedious. A direct analytical proof can be found

in Merton, "Analytical Derivation... Portfolio Frontier", p. 1868-1871.

Question: In equilibrium, can we deduce the relationship among expected rates of return on

securities and develop a systematic, quantitative measure of the "risk" of a security?

Page 230: Finance Theory - Robert C. Merton

Robert C. Merton

228

Figure XI.1

Consider a Portfolio of Three Securities

Let % = w1 invested in security i (any security not on efficient frontier)

% = w2 invested in the market portfolio (optimal combination of risky assets)

% = w-w-1 21 invested in the riskless asset

(security i has expected return Z i and standard deviation σ i i.e., point A)

The return on the portfolio, Z , is (XI.6) R + R)-Z(w + R)-Z(w = )Rw-w-(1 + Zw + Zw = Z M2i121M2i1

The expected return is

(XI.7) R + R)-Z(w + R)-Z(w = Z = E(Z) M2i1

Page 231: Finance Theory - Robert C. Merton

Finance Theory

229

and the variance is

(XI.8) σσσσ iM212M

22

2i

21

22 ww2 + w + w = })Z-E{(Z = = (Z)Var

where = iMσ the covariance between the return on the ith security and the market portfolio. Clearly, from the definition of efficient portfolio, the only times that this three-asset portfolio is

efficient is when 0. = w1 I.e., no investor would hold this portfolio as an optimal portfolio unless

0. = w1 Now, only consider mixes of the three securities which lead to an expected return on the

portfolio = ).Zm(= i (In Figure XI.1, this is represented by the dotted line through AC .) How do we

find the minimum-variance portfolio constructed from these three securities with expected return

m ? Set m). = Z R) + R)-Z(w + R)-(mw = Z = E(Z) = m iM21 (using Then

(XI.9) . R-Z(

R)-(m)w-(1 =

R-Z

R-Zw -

-RZ

R-m = w

M

1

M

i12

Substitute for w2 from (X1.9) into the expression for the variance (XI.8) to get

(XI.10)

σ

σσσ

iM

M

i1

M

1

2M2

M

i21

12M

i2

M

22i

21

2

R)-Z(

R)-Z( w -

R-Z

R-mw2 +

)R-Z(

R)-Z(w + w

)R-Z(

R)-ZR)(-2(m -

)R-Z(

)R-(m + w =

⎥⎦

⎤⎢⎣

⎥⎥⎦

⎢⎢⎣

or, by rearranging terms,

(XI.10')

)R-Z(

)R-(m + w

R)-Z(

R)-(m

R)-Z(

R)-Z( - 2 +

w R)-Z(

R)-Z2( -

)R-Z(

)R-Z( + =

2M

22M

1

M

2M

M

iiM

21iM

M

i2M2

M

2i2

i2

σσσ

σσσσ

•⎥⎦

⎤⎢⎣

⎥⎥⎦

⎢⎢⎣

Now, to find the minimum variance portfolio, we differentiate σ 2 in (XI.10') with respect to w1

and the minimizing w1 will be where

Page 232: Finance Theory - Robert C. Merton

Robert C. Merton

230

0. = wd

d

1

Call w*1 the w1 which minimizes (XI.10').

Differentiating (XI.10), we have that

(XI.11)

. w = w 0 =

R-Z

R-m

R)-Z(

R)-Z( - 2 +

R)-Z(

R)-Z2( -

)R-Z(

)R-Z( + w2 =

dw

d

*11

M

2M

M

iiM

iM

M

i2M2

M

2i2

i*1

1

2

at

⎟⎟⎠

⎞⎜⎜⎝

⎛⎥⎦

⎤⎢⎣

⎥⎥⎦

⎢⎢⎣

σσ

σσσσ

But, we know that the variance-minimizing portfolio will be on the efficient frontier (point C in

Figure XI.1) where 0. = w1 Therefore,

(XI.12) . 0 = w*1

But, from this condition (XI.12) and (XI.11), we have that either (a) (b)or R; = Z = m i

i 2iM M

M

( - R)Z - 0.( - R)Z

σ σ⎡ ⎤

≡⎢ ⎥⎣ ⎦

Since we chose security i arbitrarily, unless the expected return on all securities = R , it must

be that condition (b) holds. So that in equilibrium,

(XI.13) R)-Z( = R - Z M2M

iMi ⎟⎟

⎞⎜⎜⎝

σσ

for any security. (XI.13) is more commonly written as

(XI.14) R)-Z( = R - Z Mii β

Page 233: Finance Theory - Robert C. Merton

Finance Theory

231

.σσ β )beta"(" where

2M

iMi ≡

(XI.13) and (XI.14) is the fundamental equation relating the equilibrium expected returns on any

security with any other. This equation is called the Security Market Line.

Under conditions of homogeneous expectations and equilibrium, the efficient portfolio

frontier is called the Capital Market Line.

Figure XI.2

The equation of that line can be written as

(XI.15) Z = R + r ,eσ

and, in equilibrium, the market portfolio is an efficient portfolio, and hence, must be on the line.

I.e., or ,r + R = Z MeM σ

(XI.16) ,R-Z r

M

Me

σ≡

Page 234: Finance Theory - Robert C. Merton

Robert C. Merton

232

where re is called the price of risk-reduction for efficient portfolios. It will be important for

later analysis to remember that: even if there is not homogeneous expectations; even if the

market is not in equilibrium; even if people are not mean-variance maximizers, one can still form

a market portfolio; and by computing its mean and standard deviation, the Capital Market Line in

Figure XI.2 can be formed. This line represents all portfolio combinations of the market

portfolio with the riskless asset (where the market portfolio is never sold short). The conditions

that the market is in equilibrium and that there is agreement, imply that the Capital Market Line

is the locus of efficient portfolios and that there are no feasible portfolios with mean-variance

combinations above that line.

While the Capital Market Line describes the equilibrium expected return relationship

among efficient portfolios, the Security Market Line (XI.13) or (XI.14) describes the equilibrium

expected return relationships among all individual securities or portfolios (efficient or not). We

can rewrite (XI.13) or (XI.14) as

(XI.17) n1,2,..., = i r = R - Z iMsi σ

where ⎟⎟⎠

⎞⎜⎜⎝

⎛≡σ 2

M

is

R - Z r is called the price of risk-reduction for securities.

Page 235: Finance Theory - Robert C. Merton

Finance Theory

233

Figure XI.3

Note: σ kM can be negative in which case the equilibrium expected return on that security, ,Z k will

be less than R.

The Risk of a Security If risk is defined as that measure such that as it increases, a risk-averse investor would

have to be compensated by a larger expected return in order that he would continue to hold it in

his optimal portfolio, then, from Figure XI.3 the measure of a security's (relative) risk is its

covariance with the market. An equivalent measure is from (XI.14), the "beta" of the security.

Note: From Figure XI.3, only the risk of efficient portfolios can be measured by its standard

deviation or variance. By definition, iM i MiM ρσ σ σ≡ where iM ρ ≡ correlation coefficient

between the return on the ith security and the market portfolio. So, we can rewrite (XI.17) in

terms of re [in (XI.16)] as

Page 236: Finance Theory - Robert C. Merton

Robert C. Merton

234

(XI.18) . r = R - Z iMiei ρσ

For a fixed risk premium, R, - Z i and fixed price of risk-reduction, ,re what value for ρ iM allows

σ i to be as small as possible? Clearly, the largest value of ,iMρ namely, 1. = iMρ Note: when

1, = iMρ (XI.18) and (XI.15) are the same. So again, we see that all efficient portfolios (with 0 > σ )

are perfectly correlated with the market portfolio.

Figure XI.4

Note: For R, = Z 0, = iiMρ independent of . iσ

The intuition behind the Security Market Line can be developed in a variety of ways.

One way is by using a marginal analysis to study the effect of a small change in portfolio

composition.

Page 237: Finance Theory - Robert C. Merton

Finance Theory

235

If the market portfolio combined with the riskless asset is an efficient portfolio, then one

cannot both increase the expected return and reduce the variance by combining this portfolio with

another asset.

Consider an investor who has selected a portfolio with return given by

* **M = + (1- )R .w wZ Z The expected return on this portfolio is R + R)-Z(w = Z M

** and its

variance is . ]w[ = )Z( 2M

2** σVar Consider the effect of a small change in this portfolio achieved by

increasing the fraction held in asset i by δ and decreasing the holding in the riskless asset by .δ The

return on this portfolio can be written as

Z = w Z + (1- w )R + Z - R = Z + ( Z - R) ,*M

*i

*iδ δ δ

and it follows that

(XI.19a) R)-Z( + Z = Z i* δ

and

(XI.19b) . + w2 + )Z( = (Z) 2i

2iM

** σδσδVarVar

The effect of this small change on the mean and variance can be determined by differentiating

(XI.19) with respect to δ and evaluating the derivative at 0. = δ That is,

(XI.20a) R - Z = ]/dZ[d i=0δδ

and

(XI.20b) . w2 = ](Z)/d[d iM*

0= σδ δVar

Case (i): Suppose that 0. < iMσ If 0, R - Z i ≥ then by moving δ from ,0 > 0 = δδ to one

could reduce the variance of the portfolio and not reduce its expected return. But, this would contradict

the efficiency of Z* and therefore, the efficiency of the market portfolio. Hence, if the market portfolio

is efficient and ,0 < iMσ then 0. < R - Z i

Page 238: Finance Theory - Robert C. Merton

Robert C. Merton

236

Case (ii): Suppose that . 0 > iMσ If 0, R - Z i ≤ then by moving δ from ,0 < 0 = δδ to one

could reduce the variance of the portfolio and not reduce its expected return. Again, such a possibility

would contradict the efficiency of the market. So, if the market portfolio is efficient and

iM i > 0, Z - R > 0.σ then

Case (iii): Suppose that 0. = iMσ If R, > Zi then by moving δ from 0, > 0 = δδ to one could

increase the expected return on the portfolio and not increase its variance at the margin. If R, < Zi then

by moving δ from 0, < 0 = δδ to one could increase the expected return and not increase its

variance. Because either possibility would violate the efficiency of the market, it follows that

0. = R = Z iMi σif

Perhaps because equations (XI.17) and (XI.18) lack some intuitive appeal, expression

(XI.14) for the Security Market Line has generally been the preferred form in popular use.

"Beta" in (XI.14) is frequently called the (relative) "volatility coefficient of the security." To see

why, we proceed as follows:

Define the random variables andk Mk Mk M - - .Z ZZ Zε ε≡ ≡ Then, by construction

and Var and Var2 2Mk M k M kkE( ) = E( ) = 0 ( ) = ( ) . Eε ε ε σ σ ε≡ and ε M are the unanticipated

parts of the returns on asset k and the market portfolio, respectively.

From the definition of εε Mk and and from (XI.13), we can write the return on asset k as

(XI.21)

u + Z + a =

+ R)--Z( + R =

+ R)-Z( + R = Z

kMkk

kMM2M

kM

kM2M

kMk

β

εεσσ

εσσ

Page 239: Finance Theory - Robert C. Merton

Finance Theory

237

where and2k kM M k k Mk k k (1- )R; / - .a uβ β βσ σ ε ε≡ ≡ ≡ By construction, the random variable

uk has the property that 0 = )uE( k and 2Var 2 22 2

k Mk kk kM( ) = - = (1- ) .u σβ ρσ σ Further we have

that

(XI.22)

0 =

- =

),( - ),( = )u,Z(2MkkM

MMkkMkM

σβσεεβεε CovCovCov

since . / = 2MkMk σσβ That is, uk is uncorrelated with the return on the market. For the reader

familiar with basic single variable regression theory, β k is equal to the (theoretical) regression

coefficient from regressing Z Z Mk on and a constant. If (XI.21) were viewed formally as a regression

equation, uk would be called the residual and represent that part of the return on asset k , ,Z k which

is not "explained" by the return on the market, .Z M As is well-known, the residual in a least-squares

regression is always uncorrelated with the independent variable: Hence, (XI.22) simply reaffirms that

result.

Viewing (XI.21) as a regression equation is probably the reason that β i is thought of as a

relative volatility measure. That is, (neglecting the "residual," uk ), if the market goes up 10%, and if

2, = kβ then the return on security k would be 20% (plus ak ); and if the market goes down 10%,

then the return on security k would be –20% (plus ak ). Thus, securities (or portfolios) with large

"betas" )( kβ are called "volatile" or "aggressive" securities and in a similar fashion, securities

with small "betas" are called "defensive" securities. From (XI.14), we can draw the Security

Market Line in terms of beta:

Page 240: Finance Theory - Robert C. Merton

Robert C. Merton

238

Figure XI.5

Note: All securities with 1 = kβ have expected returns ; Z = M securities with 0 = kβ have

expected returns = R.

The reader is warned that, in general, the regression interpretation of β k is only a

heuristic. (XI.21) was derived simply by construction with no assumptions about the joint distribution of

.Z Z Mk and Although, by construction, uk is always uncorrelated with ,Z M the stronger

condition that 0 = )Z | uE( Mk is required for (XI.21) to be a valid regression equation. Moreover, even

if this condition were satisfied, one cannot attribute strict causality between . Z Z kM and Nonetheless,

this interpretation does provide some intuition for what beta is.

Systematic (or "Market") Risk and Unsystematic (or "Pure" or "Unnecessary") Risk Equation (XI.21) holds for all securities or portfolios in equilibrium. Further, if the

portfolio k is efficient, then k 0 .u ≡ Hence, if an investor holds an efficient portfolio, then he will

not be subjected to the (additional) uncertainty of return caused by . uk Since all investors can satisfy

their portfolio demands (in equilibrium) by holding efficient portfolios, any investor who holds a

(inefficient) portfolio where k u ≡ 0 is exposing himself to an (unnecessary) additional risk. Hence,

Page 241: Finance Theory - Robert C. Merton

Finance Theory

239

uk is called the unsystematic or unnecessary risk of security or portfolio k. Note that even if the

investor holds an efficient portfolio k( 0) ,u ≡ the return on the portfolio (for beta 0 ≠ ) is uncertain

because the return on the market is uncertain. This is an irreducible or "necessary" risk that he must take

to get the expected return . Z k Thus ,Z Mkβ or really , = )Z-Z( MkMMk εββ is called the

systematic risk of security k and because it is proportional to the market return, it is often called

the market risk of security k.

The uncertain part of a security's return, ,kε can always be written as the sum of systematic

and unsystematic risk: namely, as

(XI.23) .u + = kMkk εβε

From (XI.22) and (XI.23), the total variance of the return can be written as

(XI.24) )(uVarσβσ k2M

2k

2k +=

Variance of Variance of Systematic Part Unsystematic Part

Note: From (XI.14), the equilibrium, (expected) reward or risk-presmium or excess return,

R, - Z k is proportional to ,Mk σβ the standard deviation of the systematic part of the total risk, and

not ,kσ the standard deviation of total risk.

Hence, investors only get extra (expected) return from bearing larger systematic risk, or

alternatively, the market does not reward investors for choosing inefficient portfolios and

exposing themselves to more risk than is necessary.

Implications of the Capital Asset Pricing Model for Portfolio Selection The analysis of the previous section provides a very simple portfolio selection strategy

(independently of whether the CAPM holds in the "real" world): Namely, (1) diversify your

holdings as much as possible (i.e., hold each security available in proportion to its value relative

Page 242: Finance Theory - Robert C. Merton

Robert C. Merton

240

to the market); (2) Borrow (or lend) to lever (or "cool down") this portfolio until one achieves the

"right" expected return-standard deviation tradeoff. This selection rule is called a naive or

passive rule because it requires little analysis (only an estimate of the market expected return and

variance) and nothing about individual securities.

Since it is always a feasible portfolio policy, one can use this rule to provide a benchmark

for comparison of overall portfolio performance of active portfolio selection rules as will be

shown in Section XIII. Clearly, such active portfolio management should, as a minimum,

provide at least as good performance (after deducting costs) as the passive policy.

Except for certain bookkeeping and purchasing economy of scale, the naive strategy

eliminates the need for a portfolio manager all together.

Page 243: Finance Theory - Robert C. Merton

241

XII. RISK-SPREADING VIA FINANCIAL INTERMEDIATION: LIFE INSURANCE As discussed briefly at the end of Section V, financial assets can be traded directly in the

capital markets or indirectly through financial intermediaries. In general, "standardized"

securities are traded in markets (e.g., government bonds, wheat futures, shares of IBM) while

"custom" contracts (e.g., individual mortgage, personal loan, or insurance) are handled through

financial intermediaries. In this section, the classical case of pure life insurance is examined to

show how efficient risksharing can be achieved using a combination of financial intermediation

and the capital market.

The Life Insurance Company

Suppose that there are N people in the economy each with wealth (per capita) W.

Hence, national wealth ≡ W = NW. Suppose further that each person purchases a one-year term

life insurance policy which pays $c in the event of death and we define q c/W≡ to be the

amount of insurance coverage purchased by each person as a fraction of his wealth. Let yi be a

random variable describing the death of the ith person where 0 = yi if person i survives the year

and 1 = yi if person i dies during the year. Assume that the mortality tables are such that

, = )yE( i ρ the same for all people, V = )y( N1,2,..., = i 2iVar and which is also the same for all

people. Hence, ρ is the expected number of deaths per person . 1) < < (0 ρ Let y = Y i

N

1=iN ∑ be the

random variable for deaths of all people and it is equal to the number of deaths in the economy. If the

death of one person is independent of another (a crucial but reasonable assumption), then

E[Y ] = N ; (Y ) = NVN N

2ρ Var

Page 244: Finance Theory - Robert C. Merton

Robert C. Merton

242

If a single competitive insurance company writes all the policies, then the analysis will determine

the:

• Premium per policy charged, PN

• The amount of equity capital required by the company to do business, K N .

• The required (expected) return on the equity by investors in the insurance company.

Premiums are received at the beginning of the year in the amount, . NPN Benefits are paid at the

end of the year in the (random variable) amount NN c .C Y≡ Hence,

VNc = = )C(

Nc = C =] CE[222

NN

NN

σ

ρVar

Suppose that investors are mean-variance maximizers and that the conditions for the Capital

Asset Pricing Model (Section XI) hold. If R = 1 + rate of interest, then the return per dollar

invested in equity of the insurance company is NN NN

N

R( + )-NP CK ,ZK

≡ and

K

VNc = K

= =] Z[

K

Nc-]K+)NPR[( = Z =] ZE[

2N

22

2N

2N2

ZN

N

NNNN

σσ

ρ

NVar

In equilibrium, supply must equal demand, and so, for the equity of the insurance company to be

held, we have that Z N must satisfy the basic equilibrium condition for the CAPM:

Nsupply Demand for asset; or fraction of the market portfolioN *

N ZK = = = = wKNW

where W*Z N is also the fraction in optimal combination of risky assets given by:

Page 245: Finance Theory - Robert C. Merton

Finance Theory

243

N

N

n

j jZj=1*

Z

v Z

= w

- R

A-RC

( )

( )

Suppose (as is reasonable) that Z N is independent of the returns on all other assets

(i.e., ),Z Z 0 = )Z ,Z( jNjN ≠forCov then

NCov Cov N

22n 2 2* *

N j NjN M Z 2NNj=1

Nc V Nc V( , ) = Var ) = .w wZ Z Z Z ZNWK KNW

K ( , ) = ( ) = ( )(∑

From Section XI, we have from the Security Market Line that

⎟⎟⎠

⎞⎜⎜⎝

WK

Vc r = )Z,Z( r = R - ZN

22

SMNSN Cov

Substituting for Z N , we have that

or

or substituting for

2 2NN

SN N

2 2

SN

2 2

N S

R[ + ] - NcNP c VK - R = rWKK

c V - Nc = c = qWRNP rW

q WV - q W = RP rN

ρ

ρ

ρ

⎛ ⎞⎜ ⎟⎝ ⎠

⎛ ⎞⎜ ⎟⎝ ⎠⎛ ⎞⎜ ⎟⎜ ⎟⎝ ⎠

The (equilibrium) premium per policy can be written as

, Vcq r NR

1

R

c = WVq r

NR

1 +

R

qW = P 2

S22

SN thereforeandρρ

,K

Vcqr + R = ZN

2S

N the (equilibrium) expected return on equity in the insurance company.

Page 246: Finance Theory - Robert C. Merton

Robert C. Merton

244

Note: In this formal analysis, we have not taken into account the limited liability feature of the

equity of the insurance company which leads us to the last question to be answered:

What is the appropriate value for ? K N

To answer this question, one must go back and ask what service is the financial intermediary to

provide to the customer? What does he want? The customer wants a certain payment, c, in the

event of death. Now, if the total number of deaths is such that the (ex-post) benefits required to

be paid, ,cN is larger than the company's total assets, ),K + NPR( NN then by limited liability on

equity, the customer will not receive the full benefits promised, but only c. < )K+NP(y

RNN

N

Obviously, the larger is K N the less likely is default. If K N is "too small", then the probability of

default is higher, and the customer in purchasing the policy does not get the simple security he wanted

which pays $c for sure in the event of his death, but rather has the more complicated security

which pays $c in the event of his death, conditional on the company being solvent, pays the

(r.v.) amount )K+NP( y

RNN

N

in the event of death, conditional on the company not being solvent.

Clearly, to assess the probabilities about possible payoffs, the customer would have to know the amount

of capital the firm has; the nature and quantity of policies written for other customers; the probability of

these customers dying; etc. In short, nearly everything about the company that the management knows,

the customer would have to know and analyze. Essentially, the customer takes a (partial) equity position

in the company. Since one purpose of the financial intermediary is to limit the amount of information

required by customers to make a decision and because the service wanted is basically life insurance, the

equity capital should be large enough to (virtually) eliminate the chance of default. In doing so, the

separation between customer and equityholder (or general liability or debtholder) is made as large as

possible. Define: reserves as the amount of assets required to be held by the insurance company

to ensure that payment will be made to customers with some probability. Let

( )NNN KNPRrequiresreservesR +== So, given the premium, there is a one-to-one

correspondence between reserves and capital (equity). Clearly, the amount of reserves required to ensure

with absolute certainty that all customers will be paid in every state of the world would come by

Page 247: Finance Theory - Robert C. Merton

Finance Theory

245

requiring that assets be large enough to payoff everyone in the event that everyone dies. I.e.,

Nc = C Nmax or =max

NR maximum required reserves )K + NPR( = Nc = NNmax or

. R

Vcqr - )-R

1Nc( = K

2S

N ρmax However, for Nc large and ρ reasonably small, the amount of

capital required to meet the maximum reserves could be prohibitively large. Further, if ρ is small,

there is a very small chance that everyone will die (especially since the events of death are independent)

and one would expect that for large N, there would be some diversifying effects. Hence, it may not

be necessary for the company to hold the maximum reserves while still performing the essential

service required.

Suppose that instead it was required that reserves be such that the probability of default is

less than some assigned level, ,p* i.e., { } ,pRCProb *

NN ≤> and define the associated required

capital as ).p(K*

N Note: { } =>= NNmaxNN RCProb.K(0)K

*N

n*

N Ni1

2 ss

pKProb{( y pK X

r cqVR + Ncρ + r cqV

R ( )Prob > ( ) +

V c N N)c > ( ) } = { ∑

where N N

N

i1

N X X

y

X

NρE( ) = 0; Var( ) = 1

V N

- and .≡

∑ For large N, X N will be

distributed approximately standard normal (Gaussian). Hence,

p =] N

qVr + N

)p(K )Vc

R[(-1 }

N

qVr + N

)p(K )Vc

R( > X{ *S

*NS

*N

N Φ≈Prob where ] [Φ is the

cumulative density function for the normal distribution. For this distribution, there is a one-to-one

correspondence between p* and the number of standard deviations to the right of the mean. I.e., let

= µ number of standard deviations, then

Page 248: Finance Theory - Robert C. Merton

Robert C. Merton

246

µ(p*) p* 0 .5000 1.0 .1600 2.0 .0230 2.33 .0100 3.10 .0010 3.70 .0001 (1 chance in 10,000) 4.00 .00004 (1 chance in 25,000)

Hence, N

qVr + N

)p(K)Vc

R( = )p( S

*N*µ or

* *N S

Vc( ) = ( ) ( )N - qVp pK r

Rµ for large N. Thus,

for a given p* (or ,))p( *µ we have an expression for the required equity capital for large N.

Asymptotic Results for large N(N → ∞) 0) > p > 2

1( *

N* *

K ( p ) (Vc

R) ( p ) N≈ µ

0. > )p(

R = )( R; = )Z( ; R

c = )P(

*2

22Z

NN

NN

N µσ

ρNlimitlimlimit

∞→∞→∞→

0; = )NW

)p(K( 0; = )PN

)p(K( = )( *

N

NN

*N

NNlimlimitlimit Premiums Total

Equity Required

∞→∞→∞→

0 = )K

)p(K( = N

*N

NNmaxlimitlimit Equity Maximum

Equity Required

∞→∞→

Suppose 25,000

1 = .00004 = p $30,000 = c .0025; = Suppose *ρ

4] = )p([ 1 = R .0025 = V*2 µ

Page 249: Finance Theory - Robert C. Merton

Finance Theory

247

N NNP (0)KK NmaxN = )00004(.K N

N

N

NP

(.00004)K

10,000 (1x104)

$750,000 $299,250,000 $600,000 .8000

90,000 (9x104)

$6,750,000 $2,693,250,000 ($2.7 billion)

$1,800,000 .2667

1,000,000 (1x106)

one million $75,000,000

$30,000,000,000 ($30 billion)

$6,000,000 .0800

9,000,000 (9x106)

$675,000,000 $270 billion $18,000,000 .0267

1x108 one hundred

million

$7,500,000,000 ($7.5 billion)

$3 trillion (3x1012)

$60,000,000 (60 million)

(6x107) .0080

Thus, we observe a characteristic property of (many) financial intermediaries: namely, that net

worth is a small fraction of total assets (in the example, less than 1%); further total (potential)

liabilities are many orders of magnitude larger than total assets or reserves. Of course, sales and

other operating expenses would have to be added to the premium and other assets (buildings,

etc.) have been excluded.

The benefits of the financial intermediary in this case are obvious: if each insurance

policy were written by one person for one other person (and if 1), q 2 rS ≈≈ and then the

minimum premium for a $30,000 policy would be $225 versus $75 charged by the company.

Further, the reserves required would be $30,000 (or K 1max ) per policy versus $.60 per policy for

the intermediary! (50,000 times as much!)

Note despite the tremendous diversifying power of many policies, if there were no equity

capital market to raise the funds, it is doubtful if such an organization could occur without

substantially higher premiums. Suppose one (wealthy) individual provided all the capital ($60

million): (at the derived rates with R = 1) the expected rate of return is zero and there is a .16

probability of one standard deviation to the right which translates into a $15 million loss! Few

Page 250: Finance Theory - Robert C. Merton

Robert C. Merton

248

risk-averse utility maximizers would accept such an investment. But, by diversifying the risk by

issuing equity in the capital market and if individuals hold well-diversified equity portfolios (as

they should), then the loss would be around 15¢ per investor which is trivial for an investor with

initial wealth of $30,000. Thus, through the combined use of the capital market and the financial

intermediary, the individual investor can get the service or asset he wants (virtually no-default

life insurance) to eliminate a substantial non-systematic risk, at minimum cost.

Page 251: Finance Theory - Robert C. Merton

249

XIII. OPTIMAL USE OF SECURITY ANALYSIS AND INVESTMENT MANAGEMENT

In Section XI, we used portfolio analysis to derive the Capital Asset Pricing Model which

provides a relationship between expected return and risk in equilibrium. In an environment

where there is no significant differential information among investors (i.e., if distributional

beliefs about security returns are homogeneous), it was further shown that all efficient portfolios

can be represented as a simple combination of the market portfolio and the riskless asset. This

analysis suggested a naive or passive portfolio strategy (namely, hold the market mixed with the

riskless asset) which does not require the investor to undertake security analysis of individual

firms. In an environment where some investors may have differential information, this strategy

is still appropriate for those investors who do not have such information available to them. That

is, it is appropriate for those investors with information sets that do not reveal mispriced

securities. This strategy provides the best “protection” to such investors from those investors

who do have significant differential information (the “information traders”).

This passive strategy does require some forecasting of the “macro” type: namely, an

estimate of . σandZ MM However, this information is only required so that the investor can pick the

right efficient portfolio for his specific preference function. That is, no matter what combination of the

market and riskless asset he selects, the investor will have chosen an efficient portfolio. If his estimates

of σand Z MM are in error, then he will select the wrong efficient portfolio for his specific tastes.

Nonetheless, he will receive the highest expected return available (based upon his information set) for

whatever level of risk he in fact did bear. One simple method for estimating . σandZ MM (R is, of

course, observable) is to use historical data to estimate σ and R-Z MM or . σR)/-Z( MM

The passive strategy presumes that market prices for securities reflect the information that

the investor has, and therefore, relative to the investor's information set, security prices will be

such that expected returns will satisfy the Security Market Line. Hence, the strategy's success

depends upon at least some entities undertaking individual security analysis and acting on this

information to ensure that market prices reflect information available to the investor. Who these

“informed” investors are as well as how successful they are, in an empirical question which will

be addressed in Section XVII. In this section, we develop the procedures for optimal use of such

Page 252: Finance Theory - Robert C. Merton

Robert C. Merton

250

differential information if one were to have it. While the emphasis of the analysis is an optimal

utilization of security analysts who do only “micro” or individual security forecasts, the final part

of the section combines both micro and macro forecasts.

We begin by using the Capital Asset Pricing Model to develop an operationally useful

definition of “under” and “over-priced” securities. The reader should note that all distributional

estimates are computed from a particular entity's information set.

Figure XIII.1

An undervalued security has an expected return greater than that predicted by the SML (e.g.,

security i in Figure XIII.1). An overvalued security has an expected return less than that

predicted by the SML. Write the expected return on security k as

Page 253: Finance Theory - Robert C. Merton

Finance Theory

251

(XIII.1) α + R)-Z(β + R = Z kMkk

if 0, then security k is "fairly priced"kα =

0, then security k is "under priced"kα >

0, then security k is "over priced"kα <

A portfolio with a consistent positive “alpha” (α) shows evidence of ability to forecast a

security (or securities) better than the “market.”

Figure XIII.2

A superior-performing portfolio (“super efficient”) has .σ

R - Z > σ

R - Z

M

M

R

P An inferior-performing

portfolio (“sub efficient”) has .σ

R - Z < σ

R - Z

M

M

P

P

On the Relationship Between Superior Stock Selection & Super-Efficiency

Page 254: Finance Theory - Robert C. Merton

Robert C. Merton

252

Consider a portfolio constructed by a manager with superior stock selection skills that has a

positive alpha. I.e., ]UVar[ + σβ = σ

0 > α ,α + R)-Z(β = R - Z

P2M

2P

2P

PPMPP

Question 1: Is it possible to have a portfolio with a positive alpha that is a subefficient portfolio?

Yes.

Figure XIII.3 Figure XIII.4

Question 2: Is it always possible to construct a super-efficient portfolio if one has available a

portfolio with a positive alpha? Yes.

An analytical demonstration is as follows:

Form a portfolio of three securities:

Page 255: Finance Theory - Robert C. Merton

Finance Theory

253

1

2

1 23

p M21

Let w fraction invested in portfolio P

w fraction invested in market portfolio

w = fraction invested in riskless security = 1 - - w w

Z w ( -R) + ( -R) + RwZ Z

Z =

==

1 2 1 P 1 2M MPP

2 2 2 2 2 2 2 2 2P M 1 2 PM P M 1 2 M1 2 1 2 P

22 2 2 2P 1 2 M1 1 2P P

21

-R) + ( -R) + R = + [ + ][ - R] + Rβ(Z w w w α w wZ Z

Var(Z) = + +2 = + + 2 β w σ w σ w w σ w σ w σ w w σ

= Var[ ] + [ + 2 + ]β β w U w w w w σ

= Var[ w2 2

P 1 2 MP] + [ +β ]U w w σ

Find the minimum-variance portfolio with expected return equal to .ZM

0> w thatimplies 0 > α :Note

R)-Z(α

+]σUR)[Var[-Z(

α = w

w = w at 0 = )R-Z(

ασ]αw-R-Z[ - ]UVar[w2 =

dw

dVar(Z)

)R-Z(

σ]αw-R-Z[ = ]U[Varw = Var(Z)

R)-Z]/(α+R)-Z(β[w - 1 = w

R + R]-Z][w+βw[ + αw = Z = Z

*1P

M

2P

2M

PM

P*1

*112

M

P2M

P*1MP

*1

1

2M

2M2

P1MP21

MPMP12

M2P1P1M

⎥⎥⎦

⎢⎢⎣

Exercise: Show that σ

R)-Z( > Var(Z)

-R)Z(

M

M

* *1 2P M M

22 2 M* *

1 P 1 pM 2M

when Z = ( -R) + ( -R) + R=w wZ Z Z

σand Var(Z) = ( Var[ ] + [ -R-) ]w U w αZ( -R)Z

Hint: just show that σ < Var(Z) 2M

Page 256: Finance Theory - Robert C. Merton

Robert C. Merton

254

If a single entity has micro forecasts for the means, variances, and covariances of all

available securities, then the optimal utilization of such information is to form the risky portfolio

using formula (X.18) (the “optimal combination of risky assets”). This portfolio can then be

mixed with the riskless asset to produce an efficient frontier as described in Section X. However,

operationally, this may not be feasible because:

(i) the large number of available securities make it unlikely that any single unit would be

able to make estimates for all available securities;

(ii) a particular unit may have superior forecasting capability only with respect to a subset

of available securities;

(iii) a control mechanism should be employed to keep the portfolio manager from making

decisions based on forecasts which are inaccurate; and further, to reward analysts for

doing a “good job” at what they were hired to do, it is necessary to develop

performance measures for each of the roles leading to the “best” portfolio. E.g., a

portfolio could have “bad performance” even though the analyst is doing his job

because of poor portfolio management or vice versa.

Page 257: Finance Theory - Robert C. Merton

Finance Theory

255

Figure XIII.5 ACHIEVING SUPER EFFICIENCY FROM SUPERIOR STOCK SELECTION

Page 258: Finance Theory - Robert C. Merton

Robert C. Merton

256

Suppose that we are doing security analysis on the shares of m companies whose returns are

represented by m.1,2,..., =k ,Zk We can always write the returns as is done in Section XI, as

(XIII.2) m1,2,..., =k ,U + Zβ + )Rβ-(1 = Z kMkkk

where ZM is the return on the market; U ;σσ β k2

M

kMk ≡ is a random variable such that

0. = )Z,UCov( Mk The capital asset pricing model predicts that if security k is “fairly” priced,

0. = )UE( k However, the purpose of security analysis is to find securities that are either under-

or over-valued. I.e., where 0. )UE( k ≠ Define: ε where ε + α U kkkk ≡ is a random variable such

that 0; = )εE( k

’Mk k j kjCov( , ) = 0, k = 1,...,m; Cov( , ) ,k = 1,...,m; j=1,...,m.ε ε ε σZ ≡ Rewrite (XIII.2) as

(XIII.3) m1,..., =k , ε + α + Zβ + )Rβ-(1 = Z kkMkkk

Suppose that we do micro forecasts on the m stocks, but no forecasts on the market. Since from

(XIII.3), one can think of the return on Zk as coming from two sources. (1) movements in the

market, and (2) movements individual to the stock which are independent of the market, this type

of forecasting implies estimates of αk and εk by the analysts, without knowledge of : ZM

I. “Active” Portfolio

Consider the following portfolio constructed from (m+2) securities: the m stocks being

analyzed, the market portfolio, and the riskless asset. Let ≡ wai fraction of the portfolio invested

in the ith stock, i = 1,2,...,m; ≡ waM fraction of the portfolio invested in the market portfolio;

= waR fraction of the portfolio invested in the riskless asset. Further, restrict the portfolio weights

to satisfy

Page 259: Finance Theory - Robert C. Merton

Finance Theory

257

(XIII.4a) 1 = w + w + w aR

aM

ai

m

1=i∑

(XIII.4b) 0 = w + βw aMi

ai

m

1=i∑

If ≡ Za return per dollar invested in the active portfolio, then

(XIII.5)

(XIII.4b) from εw + αw

ZwεαZβw =

(XIII.4a) from R + R)-Z(w + R)-Z(w =Z

iai

m

1=ii

ai

m

1=i

MaMiiMi

ai

m

1=i

MaMi

ai

m

1=ia

+ R =

(XIII.3) from R + R)-( + ] + + R)-([

∑∑

Note: since , 0 = )Z,ε(Cov Mi from (XIII.5), 0 = )Z,Z(Cov Ma I.e., a Ma 2

M

Cov( , )Z Z = 0.βσ

All active portfolios satisfying (XIII.4a) and (XIII.4b) are uncorrelated with the market. By

constructing the active portfolio in this way, the returns on the portfolio, , Za depend only on

the )ε,α( ii which we have forecasts on and not on Z M (about which we are assumed to have no

forecasts). If we write Z a in the form of (XIII.3), then

(XIII.6) ε + α + Zβ + )Rβ-(1 = Z aaMaaa

where: m m

a aa i a ii ia

i=1 i=1

= 0; ; β α w α ε w ε≡ ≡∑ ∑

Note that:

(XIII.7)

ma

a a ii ai=1

m m2 a a

ma i ji ji=1 j=1

E( ) = + R = + R = ;α w αZ Z

Var( ) = σ w w σZ ′≡

∑ ∑

Page 260: Finance Theory - Robert C. Merton

Robert C. Merton

258

Consider the efficient portfolio set constructed from all such active portfolios. (I.e., the set of

portfolios with maximum expected return for a given variance). Mathematically, fix the variance

at σ2a , then

]}σww - σ[

2

λ + αw + {R =

]}σww-σ[2

λ + Z {

jiaj

ai

m

1=j

m

1=i

2ai

ai

m

1=i}w{

jiaj

ai

m

1=j

m

1=i

2aa

}w,...,w{

Max

Max

ai

am

a1

∑∑∑

∑∑

where λ is the Lagrange Multiplier. The first-order conditions are

(XIII.8)

σww - σ = 0 : λ

m1,2,..., = i σw λ - α = 0 : w

jiaj

ai

m

1=j

m

1=i

2a

jiaj

m

1=jia

i

∑∑

∂∂

∂∂

Multiply the first equation by wai and sum i = 1,...,m to get

σww λ - αw = 0 jiaj

ai

m

1=j

m

1=ii

ai

m

1=i′∑∑∑

(XIII.9) a2a

Z -R = λ

σ

The “efficient” part will have R Za ≥ or from (XIII.9), 0. λ ≥ So, from (XIII.8) and (XIII.9), we

have that

(XIII.8′) m1,2,..., = i ,σw σ

R-Z - α = 0 jiaj

m

1=j2a

ai ′∑⎟⎟

⎞⎜⎜⎝

Page 261: Finance Theory - Robert C. Merton

Finance Theory

259

If th th

i j = i , jv ′ element of the inverse of the variance-covariance matrix , ]σ[ ji′ then [in an

analogous fashion to (X.18)],

(XIII.10) m1,2,..., = i ,αv R-Z

σ = w jji

m

1=ja

2aa

i ′∑⎟⎟⎠

⎞⎜⎜⎝

In the special case where the unsystematic parts of the returns on the securities are uncorrelated

with each other j), i 0, = )ε,ε(Cov (i.e., ji ≠ then )σ( = σ and j i 0, = σ 2iiiji ′′′ ≠ and (XIII.8') becomes

a '2

2a

0 , 1,...,ai i i

Z Rw i mα σ

σ⎛ ⎞−= − =⎜ ⎟⎜ ⎟⎝ ⎠

or

(XIII.11) 2a ia

i ' 2a i

σ α = ( ) i = 1,2,...,mw-R ( )Z σ

From either (XIII.10) or (XIII.11), note that the ratio of “risky assets”

σασα or )αV)/(αV( = w/w 2

ik

2ki

jk

m

1=jji

m

1=j

ak

ai jj ′′ ∑∑

is independent of the point chosen on the frontier. Risky assets is put in “ “ because it refers

only to risky assets 1,2,...,m. From (XIII.4b), this portfolio also contains the market risky asset

(unless m).1,..., =i 0, = βi However, it is also true that

βw

w - = w

wia

k

ai

m

1=iak

aM ∑

is the same for all points on the frontier. Thus all “efficient” portfolios constructed from the

active portfolio can be thought of as a combination of a risky-asset only portfolio and the riskless

asset, and therefore they are perfectly correlated. To find the particular “efficient” active

Page 262: Finance Theory - Robert C. Merton

Robert C. Merton

260

portfolio with risky assets only, we set 0 = waR in (XIII.4a) and require therefore that

1. = + ww aM

ai

m

1=i∑

This can be done by requiring that the wai satisfy . 1 = )-(1 βw i

ai

m

1=i∑ (This is possible

provided that i β ≡ 1 for all i; a technical point is that even if , 1 βi ≠ such an all-risky asset

portfolio may not be efficient although it will be on the frontier). In any event, the important

point is that from (XIII.10) or (XIII.11) the holdings depend only on the forecasted variables

)σ,α( iji and not on . ZM

II. Mixing the “Active” Portfolio with the “Passive” (Market) Portfolio to Produce an

Optimal Combination of Risky Assets Consider an investor presented with the active portfolio )σ,Z(

2MM and the riskless asset

R. The efficient portfolio set can be generated by maximizing the mean for a given variance. I.e.,

let Zp be the return on a portfolio constructed from R; and ,Z,Z Ma and let δ1 be the fraction of

that portfolio invested in the market; δ2 be the fraction invested in the active portfolio; δ-δ-1 21

be the fraction invested in the riskless asset. Then

σδ + σδ = σ R; + R)-Z(δ + R)-Z(δ = Z2a

22

2M

21

2pa2M1p because: 0. = )Z,Z(cov Ma The problem

becomes

]}σδ-σδ-σ[2

γ + R)-Z(δ + R)-Z(δ + {R 2

a22

2M

21

2pa2M1

}w,...,w,δ,δ{Max

am

a121

where γ is a Lagrange multiplier. Note: he is not only allowed to pick , δ and δ 21 but in

addition, .w,...,w am

a1 I.e., he can select which active portfolio he wants. So the investor's choice

Page 263: Finance Theory - Robert C. Merton

Finance Theory

261

is not limited, and he is using the analysts' forecasts of σ,α ’iji as well as an estimate of

2MM and σZ .

The first-order conditions are

(XIII.12a) σδ γ- R-Z = 0 : δ

2M1M

1∂∂

(XIII.12b) σδ γ- R-Z = 0 : δ

2a2a

2∂∂

(XIII.12c) . σδ - σδ - σ = 0 : γ

2a

22

2M

21

2p∂

and

(XIII.13) m1,..., = i } σwδγ-α { δ = 0 : w

jiaj

m

1=j2i2a

i′∑

∂∂

Since

. σw2 = w

]σ[ and α = w Z

jiaj

m

1=jai

2a

iai

a′∑

∂∂

∂∂

From (XIII.12b), σ

R-Z = δγ 2a

a2 and substituting into (XIII.13), we have that

(XIII.14) 0 δ provided m1,..., = i , σw σ

R-Z - α = 0 2jiaj

m

1=j2a

ai ≠⎟

⎟⎠

⎞⎜⎜⎝

⎛′∑

Comparing (XIII.14) with (XIII.8'), they are identical. Thus, the correct combination of the

securities in the active portfolio can be determined by choosing an “efficient” portfolio for the

active portfolio without knowledge of the characteristics of the market portfolio.

From (XIII.12a) and (XIII.12b) we have that

(XIII.15) . σσ

R)-Z(

R)-Z( = δδ

2a

2M

M

a

1

2⎟⎟⎠

⎞⎜⎜⎝

Page 264: Finance Theory - Robert C. Merton

Robert C. Merton

262

Thus, the optimal combination of risky assets will have the active portfolio in the amount

σ

R)-Z( + σ

R)-Z(σR))/-Z(

2M

M2a

a

2aa

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

and the market in the amount

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

σ

R)-Z( + σ

R)-Z(σR)/-Z(

2a

a2M

M

2MM

Compare the “new” optimal combination of risky assets (using forecasting) with the “old”

optimal combination (with no forecasting): Note

R)-Z(

σ

R)-Z( + σ

R)-Z(σR)/-Z( + R)-Z(

σ

R)-Z( +

σ

R)-Z(σR)/-Z(

+ R = Z M

2M

M2a

a

2MM

a

2M

M2a

a

2aa*

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

⎥⎥⎥⎥⎥

⎢⎢⎢⎢⎢

2 2

2 2a M2 2 2a M

* a Ma aM M

2 2 2 2a M a M

( -R)/ ( -R)/σ σZ Z= + ( -R) ( -R)( -R) ( -R)Z ZZ Z + + σ σ σ σ

σ σ σ

⎡ ⎤ ⎡ ⎤⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦

∞≠

⎟⎟⎠

⎞⎜⎜⎝

⎛⎟⎟⎠

⎞⎜⎜⎝

⎛ < σ

R Z { for

σR-Z >

σR-Z +

σ

R-Z = σ

R-Z

a

a

M

M

M

M2

a

a2

*

*

So, the “new” combination is super-efficient.

Page 265: Finance Theory - Robert C. Merton

Finance Theory

263

Summary: Organizational Structure for Portfolio Management [here]

Final Portfolio Manager

1. Form Final Portfolio by mixing Active and Passive Portfolio

2. Monitor Performance of Active Portfolio3. Monitor Performance of Passive (Market)

Portfolio

Control

Is Final PortfolioSuper-Efficient?

Active Portfolio Manager

1. Form Active Portfolio2. How good are beta

estimates?3. How good are α , σ´

estimates?i ij

Passive Portfolio Manager

1. Form a Market Portfolioat minimum cost

Control

1. Is correlation of portfolio with Market = 17

Beta Analyst

1. Estimate βi

Group Analyst

1. Correct Analyst bias2. Estimate σij

Micro Analysts

1. Estimate αie

Micro Analysts

1. Estimate

Z , σM2M

⎟⎠⎞⎜

⎝⎛ 2

pσ,pZ

⎟⎠⎞⎜

⎝⎛ ασ,

2aZ

⎟⎠⎞⎜

⎝⎛ 2

Mσ,MZ

iβ ⎟⎠⎞⎜

⎝⎛ '

ijσ,iα

Final Risky Portfolio2pσ,pZ;pZ

Customers(good;bad)

Product

good;

badgood; bad

Active Portfolio2aσ,aZ;aZ

Product

Passive Portfolio2Mσ,MZ;MZ

Product

good;

bad

good;

bad

(good; bad) good;bad

good;

bad

eiα

Control

1.

2.

.aβRaZIs −−

0?aβIs ≈

( ) ?0RMZ >−

good; bad

Final Portfolio Manager

1. Form Final Portfolio by mixing Active and Passive Portfolio

2. Monitor Performance of Active Portfolio3. Monitor Performance of Passive (Market)

Portfolio

Control

Is Final PortfolioSuper-Efficient?

Active Portfolio Manager

1. Form Active Portfolio2. How good are beta

estimates?3. How good are α , σ´

estimates?i ij

Active Portfolio Manager

1. Form Active Portfolio2. How good are beta

estimates?3. How good are α , σ´

estimates?i ij

Passive Portfolio Manager

1. Form a Market Portfolioat minimum cost

Control

1. Is correlation of portfolio with Market = 17

Beta Analyst

1. Estimate βi

Beta Analyst

1. Estimate βi

Group Analyst

1. Correct Analyst bias2. Estimate σij

Group Analyst

1. Correct Analyst bias2. Estimate σij

Micro Analysts

1. Estimate αie

Micro Analysts

1. Estimate αie

Micro Analysts

1. Estimate

Z , σM2M

Micro Analysts

1. Estimate

Z , σM2M

⎟⎠⎞⎜

⎝⎛ 2

pσ,pZ

⎟⎠⎞⎜

⎝⎛ ασ,

2aZ

⎟⎠⎞⎜

⎝⎛ 2

Mσ,MZ

iβ ⎟⎠⎞⎜

⎝⎛ '

ijσ,iα

Final Risky Portfolio2pσ,pZ;pZ

Final Risky Portfolio2pσ,pZ;pZ

Customers(good;bad)

Product

good;

badgood; bad

Active Portfolio2aσ,aZ;aZ

Product

Active Portfolio2aσ,aZ;aZ

Product

Passive Portfolio2Mσ,MZ;MZ

Product

Passive Portfolio2Mσ,MZ;MZ

Passive Portfolio2Mσ,MZ;MZ

Product

good;

bad

good;

bad

(good; bad) good;bad

good;

bad

eiα

Control

1.

2.

.aβRaZIs −−

0?aβIs ≈

( ) ?0RMZ >−

Control

1.

2.

.aβRaZIs −−

0?aβIs ≈

( ) ?0RMZ >−

good; bad

Page 266: Finance Theory - Robert C. Merton

Robert C. Merton

264

This rough chart lays out the organizational structure of product and information flows and

responsibility. The dotted line shows decentralized units which could be feasibly separate

organizations. Thus, there are three separate units: (1) the producer of the final product which is

a risky portfolio that should be (at least) efficient and hopefully super-efficient. Its final product

is a risky portfolio (and the information ))σ,Z( 2pp which is suitable for mixing with the riskless

asset and being held as a final portfolio by individual investors. (2) the producer of an active

portfolio which has a zero-beta and makes optimal use of the micro-analysis done by the security

analysts. Its final product is a risky portfolio (with a zero-beta) whose expected return is greater

than the riskless rate (lies above the Security Market Line). However, it will not, in general, be

suitable for a final portfolio for individual investors. (3) the producer of a passive portfolio

which should be as close as possible to the market portfolio, but constructed at minimum cost.

Estimates of σ and Z2MM are produced. Its final product is a well-diversified portfolio suitable

for a final portfolio for individual investors who want to follow the naive or passive strategy or

for mixing with active portfolios for those investors willing to try for superior performance

through a blending of “managed” and “unmanaged” portfolios. The various roles and

responsibilities of the “boxed-in” sub-units are now described.

In Unit 1 (Product: Final Risky Asset Portfolio for Individual Investors) A. The Final Portfolio Manager

1. (a) He receives estimates of σ ,Z ,σ ,Z2MM

2aa and forms the final portfolio

with return Z p according to the rule (XIII.15)

]σR)-Z]/[(σ R)-Z[( = δ/δ 2aM

2Ma12

(b) He computes σ and Z 2pp and announces them to his customers. He also

supplies these figures to his Control Management.

2. He receives (complaints/compliments) on the portfolio's performance from

customers (external) and his Control Management (internal).

Page 267: Finance Theory - Robert C. Merton

Finance Theory

265

3. He is responsible for tracing back to the source of the (poor/good) performance.

The three sources are: (a) himself: did he follow the )"δ/δ(" 12 rule? (b) the

active portfolio: were 0)(=β ,σ ,Z a2aa accurate? Was 0? > R)-Z( β-R-Z Maa

He either complains to or compliments the active portfolio manager. (c) the

passive portfolio: was the passive portfolio highly correlated with the market?

Were the estimates of σ and Z2MM accurate? He either complains to or

compliments the passive portfolio manager.

B. (Final) Control 1. They monitor the performance of the final portfolio by comparing

R-Z to σ

R-Z

M

M

p

p and then, either compliment or complain to the Final Portfolio

Manager. In Unit 2 (Product: Zero-Beta Risky Asset Portfolio representing superior Micro Forecasting) A. Active Portfolio Manager

1. (a) He receives estimates of βi from the beta analyst and of )σ,α( ’iji from

the group analyst and forms the active portfolio according to the rule

(XIII.10) or (XIII.11): m;1,..., =k i, ;α v /α v = w/w jjkjjiak

ai ′′ ∑∑

.w-1 = w 1; = )β-(1w ai

m

1=i

aMi

ai ∑∑

(b) He computes σ ,Z2aa and announces them to his customer (unit 1), and to

his control management.

2. He receives (complaints/compliments) on the portfolio's performance from the

Final Portfolio Manager (“external”) and his Control Management (internal).

Page 268: Finance Theory - Robert C. Merton

Robert C. Merton

266

3. He is responsible for tracing back to the source of the (poor/good) performance of

the active portfolio. The three sources are: (a) himself; did he follow the

“(XIII.10) rule?” (b) the beta analyst: if the active portfolio does not have a zero-

beta 0),=β( a is it because the βi estimates supplied were inaccurate? (c) the

group analyst: were the σ and α iji estimates accurate? After finding the

source(s), he then complains to or compliments the responsible analyst.

B. (Active Portfolio) Control

1. They monitor the performance of the active portfolio by measuring whether the

portfolio shows evidence of superior forecasting capabilities: I.e., does it lie

above the Security Market Line? Is 0? > R)-Z(β-R-Z Maa

2. Does the active portfolio have a zero beta 0)? = β( a

3. They either compliment or complain to the Active Portfolio Manager.

C. Beta Analyst

1. It is his job to provide the estimates of the betas on those stocks being actively

considered by the micro-security analysts.

D. Group Analyst

1. It is his job to get the individual micro-security analyst's estimate )α( ei of the αi

and correct them for bias (historical) of each analyst to get (an unbiased) estimate

; αi he must also estimate . σ and σ ji’2i ′ These estimates are provided to the

Active Portfolio Manager.

2. He receives (complaints/compliments) on these estimates from the Active

Portfolio manager.

3. He is responsible for rating the individual analyst's performance and deciding on

whether the analyst is worth keeping. A (rough) measure would be the ratio

Page 269: Finance Theory - Robert C. Merton

Finance Theory

267

. )σ/α( 2ii The larger (in absolute value) this ratio is on average, the more valuable

the analyst.

E. Micro Security Analysts

1. They are responsible for estimating the mean return on the nonsystematic part of

the return on individual securities. They are not responsible for that part of the

returns which can be explained by macro-market effects.

In Unit 3 (Product: A portfolio which is perfectly correlated with the Market and estimates of σ and Z

2MM showing superior Macro Forecasting capabilities)

A. Passive Portfolio Manager

1. He must form a well-diversified portfolio which is as nearly perfectly correlated

with the market portfolio as possible at minimum cost.

2. He receives estimates σ and Z2MM from the Macro Analysts and then reports

them to his customer (unit 1) and to his control management.

3. He receives (complaints/compliments) on the portfolio's performance (i.e., how

highly correlated it was with the market) and on the estimates 2MM and σZ from

the Final Portfolio Manager (“external”) and his Control Management (internal).

4. He is responsible for the portfolio's performance. If the estimates by the Macro-

Analysts are (good/poor), he registers the appropriate compliment or complaint

with them. He measures their performance by comparing a strategy of a variable

position in the market depending on the estimated σ

R-Z

M

M ratio and a buy-and-

hold strategy.

Page 270: Finance Theory - Robert C. Merton

Robert C. Merton

268

B. (Passive Portfolio) Control

1. They monitor the performance of the passive portfolio by estimating its

correlation with the market, and then complain to or compliment the Passive

Portfolio Manager.

C. Macro Analysts

1. They are responsible for estimating the expected return and variance of the market

portfolio or equivalently, the slope of the Capital Market Line.

Of course, this is not the only way the organization could be structured. However, it does have

the property that the three major sub-units operate in a decentralized fashion. All measurements

of performance are net of operating costs (e.g., management fees, salaries, transactions costs,

computer costs, etc.). If, for example, the active portfolio “extra” returns do not cover costs,

then, like an unprofitable division of a manufacturing firm, it should be dropped. Many of the

basic techniques used here could be applied to the management of a manufacturing corporation.

Note that each level of decision making is subject to two forms of control: external and internal.

Each decision-maker's performance is judged on that aspect of the operation for which he is

responsible and over which he has the authority to do something about. Note: that throughout

there is a kind of “automatic” control which keeps “poor” performers from having much impact

on the final product. In the special case where j), (i 0 = σ ji ≠′ the weight of the individual micro

analyst in the active portfolio is proportional to which means that if his error in estimating is

large (i.e., σ2i

′ large), then independent of , αi “his” security does not get into the portfolio.

Similarly, if the aggregate error in the active portfolio estimates are large, then

⎟⎟⎠

⎞⎜⎜⎝

σ

R-Z2a

a

Page 271: Finance Theory - Robert C. Merton

Finance Theory

269

will be small and it will have little weight in the final portfolio. In the limit as

, 0 σ

R-Z2a

a →⎟⎟⎠

⎞⎜⎜⎝

the final portfolio is the market portfolio as it should be for someone with no forecasting

capability.

Page 272: Finance Theory - Robert C. Merton

270

XIV. THEORY OF VALUE AND CAPITAL BUDGETING UNDER UNCERTAINTY The valuation formulas and capital budgeting rules developed in Sections VI and VII take

into account the intertemporal characteristics of the firm's cash flows. In this section, the

analysis is extended to explicitly recognize the uncertainty associated with these flows. The

introduction of uncertainty makes the analysis much more complex. Hence, we begin with the

simple case of a one-period firm whose end-of-period output is distributed to its stockholders

through liquidation. In studying this case, it is further assumed that the equity market is such that

the Capital Asset Pricing Model (of Section XI) holds, and therefore, in equilibrium, securities

are priced so as to satisfy the Security Market Line. Having analyzed this case, we then derive

the valuation formulas for a multi-period firm.

It is assumed throughout this section that the firm is all-equity financed. Although, in

principle, the capital budgeting and financing (capital structure) decisions cannot be made

independently, the study of the financing decision in Section IX and, in particular, the derived

Modigliani-Miller Theorem, suggests that the method of financing should generally have little, if

any, impact on the choice of investment projects by the firm. Indeed, a good rule of thumb is to

be suspicious of projects which do not look attractive when evaluated on an all-equity financed

basis but which do appear attractive when presented in conjunction with a "creative" financing

plan. [There are, of course, exceptions to this rule as for example, when the government

provides subsidies to certain private sector projects by using below-market interest rate loans or

guarantees loans.] With this as background, we now turn to the analysis of a one-period firm.

Suppose that firm i has made (or is considering making) an investment of I$ i in a

project with end-of-period random variable cash flow of x~I ii where x i~ is the random variable

average cash flow per dollar of investment. Let . )x~(Var ν and )x~E( x i2iii ≡≡ If Vi is the

current market value of the firm after the investment is made and if the firm has no other

projects, then the return per dollar on the shares of the firm is given by

with and Var22

ii ii iii ii i 2

i i i

IxI Ix E( ) = ( ) = .Z ZZ ZV V V

ν≡ ≡% The covariance of the return of the

firm's equity with the market, , σiM is given by

Page 273: Finance Theory - Robert C. Merton

Finance Theory

271

Cov Cov i i Mi i iMi M MiM

i i

IxI [ , ] = , = Z Z ZV V

ρν σσ

⎡ ⎤≡ ⎢ ⎥

⎣ ⎦

%

where ρiM ≡ the correlation coefficient between . Z and x~ Mi

In equilibrium, the equity of firm i will be priced so as to satisfy the Security Market Line:

(XIV.1) )

V

ρνI( λ =

σσλ = R)-Z(

σσ = R - Z

i

iMiie

M

iMeM2

M

iMi

where e MM ( - R)/Zλ σ≡ is the Market Price of Risk and it does not depend upon the decisions

made by firm i . Substituting for Zi into (XIV.1), we have that ),V

ρνI(λ = R -

V

xI

i

iMiie

i

ii or

rearranging terms, that

(XIV.2) ]ρνλ-x[RI = V iMieii

i

(XIV.2) gives the equilibrium market value of the firm after having expended I$ i in resources

in the project. Under what conditions should the firm make this expenditure and take on the

investment? If the firm operates so as to maximize its market value, then it should take all

projects which increase its market value; be indifferent to projects which leave its value

unchanged; and not take projects which will lower its market value. Thus, it should take the

project if ; 0 > I - V ii be indifferent if ; 0 = I - V ii and not take it if . 0 < I - V ii From (XIV.2), we

have that

(XIV.3) . ]ρνλ-R-x[RI = I - V iMieii

ii

Page 274: Finance Theory - Robert C. Merton

Robert C. Merton

272

So, for a given , 0 > ]ρνλ-R-x[ if , I iMieii take it; if , 0 = ]ρνλ-R-x[ iMiei be indifferent; and if

, 0 < ]ρνλ-R-x[ iMiei do not take it.

Define: The beta of a project or project beta, , βpi by

Cov

Vari M ip Mi iM iM

i 2M MM

[ , ]x Z = = .( )Z

ρ ρν σ νβσσ

≡ %

Note: The beta of the equity of firm i (its "market beta") is given by

i pi iiM iM

i i2M i iM

I I = ( )( ) = .V V

ρνσβ βσσ

Hence, . 0

<

=

>

I - V as β <

=

>

β iiipi

Consider a concept similar to the Security Market Line except use project instead of market

betas: I.e., Define the Project Market Line by x where R)-Z(β = R - x Mp

is the expected cash

flow per dollar of investment in the project and βp is the project beta.

Page 275: Finance Theory - Robert C. Merton

Finance Theory

273

The graph of the Project Market Line is analogous to the Security Market Line in Section

XI. However, unlike the SML, this graph relates non-market assets or projects returns to market

returns. From (XIV.3), we express the capital budgeting rule as

(XIV.4a) Take the ProjectpiiMi i > R + ( - R) > V Ix Zβ ⇒ ⇒

(XIV.4b) Indifference to the ProjectpiiMi i = R + ( - R) = V Ix Zβ ⇒ ⇒

(XIV.4c) Do not take theProjectpiiMi i < R + ( - R) < V Ix Zβ ⇒ ⇒

In the graph, project #2 corresponds to (XIV.4a); project #3 corresponds to (XIV.4b); project #1

corresponds to (XIV.4c). That is, the firm should take all projects that lie above the Project

Market Line and reject all those that lie below the line.

Page 276: Finance Theory - Robert C. Merton

Robert C. Merton

274

In the capital budgeting analysis in Section VII, we defined the cost of capital k and used

it for deriving capital budgeting rules rather than the riskless rate of interest. Although in the

certainty environment of that section, the two must be equal to avoid arbitrage, it was noted there

that the distinction was made in preparation for the analysis of projects whose future cash flows

are uncertain. To connect the results here with the rules of this earlier section, we restate the

capital budgeting rule in terms of the cost of capital.

Define the cost of capital for project #i, , ki by

XIV.5) p

i Mi R + ( - R) .k Zβ≡

It follows from (XIV.4) that the correct rule for choosing projects is to take all (independent, as

defined in Section VII) projects whose expected return per dollar of investment, , xi exceeds the

associated cost of capital, , ki and to reject all projects whose expected return per dollar is less

than its cost of capital; ki is also called the "hurdle rate" for project i . The larger is ,βpi the

larger is the hurdle rate or the minimum required expected return on the project in order to justify

taking the project. In analogous fashion to securities, βpi is the appropriate measure of the risk

of project i, and the riskier is the project, the higher is its hurdle rate. As with securities, it is

the project's systematic risk )β( pi that matters in making the decision whether to invest or not,

and not the project's total risk . )ν( 2i

Two important implications for firm investment behavior (which were not evident from

the certainty analysis of Section VII) follow from the derived capital budgeting rule: First, the

cost of capital to be used for evaluating a project is the one associated with the project and not

the firm evaluating the project. That is, two different firms evaluating the same project (by

"same" we mean that x~i has the identical distribution from both firm's perspectives) should use

the same cost of capital [given by (XIV.5)]. To see this, note by inspection of (XIV.5) that ki

depends only upon the distribution of x~i and its joint distribution with the market. It does not

in addition depend upon the joint distribution of x~i with other projects that the firm may have

Page 277: Finance Theory - Robert C. Merton

Finance Theory

275

(or plan to undertake).

Second, since the correct decision on the project depends only upon its systematic risk

(and not its total risk), unlike a person selecting his optimal portfolio, a firm has no need to

consider (internal) diversification. This important conclusion will be discussed in depth in

Section XV.

Having established the correct capital budgeting rule in a one-period model, we now turn

to the evaluation of the firm and its projects in a multi-period or intertemporal framework.

Theory of Value Under Uncertainty (Multi-period Cash Flows)

Before proceeding to the development of the valuation formulas, we provide a quick

review of conditional expectation. (For further discussion, consult any reasonable book on

probability.)

Digression: Review of Expectation and Joint Probabilities

Let X be a random variable which can take on the values . ,...x,x,x 321

Let Y be a random variable which can take on the values . ,...y,y,y 321

Let )xf( = }x = P{X jj be the probability that . 1,2,3,... = j ,x = X j

Let )yg( = }y = P{Y kk be the probability that . 1,2,3,... =k ,y = Y k

Let )y,xp( = }y = Y ,x = P{X kjkj be the probability that . 1,2,...=kj, y=Y and x=X kj

y)}{p(x, is called the joint distribution for X and Y and {g(y)} and {f(x)} are called the

marginal distributions for X and Y, respectively.

(XIV.6) . )y,xp( = )yg( ; )y,xp( = )xf( kjj

kkjk

j ∑∑

Let x}= X | y = P{Y k be the conditional probability that , y = Y k given that . x = X j

Page 278: Finance Theory - Robert C. Merton

Robert C. Merton

276

(XIV.7) . )xf(

)y,xp( = }x = X | y = P{Y

j

kjjk

Let = E(X) (unconditional) expected value of . )f( xx = X jjj∑

Let = )x = X | E(Y j conditional expected value of Y , given that . x= X j

(XIV.8) )xf(

)y,xp(yxyy = )x = X | E(Y

j

kjk

kjkk

kj = } = X | = {Y p ∑∑

(XIV.9) E(Y) = )g(( =

),p(

yy = ))y,xp(y

yxy = )x)f(x = X | E(Y = X)) | E(E(Y

kkk

kjj

kk

kjkkj

jjj

∑∑∑

∑∑∑

If X and Y are mutually independent, then

(XIV.10) . )y)g(xf( = )y,xp( ; E(X)E(Y) = E{XY} kjkj

For purposes of this course, we will be dealing primarily with random variables describing an

outcome as of a given date t . E.g., (t)π~ may be a random variable describing profits for date t.

In general, the distribution for such a random variable, X(t), will depend on outcomes which

occur at an earlier date: denote these random variables by . 2),...-Y(t 1),-Y(t If the value of

= X(t) function of these random variables = ( ( -1), ( - 2),... ),F Y t Y t then the expected value of

X(t) will depend on the point in time at which the expectation is computed. Let "E" t denote

the conditional expectation operator, conditional on knowing all (relevant) information that has

occurred up to and including time t . Then, x(t),= (t)}X~

{Et the particular value that X(t) took on

at time t and (t)X~

is not a random variable relative to time t. If X(t) depends on 1),...,-Y(t

then

Page 279: Finance Theory - Robert C. Merton

Finance Theory

277

{Y(t)} allover on distributijoint theinclude will (t)}X~

{E0

conditional on knowing that (t)}X~

{E . y = Y(0) 1-t0 will be the conditional expectation,

conditional on knowing that . y = Y,...,y = Y ,y = Y 002-t2-t1-t1-t From (XIV.9), we have that

(XIV.11) (t)}X~

{E = (t)]}X~

[E{E 2-t1-t2-t

or more generally,

(XIV.11') 0 j k for (t)]X~

[E = (t)]}X~

[E{E k-tj-tk-t ≥≥

End of Digression -

Valuation Under Uncertainty: The General Case

The derivation of the valuation formula follows the same format as the certainty analysis

in Section VI. If (t)Z~

is the (random variable) return per dollar from investing in the equity of

the firm between time t and t+1, then, by definition,

(XIV.12) s(t)

1)+(ts~ + 1)+(td~

= (t)Z~

where tildes ~ denote random variables relative to time t (e.g., s(t) will be known for certain

at time t).

Let k(t) be the equilibrium market required expected rate of return for investing in the

firm between t and t+1. (Again, k(t) may be a random variable relative to dates earlier than t,

but at time t, it is known). Then, in equilibrium, the price per share of the stock at time t must

be such that k(t) + 1 = (t)}Z~

{Et or

(XIV.13) , 1)]+(td~ + 1)+(ts~[E

k(t)]+[1

1 = s(t) t

Page 280: Finance Theory - Robert C. Merton

Robert C. Merton

278

and for equilibrium, (XIV.13) must hold for each t.

Consider a firm which will remain in business for T periods (from now) and then

liquidates. As in the certainty analysis, to deduce the value of the stock today, we first go

forward in time and then, work backwards to today (time zero).

At time T in the future, the firm will pay its last dividend per share, d(T), and as

discussed in the parallel analysis in Section VI, without loss of generality, we can assume that the

salvage value at that time is zero, and hence, with probability one, the ex-dividend price per share

at time T will be zero (i.e., S(T) = 0).

Consider an investor at time (T–1): If he buys one share of stock, his expected dollar

return at time T is . (T)]d~

[E 1-T For the market to be in equilibrium, we have that S(T–1) must

be such as to satisfy (XIV.13). I.e.,

(XIV.14) . (T)]d~

[E 1)]-k(T+[1

1 = 1)-S(T 1-T

Consider when we reach time (T-2). In order for the market to be in equilibrium S(T–2)

must again satisfy (XIV.13). I.e.,

(XIV.15) . 1)]-(TS~ + 1)-(Td

~[ E

2)]-k(T+[1

1 = 2)-S(T 2-T

Substituting for S(T–2) from (XIV.14) into (XIV.15), we have that

(XIV.16) (T)]}d~

[E 1)]-(Tk

~+[1

1{E

2)]-k(T+[1

1+

2)]-k(T+[1

1)]-(Td~

[E = 2)-S(T 1-T2-T2-T

where k(T-1) has a ~ over it because relative to time (T–2) it may be uncertain (i.e., a random

variable). Noting that }1)]-k(T+[1

(T)d~

{E = (T)]d~

[E 1)]-k(T+[1

11-T1-T because k(T–1) is not a

random variable relative to time (T-1), we have that

Page 281: Finance Theory - Robert C. Merton

Finance Theory

279

-2 -1 -2 -1 -2

1 ( ) ( ){ [ ( )]} { [ ]} [ ][1 ( -1)] [1 ( -1) [1 ( -1)

T T T T T

d T d T d T E E E E E

k T k T k T= =

+ + +

% %%

% % % using the fundamental

relationship on conditional expectations given in (XIV.11) or (XIV.11'). Thus, we can rewrite

(XIV.16) as

(XIV.17) }.

1)]-(Tk~

+2)][1-k(T+[1

(T)d~

+ 2)]-k(T+[1

1)-(Td~

{E =

}1)]-(Tk

~+[1

(T)d~

+ 1)-(Td~

{E 2)]-k(T+[1

1 = 2)-S(T

2-T

2-T

At time (T-3), we have that for markets to clear that S(T-3) must satisfy (XIV.13) or

(XIV.18) 2)}-(TS~ + 2)-(Td

~{ E

3)]-k(T+[1

1 = 3)-S(T -3T

Substituting from (XVI.17) into (XIV.18); noting that k(T-2) may be a random variable relative

to time (T-3) and using the result that , E = E E -3T2-T-3T • we can rewrite (XIV.18) as

(XIV.19)

}1)]-(Tk

~+2)][1-(Tk

~+3)][1-k(T+[1

(T)d~

+

2)]-(Tk~

+3)][1-k(T+[1

1)-(Td~

+ 3)]-k(T+[1

2)-(Td~

{E =

}1)]-(Tk

~+2)][1-(Tk

~+[1

(T)d~

+2)]-(Tk

~+[1

1)-(Td~

+2)-(Td~

{E 3)]-k(T+[1

1=3)-S(T

3-T

3-T

Proceeding inductively in this backwards fashion, we arrive at the price per share today (time

zero) which ensures that an investor buying the stock at any time and selling at any other time

will face an ex-ante expectation of a fair return and that the markets will clear. I.e.,

Page 282: Finance Theory - Robert C. Merton

Robert C. Merton

280

(XIV.20)

} ](t)K

~+[1

(T)d~

{ E =

}

1)]-(sk~

+[1

(t)d~

{ E

}1)]-(Tk

~+(1)]...[1k

~+k(0)][1+[1

(T)d~

+...+ k(1)]+k(0)][1+[1

(2)d~

+ k(0)]+[1

(1)d~

{ E =S(0)

t

T

1=t0

t

1=s

T

1=t0

0

∏∑

where (t)K~

is a random variable defined for notational convenience as

1/tt

s=1

K(t) [1+k(s -1)] - 1⎡ ⎤

≡ ⎢ ⎥⎣ ⎦∏ %%

Comparing (XIV.20) with the certainty case, in Section VI, there are some obvious similarities.

Moreover, if d(t) = (t)d~

(i.e., future dividends are known with certainty), then by arbitrage

, ]k(t)+[1

d(t) = ]

]k(t)+[1

(t)d~

[E ; r(t) = k(t)tt0 and (XIV.20) becomes the same as in VI. As in the

certainty case, we can write (XIV.20) in its infinite-lived form and for an all-equity financed

firm, we have that . n(0)s(0) = V(0) I.e.,

(XIV.20') }](t)K

~+[1

(t)d~

{ E = S(0)t

1=t0 ∑

and

(XIV.21) }](t)K

~+[1

(t)d~

n(0) { E = V(0)

t1=t

0 ∑∞

While (XIV.20), (XIV.20'), and (XIV.21) represent a completely general valuation formula, they

are operationally of little use without some further specification of the structure for the

probability distributions for both the . (t)}k~

{ theand (t)}d~

{

Page 283: Finance Theory - Robert C. Merton

Finance Theory

281

The balance of this section will be devoted to specific forms for (XIV.20') and (XIV.21)

deduced from special characteristics assumed for the structure of the market (i.e., (t)k~

) and the

firm-specific characteristic (i.e., (t)d~

). It should be remembered that these cases are only

representative, and in any given situation, it may be appropriate to return to the general form

(XIV.20') and (XIV.21).

Cost of Capital:

"The cost of capital" is a term often used in corporate finance, and is usually defined as

the opportunity cost (expressed as a rate of return) to investors of a given risk project. It is

definitely an external (to the firm) rate. While in certainty analysis, it is well-defined (namely,

equal to the {r(t)}), under uncertainty, it is a "fuzzy" notion. Nonetheless, the term is usually

taken to describe the structure of the . {k(t)}

Special Cases of Valuation Under Uncertainty

Case A. Suppose that the required expected returns (t)}k

~{ and the dividend stream per

share (t)}d~

{ are mutually independent. Define 0d(t) [ d(t)] =E≡ % expected

dividend per share at time t. Then, from (XIV.10), we have that

= ] ](t)K

~+[1

1[E (t)]d

~[E = ]

](t)K~

+[1

(t)d~

[E t00t0 •

]ρ(t)+[1

(t)dt where ρ(t) is defined by 0t t

1 1 [ ] .E

[1+ (t) [1+K(t)] ]ρ≡

% In this case,

(XIV.30') and (XIV.21) can be written as

(XIV.22) and ]ρ(t)+[1

(t)d = S(0)

t1=t∑∞

Page 284: Finance Theory - Robert C. Merton

Robert C. Merton

282

(XIV.23) . ]ρ(t)+[1

(t)dn(0) = V(0)

t1=t∑∞

Warning: . 1)}-(tk~ {E ρ(t) and (t)]K

~[E ρ(t) 00 ≠≠

Case B: Suppose that the (t)}k~

{ are nonstochastic and constant, i.e., .k t

k(t)≡

Then

(XIV.20') and (XIV.21) can be rewritten as

(XIV.24) and ]k+[1

(t)d = S(0)

t1=t∑∞

(XIV.25) . ]k+[1

(t)dn(0) = V(0)

t1=t∑∞

In this case, k is the required expected rate of return by investors in the firm, i.e., the cost of

capital. Therefore, the value of the stock is equal to the present discounted value of expected

dividends per share, discounted at the cost of capital. This is very close to the certainty formula

in VI where "expected dividends" replace "dividends received" and the market "expected rate of

return" replaces the market "realized rate of return."

Case C. A slight generalization of Case B is when the (t)}k~

{ are nonstochastic, but vary in a

deterministic way over time. I.e., k(t). = (t)k~

Then (XIV.20') and (XIV.21) can be written as

(XIV.24') and ]K(t)+[1

(t)d = S(0)

t1=t∑∞

(XIV.25') . ]K(t)+[1

(t)dn(0) = V(0)

t1=t∑∞

Note: the K(t) are nonstochastic because the k(t) are not. However, K(t) is not the cost of

capital, and in an analogous fashion to the R(t) in the certainty case, the required expected return

is not K(t). However, K(t) is the average expected compound return from investing in the

Page 285: Finance Theory - Robert C. Merton

Finance Theory

283

stock (including reinvesting dividends paid) from time zero to time t. I.e., if at time zero, one

invested W0 dollars in the stock and reinvested all dividends received in the stock, then the

expected value of the position at time t would be ]K(t)+[1W = 1)]-k(s+[1 W = ]W~[E

t0

t

1=s0t0 ∏ or

0

0

[ ][1 ( )]ttE W K t

W= +

% or

1/0

0

[ ]-1 ( )

ttE W K t .

W

⎡ ⎤=⎢ ⎥

⎣ ⎦

%

Case D. Suppose that the (t)}k~

{ are nonstochastic and constant, and the expected dividend per

share grows at a constant rate per period g. I.e., ]g+d(0)[1 = (t)d t Substituting for (t)d into

(XIV.24), we have that

(XIV.26)

1 1

1(0) (0)[ (0)]

1

1

1

(0)[1 ], 1 ( . ., )

-

(1)(1) (0)[1 ]

-

t t

t t

gS d d y

k

g for y equiv

k

d g provided y i e k g

k g

d because d d g

k g

∞ ∞

= =

+= =+

+

++

= < >

= = +

∑ ∑

(XIV.27) 0 0

0

(0) (0) (0) because ( ) ( -1) ( )

{ (1)} (1) [ (0) (1)]

(0) [ (1)] (0) (1)

V n S D t n t d t

D D n dE E

n d n dE

= ≡

= =

= =

%% %

%%

%

In the certainty analysis of Section VI, the cash flow accounting identity was used to

show that the four statements of what determines the value of a firm are equivalent. Fortunately,

the analysis presented in that section carries over almost completely to the uncertainty case.

Under the assumption that the firm is financed entirely by equity, the current market value of the

firm is given by n(0)S(0) = V(0) where n(0) is the number of shares currently outstanding.

Page 286: Finance Theory - Robert C. Merton

Robert C. Merton

284

Moreover, at each point in time t, V(t) = n(t)S(t). Equation (XIV.21) gives an expression for

V(0), and from (XIV.13), we have that

(XIV.28)

{

t

t

t

t

n(t)V(t) = n(t)S(t) = [ S(t +1) + d(t +1)]E

[1+k(t)]

1= n(t)S(t +1) + n(t)d(t +1)}E

[1+k(t)]

1= { n(t +1)S(t +1)+n(t)d(t +1)- [ n(t +1)- n(t)]S(t +1)}E

[1+k(t)]

1= {V(t +1)+D(t +1)-m(t +1)S(t +1)E

[1+k(t)]

%%

%%

%% %% %

%% % % }.

Moreover, the accounting identity in Section VI, is an identity, and therefore, holds for

each possible outcome. I.e., it states that

(XIV.29) 1)+(tD~

1)+(tO~

= 1)+(tS~

1)+(tm~ 1)+(tR~ ++

Or equivalently, that

(XIV.30a) 1)+(tO~

- 1)+(tR~

= 1)+(tS~

1)+(tm~ - 1)+(tD~

(XIV.30b) 1)+(tI~

- 1)+(tX~

= 1)+(tS~

1)+(tm~ - 1)+(tD~

(XIV.30c) 1)+(ti~ - 1)+(tπ~ = 1)+(tS

~1)+(tm~ - 1)+(tD

~

Substituting from (XIV.30) into (XIV.28), we have that

(XIV.31a) 1)}+(tO~

- 1)+(tR~

+ 1)+(tV~

{Ek(t)]+[1

1 = V(t) t

(XIV.31b) 1)}+(tI~ - 1)+(tX

~ + 1)+(tV

~{E

k(t)]+[1

1 = V(t) t

(XIV.31c) . 1)}+(ti~ - 1)+(tπ~ + 1)+(tV

~{E

k(t)]+[1

1 = V(t) t

Page 287: Finance Theory - Robert C. Merton

Finance Theory

285

We can solve (XIV.31) using the same backward technique used to solve for S(0) starting with

(XIV.13). Namely, we have that

(XIV.32a) }(t)]K

~+[1

(t)]O~

-(t)R~

[{ E = V(O)

t1=t

0 ∑∞

(XIV.32b) }(t)]K

~+[1

(t)]I~

-(t)X~

[{ E = V(O)

t1=t

0 ∑∞

(XIV.32c) }(t)]K

~+[1

(t)]i~

-(t)π~[{ E = V(O)

t1=t

0 ∑∞

Coupled with (XIV.21), (XIV.32) and (XIV.21) provide four alternative but equivalent

expressions for the value of the firm under uncertainty.

Using (XIV.21) and (XIV.30), we have the following expressions for the expected change

in the value of the firm from time t to t+1:

(XIV.33a) 1)]+(tD~

-1)+(tS~

1)+(tm~[E + k(t)V(t) =}V{∆E = V(t)]-1)+(tV~

[E tttt

(XIV.33b) 1)}+(tO~

-1)+(tR~

{E - k(t)V(t) = }V{∆E ttt

(XIV.33c) 1)}+(tI~

-1)+(tX~

{E - k(t)V(t) = }V{∆E ttt

(XIV.33d) 1)}+(ti~

-1)+(tπ~{E - k(t)V(t) = }V{∆E ttt

so, from (XIV.33), the expected change in the value of the firm is not equal to the expected

change in shareholders' wealth {i.e., k(t)V(t)}.

As promised, the evaluation of projects in an uncertain environment is considerably more

complex than in the certainty case. While the formulas for value under certainty derived in

Section VI do bear some resemblance to the ones derived here, the valid application of the

former has been shown to be limited to cases of projects with specific distributional

characteristics and specific market structures (e.g., CAPM).

While further development of these techniques are beyond the scope of the course, we

Page 288: Finance Theory - Robert C. Merton

Robert C. Merton

286

end this section with a brief discussion of the certainty equivalent method of valuation.

The certainty equivalent to a particular cash flow (t)X~

is defined to be that number of

dollars, (t),Xce such that an investor would be indifferent between receiving (t)Xce for certain

at time t or the random variable cash flow X(t)% at time t. Since, by definition, the market

would be willing to exchange ce(t)X dollars for certain for the X(t),% it must be that

(XIV.34) ]r+[1

(t)Xα(t) = V(0)

t1=t∑∞

where ( )andce 0(t) (t)/ X (t) X t [ X(t)] .X Eα ≡ ≡ %

While, in general, one might expect 1, < α(t) it need not be as for example in the CAPM

if (t)X~

has a negative beta. Moreover, α(t) need not be a decreasing function of t. That is, it

is not always true that the farther in the future a cash flow will occur, the more uncertainty or risk

it must have.

Page 289: Finance Theory - Robert C. Merton

287

XV. INTRODUCTION TO MERGERS AND ACQUISITIONS: FIRM

DIVERSIFICATION In the introduction to Section VII, it was noted that firms can acquire assets by either

undertaking internally-generated new projects or by acquiring existing assets of other firms.

Having examined the former there and again in Section XIV, we now turn to the latter.

Under the operational criterion for good management of maximizing current shareholders'

wealth, there are essentially three reasons for considering the acquisition of another company:

1. Synergy: By combining the two companies, the value of the operating assets of the

combined firm will exceed the sum of the values of the operating assets of the two

companies taken separately. Such synergy will occur if there are economies of scale in

marketing, purchasing of materials, plant size, and distribution system. It can also occur

through the elimination of duplicate efforts in management or research and development.

Such economics are most likely to occur with either horizontal or vertical mergers. In

essence, the value goes up because the factors of production are more efficiently

organized in the combined firm.

2. Taxes: The market value of the firm reflects its value to the private sector. Of course,

since the firm pays taxes (or may pay taxes in the future), there is an additional "shadow"

value of the firm to the public sector in the form of the present value of its tax payments.

The sum of the market value and this "shadow" value is the value of the firm to society.

In the case of synergy, the value of the firm to society is increased with a corresponding

increase in both the market and shadow values of the firm. However, if a combination of

two firms can reduce the combined present values of these firms' tax payments taken

separately, then the market value of the combined firm can exceed the sum of the values

of the two firms taken separately even if the value of the combined firm to society is just

equal to the sum of the values to society of the two firms. I.e., this combination does not

increase the total value to society, but it does redistribute the total between the

Page 290: Finance Theory - Robert C. Merton

Robert C. Merton

288

shareholders of the firms and the public sector. Two examples are: (a) a more-effective

use of a tax-loss carryover; (b) increased debt capacity for the combined firm which may

reduce taxes if there is a "tax-shield" value to the deductibility of interest [see Section IX

for further discussion].

3. The Firm to be Acquired is a "Bargain": If the firm to be acquired has a market value

which is less than its "fair" value, then by acquiring the firm, the management of the

acquiring firm can increase its stockholders' wealth. There are two distinct reasons why a

firm could be selling for less than "fair" value. The first is that relative to the acquiring

firm's information set, the stock market is not efficient in the sense to be discussed in

Section XVII. That is, the management of the acquiring firm believes that it has

information such that if this information were widely-known, the market value of the firm

to be acquired would be higher than its acquisition cost. If this is the principal reason for

the acquisition, then the management's behavior is identical to that of a security analyst

whose job it is to identify mispriced securities. In terms of the CAPM and Section XIII,

the management believes it is purchasing a security with a positive "alpha" (α) . Hence,

all the warnings about being able to "beat the market" given in that section apply equally

well here.

A second reason why a firm could be selling for less than its "fair value" is that the firm

to be acquired is currently being mismanaged. That is, through either incompetence or

malevolence, the current management is not managing the firm's resources so as to maximize the

market value of the firm. Unlike the first reason, this reason is completely consistent with an

efficient capital market. Indeed, as discussed at length in Section III, from society's point of

view, this reason is probably the most important one for permitting mergers and takeovers.

Page 291: Finance Theory - Robert C. Merton

Finance Theory

289

Firm Diversification

Notable by its absence among the three reasons for acquisitions is diversification: That

is, the acquisition of another firm for the sole purpose of reducing the volatility (variance or

"total" riskiness) of the firm's operations. Although "diversification" is a frequently cited reason

for an acquisition, it is often not the "real" reason. More often than not, it will be for one of the

three reasons already given. However, if diversification is the real reason, then the acquisition

route will in general be an inefficient way to achieve it.

The argument for firm diversification is often presented by analogy with an individual

investor where we have seen that diversification is quite important. However, this type of

argument simply illustrates the pitfalls of treating the firm "as if" it were an individual household

with exogenous preferences rather than as an economic organization designed to serve specific

economic functions.

To show why firm diversification is not an important activity for management and if it is

undertaken, why the acquisition route is inefficient, we begin with an explicit analysis of the

value of the firm under the capital asset pricing model. Let there be two firms where each firm

has a single project as described in the beginning of Section XIV. From formula (XIV.2), the

value of firm i (i = 1,2) is given by

(XV.1) 1,2. = i ],ρνλ-x[ RI = V iMieii

i

Suppose that firms #1 and #2 merge to form firm #3. In an analogous fashion to firms #1 and #2,

define I3 as the investment in firm #3 and x~3 as the random variable end-of-period cash flow of

firm #3 per dollar of investment. If no changes in the investment plans of the firms occur as a result of

the combination, then x~δ)-(1 + x~δ x~ and I + I I 213213 ≡≡ where

VAR Cov21 2 2 2 23 1 23 1 2

1 2

I . ( ) = + (1- + 2 (1- ) ( , ).)x x x+ I I

δ δ δ δν δ ν ν≡ ≡ % % %

Page 292: Finance Theory - Robert C. Merton

Robert C. Merton

290

(XV.2)

ν

ρνδ)-(1 + ρνδ =

σνρσνδ)-(1 + ρσνδ

=

σν]Z,x~δ)Cov[-(1 + ]Z,x~δCov[

=

σν]Z,x~Cov[

V ρ

3

2M21M1

M3

2MM21MM1

M3

M2M1

M3

M33M ≡

From Section XIV, (XIV.2), the value of firm #3 will satisfy

(XV.3) ]ρνλ-x[RI = V 3M3e33

3

Substituting into (XV.3) for ρ3M from (XV.2), we have that

(XV.4) ]}ρνδ)-(1 + ρν[δλ-x { RI = V 2M21M1e33

3

Noting that ,xδ)-(1 + xδ = ]x~E[ x 213≡ we have that

(XV.5) 3

3 e 1 e 21 21M 2MI = { [ - ] + (1- )[ - ]}V x xR

δ δρ ρλ ν λ ν

But I3δ = I1 and I3(1–δ) = I2 . Hence, from (XV.5), we have that

(XV.6) . V + V =

]ρνλ-x[RI + ]ρνλ-x[

RI = V

21

2m2e22

1M1e11

3

Thus, the value of the combined firm will just equal the sum of the values of the two firms prior

to the merger.

In connection with both mergers and firms possibly undertaking many (independent)

capital budgeting projects, we generalize the above demonstration to a firm with m projects.

Page 293: Finance Theory - Robert C. Merton

Finance Theory

291

Let firm P take on m different projects where physical investment in project i is Ii

and the random variable end-of-period cash flow is , I = I i

m

1=iP ∑ and total firm end-of-period cash

flow per dollar of physical investment , xP can be written as

(XV.7)

xδ =

I]/xI[ = x

ii

m

1=i

Pii

m

1=iP

where andm

ii Pii=1

/ I I = 1 .δ δ≡ ∑

It follows from (XV.7) that

(XV.8) xδ = x ii

m

1=iP ∑

and

(XV.9) Var Covm m

2iP P j i j

i=1 j=1

( ) = ( , ) .x x xν δ δ≡ ∑∑

It follows also that

(XV.10) . σνρδ =

σνρ = )Z,xCov(

MiiMi

m

1=i

MPPMMP

Page 294: Finance Theory - Robert C. Merton

Robert C. Merton

292

From (XIV.2), (XV.8), and (XV.10), we have that

(XV.11)

V =

)]ρνλxδ[RI =

]ρνδλ - xδ[RI =

]ρνλ-x[RI = V

i

m

1=i

iMieii

m

1=i

P

iMii

m

1=ieii

m

1=i

P

PMPePP

P

-(

∑∑

where )/Rρνλ-x(I = V iMieiii is the "stand-alone" value of project i .

Hence, diversification does nothing to the market values of the firms and hence,

according to the value-maximization criterion, it is not important. The result shown in (XV.6)

and (XV.10) is called value additivity and can be shown to obtain in quite general structures

(provided that there exists a well-functioning capital market).

An intuitive explanation of why the market values are unaffected even though the

combined firm may have a smaller total risk (variance) than the individual firms is as follows: In

order for investors to be willing to pay a higher price for the combined firm than they were

willing to pay for the two firms separately, the act of combining the two firms must provide a

"service" to the investors which they were previously unable to obtain. However, prior to the

combination, any investor could purchase shares of either or both firms in any mix he wants.

And, in particular, in the case of the merger, the investor could purchase the shares of firm #1 to

firm #2 in the ratio V/V 21 which is exactly the ratio implicit in the combined firm. Hence, each

investor could achieve for himself (prior to the merger) the same amount of diversification (of the risks

of the firms #1 and #2) as is provided by the combined firm, and therefore, the merger provides no new

diversification opportunities to investors. For that reason, investors would not pay a premium for the

combined firm.

Although it will not be the case for the capital asset pricing model, it is possible that the

combined firm could sell for less than the sum of the values of the two separate firms, i.e., that

firm diversification could "hurt" market value. The reason is that post-consolidation, investors

have fewer choices for portfolio construction than they did pre-consolidation. For example, prior

Page 295: Finance Theory - Robert C. Merton

Finance Theory

293

to the merger, an investor could hold positive amounts of firm #1 and none of firm #2 or vice

versa. Post the merger, the only way that an investor can hold firm #1 is to invest in the

combined firm #3 which means he must also invest in firm #2.

Indeed, he can only invest in firm #1 if he is willing to invest in firm #2 in the relative

proportion . V/V 21 The reason that this "loss of freedom" does not have a negative effect on the

combined firm's value is the CAPM is that in that model, it is optimal for all investors to hold firm #1 and

firm #2 in the relative proportions V/V 21 which is exactly the proportion provided by the combined

firm #3.

Note that this "negative" aspect of firm diversification applies even in a "frictionless"

world of no transactions costs and where the merger takes place on terms where no premium

above market value is paid for the acquired firm by the acquiring firm. In the real world, the

acquiring firm must usually pay a premium above the market value to acquire a firm. The

premium can range from 5 to more than 100 percent with an average somewhere around 20

percent. A natural question to ask is "Why do the owners of the firm to be acquired demand a

premium for their shares?" While there are several possible explanations, one that is consistent

with our previous analyses is as follows: If the acquiring firm's management is behaving

optimally, then the reason for their making a takeover attempt must be one of the three reasons

discussed at the outset of this section. Since anyone of these three reasons will increase the value

of the acquiring firm's shares, the acquired firm's shareholders are demanding compensation for

providing the means for this increase in value. How this potential increase in value is shared

between the acquiring and acquired firms' shareholders cannot be determined in general (as is the

usual case for bilateral bargaining), but almost certainly, the acquired firm's shareholders will

demand some positive share. Of course, the acquired firm's shareholders do not know what the

acquiring firm's management believes the value of the acquired firm is. Hence, it might appear

that no consolidation could be consummated because whatever price is offered, clearly, the

acquiring firm's management believes it is worth more, and therefore, the acquired firm's

shareholders should demand more. However, the fact that the acquiring firm believes it is worth

more does not mean that it is, indeed, worth more. I.e., their beliefs may be wrong. Hence, at a

Page 296: Finance Theory - Robert C. Merton

Robert C. Merton

294

high enough price above market, the acquired firm's shareholders will take the "sure" premium,

and let the acquiring firm take the risk (and earn the possible reward) that its information is

sufficiently superior to the market's that the acquired firm is still a "bargain."

Whether or not the acquired firm's shareholders or the acquiring firm's shareholders come

out ahead on these takeovers is still an open empirical question. However, it is clear that

acquiring another firm for the sole purpose of diversification is a losing proposition for the

acquiring firm because it must pay a premium for a firm whose acquisition promises no increase

in market value even if it is purchased at market.

While the premium paid over market for the acquiring firm is usually the principal cost of

an acquisition, there are other costs as well which can frequently be substantial. In an

uncontested merger, there are legal costs and management's time which could be spent on other

activities. There are uncertainties created for the acquired firm's management, suppliers, and

customers which could affect the operations of that firm during the negotiations and subsequent

transition. Of course, if the merger is contested, then litigation costs will be substantial.

Even if it is decided that firm diversification is warranted, then achieving this

diversification through acquisition is very costly. If, because of management risk aversion or

debt capacity or supplier concerns, it is decided that the volatility or total risk of the firm should

be reduced, then this can be achieved much more efficiently (i.e., at lower cost) by simply

purchasing a portfolio of equities and fixed-income securities where no premium must be paid

over market and no significant transactions costs must be paid. If diversification is desired to

provide "cash flow" from these operations to fund growth investments in current operations, then

it is almost certainly less costly to issue securities and raise the funds in the capital markets.

Don't pay $12 to $20 to acquire $10 in cash!

If it is costly for your shareholders to diversify their portfolios by direct purchase of

individual firms' shares, then this service can be provided at less cost by mutual funds,

investment companies, and other financial intermediaries. In summary, there are three types of

reasons for a firm to consider the acquisition of another firm:

Page 297: Finance Theory - Robert C. Merton

Finance Theory

295

1) Synergy

2) Taxes

3) The firm to be acquired is a "Bargain"

They all have in common that the acquisition should increase the value of the acquiring firm's

current stockholders' wealth.

The possibility of a takeover of one firm by another is an important "check" which serves to force

managements to pursue policies which are (at least approximately) value-maximizing.

Diversification by the firm is, in general, not an important objective for the management of the

firm. Hence, if pursued, then a minimum of resources should be used to achieve it. Specifically,

the acquisition of another firm is a costly way to achieve diversification.

Warning: "diversification" is frequently given as the reason for acquiring a firm by the

acquiring firm's management. If carefully investigated, (most of the time) the meaning of

"diversification" as used is not the one described here, and the real reasons will be one or more of

the three (proper) reasons for making an acquisition.

Page 298: Finance Theory - Robert C. Merton

296

XVI. THE FINANCING DECISION BY FIRMS: IMPACT OF DIVIDEND POLICY ON VALUE

In Section IX, the choice of capital structure part of the firm's financing decision was

examined to determine if this choice has a significant effect on the market value of the firm. In a

parallel fashion, we examine here the impact of dividend policy on the market value of the firm.

That is, we address the question, "Does dividend policy ‘matter’?" As with the analysis of the

capital structure issue in Section IX, this question is well posed only if it is qualified to reflect

what are the "givens" of the environment. As in Section IX, we ask this question in the context

of a given or prespecified investment policy. That is, given that the firm has already set its

investment plan in real assets, can alternative choices among dividend policies change the market

value of the firm? In this framework, (as will be shown using the basic cash flow accounting

identity), asking the question in this context is equivalent to asking whether or not it matters that

the firm finances its investments by internally-generated funds or by raising the necessary money

externally in the capital markets (or through financial intermediaries).

Using the notation of Sections VI and XIV, an investment policy or plan corresponds to a

specific set of cash flows over time, {X(t)}, and investments over time {I(t)}. From the valuation

formulas (XIV.20) or (XIV.21), a seemingly obvious answer is that "of course, dividend policy

affects the value of the firm." From the valuation formula (XIV.32b), however, an equally

obvious answer is that "given that the distribution for {X(t)} and {I(t)} is fixed, V(0) cannot

change by changing the payout stream, and hence, dividend policy does not affect the value of

the firm." In fact, neither answer is universally correct. Thus, the second answer is correct

provided that the cost of capital, {k(t)}, does not depend on dividend policy. But, it remains to

be determined under what conditions this lack of dependence will obtain.

Before exploring this issue, we briefly digress to list some factors which appear to

influence dividend policy:

Page 299: Finance Theory - Robert C. Merton

Finance Theory

297

Factors influencing dividend policy:

(1) legal restrictions (6) profit rates (2) cash position (7) access to capital markets (tradeability of equity) (3) need to repay debt (8) control of the firm (4) restrictions in debt contracts (9) tax position of shareholders (5) rate of asset expansion (10) corporate tax liabilities Observed stability of dividend policy with respect to earnings or cash flows.

Modigliani-Miller Theorem on Dividend Policy

First proof that "dividends do not matter"

Assume an environment in which short sales are allowed with full use of the proceeds. Suppose

there are two firms with identical investment policies, i.e.,

(t)I~ (t)I

~ and (t)X~ (t)X

~2121 ≡≡

Suppose that the dividend policies of the two firms for time t > T are identical, but their

dividend policies differ from t ≤ T . Suppose that their values today are different. By

convention, . (0)V> (0)V 12 For simplicity, assume that n1(0) = n2(0) . Where 1 2 and n n are the

number of shares issued by the two firms.

Consider the following portfolio strategy:

At time zero, buy λ% of firm #1 and sell short λ% of firm #2. Since V2(0) > V1(0), my total

Page 300: Finance Theory - Robert C. Merton

Robert C. Merton

298

position is at this point:

(a) cash = λ[V2(0) – V1(0)] > 0 (b) long λn1(0) shares of firm #1 (c) short λn2(0) shares of firm #2 Suppose that the portfolio policy is pursued of always maintaining a long position in firm #1

equal to λ% of its value and a short position in firm #2 equal to λ% of its value.

Let N1(t) = number of shares of firm #1 which you are long at time t.

Let N2(t) = number of shares of firm #2 which you are short at time t.

Then N1(0) = λn1(0) and N2(0) = λn2(0) and

(XVI.1a) 1 1 1( 1) = ( ) + ( 1), andt t tN N mλ+ +

(XVI.1b) 2 2 2( 1) = ( ) ( 1)t t tN N mλ+ + +

Where 1 2 1 2 and are the changes in and .m m n n

Let C(t) = total cash flow from this portfolio strategy at time t. Then:

2 2 1 1

2 1

(0) = (0) (0) - (0) (0)

= [ (0) - (0)] 0

C n S n S

V V

λ λλ >

Where 1 2 and S S are the share prices for firms #1 and #2.

Assume that C(0) is invested in riskless-in-terms-of-default, T-period discount bonds with yield

to maturity of R(T). For t > 0 and t < T–1 , we have that

(XVI.2) 1 1 2 2 2 2 1 1( 1) = ( ) ( 1) - ( ) ( 1) + ( 1) ( 1) - ( 1) ( 1)C t t t t t t t t tN d N d m S m Sλ λ+ + + + + + +

Where 1 2 and d d are the dividends per share.

Page 301: Finance Theory - Robert C. Merton

Finance Theory

299

From the strategy design described in (XVI.1), we have that:

(XVI.3a) 11 1( ) ( 1) = ( 1)t t tN d Dλ+ +

(XVI.3b) 22 2( ) ( 1) = ( 1)t t tN d Dλ+ +

Where 1 2 and D D are the total dividends paid by the two firms respectively.

Substituting from (XVI.3) to (XVI.2), we have that:

(XVI.4) 1 2 2 2 1 1

1 21 1 2 2

( 1) = ( 1) - ( 1)+ ( 1) ( 1)- ( 1) ( 1)

= {[ ( 1) - ( 1) ( 1)]-[ ( 1) - ( 1) ( 1)]}

C t t t t t t tm S m SD Dt t t t t tm S m SD D

λ λ λ λλ

+ + + + + + ++ + + + + +

From the cash flow accounting identity (VI.12), we have that:

(XVI.5a) 1 1 1 11 1( 1) - ( 1) ( 1) ( 1) - ( 1) ( 1)t t t t t tm SD X I Y+ + + ≡ + + ≡ +

(XVI.5b) , 1)+(tY 1)+(tI - 1)+(tX 1)+(tS1)+(tm - 1)+(tD 222222 ≡≡

and by hypothesis of a fixed investment policy, 1 2( 1) ( 1)t tY Y+ ≡ + for all t. Therefore,

substituting into (XVI.4), we have that:

(XVI.6) ( 1) = 0 0 -1C t for t T+ < <

If the positions are liquidated at time T, then we have that:

(XVI.7) 1 21 1 2 2( ) = {[ ( ) - ( ) ( )] - [ ( ) - ( ) ( )]}C T T T T T T Tm S m SD Dλ

+λV1(T) -λV2(T) + C(0)[1+R(T)]T

sale of purchase of cash and interest on shares long shares short maturity of bonds. By assumption, after date T, the dividend policies of the two firms are identical. So after the

dividend payments at time T, it must be that the two firms have identical market values, i.e.,

V1(t) = V2(t) for t ≥ T . In particular, V1(T) = V2(T) . From this and (XVI.5), we have that:

Page 302: Finance Theory - Robert C. Merton

Robert C. Merton

300

(XVI.8) 2 1 2 1

( ) = (0)[1 ( )]

= [ (0) - (0)][1 ( ) >0 if (0) (0)]

T

T

C T C R T

R TV V V Vλ+

+ >

Therefore, by investing no money at any time during the interim, the investor can earn C(T) at

time T . Therefore, to avoid arbitrage, C(T) ≡ 0 or:

(XVI.9) 2 1(0) = (0)V V

Therefore, the values of the two firms must be equal and dividend policy "does not matter."

Second proof that "dividends do not matter":

Assume that: 1. Imputed Rationality: If, in forming expectations, each individual investor

assumes that every other trader in the market (A) is rational in the sense of preferring more

wealth to less, independent of the form an increment in wealth may take, and (B) imputes

rationality to all other investors. (2) Symmetric Market Rationality (SMR): Market as a whole

satisfies SMR, if every trader is both rational in behavior and imputes rationality to the market.

We do not assume that short sales can be made with the full use of the proceeds. Consider two

firms as in the "first proof." Suppose that at time t = T–1 there is an investor who is considering

buying λ% of firm #2 for $λV2(T–1). Suppose instead he bought λ% of firm #1 and did the

following: at time T, he will receive λD1(T) in dividends. Suppose he sells (ex-dividend)

$λ[D2(T) – D1(T)] of his stock for cash if D2(T) ≥ D1(T), or if D1(T) > D2(T) , then he buys

$λ[D1(T) – D2(T)] of the stock of firm #1. At this point, he will then have ${λD1(T) + λ[D2(T)

– D1(T)]} = $λD2(T) in cash and ${λ[V1(T) – m1(T)S1(T)] + λ[D1(T) – D2(T)]} worth of firm

#1's stock. From (XVI.5), we have that D1(T) – m1(T)S1(T) = D2(T) – m2(T)S2(T). So, –

λ[m1(T)S1(T) – D1(T) + D2(T)] = – λm2(T)S2(T). Therefore, our investor would have:

$ λD2(T) , in cash, and $ λ[V1(T) – m2(T)S2(T)] = $ λ[V2(T) – m2(T)S2(T)] , in stock, because V1(T) = V2(T) .

Page 303: Finance Theory - Robert C. Merton

Finance Theory

301

But, this is exactly the amount of cash and stock which he would have had if he bought λ% of

firm #2. If V1(T–1) < V2(T–1), then every investor (who prefers more to less) would be better

off to buy firm #1 instead of firm #2. Hence, unless V1(T–1) = V2(T–1), there will be a

dominance of one of the firms over the other. If one firm dominates the other, who would buy

the dominated firm, or who would hold it? Clearly, no one. Hence, V1(T–1) = V2(T–1) .

Suppose at some date τ, V1(τ) = V2(τ) , then, by the same argument (with "τ" replacing "T"),

we have that V1(τ–1) = V2(τ–1). Proceeding inductively, we have that V1(0) = V2(0). Both

proofs neglect transactions costs and personal taxes. We now explore what effect these might

have.

Dividend Policy & Market Imperfections: It appears that reductions in current dividends per

share (for fixed investment policy) may increase stockholders' wealth.

(i) because substantial underwriting costs are incurred in issuing stock, shareholders should

prefer a reduction in dividends to a stock issue.

(ii) because capital gains are taxed at a lower rate than dividends and only at the time of their

realization through sale.

Informational Content of Dividends

Since the practice is that dividend payments are smoothed to conform to managers'

estimates of average earnings, the announcement of an increase in dividend payments implies

that management has raised its estimate of average future earnings. If unanticipated through

other means, such an announcement would be expected to affect the stock price.

Generally, (i) Managers are reluctant to cut the dividend rate for fear that this would be

interpreted as a sign of poor earning prospects.

(ii) Dividends are increased only when management is reasonably confident that

Page 304: Finance Theory - Robert C. Merton

Robert C. Merton

302

the increase can be maintained.

(iii) Payout ratios ⎟⎟⎠

⎞⎜⎜⎝

⎛Earnings

Dividends fluctuate because dividends are more stable than

earnings. But, a firm's target payout ratio is normally stable over time.

(iv) Target payout ratios vary widely from company to company. A typical ratio

is .50 - .60 .

Example: The Constant-Growth Case: Growth Stocks

Review Section VI, pp. 6-20 and 6-21.

Consider the constant growth examined there:

We have from (VI.18) that:

(XVI.10) *

(1- ) (0)(0) =

-V

r r

δ πδ

where *r δ = rate of growth of earnings

and δ = fraction of profits allocated to new investment.

From (XIV.27), we have that:

(XVI.11) (1)

(0) -

DV

r g=

where g is the rate of growth of dividends per share. From the accounting identity, D(t) –

m(t)S(t) = X(t) – I(t). If I(t) = δX(t) , then D(t) – m(t)S(t) = [1–δ]X(t). Let δr = fraction of

current earnings retained (i.e., D(t) = [1–δr]X(t). Let δe = the amount of external financing

required expressed as a fraction of current earnings. It follows that [1–δr]X(t) – δeX(t) = [1-

δ]X(t) or δe = δ – δr.

(1) (0), , (1) [1 - ] (0).rX so Dπ πδ= =

From (XVI.10) and (XVI.11), we have that

Page 305: Finance Theory - Robert C. Merton

Finance Theory

303

(XVI.12) *

(1- ) (0) [1- ] (0) = (0)

- - rV

r gr r

δ π πδδ

=

or

(XVI.12') *(1- )

= - (1- ) 1 -

r er rg

δ δ δδ δ

Page 306: Finance Theory - Robert C. Merton

Robert C. Merton

304

Note: Unless δe = 0 (i.e., no external financing), the rate of growth of dividends, g , is less

than the rate of growth of profits, *.rδ Further, even if the firm pays out all of its current earnings in

dividends, i.e., δr = 0 , dividends and price per share will grow over time, i.e., *( - )

.(1- )

r rg

δδ

=

Example: three firms all with (0) = $100π and identical investment policies:

Firm I II III

π(0) $100 $100 $100

r .10 .10 .10

*r .20 .20 .20

δ .40 .40 .40

δr .40 0 .20

δe 0 .40 .20

V(0) $3,000 $3,000 $3,000

I(1) $40 $40 $40

n(0) 1,000 1,000 1,000

S(0) $3.00 $3.00 $3.00

Firm I: Finances all its investment internally through retained earnings, i.e., δr = δ = .40 and

δe = 0 .

From (XVI.10),

*

(1- ) (0) (.6)(100) 60(0) = = = $3,000

.10-.4(.2) .02-

(1) (1) = .4($100) = $40.

Vr r

I X

δ πδ

δ

=

=

Page 307: Finance Theory - Robert C. Merton

Finance Theory

305

Since this firm does no external financing, D(1) = dividends = X(1) – I(1) = $100 – 40 = $60, by

the accounting identity. Dividends per share, (1) 60

(1) $.06(0) 1,000

Dd

n= = = per share. We

have that -1* *( ) ( -1) ( -1) (0)[1 .]tX t X t r I t rπ δ= + = + Hence,

*(2) (1)[1 ] 100(1.08) $108.X X r δ= + = = Therefore, the value of the firm next period will

be:

(2)(1- ) $108(1- .4)(1) $3, 240 .

- .10 -.4 x .2

XV

r r

δδ

= = = Since no new shares are issued, n(1) = n(0) =

1,000 shares. So, the price per share will be $3.24. The total rate of return to the stockholder will

be:

. r = 10% = 3.00

3.00-3.24+.06 =

S(0)

S(0)-S(1)+d(1)

The rate of growth of dividends, g, from (XVI.12'), will be

* *(1- ) (1) - (0) = - = .08 = 8% = rate of growth of the firm

(1- ) 1- (0)

(1) - (0)= rate of growth of price per share .

(0)

r er V Vg r r

V

S S

S

δ δδ δδ δ

Firm II: Finances all new investment by issuing new shares and pays out all earnings as

dividends. As has been demonstrated previously, since the investment policy is the same for all

three firms, X(1), X(2), V(0), V(1), and S(0) will be the same for all firms, and they depend on

the profitability of current assets and future investment opportunities which are independent of

dividend policy. Hence, V(1) = $3240 for this firm, but at that point, the firm will not belong

completely to the shares outstanding at time zero. Namely, it must issue m(1) new shares at

price S(1) to finance investment I(1). I.e., m(1)S(1) = I(1) = $40, V(1) = n(0)S(1) + m(1)S(1) or

(1) - (1) (1) 3240-40(1) = =

(0) 1000

V m SS

n or S(1) = $3.20 and m(1) = 12.5. The return to the

Page 308: Finance Theory - Robert C. Merton

Robert C. Merton

306

shareholders is (1) (1) - (0) .10+3.20-3.00

= = 10% = (0) 3.00

d S Sr

S

+ since D(1) = X(1) = $100

and (1) 100

(1) = = = .10 .(0) 1000

Dd

n Note that the larger dividend of Firm II is offset by a smaller capital

gain.

The rate of growth of dividends, g, from (XVI.12') is

*(1- ) (.4)(.2)(1-0) .4(.1) (1) - (0) = - = - = 0.0667 = rate of growth of price per share =

1- 1- .6 .6 (0).r er r S S

gS

δ δ δδ δ

Note: The growth of dividends is smaller than for Firm I.

Firm III: Uses a mix of one-half internal and one-half external financing. Hence, m(1)S(1) =

.5I(1) = $20 and again,

*

(1) - (1) (1) 3240-20(1) = = = $3.22 per share

(0) 1000

(1) = (1) - (1) (1) (1) = 100 - 40 + 20 = $80 and

(1) (1) (1) - (0) .08+3.22-3.00(1) = = $0.08 and = = 10% =

(0) (0) 3.00

(1- ) =

(1-r

V m SS

n

D X I m S

D d S Sd r

n S

g r δδδ

+

(.8)(.8) (.2)(.1) - = - = .07334

) 1- .6 .6

(1) - (0)= = rate of growth stock price .

(0)

er

S S

S

δδ

Note: (1)

= 10% = + = (0)

dr g

S current dividend yield + growth .

On Corporate Earnings and Investor Returns What is the relationship between total dollar returns to shareholders in a particular period

Page 309: Finance Theory - Robert C. Merton

Finance Theory

307

(i.e., dividends plus capital gains) and total dollar earnings of the firm, X(t)? If G(t) = capital

gains to shareholders between period t – 1 and t, then D(t) + G(t) = (1–δr)X(t) + gV(t–1) ,

because (1–δr)X(t) is the amount of earnings not retained, and g, the rate of growth of

dividends, is equal to the rate of growth of price per share. We have that *

(1- ) ( )( -1) =

-

X tV t

r r

δδ

and from (XVI.12'), that *(1- )

= - .(1- ) (1- )

r er rg

δ δ δδ δ Hence,

*

* *

* **

* *

(1- )(1- ) (1- )( ) ( ) = ( ){(1- ) + - }

( - )(1- ) ( - )(1- )

( )= { (1- ) - (1- ) + (1- ) - }

-( ) ( )

= {1- - } = [1- ] since = + .- -

r er

r r r e

r e r e

rrD t G t X tr r r r

X tr r r r

r rrX t rX t

r r r r

δ δ δδ δδδ δ δ δ

δ δδ δ δ δδ

δ δδ δ δ δδ δ

+

So,

**

*

*

( ) ( ) (1- ) = = 1 for =

( ) -

> 1 for > r for 0 < <1

< 1 for <

D t G t rr r

X t r r

r

r r

δδ

δ

+

for 0 < δ < 1

Note: From (XVI.10), *(1) (0) -

= = ,(0) (0) 1-

X r r

V V

π δδ and from (XVI.11),

(1) = - .

(0)

Dr g

V

So, in general, neither the earnings-to-price nor the dividends-to-price ratio is an unbiased

estimate of the cost of capital, r.

Does dividend policy "matter"? Empirical Evidence Graham & Dodd (early work)

As the result of a cross-sectional fit of companies, they found the following relationship:

Page 310: Finance Theory - Robert C. Merton

Robert C. Merton

308

[ ]3

EP m D= +

where

E = earnings; D = dividends; change in retained earnings = ∆RE;

P = price of stock; m = constant.

Because E = D + ∆RE , we also have

[4 + ] .3

mP D RE= ∆

The weighted average is important and the dividends have a large weight. Implied policy: make

the dividend as large as possible. The equation was "derived" by looking at the data (although it

did not do well for growth stocks, e.g., IBM). Regression or "fit" was done as follows:

Implication: Other things equal, the higher the payout ratio, the higher the price.

Is there any problem with this analysis?

Page 311: Finance Theory - Robert C. Merton

Finance Theory

309

Suppose: P = price is a function of future earnings and managements choose dividends as a

function of future earnings. Does it follow that because P plotted against D gives a good fit,

one can raise price by increasing the dividend payout if the anticipated future earnings stream

remains the same? I.e., is the Graham-Dodd result a causal relationship?

Suppose: the price-earnings ratio properly computed, using long-run "smooth" earnings (i.e.,

(0)π and the "target" payout ratio, / (0) ,D π are independent of each other). At a point in time, some

firms' earnings will be transitorily lower than their long-run average. Realizing the transitory

nature of the lower earnings, management does not "cut" the dividend which is based on "long-

run" earnings trend. Hence, / > / (0) .D E D π Similarly, the market, recognizing that price is

dependent on "long-run" earnings, will not bid down the price. Hence, / > / (0).P E P π At the same

time, some firms' earnings will be transitorily higher than their long-run average. For the same

reasons, management does not raise the dividend nor does the market bid up the price. Hence

/ / (0) and / (0) .D E D P Eπ π< <

In a cross-section, the strong positive fit between D/E and P/E could merely reflect

transitory earnings coupled with managements having a target payout based on long-run

"smoothed" earnings.

Because of their concern over the information effect of dividends, management may well

"smooth" dividend payments to match their long-run expectations about the earnings of the firm.

Suppose: (as seems to be the case empirically), that dividend payout policy and the risk (as, for

example, measured by beta) of a firm's underlying assets are not independent. I.e., that high (or

low) dividend payout policies are not randomly distributed across firms. Moreover (as seems to

be the case), suppose that low-risk firms tend to also have high payout policies. Then in a cross-

section of firms, one would expect to find that high-payout ratios would be associated with high

price-earnings ratios. Yet, such a finding does not imply that a firm can raise its PE ratio by

increasing its payout ratio if it maintains the same risk level for its assets.

Page 312: Finance Theory - Robert C. Merton

Robert C. Merton

310

Black-Scholes Dividend Paper

As was discussed in the beginning of this section, the only way that dividend policy can

affect the value of the firm (given, a fixed investment policy) is if alternative choices for dividend

policy affects the required expected return on the firm (i.e., the { ( )}) .k t% In their dividend paper,

Black and Scholes provide a test of the hypothesis that alternative dividend policies differentially affect

required expected returns. To overcome the inherent difficulties with simple cross-sectional analysis,

their test is a combined time-series and cross-sectional analysis. Moreover, their test attempts to correct

for the different risks inherent in a cross-section of stocks. In constructing the test procedure, they begin

with a (generalized) Capital Asset Pricing Model specification for expected returns on securities:

(XVI.13) 0 0 1( ) = + [ ( ) - ] + [ - ] /j M j M MjE EZ Zγ γ β γ δ δ δ

where δj = current dividend yield on security j; δM = current dividend yield on the market; γ0 =

expected return on a "zero-beta" portfolio; and γ1 is the "expected return" on the dividend factor.

Possibilities:

(i) The classical security market line relationship of the CAPM would predict γ0 =

R, the riskless rate γ1 = 0 .

Thus, if they could not reject γ0 = R and γ1 = 0 , we cannot reject the CAPM and

we cannot reject the hypothesis that dividend policy "does not matter."

(ii) If γ1 ≠ 0 , then the data suggest that dividend policy does differentially affect

returns. Further, if γ1 > 0 , then this would imply that investors prefer low-

dividend yielding stocks. If γ1 < 0 , then this would imply that investors prefer

high-dividend yielding stocks.

Their findings were that while they could reject the hypothesis that γ0 = R, they could not reject

the hypothesis that γ1 = 0 .

Page 313: Finance Theory - Robert C. Merton

Finance Theory

311

Their results seem somewhat surprising in the light of our proof that dividend policy does

not matter in the absence of transactions costs and personal taxes. Since both exist in the real

world, one's prior might be that γ1 > 0 . I.e., investors prefer low-dividend yielding stocks.

The Black-Scholes explanation of this result is as follows: because payout policies are

not randomly distributed across the firms and risk classes, to achieve dividend yields that are

significantly different from the market's, the investor must hold a less-than-well-diversified

portfolio. Thus, to achieve a higher (or lower) dividend-yielding portfolio, one must pay a price

in the form of increased variance. Because dividend-yield is only a small fraction of the total

return on the market and the maximum tax-saving is even smaller, it does not pay to adjust one's

portfolio to avoid dividends. Moreover, unless a taxpayer is in the maximum tax bracket, he

does not know if he would prefer high or low-dividend paying portfolios unless he knows the

"spread" between pre-tax yields. Hence, they conclude that for stock portfolios (in the world as it

is) investors neglect tax differentials between dividends and capital gains.

Page 314: Finance Theory - Robert C. Merton

312

XVII. SECURITY PRICING AND SECURITY ANALYSIS IN AN EFFICIENT

MARKET Consider the following somewhat simplified description of a typical analyst-investor's

actions in making an investment decision. First, he collects the information or "facts" (both

fundamental and technical) about the company and related matters which may affect the

company. Second, he analyzes this information in such a way so as to determine his best

estimate (as of today, time "zero") of the stock price at a future date (time "one"). This best

estimate is the expected stock price at time one which we denote by . (1)P From looking at the

current stock price, P(0), he can estimate an expected return on the stock, Z , which is . P(0)

(1)P = Z

However, his analyst's job is not finished. Because he recognizes that his information is not

perfect (i.e., subject to error, unforeseen events which may occur, etc.), he must also give

consideration to the range of possible future prices. In particular, he must estimate how

dispersed this range is about his best estimate and how likely is a deviation of a certain size from

this estimate. This analysis then gives him an estimate of the deviations of the rate of return from

the expected rate and the likelihood of such deviations. Obviously, the better his information, the

smaller will be the dispersion and the less risky the investment.

Third, armed with his estimates of the expected rate of return and the dispersion, he must

make an investment decision and determine how much of the stock to buy or sell. How much

will depend on how good the risk-return tradeoff on this stock is in comparison with alternative

investments available and on how much money he has to invest (either personally or as a

fiduciary). The higher the expected return and the more money he has (or controls), the more of

the stock he will want to buy. The larger the dispersion (i.e., the less accurate the information

that he has), the smaller the position he will take in the stock.

To see how the current market price of the stock is determined, we look at the

aggregation of all analysts' estimates, and assume that on the average the market is in

equilibrium. I.e., on average, the price will be such that total (desired) demand equals total

supply. Analysts' estimates may differ for two reasons: (1) they may have access to different

Page 315: Finance Theory - Robert C. Merton

Finance Theory

313

amounts of information (although presumably public information is available to all); (2) they

may analyze the information differently with regard to its impact on future stock prices.

Nonetheless, each analyst comes to a decision as to how much to buy or sell at a given market

price, P(0). The aggregation of these decisions gives us the total demand for shares of the

company at the price, P(0). Suppose that the price were such that there were more shares

demanded than supplied (i.e., it is too low), then one would expect the price to rise, and vice

versa, if there were more shares available at a given price than were demanded. Hence, the

market price of the stock will reflect a weighted average of the opinions of all analysts. The key

question is: what is the nature of this weighting? Because "votes" in the marketplace are cast

with dollars, the analysts with the biggest impact will be the ones who control the larger amounts

of money, and among these, the ones who have the strongest "opinions" about the stock will be

the most important. Note: the ones with the strongest "opinions" have them because (they

believe that) they have better information (resulting in a smaller dispersion around their best

estimate). Further, because an analyst who consistently overestimates the accuracy of his

estimates will eventually lose his customers, one would expect that among the analysts who

control large sums, the ones that believe that they have better information, on average, probably

do.

From all this, we conclude that the market price of the stock will reflect the weighted

average of analysts' opinions with heavier weights on the opinions of those analysts with control

of more than the average amount of money and with better than average amounts of information.

Hence, the estimate of "fair" or "intrinsic" value provided by the market price will be more

accurate than the estimate obtained from an average analyst.

Now, suppose that you are an analyst and you find a stock whose market price is low

enough that you consider it a "bargain" (if you never find this situation, then there is no point

being in the analyst business). From the above discussion, there are two possibilities: (1) you do

have a bargain─your estimate is more accurate than the market's. I.e., you have either better than

average information about future events which may affect stock price and/or you do a better than

average job of analyzing information. Or, (2) others have better information than you do or

Page 316: Finance Theory - Robert C. Merton

Robert C. Merton

314

process available information better, and your "bargain" is not a bargain.

One's assessment of which it is, depends on how good the other analysts are relative to

oneself. There are important reasons why one would expect the quality of analysts to be high:

(1) the enormous rewards to anyone who can consistently beat the average attract large numbers

of intelligent people to the business; (2) the relative ease of entry into the (analyst) business

implies that competition will force the analysts to get better information and better techniques for

processing this information just to survive; (3) the stock market has been around long enough for

these competitive forces to take effect. Unfortunately, the tendency is to underestimate the

capabilities of other analysts. Ask any analyst if he is better than average, and invariably he

answers "yes." Clearly, this cannot be true for all analysts by the very definition of average. If

the analysts are so good, why aren't most of them rich? Precisely because they compete with

each other, the market price becomes a better and better estimate of "fair value," and it becomes

more difficult to find profit opportunities. To stay ahead, the analyst must develop new ideas

continually. As the limiting case of this process, one would expect that as market prices become

better estimates of "fair value" in the sense of fully reflecting all relevant known information, the

fluctuations of stock prices around the expected "fair return" will be solely the result of

unanticipated events and new information. Hence, these fluctuations are random and not

forecastable. And it is in this sense that the fluctuations in stock prices can be described by a

random walk.

This also explains why the performance of most "managed" portfolios will be no better

than the performance of an "unmanaged" well-diversified portfolio. In fact, the "unmanaged"

portfolio, because it takes market prices as the best estimate of value, is equivalent to a

"managed" portfolio whose manager is a no-worse-than-average analyst! The investor who buys

such a portfolio is simply "piggy-backing" on the actions taken by active analyst-investors

competing with each other.

This is essentially the story behind the "Random Walk Theory." It does not imply that a

better-than-average analyst cannot make greater than fair returns. It does not imply that all

analysts should quit their jobs, and in fact, its cornerstone is that enough analysts remain and

Page 317: Finance Theory - Robert C. Merton

Finance Theory

315

actively compete so that market prices are good estimates of "fair" value. It is only in this way

that the "piggy-backing" by investors can be justified. Further, it does not imply that all investors

should hold "unmanaged" portfolios. If an investor can identify an analyst with above-average

capabilities and is willing to bear the risk of his capabilities, then a "bargain" can be struck so

that both are rewarded for the effort. The theory does imply that to make "extra" profits, one

must have superior techniques which process information in a way not generally known in the

market and that the longer that the market is in existence, the greater the number of participants,

the more difficult it is to make these "extra" profits.

An Example to Illustrate the Efficient Market Concept

Consider a firm in a cyclical business whose earnings are completely predictable but vary

in the following fashion: If the earnings per share this period are $50, then next period's earnings

per share will be $100, and if the earnings per share this period are $100, then next period's

earnings per share will be $50. I.e., if Et denotes earnings in period ,t and if

or 100,... = E 50, = E 100, = E then $50 = E 3210

50)(-1 + E = Et

t1+t

If the firm pays out all earnings as dividends )E = D( tt and if the required return ("fair market

return") is 20% per period, then the correct price per share, tS (ex-dividend) is given by

.$363.64,.. = S $386.36, = S $363.64, = S $386.36, = S 3210

22.72 )(-1 + S = S1+t

t1+t

I.e., the return per dollar from investing in the shares from time 0 to time 1,

1.20, = 386.36

363.64 + 100 =

S

S+D = Z0

111 and from time 1 to time 2,

1.20, = 363.4

386.36 + 50 =

S

S + D = Z1

222 and so forth.

Page 318: Finance Theory - Robert C. Merton

Robert C. Merton

316

Suppose that investors are myopic and assume that current earnings (and hence, current

dividends) are permanent. I.e., their best guess of future dividends is that they will be equal to

current dividends. If 'tS denotes price per share under this belief, then

' '0 10 1

' '2 32 3

50 100 $250; $500,

.2 .250 100

$250; $500,....2 .2

D DS Sr r

D DS Sr r

= = = = = =

= = = = = =

''

1 (-1 250)t

tt SS += +

or

The return per dollar from investing in the shares from time 0 to time 1 under this pricing is

'

' 1 11 '

0

100 500 2.4 140%

250

SDZ orS

+ += = =

and from time 1 to time 2 , '2Z is

'

' 2 22 '

1

50 250 0.6 - 40%

500

SDZ orS

+ += = =

and it continues to alternate.

Page 319: Finance Theory - Robert C. Merton

Finance Theory

317

Empirical Studies of Capital Market Theory

In Sections IX and X, we developed a theory for the capital markets based on essentially

rational behavior and optimal portfolio selection. Specifically, by applying the mean-variance

model and aggregating demands, we deduced the Capital Asset Pricing model, which provided a

specification for equilibrium expected returns among securities. Based on this model, we

deduced a naive or benchmark portfolio strategy. From our analysis of an efficient speculative

market, we deduced a rationale for random selection of securities or the naive strategy as possible

portfolio strategies. Since these models have important implications for both corporate finance

and financial intermediation, it is most important that empirical testing of the models is

performed. Basically, there are three questions to be answered: (i) How does the "random walk"

theory hold up against the data? (ii) Is the security market line specification a reasonable

description of returns on securities? (iii) How does the performance of the naive strategy

compare with managed portfolio strategies?

Page 320: Finance Theory - Robert C. Merton

Robert C. Merton

318

The answer to (i) is simply that a large number of technical trading strategies (filtering,

serial correlation, charting services, volume analysis, etc.) have produced no evidence to refute

the random walk hypothesis. To the extent that any serial correlation in the returns were present,

it was of such small magnitude and "short-lived" nature that no profitable trading was possible.

Other studies of brokerage house and general service recommendations, dividend announcements

and earning reports have shown no evidence of providing trading profits. "Dart throwing" or

more careful random selection of portfolios provide no evidence against the random walk

hypothesis.

In the study of managed portfolio performance, both the random walk hypothesis and the

asset pricing model are implicitly tested.

Returns on the "Market"

NYSE index: value-weighted index of all stocks on the New York Stock Exchange

≈80% in market value of all securities)

S&P index: Standard & Poors 500-stock index including the largest companies (in

1965 representing ≈80% of market value of NYSE stocks)

Random Selection of Stocks (Fisher & Lorie): Equally-weighted portfolio of all stocks on

the New York Stock Exchange 1926-1965. Average Return (1-year): including dividends, no taxes, or commissions

Years Average Annual Return (Arithmetic Average)

Standard Deviation (Annual)

1926-1945 17.8% 41.2%

1946-1965 15.1% 19.8%

1926-1965 16.5% 32.3%

Page 321: Finance Theory - Robert C. Merton

Finance Theory

319

"Market" (in this sense) was much more volatile in the pre-war versus post-war period. Average Compound Return: including dividends, no taxes, but including purchase commissions:

Average Compound Return Years (Geometric Average)

1926-1945 6.3%

1946-1965 12.6%

1926-1965 9.3%

All Stocks on the New York Stock Exchange: Value-Weighted

Cowles (1871-1937): Average Compound Return: 6.6%

Since the Fisher Lorie results for average performance of randomly selected portfolios is as good

as managed portfolios on average over the same period, this is additional evidence in favor of the

Random Walk.

Page 322: Finance Theory - Robert C. Merton

320

Simulated Rate of Return Experience for Successful Market Timing*

Monthy Forecasts: P = Probability of Correct Forecast

January 1927 – December 1978

Market Timing NYSE Per Month P=1.0 P=.90 P=.75 P=.60 P=.50 Stocks

Average Rate of Return 2.58% 2.17% 1.56% 0.94% 0.53% 0.85%

Standard Deviation 3.82% 3.98% 4.13% 4.19% 4.18% 5.89%

Highest Return 38.55% 38.27% 37.61% 36.41% 35.12% 38.55%

Lowest Return -0.06% -17.05% -22.02% -24.52% -25.64% -29.12%

Average Compound Return 2.51% 2.10% 1.47% 0.85% 0.44% 0.68%

Growth of $1,000 $5,362,212,000 $418,902,144 $9,146,722 $199,718 $15,602 $67,527

Average Annual Compound Return 34.65% 28.32% 19.14% 10.69% 5.41% 8.47%

*Buy the market when the forecast is for stocks to do better than bonds. Buy bonds when the forecast is for bonds to do better than stocks.

Page 323: Finance Theory - Robert C. Merton

Robert C. Merton

321

Average Annual Compound Return on the Market (value-weighted, including reinvesting

dividends, no commissions, or taxes) (Scholes)

NYSE S&P Average Avg. Excess Average Avg. Excess Years Return Return Return Return Total: 1953-1972 11.98% 7.57% 11.63% 7.22% 10 Years: 1953-1962 13.11% 10.00% 13.38% 10.25% 1963-1972 10.86% 5.15% 9.90% 4.19% 5 Years: 1953-1957 12.25% 9.45% 13.50% 10.70% 1958-1962 13.99% 10.52% 13.26% 9.79% 1963-1967 14.53% 9.70% 12.34% 7.50% 1968-1972 7.31% 0.71% 7.52% 0.92% Jensen Performance of Mutual Funds Study 1945-1964

Testing 115 Funds ability to Forecast (relative to Security Market Line):

Model Specification: (t)α + R(t)] - (t)Z[β + R(t) = (t)Z jMjj

Test: (t)ε~ + (t)α + R(t)] - (t)Z~[β + R(t) = (t)Z

~jjMjj

1,2,... =k

1, = ji, 0 = k))-(tε~(t),ε~(Cov 0; = (t))ε~E( ijj

Assumes: β j is stationary. Suppose not:

(t)U~ + β = (t)β~ jjj

if you can forecast the market, then 0 > )Z~,U

~(Cov Mj which would imply β < )βE( jestimatedj

and biases tests in favor of superior performance (i.e., larger α j ).

Page 324: Finance Theory - Robert C. Merton

Robert C. Merton

322

115 Funds Studied. Returns net of all costs including management fees.

76 funds had measured 0 < α j

Average α = – .011 = – 1.1% 39 funds had measured 0 α j ≥

The statistical significance of the positive α j were no more than would have been expected by

chance when the true α j = average α .

Using Returns Gross of Management Fees

55 funds had measured 0 < α j

Average α = – .004 = –0.4% 60 funds had measured 0 α j ≥

Statistical significance of the positive α j were no more than would have been expected by

change when the true 0. = α j

Conclusions: Funds taken as a whole do not show evidence of superior forecasting capability;

and, of course, do not show evidence of sufficient superior forecasting to cover costs.

What about individual funds? Even if funds as a whole do not show evidence of superior

forecasting, what about the overtime performance of particular funds? Is it true that funds with

observed positive α j in the past tend to have positive α j in the future? Jensen & Black studied

the 115 funds for the years 1955-1964 computing the realized α j for each year (a total of 10 ×

115 = 1150 observations). The differential returns were computed gross of management fees.

The results were

Page 325: Finance Theory - Robert C. Merton

Finance Theory

323

Number of Successive Years of Observed

Positive "α" Number of Times Observed

Percent of Cases Followed by Another Positive "α"

1 574 50.4%

2 312 52.0%

3 161 53.4%

4 79 55.8%

Conclusion: It appears that funds that did well in the past show little evidence of continuing to

do so.

Jensen also found that there was no significant evidence of serial correlation in the return

series in support of the Random Walk Hypothesis.

With respect to providing efficient (or well-diversified) portfolios, on average, Jensen

found that 85% of the variance of the funds' returns were due to market movements. I.e.,

.9216 = 1.085

1 ρ or σρ1.085 = βσ(1.085) σ pMppMpMp ≈≈

Further, on the whole, funds tended to keep about the same level of βp or σp through time.

Overall Summary

1. Over the last forty years, randomly selected portfolios have returns greater than or equal

to randomly selected managed portfolios.

2. Most mutual funds are reasonably well diversified (i.e., have reasonably low non-

systematic risk).

3. On average, funds did not perform, before expenses, any better than a naive strategy

portfolio with the same beta.

Page 326: Finance Theory - Robert C. Merton

Robert C. Merton

324

4. On average, funds did worse, after expenses, than the naive strategy portfolio with the

same beta.

5. Few, if any, individual funds showed any consistent performance superior to the naive

strategy over time.

6. Most funds spend too much money trying to forecast returns on stocks: either explicitly

in analyst salaries and support and implicitly through brokerage commissions and spreads

through excess turnover.

Investment prescription: Since these results did not include sales commissions on "load" funds

which run from 1½ - 8½%, clearly one should buy "no load" funds (with no sales commissions).

To achieve an efficient investment strategy, choose a mix of a few well-diversified, no load

funds. Select funds with the lowest costs (management fees and turnover).

(ii) Testing the Capital Asset Pricing Model

(Miller and Scholes; Black-Jensen-Scholes)

The capital asset pricing model specifies that

R.> )ZE( and R] - )Z[E( β + R = )ZE( MMjj

I.e., investors are risk-averse; expected excess return on a security is proportional to its beta; it is

dependent only on beta; is linear in beta.

The Black-Jensen-Scholes paper is one of the most sophisticated tests of the capital asset

pricing model. Using monthly returns from 1931-1965 on 600-1100 securities, they found the

following:

1. The expected return on the market is greater than the riskless rate R). > Z( M

2. Expected return on individual securities (portfolios) is an increasing function of its beta

and the excess returns are linear in beta.

3. Expected return depends on beta.

Page 327: Finance Theory - Robert C. Merton

Finance Theory

325

4. The empirical Security Market Line is too "flat." I.e., the returns on "low beta" 1) < (β

stocks were higher than predicted by the Capital Asset Pricing Model and the returns on

"high beta" (β > 1) stocks were lower than predicted by the Capital Asset Pricing Model.

Results 1-3 are consistent with the capital asset pricing model, result 4 is not, and has

been the cause for much concern as well as new research in this area. To analyze this problem,

BJS constructed a "zero-beta" portfolio by combining stocks only (so it has variance), and this

portfolio had realized returns significantly greater than the riskless rate. I.e., Z whereR > Z β-0β-0

is the expected return on the minimum-variance, zero-beta portfolio constructed from stocks.

The specification that they fit was j M 0-βj j - R = ( - R) + γ( )( - R) ,β βZ Z Z

where . 0 < dβdγ

and 0 = γ(1)

While there are many possible theoretical and empirical explanations for this finding, such

analyses are beyond the level of this course. It is evident that the simple form of the Capital

Asset Pricing Model as a means for estimating expected returns on individual securities is not

sufficient; however, the main results implied by that model (1-3) do seem to describe returns and,

as a good approximation, its specification is not unreasonable.