28
Chapter 3: Production and Cost Function Analysis * Steven Berry Yale Univ. Ariel Pakes Harvard Univ. October 2, 2003 1 Introduction In the discussion in Chapter 2, we assumed that we had only data on market- level prices and quantities, together with cost and demand shifters. In this case, we were forced to make inferences about costs from the market out- comes together with an equilibrium assumption on the model. We saw how the equilibrium assumption can influence the resulting estimates of cost. Inside of learning about costs via equilibrium assumptions, we might be bet- ter off collecting data on the underlying cost and production process. The resulting data could be used together with a model of production or cost- minimization to estimate the production parameters without any reference to overall market equilibrium, or else the production and market equilibrium assumptions could be combined in one analysis. The careful treatment of cost and production functions brings up a range of classic economic and policy questions, including the issue of returns to scale (which is bound up with many questions of public regulation and/or ownership of firms) and with questions about firm-level productivity and how productivity is affected by policy. There are two related literatures. One focuses on estimation of the pro- duction function and is greatly concerned with simultaneity and with correct * This is very much a work-in-progress. Corrections Welcome. Copyright by Steven Berry and Ariel Pakes. 1

Chapter 3: Production and Cost Function Analysis

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Chapter 3: Production and Cost Function Analysis

Chapter 3:Production and Cost Function Analysis ∗

Steven BerryYale Univ.

Ariel PakesHarvard Univ.

October 2, 2003

1 Introduction

In the discussion in Chapter 2, we assumed that we had only data on market-level prices and quantities, together with cost and demand shifters. In thiscase, we were forced to make inferences about costs from the market out-comes together with an equilibrium assumption on the model. We saw howthe equilibrium assumption can influence the resulting estimates of cost.Inside of learning about costs via equilibrium assumptions, we might be bet-ter off collecting data on the underlying cost and production process. Theresulting data could be used together with a model of production or cost-minimization to estimate the production parameters without any referenceto overall market equilibrium, or else the production and market equilibriumassumptions could be combined in one analysis.

The careful treatment of cost and production functions brings up a rangeof classic economic and policy questions, including the issue of returns toscale (which is bound up with many questions of public regulation and/orownership of firms) and with questions about firm-level productivity and howproductivity is affected by policy.

There are two related literatures. One focuses on estimation of the pro-duction function and is greatly concerned with simultaneity and with correct

∗This is very much a work-in-progress. Corrections Welcome. Copyright by StevenBerry and Ariel Pakes.

1

Page 2: Chapter 3: Production and Cost Function Analysis

estimates of productivity. A second literature does not focus on productivityand instead focuses on the shape of the cost functions; this latter literatureis often concerned with economies of scale.

1.1 Cost Functions, Economies of Scale and FunctionalForm.

As an alternative to panel data methods based on production function esti-mation, one can use duality theory to derive the cost function as well as theconditional and unconditional input demands. You should re-derive the fol-lowing Cobb-Douglas equations as a review. For the Cobb-Douglas FunctionQ = LαKbetaM δ, with δln(M) = µ + ε, we have Cost Function:

ln(C) = constant +α

α + βln(w) +

β

α + βln(r) +

1

α + βln(Q)− δ

α + βε (1)

Conditional input demands:

ln(L) = constant− α

α + βln(w) +

β

α + βln(r) +

1

α + βln(q)− δ

α + βε (2)

(and similarly for the other input(s)). Unconditional input demands (assum-ing p = mc):

ln(L) = constant− 1− β

δln(w)− β

δln(r) +

1

δln(p) + ε. (3)

From (3) we have immediate confirmation of our intuition of the last sec-tion that increases in productivity ε directly cause an increase in the profit-maximizing level of labor.

One could estimate the cost function on a cross-section of firms usingsome instruments for Q, such as demand shifters (when would these be goodinstruments?) However, cost functions are often (if not usually) estimatedwithout reference to the problem of simultaneity, a methdological choice thatGriliches and Mairesse (1995) attribute to “denial”. In fact, the problem ofsimultaneity is exactly the same as in the estimation of production functions.

One exception to this is provided by Nerlove (1963) (also (Nerlove 1965)),who argued that Q can be treated as exogenous in regulated industries if (i)price is set by regulators who do not observe ε and (ii) firms have to serve

2

Page 3: Chapter 3: Production and Cost Function Analysis

all demand at the regulated price. Assumption (ii) is (or at least was) thestandard regulatory rule, while assumption (i) may not be unreasonable ifregulators have the same data that we do. In a time series, though, the un-observed productivity term is probably correlated over time and one wonderswhy the regulators never learn anything about it.

The measurement error problems, especially in input prices, that werepreviously discussed seem if anything worse here than in production func-tion estimation. In a cross-section of perfectly competitive firms within onemarket, there is no reason to observe any variation in the right-hand sideinput prices In a panel data set of firms within markets, there is a ques-tion of whether the variation in input prices is exogenous, although this caseseems more reasonable as input prices (as opposed to output prices) mightbe determined in a much larger market; e.g. the price of energy as an inputis probably not much influenced by what goes on in some particular smallmanufacturing industry.

1.2 The use of input demands

We have modelled the production function as having only one “error”, ε.Consequently, the various input demand equations are redundant, given thecost function, in the sense that the error term in each is identical. In empiricalwork, this is usually ignored, with extra error terms being “tacked on” toeach equation. McElroy (1987) shows one way to introduce input-specificdemand into the production function and then derive input demands withseparate error terms. She thereby provides a theoretical rationale for, andan intrepretation of, common practice. McElroy models the use of inputs, x,as

x∗ = x− ε

where x is the physical amount of the input, x∗ is the “effective” amount ofthe input and ε is a input specific productivity shock. It is easy to show thatsuch a linear input error also shows up in the input demands as a linear error.However, note that the Cobb-Douglas input demand as usually written as interms of log input demand, and McElroy’s error terms aren’t linear in logs.

3

Page 4: Chapter 3: Production and Cost Function Analysis

1.3 The Functional Form of Cost and Testing for Nat-ural Monopoly

The Cobb-Douglas production function is obviously a very restricted func-tional form. The Cobb-Douglas implies unit-elastic substitution betweeninputs, which makes the functional form useless for discussing input substi-tution. Also, most real-world firms are multi-product firms. This has led tosuggestions of multiproduct cost functions that involve various less restric-tive functional forms. Typically, these functional forms are polynomials insome functions of the outputs and inputs. Such polynomials can often beinterpreted as approximations to more general functional forms.

In the “old days” there was a strong preference for functional forms thatare linear in parameters, so that standard linear estimation techniques couldbe used. One example is the “translog” functional form. In the case oftranslog cost functions, ln(C) is a quadratic function of log input prices andoutputs (plus a linear error term), whereas the Cobb-Douglas in linear inthe same terms. The translog is consistent with a second-order Taylor-seriesapproximation, in logs, to some more general functional functional forms.

A number of other “flexible” functional forms are found in Fuss, McFad-den, and Mundlak (1978). Also, that same source discusses how to test costfunctions for the various restrictions implied by cost minimization (such ashomogeneity of degree one in input prices). Such tests are interesting, al-though one, of course, never knows if it is cost minimization or the specificfunctional form approximation that is being rejected.

An important economic question, often posed is the empirical literature,is whether the cost function implies a natural monopoly cost structure for theindustry. Notions of “natural monopoly” are discussed in Baumol and Willig(1982). The definition of a natural monopoly is that, for a given industryoutput Q, the cheapest method of production involves production by onlyone firm. More formally, a cost function C implies a strict natural monopolyat Q if and only if the cost function is sub-additive, that is that

C(Q) <∑

i

C(qi)

for all q1 . . . qn such that∑

i qi = Q. The definition of subadditivity is ex-tended to multi-product firms simply by treating Q as a vector of inputs ofvarious products, e.g. Q = (Q1, Q2). Note that the definition of subadditivityrequires us to specific a level of industry output; a cost function that implies

4

Page 5: Chapter 3: Production and Cost Function Analysis

a Natural Monopoly at one Q may not imply it elsewhere. Also, the Baumoland Willig (1982) treatment does not, unfortunately, consider heterogeneousfirms, although it is easy to consider modifications of the definition of naturalmonopoly that would consider firms of differing productivity levels.

For single product firms, decreasing average costs implies subadditivity,but subadditivity (contrary to the treatment in some undergraduate texts)does not require decreasing average cost. For an example, consider the classicU-shaped average cost curve at a Q just slightly past the minimum point ofthe curve. For multiproduct firms, it is not even obvious what is the definitionof “decreasing cost.” One concept provided by Baumol and Willig (1982) isDecreasing Ray Average Cost, which is decreasing cost along the ray thatconnects the origin to a given output vector Q. This captures the notionthat costs decrease as all outputs increase proportionately, but says nothingabout why two goods should be produced within one firm. With two goods,and a cost function C(Q1, Q2), economies of scope are defined as

C(Q1, Q2) < C(Q1, 0) + C(0, Q2), (4)

that is, joint production is cheaper than stand-alone production. One reasonfor this might be shared fixed costs. A local notion of economies of jointproduction, on the other hand, is cost complementaries, that the marginalcost with respect to good one is declining with respect to good two. Theseand other concepts of Baumol and Willig (1982) provide a rich vocabularyfor describing the features of empirical cost functions that can lead to naturalmonopoly.

The Baumol and Willig (1982) book discusses some combinations of thesedefinitions, and others, that are sufficient for natural monopoly. It is typicallydifficult to relate these sufficifient conditions to simple tests on coefficients.However, given an empirically estimated cost function, it is also possible tocheck the definition of subadditivity by “brute force” at different levels of in-dustry output Q. A concern with any empirical test of subadditivity is thaterrors in the functional form approximation can lead to misleading results.This worry accounts for the interest in “flexible functional forms”, but thefunctional forms in use all have their own inflexibilities. For example, thetranslog is defined in logs and so cannot be evaluated at zero output levelsand so cannot be used to detect economies of scope. (Of course, realisti-cally, if zero output levels are never observed in the data, then no estimatedfunctional form may do well in predicting costs at zero outputs.) In general,

5

Page 6: Chapter 3: Production and Cost Function Analysis

polynomial approximations can do very poorly at points far away from theoriginal point of approximation, but they are still likely to provide richer re-sults than a simple Cobb-Douglas in cases where the data allow us to estimateadditional parameters.

1.4 Application to the Deregulation of Telecommuni-cation

Here, one would discuss the actual empirical findings of Olley-Pakes for thederegulation of telecommunication manufacturing and the empirical findingsof Evans and Heckman (1983) for economies of scope in Long Distance andLocal Phone Service. The papers are a nice contrast because they focus onopposite methodologicall issues in a way that reflects a general split betweena set of papers that look at natural monopoly issues versus a set of papersthat focus on productivity analysis. The first set of papers considers animportant economic question and makes use of richer functional forms thatexplicitly account for the multiproduct nature of firms. The second set ofpapers also considers important economic questions while providing a richerset of econometric tools that are appropriate for market generated data. Theonly point here is that it would be nice to combine the concerns of the twoliteratures.

Evans and Heckman look at data from the Bell telephone system in theUnited States from 1947-1977. In most regions of the U.S. at that time,the Bell system was the regulated monopoly provider of both local and longdistance telephone service. They observe firm level input prices for capitaland labor, as well as outputs of long-distance and local telephone service.They estimate a translog cost function, with a time trend as an additionalvariable that proxies for “technology”. The estimation procedure uses thecost function plus the cost shares as implied by the translog model (see theexercises below), with additive, normally distributed errors “tacked on” toeach equation, with no discussion of the economic interpretation of thoseerrors. The strong parametric assumption on the errors allows for estimationby maximum likelihood. In the following table of parameter estimates, t isthe time trend, while ln(QLocal) is the output of local phone service and soforth.

It would be nice if these parameter estimates could be directly tested tosee if the cost function was subadditive, there is no such test. However, one

6

Page 7: Chapter 3: Production and Cost Function Analysis

can test for local cost complementaries, which is the notion, for example, thatthe marginal cost of providing long-distance service declines in the quantity oflocal service provided. Unfortunately for the old Bell System, the estimatedcoefficient on the local/long distance interaction is positive. To check forsubadditivity, Evans and Heckman apply the definition of subadditivity ina brute force way to the cost function, checking to see if observed outputlevels could be reallocated across two firms with this same cost function, in away that could reduce total cost. They reject subaddivity, finding the outputreallocations to two firms would typically reduce industry costs. Regressionsof this sort were influential in the eventual break-up of the Bell System.

Evans and Heckman’s Bell System Translog Cost Function EstimatesMLE (normal “errors”), 1947-1977

Parameter Estimate SE

Constant 9.06 0.20ln(r) 0.54 0.01ln(w) 0.35 0.01ln(QLocal) 0.29 0.26ln(QLong) 0.42 0.20t −0.16 0.07ln(r)2 0.20 0.02ln(w)2 0.18 0.03ln(r)ln(w) −0.16 0.02ln(QLong)

2 −5.28 1.70ln(QLocal)

2 −2.64 1.13ln(QLong)ln(QLocal) 7.76 2.70t2 0.41 0.80ln(r)ln(QLong) 0.35 0.10ln(r)ln(QLocal) −0.35 0.09ln(w)ln(QLong) −0.22 0.09ln(w)ln(QLocal) 0.21 0.08ln(r)t 0.11 0.04ln(w)t −0.11 0.03ln(QLong) t −0.97 1.20ln(QLocal) t 0.36 1.20

7

Page 8: Chapter 3: Production and Cost Function Analysis

1.5 Production Functions and Productivity

Let us begin with the firm’s problem and the question of estimating theunderlying parameters that determine the “supply curve.” At the firm level,this means that we will explicitly consider a production and/or cost function.For an excellent review of this literature and the problems associated withit, see Griliches and Mairesse (1995).1

The classic Cobb-Douglas production function (Cobb and Douglas 1928)provides a familar starting point

Q = LαKβM δ, (5)

where L is labor input, K is capital input and M is the exogenously given“management talent” of the firm. Assume for now that we do not observeM , which is observed only by the firm. Perhaps we can assume that

δln(Mj) = µ + εj,

with ε a random variable whose distribution we will have to discuss at somepoint. Note we are asserting for now that the unobserved variation in outputis due to something that the firm observes, but we do not. This seems to bethe reasonable first assumption for most industries, although we will considerother possibilities.

In a perfectly competitive market, the observed exogenous data is, sayZ = (w, r, p, x), where w is the wage, r the “price” of capital, p the marketoutput price, and x the demand shifters. The endogenous variables, some orall of which may be observed, are then Q, L, K, total costs C, and profits, π.The parameters of the models are then θ = (α, β, δ, µ).

Remember that in practice it is hard to measure each of these “observ-ables”. Outputs and inputs are not homogeneous. For example, labor variesin quality, while measures of capital can hardly help but be artificial ag-gregates of past investments. In many cases, output is “measured” as totalrevenue of the firm divided by a price index, and labor input may be mea-sured in the same way. Differences in the output product mix or in inputquality can cause serious problems here. In general, measurement error in theright-hand side inputs introduces yet another difficult econometric problem.Similarly, measured input and output prices may reflect differences in quality

1Our outline of the history of estimating cost and production function owes much tothis same review.

8

Page 9: Chapter 3: Production and Cost Function Analysis

more than exogenous variation in cost, e.g. a “high wage” plant may simplybe employing better qualified workers. In this case, it is not reasonable totreat observed input prices as simple exogenous shifters of production cost.

Putting measurement error aside for now, it is tempting to estimate theproduction function from data on cross-section of firms, using OLS to esti-mate

qj = µ + αlj + βkj + εj,

where

qj ≡ ln(Qj), (6)

lj ≡ ln(Lj), and (7)

kj ≡ ln(Kj) (8)

However, there is the obvious endogeneity problem that ε helps to determinel and k. That is, more productive firms will use different quantities of laborand capital than will unproductive firms. The problem of endogenous inputsin the production function was emphasized by Marshak and Andrews (1944).Griliches and Mairesse (1995) highlight the following relevant quote fromMarshak and Andrews “ . . . the manpower and capital used by each firmis determined by the firm, not by the economist. This determination isexpressed by a system of functional relationships; the production function,in which the economist happens to be interested, is but one of them.”

One possible solution is to use instruments for l and k. Input prices anddemand shifters are obvious instruments, but often they vary little in thecross-section and may not be be well-measured. There is also our questionabout the endogeneity of the prices themselves; input prices may be positivelycorrelated with the unobserved “quality” of the inputs.

As an alternative to instrumental variables, panel data techniques areoften employed to help solve the simultaneity problem. Following Mundlakand Hoch ((Mundlak 1961), (Hoch 1962) and (Mundlak 1963)), suppose thatwe observe a set of firms over time and assume that the unobserved compo-nent of firm productivity can be decomposed into a mean that is fixed overtime (a “fixed effect”) and an i.i.d. random error about that mean.

ln(Mjt) = µj + εjt

A critical assumption is then that the firm observes εjt only after choosing Land K (or else that εjt is pure measurement error in ln(Q)). In agricultural

9

Page 10: Chapter 3: Production and Cost Function Analysis

markets, εjt might be thought of as weather. It has been tempting for re-searchers in agricultural markets to assume that µj does not vary at all acrossfirms and to associate the entire error with unforecastable events (Zellner,Kmenta, and Dreze 1966). While this seems a bit strained in agriculturalmarkets (where unobserved land quality may play the role of µj), it seemsquite untenable in the manufacturing or retail sectors.

Given the assumptions of the last paragraph, inputs are correlated withµj but not with εjt. We can then eliminate the fixed effect µj from theequation

qjt = δµj + αljt + βkjt + δεjt (9)

by subtracting the mean level (over time and within firms) from each variable,to obtain:

qj· = αlj· + βkj· + δεj·,

where the subscript, ·, indicates a variable that has been differenced fromits mean. This last equation can then be estimated by OLS.2 The methodmakes use only of changes in inputs and output over time in the same firm.The “between-firm” portion of the data has been differenced out, leaving usto use only the “within-firm” variance in inputs and outputs. Note that theε’s from each time period show up in the error of the transformed equation,so it is in fact necessary that εjt have no effect on output in periods after t.Chamberlain (1982) has a comprehensive discussion of related models andspecification tests.

A well-known problem is that such deviations over time may be measuredvery badly, so that the fixed-effects estimator can easily do worse than thesimple OLS estimator even when there in the presence of endogeneity. This isparticularly true if average differences across firms are well-measures, but thewithin firm fluctuations are badly measured. The fixed effect model throwsout the across firm variation and so can throw out the useful part of the data.A frequent finding in applied work is that the estimates of α and β declinegreatly after the fixed effects transformation is applied, so that there appearto be large decreasing returns to scale. Most researchers seem to believe thatthis represents a bias toward zero that is caused by mismeasurement, ratherthan simply a correction of simultaneity. Note, though, that the direction ofchange is consistent with a correction for simultaneity.

2With a long time series, the µj could themselves be estimated by OLS. However, itmay be less plausible that productivity levels remain constant over a long time period.

10

Page 11: Chapter 3: Production and Cost Function Analysis

1.6 Using Investment to Help with Simultaneity

Olley and Pakes (1996) suggest the use of additional information on the un-observed productivity of the firm. In their study of the telecommunicationsindustry, they consider models in which investment is increasing in the per-sistent portion of productivity. They show conditions under which observedinvestment decisions can help us to control for systematic productivity dif-ferences and therefore help solve the simultaneity problem.

Drawing from formal models of industry dynamics (discussed in a laterpart of the course), they assume that investment, I, is a function of the firm’spersistant productivity shock, ω, and of (log) capital, k.

Ijt = ιt(ωjt, kjt). (10)

Olley and Pakes combine this investment equation with the simplest Cobb-Douglas production function, although as usual the exact functional form isnot important. Consider a form for our management talent term of

δln(Mjt) = M + ωjt + ηjt (11)

where the distinction between ω and η is now just that ω is known to thefirm and possibly persistent, while η is unknown prior to the input choiceand is purely transitory. Because ω is known to the firm, it is likely to becorrelated with inputs and this introduces the simultaneity problem. Thefull production function is then (ignoring constants):

qjt = αljt + βkjt + ωjt + ηjt. (12)

If input choices are correlated with the innovation in ω, then differencing thedata won’t solve the endogeneity problem.

To compensate for this, Olley and Pakes use observed investment to “solveout” for the unobserved ω. If the investment function ι is strictly increasingin ω, then it is invertable and

ωjt = ι−1t (Ijt, kjt) ≡ gt(Ijt, kjt). (13)

The t subscript on ι allows it to be shifted by macro variables (and by changesin competitive conditions over time; this latter change becomes particularlyimportant in a oligopoly model.)

11

Page 12: Chapter 3: Production and Cost Function Analysis

If we plug this back into the production function,

qjt = αljt + φ(Ijt, kjt) + ηj·, (14)

where φ(I, k) = βkjt + g(I, k).Olley and Pakes approximate φ by various non-parametric methods (the

simplest of which is to approximate φ with a high-order polynomial.) Notethat β can no longer be estimated from the equation, because k enters φ boththrough the direct effect of capital on production and through the effect ofcapital on investment as captured in h. However, (14) can be used to estimateα.

Once α is known, we can think of estimating the equation

qjt − αljt = βkjt + ωjt + ηjt. (15)

But once again capital is likely correlated with current ωjt. Fortunately,it is possible to use lagged values of ω together with previously estimatedparameters to control for this. Olley and Pakes begin by taking expectedvalue of qjt −αljt, conditional on kjt and ωt−1. Note that ωt−1 is not directlyunobserved, but we can difference lagged (12) and (14) to obtain

ωt−1 = βkjt−1 − φ(Ijt−1, kjt−1) (16)

Thus, ωt−1 is observed up to the single unknown parameter β.The desired expected value is

E[qt − αlt | kt, ωt−1] = βkt + E[ωt | ωt−1, kt]. (17)

We don’t know the expection on the right hand side. However, let us assumethat ωt follows an exogenous first-order Markov process so that

E[ωt | ωt−1, kt] = E[ωt | ωt−1].

Then, the right-hand side of (17) is just βkt plus an unknown function ofωt−1. If we let E[ωt | ωt−1] ≡ g(ωt−1), then

E[qt − αlt | kt, ωt−1] = βkt + E[ωt | ωt−1] (18)

= βkt + g(ωt−1) (19)

= βkt + g(βkjt−1 − φ(Ijt−1, kjt−1) (20)

12

Page 13: Chapter 3: Production and Cost Function Analysis

We can once again approximate g by a semi-parametric methods, as for ex-ample replacing g with a high-order polynomial. There is now one parameter,β, to estimate, as the function φ is known from the the first-stage and, fromthe conditioning argument kt is available as instrument. (In fact, extendingthe conditioning argument shows that all observed variables prior to t arealso instruments, so that the equation is over-identified.)

1.6.1 Selection

Olley and Pakes are also concerned with the effect of sample selection. Alarge fraction of plants exit their sample over the period of observation, whileother plants enter. It is traditional to construct a “balanced panel” of plantsthat are in the sample for the entire period, but this ignores the economicprocess that generates entry and exit. The firms that exit, for example, arelikely to be those firms with a low value of ωjt. However, firms with a lot ofcapital are likely to stay in the market even in the presence of bad ω draws.This suggests that the use of a balanced sample will bias the coefficient oncapital down toward zero.

One solution suggested by Olley and Pakes is just to keep the firms in thesample for as long as they operate. In fact, this simple change to an unbal-anced panel seems to solve most of the selection problem. They also providemore sophisticated econometric techniques that correct for the remainder ofthe selectivity bias. In particular, they have to consider the expected value ofω conditional on survival when they construct their econometric estimations.

Again relying on a dynamic industry model, OP assume that firms exitwhen ω falls below a cut-off point. The relevant cut-off for ω varies with thelevel of capital. In particular, the exit rule is

ωjt < ω(kjt, (21)

where ω(kjt is the cut-off that is decreasing in k. To consider the econometricconsequences, let us ignore the simultaneity problem for a moment, assumingthat k is uncorrelated with the random draws on ω. However, k will still becorrelated with the ω’s of surviving firms, because ω declines in k. Thesecond step of OP then depends on :

E[yjt | kjt, ωjt−1] = βkjt + E[ωjt | kjt, ωjt−1]. (22)

13

Page 14: Chapter 3: Production and Cost Function Analysis

The LHS integral is just ∫ω(kt)

ωtPr(dωt | ωt−1)

Pr(ωt > ω(kt)). (23)

(Recall the that ω follows an exogenous Markov process so that conditioningon omegat−1 removes any dependence of ωt on kt.) The integral in (23) is anunknown function of ω(kt) and ωt. We have seen that ω(kt) is known up tothe parameter β, but we haven’t seen how to deal with ωt.

OP borrow a technique from the semi-parametric sample selection liter-ature [cites]. In particular, they use the data on exit and survival of firmsto obtain a non-parametric estimate of the probability of survival, which isPr(ωt > ω). By assumption of the exit model, this is monotone in ω. There-fore, including the exit probability in as a control is the same as includingomega directly. Similar to the pure simultaneity case, then, OP include poly-nomials (or alternatively other flexible functions like kernels) in both ωt−1

and in the survival probabilities. This corrects both for simultaneity and forselection, then.

We can summarize the “extra information” used by OP as follows. invest-ment and capital together control for the persistant shock in productivity andallow estimation of the labor coefficient. Lagged productivity then allows forestimation of the capital coefficient in the presence of simultaneity. To alsocorrect for selection, one must introduce estimates of the exit probabilities.

1.6.2 Olley-Pakes Results on Coefficients.

The following table traces out their capital and labor coefficients across somedifferent specifications. The coefficient on l is in the second column, whilethe coefficient on k is in the third. Note that the coefficient on k tendsto decline dramatically in the fixed effects specifications, which might reflectmeasurement error in k. However, this coefficient jumps when the full sampleis used, perhaps because the the balanced panel throws out many of the lowoutput/low capital firms. The full method has the lowest coefficient on l, butstill has a substantial coefficient on k. One hopes that this reflects that thesimultaneity bias, which may be particularly large for l, has been correctedwithout biasing the coefficient on k.

14

Page 15: Chapter 3: Production and Cost Function Analysis

Some Parameter Estimates from Olley PakesMethod α β

OLS, balanced panel .851 .173Fixed Effects, balanced panel .728 .067OLS, full sample .693 .304Within, full sample .629 .150full method, full sample .608 .342

1.7 Productivity in Telecomm

Olley and Pakes consider the telecommunications equipment manufacturingside of the Bell system. As Bell bought telecom equipment largely from itself,it was a de facto near monopolist on the manufacturing side as well as on thephone service side. After deregulation of the system, entry was allowed intotelecom manufacturing and the newly created “Baby Bells” bought equip-ment from a variety of providers; consumers were also free to choose suppliersof telephones, answering machines and so forth. Olley and Pakes find thatproductivity gains from the deregulation of telecommunications equipmentmanufacturing came largely from entry and exit: less efficient plants whereclosed down and more efficient plants where opened. This gain offset a staticloss that came from a less efficient allocation of output across existing plants.When we study oligopoly, we will see that such static loses may be commonin oligopoly markets, although they should not be present in monopoly mar-kets. Olley and Pakes provide an important reminder that dynamic efficiencygains can more than offset these oligopoly inefficiencies.

1.8 Recent Work

Levinsohn-Petrin; Ackerberg

1.9 Methodological Lessons from the Estimation of Costand Production Functions.

We see that both the theory and the nature of the data can influence ourchoice of technique; e.g. panel data techniques that look good in theorymay be confounded by right-hand side measurement error. The production

15

Page 16: Chapter 3: Production and Cost Function Analysis

function literature also tells us to think hard about who observes differentvariables and, critically, when these variables are observed. Also, note thatthe nature of market regulation may influence our choice of technique, aswhen a regulatory process exogenously fixes output quantities.

2 Estimating Homogeneous Goods Demand

from Consumer-level Data

Traditionally, IO pays closer attention to estimating the cost and productionfunctions that define the supply curve than to the utility functions that definethe demand curve. However, if one wants an answer to a practical questionabout markets, it is often as necessary to model demand carefully as it isto model costs. For example, we might want to know the compensatingvariation from a policy change and we can’t calculate this without knowledgeof the distribution of preferences across households. For partial equilibriumanalysis (which is the only kind of analysis we will consider), one might startwith a simple model in which utility is

U(Qi, Mi, ζi).

where Q is the quantity of the good consumed and ζ is an index of thetastes of the household. In the spirit of partial equilibrium, say that M is a“composite good” with a price of 1, so Mi = Yi − pQi, where Yi is householdincome. For example, we might assume that

U ≡ ζiQλi + M δ

i , (24)

where λ and δ are between zero and one. Standard utility theory then tellsus that the consumer sets

∂U(Qi, M, ζi)

∂Q=

∂U(Qi, M, ζi)

∂Mp, (25)

which for the functional form just assumed works out to imply that

ln(Qi) = constant + ln(ζi)− αln(p) + βln(M),

where the constant term and the demand coefficients α and β depend on theutility parameters λ and δ. It remains to discuss the household attributeindex, which we might parameterized as

ln(ζi) = ξ + ziγ + νi,

16

Page 17: Chapter 3: Production and Cost Function Analysis

where ξ is a mean level, zi is a vector of observed household attributes andνi is an observed household level demand shifter. The parameters γ are tobe estimated. The resulting equation is then

ln(Qi) = constant + ziγ − αln(p) + βln(M) + νi.

First, let us consider estimation on a cross-section of households in asingle time period and living in a single regional market. In this case, thereis no reason for prices to vary at all, and the elasticity of demand α cannot beestimated. This is the same situation as cost function estimation on a cross-section of firms within a single market: there is no reason for input prices tovary and so not all the parameters can be estimated. However, for the currentfunctional form of demand we could learn about γ and β. The expenditureon other goods, Mi, is correlated with the unobserved component of the tastefor this good. However, Yi is available as an instrument. In practice, mostapplications don’t treat M as endogenous if spending on the good in questionis small relative to income. In this case, M is approximatively Y anyway andln(Y ) might be substituted for ln(M).

Now consider estimation on panel data, with a series of household cross-sections across time or across regional markets. In this case, we need to adda t subscript and to consider the possibility that the mean of the unobservedtaste distribution varies across markets. This would give us

ln(ζit) = ξt + zitγ + νit

and a resulting demand equation of

ln(Qit) = constant + zitγ − αln(pt) + βln(Mit) + ξt + νit.

In this case, it seems reasonable to assume, via the typical supply and demandargument, that ξt and pt are correlated. Once again, supply-side instrumentsthat vary across markets but are uncorrelated with ξt could be used to forman instrumental variables estimator.

Perhaps surprisingly, it is often argued that the use of household dataremoves the problem of simultaneity. A usual argument is that a singlehousehold’s demand is too small to “cause” the market price and that there-fore the problem of joint causation disappears. But we have just seen thatthe necessary condition for estimating demand curves by OLS (aside fromthe endogeneity of M) is that that prices vary across markets while the mean

17

Page 18: Chapter 3: Production and Cost Function Analysis

unobserved tastes of consumers do not. This amounts to an assertion thatthe market demand shifters are associated with changes in the distributionof observed z’s and not with changes in the distribution of the unobservedtastes.

In most cases, this seems unlikely, as all demand shifters have to be inthe household data. Some demand shifters, like the distribution of income,are likely to be in the micro data, but then the distribution of income isoften available at the market level even when data on household demandlevels, Qi, are not. On the other hand, some demannd shifters, like regionalvariations in taste, are unlikely to be directly observed in any data set atany level of aggregation. Therefore, one should be skeptical of claims thatcollecting household demand data eliminates the endogeneity problem. Whatmatters at either level of aggregation is whether all market demand shiftersare observed.

As we note in the next section, household level demand systems can becompared to market-level demand so as to provide a formal test of whetherthe observed z′s account for all shifts in demand across markets. In particu-lar, as the number of households in the market grows, if all demand shiftersare in z, then the sum of household predicted household demands shouldexactly equal market demand in each observed market.

2.1 Functional Form

The simple functional forms here are in fact too simple, and we might wantto use more “flexible” functional forms and/or some kind of semi-parametricor non-parameteric methods.

2.1.1 Example

To continue with the telecommunications examples, Wolak (1996) specifies atranslog function for indirect utility from consuming long distance and localtelephone service (we consider multi-product translog demand more carefullyin the next chapter.) Wolak uses the demand system to analyze the effectof proposed price changes in local and long-distance telephone service onhousehold welfare. In particular, he considers proposals to “rebalance” pricesin the direction of true costs, which would involve higher local service pricesand lower long-distance prices. Regulators have been particularly worried

18

Page 19: Chapter 3: Production and Cost Function Analysis

about the effects of this change on poor households, who might use littlelong-distance service.

Wolak looks at household-level data from the consumer expenditure sur-vey. He can considers the distribution of welfare outcomes and not just theaverage outcome. This is important to regulatory bodies, who worry aboutthe effect of deregulation on potentially vulnerable sub-groups.

There is an econometric complication caused by those who choose a zerolevel of long-distance, or who have no phone service at all (Lee and Pitt 1986)and this requires a full parametric assumption on the unobservables.

Wolak results finds that there would be little net welfare loss even forthose at the bottom end of the income distribution. However, he does notfully incorporate the demand errors (“random utility components”) into hiswelfare analysis and does not consider possible endogeneity problems.

2.2 Discrete Choice Models of Demand

Equation (25) assumes that the quantity demanded is set according to a first-order condition. However, at the household level some choices may be betterapproximated by a discrete choice model. Let us say that the household buyseither one unit of a good, or else none. A “probit” model might assume thatif the good is purchased a consumer receives utility of

ui = ziβ − αp + εi, (26)

where zi is again an observed vector of household attributes and εi is theobserved taste of the household. while utility from not consuming the goodis normalized to zero. Once again, if there is no variation in price one can’testimate α; the term αp is subsumed into the constant in ziβ.3. Traditionaldiscrete choice models require us to make a parametric assumption on theunobervables, not just a mean independence assumption. If the distributionof ε is Φ, then the probability of purchase is the probability that ε > −ziβ:

1− Φ(−ziβ). (27)

This probability can then be used to form the log-likelihood of the sampleand the vector β can be estimated by traditional maximimum likelihoodmethods.

3Sometimes, there is variation in price across households even within a given “market”,for example transportation costs depend on the location of the household, not just on thecity of residence, see McFadden [cite]

19

Page 20: Chapter 3: Production and Cost Function Analysis

If one has panel data across markets then there is plausible variation inp and one might want to consider the possibility that the mean of ε variesacross markets t. Consider the utility function

uit = zitβ − αpt + ξt + εit. (28)

Remember that ξt helps to determine pt. But traditional IV estimation tech-niques don’t carry over to discrete choice models. Unfortunately, the methodof moments IV estimator, as discussed above, made use of the value of theerror as a function of the parameters. The discrete choice model doesn’treveal an exact value for the composite error ξ + ε as a function of modelparameters, but only a range of possible values.

An alternative method is to define

δt = β0 +−αpt − ξt, (29)

where β0 is the constant in zβ. If we have many observations on consumersin each market t, then δt can be estimated as a parameter of the model(this is like estimating a linear fixed effect as the number of obervations perindividual grows large.) The estimated δt can then be regressed on a constantand pt, using the usual sorts of instruments for price, to obtain consistentestimates of α.4

Note that the probit functional form greatly restricts the slope of the av-erage demand curve. The derivative of the purchase probability with respectto price is

−αφ(δt + zitβ),

where φ is the density of ε. In the product differentiation section below, wewill discuss methods for improving the discrete choice functional form.

3 Aggregation to the Market Level: In pro-

gess

Here, we consider how to aggregate firm cost functions and household demandfunctions to the market level. We discuss two methods: analytic aggregationvia special functional forms.

4Of course, the standard errors will have to be adjusted to account for the fact that weare using estimates of δ on the left hand side instead of the actual values.

20

Page 21: Chapter 3: Production and Cost Function Analysis

Once the firm-level cost functions and household level demand functionshave been estimated, the market level demand and cost costs are found simplyby aggregating the micro demands and supplies. In practice, this is notalways so easy, because functional forms that are typically used at the microlevel do not necessary result in attractive functional forms at the marketlevel.

One way around this is to look for household or firm-level functional formsthat aggregate nicely. Houthakker (1955) does this for production functions,showing how a distribution of firms with Leontieff production technologiescan aggregate, under special assumptions on the distribution of micro pro-duction coefficients, into a market-level Cobb-Douglas production function.[Is this right?] As another example, Anderson, DePalma, and Thisse (1992)show how special case discrete-choice assumptions can lead to a market levelConstant Elasticity demand function. One way of looking at such papersis that they tell us what we are assuming when we make functional formassumptions at the market level.

Pakes (1986) suggests another method, which is to simulate the aggre-gate functions. In this case, we need to make a parametric assumption onthe distribution of the household or firm-level unobservables. We can thenthink of taking a set of random draws from the joint distribution of observedand unobserved micro data. In practice, we draw the observed micro datarandomly from the micro dataset and then associate with each of these drawsa random draw from the distribution of unobservables. For each draw, wecalculate the micro quantity demanded or supplied. We finally take the meanof these calculations and multiply it by the number of households or firms inthe market. As the number of draws gets large, this is a consistent estimateof the market quantity.

Using either simulation or, in special cases, direct analytic aggregation,we can think of combining micro and market level data to gain efficiency.We might see micro data for some subset of time periods or some subset ofthe cross-section of markets The market level data might be more availablefor more time periods or for a larger cross-section. In this case, the microdata will tell us particularly about the effect of observed micro shifters, likehousehold or firm characteristics. The aggregate data across markets willhelp us to learn about the effects of market-level shifters that are commonto the micro participants within a given market.

Also, in some cases the micro data will have a small number of obser-vations, but the market level data is, by its nature, created from the true

21

Page 22: Chapter 3: Production and Cost Function Analysis

aggregation of very many participants. In this case, the market level datacan add efficiency even when all the parameters of the model are theoreticallyidentified from the micro data.

4 Exercises

Exercise 1 Elementary Micro-economics:

1. What restrictions does theory place on market level demand Equa-tion (??), e.g. should (??) satisfy the Slutsky equation?

2. Show how to derive (??) if the market consists of many heterogeneousfirms, each firm j in market n having a linear marginal cost curve:

mcjn = wnγ + λjqjn + ωjn (30)

3. Is Equation (??) consistent with a long-run zero-profit condition? IfEquations (30) and (??) are true, is it possible to redistribute outputacross firms (holding the set of active firms constant) and lower indus-try costs, while holding total output Qn fixed?

Exercise 2 Elementary Econometrics:

1. Give conditions on the system (??) and (??) such that price is endoge-nous even if λ = 0.

2. Alternatively, what conditions would ensure that price is uncorrelatedwith ε?

3. What is the “reduced form” of the supply and demand system? Can theparameters of supply and demand be derived from the reduced form?

4. What is the “rank” condition in IV methods and when might it be sat-isfied when z contains elements of x and w?

5. Derive OLS and 2SLS from mean independence assumptions. Showthat the 2SLS formula is consistent with the procedure of (i) regress-ing ln(p) on Z, placing the fitted value hatln(p) back into the demandequation and then running OLS on x and ln(p).

22

Page 23: Chapter 3: Production and Cost Function Analysis

6. Show that the optimal weighting matrix for the single equation modelwhen ε is i.i.d. is in fact (Z ′Z)−1.

7. Show that for the constant elasticity demand function the “optimal”instruments for (??) using the traditional instrument vector Z are xand E(ln(p) | Z).

Exercise 3 Cost and Production Functions

1. Show that the (labour) cost share is equal to ∂ln(C)∂ln(w)

and in the Cobb-Douglas case is equal to the constant α

α+β.

2. Derive the translog cost share equations and show that they vary withinput prices and output.

3. What happens to the fixed-effects estimator for Equation (9) if ln(εjt)is auto-correlated?

4. Something About Ray Average Cost.

Exercise 4 Demand

Show that weakly separable demand functions imply demands that are func-tions of within group prices and of group expenditure.

5 Chapter Appendix

Here we will collect some more detailed results for selected topics in thischapter.

5.1 Details on Simultaneity in Household Demand

We expand on this topic because of the topics in this chapter, we have foundit to be the least understood and most controversial. Our basic propositionis as follows: when simultaneity is a problem in demand estimation at themarket level, it is typically still a problem when household consumption dataare available.

Consider the following very general form for the demand of household iin market t:

qit = f(zit, νit, pt, θ),

23

Page 24: Chapter 3: Production and Cost Function Analysis

where z is a vector of observed (by us) household attributes, ν is a vectorof unobserved (by us) attributes, p is price and θ is a vector of parametersto be estimated. Some of the demand shifters z may be constant acrosshouseholds within a market, so that they are “household” attributes in onlya trivial sense; for example zit might contain seasonal dummies or measuresof the (exogenous?) prices of substitute products in a market.

Market demand is then

qt =∫

f(z, ν, pt, θ)Λt(dν | z)Φt(dz)

where Φ is the population distribution of z and Λ is the conditional distri-bution of the unobserved household demand shifters.

Let us consider the following “market level” data: qt, pt and some infor-mation on the distribution of zt. Formally, assume that in each market weobserve, in addition to (qt,pt), N draws from the distribution Φt(zt). Thesemight be, for example, draws from the Current Population Survey on income,household size, geographic location, etc. We also might observe householdconsumption data. In particular, in each market assume that we observen draws on the pair (qit, zit). The additional information in the householdconsumption data is the match between qit and zit, which is not provided inCPS data.

There are two cases to consider: either the distribution of the unobserv-ables varies across markets or it does not. In the first case, there is nosimultaneity problem when we use either market or household level data. Inthe second case, there is typically a simultaneity problem at both levels ofaggregation.

The first case is easy, because the ν’s just “integrate out”. Formally,expected household demand is

E(qit | zit, pt, θ) ≡ h(zit, pt, θ) =∫ ∫

f(z, ν, pt, θ)Λ(dν | z)

That is, the unobservables just help to determine the functional form of theexpected demand function. Indeed, in most cases like this one would justforget the ν’s and make a direct functional form assumption on expecteddemand h (or else could even estimate h via non-parametric methods.) Givenhousehold consumption data, we could estimate the equation

qit = h(zit, pt, θ) + εit,

24

Page 25: Chapter 3: Production and Cost Function Analysis

where εit is uncorrelated with both z and p, so there is no problem of simul-taneity.

However, there is no simultaneity problem at the aggregate level either.We could approximate expected market demand via

E(qt | pt, Φt, θ) ≈1

N

N∑i=1

h(zit, pt, θ),

where the right-hand side replaces an integral with a sample mean, using ourN draws from the CPS-like data. The estimating equation

qt =1

N

N∑i=1

h(zit, pt, θ) + εt,

again has an error that is mean zero condition on pt and which goes to zeroexactly as the sample size N increases. Since the CPS is very large, weexpect this estimator to perform well. Indeed, if N is much larger than n,we might be better off using the market level estimator instead of using thehousehold consumption data. (Of course, we would be even better off usingboth.) Also, if N is large, the market-level equation should fit almost exactly,which provides a test of whether we really observe all the determinants ofdemand that vary across markets.

Now let us consider the second case, where the distribution of νt variesacross t and is unknown, although one might be willing to assume a functionalform for its distribution. As a very simple case, suppose that there is onlyone element of ν and that

νt = µt + εit,

with εit i.i.d. Here, the only shift in the distribution of unobservables is ashift in the mean of the unobserved taste for the product. Further, considera linear demand curve,

qit = µt + zitγ − αpt + εit,

which implies a market-level demand curve of

qt = ztγ − αpt + µt.

Here, the mean unobservable µt, is the market-level demand shock, but thisterm also enters the household consumption data. Note that the linear de-mand function implies that all we have to observe at the market level aremean demand shifters, z.

25

Page 26: Chapter 3: Production and Cost Function Analysis

One approach to estimation at the household level is to consider an ag-gregate error, uit = µt + εit. Note that µt is correlated with pt, a problemthat could be solved via instrumental variables. Note that the problem ofestimating α is essentially the same whether we use household or aggregatedata because µt enters the estimation equations at both levels of aggregation.

However, the household data might help us to estimate parameters otherthan α from within-market information on households. In particular, con-sider a market-specific intercept term,

δt = µt − αpt,

and then estimateqit = δt + zitγ + εit, (31)

using OLS on household data. This gives us the γ’s, but not α. We could, asa second stage, then estimate α from the δt’s, which are effectively market-level aggregates. In the regression of δt on pt, we still face the simultaneityproblem.

Non-linear functional forms will change the details of the argument in thelast paragraph, but the basic point remains: as long as we have access to themarginal distribution of observed household demand shifters in the popula-tion, joint information on household attributes and household consumptionchoices does not help us to solve the simultaneity problem. In some cases,parameters not related to price could, however, be estimated from the house-hold data by exploiting the within-market variance in zi and qit. The withinmarket argument doesn’t help us with the effect of price, though, becausethe price is fixed within a given market.

5.2 Details on Houthakker

Here we include some additional details on Houthakker’s aggregation argu-ment, mostly because it is so clever . . .

References

Anderson, S., A. DePalma, and F. Thisse (1992): Discrete ChoiceTheory of Product Differentiation. MIT Press, Cambridge MA.

26

Page 27: Chapter 3: Production and Cost Function Analysis

Baumol, W., J. P., and R. Willig (1982): Contestable Markets and theTheory of Industry Structure. Harcourt Brace Jovanovich, New York.

Chamberlain, G. (1982): “Multivariate Regression Models for PanelData,” Journal of Econometrics, 18(1), 5–46.

Cobb, C. W., and P. Douglas (1928): “A Theory of Production,” AER,18(1), 139–72, Supplement.

Evans, D., and J. Heckman (1983): “Multiproduct Cost Functions andNatural Monopoly Test for the Bell System,” in Breaking Up Bel: Essayson Industrial Organization and regulation, ed. by D. Evans. North-Holland,Amsterdam.

Fuss, M., D. McFadden, and Mundlak (1978): “A Survery of Func-tional Forms in the Economic Analysis of Production,” in Production Eco-nomics: A Dual Approach, vol. 1. North-Holland, Amsterdam.

Griliches, Z., and J. Mairesse (1995): “Production Functions: TheSearch for Identification,” Discussion Paper 5067, NBER.

Hoch, I. (1962): “Estimation of Production Function Parameters Combin-ing Time-S Series and Cross-Section Data,” emet.

Houthakker, H. S. (1955): “The Pareto Distribution and the Cobb-Douglas Production Function,” Review of Economic Studies.

Lee, L.-F., and M. M. Pitt (1986): “Microeconomic Demand Systemswith Binding Nonnegativity Constraints: The Dual Approach,” Econo-metrica, 54, 1237–42.

Marshak, and Andrews (1944): “Random Simultaneous Equations andthe Theory of Production,” Econometrica, 12, 133–205.

McElroy, M. B. (1987): “Additive General Error Models for Production,Cost, and Derived Demand or Share Systems,” Journal of Political Econ-omy, 1995(4), 737–757.

Mundlak, Y. (1961): “Empirical Production Function Free of ManagementBias,” Journal of Farm Economics, 43, 44–56.

27

Page 28: Chapter 3: Production and Cost Function Analysis

(1963): “Estimation of Production and Behavioral Functions from aCombination of Cross-section and Time-series Data,”,” in Measurement inEconomics, ed. by C. Christ, pp. 138–166. Stanford Univ. Press, Stanford.

Nerlove, M. (1963): “Returns to Scale in Electricity Supply,” in Measure-ment in Economics, ed. by C. Christ, Stanford. Stanford Univ. Press.

Nerlove, M. (1965): Estimation and Identification of Cobb-Douglas Pro-duction Functions. North-Holland, Amsterdam.

Olley, S. G., and A. Pakes (1996): “The Dynamics of Productivity in theTelecommunications Equipment Industry,” Econometrica, 64(6), 1263–97.

Pakes, A. (1986): “Patents as Options: Some Estimates of the Value ofHolding European Patent Stocks,” Econometrica, 54, 755–784.

Wolak, F. A. (1996): “The Welfare Impacts of Competitive Telecommuni-cations Supply: A Household Level Analysis,” Brooking Papers: Microe-conomics, pp. 269–340.

Zellner, A., J. Kmenta, and J. Dreze (1966): “Specification and Esti-mation of Cobb-Douglas Production Function Models,” emet, 34(4), 784–795.

28