Frontier Functions

7/29/2019 Frontier Functions

1/45

Frontier Functions:

Stochastic Frontier Analysis (SFA) &

Data Envelopment Analysis (DEA)

Sponsored by:The Martin School of Public Policy and Administration

The Department of Economics

The Research Office

University of Kentucky


2/45

Production and cost functions

A researcher wishes to estimate a production function ora cost function.

The object is to estimate not the average production oraverage cost, but the maximum possible productiongiven a set of inputs or the minimum possible cost of aset of outputs.

OLS regression estimates the mean of the dependentvariable conditional on the explanatory variables;

Quantile regression is based on a quantile (e.g. 10th,25th,median, 75th, 90th), not the maximum or minimum;

The max or min cannot be detected directly and used todefine the sample for selection bias analysis;

Limited dependent variable models truncate thedependent variable into categories or limits but not themaximum or minimum.


3/45

Frontier functions: definition

None of those standard econometric models isthe answer.

The answer is frontier functions, econometricstochastic frontier analysis (SFA) or linear

programming data envelopment analysis (DEA). Frontier functions estimate maxima or minima of

a dependent variable given explanatoryvariables, usually to estimate production or cost

functions. All frontier functions come from one paper,Aigner and Chu (1968).


4/45

Aigner and Chu (1968)

D.J. Aigner and S.F. Chu (AER 1968), On Estimatingthe Industry Production Function invented this area.

A viable distinction between the average and frontierfunctions as predictors of capacityderives from aprobability interpretation of alternative forecasts.thefrontier we construct is truly a surface of maximumpoints. This became Stochastic Frontier Analysis,Stochastic = probability interpretation.

Estimation, for primary metals production in stateaggregates:

one stage least squares and two stage least squares, quadratic programming (now rarely estimated), and linear programming, developed into Data Envelopment

Analysis in Charnes, Cooper, and Rhodes (1978) andsubsequent research.


5/45

Varian (1984)

Varian shows how to estimate and test for theWeak Axiom of Cost Minimization (WACM) andother microeconomic assumptions

Varian suggests using either regression (SFA) orlinear programming (DEA)

The WACM applies to for-profit, not-for-profit,private, and public producers

The only requirement is that minimum inputs are

intended to be used to produce desired output,or maximum output is intended from inputs used

Profit maximization is not required


6/45

SFA and DEA

Two large differences and another possibledifference

SFA has a stochastic frontier with a probabilitydistribution

DEA has a non-stochastic frontier SFA has one output, or an a priori weighted

average of multiple outputs

DEA often has more than one output, no a prioriweights, but assumes input-output separability

Both can have stochastic inefficiency, SFAalways does, DEA sometimes does


7/45

One-sided disturbances

In frontier functions, the disturbance has adistribution all on one side of zero

the maximum production must be greater than

or equal to any value in the sample, the minimum cost must be less than or equal toany value in the sample.

produced quantities are bounded by the

maximum, with non-positive disturbances costs are bounded by the minimum, with non-

negative disturbances


8/45

MLE with a one-sided disturbance

does not work well

MLE and the Cramr-Rao lower bound(minimum variance of an asymptoticallyunbiased estimator, usually the MLE) arequestionable!

Begin with a likelihood function L which showsthe probability of the data x given theparameters ,

The parameters might be the mean andstandard deviation or might just be mathematicalparameters.


9/45

Setting up MLE

Limits are a function of parameters in a non-stochastic frontier function: production function

(max), cost function (min)

L is Likelihood, L* is log likelihood. L() is always a probability distribution, so it

follows that it integrates to 1.0 over the range of

the data, from lower bound A to upper bound Z.

AZ L(x | )dx = 1.

Take the derivative wrt :


10/45

MLE: problems

AZ[dL(x | )/d] dx + [dZ/d]L(Z)[dA/d]L(A) = 0 E(dL*/d) + [dZ/d]L(Z) [dA/d]L(A) = 0. The first derivatives of the log likelihood do not

have mean 0 if those extra terms stay. Second derivatives add more unwanted

derivatives if the limits are functions of theparameters.

The negative inverse Hessian is not the varianceof the MLE.

This is not working at all.


11/45

MLE: possible repairs

Make the frontier stochastic and limits ofproduction or cost not a function of the

parameters, completely eliminating the problem.

Make the probability distribution have pdf of 0and derivatives of 0 at the limits, even though

the limit itself is a function of the parameters:

[dZ/d]L(Z) = 0 and [dA/d]L(A) = 0

The Gamma Distribution can do that (Greene(1980)).


12/45

The Gamma Distribution

The Gamma Distribution describes a non-negative random variable with two parameters (shape) and (spread)

If ~ (, ), E() = /, V() = /2 pdf() = exp(-)-1/(),

with different shapes for > 0, in ranges: less

than 1, 1, between 1 and 2, 2, greater than 2 A graph follows; > 2 is required for the pdf and

its derivatives to be zero at the limits.


13/45

Gamma Distribution: shapes

0

.2

.4

.6

.8

1

0 1 2 3 4 5x

alpha=1.0 alpha=1.5

alpha=2.0 alpha=2.5

Gamma distributions (lambda =1)


14/45

Ok, so a Gamma Distribution?

No, not really. The parameters are restricted

mathematically. That really annoys

researchers. Some other distribution? No, no other one-

sided distribution has the required

properties at the limits. This is why no one has just onedisturbance .


15/45

Composite disturbances

The disturbance has two parts

Stochastic frontier (v), unlimited range as usual.The limits of the production or cost function are

at infinity, not a function of the parameters Inefficiency (u), one sided, non-positive for

production, non-negative for cost

Finally, yj = xj + uj + vj , that is, j = uj + vj So there are two disturbance terms to keep theparameters from affecting the limits


16/45

Panel data: Fixed effects

Panel data researchers would like toinclude fixed or random effects ineverything, so why not frontier models?

Greene (2005) addresses this in detail. Fixed effects have special problems in

non-linear models, but they can work

Random effects are offered by Stata. Now there are three disturbance terms! yjt = xjt + j + ujt + vjt


17/45

Fixed effects in non-linear models

Fixed effects have well known advantages inlinear models but in non-linear models they:

are inconsistent (too small sample for each

fixed effect), cannot be differenced out (differences ofnon-linear models are still non-linear),

spread their inconsistency to other

coefficients (assuming correlation with otherexplanatory variables, which is the motivation forfixed rather than random effects).


18/45

Wait, maybe fixed effects are ok

With few units and many observations, fixed effects workbecause the sample size for each fixed effect might belarge enough. Greene (2005) points this out.

Stata refuses to enter fixed effects in the model.

The user can enter fixed effects. Random effects, normally distributed, are offered by

Stata. As always, they must be assumed to beuncorrelated with explanatory variables.

The independence assumption cannot be tested byStata, and there is no Hausman test, but Estimate fixed effects by direct inclusion and regress the

fixed effects on explanatory variables to test theindependence required for consistent random effects.


19/45

Stata: all MLE, all the time

Stata offers MLE with composite disturbances. The one-sided distribution is half-normal, truncated

normal, or exponential (restricted Gamma)

frontier dependent explanatory, d(hn) ord(tn) ord(e)

In Stata, u is one-sided inefficiency and v is the two-sided stochastic frontier. Stata uses notation fromGreene (1990) in which = ratio of standard deviationsu/v, so that = 0 means there is no inefficiency.

Fixed effects sneaked in by the user underfrontier, orrandom effects by Stata (normally distributed). xtfrontier dependent explanatory, re i(group_id) For minimization, use the option , cost


20/45

Stata: heteroscedasticity

Stata offers a lot of heteroscedasticity: either uor v can be heteroscedastic, or both.

Heteroscedastic u (one-sided error, inefficiency)

Heteroscedastic v (two-sided error, randomvariation)

The same explanatory variables, or different

variables, can appear in the frontier and in theheteroscedasticity.

frontier, uhet(var_name) vhet(var_name)


21/45

Stata estimates the inefficiency

Stata estimates the technical efficiency,the percentage of estimated frontier outputattained or the extra percentage spent

beyond frontier cost predict var_name, te As usual, many other options exist using

predict. Successful Stata estimation is illustrated atthis point.


22/45

Is MLE necessary?

If you always use Statas options, yes!

If not, no!

Not-MLE (1) Corrected OLS

Not-MLE (2) Fixed effects in panels

Not-MLE (3) Gamma-distributed inefficiency

Note: the Gamma distribution or any other

distribution of inefficiency is unrestricted if MLEis not used; only MLE has a range problem


23/45

Not MLE(1)

Corrected OLS

Estimate OLSthats all, just OLS yj = xj+ j Estimate residuals ej and interpret them as

inefficiency

Assuming production, inefficiency


24/45

Not MLE (2)

Fixed effects as inefficiency

Schmidt and Sickles (1984) but not inStatafixed effects required!

Given panel data and fixed effects,

assume that inefficiency is the fixed effect Estimate yjt = xjt + j + vjt by xtreg predict the fixed effects j and define the

most efficient (production) max Inefficiency = max - j Min and reverse sign for cost functions


25/45

Not MLE (3)

Gamma-distributed inefficiency

Greene (1990), not in Stata

Not a panel, yj = xj + (uj + vj) by reg andpredict the residuals e

j

= uj

+ vj

Adjust residuals to one side of 0 by themax or min; the constant absorbs emax/min

Assume v ~ (0, v

2) and u ~ a Gamma

distribution and estimate v2, , and

E(e) isnt useful, fixed to 0 by OLS but


26/45

Not MLE (3)

Gamma-distributed inefficiency

V(e) = v2+ /2

Skewness(e) = 2/3

Kurtosis(e) = V2

(e) + 6/4

Three equations in three unknowns: V(v),

two parameters of the distribution of u

Standard errors by delta method or GMM But the range of the data is a function ofthe parameters? No problem, not MLE!


27/45

Failure of well-specified MLE:

parameters

Failure to converge; estimation continues indefinitelythrough many iterations with no sign of stropping.

Repeated non-concave loglikelihood means the loglikelihood is not maximized, maximum likelihood fails;backed up means the loglikelihood decreases.

Estimation fails to start, initial values not feasible. OLSstarting values imply negative infinite log likelihood.

Apparent estimates but the SD of inefficiency (u) is small

or the ratio ofu to the SD of the stochastic frontier (v),=u/v, is small, e.g. .01; sometimes goes as close tozero as Stata can make it, e.g. 0.00001.


28/45

Failure of well-specified MLE:

distributions The truncated normal distribution of inefficiency has an

extra parameter, the mean of the normal truncated at 0,which often fails in estimation.

The exponential distribution slopes down from 0

smoothly, which leads to initial values not feasible ifinefficiency is not strongly skewed right. The stochastic frontier can disappear from the model,

leaving one-sided inefficiency that violates the MLErange rule (range not a function of the parameters).

The half-normal is the most often successful, the mostcommon in the literature, and the default in Stata.

Unsuccessful Stata estimation is illustrated at this point.


29/45

Data Envelopment Analysis (DEA)

Envelop the m inputs and n outputs in m+nspace, i.e. a graph with points, with hyperplanes,i.e. lines/planes/etc.

Linear programming Constant returns to scale (CRS) = CCR for

Charnes, Cooper, and Rhodes (1978),

Variable returns to scale (VRS) = BCC for

Banker, Charnes, and Cooper (1984). Aigner and Chu (1968) did it first and also did

quadratic programming


30/45

DEA assumptions

DMU = decision making unit, business, bank,farm, not-for-profit, government, university, etc.

All actual observed inputs and outputs of anyDMUs are feasible for all DMUs

All linear combinations of observed inputs andoutputs are feasible.

Free disposal of inputs and outputs.

The production function or cost function ispiecewise linear, implying linear or non-differentiable functions everywhere.


31/45

DEA efficiency without prices

Output-oriented technical efficiency is producing thegreatest possible output in the sense of a linear functionof a set of outputs given the value of a linear function ofinputs. No prices are involved. Efficiency = output thatcould be produced from inputs used, if >100%, inefficient.

Input-oriented technical efficiency is producing a givenset of outputs with the smallest linear function of inputs.No prices are involved. Efficiency = percentage of actualinputs used that would be needed, if


32/45

DEA efficiency with prices

Allocative efficiency is minimizing the cost of thelinear combination of the outputs produced,using input prices.

Profit maximization: maximizing the value of

outputs minus the value of inputs, using bothoutput and input prices

Scale efficiency is operating at the scale ofoperation maximizing the ratio of the linear sum

of outputs to the linear sum of inputs. An economically efficient business is technicallyand scale efficient.


33/45

DEA: tiny example

constant returns to scale

DMU x y y/x efficiency supereffic 1 1 6 6.00 1.0000 1.0909 2 2 8 4.00 0.6667 3 2 11 5.50 0.9167

4 3 9 3.00 0.5000 5 3 13 4.33 0.7222 6 5 15 3.00 0.5000 DMU#1 has the highest y/x and others are inefficient

according to their ratios of y/x DMU#1 could drop to 5.5 and still be efficient (see

DMU#3), DMU#1s superefficiency is 6.00/5.50 = 1.0909


34/45

DEA: tiny example

variable returns to scale

DMU x y y/x efficiency supereffic 1 1 6 6.00 1.0000 2.0000 2 2 8 4.00 0.7000 3 2 11 5.50 1.0000 1.2143

4 3 9 3.00 0.5333 5 3 13 4.33 1.0000 1.1667 6 5 15 3.00 1.0000 big DMUs#1,3,5,6 define the frontier

DMU#2 is inefficient relative to 0.6 X #1 + 0.4 X #3 DMU#4 is inefficient relative to 0.4 X #1 + 0.6 X #3 DMU#1 could use twice the input and still be efficient


35/45

DEA graph

0

10

20

30

0 1 2 3 4 5x

crs vrs

y

Output-oriented DEA: input x, output y


36/45

DEA: standard setup

N decision making units (DMU). Assume a linear function of n inputs produces m outputs. There is no economic production function or cost

function in basic DEA.

Assume the linear function of the inputs is minimizedgiven the linear function of the outputs,

Equivalent: the linear function of the outputs ismaximized given the linear function of the inputs.

Call inputs x and outputs y as in regression. Call the coefficients on inputs b and the coefficients onoutputs c. These are shadow prices in economics.


37/45

DEA: linear programming,

input oriented (production)

Consider DMU t, 1tN, N total producers tostudy, with m outputs and n inputs.

DEA estimates each DMUs efficiency by itself,not relative to one estimated frontier. Each DMU

t has an individual input and output function. Max, over c.t and b.t, i=1mcityit/ j=1nbjtxjt s.t. bjt0 and i=1mcityip/j=1nbjtxjp 1, all DMUs p. Linear fractional programming is difficult,

maximizing the ratio of two linear functions;restate to maximize the numerator minus thedenominator, which is a linear program.


38/45

DEA: avoiding

linear fractional programming

Max i=1mcityit - j=1nbjtxjt s.t. bjt0, all j, andj=1nbjtxjt = 1, a normalization of total cost, andi=1mcityip - j=1nbjtxjp 0, all DMUs p.

Note on math: given real z, functions f(z), g(z) all

>0; substituting max f(z)-g(z) for max f(z)/g(z)implies that f(z) and g(z) are near 1.0, so thatln(f(z)) and ln(g(z)) are approximately linear.

Setting total costs = 1.0 is a normalization butsetting total output (1) near 1.0 is anassumption that inefficiency is not too large.Linearizing overstates large inefficiencies.

No standard errors, no statistical tests.


39/45

DEA including prices

Basic DEA has no economic production or cost function,but see Ray (2004, Chapter 9), linear programming, withadditional constraints.

Add constraints to the production or cost (linear) functionusing the market prices.

Maximize output given inputs but add the linearconstraint on inputs that cost adds up to a total variablecost budget.

An explicit production function can be added as aconstraint.

DEA for profit maximization explicitly maximizes the totalrevenue from outputs minus the variable cost of inputsas a linear function.


40/45

DEA: what management

consultants do

Rank DMUs by efficiency Benchmark to efficient units Estimate superefficiency Use the coefficients to suggest alterations in

resource allocations.

Assumption: the production function that appliesto a particular DMU (farm, hospital, or university,

e.g.) can be expanded or contracted linearly. DMUs with unusual combinations of inputs can

appear efficient but be very difficult to emulate.


41/45

DEA: attempted standard errors

Interpretation as MLE on efficiency: estimate aprobability distribution of estimated efficiencies. Thefrontier is still non-stochastic; the probability distributionis descriptive and post-estimation; this is not MLE.

Chance-constrained linear programming: add adisturbance (maybe Gamma) to the non-stochasticfrontier. The frontier is still non-stochastic; not MLE.

Bootstrap variances: random sampling variation inestimated efficiencies does not represent behavior of

DMUs or the observed frontier in the actual data. No method provides econometric standard errors, the

reason many econometricians just say no.


42/45

Comparing DEA and SFA

Comparing SFA to DEA has not beendone very much

Some work on hospitals

The correlation of efficiency estimates isnot very high: 0.13-0.63 in hospitals,

apparently similar elsewhere

DEA focuses on individual DMUs, whileSFA focuses on estimating the frontier.


43/45

Research on frontier functions

SFA and DEA results

What systematic factors are associatedwith failure of SFA models:

topic (banks, farms, hospitals, states, etc.),

distribution (exponential, half-normal,truncated normal, gamma),

explanatory variables, sample size, etc.?

What systematic factors are associatedwith SFA and DEA results being similar ordifferent?


44/45

Research on frontier functions:

methodology

No theoretical reason to avoid the Gamma distribution,so use it in research and compare results.

Apply SFA based on moments and compare with MLE. Quadratic programming (minimizing the sum of squared

inefficiency terms) in DEA was difficult decades ago, buttoday? The method of Wolfe (1959) can be used.

Aigner and Chu (1968) estimated quadraticprogramming and had apparently different estimates(with no standard errors) of capital-output elasticity and

technology-output elasticity.

Fractional linear programming also might be feasible inDEA given modern computing resources.


45/45

Go estimate frontier functions

Economics and policy are often concerned withefficiency of banks, farms, governments, private andpublic agencies, for-profit and not-for-profit producers.

The weak axiom of cost minimization is reasonable;

profit maximization is not required. Statas frontier and xtfrontier are available and Statas

restrictions can be evaded.

DEA is used by management consultants, estimated bygeneral and specific linear programming packages.

Comparative or methodological research is possible.

Documents

Frontier Functions