E62: Stochastic Frontier Models and Efficiency Analysis E-1
E62: Stochastic Frontier Models and Efficiency Analysis
Chapters E62-E65 present LIMDEPs programs for two types of efficiency analysis, stochastic frontier analysis (SFA) and data envelopment analysis (DEA). To a large extent, these are
competing methodologies. No formulation has yet been devised that unifies the two in a single
analytical framework. Arguably, the former is a fully parameterized model whereas the latter is
nonparametric, albeit also atheoretical in nature. The stochastic frontier model is used in a large literature of studies of production, cost,
revenue, profit and other models of goal attainment. The model as it appears in the current literature
was originally developed by Aigner, Lovell, and Schmidt (1977). The canonical formulation that
serves as the foundation for other variations is their model,
y = x + v - u,
where y is the observed outcome (goal attainment), x + v is the optimal, frontier goal (e.g.,
maximal production output or minimum cost) pursued by the individual, x is the deterministic part
of the frontier and v ~ N[0,v2] is the stochastic part. The two parts together constitute the
stochastic frontier. The amount by which the observed individual fails to reach the optimum (the frontier) is u, where
u = |U| and U ~ N[0,u2]
(change to v + u for a stochastic cost frontier or any setting in which the optimum is a minimum). In
this context, u is the inefficiency. This is the normal-half normal model which forms the basic form of the stochastic frontier model.
Many varieties of the stochastic frontier model have appeared in the literature. A major
survey that presents an extensive catalog of these formulations is Kumbhakar and Lovell (2000).
(See, as well, Bauer (1990), Greene (2008) and several other surveys, many of which are cited in
Kumbhakar and Lovell and in Greene.) The estimator in LIMDEP computes parameter estimates for
most single equation cross section and panel data variants of the stochastic frontier model.
A large number of variants of the stochastic frontier model based on different assumptions
about the distribution of the inefficiency term, u have been proposed in the received literature. Most of these are available in LIMDEP, as suggested in the list below. The bulk of the received
technology centers on cross section style modeling. However, recent advances include many
extensions that take advantage of the features of panel data. A large array of panel data estimators
are also supported by LIMDEP as well.
E62: Stochastic Frontier Models and Efficiency Analysis E-2
The conventional approach to deterministic frontier estimation is currently data envelopment
analysis. This is usually handled with linear programming techniques. The analysis assumes that
there is a frontier technology (in the same spirit as the stochastic frontier production model) that can
be described by a piecewise linear hull that envelopes the observed outcomes. Some (efficient)
observations will be on the frontier while other (inefficient) individuals will be inside. The
technique produces a deterministic frontier that is generated by the observed data, so by construction,
some individuals are efficient. This is one of the fundamental differences between DEA and SFA. Data envelopment analysis is documented in Chapter E65.
The analysis of production, cost, etc. in the stochastic frontier framework involves two steps.
In the first, the frontier model is estimated, usually by maximum likelihood. In the second, the
estimated model is used to construct measures of inefficiency or efficiency. Individual specific
estimates are computed that provide the basis of comparison of firms either to absolute standards or
to each other. The sections of this chapter develop several model forms used in the first step.
Efficiency estimation, the second step, appears formally in Section E62.8. The general methodology
is then used in the already developed specifications and with several proposed in the sections that
follow, as well as in Chapters E63 and E64.
E62.2 Stochastic Frontier Model Specifications
The stochastic frontier model is
y = x + v-u, u =|U|.
In this area of study, unlike most others, estimation of the model parameters is usually not the
primary objective. Estimation and analysis of the inefficiency of individuals in the sample and of the
aggregated sample are usually of greater interest. This part of the development will present tools for
estimation of inefficiency.
Typically, the production or cost model is based on a Cobb-Douglas, translog, or other form
of logarithmic model, so that the essential form is
log y = x + v - u
where the components of x are generally logs of inputs for a production model or logs of output and
input prices for a cost model, or their squares and/or cross products. In this form, then, at least for
relatively small variation, u represents the proportion by which y falls short of the goal, and has a
natural interpretation as proportional or percentage inefficiency. The numerous examples below will
demonstrate. Users are also referred to the various survey sources listed earlier.
The results one obtains are, of course, critically dependent on the model assumed. Thus,
specification and estimation of model parameters, while perhaps of secondary interest, are
nonetheless a major first step in the model building process. In nearly all received formulations, the
random component, v, is assumed to be normally distributed with zero mean. In some models, v may
be heteroscedastic. But, in either form, the large majority of the different frontier models that have
been proposed result from variations on the distribution of the inefficiency term, u. The range of
specifications examined in this chapter includes the following:
Distributional assumptions: half normal, exponential, gamma
Partially nonparametric frontier function
Sample selection model
E62: Stochastic Frontier Models and Efficiency Analysis E-3
The following extensions are presented in Chapter E63:
Truncated normal with nonzero, heterogeneous mean in the underlying U
Heteroscedasticity in v and/or u
Heterogeneity in the parameter of the exponential or gamma distribution
Amsler et al.s scaling model
Alvarez et al.s model of fixed, latent management
A number of treatments for panel data are presented in Chapter E64.
E62.3 Basic Commands for Stochastic Frontier Models
The command for all specifications of the stochastic frontier model is
FRONTIER ; Lhs = y ; Rhs = one, ... ; other specifications $
NOTE: One must be the first variable in the Rhs list in all model specifications.
The default specification is Aigner, Lovell and Schmidts canonical normal-half normal model. The default form is a production frontier model,
y = x + v - u, u = |U|.
That is, the right hand side of the equation specifies the maximum goal attainable. To specify a cost
frontier model or other model in which the frontier represents a minimum, so that
y = x + v + u, u = |U|, use
This specification is used in all forms of the stochastic frontier model. As noted below, one
additional specification you may find useful is
; Start = values for , , .
(The meanings of the parameters are developed below.) ALS also developed the normal-exponential
model, in which u has an exponential distribution rather than a half normal distribution. To request
the exponential model, use
; Model = Exponential (or ; Model = E )
in the FRONTIER command. For this model, the parameters are (,,v). Further details appear below. There are also several model forms, and numerous modifications such as heteroscedasticity
that are developed below.
E62: Stochastic Frontier Models and Efficiency Analysis E-4
This is the full list of general specifications that are applicable to this model estimator.
Controlling Output from Model Commands
; Par keeps ancillary parameters , , etc. with main parameter vector in b. ; OLS displays least squares starting values when (and if) they are computed.
; Table = name saves model results to be combined later in output tables.
Robust Asymptotic Covariance Matrices
; Covariance Matrix displays estimated asymptotic covariance matrix (normally not shown),
same as ; Printvc.
; Choice uses choice based sampling (sandwich with weighting) estimated matrix.
; Cluster = spec requests computation of the cluster form of corrected covariance estimator.
Optimization Controls for Nonlinear Optimization
; Start = list gives starting values for a nonlinear model.
; Tlg [ = value] sets convergence value for gradient.
; Tlf [ = value] sets convergence value for function.
; Tlb [ = value] sets convergence value for parameters.
; Alg = name requests a particular algorithm, Newton, DFP, BFGS, etc.
; Maxit = n sets the maximum iterations.
; Output = n requests technical output during iterations; the level n is 1, 2, 3 or 4. ; Set keeps current setting of optimization parameters as permanent.