• Published on

  • View

  • Download

Embed Size (px)


<ul><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-1 </p><p>E62: Stochastic Frontier Models and Efficiency Analysis </p><p> E62.1 Introduction </p><p> Chapters E62-E65 present LIMDEPs programs for two types of efficiency analysis, stochastic frontier analysis (SFA) and data envelopment analysis (DEA). To a large extent, these are </p><p>competing methodologies. No formulation has yet been devised that unifies the two in a single </p><p>analytical framework. Arguably, the former is a fully parameterized model whereas the latter is </p><p>nonparametric, albeit also atheoretical in nature. The stochastic frontier model is used in a large literature of studies of production, cost, </p><p>revenue, profit and other models of goal attainment. The model as it appears in the current literature </p><p>was originally developed by Aigner, Lovell, and Schmidt (1977). The canonical formulation that </p><p>serves as the foundation for other variations is their model, </p><p> y = x + v - u, </p><p>where y is the observed outcome (goal attainment), x + v is the optimal, frontier goal (e.g., </p><p>maximal production output or minimum cost) pursued by the individual, x is the deterministic part </p><p>of the frontier and v ~ N[0,v2] is the stochastic part. The two parts together constitute the </p><p>stochastic frontier. The amount by which the observed individual fails to reach the optimum (the frontier) is u, where </p><p> u = |U| and U ~ N[0,u2] </p><p>(change to v + u for a stochastic cost frontier or any setting in which the optimum is a minimum). In </p><p>this context, u is the inefficiency. This is the normal-half normal model which forms the basic form of the stochastic frontier model. </p><p> Many varieties of the stochastic frontier model have appeared in the literature. A major </p><p>survey that presents an extensive catalog of these formulations is Kumbhakar and Lovell (2000). </p><p>(See, as well, Bauer (1990), Greene (2008) and several other surveys, many of which are cited in </p><p>Kumbhakar and Lovell and in Greene.) The estimator in LIMDEP computes parameter estimates for </p><p>most single equation cross section and panel data variants of the stochastic frontier model. </p><p> A large number of variants of the stochastic frontier model based on different assumptions </p><p>about the distribution of the inefficiency term, u have been proposed in the received literature. Most of these are available in LIMDEP, as suggested in the list below. The bulk of the received </p><p>technology centers on cross section style modeling. However, recent advances include many </p><p>extensions that take advantage of the features of panel data. A large array of panel data estimators </p><p>are also supported by LIMDEP as well. </p></li><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-2 </p><p> The conventional approach to deterministic frontier estimation is currently data envelopment </p><p>analysis. This is usually handled with linear programming techniques. The analysis assumes that </p><p>there is a frontier technology (in the same spirit as the stochastic frontier production model) that can </p><p>be described by a piecewise linear hull that envelopes the observed outcomes. Some (efficient) </p><p>observations will be on the frontier while other (inefficient) individuals will be inside. The </p><p>technique produces a deterministic frontier that is generated by the observed data, so by construction, </p><p>some individuals are efficient. This is one of the fundamental differences between DEA and SFA. Data envelopment analysis is documented in Chapter E65. </p><p> The analysis of production, cost, etc. in the stochastic frontier framework involves two steps. </p><p>In the first, the frontier model is estimated, usually by maximum likelihood. In the second, the </p><p>estimated model is used to construct measures of inefficiency or efficiency. Individual specific </p><p>estimates are computed that provide the basis of comparison of firms either to absolute standards or </p><p>to each other. The sections of this chapter develop several model forms used in the first step. </p><p>Efficiency estimation, the second step, appears formally in Section E62.8. The general methodology </p><p>is then used in the already developed specifications and with several proposed in the sections that </p><p>follow, as well as in Chapters E63 and E64. </p><p>E62.2 Stochastic Frontier Model Specifications </p><p> The stochastic frontier model is </p><p> y = x + v-u, u =|U|. </p><p>In this area of study, unlike most others, estimation of the model parameters is usually not the </p><p>primary objective. Estimation and analysis of the inefficiency of individuals in the sample and of the </p><p>aggregated sample are usually of greater interest. This part of the development will present tools for </p><p>estimation of inefficiency. </p><p> Typically, the production or cost model is based on a Cobb-Douglas, translog, or other form </p><p>of logarithmic model, so that the essential form is </p><p> log y = x + v - u </p><p>where the components of x are generally logs of inputs for a production model or logs of output and </p><p>input prices for a cost model, or their squares and/or cross products. In this form, then, at least for </p><p>relatively small variation, u represents the proportion by which y falls short of the goal, and has a </p><p>natural interpretation as proportional or percentage inefficiency. The numerous examples below will </p><p>demonstrate. Users are also referred to the various survey sources listed earlier. </p><p> The results one obtains are, of course, critically dependent on the model assumed. Thus, </p><p>specification and estimation of model parameters, while perhaps of secondary interest, are </p><p>nonetheless a major first step in the model building process. In nearly all received formulations, the </p><p>random component, v, is assumed to be normally distributed with zero mean. In some models, v may </p><p>be heteroscedastic. But, in either form, the large majority of the different frontier models that have </p><p>been proposed result from variations on the distribution of the inefficiency term, u. The range of </p><p>specifications examined in this chapter includes the following: </p><p> Distributional assumptions: half normal, exponential, gamma </p><p> Partially nonparametric frontier function </p><p> Sample selection model </p></li><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-3 </p><p>The following extensions are presented in Chapter E63: </p><p> Truncated normal with nonzero, heterogeneous mean in the underlying U </p><p> Heteroscedasticity in v and/or u </p><p> Heterogeneity in the parameter of the exponential or gamma distribution </p><p> Amsler et al.s scaling model </p><p> Alvarez et al.s model of fixed, latent management </p><p>A number of treatments for panel data are presented in Chapter E64. </p><p>E62.3 Basic Commands for Stochastic Frontier Models </p><p> The command for all specifications of the stochastic frontier model is </p><p> FRONTIER ; Lhs = y ; Rhs = one, ... ; other specifications $ </p><p>NOTE: One must be the first variable in the Rhs list in all model specifications. </p><p>The default specification is Aigner, Lovell and Schmidts canonical normal-half normal model. The default form is a production frontier model, </p><p> y = x + v - u, u = |U|. </p><p>That is, the right hand side of the equation specifies the maximum goal attainable. To specify a cost </p><p>frontier model or other model in which the frontier represents a minimum, so that </p><p> y = x + v + u, u = |U|, use </p><p> ; Cost </p><p>This specification is used in all forms of the stochastic frontier model. As noted below, one </p><p>additional specification you may find useful is </p><p> ; Start = values for , , . </p><p>(The meanings of the parameters are developed below.) ALS also developed the normal-exponential </p><p>model, in which u has an exponential distribution rather than a half normal distribution. To request </p><p>the exponential model, use </p><p> ; Model = Exponential (or ; Model = E ) </p><p>in the FRONTIER command. For this model, the parameters are (,,v). Further details appear below. There are also several model forms, and numerous modifications such as heteroscedasticity </p><p>that are developed below. </p></li><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-4 </p><p>This is the full list of general specifications that are applicable to this model estimator. </p><p>Controlling Output from Model Commands </p><p>; Par keeps ancillary parameters , , etc. with main parameter vector in b. ; OLS displays least squares starting values when (and if) they are computed. </p><p>; Table = name saves model results to be combined later in output tables. </p><p>Robust Asymptotic Covariance Matrices </p><p>; Covariance Matrix displays estimated asymptotic covariance matrix (normally not shown), </p><p> same as ; Printvc. </p><p> ; Choice uses choice based sampling (sandwich with weighting) estimated matrix. </p><p> ; Cluster = spec requests computation of the cluster form of corrected covariance estimator. </p><p>Optimization Controls for Nonlinear Optimization </p><p>; Start = list gives starting values for a nonlinear model. </p><p>; Tlg [ = value] sets convergence value for gradient. </p><p>; Tlf [ = value] sets convergence value for function. </p><p>; Tlb [ = value] sets convergence value for parameters. </p><p>; Alg = name requests a particular algorithm, Newton, DFP, BFGS, etc. </p><p>; Maxit = n sets the maximum iterations. </p><p>; Output = n requests technical output during iterations; the level n is 1, 2, 3 or 4. ; Set keeps current setting of optimization parameters as permanent. </p><p>Predictions and Residuals </p><p>; List displays a list of fitted values with the model estimates. </p><p>; Keep = name keeps fitted values as a new (or replacement) variable in data set. </p><p>; Res = name keeps residuals as a new (or replacement) variable. </p><p>; Fill fills missing values (outside estimating sample) for fitted values. </p><p>Hypothesis Tests and Restrictions </p><p>; Test: spec defines a Wald test of linear restrictions. </p><p>; Wald: spec defines a Wald test of linear restrictions, same as ; Test: spec. </p><p>; CML: spec defines a constrained maximum likelihood estimator. </p><p>; Rst = list specifies equality and fixed value restrictions. </p><p> ; Maxit = 0 ; Start = the restricted values specifies Lagrange multiplier test. </p></li><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-5 </p><p>E62.3.1 Predictions, Residuals and Partial Effects </p><p> Predicted values and residuals for the stochastic frontier models are computed as follows: </p><p>The same forms are used for cross section and panel data forms. The predicted value is x. (These are rarely useful in this setting.) The residual is computed directly as </p><p> i i ie y x </p><p>This residual is usually not of interest in itself. It is, however, the crucial ingredient in the efficiency </p><p>estimator discussed in Section E62.8. The estimator of ui that we will use is computed by the </p><p>Jondrow formula E[u|v-u] or E[u|v+u] if based on a cost frontier, </p><p> 2</p><p>( )[ | ] ,1 1 ( )</p><p>wE u w v u</p><p>w</p><p> , w = /, </p><p> 2 2 , .uv uv</p><p>In the JLMS formula, ei is the estimator of i. The formulas and computations are discussed in Section E62.8. </p><p> The frontier model is, save for its involved disturbance term, a linear regression model. The </p><p>conditional mean in the model is </p><p> E[yi|xi] = xi - E[ui|xi]. </p><p>In most cases, E[ui|xi]is not a function of xi, so the derivatives of E[yi|xi] with respect to xi are just . In other cases, we will consider, the conditional mean of ui does depend on xi or other variables, so </p><p>the partial effects in the model might be more involved than this. Once again, however, these will </p><p>usually not be of direct interest in the study. But, in all cases, [ | ]E u will be an involved function of </p><p>xi and any other variables that appear anywhere else in the model. We will examine the partial </p><p>effects on the efficiency estimators in Section E62.8. </p><p>E62.3.2 Results Saved by the Frontier Estimator </p><p> The results saved by the frontier estimator are </p><p> Matrices: b = regression parameters, , varb = asymptotic covariance matrix </p><p> Scalars: sy, ybar, nreg, kreg, and logl </p><p> Last Function: JLMS estimator of ui. </p></li><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-6 </p><p>Use ; Par to add the ancillary parameters to these. The ancillary parameters that are estimated for </p><p>the various models are as follows, including the scalars saved by the estimation program: </p><p> Half and truncated normal: estimates , , saves lmda and s = , </p><p> Truncated normal: same as half normal, estimates , saved as mu, </p><p> Exponential: estimates , v, saves theta and s = v, </p><p> Heteroscedastic model: average value of as s, average value of as lmda </p><p> Heterogeneity in mean: estimates , , saves lmda and s = . </p><p>E62.4 Data for the Analysis of Frontier Models </p><p> We will use two data sets to illustrate the frontier estimators. The first, the data on U.S. airlines is a panel data set that we will use primarily for illustrating the stochastic frontier model. </p><p>The second, the famous WHO data on health care attainment, will be used both for the stochastic </p><p>frontier models and for the later work on data envelopment analysis. </p><p>E62.4.1 Data on U.S. Airlines </p><p> We will develop several examples in this section using a panel data set on the U.S. airline </p><p>industry from the pre-deregulation period (airlines.dat). The observations are an unbalanced panel </p><p>on 25 airlines. The original balanced panel data set contained 15 observations (1970-1984) on each </p><p>of 25 airlines. Mergers, strikes and other data problems reduced the sample to the unbalanced panel </p><p>of 256 observations The group sizes (number of firms) are 2 (4), 4(1), 7 (1), 9 (3), 10 (3), 11 (1), 12 </p><p>(2), 13 (1), 14 (3) and 15 (6). The variables in the data set are </p><p> firm = ID, 1,...,25 year = 1970...1984 t = year - 1969 = 1,...,15 </p><p> cost = total cost revenue = revenue output = total output </p><p> stage = average stage length points = number of points served loadfct = load factor </p><p> cmtl = materials cost mtl = materials quantity pm = price of material </p><p> cfuel = fuel cost fuel = fuel quantity pf = fuel price </p><p> ceqpt = equipment cost eqpt = equipment quantity pe = equipment price </p><p> clabor = labor cost labor = labor quantity pl = labor price </p><p> cprop = property cost property = property quantity pp = property price </p><p> k = capital index pk = capital price index </p><p>Transformed variables used in the examples are as follows: </p><p> lc = log(cost) cn = cost/pp lcn = log(cn) </p><p> lpm = log(pm) lpf = log(pf) lpe = log(pe) </p><p> lpl = log(pl) lpp = log(pp) lpk = log(pk) </p><p> lpmpp = log(pm/pp) lpfpp = log(pf/pp) lpepp = log(pe/pp) </p><p> lplpp = log(pl/pp) lf = log(fuel) lm = log(mtl) </p><p> le = log(eqpt) ll = log(labor) lp = log(property) </p><p> lq = log(output) lq2 = lq2 </p></li><li><p>E62: Stochastic Frontier Models and Efficiency Analysis E-7 </p><p>E62.4.2 World Health Organization (WHO) Health Attainment Data </p><p> The data used by the WHO in their 2000 World Health Report assessment of health care </p><p>attainment by 191 countries have been used by many researchers worldwide both for developing </p><p>frontier models and for analyzing health outcomes. The data are a panel of five years, 1993-1997, on </p><p>health outcome data for 191 countries and a number of internal political units, e.g., the states of </p><p>Mexico. The ma...</p></li></ul>


View more >