Create Presentation
Download Presentation

Frontier Functions: Stochastic Frontier Analysis SFA Data Envelopment Analysis DEA

1916 Views
Download Presentation

Download Presentation
## Frontier Functions: Stochastic Frontier Analysis SFA Data Envelopment Analysis DEA

- - - - - - - - - - - - - - - - - - - - - - - - - - - E N D - - - - - - - - - - - - - - - - - - - - - - - - - - -

**1. **Frontier Functions: Stochastic Frontier Analysis (SFA) &Data Envelopment Analysis (DEA)

**3. **Frontier functions: definition None of those standard econometric models is the answer.
The answer is frontier functions, econometric stochastic frontier analysis (SFA) or linear programming data envelopment analysis (DEA).
Frontier functions estimate maxima or minima of a dependent variable given explanatory variables, usually to estimate production or cost functions.
All frontier functions come from one paper, Aigner and Chu (1968).

**4. **Aigner and Chu (1968) D.J. Aigner and S.F. Chu (AER 1968), “On Estimating the Industry Production Function” invented this area.
“A viable distinction between the average and frontier functions as predictors of capacity…derives from a probability interpretation of alternative forecasts….the frontier we construct is truly a surface of maximum points.” This became Stochastic Frontier Analysis, Stochastic = probability interpretation.
Estimation, for primary metals production in state aggregates:
“one stage least squares” and two stage least squares,
quadratic programming (now rarely estimated), and
linear programming, developed into Data Envelopment Analysis in Charnes, Cooper, and Rhodes (1978) and subsequent research.

**5. **Varian (1984) Varian shows how to estimate and test for the Weak Axiom of Cost Minimization (WACM) and other microeconomic assumptions
Varian suggests using either regression (SFA) or linear programming (DEA)
The WACM applies to for-profit, not-for-profit, private, and public producers
The only requirement is that minimum inputs are intended to be used to produce desired output, or maximum output is intended from inputs used
Profit maximization is not required

**6. **SFA and DEA Two large differences and another possible difference
SFA has a stochastic frontier with a probability distribution
DEA has a non-stochastic frontier
SFA has one output, or an a priori weighted average of multiple outputs
DEA often has more than one output, no a priori weights, but assumes input-output separability
Both can have stochastic inefficiency, SFA always does, DEA sometimes does

**7. **One-sided disturbances In frontier functions, the disturbance has a distribution all on one side of zero
the maximum production must be greater than or equal to any value in the sample,
the minimum cost must be less than or equal to any value in the sample.
produced quantities are bounded by the maximum, with non-positive disturbances
costs are bounded by the minimum, with non-negative disturbances

**8. **MLE with a one-sided disturbancedoes not work well MLE and the Cramér-Rao lower bound (minimum variance of an asymptotically unbiased estimator, usually the MLE) are questionable!
Begin with a likelihood function L which shows the probability of the data x given the parameters ?,
The parameters might be the mean and standard deviation or might just be mathematical parameters.

**9. **Setting up MLE Limits are a function of parameters in a non-stochastic frontier function: production function (max), cost function (min)
L is Likelihood, L* is log likelihood.
L(?) is always a probability distribution, so it follows that it integrates to 1.0 over the range of the data, from lower bound A to upper bound Z.
?AZ L(x | ?)dx = 1.
Take the derivative wrt ?:

**10. **MLE: problems ?AZ[dL(x | ?)/d?] dx + [dZ/d?]L(Z)–[dA/d?]L(A) = 0
E(dL*/d?) + [dZ/d?]L(Z) – [dA/d?]L(A) = 0.
The first derivatives of the log likelihood do not have mean 0 if those extra terms stay.
Second derivatives add more unwanted derivatives if the limits are functions of the parameters.
The negative inverse Hessian is not the variance of the MLE.
This is not working at all.

**11. **MLE: possible repairs Make the frontier stochastic and limits of production or cost not a function of the parameters, completely eliminating the problem.
Make the probability distribution have pdf of 0 and derivatives of 0 at the limits, even though the limit itself is a function of the parameters: [dZ/d?]L(Z) = 0 and [dA/d?]L(A) = 0
The Gamma Distribution can do that (Greene (1980)).

**12. **The Gamma Distribution The Gamma Distribution describes a non-negative random variable with two parameters a (shape) and ? (spread)
If e ~ G(a, ?), E(e) = a/?, V(e) = a/?2
pdf(e) = ?aexp(-?e)ea-1/G(a),
with different shapes for a > 0, in ranges: less than 1, 1, between 1 and 2, 2, greater than 2
A graph follows; a > 2 is required for the pdf and its derivatives to be zero at the limits.

**13. **Gamma Distribution: shapes

**14. **Ok, so a Gamma Distribution? No, not really.
The parameters are restricted mathematically. That really annoys researchers.
Some other distribution? No, no other one-sided distribution has the required properties at the limits.
This is why no one has just one disturbance e.

**15. **Composite disturbances The disturbance has two parts
Stochastic frontier (v), unlimited range as usual. The limits of the production or cost function are at infinity, not a function of the parameters
Inefficiency (u), one sided, non-positive for production, non-negative for cost
Finally, yj = xj'ß + uj + vj , that is, ej = uj + vj
So there are two disturbance terms to keep the parameters from affecting the limits

**16. **Panel data: Fixed effects Panel data researchers would like to include fixed or random effects in everything, so why not frontier models?
Greene (2005) addresses this in detail.
Fixed effects have special problems in non-linear models, but they can work
Random effects are offered by Stata.
Now there are three disturbance terms!
yjt = xjt'ß + aj + ujt + vjt

**17. **Fixed effects in non-linear models Fixed effects have well known advantages in linear models but in non-linear models they:
are inconsistent (too small sample for each fixed effect),
cannot be differenced out (differences of non-linear models are still non-linear),
spread their inconsistency to other coefficients (assuming correlation with other explanatory variables, which is the motivation for fixed rather than random effects).

**18. **Wait, maybe fixed effects are ok With few units and many observations, fixed effects work because the sample size for each fixed effect might be large enough. Greene (2005) points this out.
Stata refuses to enter fixed effects in the model.
The user can enter fixed effects.
Random effects, normally distributed, are offered by Stata. As always, they must be assumed to be uncorrelated with explanatory variables.
The independence assumption cannot be tested by Stata, and there is no Hausman test, but…
Estimate fixed effects by direct inclusion and regress the fixed effects on explanatory variables to test the independence required for consistent random effects.

**19. **Stata: all MLE, all the time Stata offers MLE with composite disturbances.
The one-sided distribution is half-normal, truncated normal, or exponential (restricted Gamma)
frontier dependent explanatory, d(hn) or d(tn) or d(e)
In Stata, “u” is one-sided inefficiency and “v” is the two-sided stochastic frontier. Stata uses notation from Greene (1990) in which ? = ratio of standard deviations su/sv, so that ? = 0 means there is no inefficiency.
Fixed effects sneaked in by the user under frontier, or random effects by Stata (normally distributed).
xtfrontier dependent explanatory, re i(group_id)
For minimization, use the option “, cost”

**20. **Stata: heteroscedasticity Stata offers a lot of heteroscedasticity: either u or v can be heteroscedastic, or both.
Heteroscedastic u (one-sided error, inefficiency)
Heteroscedastic v (two-sided error, random variation)
The same explanatory variables, or different variables, can appear in the frontier and in the heteroscedasticity.
frontier…, uhet(var_name) vhet(var_name)

**21. **Stata estimates the inefficiency Stata estimates the technical efficiency, the percentage of estimated frontier output attained or the extra percentage spent beyond frontier cost
predict var_name, te
As usual, many other options exist using predict.
Successful Stata estimation is illustrated at this point.

**22. **Is MLE necessary? If you always use Stata’s options, yes!
If not, no!
Not-MLE (1) Corrected OLS
Not-MLE (2) Fixed effects in panels
Not-MLE (3) Gamma-distributed inefficiency
Note: the Gamma distribution or any other distribution of inefficiency is unrestricted if MLE is not used; only MLE has a range problem

**23. **Not MLE(1) Corrected OLS Estimate OLS—that’s all, just OLS
yj = xj'ß + ej
Estimate residuals ej and interpret them as inefficiency
Assuming production, inefficiency<0, most efficient = max(e1, e2,… en) = emax
Inefficiency of unit j = emax - ej
Substitute min and ej - emin for a cost function

**24. **Not MLE (2) Fixed effects as inefficiency Schmidt and Sickles (1984) but not in Stata—fixed effects required!
Given panel data and fixed effects, assume that inefficiency is the fixed effect
Estimate yjt = xjt'ß + aj + vjt by xtreg
predict the fixed effects aj and define the most efficient (production) amax
Inefficiency = amax - aj
Min and reverse sign for cost functions

**25. **Not MLE (3) Gamma-distributed inefficiency Greene (1990), not in Stata
Not a panel, yj = xj'ß + (uj + vj) by reg and predict the residuals ej = uj + vj
Adjust residuals to one side of 0 by the max or min; the constant absorbs emax/min
Assume v ~ (0, sv2) and u ~ a Gamma distribution and estimate sv2, a, and ?
E(e) isn’t useful, fixed to 0 by OLS but…

**26. **Not MLE (3) Gamma-distributed inefficiency V(e) = sv2 + a/?2
Skewness(e) = 2a/?3
Kurtosis(e) = V2(e) + 6a/?4
Three equations in three unknowns: V(v), two parameters of the distribution of u
Standard errors by delta method or GMM
But the range of the data is a function of the parameters? No problem, not MLE!

**27. **Failure of well-specified MLE: parameters Failure to converge; estimation continues indefinitely through many iterations with no sign of stropping.
Repeated “non-concave” loglikelihood means the log likelihood is not maximized, “maximum” likelihood fails; “backed up” means the loglikelihood decreases.
Estimation fails to start, “initial values not feasible.” OLS starting values imply negative infinite log likelihood.
Apparent estimates but the SD of inefficiency (su) is small or the ratio of su to the SD of the stochastic frontier (sv), ?=su/sv, is small, e.g. .01; ? sometimes goes as close to zero as Stata can make it, e.g. 0.00001.

**28. **Failure of well-specified MLE: distributions The truncated normal distribution of inefficiency has an extra parameter, the mean of the normal truncated at 0, which often fails in estimation.
The exponential distribution slopes down from 0 smoothly, which leads to “initial values not feasible” if inefficiency is not strongly skewed right.
The stochastic frontier can disappear from the model, leaving one-sided inefficiency that violates the MLE range rule (range not a function of the parameters).
The half-normal is the most often successful, the most common in the literature, and the default in Stata.
Unsuccessful Stata estimation is illustrated at this point.

**29. **Data Envelopment Analysis (DEA) Envelop the m inputs and n outputs in m+n space, i.e. a graph with points, with hyperplanes, i.e. lines/planes/etc.
Linear programming
Constant returns to scale (CRS) = CCR for Charnes, Cooper, and Rhodes (1978),
Variable returns to scale (VRS) = BCC for Banker, Charnes, and Cooper (1984).
Aigner and Chu (1968) did it first and also did quadratic programming

**30. **DEA assumptions DMU = decision making unit, business, bank, farm, not-for-profit, government, university, etc.
All actual observed inputs and outputs of any DMUs are feasible for all DMUs
All linear combinations of observed inputs and outputs are feasible.
Free disposal of inputs and outputs.
The production function or cost function is piecewise linear, implying linear or non-differentiable functions everywhere.

**31. **DEA efficiency without prices Output-oriented technical efficiency is producing the greatest possible output in the sense of a linear function of a set of outputs given the value of a linear function of inputs. No prices are involved. Efficiency = output that could be produced from inputs used, if >100%, inefficient.
Input-oriented technical efficiency is producing a given set of outputs with the smallest linear function of inputs. No prices are involved. Efficiency = percentage of actual inputs used that would be needed, if <100%, inefficient.
Constant returns to scale: output and input orientation are the same.
Variable returns to scale: output and input orientation are different.

**32. **DEA efficiency with prices Allocative efficiency is minimizing the cost of the linear combination of the outputs produced, using input prices.
Profit maximization: maximizing the value of outputs minus the value of inputs, using both output and input prices
Scale efficiency is operating at the scale of operation maximizing the ratio of the linear sum of outputs to the linear sum of inputs.
An economically efficient business is technically and scale efficient.

**33. **DEA: tiny exampleconstant returns to scale DMU x y y/x efficiency supereffic
1 1 6 6.00 1.0000 1.0909
2 2 8 4.00 0.6667
3 2 11 5.50 0.9167
4 3 9 3.00 0.5000
5 3 13 4.33 0.7222
6 5 15 3.00 0.5000
DMU#1 has the highest y/x and others are inefficient according to their ratios of y/x
DMU#1 could drop to 5.5 and still be efficient (see DMU#3), DMU#1’s superefficiency is 6.00/5.50 = 1.0909

**34. **DEA: tiny examplevariable returns to scale DMU x y y/x efficiency supereffic
1 1 6 6.00 1.0000 2.0000
2 2 8 4.00 0.7000
3 2 11 5.50 1.0000 1.2143
4 3 9 3.00 0.5333
5 3 13 4.33 1.0000 1.1667
6 5 15 3.00 1.0000 big
DMUs#1,3,5,6 define the frontier
DMU#2 is inefficient relative to 0.6 X #1 + 0.4 X #3
DMU#4 is inefficient relative to 0.4 X #1 + 0.6 X #3
DMU#1 could use twice the input and still be efficient

**35. ** DEA graph

**36. **DEA: standard setup N decision making units (DMU).
Assume a linear function of n inputs produces m outputs.
There is no economic production function or cost function in basic DEA.
Assume the linear function of the inputs is minimized given the linear function of the outputs,
Equivalent: the linear function of the outputs is maximized given the linear function of the inputs.
Call inputs x and outputs y as in regression.
Call the coefficients on inputs b and the coefficients on outputs c. These are shadow prices in economics.

**37. **DEA: linear programming, input oriented (production) Consider DMU t, 1=t=N, N total producers to study, with m outputs and n inputs.
DEA estimates each DMU’s efficiency by itself, not relative to one estimated frontier. Each DMU t has an individual input and output function.
Max, over c.t and b.t, Si=1mcityit/ Sj=1nbjtxjt
s.t. bjt =0 and Si=1mcityip/Sj=1nbjtxjp = 1, all DMUs p.
Linear fractional programming is difficult, maximizing the ratio of two linear functions; restate to maximize the numerator minus the denominator, which is a linear program.

**38. **DEA: avoiding linear fractional programming Max Si=1mcityit - Sj=1nbjtxjt s.t. bjt =0, all j, and Sj=1nbjtxjt = 1, a normalization of total cost, and Si=1mcityip - Sj=1nbjtxjp = 0, all DMUs p.
Note on math: given real z, functions f(z), g(z) all >0; substituting max f(z)-g(z) for max f(z)/g(z) implies that f(z) and g(z) are near 1.0, so that ln(f(z)) and ln(g(z)) are approximately linear.
Setting total costs = 1.0 is a normalization but setting total output (=1) near 1.0 is an assumption that inefficiency is not too large. Linearizing overstates large inefficiencies.
No standard errors, no statistical tests.

**39. **DEA including prices Basic DEA has no economic production or cost function, but see Ray (2004, Chapter 9), linear programming, with additional constraints.
Add constraints to the production or cost (linear) function using the market prices.
Maximize output given inputs but add the linear constraint on inputs that cost adds up to a total variable cost budget.
An explicit production function can be added as a constraint.
DEA for profit maximization explicitly maximizes the total revenue from outputs minus the variable cost of inputs as a linear function.

**40. **DEA: what management consultants do
Rank DMUs by efficiency
Benchmark to efficient units
Estimate superefficiency
Use the coefficients to suggest alterations in resource allocations.
Assumption: the production function that applies to a particular DMU (farm, hospital, or university, e.g.) can be expanded or contracted linearly.
DMUs with unusual combinations of inputs can appear efficient but be very difficult to emulate.

**41. **DEA: attempted standard errors Interpretation as MLE on efficiency: estimate a probability distribution of estimated efficiencies. The frontier is still non-stochastic; the probability distribution is descriptive and post-estimation; this is not MLE.
Chance-constrained linear programming: add a disturbance (maybe Gamma) to the non-stochastic frontier. The frontier is still non-stochastic; not MLE.
Bootstrap variances: random sampling variation in estimated efficiencies does not represent behavior of DMUs or the observed frontier in the actual data.
No method provides econometric standard errors, the reason many econometricians just say no.

**42. **Comparing DEA and SFA Comparing SFA to DEA has not been done very much
Some work on hospitals
The correlation of efficiency estimates is not very high: 0.13-0.63 in hospitals, apparently similar elsewhere
DEA focuses on individual DMUs, while SFA focuses on estimating the frontier.

**43. **Research on frontier functionsSFA and DEA results What systematic factors are associated with failure of SFA models:
topic (banks, farms, hospitals, states, etc.),
distribution (exponential, half-normal, truncated normal, gamma),
explanatory variables, sample size, etc.?
What systematic factors are associated with SFA and DEA results being similar or different?

**44. **Research on frontier functions:methodology No theoretical reason to avoid the Gamma distribution, so use it in research and compare results.
Apply SFA based on moments and compare with MLE.
Quadratic programming (minimizing the sum of squared inefficiency terms) in DEA was difficult decades ago, but today? The method of Wolfe (1959) can be used.
Aigner and Chu (1968) estimated quadratic programming and had apparently different estimates (with no standard errors) of capital-output elasticity and technology-output elasticity.
Fractional linear programming also might be feasible in DEA given modern computing resources.

**45. **Go estimate frontier functions Economics and policy are often concerned with efficiency of banks, farms, governments, private and public agencies, for-profit and not-for-profit producers.
The weak axiom of cost minimization is reasonable; profit maximization is not required.
Stata’s frontier and xtfrontier are available and Stata’s restrictions can be evaded.
DEA is used by management consultants, estimated by general and specific linear programming packages.
Comparative or methodological research is possible.