Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Span of Control in Production Hierarchies∗
Jan Eeckhout and Roberto Pinheiro
University of Pennsylvania
February, 2008
(Preliminary Draft)
Abstract
The allocation of skills in the firm is determined by the span of control of managers. The span
of control of any given manager includes the lower skilled managers and the workers that are in the
span of control of those lower skilled managers. At each level, skills are imperfect substitutes in the
production of output and there are decreasing returns to hiring more agents with the same skill level.
In a competitive labor market with atomless firms, we find that: 1. firms have a non-degenerate skill
distribution; 2. larger firms hire disproportionately more skilled workers. As a result, large firms
have a skill distribution that is more skewed and they pay on average higher wages; 3. The presence
of a non-divisibility as in Lucas (1978) will pin down the skill level of the highest skilled manager.
When investment in skills is endogenized, we find that the equilibrium skill distribution has a long
right tail, even if ex ante all agents are identical.
1 Introduction
The firm is a distribution of skills. Workers with many different talents and skills populate a typical
firm. Even the smallest firms have occupations ranging from CEO to janitor. Most production processes
involve the collaboration of agents with different ability and skill, and typically there is a hierarchy of
decision making and execution of tasks. With those hierarchies comes a distribution of skills, and
as a consequence, a distribution wages. We put forward a matching theory where firms operate in
a competitive labor market and where workers optimally sort in different firms. The theory aims to
characterize the firm as an equilibrium composition of skilled workers. Each firm has an endogenous
organigram that maximizes output. With this production technology, it does not only matter who your
workers are, but also what they do and what the organigram of the firm is.∗We are grateful to numerous colleagues for valuable discussions and comments.
1
a + bnγ
n
h(n)
a
1
Lucas
Figure 1: Span of Control: h(n) versus Lucas (1978)
The starting point of our analysis is the Lucas (1978) span of control model. There, the firm consists
of a manager who hires workers, and her productivity is limited by her span of control. Hiring more
workers has decreasing returns, which in equilibrium determines the boundaries of the firm. Because
there are complementarities between managers and workers, higher skilled managers in equilibrium hire
more workers. The firm is fully characterized by one manager and her skill, and the equilibrium number
of workers. The key assumption is that a firm has exactly one manager. This non-divisibility seems
reasonable: to run a firm, we need to hire one manager who is full time devoted to the firm.
We further build on this model in two ways. First, while we maintain that a limited amount of time
commitment by a manager is needed, we relax the assumption that a firm is restricted to hiring exactly
one manager. Our interpretation is that a minimum scale of managerial skill is needed, but that we
can extend beyond the minimum scale. When a firm hires more managers with the same skill level, we
assume that there are decreasing returns. We model this by the a function h(n) which measures the
productivity at a given skill level. In Lucas, this is a step function valued zero if you hire less than
one manager, and valued one if you hire 1 or more managers. We have a smooth concave function that
reflects the decreasing returns to hiring more workers. If the intercept of that function is negative, we
assume it to be bounded at zero: the firm can always opt out and not hire any manager of that skill
type at all. This is illustrated in Figure 1 where h(n) is a simple polynomial.
2
Second, we assume that each skilled agent faces the same technology within the firm, i.e. there is no
distinction between managers and workers. A hire skilled manager has span of control over the lower
skilled manager who in turn has span of control over the next lower skilled managers.
Within this framework we study how the distribution of skills and earnings differs between firms.
We will be able to establish whether larger firms hire more skilled workers and whether the distribution
of skills inside large firms is more dispersed than in small firms. While all firms hire a wide range of
skills, it is possible that large firms like GE or IBM have a very different composition of skills than
small family firms.
The firm exists because different inputs in production are needed to produce output. And while any
particular individual or individuals of similar ability may be able to provide those different inputs in
production, typically it will be optimal to allocate differently skilled agents to different jobs even if one
particular skilled agent is better at performing all other tasks. This is because the firm faces a trade-off
between allocating a more skilled worker who contributes more to output at a higher wage, and a less
productive agent who commands a lower wage. The price system, market wages for different skill levels,
determines the optimal resolution of the trade-off and therefore fixes the equilibrium allocation of skills.
The optimal solution to this trade-off fundamentally boils down to the allocation of talent according
to comparative advantage, as captured by the well known metaphor of the attorney and the secretary.
Even if the best lawyer is also the best secretary, it is quite clear that this lawyer will focus on the
task of being an attorney by employing a secretary instead of doing all the paperwork by himself, or by
hiring a secretary who is as skilled as he is himself. Even though he is the best secretary and the best
lawyer, he can earn more by running a law firm and employing a low skilled secretary at a low wage
than by hiring another expensive attorney to do the secretarial work.
This implicitly involves production hierarchies as the marginal product of a lower skilled agent is
affected by the skill-level of those higher in the hierarchy. There are many reasons why such production
hierarchies emerge. This may be due to the efficient processing of information and resolution of prob-
lems. Garicano (2000) and Antras, Garicano and Rossi-Hansberg (2006) provide micro foundations for
why particular production functions may be optimal. The hierarchies may alternatively also be due to
O-ring type production technologies with asymmetry as in Kremer-Maskin (1996). A small mistake by
one worker in the production chain can have implications of unprecedented dimensions. One bug in
the software may lead to the malfunctioning of millions of electronic devices, or the inadequate quality
control for lead in paint can lead to a worldwide recall of a toy.
There are two main implications for the equilibrium allocation of using this technology:
1. The firm size is endogenous and consists of a non-degenerate distribution of skills.
The imperfect substitutability of workers as inputs in production implies that the size of the firm is
3
endogenous. For reasons of comparative advantage in different jobs, firms in equilibrium decide to hire
workers with different talent. Of course, quite a lot is known about the size distribution of firms (for
recent examples, see Luttmer (2007) and Rossi-Hansberg and Wright (2007)). The interest here is how
firm size relates to the internal distribution of skills within the firm.
2. Firms differ in their composition of talent. Firms with higher firm-specific total factor
productivity will hire more labor which is due to the complementarity between capital and labor inputs.
More interestingly, the equilibrium distribution of skills within different firms are not identical. We
show that only if the elasticity of substitution between different skills is constant and there are no
indivisibilities, will the distribution of skills within different firms be identical. We characterize the
properties of different firms for some skill distributions. Because size and firm-specific TFP are related,
we can characterize the firm’s skill distribution in function of the firm size.
First we analyze the general case without any indivisibility, i.e. with h(n) positive everywhere. We
show that if the elasticity of substitution between different skills is constant, all firms will obtain the
same distribution of skills. Firms with higher capital stocks will be larger, but they will not differ in
the composition of skills. In contrast, if the elasticity of substitution is decreasing, the skill distribution
of larger firms stochastically dominates the distribution of smaller firms. The implication of the first-
order stochastic dominance in the skill distribution is that there is also stochastic dominance in the
wage distribution. We analyze a competitive equilibrium in which market wages for the same skills are
the same. Larger firms therefore hire on average more skilled workers and therefore pay on average
higher wages. This can explain a well-documented fact in the empirical labor literature, that there is an
employer-size wage premium. While this fact is typically established after controlling for observables,
it may nonetheless be determined by skill heterogeneity unobserved by the econometrician.
Second, we consider the case as in Lucas, where a minimum scale of output is needed. This is the
case where h is negative for some values. We find that in equilibrium, firms with larger capital stocks
will be larger and will find it profitable to hire proportionally more high skilled workers. This implies
that the skill distribution in large firms is skewed to the right compared to the distribution in small
firms. In addition, the highest skilled manager in the large firm will be more skilled than the CEO in a
small firm. As a result, the support of skills of the small firm is included in the support of skills of the
large firm. This is illustrated in Figure 1.
In Section 3 below, we analyze the impact of investment in skills by ex ante identical agents and show
that in equilibrium, there will be an endogenous distribution of skills. Even with no or small ex ante
heterogeneity, there can be considerable ex post inequality as this technology enhances heterogeneity.
In equilibrium, if there is scarcity of any one particular input, the returns to obtaining that skill are
high. With increasing investment costs, the ensuing distribution of skills is decreasing in type as the
4
density
skills
large firm
small firm
Figure 2: Stochastic Dominance of Skill Distribution in Large Firms
returns in term of wages must be increasing to compensate for higher investments costs. Wages can
only be increasing if there is sufficient scarcity in that particular input.
One further application is Occupational Choice and span-of-control. A CEO chooses to optimally
design the hierarchy of the firm, and she herself competes on a market where she can choose, based
on equilibrium compensation schedules between a job as an employee or as a CEO. The span-of-
control of the CEO determines her productivity. We extend the technology of production hierarchies
to incorporate this feature.
2 The model
Population. Consider a population consisting of agents endowed with talent x, a one-dimensional skill
characteristic. Skills are distributed according to the distribution function F (x). The measure of agents
is normalized to one. There is a measure of capitalists, each of whom is atomless, who have the property
rights to a production process k. This can be interpreted as firm-specific total factor productivity. Let
µ(k) denote the measure of each type k.
Production. Firms produce output y using the input k and a set of workers of different skills. The
5
production function is given by
y = kL(n,x)
where n is the vector of quantities ni and x is the vector of skills xi, and where
L(n; x) =
[N∑i=1
h(ni)xi
]β
where h(·) is monotonically increasing and concave and β > 0.
It is important to note at this stage that y is a firm-level production function and that in general it
is not equal to the aggregate production function.
For most of the paper, we consider a discrete distribution of types x. A continuous distribution of
types is analogously represented by
L(n; x) =[∫
h(n(x))xdFk(x)]β
where Fk(x) denotes the distribution of skills in firm k. Below we derive that this is the continuous
limit of the production technology with finite skill types.
The firm’s optimization problem. Markets are competitive and the atomless firms act as price
takers. Given a vector of wages w(x) (normalize the output price to 1) firm k’s problem is given by:
π (k;w (·)) = maxn1,...,nN
k
[N∑i=1
h(ni)xi
]β−
N∑i=1
niw (xi)
A competitive equilibrium of the economy can be defined as follows:
Definition 1 In a competitive equilibrium in this economy: 1. Firms maximize profits πk; 2. workers
choose the job with the highest wage offered w(x) for a type x; 3. markets clear.
Now the main properties, as discussed in the introduction, of this production process are made
precise. 1. The marginal product of the second CEO, or the second janitor is lower. Returns are
decreasing returns to hiring more of the same worker since h(n) is concave. 2. The same number of
more skilled workers are more productive than low skilled workers: h(n)xi > h(n)xj if xi > xj . 3. There
is a notion of scarcity. A shortage of a particular skill level can drive up the prices. A lower skilled
employee xj can be more productive than the high skilled employee xi if she is sufficiently scarce in
the firm: h(ni)xi < h(nj)xj provided ni � nj . 4. Inputs in production can be both complements or
substitutes.
6
β
Substitutes and Strictly concavity
Complements and Strictly concavity
1
1
γ
Strc.
Quasiconcave
Once we have set up the characteristics of firm’s problem, we must define the equilibriumin our economy. Since we are assuming the approach in which workers don’t value leisure, ourequilibrium is quite simple and involves only profit maximization and market clearing condi-tions. Later, we endogeneize decisions on investment in education and we adapt our equilibriumconditions to take these decisions in account.
Definition 4 A Competitive equilibrium in this economy is one in which
• Firms maximize profits πi,
• workers choose the job with the highest wage w(x),
• markets clear.
Let’s explicitly find the equilibrium in this economy. As presented before, firm’s problem isgiven by:
π (k;w (·)) = maxn1,...,nN
k
"NXi=1
(a+ bnγi )xi
#β−
NXi=1
niw (xi)
Then, from F.O.C.s we have:
kβ
"NXi=1
(a+ bnγi )xi
#β−1bγnγ−1i xi = w (xi) , ∀i ∈ {1, ..., N}
4
Figure 3: Complements and Substitutes
Complements and Substitutes. From the firm’s objective function, we derive
∂2π
∂ni∂nj= kβ (β − 1)
[N∑i=1
h (ni)xi
]β−2
h′(ni)h′(nj)xixj
Notice that ∂2π∂ni∂nj
> 0⇐⇒ β > 1. Therefore, β determines whether xi and xj are gross complements
or substitutes.
Claim 2 If β > 1, inputs are complements. If β < 1 they are substitutes.
For example, let h (ni) = nγi , then we can summarize this in terms of the parameter values for
β ∈ R+ and γ ∈ [0, 1]. The firm’s problem is well-defined for β < 1/γ (a sufficient condition for
concavity is γβ < 1). Then the yellow area is the range of parameters where inputs in production are
complements, and the green area where they are substitutes.
Elasticity of Substitution. A key characteristic of the firm’s production function is its Elasticity of
Substitution between inputs ni and nj , denoted by σ. The elasticity of substitution is defined as
σ =d ln(xj/xi)d ln(TRS)
7
where TRS = dy/dxidy/dx2
is the technical rate of substitution. Then
σ = − h′ (ni)
h′′ (ni)1ni.
Claim 3 Let h(ni) = nγi . Then the production function is CES (σ is constant) and L (n; x) is homo-
geneous of degree one.
We can show in greater generality necessary and sufficient conditions for the production function
to be CES, namely that h(ni) = a + bnγi with a, b constants. In the appendix, we prove that if σ is a
constant, we must have that h (ni) is of the form a+bnγi , where a and b are constants, and that L (n; x)
is homotetic if and only if h (·) is the form a+ bnγi . For the remainder of this and the next subsection,
we assume that a is non-negative.
CES: h(ni) = nγi . The CES production function can be written as:
y = k
[N∑i=1
nγi xi
]β.
A special case of this CES production function is the one in Kremer (1993), which is equivalent to our
model when L =[
1N
∑Ni=1 n
γi
]Nγ with γ → 0.
Finally, we can show that if γβ < 1, then the firm’s objective function as defined generally above is
strictly concave. This Claim is proven in the Appendix.
2.1 The equilibrium allocation
We now explicitly derive the equilibrium in this economy. From the firm’s problem, we obtain the
F.O.C.s:
kβ
[N∑i=1
nγi xi
]β−1
bγnγ−1i xi = w (xi) , ∀i ∈ {1, ..., N}
Then, rearranging, we obtain:ninj
=(w (xj)xiw (xi)xj
) 11−γ
Substituting back, we obtain the demand for labor quality xj as a function of wages:
nj (k) =(kβγb2
) 11−γβ
(xj
w (xj)
) 1(1−γ)
[N∑i=1
(xi
w (xi)γ
) 11−γ] β−1
1−γβ
Market clearing satisfies: ∑k
nj (k)µ (k) = m (xj)
8
where m(xj) = F (xj) − F (xj−1) is the measure of worker type xj . Substituting for the equilibrium
quantity of nj(k) and solving for w (xj), we obtain the equilibrium wages:
w (xj) =xj
m (xj)1−γ
[N∑i=1
(xi
w (xi)γ
) 11−γ] (β−1)(1−γ)
1−γβ[∑
k
(kβγb2
) 11−γβ µ (k)
]1−γ
Now, substituting in the demand for wages, we obtain the equilibrium allocations:
nj (k) =k
11−γβm (xj)∑k k
11−γβ µ (k)
.
Then, looking at the total labor force of a firm with capital k, we have:
n (k) =N∑j=1
nj (k) =k
11−γβm∑
k k1
1−γβ µ (k)
where: m ≡∑N
j=1m (xj) . This expression is strictly increasing and convex in k, and the next Propo-
sition therefore immediately follows.
Proposition 4 Firms with higher k have a larger labor force
Firms with higher firm-specific TFP k are larger. The productivity per worker is higher, and
therefore at common economomy-wide wage rates, it is optimal for them to hire more workers. The
question remains how the skill distributions within the different firms compare.
Proposition 5 When the production function is CES, in equilibrium all firms have the same skill
distribution Fk(x) which is equal to the economy’s skill distribution F (x)
To see this, look at the fraction of quality j workers in terms of the total number of workers, and
we have:
nj (k)n (k)
=
k1
1−γβm(xj)∑k k
11−γβ µ(k)
k1
1−γβm∑k k
11−γβ µ(k)
=m (xj)m
for every k. Therefore, the distribution of workers inside a firm is exactly the same as the one in any
other firm and mimics the distribution in the market. Firms are different in size, given different k′s,
but they look alike considering the distribution of labor qualities inside the firm. This is a consequence
of constant elasticity of substitution assumption or, more generally, homotheticity. This assumption
imposes a lot of structure on firm’s production function.
9
2.2 Different Production Hierarchies
We show here that the equilibrium distribution of labor abilities inside a firm varies with k according
to changes in σ and the measure of labor qualities in the economy. We establish the following result,
consistent with the distributional properties discussed in Figure 1.1
Proposition 6 Let σ′ < 0. If the density of x is decreasing then:
1. Higher k firms are larger;
2. Average skills and average wages are higher in larger firms than in smaller firms;
3. The skill and wage distribution in larger firms First-Order Stochastically dominates those in small
firms.
To derive this result, we show how the elasticity of substitution and the equilibrium allocation relate.
In particular, the following result holds.
Proposition 7 If σ is decreasing2, then higher k firms hire more of the scarce skilled workers (m(x1) <
m(x2)):
σ
(m (x2)
2
)< σ
(m (x1)
2
)⇒
∂(n1
1
n12
)∂k1
∣∣∣∣∣∣k1=k2
> 0
Proof. We prove this result here for β = 1. In the appendix, we provide proof for general β. Observe
that the first-order conditions, after substituting for market clearing imply
k1h′ (n1
1
)= k2h
′ (m (x1)− n11
)k1h′ (n1
2
)= k2h
′ (m (x2)− n12
)These conditions implicitly define the equilibrium allocations n1
1 and n12. Applying the implicit function
theorem, we get
∂n11
∂k1= −
h′(n1
1
)k1h′′
(n1
1
)+ k2h′′
(m (x1)− n1
1
) and∂n1
2
∂k1= −
h′(n1
2
)k1h′′
(n1
2
)+ k2h′′
(m (x2)− n1
2
)1To simplify exposition, here we will consider the case in which we have only two firms with different managerial skills
or TFP, k1 and k2 and β = 1. We have shown that exactly the same results hold in general, though the derivation is
somewhat more involved.2A sufficient condition for σ decreasing is h′′′ < 0. This immediately follows from
dσ
dn= − [h′′]
2n− h′ [h′′ + nh′′′]
[nh′′]2
and the fact that h′ > 0, h′′ < 0.
10
When k1 = k2, we obtain from using the quotient rule that:
∂(n1
1
n12
)∂k1
∣∣∣∣∣∣k1=k2
=
−h′(m(x1)
2
)h′′(m(x1)
2
) ∗ m(x2)2 −
−h′(m(x2)
2
)h′′(m(x2)
2
) ∗ m(x1)2
2k1
(m(x2)
2
)2
The left-hand side is positive provided
−h′(m(x1)
2
)h′′(m(x1)
2
) 1m(x1)
2
>−h′
(m(x2)
2
)h′′(m(x2)
2
) 1m(x2)
2
Recall that the elasticity of substitution is
σ = − h′ (ni)
h′′ (ni)1ni,
and therefore∂(n1
1
n12
)∂k1
∣∣∣∣∣∣k1=k2
> 0 if σ(m (x2)
2
)< σ
(m (x1)
2
)
Considering the more scarce labor quality as the one with higher levels of education or human
capital, an economy with production hierarchies will have larger firms hiring more heavily at the top,
i.e., they will have more skilled workers. This effectively means that they have proportionally more
managerial positions compared to smaller firms. One possible interpretation is that of an increase in
the monitoring cost. In order to manage a larger hierarchy, the demands on communication skills and
span-of-control go up, leading to the hiring of more skilled types. Notice that this is due to the relative
scarcity of each type of labor. The result is driven by the elasticity of substitution of a given quality of
labor compared to others.
2.3 Minimum Scale of Operation
We derive under plausible conditions that the highest skilled worker has a higher type in larger firms
than in smaller firms. This implies that the distribution of higher k firms has fat tails at the top as
long as the skill distribution has decreasing density.
Suppose there is some non-convexity in the production technology. On any given task, firms incur
a fixed cost.3 Consider the production function we used above h(n) = a + bnγ , where a < 0. A firm
will hire a type x if for that type, the equilibrium n∗ yields positive output: h(n∗) = a+ b (n∗)γ , where
we derived n earlier as:
n∗ (k) =k
11−γβm (x)∑k k
11−γβ µ (k)
.
3A special case of this technology is that in Antras, Garicano and Rossi-Hansberg (2006).
11
The firm’s decision problem is therefore to choose n∗ as long as h∗ = a+ b
(k
11−γβm(x)∑k k
11−γβ µ(k)
)γ> 0. A firm
with capital k will therefore be indifferent between hiring and not hiring provided
k =(−ab
) 1−γβγ
1m(x)
∑k∈K(x)
k1
1−γβ µ (k)
1−γβ
.
The only caveat is of course that the summation over k is for all k actively hiring workers of type x.
K(x) denotes the set of firms actively hiring type x workers.
Proposition 8 Let the elasticity of substitution σ be constant, and there is a fixed cost of employing
one skill type (a < 0), then: 1. higher k firms hire more workers; 2. the support of skills hired in lower
k firms is included in the support of skills of higher k firms; 3. when the skill density is decreasing,
higher k firms higher more skilled workers
Example. Let skills be distributed according to the Pareto with location 1 and coefficient 1. Then the
cdf is P (x) = x−1 and the density is p(x) = x−2(= m(x)). Let the distribution of firms be uniform,
µ = 1 for k ∈ [0, 1]. Let h(n) = a+ n1/2, and β = 1. We have:
h (n) =
a+ n12 if n > 0
0 if n = 0
where a < 0. From previous calculations, we obtain:
nx (k) =k2x−2∫K k
2dk.
Define k (x) = {k ∈ K |h (nx (k)) = 0}. Therefore, there exists a threshold such that if k < k (x),
max{
0, a+ nx (k)12
}= 0. This implies that K =
[k (x) , 1
]. Solving for k (x) :
a+
k (x)2
x2∫ 1k(x) k
2dk
12
= 0,
and rearranging, we have:
3k (x)2 = (−ax)2[1− k (x)3
], (1)
which defines k (x). From the implicity function theorem, we have:
dk (x)dx
=2a2x
[1− k (x)3
]3k (x)
[2 + a2x2k (x)
] > 0.
12
Claim 9 x→∞ as k (x)→ 1.
Proof. Assume that there is a x∗ ∈ R such that k (x∗) = 1. But then, from (1) we must have:
3k (x∗)2︸ ︷︷ ︸=3
− (−ax∗)2
1− k (x∗)3︸ ︷︷ ︸=0
= 0
3 = 0
which is a contradiction. Then, we cannot have k (x∗) = 1 for x∗ finite. Since dk(x)dx > 0, ∀k (x) ∈ (0, 1),
we must have k (x)→ 1 as x→∞.
Claim 10 k (1) > 0, i.e., some firms shut down in equilibrium.
Proof. From (1), we have:
3k (1)2 = (−a)2[1− k (1)3
]Now, observe that the LHS of this equality is strictly increasing in k (1), while the RHS is strictly
decreasing. But if k (1) = 0 , we have LHS < RHS, so we must have that k (1) > 0.
The fact that k is increasing in x of course also implies that the larger firms k have higher cut-off
types for their highest skilled employee. The maximum quality of x that a given k firm hire:
x (k) =√
3k
−a (1− k3)12
and is increasing in k. The lowest firm that has positive profits in this market
x =√
3k
0.5 (1− k3)12
k = 0.25
Finally, we also verifty that the demand in the right tail is in fact decreasing as x increases:
dnx (k)dx
=d
{3k2
x2[1−k(x)3]
}dx
=−3k2
{2x[1− k (x)3
]− 3k (x)2 dk(x)
dx x2}
x4[1− k (x)3
]2
Substituting k (x) and rearranging, we have:
dnx (k)dx
=−12xk2
x4[2 + a2x2k (x)
] [1− k (x)3
] < 0
So, the demand is strictly decreasing in x, for a given k and a cut off rule is optimal.
13
For this example, we now explicitly have the measure of skills within a firm
n(x | k) =3k2
x2[1− k (x)3
]where k (x) solves (1). Normalizing this measure to sum up to one, we obtain the firm’s distribution of
skills. Larger firms hire more workers of all skill types, but from simple comparison of the normalized
densities, we see that the low k firms hire proportionally more low skilled workers. The high k firm’s
skill distribution is therefore heavy in the tail and skewed to the right.
3 Applications
3.1 Investment in skills: Endogenous heterogeneity
With production hierarchies, ex ante identical agents have incentives to take on different levels of
investment. Because all skill levels are needed in production, it cannot be an equilibrium where all
agents choose to invest the same amount and obtain the same level of investment. We now show this
for the constant elasticity case with h (ni) = nγi . Let c(xi) be the cost associated with obtaining skill
level xj .
We first find an expression for wages. From our previous calculations, we obtain:
w (xj) =xj
m (xj)1−γ
(N∑i=1
m (xi)γ xi
)(β−1) [∑k
(kβγb2
) 11−γβ µ (k)
]1−γβ
.
Notice that w (xj) depends on the ratio xjm(xj)
1−γ , where m (xj) is the aggregate supply of skill j.
Then,the worker’s problem is given by:
max{x1,...,xN}
{w (x1)− c (x1) , ..., w (xN )− c (xN )}
since in equilibrium all skills must be offered4 and workers are ex ante symmetric, we must have:
w (x1)− c (x1) = ... = w (xN )− c (xN ) = v
where v is a constant.
Then w (xi) is identical to the cost function up to a constant. Using this, we can calculate m (xi),
which is the supply of skill i. Observe that:
v + c (xj) =xj
m (xj)1−γ
(N∑i=1
m (xi)γ xi
)(β−1) [∑k
(kβγ)1
1−γβ µ (k)
]1−γβ
4With h(ni) = nγi the Inada conditions hold and the marginal productivity at ni = 0 is infinity. Hence all skills will be
offered in in equilibrium.
14
and that:v + c (xj)v + c (xi)
=
xjm(xj)
1−γ
xim(xi)
1−γ.
Then, from the expression for v + c (xj), we have:
v + c (xj) =xj
m (xj)1−γ
(m (xj)
γN∑i=1
[m (xi)m (xj)
]γxi
)(β−1) [∑k
(kβγb2
) 11−γβ µ (k)
]1−γβ
.
Substituting the above expression and rearranging:
m (xj) =(
xjv + c (xj)
) 11−γ(
N∑i=1
(xi
[v + c (xi)]γ
) 11−γ) (β−1)
1−γβ[∑
k
(kβγb2
) 11−γβ µ (k)
]
Then:
m′ (xj) =1
1− γ
(xj
v + c (xj)
) γ1−γ(
N∑i=1
(xi
[v + c (xi)]γ
) 11−γ) (β−1)
1−γβ
×
[∑k
(kβγb2
) 11−γβ µ (k)
](v + c (xj)− c′ (xj)xj
(v + c (xj))2
)
If there is no fixed cost, m′ (xj) < 0, for every xj . The density of skill types is downward sloping.
The higher the skill level, the higher the cost of obtaining those skills. As a result, wages must be higher
for higher skill level to compensate for the cost. For that to be the case, there must be fewer people in
equilibrium who invest to obtain high skill levels.5 Observe that the properties of the distribution here
are derived in the context of a competitive market without any externalities.6
3.2 Occupational Choice and Span-of-Control
Models of occupational choice have increasingly received attention as a way of explaining the aggregate
outcomes by means of micro-founded allocation problems. Lucas (1978) uses a matching problem where
differently skilled agents decide to become either workers or entrepreneurs. The span-of-control of the
manager determines the size of the firm, which in equilibrium generates an equilibrium distribution of
firms. In our context, we can extend Lucas’ framework to allow for CEO’s to run the firm, rather than
capitalists. The production technology then becomes y = xL(n,x) instead of y = kL(n,x). There will5If there is a fixed cost, then there is the possibility that m′ (xj) is positive for small x′js, and there can be a skewed
unimodal distribution with a long upper tail. The intuition comes from the compensation for the fixed cost. Since these
skills are not that valuable per se (small x′s), but there is this fixed cost that workers have to pay, we need to increase
wages by decreasing m (x), so we have this initial increase in m (x) as we increase x and then we start decreasing.6For a framework with spillovers from technology adoption and the ensuing endogenous heterogeneity of ex ante identical
agents, see for example Eeckhout and Jovanovic (2002).
15
be a wage for all types both as a worker and as a CEO, and the equilibrium allocation is determined
by the occupational choice of each type, driven by the maximum over both wages. With a distribution
of skills, CEOs of different skills will manage teams, each potentially with a different composition and
distribution. This relates to the findings in Gabaix and Landier (2008) who analyze the matching
problem of CEOs to firms with different capital stocks. Here we interpret the task of the CEO as
managing the composition and distribution of the work force.
We derive the equilibrium condition that determines the occupational choice decision for the con-
tinuous type distribution (the derivation of the continuous type formulation as the limit of the discrete
type case is at the end of this section). If the distance between two sequential qualities is ∆, we have
N = 1+ (x−x)∆ . Then, m (xi) = F (xi)−F (xi−1) = F (xi)−F (xi −∆). Wages and profits, after taking
the limit for ∆→ 0, satisfy:
w (xi) =γxαi
[∫E x
11−γ f (x) dx
]1−γ
f (xi)1−γ
and
π (x,w (·)) = (1− γ)x∫Ecxγi
(γxxαiw (xi)
) γ1−γ
dxi.
then, substituting w (xi), we obtain:
π (x,w (·)) =(1− γ)x
11−γ[∫
E x1
1−γj f (xj) dxj
]γ ∫Ec
[xif (xi)]γ dxi.
Then the condition to become a manager is:
(1− γ)x1
1−γ[∫E x
11−γj f (xj) dxj
]γ ∫Ec
[xif (xi)]γ dxi ≥
γxα[∫E x
11−γ f (x) dx
]1−γ
f (x)1−γ
Rearranging, we have:
x1−α+αγ
1−γ f (x)1−γ ≥ γ
1− γ
[∫E x
11−γ f (x) dx
]∫Ec [xif (xi)]
γ dxi.
We can also go further and introduce span-of-control at each skill level. More generally therefore,
we can use our set up and specify h as h (ni;n−i,x) instead of h (ni) : the returns to each occupation do
not only depend on the number of people of the same skill, but on the number of people of other skill
levels. As a result, there is an occupational choice decision for each job, and each occupation, driven
by the local span-of-control of that occupation. As in Lucas (1979), this partitions the set of skills into
different distributions.
This generalized production function that combines the optimal allocation of skills with an occu-
pational choice decision is really getting to the heart of production hierarchies. At all levels within
16
the firm, managers can be interpreted as having span-of-control of different degrees over workers with
different skill levels. One of the main objectives of this paper is to further elaborate the links between
production hierarchies and span-of-control and occupational choice.
17
4 Appendix
Claim 11 If σ is a constant, we must have that h (ni) is of the form a + bnγi , where a and b are
constants.
Proof. Since σ is a constant, we have that:
h′′ (ni) +1σni
h′ (ni) = 0
is a homogeneous second order linear differential equation. Considering h′ (ni) = g (ni) we reduce it to
a first order ODE. Solving it, we obtain:
h′ (ni) = h′ (n0) e−∫ nin0
1σydy
where h′ (n0) is the initial condition. Taking the integral on both sides, we obtain:
h (ni)− h (n0) = h′ (n0)∫ ni
n0
e−∫ zn0
1σydydz
Then, notice that:
−∫ z
n0
1σydy =
1σ
∫ n0
z
1ydy =
1σ
ln y|n0z =
1σ
lnn0
z
Substituting back, we have:
e−∫ zn0
1σydy =
[eln(n0
z )] 1σ =
(n0
z
) 1σ
Substituting back again, we have:
h (ni)− h (n0) = h′ (n0)∫ ni
n0
(n0
z
) 1σdz
h (ni)− h (n0) = h′ (n0)n1σ0
∫ ni
n0
z−1σ dz
Solving the integral, we obtain:
h (ni)− h (n0) = h′ (n0)n1σ0
[σ
σ − 1zσ−1σ
∣∣∣∣nin0
]
Then, rearranging, we have:
h (ni) = h (n0)− σ
σ − 1h′ (n0)n0 +
σ
σ − 1h′ (n0)n
1σ0 n
σ−1σ
i
Therefore:
h (ni) = a+ bnγi
18
where:
a : = h (n0)− σ
σ − 1h′ (n0)n0
b : =σ
σ − 1h′ (n0)n
1σ0
γ : =σ − 1σ
.
Claim 12 L (n; x) is homotetic if and only if h (·) is the form a+ bnγi .
Proof. We know that, by definition, L (n; x) is homotetic if for any i, j ∈ {1, ..., N} and for any t > 0,
we have that:∂L(n;x)∂ni
∂L(n;x)∂nj
=∂L(tn;x)∂ni
∂L(tn;x)∂nj
But then, we should have:h′ (ni)h′ (nj)
=h′ (tni)h′ (tnj)
rearranging:h′ (tnj)h′ (nj)
=h′ (tni)h′ (ni)
Since this must always be satisfied, we must have:
h′ (tni)h′ (ni)
= c
where c is a constant. But then, we must have:
h′ (tni) = ch′ (ni)
since the function f (β) = tβ, with t > 0, is continuous and has image on (0,∞), by mean value theorem
we have that there is a (γ − 1) ∈ (0,∞) such that t(γ−1) = c. Therefore, we have:
h′ (tni) = tγ−1h′ (ni)
Therefore, h′ (·) is a homogeneous function of degree γ − 1.
Since h (·) is a univarite function, it is easy to see that it must be of the form dnγ−1i , where bd is a
constant (Note that h (ni) = h (ni ∗ 1) = nγ−1i h (1) = dnγ−1
i , where d = h (1)). But then, we have:
h (ni) =∫h′ (ni) dni =
∫dnγ−1
i dni =d
γnγi + a
Define b = dγ , so we have:
h (ni) = a+ bnγi .
19
Claim 13 γβ < 1 is a sufficient condition for strictly concavity of firm’s objective function, whenever
a ≥ 0.
Proof. Notice that:
∂2π
∂n2i
= kβ (β − 1)
[N∑i=1
(a+ bnγi )xi
]β−2
b2γ2n(γ−1)2i x2
i +
kβ
[N∑i=1
(a+ bnγi )xi
]β−1
bγ (γ − 1)nγ−2i xi.
Rearranging:
∂2π
∂n2i
= kβ
[N∑i=1
(a+ bnγi )xi
]β−2
bγnγ−2i xi
{(β − 1) bnγi γxi + (γ − 1)
[N∑i=1
(a+ bnγi )xi
]}Then, ∂2π
∂n21< 0 if we have:
kβ
[N∑i=1
(a+ bnγi )xi
]β−2
bγnγ−21 x1
{(β − 1) bnγ1γx1 + (γ − 1)
[N∑i=1
(a+ bnγi )xi
]}< 0
Which implies:
(β − 1) bnγ1γx1 + (γ − 1)
[N∑i=1
(a+ bnγi )xi
]< 0
Rearranging, we have:
(γβ − 1) bnγ1x1 + (γ − 1)
[a
N∑i=1
xi + bN∑i=1
nγi xi
]< 0
From our assumption that h′ (·) > 0, we must have b > 0. However, initially we don’t have any
assumptions on a. If we consider a ≥ 0, we notice that a sufficient condition would be γβ < 1 (I’m
already assuming by concavity of h (·) that γ < 1). To get Inada conditions, we necessarily have a = 0.
If a < 0, then we wouldn’t have strict concavity holding for all n.
Let’s now consider the second principal minor. Then, our condition is given by:
k2β2
[N∑i=1
(a+ bnγi )xi
]2β−3
b2γ2nγ−21 nγ−2
2 x1x2 (γ − 1)
(γβ − 1) b (nγ1x1 + nγ2x2)
+ (γ − 1)[a∑N
i=1 xi + b∑N
i=3 nγi xi
] > 0
Again, for the case in which a ≥ 0, γβ < 1 is a sufficient condition, since γ < 1.
Let’s now consider the third principal minor. Then, our condition is given by:
k3β3
[N∑i=1
(a+ bnγi )xi
]3β−4
b3γ3nγ−21 nγ−2
2 nγ−23 x1x2x3 (γ − 1)2
∗
(γβ − 1) b (nγ1x1 + nγ2x2 + nγ3x3)
+ (γ − 1)[a∑N
i=1 xi + b∑N
i=4 nγi xi
] < 0
20
Then, again, for the case in which a ≥ 0 , γβ < 1 is a sufficient condition. We also can see the pattern
for these conditions, meaning that γβ < 1 is a sufficient condition for any N and a ≥ 0. Therefore,
γβ < 1 is a sufficient condition for strict concavity of the objective function whenever a ≥ 0.
Proof of Proposition 7 for general β.
Proof. Equilibrium conditions. Two firms, two skills; Endogenous Variables: n11, n
12, n
21, n
22, w1, w2.
k1β[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
1
)x1 = w1 (1)
k1β[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
2
)x2 = w2 (2)
k2β[h(n2
1
)x1 + h
(n2
2
)x2
]β−1h′(n2
1
)x1 = w1 (3)
k2β[h(n2
1
)x1 + h
(n2
2
)x2
]β−1h′(n2
2
)x2 = w2 (4)
n11 + n2
1 = m (x1) (5)
n12 + n2
2 = m (x2) (6)
For the general case, when β 6= 1, we can reduce the system to:k1
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
1
)− k2
h(m (x1)− n1
1
)x1
+h(m (x2)− n1
2
)x2
β−1
h′(m (x1)− n1
1
)= 0 (F1)
k1
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
2
)− k2
h(m (x1)− n1
1
)x1
+h(m (x2)− n1
2
)x2
β−1
h′(m (x2)− n1
2
)= 0 (F2)
The main problem is that this is a non-linear non-separable system.
From (F1)(F2) , we have:
h′(n1
1
)h′(n1
2
) =h′(m (x1)− n1
1
)h′(m (x2)− n1
2
)Then, let’s prepare ourselves for the IFT:
DkF =
∂F1∂k1
∂F1∂k2
∂F2∂k1
∂F2∂k2
where:
∂F1
∂k1=[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
1
)∂F1
∂k2= −
[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′(m (x1)− n1
1
)∂F2
∂k1=[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
2
)∂F2
∂k2= −
[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′(m (x2)− n1
2
)
21
And,
DnF =
∂F1
∂n11
∂F1
∂n12
∂F2
∂n11
∂F2
∂n12
where:
∂F1
∂n11
= k1
{(β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2 [h′(n1
1
)]2x1 +
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′′(n1
1
)}−k2
− (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 [h′(m (x1)− n1
1
)]2x1
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′′(m (x1)− n1
1
)
∂F1
∂n12
=
k1 (β − 1)[h(n1
1
)x1 + h
(n1
2
)x2
]β−2h′(n1
1
)h′(n1
2
)x2+
k2 (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2h′(m (x1)− n1
1
)h′(m (x2)− n1
2
)x2
∂F2
∂n11
=
k1 (β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2h′(n1
2
)h′(n1
1
)x1+
k2 (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2
∗h′(m (x2)− n1
2
)h′(m (x1)− n1
1
)x1
∂F2
∂n12
= k1
{(β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2 [h′(n1
2
)]2x2 +
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′′(n1
2
)}−k2
− (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 [h′(m (x2)− n1
2
)]2x2
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′′(m (x2)− n1
2
)
Then, we have:
detDnF =∂F1
∂n11
∗ ∂F2
∂n12
− ∂F2
∂n11
∗ ∂F1
∂n12
So:
∂F1
∂n11
∗ ∂F2
∂n12
=k1
{(β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2 [h′(n1
1
)]2x1 +
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′′(n1
1
)}−k2
− (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 [h′(m (x1)− n1
1
)]2x1
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′′(m (x1)− n1
1
)
∗
k1
{(β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2 [h′(n1
2
)]2x2 +
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′′(n1
2
)}−k2
− (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 [h′(m (x2)− n1
2
)]2x2
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′′(m (x2)− n1
2
)
22
Rearranging:
∂F1
∂n11
∗ ∂F2
∂n12
=k1
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2{
(β − 1)[h′(n1
1
)]2x1 +
[h(n1
1
)x1 + h
(n1
2
)x2
]h′′(n1
1
)}−k2
[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 ∗ − (β − 1)[h′(m (x1)− n1
1
)]2x1
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]h′′(m (x1)− n1
1
)
∗
k1
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2{
(β − 1)[h′(n1
2
)]2x2 +
[h(n1
1
)x1 + h
(n1
2
)x2
]h′′(n1
2
)}−k2
[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 ∗ − (β − 1)[h′(m (x2)− n1
2
)]2x2
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]h′′(m (x2)− n1
2
)
and
∂F2
∂n11
∗ ∂F1
∂n12
=
k1 (β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2h′(n1
1
)h′(n1
2
)x2+
k2 (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2
∗h′(m (x1)− n1
1
)h′(m (x2)− n1
2
)x2
∗
k1 (β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2h′(n1
2
)h′(n1
1
)x1+
k2 (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2
∗h′(m (x2)− n1
2
)h′(m (x1)− n1
1
)x1
Now, consider the symmetric equilibrium in which k1 = k2, n1
1 = m(x1)2 and n1
2 = m(x2)2 . Then, we have:
∂F1
∂n11
∗ ∂F2
∂n12
∣∣∣∣k1=k2=k
=
(2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]β−2)2 (β − 1)
[h′(m(x1)
2
)]2x1
+[h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]h′′(m(x1)
2
)
∗
(β − 1)[h′(m(x2)
2
)]2x2
+[h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]h′′(m(x2)
2
)
and
∂F2
∂n11
∗ ∂F1
∂n12
=
[2k (β − 1)
[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]β−2
h′(m (x1)
2
)h′(m (x2)
2
)]2
x2x1
23
Then, detDnF becomes:
detDnF = 4k2
[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−4
∗(β − 1)
[h′′(m(x2)
2
) [h′(m(x1)
2
)]2x1 + h′′
(m(x1)
2
) [h′(m(x2)
2
)]2x2
]+[h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]2h′′(m(x1)
2
)h′′(m(x2)
2
)
If β < 1, this is necessarily different than zero. Otherwise, this could be zero but the set of parameters
in which this occurs has mean zero. Then:
D−1n F =
1|detDnF |
∂F2
∂n12−∂F1
∂n12
−∂F2
∂n11
∂F1
∂n11
Then: ∂n1
1∂k1
∂n11
∂k2∂n1
2∂k1
∂n12
∂k2
= −D−1n F ∗DkF
Substituting, we have:∂n11
∂k1
∂n11
∂k2∂n1
2∂k1
∂n12
∂k2
= − 1|detDnF |
∂F2
∂n12−∂F1
∂n12
−∂F2
∂n11
∂F1
∂n11
∗∂F1∂k1
∂F1∂k2
∂F2∂k1
∂F2∂k2
Then:
∂n11
∂k1= − 1|detDnF |
(∂F2
∂n12
∗ ∂F1
∂k1− ∂F1
∂n12
∗ ∂F2
∂k1
)Then:
∂F2
∂n12
∗ ∂F1
∂k1=
k1
{(β − 1)
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2 [h′(n1
2
)]2x2 +
[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′′(n1
2
)}−k2
− (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 [h′(m (x2)− n1
2
)]2x2
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−1h′′(m (x2)− n1
2
)
∗[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
1
)at k1 = k2 = k and symmetric equilibrium, we have:
∂F2
∂n12
∗ ∂F1
∂k1
∣∣∣∣k1=k2=k
=
2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3 (β − 1)
[h′(m(x2)
2
)]2x2+[
h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]h′′(m(x2)
2
)h′(m (x1)
2
)
24
and
∂F1
∂n12
∗ ∂F2
∂k1=
k1 (β − 1)[h(n1
1
)x1 + h
(n1
2
)x2
]β−2h′(n1
1
)h′(n1
2
)x2+
k2 (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2
∗h′(m (x1)− n1
1
)h′(m (x2)− n1
2
)x2
∗[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
2
)again, at k1 = k2 = k, we have:
∂F1
∂n12
∗ ∂F2
∂k1
∣∣∣∣k1=k2=k
=
2k (β − 1)[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3
h′(m (x2)
2
)2
h′(m (x1)
2
)x2
Putting everything together at k1 = k2 = k, we have:
∂F2
∂n12
∗ ∂F1
∂k1
∣∣∣∣k1=k2=k
− ∂F1
∂n12
∗ ∂F2
∂k1
∣∣∣∣k1=k2=k
=
2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3 (β − 1)
[h′(m(x2)
2
)]2x2+[
h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]h′′(m(x2)
2
)h′(m (x1)
2
)
−2k (β − 1)[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3
h′(m (x2)
2
)2
h′(m (x1)
2
)x2
= 2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−2
h′′(m (x2)
2
)h′(m (x1)
2
)Therefore:
∂n11
∂k1
∣∣∣∣k1=k2=k
= −
{2k[h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]2β−2h′′(m(x2)
2
)h′(m(x1)
2
)}|detDnF |
> 0
Now, let’s calculate ∂n12
∂k1. Then, we have:
∂n12
∂k1
∣∣∣∣k1=k2=k
= − 1|detDnF |
(∂F1
∂n11
∗ ∂F2
∂k1− ∂F2
∂n11
∗ ∂F1
∂k1
)
25
Then, let’s substitute this step by step:
∂F1
∂n11
∗ ∂F2
∂k1=
k1
[h(n1
1
)x1 + h
(n1
2
)x2
]β−2{
(β − 1)[h′(n1
1
)]2x1 +
[h(n1
1
)x1 + h
(n1
2
)x2
]h′′(n1
1
)}−k2
[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2 ∗ − (β − 1)[h′(m (x1)− n1
1
)]2x1
−[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]h′′(m (x1)− n1
1
)
∗[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
2
)at k1 = k2 = k, we have:
∂F1
∂n11
∗ ∂F2
∂k1
∣∣∣∣k1=k2=k
=
2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3 (β − 1)
[h′(m(x1)
2
)]2x1
+[h(m(x1)
2
)x1 + h
(nm(x2)
2
)x2
]h′′(m(x1)
2
)h′(m (x2)
2
)and
∂F2
∂n11
∗ ∂F1
∂k1=
k1 (β − 1)[h(n1
1
)x1 + h
(n1
2
)x2
]β−2h′(n1
2
)h′(n1
1
)x1+
k2 (β − 1)[h(m (x1)− n1
1
)x1 + h
(m (x2)− n1
2
)x2
]β−2
∗h′(m (x2)− n1
2
)h′(m (x1)− n1
1
)x1
∗[h(n1
1
)x1 + h
(n1
2
)x2
]β−1h′(n1
1
)Then, at k1 = k2 = k, we have:
∂F2
∂n11
∗ ∂F1
∂k1
∣∣∣∣k1=k2=k
=
2k (β − 1)[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3
h′(m (x2)
2
)[h′(m (x1)
2
)]2
x1
Then, we have:
∂F1
∂n11
∗ ∂F2
∂k1
∣∣∣∣k1=k2=k
− ∂F2
∂n11
∗ ∂F1
∂k1
∣∣∣∣k1=k2=k
=
2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3 (β − 1)
[h′(m(x1)
2
)]2x1
+[h(m(x1)
2
)x1 + h
(nm(x2)
2
)x2
]h′′(m(x1)
2
)h′(m (x2)
2
)
−2k (β − 1)[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−3
h′(m (x2)
2
)[h′(m (x1)
2
)]2
x1
= 2k[h
(m (x1)
2
)x1 + h
(m (x2)
2
)x2
]2β−2
h′′(m (x1)
2
)h′(m (x2)
2
)26
Then:
∂n12
∂k1= −
{2k[h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]2β−2h′′(m(x1)
2
)h′(m(x2)
2
)}|detDnF |
> 0
Then:∂(n1
1
n12
)∂k1
=∂n1
1∂k1∗ n1
2 −∂n1
2∂k1∗ n1
1(n1
2
)2∂(n1
1
n12
)∂k1
∣∣∣∣∣∣k1=k2
=2k[h(m(x1)
2
)x1 + h
(m(x2)
2
)x2
]2β−2
|detDnF |(m(x2)
2
)2
−h′′ (m(x2)2
)h′(m(x1)
2
)m(x2)
2
+h′′(m(x1)
2
)h′(m(x2)
2
)m(x1)
2
Therefore, we have:
∂(n1
1
n12
)∂k1
∣∣∣∣∣∣k1=k2
> 0 if
−h′′(m (x2)
2
)h′(m (x1)
2
)m (x2)
2> −h′′
(m (x1)
2
)h′(m (x2)
2
)m (x1)
2
Rearranging, we have:
−h′′(m(x2)
2
)h′(m(x2)
2
) m (x2)2
> −h′′(m(x1)
2
)h′(m(x1)
2
) m (x1)2
which is exactly the same condition we obtained before for the case in which β = 1.
Example in which σ′ (n) < 0
→ Case without Inada Conditions: arctan (n) :
h (n) = arctan (n)
h′ (n) =1
1 + n2> 0
but note that h′ (n)→ 1 as n→ 0, and
h′′ (n) = − 2n(1 + n2)2 < 0
Then:
σ = − h′ (n)
h′′ (n)1n.
solving it:
σ = −
(1
1+n2
)(− 2n
(1+n2)2
) 1n
σ =1 + n2
2n2
27
and
σ′ = − 1n3
< 0
but note that:
h′′′ (n) = −2
(1− 3n2
)(1 + n2)3
notice that this derivative is negative until n = 1√3
and then becomes positive. Since we want a bounded
function, unless limn→∞ h′ (n) is not defined, we must have limn→∞ h
′ (n) = 0, and then we need this
long tail. If h′′′ (·) < 0 this would be impossible.
→ Example with Inada Conditions X 21 (Chi-Square with one degree of freedom).
Pdf of Chi-square X 2k :
h′ (n) =1
2k2 Γ(k2
)x k2−1e−x2 .
where:
Γ (k) =∫ ∞
0tk−1e−tdt
Remark: we are considering the distribution as h (·) .
We can show that: Γ(
12
)=√π. Then, the pdf of X 2
1 is:
h′ (n) =1√2πn−
12 e−
n2 .
Then:
h′′ (n) = − 1√2π
12n−
32 e−
n2 (1 + n)
Finally:
σ =2
1 + n.
Notice that: h′ (n)→∞ as n→ 0 and h′′ (·) < 0. Again, we have h′′′ (·) > 0.
Looking atd
(n11n12
)dk1
Anyway, I actually can show a general proof ofd
(n11n12
)dk1
< 0 if we assume h′′′ (·) < 0 (at least for the
simplest case in which β = 1).
Proposition 14 If β = 1 and h′′′ (·) < 0, we have thatd
(n11n12
)dk1
< 0.
Proof. First of all, remember that, simplifying the system of equilibrium conditions, we end up with
the following two conditions:
28
k1h′ (n1
1
)= k2h
′ (m (x1)− n11
)(1)
k1h′ (n1
2
)= k2h
′ (m (x2)− n12
)(2)
Then, rearranging eq. (1), we have:
k1
k2=h′(m (x1)− n1
1
)h′(n1
1
)Now, consider that we increase m (x1). Since LHS is constant and h′′ (·) < 0, we must increase n1
1 to
increase the numerator and decrease the denominator. Therefore, an increase in m (x1) increases both
m (x1)− n11 and n1
1. Now, using the IFT, we have:
∂n11
∂k1= −
h′(n1
1
)k1h′′
(n1
1
)+ k2h′′
(m (x1)− n1
1
)Similarly:
∂n12
∂k1= −
h′(n1
2
)k1h′′
(n1
2
)+ k2h′′
(m (x2)− n1
2
)Considering that m (x1) > m (x2), using a similar argument as the one we used above, we have
n11 > n1
2 and m (x1) − n11 > m (x2) − n1
2 (an increase in m increases n but less than proportion-
ally). Then, using the fact that h′′ (·) < 0 and h′′′ (·) < 0, we have that h′(n1
1
)< h′
(n1
2
)and
−[k1h′′ (n1
1
)+ k2h
′′ (m (x1)− n11
)]> −
[k1h′′ (n1
2
)+ k2h
′′ (m (x2)− n12
)](since h′′′
(n1
1
)is more nega-
tive than h′′(n1
2
)and so for so on). Therefore, ∂n1
1∂k1
<∂n1
2∂k1
. But then:
∂n11
∂k1n1
2 −∂n1
2
∂k1n1
1 < 0
andd(n1
1
n12
)dk1
=∂n1
1∂k1
n12 −
∂n12
∂k1n1
1(n1
2
)2 < 0.
Now, I think we can have a broader solution. First of all, notice that:
∂n11
∂k1= −
h′(n1
1
)k1h′′
(n1
1
)+ k2h′′
(m (x1)− n1
1
)dividing above and below by −h′
(n1
1
), we have:
∂n11
∂k1=
1
−k1h′′(n1
1)h′(n1
1)− k2
h′′(m(x1)−n11)
h′(n11)
29
Since h′(n1
1
)= k2
k1h′(m (x1)− n1
1
), we have:
∂n11
∂k1=
1
k1
{−h′′(n1
1)h′(n1
1)− h′′(m(x1)−n1
1)h′(m(x1)−n1
1)
}A similar argument can be made for ∂n1
2∂k1
. Then, for ∂n11
∂k1n1
2 −∂n1
2∂k1
n11 < 0, we have:
1{−h′′(n1
1)n11
h′(n11)− h′′(m(x1)−n1
1)n11
h′(m(x1)−n11)
} <1{
−h′′(n12)n1
2
h′(n12)− h′′(m(x2)−n1
2)n12
h′(m(x2)−n12)
}Substituting the elasticity of substitution, we have:
−h′′(n1
2
)n1
2
h′(n1
2
) −h′′(m (x2)− n1
2
)n1
2
h′(m (x2)− n1
2
) < −h′′(n1
1
)n1
1
h′(n1
1
) −h′′(m (x1)− n1
1
)n1
1
h′(m (x1)− n1
1
)1
σ(n1
2
) +n1
2
m (x2)− n12
∗ 1σ(m (x2)− n1
2
) <1
σ(n1
1
) +n1
1
m (x1)− n11
∗ 1σ(m (x1)− n1
1
) .
Derivation of the continuous case. We need to be careful about which assumptions we impose
on n (x) for writing down the continuous case. If we rewrite the model with ∆s, we are using a
partition/refinement argument, which delivers a Riemann integral7. Based on this, we must have a
piecewise continuous n (x). Consider a partition P and an associated set of points X in which Xi ∈ Ii,
where Ii is an interval in the partition P. Then, S [(P,X ) , f ] is defined by:
S [(P,X ) , f ] =N−1∑i=1
h (n (Xi))Xi |Ii| .
A function f is integrable if and only if:
lim|P|→0
S [(P,X ) , f ] =∫ x
xh (n (x))xdx
for any (P,X ) . We can show that any piecewise continuous function satisfies integrability. The contin-
uous case can derived from taking the appropriate limit for ∆→ 0
L(n,x) =[∫
h(n)xdx]β
A special case with h CES:
L(n,x) =
[N∑i=1
nγi xαi
]βthen becomes in the continuous case:
L(n,x) =[∫
nγi xαi dni
]β.
7A function is Riemann integrable if it is continuous almost everywhere, i.e., it is discontinuous in at most a zero
measure set.
30
References
Antras, Pol, Luis Garicano and Esteban Rossi-Hansberg, “Offshoring in a Knowledge Econ-
omy”, Quarterly Journal of Economics 121(1), 2006, 31-77.
Eeckhout, Jan, and Boyan Jovanovic, “Knowledge Spillovers and Inequality”, American Eco-
nomic Review 92(5), 2002, 1290-1307.
Eeckhout, Jan, and Boyan Jovanovic, “Occupational Sorting and Development”, NBER working
paper w13686, 2007.
Gabaix, Xavier, and Augustin Landier “Why has CEO Pay Increased so Much?”, Quarterly
Journal of Economics, forthcoming 2008.
Gale, David, and Lloyd Shapley, “College Admission and the Stability of Marriage”, American
Mathematical Monthly, 69, (1962), 9-15.
Garicano, Luis, “Hierarchies and the Organization of Knowledge in Production,” Journal of Political
Economy 108(5), 2000.
Kelso, Alexander and Vincent Crawford, “Job Matching, Coalition Formation, and Gross
Substitutes,” Econometrica 50, 1982.
Koopmans, T. C., and M. J. Beckmann, “Assignment Problems and the Location of Economic
Activity.” Econometrica 25, 1957, 52-76.
Kremer, Michael, “The O-Ring Theory of Economic Development”, Quarterly Journal of Eco-
nomics 108(3), 1993, 551-575.
Kremer, Michael, and Eric Maskin, “Wage Inequality and Segregation by Skill”, NBER Working
Paper No. w5718, 1996
Lucas, Robert, “On the Size Distribution of Business Firms.” Bell Journal 1978.
Luttmer, Erzo, “Selection, Growth, and the Size Distribution of Firms,” Quarterly Journal of
Economics 122(3), 2007, 1103-1144.
Rossi-Hansberg, Esteban, and Mark Wright, “Establishment Size Dynamics in the Aggregate
Economy”, American Economic Review 97(5), 2007, 1639-1666.