Upload
voxuyen
View
235
Download
1
Embed Size (px)
Citation preview
Gender Wage Gaps Reconsidered:A Structural Approach Using Matched
Employer-Employee Data�
Cristian BartolucciCollegio Carlo Albertoy
August 2012
Abstract
In this paper, we study the extent to which wage di¤erentials between men and women canbe explained by di¤erences in productivity, disparities in friction patterns, segregation andwage discrimination. For this purpose, we propose an equilibrium search model that featuresrent splitting, on-the-job search and two-sided heterogeneity in productivity. The model isestimated using German matched employer-employee data. Overall, the results reveal thatfemale workers are less productive and more mobile than males. In addition, female workershave on average slightly lower bargaining power than their male counterparts. The rawgender wage gap is 41 percent; 65 percent of this gap is accounted for by di¤erences inproductivity, 17 percent of this gap is driven by segregation, while di¤erences in destructionrates explain 9 percent of the total wage-gap. Netting out di¤erences in o¤er-arrival rateswould increase the gap by 13 percent. Di¤erences in the rent-splitting parameter explain 21percent of the gap, which means that female workers receive wages that are 9 percent lowerthan their equivalent male counterparts.JEL Code: J70, C51, J64KEYWORDS: Labor market discrimination, search frictions, structural estimation, matched
employer-employee data.
1 Introduction
In this paper, we estimate an equilibrium search model to study the extent to which wage dif-
ferentials between men and women can be explained by di¤erences in productivity, disparities
in friction patterns, segregation and direct wage discrimination. The model features on-the-job
�I am specially grateful to Manuel Arellano for his guidance and constant encouragement. I would also liketo thank Stéphane Bonhomme, Christian Bontemps, Raquel Carrasco, Zvi Eckstein, Chris Flinn, Andrés Erosa,Pietro Garibaldi, Carlos González-Aguado, José Labeaga, Joan Llull, Claudio Michelacci, Pedro Mira, EnriqueMoral-Benito, Diego Puga, Jean Marc Robin, Steven Stern, Nieves Valdez, two anonymous referees and seminarparticipants at Uppsala University, Bank of Canada, Collegio Carlo Alberto, Bank of Italy, Bank of France, PSE,UQAM, CEMFI, University of Tucuman, University College London, ICEEE (2011), European Meeting of theEconometric Society (2009), Cosme Workshop, EC-Squared Conference on Structural Microeconometrics, SEAmeetings (2008) and IZA-ESSLE (2008) for very helpful comments. Special thanks is due to Emily Moschinifor excellent research assistance and to Nils Drews, Peter Jacobebbinghaus and Dana Muller at the Institute forEmployment Research for invaluable support with the data. First Version: August 2008.
yVia Real Collegio, 30 - 10024 - Moncalieri (Torino), Italy. Email: [email protected].
1
search, rent-splitting, and productivity heterogeneity in �rms and workers. The structural es-
timation combines group-speci�c productivity measures and the empirical distribution of �rms�
productivity estimated from �rm-level data, group-speci�c friction patterns estimated from indi-
vidual duration data, and individual wages. The model implies a structural wage equation, which
illustrate the precise relationship between wages, worker ability, �rm productivity, friction pat-
terns, and bargaining power. The wage equation estimated in this paper may be best understood
as the structural counterpart of a standard Mincer equation, including only wage determinants
which are relevant from a theoretical point of view. The model allows for counterfactual analysis,
such as the decomposition of the observed wage gaps in terms of the di¤erences in productivity
and frictions, or segregation and discrimination, taking into account equilibrium e¤ects.
Understanding wage gaps across di¤erent demographic groups has always been a central con-
cern in labor economics. Many studies have focused on explaining how much of the unconditional
mean wage di¤erential between groups may be understood as wage discrimination.1 The stan-
dard strategy involves estimating Mincer-type equations for both groups and then decomposing
the di¤erence in mean wages into �explained�and �unexplained�components. The fraction of
the gap that cannot be explained by di¤erences in observable characteristics is considered dis-
crimination. This kind of analysis is very informative from a descriptive perspective, but the
causal interpretation and the nature of discrimination are not clear. The reason is that most
of the observable characteristics included in the standard reduced-form wage equations, such as
education or experience may be understood as proxies of the true wage determinants, such as
productivity, outside options, and the rent splitting rule. Whenever there is a residual di¤erence
between groups in some of these wage determinants, the measure of discrimination may be biased.
The primary candidate to generate these biases is a between-group di¤erence in productivity that
may persist after controlling for education, occupation, age and tenure.
To deal with this shortcoming, Hellerstein and Neumark (1999) and Hellerstein, Neumark
and Troske (1999) exploit newly available matched employer-employee data to calculate a novel
indicator for wage discrimination. They estimate the relative marginal products of various worker
types, which are then compared with their relative wages. If productivity is the only wage
determinant, which is true if there is perfect competition in the labor market, or if there are no
1See Blau and Kahn (2003) and Altonji and Blank (1999) for good surveys.
2
signi�cant di¤erences in friction patterns or in the distribution of �rms faced by both groups,
any di¤erence in wages that is not driven by a di¤erence in productivity may be understood as
discrimination.
However, comparing wages and productivity may provide an incomplete picture of the prob-
lem. For example, wage di¤erentials across groups are often accompanied by unemployment-rate
and job-duration di¤erentials. There is a vast expanse of literature estimating di¤erentials in
job-�nding and job-retention rates across groups, directly observing duration in unemployment
and employment, or using experiments such as audit studies.2 Although there is agreement in
predicting e¤ects of frictions on wages,3 there is scarce empirical evidence as to how much of the
wage gap can be accounted for by the di¤erences in the friction patterns. Also, the variation
in worker composition is likely to be correlated with heterogeneity in the production technology,
which generates labor market segregation.4
Estimated structural models may provide a complete interpretation of observed wage gaps as
a consequence of between-groups di¤erences in every wage determinant. Nevertheless, progress in
this direction has been slow, primarily due to the di¢ culty in separately identifying the impacts
of worker productivity di¤erentials and discrimination, solely from worker-level data. The main
references are Eckstein and Wolpin (1999) and Bowlus and Eckstein (2002). Both papers are
focused on racial discrimination in the U.S. and deal with this empirical identi�cation problem
through structural assumptions. Eckstein and Wolpin (1999) proposed a method based on a two-
sided, search-matching model that formally accounts for unobserved heterogeneity and unobserved
o¤ered wages. They argued that di¤erences in the bargaining power parameter (their index of
discrimination) are not identi�ed unless some �rm-side data are available. They compute bounds
for discrimination that were ultimately not informative on the estimation sample they worked
with. Bowlus and Eckstein (2002) also proposed a search model with heterogeneity in worker
productivity but including an appearance-based employer disutility factor. As long as there are
�rms that do not discriminate, they are able to identify between-groups di¤erences in the skill
distribution as well as the proportion of discriminatory employers. The �rst attempts to use
an equilibrium search model to study gender discrimination were made by Bowlus (1997) and
2See Chapter 4 in Altonji and Black (1999) for a good survey.3See van den Berg and van Vuuren (2010) for a good discussion of this issue.4See Altonji and Blank (1999) and Bayard, Hellerstein, Neumark and Troske (2003).
3
Flabbi (2010). Bowlus only focuses on the e¤ect of gender di¤erences in friction patterns on wage
di¤erentials without distinguishing between di¤erences in productivity and discrimination. Flabbi
(2010) uses the identi�cation strategy proposed in Bowlus and Eckstein (2002) to additionally
disentangle the e¤ect of productivity di¤erentials.5
In this paper, we link two independent branches of the wage-discrimination literature. We
propose a natural step in extending the structural estimation literature focused on wage discrim-
ination. The availability of better data enables us to estimate a more complete model,6 using a
more transparent identi�cation strategy to measure di¤erences in productivity and taking into
account the e¤ect on wages of labor market segregation. This approach may also be understood
as a step forward from the strategy proposed by Hellerstein and Neumark (1999), where only pro-
ductivity gaps were estimated. Here, we provide a more complete interpretation of the observed
wage gap, where the productivity gap is only one of the potential determinants of the di¤erence
in wages between males and females.
The model builds upon the equilibrium search model, with strategic wage bargaining and
on-the-job search presented in Cahuc, Postel-Vinay and Robin (2006). The model consents rent-
splitting and, as in Eckstein and Wolpin (1999), we allow for di¤erent rent-splitting rules for male
and female workers to capture wage discrimination. The model has two sources of heterogeneity.
Workers are heterogeneous in terms of productive e¢ ciency to capture the e¤ect on equilibrium
wages of di¤erences across sexes in productivity. Firms technology is also heterogeneous and its
distribution is group-speci�c. This feature allows the model to �t an economy with segregation.
Workers do not �nd jobs instantaneously and, when working, there is a positive probability of
losing the current job. The parameters that describe these events are also allowed to be gender-
speci�c in order to capture the e¤ect of frictions on the gender wage gap.
We use LIAB, a 1996-2005 matched employer-employee panel data made available by the Ger-
man Labor Agency.7 This panel provides us with a unique opportunity to disentangle the di¤erent
5To have an estimable model with employee-level data, Flabbi (2010) only includes heterogeneity at the matchlevel, not taking segregation into account, and not allowing for on-the-job search. Flabbi does allow for wagebargaining; however, the bargaining power is not estimated.
6Particularly related to Flabbi (2010), which at the moment is the most complete structural estimation in thewage-discrimination literature, the model presented in this paper allows for two-sided heterogeneity and on-the-jobsearch. In the estimation, rent splitting parameters are not imposed, but are estimated. Note that we are notforcing the same �rms distribution across gender. See Sections 2 and 4 for details.
7This dataset is subject to strict con�dentiality restrictions. It is not directly available. After the IAB hasapproved the research project, The Research Data Center (FDZ) provides on-site use or remote access to external
4
sources of the gender wage gap, because it contains valuable information on gender, wages and
occupations, but also because it is a panel that tracks �rms as opposed to individuals. Track-
ing �rms is essential in the estimation of production functions using panel estimation methods.
To the best of our knowledge, this paper presents the �rst structural estimation using matched
employer-employee data to study labor market discrimination.8
To have a more homogeneous sample, we only consider �rms in West Germany. The gender
wage gap in this part of Germany is particularly large. According to Blau and Kahn (2000) the
gender hourly-wage gap in West Germany is 32 percent; placing west Germany in position 6 in a
ranking of 22 industrialized countries. Many researchers are trying to understand the nature of
this large earning di¤erential (Lauer, 2000; Heinze and Wolf, 2010; Antonczyk, Fitzenberger and
Sommerfeld, 2010).
The empirical analysis initially involves calculating the empirical distribution of �rms�produc-
tivity and di¤erences in productivity between men and women. The linkage between worker-level
information and �rm-level data is essential at this stage. From the worker-level data, we recover
the composition of the �rm during each time period. The production function estimation exploits
within �rm variation in worker composition, capital and output. We estimate group-speci�c
productivity parameters and a �rm-speci�c technology measure. We �nd large productivity dif-
ferentials for women and noticeable di¤erences in the distribution of �rms faced by both kind of
workers.
We then analyze group-speci�c friction patterns by survival analysis. We �nd that women have
higher job-creation rates than men, and that females also have higher job-destruction rates than
males. Finally, we estimate bargaining power for male and female workers by means of individual
wages, gender-speci�c productivity measures for each �rm, the gender-speci�c distribution of
�rms�productivity and gender-speci�c frictions patterns. In spite of the large wage di¤erentials,
on average, women are found to have only slightly lower bargaining power than men.
In terms of wages, the raw gender wage gap is 41 percent. It turns out that most of the gap �65
percent �is accounted for by di¤erences in productivity. Di¤erences in destruction rates explain
9 percent while di¤erences in the distribution of the �rm�s productivity faced by male and female
researchers.8There is a recent paper by Sulis (2007) that studies gender wage di¤erentials in Italy, estimating a structural
model. Sulis uses employee level data with �rm identi�ers, without data on �rms, such as capital or output.
5
workers explain 17 percent of the total wage-gap. Netting out di¤erences in o¤er-arrival rates
would increase the gap by 13 percent. Presumably, within di¤erences in arrival rates, destruction
rates, the distribution of employers and productivity there is some content of discrimination. The
economic literature, as well as the legislation of many countries, certainly recognize these potential
sources of discrimination. But if we focus our attention on direct wage discrimination, we �nd
that di¤erences in the rent-splitting parameter are responsible for 21 percent of the wage gap,
implying that female workers receive wages that are 8.6 percent lower than equivalent males in
terms of productivity and treated the same as males in terms of the o¤er arrival rate, destruction
rate and distribution of employers.
The rest of this paper is organized as follows. In the next section, we describe the structural
model. Section 3 describes the data. Section 4 presents the estimation of the structural model
inputs, namely the productivity measures and friction parameters. We present and discuss the
intermediate results and then describe the empirical strategy used to estimate the structural
wage equation and its results. In Section 5, the counterfactual experiments are discussed and we
compare our empirical results with those found using other strategies for detecting discrimination.
Conclusions are o¤ered in Section 6.
2 Structural framework
In this section we describe the behavioral model of labor market search with matching and rent-
splitting. The primary goal of estimating a structural model is to clearly state a wage-setting
equation that enables us to measure the e¤ect of each wage determinant. Once this wage equation
is estimated, it becomes straightforward to obtain the e¤ect of discriminatory wage policies by
comparing a man�s wage with the wage that a woman with the same wage determinants would
receive. This comparison is equivalent to a standard Oaxaca-Blinder decomposition, the main
di¤erence lies in the de�nition of wage determinants. A Oaxaca-Blinder decomposition includes
observable characteristics of the worker and the job that are potentially correlated with the wage.
By comparison, in our decomposition we include wage determinants that are driven by economic
theory. We model the labor market, and therefore we propose a theory that explains how wages
are determined. According to that theory, wages are a function of the extent of frictions in the
6
labor market, the productivity of the worker, the productivity of the �rm and the bargaining
power of the worker. Knowing the function that links wages to these wage determinants, we can
calculate the portion of the wage gap that is driven by di¤erences in each wage determinant.
Previous research has shown the aptitude of these models in describing the labor market
outputs and dynamics. Building on these assessments, we are interested in using the structural
model as a measurement tool that enables us to empirically disentangle the e¤ect of each wage
determinant on the gender wage gap. Search-matching models have been used as an instrument
to address empirical questions in a variety of papers. Examples include the previously mentioned
papers in the discrimination literature, but there are also interesting contributions in measuring
returns to education (Eckstein and Wolpin, 1995) or in analyzing the e¤ect of a change in the
minimum wage (Flinn, 2006).
2.1 Assumptions
We propose a continuous time, in�nite horizon, stationary economy. This economy is populated
by in�nitely lived �rms and workers. All agents are risk-neutral and discount future income at
the rate � > 0.
Workers: We normalize the measure of workers to one. Workers may belong to one of the
groups (k) de�ned in terms of gender.9 Workers have di¤erent abilities (") measured in terms
of e¢ ciency units they provide per unit of time. The distribution of ability in the population
of workers is exogenous and speci�c to each group, with cumulative distribution function Lk(").
Group-speci�c distributions of e¢ ciency units provided by workers are crucial in considering the
between-groups di¤erences in productivity. This source of heterogeneity is perfectly observable by
every agent in the economy. Each worker may be either unemployed or employed. The workers
from a generic group k that are not actually working receive a �ow utility proportional to their
ability, bk": The workers�e¢ ciency in home production is associated with their productivity in the
labor market and the parameter that describes that association is allowed to be gender-speci�c.10
9The structural model abstracts many dimensions that may be relevant in the wage setting. In order to comparejobs that are as similar as possible, the empirical analysis is clustered at the sector and occupation level. See Section4 for details.10The assumption that worker productivities �at home� and at work are proportional greatly simpli�es the
upcoming analysis. Furthermore, we allow b to be gender-speci�c to be consistent with a potential gender special-ization in home production.
7
Firms: Every �rm is characterized by its productivity (p). We assume that p is distributed
across �rms according to a given cumulative distribution function Hk(p), which is continuously
di¤erentiable with support [pmin; pmax]: The primitive distribution of �rms� productivity faced
by workers may di¤er among groups, allowing the model to be robust to labor market segrega-
tion. This source of heterogeneity is perfectly observable by every agent in the economy. The
opportunity cost of recruiting a worker is zero.
Each �rm contacts a worker of a given group k at the same constant rate, regardless of the
�rm�s bargained wage, its productivity or how many �lled jobs it has. Unemployed workers receive
job o¤ers at a Poisson rate �0k > 0. Employed workers may also search for a better job while
employed and receive job o¤ers at a Poisson rate �1k > 0. We treat �0k and �1k as exogenous
parameters speci�c to each group k. Searching while unemployed and searching while employed
has no cost. Employment relationships are exogenously destroyed at a constant rate �k > 0;
leaving the worker unemployed and the �rm with nothing. The marginal product of a match
between a worker with ability " and a �rm with productivity p is "p:
Whenever an employed worker meets a new �rm, the worker must choose an employer and
then, if she switches employers, she bargains with the new employer with no possibility of recalling
her old job. If she stays at her old job, nothing happens. Consequently when a worker negotiates
with a �rm, her alternative option is always unemployment. The surplus generated by the match
is split in proportions �k and (1��k), for the worker and the �rm respectively. �k is exogenously
given and speci�c to each group k. We will refer to �k as the rent-splitting parameter. As in
Wolpin and Eckstein (1999), we interpret �male��female as an index of the level of discrimination
in the labor market. A di¤erence in � in the same kind of job and sector, reveals di¤erential
payments unrelated to productivity and outside options, which are only driven by belonging to a
given group.
It is well known that many other factors may have an impact on the rent-splitting parame-
ters. The main candidates are: negotiation skills, risk aversion and the discount factor. In this
model agents are assumed to be risk-neutral and the discount rate is homogeneous across groups.
Although this could be part of the story that explains di¤erences in �, there is scant convincing
empirical evidence of gender di¤erences in these primitive parameters.11 This could be the reason
11The most convincing evidence comes from arti�cial experiments, see Croson and Gneezy (2009) for a good
8
why risk aversion and the discount factor have been held constant across genders in most of the
empirical studies on wage discrimination.
It is not clear whether � can be interpreted as a Nash bargaining power. Shimer (2006) argues
that in a simple search-matching model with on-the-job search, the standard axiomatic Nash
bargaining solution is inapplicable, because the set of feasible payo¤s is not convex. This non-
convexity arises because an increase in the wage has a direct negative e¤ect over the �rm�s rents,
but an indirect positive e¤ect raising the duration of the job. This critique will hold depending
on the shape of the �rm productivity distribution. Whether � can be understood as a Nash
Bargaining power is not essential for this study. If this critique holds up, we still interpret � as a
rent-splitting parameter that simply states the proportion of the surplus that goes to the worker.
A di¤erence in these parameters remains informative about wage-discrimination.
The model assumes that the worker does not have the option of recalling the old employer.
Hence, there is no possibility of Bertrand competition between �rms, as in Cahuc, Postel-Vinay
and Robin (2006). Whether to allow �rm competition a la Bertrand or not is controversial.
While the Cahuc et al bargaining scenario may be conceptually more appealing and may help
to avoid the Shimer critique, it is not clear how realistic this assumption is. Mortensen (2003)
argues that countero¤ers are empirically uncommon. Moscarini (2008) illustrates that, in a model
with search e¤ort, �rms may credibly commit to ignore outside o¤ers to their employees, letting
them go without a countero¤er, and su¤er the loss, in order to keep in line the other employees�
incentives to not search on the job. Moreover, it can be shown that including a marginally positive
cost of negotiation, makes it unpro�table for �rms to try to poach the worker from better �rms,
and then the Bertrand competition vanishes.
In an environment where contracts cannot be written and wages are continuously negotiated,
the outside option of the job is always unemployment. In this context, if a worker receives an o¤er
from a �rm with higher productivity, she must switch. She cannot use this o¤er to renegotiate
with her current �rm, because she knows that this o¤er will not be available tomorrow, and then
her future option will be the unemployment.12 This possibility is also discussed in Flinn and
survey.12If wages are continuously negotiated, �rms could increase the wage of the worker at the same moment as the
on-the-job o¤er to try to keep the worker from quitting. If the alternative employer is more productive, it canforce the transition by also paying a premium. This auction for the worker �nishes when the actual �rm cannotpay more than the full productivity and transition holds, as in a Bertrand competition. This premium may be
9
Mabli (2010).
Beyond the theoretical relevance of between-�rms Bertrand competition, this assumption is
not critical for most of the results presented in this paper. In the Appendix, we illustrate that
using the same data with a variation of the model where between-�rm Bertrand competition is
allowed, the gender wage-gap decomposition remains practically unchanged.
This model is similar to the model presented in Flinn and Mabli (2010). The primary di¤erence
is in the distributions of productivity. To have a model that is estimable only with employee-level
data, they assume that there is a technologically-determined discrete distribution of worker-�rm
productivity. In other words, they assume discrete heterogeneity at the match level, while here
we assume two-sided continuous heterogeneity. The model presented here also has the convenient
property of producing a closed-form solution for the wage setting equation.
2.2 Value Functions
The expected value of income for a worker with ability ", who belongs to group k, currently
employed at wage w(p; "; k) is denoted by E(w(p; "; k); "; k) and it satis�es:
�E(w(p; "; k); "; k) (1)
= w(p; "; k) + �k(U("; k)� E(w(p; "; k); "; k)) +
�1k
Z wmax (p;";k)
w(p;";k)
[E( ~w(p; "; k); "; k)� E(w(p; "; k); "; k)] dF ( ~w(p; "; k)j"; k):
The expected value of being unemployed for a worker with ability ", who belongs to group k
is given by:
�U("; k)
= bk"+ �0k
Z wmax(p;";k)
wmin(p;":k)
[E( ~w(p; "; k); "; k)� U("; k)] dF ( ~w(p; "; k)j"; k):
Finally, the value of the match with productivity p" for the �rm when paying a wage w(p; "; k)
considered a hiring cost for the �rm. Modeling this possibility is outside of the scope of this paper.
10
to a worker of group k is given by:
�J(w(p; "; k); p; "; k) (2)
= p"� w(p; "; k)� (�k + �1k �F (w(p; "; k)j"; k))J(w(p; "; k); p; "; k);
where �F (w(p; "; k)j"; k) = 1�F (w(p; "; k)j"; k) and F (w(p; "; k)j"; k) is the equilibrium cumulative
distribution function of wages paid by �rms with a productivity lower than p to workers with an
ability " who belong to group k. Note that every parameter is group-speci�c. As the alternative
value of the match for the �rm is always zero, this value does not depend on alternative matches
and is therefore independent of the parameters of the other groups of workers. Although every
group is sharing the same labor market, all value functions may be considered group-by-group as
if they were in independent markets. For simplicity of the notation, we therefore omit the k-index.
Note that w(p; ") is a function of p and ", therefore given p and ", the wage is a redundant state
variable which is only included for exposition simplicity.
These expressions are equivalent to the value functions of the model with heterogeneous �rms
presented in Shimer (2006), including heterogeneity in workers ability. Here, wages are determined
by the following surplus splitting rule:
(1� �) [E(w(p; "); ")� U(")] = �J(w(p; "); p; "): (3)
After some algebra (see the Online Appendix for the proof13), it can be shown that:
w(p; ") = p"� (�+ � + �1 �F (w(p; ")j"))
�(1� �)�
Z w(p;")
wmin(p;")
1
(�+ � + � �F ( ~w(p; ")j"))d( ~w(p; ")):
Noting that �F (w(p; ")j") = �H(p) and changing the variable within the integral, we obtain a
�rst-order di¤erential equation,
w(p; ") = p"� (�+ � + �1 �H(p))(1� �)�
Z p
pmin(")
1
(�+ � + � �H(p0))
d(w(p; "))
dp0dp0:
13Online Appendix available at http://sites.google.com/site/cristianbartolucci/DetectingWageDiscirmination_OA.pdf
11
After some algebra, solving for the di¤erential equation yields:
w(p; ") = "p� "(1� �)(�+ � + �1 �H(p))�Z p
pmin
��+ � + �1 �H(p
0)���
dp0: (4)
This expression states a clear relationship between wages w(p; "), workers�ability ("), �rm
productivity (p), the distribution of �rms�productivity (H(:)), friction patterns (�1; �) and the
rent-splitting parameter (�):14 This wage equation is relatively similar to the one proposed by
Cahuc, Postel-Vinay and Robin (2006) when the wage is bargained between a �rm with produc-
tivity p and an unemployed worker with ability ":15
As expected, the model predicts that the mean equilibrium wage increases in �; and that
the mean wage paid by a �rm with productivity p increases in p: Note in Equation (4) that if
� = 1 ) w(p; ") = p"; the maximum wage that a �rm with productivity p can pay to a worker
with ability " is the full productivity: If � = 0) w(p; ") = pmin"; that is the minimum wage that a
worker would accept to leave unemployment. Figure 1 illustrates that the mean equilibrium wage
increases when �1 increases and when � decreases.16 Many models in the literature predict that
the mean equilibrium wage decreases in the amount of frictions (see for example the models in
Burdett and Mortensen 1998, Bontemps, Robin and van den Berg, 2000; Postel-Vinay and Robin,
2002; Cahuc, Postel-Vinay and Robin 2006). Wages are decreasing in the degree of frictions (�1�)
due to a compositional e¤ect and a direct e¤ect. The compositional e¤ect is due to the fact that
when frictions decrease the workers are, on average, higher in the job ladder. The direct e¤ect is
because the model is asymmetric in workers and employers, when frictions decrease the outside
option of the worker increases, but the outside option of the �rm is always zero.17
We assumed that the economy is in a steady state. The standard stationary equilibrium
14Solving the di¤erential equation we have:
w(p; ") = "p� "(1� �)(�+ � + �1 �H(p))�Z p
pmin(")
��+ � + �1 �H(p
0)���
dp0:
In section A.2 of the Online Appendix, we show that the minimum accepted productivity is independent of theworker type. Therefore we directly write pmin in the lower limit of integration.15Note that in Cahuc et al. (2006), when the wage is bargained between a �rm and an unemployed worker,
Bertrand competition does not hold. Therefore, their proposed scenario and ours are equivalent. The only di¤erencestems from the fact that both parts take into account future Bertrand competition.16These simulations are calibrated using the estimated parameters of male highly-quali�ed workers in the man-
ufacturing sector, see Section 4. The parameters are: � = 0:292, �1 = 0:217 and � = 0:034.17See van den Berg and van Vuuren (2010) for a deeper discussion on the e¤ect of friction on wages.
12
020
040
060
0M
ean
Dai
ly W
age
0 .2 .4 .6 .8 1beta
020
040
060
080
010
00W
ithin
Firm
Mea
n D
aily
Wag
e
0 1000 2000 3000Firm's Productiv ity
170
180
190
200
Mea
n D
aily
Wag
e
0 .2 .4 .6 .8 1delta
160
170
180
190
200
Mea
n D
ailly
Wag
e0 .2 .4 .6 .8 1
lambda
Figure 1: Wage Setting Equation
conditions are exploited. The in�ow must balance the out�ow for every stock of workers, de�ned
in terms of individual ability, employment status, and for those workers who are employed, �rm
productivity.
� The in�ow to unemployment must be equal to its out�ow, �0� = �(1 � �), where � is the
unemployment rate given by:
� =�
� + �0(5)
� The in�ow to jobs in �rms with productivity p or lower than p must be equal to its out�ow:
�0H(p)� =��1 �H(p) + �
�G(p)(1� �);
where G(p) is the fraction of workers employed at a �rm with productivity p or lower than
p: Using condition (5) and rearranging:
G(p) =H(p)
1 + �1��H(p)
; (6)
This stationary condition, (or its counterpart in terms of wages) is quite common and has
13
been broadly used after Burdett and Mortensen (1998) to infer the primitive distribution of
productivity (or the primitive distribution of wages) when only the distribution of produc-
tivity (or distribution of wages) for employed workers is observable. Since we use matched
employer-employee data, we can directly observe the empirical distribution of productiv-
ity at the �rm level. We use this stationary condition to construct the likelihood for the
duration analysis in Section 4.
� The fraction of employed workers with ability " or lower than " that are working in �rms
with productivity p or lower than p are (1��) ~F ("; p); where ~F ("; p) is the joint cdf of " and
p: These workers leave this group due to a better o¤er, or because they become unemployed;
such event occurs with probability (� + �1 �H(p)). The in�ow to this group is given by the
unemployed workers with ability " or lower than ", (ie: L(")�) who receive an o¤er from a
�rm with productivity p or lower than p. This last event occurs with probability �0H(p).
We now obtain the following condition:
(1� �)(� + �1 �H(p))F ("; p) = �0H(p)L(")�:
Using conditions (5) and (6) and rearranging:
F ("; p) =H(p)
(1 + �1��H(p))
L(") = G(p)L("): (7)
This expression states that there is no sorting between �rm�s productivity and worker�s abil-
ity;18 this statement is controversial, and there is an active debate in the assortative matching
literature.
Becker (1973) showed that in a model without search frictions but with transferable utility, if
there are supermodular production functions, then any competitive equilibrium exhibits positive
assortative matching. In a more recent work, Shimer and Smith (2000) and Atakan (2006) show
that in search models, complementaries in production functions are not su¢ cient enough to ensure
assortative matching.
18To show that there is no sorting, condition (7) is necessary but not su¢ cient. pmin; the minimum productivity,also needs to be independent of the worker ability. This second condition also holds in this model (the proof is inthe Online Appendix at https://sites.google.com/site/cristianbartolucci/DetectingWageDiscirmination_OA.pdf ).
14
After Abowd, Kramarz, and Margolis (1999), the empirical literature has primarily focused
on estimating the correlation between worker�s and �rm�s �xed e¤ects, using matched employer-
employee data. However, there are still no de�nitive results. Abowd et al (1999) �nd a negative
and small correlation between �rms and workers �xed e¤ects for France and a zero correlation for
the U.S.. Lindeboom, Mendez and van den Berg (2010), using a Portuguese matched employer-
employee dataset, �nd that there is positive assortative matching.
3 Data
Linked Employer-Employee Data from the German Federal Employment Agency
(LIAB).
We use the linked employer-employee dataset of the IAB (denoted LIAB), covering the period
of 1996-2005. LIAB was created by matching the data of the IAB establishment panel and
the process-produced data of the Federal Employment Services (social security records). The
distinctive feature of this data is the combination of information about individuals with details
concerning the �rms in which these people work. The workers source contains valuable data
on age, sex, nationality, daily wage (censored at the upper earnings limit for social security
contributions),19 schooling/training, the establishment number and occupation. Occupation is
recorder using a 3-digit code but in this paper is collapsed into two categories: high and low
quali�cation occupations.20
The �rm level data come form IAB Establishment Panel. These data is drawn from a strati�ed
sample of the plants, where the strata are de�ned over industries and plant sizes (large plants are
oversampled), but the sampling within each cell is random. The �rm�s data provide details on total
sales, value added, investment, depreciation,21 number of workers and sector.22 In particular, only
19In the sample of �rms used in this paper, 14.7% of the worker observations have censored information onwages. This proportion varies signi�cantly across genders and occupations. 4.7% of female observations and 18.1%of males observations are censored. The proportion of worker observations in high-quali�cation occupations, withwages that exceed the upper earning limit for social security contributions, is 37.0% while the correspondingproportion of low-quali�cation occupations is 3.1%.20We assigned the following groups to the low quali�cation occupations: agrarian occupations, manual occupa-
tions, services and simple commercial or administrative occupations. We classi�ed the following as high quali�ca-tion occupations: engineers, professional or semi-professional occupations, quali�ed commercial or administrativeoccupations, and managerial occupations.21The survey gives information about investment made to replace depreciated capital.22For a more detailed description of this dataset, see Alda et al (2005)
15
�rms with more than 10 workers, a positive output and positive depreciated capital were included
in our subsample. Since �rms of di¤erent sectors do not share the same market, we construct
separate samples for each sector. LIAB has a very detailed industry classi�cation. We focus on
four primary industries: Manufacturing, Construction, Trade, and Services.23 Participation of
establishments is voluntary, but the response rates are high (exceeding 70 per cent). However, the
response rate on some variables key for our purposes is lower. Among survey respondents, only 60
percent of �rms in the previous four industries provided valid responses for output. To estimate
productivity, we need data on output and number of workers in each category. We only consider
observations from the old Federal Republic of Germany (West-Germany). The �nal number of
observations in our sample of �rms is 15,174. Table 1 provides descriptive statistics of the �nal
sample of the �rms.
Table 1: Firms - Descriptive Statistics
no of Output no of Women (%) Men (%)�rms (mean)* workers Low-Q High-Q Low-Q High-Q
Manufact. 7,354 151.0 4,297,762 11.9 8.1 55.3 24.7Construct. 1,491 30.2 170,786 12.9 11.5 58.1 17.6
Trade 2,078 67.4 247,884 30.6 17.5 34.4 17.6Services 4,251 30.4 1,043,678 21.1 21.0 38.9 20.0Total 15,174 92.8 5,760,110 14.2 11.0 51.5 23.4
* Per annum total output in millions of euros
One of the main advantages of this data-set is that it has information for all employees subject
to social security in each �rm.24 The employee data are matched to �rms for which we have valid
estimates of productivity through a unique �rm identi�er. The raw data contains 21,246,022
observations between 1996 and 2005. After the �nal trimming we have a 10-year unbalanced panel,
including a total of 5,760,110 workers�observations distributed into 15,174 �rms�observations.
Given this selection on top of the original over-sampling of large plants, the sample becomes less
23The service sector includes three kind of services de�ned in the survey: industrial, transport and communica-tion, and other services. In the database, there is also information about �rms in the �nancial and public sectors.No measure of output is well-de�ned for these sectors, so they have been excluded from the analysis.24Employees subject to social security are workers, other employees and trainees who are liable to health, pension
and/or unemployment insurance or whose contributions to pension insurance is partly paid by the employer. Wedo not have information on workers who are not liable to social security. The following forms of employmentare not considered liable to social security: civil servants, self-employed persons, unpaid family workers and so-called "marginal" part-time workers (A "marginal" part-time worker is a person who is either only employed fora short-term or paid a maximum wage of e400 per month).
16
representative. According to the Federal Statistical O¢ ce (Statistisches Bundesamt Deutschland),
between 1996 and 2005, the proportion of the workforce in the manufacturing sector ranges
between 25.6 and 31.7 percent. In the sample used in this paper, this proportion exceeds 74 percent
(Table 1). For that reason, the analysis is performed clustering at the industry and occupation
level and when making inference on the total population, the between-clusters aggregation of
results are calculated using weights obtained from the G-SOEP.25
In the sample, women are, on average, younger than men. They also have less tenure and
less experience. Women are overrepresented in high quali�cation occupations. The proportion of
immigrants is larger within the male group. See Table 2 for details on the sample of workers.
Table 2: Workers - Descriptive Statistics
Women MenImmigrant (%) 8.4 10.4Age (years) 39.2 40.7Tenure (years) 10.1 12.0Experience (years) 15.3 17.1High-Qualification (%) 46.4 31.9Observations 1,290,156 4,130,453
The primary goal of this study is to understand the gender wage gap. In LIAB, individual
wages are top-coded. Consequently, the wage gap is estimated by maximum likelihood assuming
that log-wages are normally distributed within the �rm. The mean of the log-wage for each group
is estimated using worker-level data, maximizing saturated normal-likelihoods at the �rm level.
This is consistent with the model if we assume that, conditioning on the worker group, the ability
is log-normally distributed.
The di¤erence in conditional means is 21 percent (see Table 17 in the Appendix). This implies
that women, on average, have salaries that are 21 percent lower than men with the same set
of observable characteristics. The unconditional wage di¤erential is 41 percent, but it is not25The weights for each groups are estimated with the relative frequencies in the 1996-2005 sample of the G-
SOEP; these include: manufacturing-high quali�cation-men 12.3%, manufacturing-low quali�cation-men 15.6%,manufacturing-high quali�cation-women 6.1%, manufacturing-low quali�cation-women 4.8%, construction-highquali�cation-men 4.1%, construction-low quali�cation-men 7.0%, construction-high quali�cation-women 1.1%,construction-low quali�cation-women 0.2%, trade-high quali�cation-men 6.1%, trade-low quali�cation-men 2.9%,trade-high quali�cation-women 10.5%, trade-low quali�cation-women 2.8%, services-high quali�cation-men 10.3%,services-low quali�cation-men 4.2%, services-high quali�cation-women 8.7%, and services-low quali�cation-women3.1%.
17
stable across sectors and occupations (see Table 3). Mean-wages, estimated across industries and
occupations, illustrate that the gap ranges between 30 and 45 percent. Wage gaps are signi�cantly
di¤erent from zero for every sector and for every group. Earning di¤erentials are larger for highly
quali�ed workers in manufacturing, construction and trade sectors.
Table 3: Gender Wage Gap
Mean Daily-Wage W-GapWomen Men (%)
Manufact. Low-Q 76.94 109.25 29.57%(0.41) (0.59) (1.00%)
High-Q 105.74 190.30 44.43%(0.11) (0.22) (0.33%)
Construct. Low-Q 53.35 96.59 44.77%(0.14) (0.16) (0.30%)
High-Q 82.81 150.90 45.11%(0.27) (0.54) (0.81%)
Trade Low-Q 47.64 84.90 43.16%(0.08) (0.53) (0.51%)
High-Q 67.76 119.20 43.88%(0.02) (0.09) (0.11%)
Services Low-Q 49.11 88.47 44.80%(0.07) (0.17) (0.24%)
High-Q 87.44 156.31 44.06%(0.26) (0.68) (0.94%)
Weighted Average 72.08 124.10 41.93%
Note: Standard errors are given in parentheses. Individual wages are top-coded therefore means of log-wage are estimated using worker-level data maximizing saturated normal-likelihoods at the �rm level.Means of wages are then recovered by the moment generating function. Standard errors are obtainedusing the Delta-Method.
German Socio-Economic Panel
The LIAB version used in this paper is a panel of �rms complemented by worker data. As it
does not track workers, it is not possible to distinguish between attrition26 and job-termination.27
For that reason we use the G-SOEP (German Socio-Economic Panel) to estimate the group-
speci�c transition parameters.28 The G-SOEP is a representative repeated survey of households
in Germany. This survey has been carried out annually with the same people and families in
26There is practically no attrition in a establishment level data. There is attrition in the worker level data dueto changes in the worker�s identi�er and changes across establishments of the same �rm.27Unless the worker leaves the establishment and moves to another establishment within the panel.28Cahuc, Postel-Vinay and Robin (2006) follow the same strategy for estimating transition parameters with the
French Labor Force Survey.
18
Germany since 1984 (note that this study only analyzes the 1996-2005 data).29
4 Empirical Strategy and Results
The discrete nature of annual data implies a complicated censoring of the continuous-time trajec-
tories generated by the theoretical model. Because of these complications, a potentially e¢ cient
full-information maximum likelihood is not considered as a candidate for the estimation. Instead,
we perform a multi-step estimation procedure.30
Even though it may be asymptotically ine¢ cient, we prefer a step-by-step method. One reason
is that transition parameters are better estimated using a standard labor force survey, such as
G-SOEP. Another reason is that the consistency of every parameter in full-information maximum
likelihood is only guaranteed in the case of the correct speci�cation of the full model. However,
we are interested in obtaining productivity di¤erences and transition parameter estimates that
are robust to misspeci�cation in other parts of the model, such as the wage setting mechanism.
This facilitates the comparability of the intermediate results with previous reduced-form studies
which measure gender di¤erentials in productivity and job-retention rates. Moreover, even in an
informal way, models are generally incomplete. Therefore, it seems prudent to use estimators that
are as robust to misspeci�cation as possible.
The structural model abstracts many dimensions that may be relevant in the wage setting
(e.g., amenities or union pressure). These omitted dimensions may be mainly associated with
di¤erent job types. As can be seen in Tables 1 and 2, there are important di¤erences between men
and women in terms of occupation and sector. To compare jobs that are as similar as possible,
the empirical analysis is clustered at the sector and occupation level. The model is estimated
independently for each of the four sectors. In order to control for occupation, transition and
the rent-splitting parameters are estimated independently for both types of jobs, in each sector
and gender group. We only control for occupation parametrically when we estimate productivity,
because we need to consider the full workforce at each �rm.31
29See Wagner, Burkhauser, and Behringer (1993) for further details on the G-SOEP.30Multi-step estimation has been done in many papers. Good examples include Bontemps, Robin, and Van den
Berg (2000), Postel-Vinay and Robin (2002), and Cahuc, Postel-Vinay and Robin (2006).31We treat selection into occupations and sector as exogenous. Nevertheless, this selection may also have some
content of discrimination (e.g. Blau and Hendricks, 1979).
19
4.1 Productivity
We assume, as in the theory laid out in Section 2, that the distribution of abilities in worker
groups within each �rm �uctuates around some �xed density, say l("jp; k). According to the
model (see condition (7)), there is no assortative matching between workers and �rms. Therefore,
for a given worker group (k), the equilibrium distribution of worker types, conditional on the �rm
type, equals the marginal one: L("jp; k) = L("jk). The model describes the matching process,
and the wage setting, for single-worker �rms. For the sake of consistency with multi-worker �rms,
we assume that workers are perfectly substitutable in the production process, between and within
skill categories (k0s). Equivalently to Cahuc, Postel-Vinay and Robin (2006), we de�ne the total
amount of e¢ cient labor employed at �rm j at time t as:
Ljt =XK
kLkjt:
where Lkjt is the total amount of workers of group k in �rm j at time t; and k =
"Z"
"l("jk)d"
is the steady-state mean ability of workers of group k. We then specify �rm j�s total per-period
output at the constant-return Cobb�Douglas function of capital and quality adjusted labor input.
Yjt = AjK(1��)jt L�jt; (8)
where, Yjt is the value added produced by �rm j in period t, Kjt is the total capital and Aj is
a �rm-speci�c productivity parameter. A similar speci�cation, without imposing constant return
to scale, has already been used in the discrimination literature to estimate between-group pro-
ductivity gaps (e:g: Hellerstein and Neumark, 1999 and Hellerstein, Neumark and Troske, 1999).
We are primarily interested in four groups of workers depending on gender (men and women) and
occupation (high and low quali�cation). We normalize hm = 1; considering highly-quali�ed male
workers as the reference group.32 As such, k = ~ k=~ hm is the proportional productivity of group
k relative to the productivity of highly-quali�ed male workers.
We further assume that �rms can adjust capital instantaneously, what makes (8) totally con-
32Due to this normalization, the �rm speci�c productivity ~Aj is rede�ned as Aj �lhm:
20
sistent with the theory presented in Section 2, where we have assumed that the productivity of a
match is p". If the �rm chooses the level of capital, the �rm solves the following problem:33
maxKjt
(AjK(1��)jt L�jt � rKjt);
where r is the cost of capital. Therefore, K�jt(Aj; Ljt) �that is the optimal choice of capital given
the stock of workers and the �rm speci�c productivity parameter �is:
K�jt(Aj; Ljt) = Ljt
�1� �r
Aj
� 1�
Plugging K�jt(Aj; Ljt) into (8) and rearranging, we have that:
Yjt = Aj
"Ljt
�1� �r
Aj
� 1�
#1��L�jt =
"A
1�j
�1� �r
� 1���
#Ljt = pjLjt (9)
Note that (9) is equivalent to the production function presented in Section 2, where pj is a function
of the �rm-speci�c productivity parameter�e:g: : pj = A
1�j
�1��r
� 1���
�and Ljt is the total labor
input in e¢ ciency units. Using the panel with �rm level data on output ( �Yjt)34, depreciated
capital (Kdjt)
35 and the number of workers in each category, we estimate the production function
in logs imposing that the relative di¤erence in productivity between low and high quali�cation
occupations is constant across gender ( lw = w � l).
log( �Yjt) = log(Aj) + (1� �l) log(Kdjt) + (10)
�l log(Lhmjt + wL
hwjt + lL
lmjt + w lL
lwjt ) + ujt;
where Lhwjt and Lhmjt are, respectively, the number of women and men in high-quali�cation occu-
pations in �rm j at time t while Llwjt and Llmjt are, respectively, the number of women and men in
low-quali�cation occupations in �rm j at time t; and ujt is a zero mean stationary productivity
33Section A.2 in the Appendix provides more detail and robustness checks on this assumption.34There were problems of convergence in the estimation of (8) using value added. For that reason, output
measures where used in the estimation of � and in (8). If a constant fraction of output is spent on materials,both estimators are consistent for � and . There is only a di¤erence in the constant term of (8).On the other hand, to estimate pj in (9), the constant term matters, and hence we use measures of value-added.35Assuming that a constant fraction (d) of capital depreciates by unit of time: Kd
jt = d � Kjt ) log(Kdjt) =
log(d) + log(Kjt): Therefore �k log(d) goes to the constant term.
21
shock.
The model predicts that more productive �rms are able to attract more workers of every
type. Moreover, the optimal choice of capital depends on Aj: As a result, productive inputs
are correlated with the �rm �xed e¤ect (ie:cov(Aj; Ljt) 6= 0 and cov(Aj; Kjt) 6= 0); which would
generates inconsistent estimates of � and if we used ordinary least squares in the estimation.
Therefore, we estimate equation (10) by within-�rms non-linear least squares to remove the �rm
�xed e¤ects.
Table 4: Production Function Estimates
WF-NLLS Estimates of (10)�l w l
Manufacturing 0.963 0.672 0.484(0.005) (0.062) (0.042)
Construction 0.961 0.701 0.444(0.006) (0.052) (0.039)
Trade 0.971 0.804 0.487(0.009) (0.092) (0.056)
Services 0.945 0.588 0.298(0.007) (0.068) (0.030)
Weighted Average 0.96 0.67 0.43
Note: Time dummies included. Standard errors robust to heteroskedasticity are given in parentheses.Weighted averages take into account the number of �rms in each sector.
The within-�rms non-linear least squares results are presented in Table 4. Women�s produc-
tivity is lower than men�s productivity in similar jobs. This di¤erence ranges from 20 to 41
percent. On average across sectors, female workers are 33 percent less productive than males in
each job. One of the primary candidates to explain this large productivity gap, is that these
estimates are not taking into account that women work, on average, fewer hours than men. Using
the G-SOEP36, we �nd that the average hour-gap is 19.9 percent, hence, di¤erences in hours are
likely to be one of the main determinants of the productivity gap.37
Low-quali�cation workers are also found to be between 51 percent and 70 percent less produc-
tive than highly quali�ed workers. As in most production function estimations using microdata,
36LIAB does not provide information on hours.37Although di¤erences in hours are illustrated as being important, the primary results of this paper remain valid.
We only mention them in order to have a better understanding of the estimated productivity gap. See Section 5.1for a more detailed discussion of this issue.
22
�l is found to be very near one, and hence, �k is very small but statistically di¤erent from zero.
Although this �nding is standard,38 the main results of this paper are not very sensitive to this
issue. If instead of �l = 0:96; we include an a priori more realistic value of �l = 0:60; the wage
gap decomposition does not change signi�cantly.
Pioneered by Hellerstein and Neumark (1999) and Hellerstein, Neumark and Troske (1999),
di¤erences in productivity across genders are now well documented in the literature. The �rst
paper �nds, with Israeli �rm-level data, a productivity gap of 17 percent. The second paper, using
a U.S. sample of manufacturing plants, reports a productivity gap of 15 percent. These studies
have been criticized mainly due to the potential endogeneity of the proportion of female workers
in the �rm.39
In this paper, we treat the number of workers of each group as potentially correlated with
the �rm �xed e¤ect.40 Estimating equation (10) by within-�rm non linear least-squares, the �rm
�xed e¤ect is completely removed; hence, our estimates are robust to a correlation between the
�rm �xed e¤ect and the �rm labor input level, but also robust to a correlation between the �rm
�xed e¤ect and the labor composition of the �rm. In this dataset, we �nd a signi�cant correlation
between the �rm�s �xed e¤ect and the �rm�s labor input. Estimating equation (10) by NLLS
without �xed e¤ects, 0s estimates are signi�cantly lower, the average of NLLSw across sectors is
0.38 and the average of NLLSl across sectors is 0.26. See Table 12 in the Appendix.
The estimates presented in Table 4, require orthogonality of productive inputs to the sequence
of shock ujt: In the theory presented in Section 2, we abstract from productivity shock. Assuming
that ujt is uncorrelated with the productive inputs would be consistent with the model. As we
assume that capital is perfectly adjustable, if the productivity shock is observed by the manager
after production takes place, capital is exogenous and our results are consistent. Nevertheless, this
extra assumption could be avoided by dropping capital altogether and estimating equation (9)
directly. Results of this last exercise are presented in Table 11 in the Appendix. The estimates of
w and l are not statistically di¤erent from the ones that come out from the estimation of equa-
tion (10), what suggests that the productivity shock is not observed before the manager chooses
38One interpretation of this result is that Kdjt only captures variable capital whereas �xed capital is subsumed
in the �rm e¤ect. If this is the case, the constant returns restriction is dubious.39See Altonji & Blank (1999).40Indeed, the model predicts that more productive �rms are able to attract more workers of every type. Therefore,
the total labor input will be correlated with the �rm �xed e¤ect, but not with the labor input composition.
23
the level of K. The current productivity shock might also a¤ect the vacancy creation, making it
possibly correlated with Lj;~t 8 ~t > t, which would go against the condition of strict exogeneity
required for consistency of �xed-e¤ects estimators. As in the case of capital, the number of vacan-
cies opened by the �rm would not be correlated with the productivity shock if ujt were realized
ex-post the production takes place. However, there is large literature suggesting that productive
inputs are determined simultaneously with the productivity shock.41 It is possible to estimate �l
and k by GMM using internal instruments, assuming that Ljt and Kjt are predetermined. For
the sake of robustness of our results, we have explored this possibility, but unfortunately there is
a severe problem of lack of precision on the GMM estimates of the parameters (See Section A.1
in the Appendix for details).
4.2 Labor Market Dynamics
In the model described is Section 2, job terminations might occur endogenously, due to job-to-job
transitions, or exogenously, due to job destruction. As both processes are Poisson, the model
de�nes the precise distribution of job durations (i:e:; t) conditional on the �rm productivity
(i:e:; p):
L(tjp) =�� + �1 �H(p)
�e�[�+�1
�H(p)]t: (11)
We use the G-SOEP to estimate transition parameters. Unfortunately, this dataset does not
have productivity measures, so we estimate �1 and � treating p as an unobservable. To do this,
we maximize the unconditional likelihood L(t) =RL(tjp)g(p)dp; where g(p) is the probability
density function of the �rm�s productivity among employed workers implied by the model.
Taking derivatives with respect to p in Equation (6), we obtain the density of �rm productivity
in the population of workers as:
g(p) =(1 + �1
�)h(p)
1 + �1��H(p)
: (12)
In the Appendix we show the individual contributions to the unconditional likelihood become
41If the �rm has some knowledge of the productivity shock, it will adjust its choice of productive inputs accord-ingly. See for example Griliches and Mairesse, (1998).
24
simple enough to be estimated and are given by:
L(t) =�(1 + �1
�)
�1�
"Z (1+�1�)�t
�t
e�x
xdx
#:
Integrating the unobserved productivity out of the conditional likelihood removes p and all
reference to the sampling distribution H(p). This method is robust to any misspeci�cation in
wage bargaining. The only required property of the structural model is that there exists a scalar
�rm index, in this case p; which monotonously de�nes transitions.
In the Appendix42, we show how to obtain the exact form of the likelihood that takes into
account that some durations are right-censored, while others started before the survey�s beginning.
Finally, an individual contribution to the log-likelihood is:
l(ti) = (1� ci) log
0B@ R (1+�1�)�t
�te�x
xdx
e��Hi�� e��(1+
�1�)Hi
�(1+�1�)�Hi
R (1+�1�)�Hi
�Hi
e�x
xdx
1CA+
ci log
0B@ e��ti�� e��(1+
�1�)ti
�(1+�1�)� ti
R (1+�1�)�ti
�ti
e�x
xdx
e��Hi�� e��(1+
�1�)Hi
�(1+�1�)�Hi
R (1+�1�)�Hi
�Hi
e�x
xdx
1CA ;where ci is a right-censored spell indicator and Hi is the time period elapsed before the sample
started.43
Maximum likelihood estimates are reported in Table 5. The average duration of an employment
spell, 1=� (possibly changing employers), is between 10 and 32 years, but the mean-duration across
sectors is 20.2 years. The average time between two outside o¤ers, 1=�1, ranges from 1.7 to 4.6
years. These results seem to be fairly large, but they are compatible with the previous literature.
van den Berg and Ridder (2003) used a similar speci�cation, but with German aggregated data,
and found � to be equal to 0.060 and �1�equal to 6.5,44 Here, the weighted average of � across
sectors and groups is 0.0574 and the weighted average of �1�is 6.41.
Highly quali�ed workers have, in general, lower transition rates to unemployment and lower
42Online Appendix available at http://sites.google.com/site/cristianbartolucci/DetectingWageDiscirmination_OA.pdf43The MATA code for computing the exponential integral and the MATA code to maximize this likelihood are
available from the author upon request.44van den Berg and Ridder (2003, p.237) report monthly rates for �1 = 0:028 and �1 = �1
� = 6:5:
25
Table 5: Transition Parameters - Maximum Likelihood Estimates
Low-QualificationWomen Men
�1 � �1�
�1 � �1�
Manufact. 0.406 0.044 9.202 0.314 0.031 10.095(0.041) (0.004) (0.127) (0.021) (0.002) (0.176)
Construct. 0.601 0.098 6.085 0.437 0.105 4.162(0.188) (0.031) (0.150) (0.025) (0.006) (0.118)
Trade 0.5257 0.094 5.613 0.432 0.074 5.478(0.050) (0.009) (0.085) (0.042) (0.008) (0.090)
Services 0.559 0.095 5.866 0.458 0.086 5.313(0.051) (0.009) (0.073) (0.037) (0.007) (0.049)
High-QualificationWomen Men
�1 � �1�
�1 � �1�
Manufact. 0.308 0.044 7.025 0.217 0.034 6.373(0.031) (0.004) (0.316) (0.015) (0.002) (0.103)
Construct. 0.510 0.090 5.691 0.255 0.071 3.620(0.095) (0.017) (0.107) (0.025) (0.006) (0.048)
Trade 0.327 0.060 5.459 0.353 0.050 7.093(0.020) (0.004) (0.078) (0.032) (0.004) (0.219)
Services 0.393 0.073 5.363 0.227 0.050 4.547(0.024) (0.004) (0.062) (0.015) (0.003) (0.051)
Note: Per annum estimates. Standard errors are given in parentheses.
on-the-job o¤er arrival rates. Women are more mobile than men in terms of job-to-job transitions
and, in general, have higher job-destruction rates. Considering �1�as an index of friction (van den
Berg and Ridder 2003), we do not �nd signi�cant di¤erences in the extent of labor market friction
between men and women.
4.3 The Wage Equation: Closing the Model
The structural wage equation (4) can be written as:
wj;t;i = "iwpj;t;k(pj; �k; Hk(p); �1k; �k);
26
where wj;t;i is the daily wage of worker i; who belongs to group k,45 in a �rm j with productivity
pj at time t; and:
wpj;t;k(pj ; �k; Hk(p); �1k; �k)
= pj � (1� �k)(�+ �k + �1;k �Hk(pj))�k
�Z pj�
pmin
��+ �k + �1;k �Hk(p
0)���k dp0:
As illustrated in equation (7), " is statistically independent of p; thus
E(wj;t;i) = E("iwpj;t;k(pj; �k; Hk(p); �1k; �k))
= Ek(")E(wpjtk(pj; �k; Hk(p); �1k; �k)):
Ek(") = k is the mean e¢ ciency units of workers in group k; relative to the male highly-
quali�ed group. Therefore, the predicted mean wage for workers of group k working in �rms with
productivity pj at time t is:
E(wjtk) = kwpjtk(pj; �k; Hk(p); �1k; �k): (13)
The group chosen for normalization is irrelevant. Changing this group to a generic group k;
we would change our measure of productivity. Instead of pj, that is, the productivity measured
in terms of e¢ ciency units of highly-quali�ed males, we would have pkj = kpj ; that is, the
productivity measured in term of e¢ ciency units of group k: To de�ne equation (13) in terms of
the productivity of group k; we only need to put k inside the expectation operator:
E(wj;t;i) = E( kpj �
(1� �k)(�+ �k + �1;k �Hk(pj))�k
�Z pj
pmin
��+ �k + �1;k �Hk(p
0)���k kdp0):
45Notice that given the individual identi�er i, the group k is �xed. Therefore we could replace k by k(i): Tokeep the notation as simple as possible we simply refer to k.
27
Noting that dpdpk= 1
k; and changing the variable within the integral, we obtain:
E(wj;t;i) = E(pkj �
(1� �k)(�+ �k + �1;k �Hk(pkj ))�k
�Z pkj
pkmin
��+ �k + �1;k �Hk(p
k0)���k dpk0) �
For each �rm in the sample we estimate the average daily wage �wjtk paid to the workers of
group k at time t. Since wages are top-coded, we estimate the �rm mean-wage for each worker
group (i:e:; �wjtk) by maximum likelihood at the �rm level assuming that "i is distributed as a log-
normal.46 Under the steady state assumption, and according to the theory presented in Section
2, �wjtk exhibits stationary �uctuation around the steady state mean wage E(wjtk) paid by �rm j
with productivity pj.
Writing equation (13) in logs:
log �wjtk = ln( k) + (14)
ln[pj � (1� �k)(�+ �k + �1;k �Hk(pj))�k
�Z pj
pmin
��+ �k + �1;k �Hk(p
0)���k dp0]
+vjtk;
We estimate equation (14) by weighted non-linear least squares at the �rm level, where k, �k
and �k are parameters estimated in the previous stages, pj is the productivity of �rm j, and vjtk
is a transitory shock that comes out from within �rm aggregation. As usual, the discount factor
has been set to an annual rate of 5 percent (daily rate of 0.0134 percent).
Standard errors have to take into account that ; �1 and � are estimated in the previous
stages. To solve this problem, we combine bootstrap for with the analytical solution for �1 and
�. Hence, we obtain standard errors replicating the productivity estimation and the bargaining
power estimation in 200 resamples of the LIAB original sample, with replacement, but considering
the estimated transition parameters as the population parameters. To correct these preliminary
46The within-�rm distribution of wages of each worker group, is the same as the distribution of ability. As wagesare linear in "; to assume log-normality in the distribution of " implies log-normal wages for each group within the�rm.
28
standard errors, we add to them the analytical term corresponding to the standard errors of �1
and � reported in Table 5. The analytical correction is straightforward in this case, because the
estimators come from di¤erent samples, such that we can omit the term corresponding to the
outer product of the scores in the �rst and second stages.
Consistent standard errors are given by:
\V ar(�̂) = V ar(�̂j�̂1; �̂)bootstrap +
�̂��1\V ar(�̂1)�̂��1 + �̂��
\V ar(�̂)�̂���̂�� � �̂��
;
where � is the objective function in the optimization, which, in this case is the weighted sum of
squares and �̂��=@�@�̂@�
�@�
. Second derivatives of �̂ are obtained numerically.47
Results are presented in Table 6. Women are found to have lower rent-spitting parameters
than men in construction and trade for both high and low quali�cation occupations, and in
manufacturing highly quali�ed occupations. Female workers receive larger portions of the surplus
than males in services and in manufacturing low quali�cation occupations, but these di¤erences
are not signi�cant.
There is a clear pattern in terms occupations. Low quali�cation workers receive larger shares
of the surplus in every sector, considering female and male workers.48 Union coverage, which is
higher for low quali�cation occupations, and di¤erences in compensating di¤erentials, are potential
reasons to explain this di¤erence.
Estimates of the rent-splitting parameter are considerably higher than the parameters reported
in Cahuc, Postel-Vinay and Robin (2006). This is probably due to the di¤erences in our de�nition
of match rents.49 In a similar model estimated with US employee-level data by Flinn and Mabli
(2010), the overall bargaining power is found to be 0:45. Here, the weighted average across cells
47The analytical correction �̂��1\V ar(�̂1)�̂��1+�̂��
\V ar(�̂)�̂���̂����̂��
is not signi�cant in any industry, neither for high nor
low quali�cation workers.48Di¤erences between industries and occupations may be understood as consequence of compensating di¤eren-
tials or di¤erences in union pressure. They cannot be understood as discrimination, because we are comparingoccupations and not workers.49The surplus is de�ned in terms of the productivity of the match and the outside option. Both models imply
di¤erent outside options. Without Bertrand competition the worker outside option is unemployment. Whileallowing for Bertrand competition, the worker�s outside option is the whole productivity of the poaching �rm.As the outside option in the model with Bertrand competition dominates the one in the model without Bertrandcompetition, the estimated bargaining power is smaller.
29
Table 6: Rent-Splitting Parameter Estimates
.
WNLLS estimates of (14)Women Men
� �
Manufacturing Low-Q 0.419 0.398(0.129) (0.096)
High-Q 0.226 0.292(0.088) (0.036)
Construction Low-Q 0.214 0.408(0.090) (0.045)
High-Q 0.113 0.186(0.078) (0.104)
Trade Low-Q 0.339 0.382(0.173) (0.106)
High-Q 0.152 0.222(0.066) (0.064)
Services Low-Q 0.849 0.757(0.125) (0.125)
High-Q 0.413 0.324(0.104) (0.073)
Note: Corrected Bootstrap Standard errors are given in parentheses.
is remarkably similar, 0:421.
Allowing between-�rms Bertrand competition as in Cahuc, Postel-Vinay and Robin (2006),
changes the magnitude of � for males and females, but not the di¤erence across genders. In
the Appendix, we present a numerical exercise, where � is recovered by the simulated method
of moments using the same data and a model with between-�rm Bertrand competition. The
bargaining power is found to be signi�cantly lower; the weighted average is 0:219, but the gender
and occupation patterns do not change. Women are found to have lower bargaining power than
males in construction and trade, while there is no clear pattern in manufacturing and services.
As in the model without Bertrand competition, workers in low-quali�cation occupations are also
found to have higher � than workers in high-quali�cation occupations. These �ndings are not
consistent with results found in Cahuc, Postel-Vinay and Robin (2006), where they �nd a positive
association between bargaining power and job quali�cation in France. As in this case we are
estimating a similar model to Cahuc, Postel-Vinay and Robin (2006), this occupation pattern
does not seem to be a modeling artifact, but instead is a di¤erence between German and French
30
labor markets.50
Di¤erences in rent-splitting parameters are not signi�cant in every sector. We only �nd that
male workers receive larger shares of the surplus than females in the construction sector, where
bootstrap p-values of the di¤erences in � are 4.5 percent for low-quali�cation workers and 9.6
percent for high-quali�cation workers..
5 Wage-Gap Decomposition
The structural wage setting equation provides us with a direct way to isolate the e¤ect of each
wage determinant on the overall wage di¤erential. Hence, we are able to calculate what fraction
of the wage gap is due to segregation or di¤erences in the rent-splitting parameter, productivity
or friction patterns. Although the model takes into account the equilibrium e¤ects of changes
in the primitives, this is always conditional to what we have de�ned as primitives. For example,
there are good reasons to think that the o¤er arrival rate might be a function of the distribution
of o¤ers and the bargaining power. Changes in the distribution of o¤ers and the bargaining power
might change the search e¤ort of the worker, but in the model we do not endogeneize wage e¤ort.
Moreover, di¤erences in productivity might be also a function of frictions. If a demographic group
faces a labor market with more frictions, its members are likely to accommodate their optimal
investment in human capital. In principle every mechanism could be endogeneized, and for sure
the decomposition would more complete and its interpretation would be cleaner. The purpose of
the estimation presented in this paper is to provide a wage gap decomposition to have a better
understanding of the main determinants of wage di¤erentials between male and women. This
is not a �nal answer since we are not explaining what is causing the di¤erence in each wage
determinant, but our results provide insights of relative importance of each wage determinant in
the determination of the wage gap.
To calculate our decomposition we proceed as follow. Using the structural wage equation (4):
wj;t;i = "iwp(pj; �k; Hk(p); �1k; �k);
For each sector and each worker group, it is possible to measure the wage di¤erential caused by
50See Section A.4 for details.
31
the di¤erences in each wage determinant. For example, we can estimate the wage gap accounted
for by � as the relative di¤erence between the mean-wage that women actually receive and the
mean wage that women would receive if they had the male rent-splitting parameter.
1� E ["iwp (pj; �W ; HW (p); �1;W ; �W )]
E ["iwp (pj; �M ; HW (p); �1;W ; �W )]: (15)
As shown in equation (7) " is independent of p; therefore we can estimate equation (15) at the
�rm level:
1�
WXj
Nj � wp (pj; �W ; HW (p); �1;W ; �W )
WXj
Nj � wp (pj; �M ; HW (p); �1;W ; �W );
where Nj is the number of female workers in each �rm. To decompose the wage gap, we sequen-
tially replace each female parameter with a male parameter until we reach the male predicted
mean wage: Counterfactual wages are presented in Table 7.
Di¤erences in friction patterns imply di¤erences in the observed distributions of �rm produc-
tivity within the employed workers of di¤erent groups. This is also true when both groups face
the same primitive distribution of productivity (see condition (6)). Consequently, a change in
frictions generates a direct e¤ect and a compositional e¤ect on mean wages. The direct e¤ect is
the change in the wage of a worker ", who belongs to the group k, working in a �rm p, driven by
the change in the value of the outside option of the worker; this is the e¤ect displayed in Figure 1.
The second e¤ect is the one that comes from aggregation, due to the changes in the distribution
of the accepted wages.
The wage setting equation implies that the higher the o¤er-arrival rate, the higher the wage
and the higher the job destruction rate, the lower the wage (see Figure 1). In addition, increasing
the o¤er-arrival rate makes the counterfactual �rm productivity distribution among workers to
stochastically dominate the original one, while increasing the destruction rate has the opposite
e¤ect.51 Hence, the direct e¤ect and the aggregation e¤ect follow the same pattern when we
51Given that the o¤er arrival rate of women is higher than the one of men, the distribution of �rms productivityacross the population of female worker stochastically dominates the couterfactual distribution of �rm�s productivitycorresponding to female workers if they had the o¤er arrival rate of male workers. The proof is trivial, note thatif �F1 > �
M1 :
G(�W1 ; �W ) =
H(p)
1 +�W1�W
�H(p)6 H(p)
1 +�M1�W
�H(p)= G(�M1 ; �
W ) �
32
change �1 and �:
It is straightforward to calculate the counterfactual wage of one worker if we change � or �1.
But, as the distribution of accepted wages change, it is more problematic to aggregate individual
wages when some of this parameters change. Therefore, the counterfactual wages are calculated
by simulating the distribution of productivity among employed female workers using the friction
parameters of male workers.52
Table 7: Counterfactual wages
Counterfactual SectorsMean-Daily Wages M C T S
w(�M1 ; �M ; M ; H(p)M ; �M) 190.3 150.9 119.2 156.3
High w(�M1 ; �M ; M ; H(p)M ; �F ) 153.6 103.1 84.7 194.7
Qualification w(�M1 ; �M ; M ; H(p)F ; �F ) 151.1 104.6 92.8 138.3
Occupations w(�M1 ; �M ; F ; H(p)F ; �F ) 101.5 73.3 74.6 81.3
w(�M1 ; �F ; F ; H(p)F ; �F ) 95.2 69.5 69.7 71.6
w(�F1 ; �F ; F ; H(p)F ; �F ) 105.7 82.8 67.8 87.4
w(�M1 ; �M ; M ; H(p)M ; �M) 109.2 96.6 84.9 88.5
Low w(�M1 ; �M ; M ; H(p)M ; �F ) 114.1 56.3 76.2 97.6
Qualification w(�M1 ; �M ; M ; H(p)F ; �F ) 116.2 68.1 58.6 80.6
Occupations w(�M1 ; �M ; F ; H(p)F ; �F ) 78.1 47.7 47.1 47.4
w(�M1 ; �F ; F ; H(p)F ; �F ) 71.0 48.5 42.9 45.7
w(�F1 ; �F ; F ; H(p)F ; �F ) 76.9 53.3 47.6 49.1
Considering the direct e¤ect and the aggregation e¤ect together, the e¤ect of friction is signif-
icant. We �nd that, on average across sectors, gender di¤erences in destruction rates, �, explain 9
percent of the total wage gap, while decreasing the female o¤er arrival rate, �1, to be equal to the
male one would increase the gap by 13 percent. Di¤erences in � increase the gap, while di¤erences
in �1 reduce the gap. Therefore, the e¤ects of both type of frictions are partially compensated for.
The sizes of these e¤ects are consistent with Bowlus (1997), who analyzed samples of high school
On the other hand, given that women have higher destruction rates than men
G(�W1 ; �W ) 6 G(�W1 ; �M ):
52The model used for simulations is a simpli�ed version of the model presented in Section 2, where workerheterogeneity has been omitted. Simulations use the punctual estimates of �1; �; w; l and �l for every sectorand worker group, reported in Section 3. We assume that the primitive distribution of �rm productivity is log-normal. The mean of the distribution of �rm�s productivity is calibrated in equilibrium by matching mean wagesfor each occupation and for each sector. MATA codes for simulating the model previously described are availablefrom the author upon request.
33
and college graduates from the National Longitudinal Survey of Youth (NLSY). These behavioral
patterns were found to account for 20 - 30 percent of the wage di¤erentials.
It is possible to disentangle the direct e¤ect of friction on wages from the aggregation e¤ect.
For this purpose, we calculate the mean wage of women, changing the friction parameters, but
maintaining the original distribution of �rm productivity, as is illustrated in Figure 1. These
e¤ects are surprisingly small; di¤erences in � explain, on average, 1.3 percent of the total wage
gap, while netting out �1 increases the wage-gap by 2.4 percent. In view of that, most of the
e¤ect of friction comes through the aggregation e¤ect.
The proportion of the wage gap due to di¤erences in productivity connects directly with a
branch of the literature initiated by Hellerstein et al. (1999). In this line of work, they assumed
proportionality between wages and productivity. Therefore, an inequality in wages that is not
driven by a di¤erence in productivity, suggests that there is discrimination. In this paper we
impose more structure, which allows us to have a more complete interpretation of the fraction of
the wage gap that is not explained by di¤erences in productivity.
On average, 65 percent of the total wage gap is accounted for by di¤erences in productivity.
The role of productivity in explaining the wage gap is large, but not surprising, given the de�nition
of productivity used in this exercise. Behind these large productivity-gaps, there are important
di¤erences in productivity determinants. It can be seen on Table 2 that there are signi�cant
di¤erences in age, tenure and potential experience. Moreover, in Lauer (2000), there is also
evidence suggesting that a large part of the gender wage gap in Germany can be attributed to
the fact that women have a lower endowment in human capital than men.
There is a large literature base investigating the e¤ect of segregation on the gender wage gap.53
The structural estimation enables us to compare the current mean-wages of female workers with
the counterfactual female mean-wages, if they faced the primitive distribution of �rm productivity
faced by male workers (i.e., changing HW (p) for HM(p)). We �nd that 17 percent of the wage
gap is accounted for by this di¤erence. These results are consistent with the results of Bayard,
Hellerstein, Neumark and Troske (2003). who found a negative and small e¤ect of the proportion
of women in the establishment, over wages in the U.S.. One of the advantages of the estimation
of this structural model is that it enables us to empirically disentangle the e¤ect of segregation
53See Altonji and Blank (1999).
34
from di¤erences in the distribution of �rm productivity generated by gender di¤erences in fric-
tion patterns.54 In this exercise, we are not modeling the �rm behavior explicitly and therefore
di¤erences in H(p) have no direct structural interpretation.
The e¤ect of di¤erences in productivity is not stable across occupations. Three quarters of
the wage gap is explained by di¤erences in productivity in low quali�cation occupations, while
this proportion reduces to 58 percent when considering high quali�cation occupations. Finally,
these productivity measures are not on an hourly-basis: the numbers of hours signi�cantly di¤ers
between male and female workers (see Section 5.1 for a more detailed discussion on this issue).
Table 8: Gender Wage-Gap decomposition
.
% of the Wage-Gap explained byDifferences in Low-Qualification High-Qualification All
� 10.3 8.5 9.3� -12.6 -13.8 -13.3 74.5 58.3 65.1
H(p) 23.5 13.4 17.7� 4.2 33.6 21.2
Wage-Gap 39.0 44.3 41.9
On average, 21 percent of the total wage gap is accounted for by the di¤erences in the rent-
splitting parameter. This means that women receive wages that are 9 percent lower than the
wages received by equivalent men. This �nding is not signi�cantly di¤erent to what we obtained
using the traditional approach based on Mincer-equations, where female workers are found to
receive wages that are 15 percent lower than those of equivalent male workers (see Section A.5
in the Appendix). The reported gender di¤erences in bargaining power represent a potential
explanation of the observed di¤erences in destruction rates and in productivity. Neumark and
McLennan (1995) show that as women experience labor market discrimination, they have more
career interruptions, and therefore less human capital accumulation.
As in the case of di¤erences in productivity, di¤erences in the wage setting parameter are
also occupation-speci�c. As can be seen from Table 8, the di¤erence in rent splitting parameters
turns out to be an important determinant of the gender wage gap of workers in high quali�ca-
tion occupations; this explains almost none of the wage gap in the low quali�cation jobs. This
54To the best of our knowledge, there is currently no available paper that disentangles both e¤ects.
35
result is consistent with the growing literature on glass ceilings. One potential reason to explain
this pattern is the higher union coverage in low quali�cation occupations. Collective bargaining
agreements regulating wage rates and general working conditions may limit the extent of wage
discrimination. See Heinze and Wolf (2010) and Antonczyk, Fitzenberger and Sommerfeld (2010)
for empirical evidence of a relationship between the gender wage gap and the coverage by collective
wage bargaining in Germany.55
5.1 Productivity-gap and di¤erences in hours
One possible explanation for the large estimated productivity and wage gaps may be that male
workers work more hours than females. One of the main limitations of the LIAB is that it does
not provide any measure of hours. Therefore, the estimated di¤erences in productivity and the
estimated di¤erences in wages are not on an hourly-basis. To tackle this problem, one alternative
is to look for an external source of information about hours worked by each group in each sector.
Using the G-SOEP, we �nd signi�cant di¤erences in mean-hours between genders (see Table 9).
On average, female workers work almost 20 percent fewer hours than their male counterparts.
As the structural wage equation is linear in workers� ability, correcting for hours does not
a¤ect the estimated rent-splitting parameters.56 Nevertheless, an interesting exercise is to have
an hourly-wage-gap decomposition directly plugging in the hour correction.57 Correcting mean-
wages using mean-hours is going to be valid whenever hours worked are uncorrelated with hourly
wages, and there is evidence suggesting that this may be the case.58
Results are presented in Table 10. When correcting for hours, the wage gap is signi�cantly
smaller. We �nd that the average hourly-wage gap is 27.5 percent. A smaller fraction of this gap,
48 percent, is now explained by di¤erences in productivity. On the other hand, a larger fraction of
the gap, 32 percent,59 is due to di¤erences in rent-splitting parameters. Segregation is responsible
55The order of the sequential decomposition steps matters. As a robustness check, we reversed the order of thedecomposition (see Table 16 in the Online Appendix). Results remain qualitatively similar.56If wages and productivity are linear in time worked, the hour-correction only modi�es the worker productivity
and the wage. Consider the case where a worker i works h percent of time of a full time worker.
hwi;j;t = h"iwpj;t;k(i)(pj;t; �k(i);Hk(i)(p); �1k(i); �k(i))
h is canceled out of both sides of the equation and the original wage setting equation holds.57I must thank Zvi Eckstein for this suggestion.58See Blundell and MaCurdy (1999).59Female workers used to receive wages that were 6.2% lower than equivalent males, due to di¤erences in the
36
Table 9: Mean-Hours Per Week
Men Women Hours-GapManufact. Low-Q. 39.95 38.84 12.8%
(0.10) (0.21) (0.56%)High-Q 41.68 38.69 7.2%
(0.08) (0.32) (0.78%)Construct. Low-Q 43.09 34.40 20.2%
(0.26) (1.65) (3.85%)High-Q 43.62 38.74 11.2%
(0.12) (0.97) (2.24%)Trade Low-Q 41.23 28.27 31.4%
(0.47) (0.45) (1.35%)High-Q 44.01 33.89 23.0%
(0.24) (0.52) (1.25%)Services Low-Q 44.67 28.12 37.0%
(0.43) (0.40) (1.08%)High-Q 46.33 42.08 9.2%
(0.30) (0.72) (1.66%)Note: Standard errors are given in parentheses. Means of weekly-hours are estimated using worker-level
data from the G-SOEP. Standard errors of the hours gap are obtained using the Delta-Method.
Table 10: Gender Wage-Gap decomposition
.
% of the Hourly Wage Gap explained byDifferences in Low-Qualification High-Qualification All
� 32.0 11.1 17.7� -39.2 -18.1 -25.4 43.8 21.8 48.3
H(p) 53.7 15.7 27.0� 9.6 39.4 32.4
Wage-Gap 17.1 37.8 27.5
for 27 percent of the unconditional wage gap in hourly wages. Netting out the o¤er arrival rate
would increase the hourly-wage di¤erential by 25 percent; if women had the male destruction rate,
the gap would be 17 percent smaller.
rent-splitting parameter. This number does not change because the rent-splitting parameter estimates don�tchange.
37
6 Concluding Remarks
This paper presents the �rst estimation of an equilibrium search model using matched employer-
employee data to study wage discrimination. The combination of �rm and worker data is crucial
for these purposes for many reasons. Firstly, and most importantly, it allows us to separately
identify di¤erences in workers productivity across groups and di¤erences in wage policies toward
those groups. Secondly, better data makes the estimation simpler, and therefore, a more complete
model can be estimated. Finally, it enables us to produce measures of the e¤ect of labor market
segregation, disentangled from the e¤ect of friction on the distribution of accepted wages.
The structural estimation involved several steps. First, we estimated group-speci�c produc-
tivity for each �rm, relying on the production function estimation at the �rm-level, using LIAB.
Second, we computed job-retention and job-locating rates using G-SOEP employee-level data.
And third, we estimated the wage-setting parameters (bargaining power) using individual wage
records in LIAB, and transition parameters and productivity measures speci�c to each group and
�rm estimated in the previous steps.
When analyzing productivity, we observe that women are 33 percent less productive than men
in similar jobs. This di¤erence is reduced to 17.5 percent if we control for di¤erences in hours. The
primary �ndings, in terms of friction patterns, are that women are, in general, more mobile than
men in terms of job-to-job transitions and that they have higher job-destruction rates. In spite of
having large wage di¤erentials, women were only found to have signi�cantly lower rent-splitting
parameters than men in the construction sector.
In terms of wages, we �nd that the unconditional gender wage gap is 42 percent. It turns
out that 65 percent of the gap is accounted for by di¤erences in productivity. Di¤erences in
destruction rates explain 9 percent of the total wage-gap and segregation is responsible for 17
percent of this wage di¤erential. Netting out di¤erences in o¤er-arrival rates would increase the
gap by 13 percent. Di¤erences in the rent-splitting parameter generate 21 percent of the wage gap,
which implies that female workers receive wages 9 percent lower than equivalent males. This is
not a �nal word given that we still have to understand what is causing the di¤erence in each wage
determinant. Moreover, in the model all the parameters are considered as exogenous primitives.
In reality, most of them are the outcome of individual behavior, and they are not independent.
38
Still, the purpose of our decomposition is to identity which wage determinants are contributing
the most to the wage gap. Since we �nd that the gender productivity gap is very large and it
accounts for most of the wage di¤erential between male and women. A promising area for future
research is now understanding what is behind the productivity gap. Our model is not the most
appropriate framework to analyze this question because the productivity of the worker is assumed
to be a �xed primitive characteristic and not a result of a decision of investment in human capital.
In a life cycle model, it is likely that some of the di¤erence in productivity between male and
women is a response to a di¤erence between both groups in � or H(p).60
Our results also rely on simplifying assumptions that would need further scrutiny. We now list
two very desirable extensions. A �rst interesting extension is to relax the assumption of station-
arity. Men and women have di¤erent levels of experience in the data. Di¤erences in experience
are immaterial in the model, which is in steady-state with in�nitely-lived agents. However, in the
real world workers with shorter experience are less productive and have had less time to climb the
job ladder. To incorporate experience and human capital accumulation by learning-by-doing in
the model would generate a much richer interpretations of the gender di¤erence in H(p) and the
gender di¤erences in productivity. This extension is not trivial because the equilibrium allocation
of workers to �rms in such model has assortative matching (see Carrillo-Tudela, 2009) and the
estimation of the production function would have to account for it. A second desirable extension
is to look into our reported gender di¤erences in rent splitting parameters. Within this di¤erence,
apart from pure wage discrimination, there can be primitive di¤erences in bargaining performance
(Croson and Gneezy, 2009). To empirically disentangle wage discrimination from these behavioral
di¤erences is surely di¢ cult, but it would certainly yield further insight into the understanding
of the gender wage gaps.
A Appendix
A.1 Production Function - Speci�cation Tests
The robustness of the productivity estimates are crucial to be able to reach reliable conclusions
about wage discrimination. As robustness check, we present results of the productivity estimates
60This kind of mechanisms was �rst proposed by Coate and Loury (1993).
39
under di¤erent sets of assumptions.
In Table 11, we report within-group non-linear least squares estimates of (10) and (9). In the
estimation of (9) we do not use measures of capital. Relative productivity estimates (ie: w and
l) are very similar in both estimations: di¤erences between parameters are always smaller than a
standard deviation. This robustness check is essential because in this kind of model, it is assumed
that capital adjusts instantaneously to match the number of workers in each period. Although
this assumption may seem controversial, it turns out that using observed capital or using the
theoretical optimal choice of capital does not change the relative productivity estimates.
Table 11: Production Function: Optimal Capital Input
WG-NLLS of (10) WG-NLLS of (9) w u w u
Manufacturing 0.672 0.484 0.642 0.478(0.062) (0.042) (0.132) (0.070)
Construction 0.701 0.444 0.839 0.485(0.052) (0.039) (0.141) (0.833)
Trade 0.804 0.487 0.745 0.541(0.092) (0.056) (0.176) (0.982)
Services 0.588 0.298 0.595 0.301(0.068) (0.030) (0.115) (0.040)
Note: Time dummies included. Robust to heteroskedasticity standard errors are given in parentheses.
Worker Composition Endogeneity: One of the main criticisms to the productivity estimates
reported in Hellerstein and Neumark (1999) and in Hellerstein, Neumark, and Troske (1999) was
that the proportion of women in the �rm is likely to be correlated with the �rm�s technology.61
log(Yjt) = log( �Aj) + �k log(Kdjt) + (16)
�l log(Lmhjt + wL
whjt + lL
mljt + w lL
wljt ) + ujt:
Note that (16) is the original Cobb-Douglas production function in logs without imposing
constant returns to scale and including the depreciated capital instead of the optimal capital
input. In this estimation we are assuming that the depreciation rate is constant, and hence
depreciated capital is a constant fraction of the total capital.
61See Altonji and Blank (1999).
40
Table 12: Production Function: Non Linear Least Squares in Levels
NLLS of (16)�k �l w l
Manufacturing 0.153 0.905 0.352 0.300(0.007) (0.018) (0.032) (0.016)
Construction 0.084 1.034 0.451 0.382(0.017) (0.028) (0.082) (0.043)
Trade 0.110 0.963 0.562 0.365(0.018) (0.029) (0.068) (0.039)
Services 0.180 0.839 0.356 0.292(0.098) (0.015) (0.029) (0.017)
Note: Time dummies included. Standard errors are given in parentheses.
The results are presented in Table 12. Female productivity estimates are signi�cantly smaller.
This �nding is true for all sectors. The di¤erence between results presented in Table 12 and
results presented in Table 4, where �rm�s �xed e¤ects were removed, implies a non-zero correlation
between the proportion women of women in the �rm and the �rm �xed e¤ect. Hausman tests
reject equality of w in every sector. In terms of l; the results are not so di¤erent and we only
reject equality for manufacturing.
Predetermined inputs:
It is possible to estimate �l and k by GMM using internal instruments, assuming that Ljt
and Kjt are predetermined. The estimates of 0s by non-linear GMM are imprecise in every sector
and for every group. We have contrasted di¤erent identi�cation strategies, in terms of the sets of
internal instruments and optimization routines. has been estimated using only lagged levels as
instruments in the equation in di¤erences (Arellano and Bond, 1991); lagged levels as instruments
in the equation in di¤erences jointly with lagged di¤erences as instruments in the equation in
levels (e.g. Arellano and Bover, 1995); and only lagged di¤erences as instrument in the equation
in levels (e.g. Cahuc, Postel-Vinay and Robin, 2006). We obtained extremum estimators that
minimize the two-stage robust GMM objective function and iterated-GMM objective function.62
We also calculated Monte Carlo Markov Chain type of estimators for continuously-updated GMM
(Chernozhukov and Hong, 2003). Non-linear System-GMM estimates of the production function
(9) are reported in Table 13 .63 The lack of precision in the quality parameter estimates is a
62See Arellano 2003 pp 184-19763MATA codes for computing the non-linear estimators previously described are available from the author upon
41
Table 13: Production Function: Non-Linear SYSTEM-GMM
System-GMM of (16)�k �l w l Sargan p-v
Manufacturing 0.060 0.938 0.70 0.15 58%(0.046) (0.011) (0.454) (0.069)
Construction 0.016 1.164 0.442 0.188 98%(0.043) (0.160) (0.267) (0.083)
Trade 0.053 1.003 0.370 0.322 97%(0.050) (0.170) (0.290) (0.215)
Services 0.117 0.786 0.434 0.204 93%(0.071) (0.146) (0.158) (0.090)
Note: Time dummies included. Standard errors are given in parentheses.
pervasive problem in this kind of production function speci�cation when the lag of capital and
the lag of labor are used to instrument the current capital and labor, partialling out the �rm
�xed e¤ects. See for example Cahuc, Postel-Vinay and Robin (2006) or Hellerstein and Neumark
(1998).
Estimating (10) the �rm �xed e¤ect is completely removed, but the simultaneity problem is not
totally solved. One alternative would be to treat Ljt and kjt. In Table 13 we report System-GMM
estimates of (16). However the precision in the 0s GMM estimates is poor. Capital coe¢ cients
are not signi�cantly di¤erent from zero and constant returns to scale are not rejected in any sector.
Sargan tests do not reject compatibility of instruments in any sector. Haussman test of equality
between 0s reported in Table 13 and those in Table 4 do not reject equality in any group.
request.
42
References
[1] Abowd, J. M., Kramarz, F., and Margolis, D. N. (1999), "High wage workers and high wage�rms", Econometrica, 67(2), pp. 251-334.
[2] Antonczyk, D., B. Fitzenberger,and K. Sommerfeld (2010) "Rising Wage Inequality, theDecline of Collective Bargaining, and the Gender Wage Gap", Labour Economics 17(5), pp.835-847.
[3] Alda H., St. Bender, H. Gartner (2005), "The Linked Employer-Employee Dataset of theIAB (LIAB)", IAB-Discussion Paper.6/2005.
[4] Altonji, J.G. and R.M. Blank (1999), �Race and Gender in the Labor Market�, in: O.Ashenfelter and D. Card (eds.), Handbook of Labor Economics, Volume 3, Amsterdam:Elsevier Science.
[5] Arellano, M. and S.R. Bond (1991), �Some Tests of Speci�cation for Panel Data: MonteCarlo Evidence and an Application to Employment Mquations�, Review of Economic Studies,58(2), pp. 277�297.
[6] Arellano, M., and O. Bover (1995), �Another Look at the Instrumental-Variable Estimationof Error-Components Models�, Journal of Econometrics, 68(1), pp. 29�52.
[7] Atakan, A. (2006) "Assortative Matching with Explicit Search Cost", Econometrica, 74(3),pp. 667-680
[8] Bayard, K., J. Hellerstein, D. Neumark, and K. Troske (2003). �New Evidence on Sex Seg-regation and Sex Di¤erences in Wages from Matched Employee-Employer Data�, Journal ofLabor Economics 21(4), pp. 887-922.
[9] Becker, G. (1973), �A Theory of Marriage: Part I�, Journal of Political Economy, 81(4), pp.813-846.
[10] Blau, F. andW. Hendricks (1979) "Occupational Segregation by Sex: Trends and Prospects",The Journal of Human Resources, 14(2), pp. 197-210.
[11] Blau, F. and L. Kahn (2000), "Gender Di¤erences in Pay", Journal of Economic Perspectives14(49), pp. 75�99
[12] Blau, F. and L. Kahn (2003), �Understanding International Di¤erences in the Gender PayGap�, Journal of Labor Economics, 21(1), pp. 106-144.
[13] Blundell, R. and T. MaCurdy (1999) "Labor Supply: A Review of Alternative Approaches",Handbook of Labor Economics, Volume 3, Amsterdam: Elsevier Science.
[14] Bontemps, C., J.-M. Robin, and G. J. van den Berg (2000), �Equilibrium Search with Con-tinuous Productivity Dispersion: Theory and Non-Parametric Estimation�, InternationalEconomic Review, 41(2), pp. 305�358.
[15] Bowlus, A. (1997), �A Search Interpretation of Male-Female Wage Di¤erentials�, Journal ofLabor Economics, 15(4), pp. 625-657.
[16] Bowlus, A. and Z. Eckstein (2002), �Discrimination and Skill Di¤erences in an EquilibriumSearch Model�, International Economic Review, 43(4), pp. 1309-1345.
43
[17] Burdett, K.and D. T. Mortensen (1998), �Wage�Di¤erentials, Employer Size and Unemploy-ment", International Economic Review, 39(2), pp. 257�273.
[18] Cahuc, P., F. Postel-Vinay and J.-M. Robin, (2006), �Wage Bargaining with On-the-jobsearch: Theory and Evidence�, Econometrica, 74(2), pp. 323-364.
[19] Carrillo-Tudela, C., (2009), �An Equilibrium Search Model when Firms Observe Workers�Employment Status�, International Economic Review, 50(2), pp. 485-506.
[20] Chernozhukov, V. and H. Hong, (2003), "An MCMC Approach to Classical Estimation"Journal of Econometrics 115(2), pp. 293-346.
[21] Croson, R., U. Gneezy, (2009) "Gender Di¤erences in Preferences", Journal of EconomicLiterature, 47(2), pp. 448-474.
[22] Eckstein, Z. and K.I.Wolpin (1995), "Duration to First Job and Return to Schooling: Esti-mates from a Search-Matching Model", Review of Economic Studies, 62(2), pp. 263-286.
[23] Eckstein, Z. and K. Wolpin (1999), �Estimating the E¤ect of Racial Discrimination on FirstJob Wage O¤ers�, Review of Economics and Statistics, 81(3), pp. 384-392.
[24] Flabbi L.(2010), "Gender Discrimination Estimation in a Search Model with Matching andBargaining", International Economic Review, 51(3), pp. 745-783.
[25] Flinn, C. (2006), �Minimum Wage E¤ects on Labor Market Outcomes under Search, Bar-gaining, and Endogenous Contact Rates�, Econometrica 74 (4), pp. 1013-1062.
[26] Flinn, C and J. Mabli (2010), "On-the-Job Search, Minimum Wages, and Labor MarketOutcomes in an Equilibrium Bargaining Framework", NYU mimeo.
[27] Griliches, Z. and J. Mairesse (1998), "Production Functions: the Search for Identi�cation",Essays in Honour of Ragnar Frisch. S. Strom (ed.), Econometric Society Monograph Series,Cambridge University Press.
[28] Heinze, A. and E. Wolf (2010) "The Intra-Firm Gender Wage Gap: a New View on WageDi¤erentials Based on Linked Employer�Employee Data", Journal of Population Economics23(3), pp. 851-879.
[29] Hellerstein, J., Neumark, D. (1998) "Wage Discrimination, Segregation, and Sex Di¤erencesin Wage and Productivity Within and Between Plants", Industrial Relations 37(2), pp. 232-260.
[30] Hellerstein, J. and D. Neumark (1999), "Sex, Wages, and Productivity: An Empirical Analy-sis of Israeli Firm-Level Data". International Economic Review 40(1), pp. 95�123.
[31] Hellerstein, J., D. Neumark, and K. R. Troske (1999), "Wages, Productivity, and WorkerCharacteristics: Evidence from Plant-Level Production Functions and Wage Equations",Journal of Labor Economics 17(3), pp. 409�46.
[32] Jolivet, G., F. Postel-Vinay and J.-M. Robin (2006), �The Empirical Content of the JobSearch Model: Labor Mobility and Wage Distributions in Europe and the U.S.�, EuropeanEconomic Review, 50(4), pp. 877-907.
[33] Lauer, C. (2000), "Gender Wage Gap in West Germany: How Far Do Gender Di¤erences inHuman Capital Matter?", ZEW Discussion Paper No. 00-07.
44
[34] Lindeboom, M., R. Mendes and G.J. van den Berg (2010), "An Empirical Assessment ofAssortative Matching in the Labor Market", Labour Economics, 17(6), pp. 919-929.
[35] Mondal S. (2006), "Employer Discrimination and Racial Di¤erences in Wages and Turnover",Unpublished Manuscript.
[36] Mortensen, D.T. (2003), "Wage Dispersion", MIT Press, Cambridge.
[37] Moscarini, G, (2008), "Job-to-Job Quits and Corporate Culture", Unpublished Working Pa-per.
[38] Neumark, D. and M. McLennan (1995) "Sex Discrimination and Women�s Labor MarketOutcomes", Journal of Human Resources, 30(4), pp. 713-740.
[39] Postel-Vinay, F. and J-M Robin (2002), �Equilibrium Wage Dispersion with HeterogeneousWorkers and Firms", Econometrica 70(6), pp. 2295-1350.
[40] Ridder, G. and G.J. van den Berg (2003), �Measuring labor market frictions: a cross-countrycomparison�, Journal of the European Economic Association, 1(1), pp. 224�244.
[41] Rubinstein, A. (1982), �Perfect Equilibrium in a Bargaining Model�, Econometrica, 50(1),pp. 97�109.
[42] Shimer, R. (2006), �On-the-Job Search and Strategic Bargaining�, European Economic Re-view 50(4), pp. 811.830
[43] Shimer, R. and L. Smith (2000), "Assortative Matching and Search", Econometrica, 68(2),pp. 343 369
[44] Sulis, G. (2011), "Gender Wage Di¤erentials in Italy: a Structural Estimation Approach",Journal of Population Economics, 25(1), pp 53-87.
[45] Van den Berg, G.J. and A. van Vuuren (2010), "The E¤ect of Search Frictions on Wages",Labour Economics, 17(6), pp. 875-885.
[46] Wagner, G., R.V. Burkhauser, and F. Behringer (1993), �The English Language Public UseFile of The German Socio-Economic Panel�, Journal of Human Resources, 28(2), pp. 429-433.
45