Gender Wage Gaps Reconsidered: A Structural Approach Using

Gender Wage Gaps Reconsidered:A Structural Approach Using Matched

Employer-Employee Data�

Cristian BartolucciCollegio Carlo Albertoy

August 2012

Abstract

In this paper, we study the extent to which wage di¤erentials between men and women canbe explained by di¤erences in productivity, disparities in friction patterns, segregation andwage discrimination. For this purpose, we propose an equilibrium search model that featuresrent splitting, on-the-job search and two-sided heterogeneity in productivity. The model isestimated using German matched employer-employee data. Overall, the results reveal thatfemale workers are less productive and more mobile than males. In addition, female workershave on average slightly lower bargaining power than their male counterparts. The rawgender wage gap is 41 percent; 65 percent of this gap is accounted for by di¤erences inproductivity, 17 percent of this gap is driven by segregation, while di¤erences in destructionrates explain 9 percent of the total wage-gap. Netting out di¤erences in o¤er-arrival rateswould increase the gap by 13 percent. Di¤erences in the rent-splitting parameter explain 21percent of the gap, which means that female workers receive wages that are 9 percent lowerthan their equivalent male counterparts.JEL Code: J70, C51, J64KEYWORDS: Labor market discrimination, search frictions, structural estimation, matched

employer-employee data.

1 Introduction

In this paper, we estimate an equilibrium search model to study the extent to which wage dif-

ferentials between men and women can be explained by di¤erences in productivity, disparities

in friction patterns, segregation and direct wage discrimination. The model features on-the-job

�I am specially grateful to Manuel Arellano for his guidance and constant encouragement. I would also liketo thank Stéphane Bonhomme, Christian Bontemps, Raquel Carrasco, Zvi Eckstein, Chris Flinn, Andrés Erosa,Pietro Garibaldi, Carlos González-Aguado, José Labeaga, Joan Llull, Claudio Michelacci, Pedro Mira, EnriqueMoral-Benito, Diego Puga, Jean Marc Robin, Steven Stern, Nieves Valdez, two anonymous referees and seminarparticipants at Uppsala University, Bank of Canada, Collegio Carlo Alberto, Bank of Italy, Bank of France, PSE,UQAM, CEMFI, University of Tucuman, University College London, ICEEE (2011), European Meeting of theEconometric Society (2009), Cosme Workshop, EC-Squared Conference on Structural Microeconometrics, SEAmeetings (2008) and IZA-ESSLE (2008) for very helpful comments. Special thanks is due to Emily Moschinifor excellent research assistance and to Nils Drews, Peter Jacobebbinghaus and Dana Muller at the Institute forEmployment Research for invaluable support with the data. First Version: August 2008.

yVia Real Collegio, 30 - 10024 - Moncalieri (Torino), Italy. Email: [email protected].

1

search, rent-splitting, and productivity heterogeneity in �rms and workers. The structural es-

timation combines group-speci�c productivity measures and the empirical distribution of �rms�

productivity estimated from �rm-level data, group-speci�c friction patterns estimated from indi-

vidual duration data, and individual wages. The model implies a structural wage equation, which

illustrate the precise relationship between wages, worker ability, �rm productivity, friction pat-

terns, and bargaining power. The wage equation estimated in this paper may be best understood

as the structural counterpart of a standard Mincer equation, including only wage determinants

which are relevant from a theoretical point of view. The model allows for counterfactual analysis,

such as the decomposition of the observed wage gaps in terms of the di¤erences in productivity

and frictions, or segregation and discrimination, taking into account equilibrium e¤ects.

Understanding wage gaps across di¤erent demographic groups has always been a central con-

cern in labor economics. Many studies have focused on explaining how much of the unconditional

mean wage di¤erential between groups may be understood as wage discrimination.1 The stan-

dard strategy involves estimating Mincer-type equations for both groups and then decomposing

the di¤erence in mean wages into �explained�and �unexplained�components. The fraction of

the gap that cannot be explained by di¤erences in observable characteristics is considered dis-

crimination. This kind of analysis is very informative from a descriptive perspective, but the

causal interpretation and the nature of discrimination are not clear. The reason is that most

of the observable characteristics included in the standard reduced-form wage equations, such as

education or experience may be understood as proxies of the true wage determinants, such as

productivity, outside options, and the rent splitting rule. Whenever there is a residual di¤erence

between groups in some of these wage determinants, the measure of discrimination may be biased.

The primary candidate to generate these biases is a between-group di¤erence in productivity that

may persist after controlling for education, occupation, age and tenure.

To deal with this shortcoming, Hellerstein and Neumark (1999) and Hellerstein, Neumark

and Troske (1999) exploit newly available matched employer-employee data to calculate a novel

indicator for wage discrimination. They estimate the relative marginal products of various worker

types, which are then compared with their relative wages. If productivity is the only wage

determinant, which is true if there is perfect competition in the labor market, or if there are no

1See Blau and Kahn (2003) and Altonji and Blank (1999) for good surveys.

2

signi�cant di¤erences in friction patterns or in the distribution of �rms faced by both groups,

any di¤erence in wages that is not driven by a di¤erence in productivity may be understood as

discrimination.

However, comparing wages and productivity may provide an incomplete picture of the prob-

lem. For example, wage di¤erentials across groups are often accompanied by unemployment-rate

and job-duration di¤erentials. There is a vast expanse of literature estimating di¤erentials in

job-�nding and job-retention rates across groups, directly observing duration in unemployment

and employment, or using experiments such as audit studies.2 Although there is agreement in

predicting e¤ects of frictions on wages,3 there is scarce empirical evidence as to how much of the

wage gap can be accounted for by the di¤erences in the friction patterns. Also, the variation

in worker composition is likely to be correlated with heterogeneity in the production technology,

which generates labor market segregation.4

Estimated structural models may provide a complete interpretation of observed wage gaps as

a consequence of between-groups di¤erences in every wage determinant. Nevertheless, progress in

this direction has been slow, primarily due to the di¢ culty in separately identifying the impacts

of worker productivity di¤erentials and discrimination, solely from worker-level data. The main

references are Eckstein and Wolpin (1999) and Bowlus and Eckstein (2002). Both papers are

focused on racial discrimination in the U.S. and deal with this empirical identi�cation problem

through structural assumptions. Eckstein and Wolpin (1999) proposed a method based on a two-

sided, search-matching model that formally accounts for unobserved heterogeneity and unobserved

o¤ered wages. They argued that di¤erences in the bargaining power parameter (their index of

discrimination) are not identi�ed unless some �rm-side data are available. They compute bounds

for discrimination that were ultimately not informative on the estimation sample they worked

with. Bowlus and Eckstein (2002) also proposed a search model with heterogeneity in worker

productivity but including an appearance-based employer disutility factor. As long as there are

�rms that do not discriminate, they are able to identify between-groups di¤erences in the skill

distribution as well as the proportion of discriminatory employers. The �rst attempts to use

an equilibrium search model to study gender discrimination were made by Bowlus (1997) and

2See Chapter 4 in Altonji and Black (1999) for a good survey.3See van den Berg and van Vuuren (2010) for a good discussion of this issue.4See Altonji and Blank (1999) and Bayard, Hellerstein, Neumark and Troske (2003).

3

Flabbi (2010). Bowlus only focuses on the e¤ect of gender di¤erences in friction patterns on wage

di¤erentials without distinguishing between di¤erences in productivity and discrimination. Flabbi

(2010) uses the identi�cation strategy proposed in Bowlus and Eckstein (2002) to additionally

disentangle the e¤ect of productivity di¤erentials.5

In this paper, we link two independent branches of the wage-discrimination literature. We

propose a natural step in extending the structural estimation literature focused on wage discrim-

ination. The availability of better data enables us to estimate a more complete model,6 using a

more transparent identi�cation strategy to measure di¤erences in productivity and taking into

account the e¤ect on wages of labor market segregation. This approach may also be understood

as a step forward from the strategy proposed by Hellerstein and Neumark (1999), where only pro-

ductivity gaps were estimated. Here, we provide a more complete interpretation of the observed

wage gap, where the productivity gap is only one of the potential determinants of the di¤erence

in wages between males and females.

The model builds upon the equilibrium search model, with strategic wage bargaining and

on-the-job search presented in Cahuc, Postel-Vinay and Robin (2006). The model consents rent-

splitting and, as in Eckstein and Wolpin (1999), we allow for di¤erent rent-splitting rules for male

and female workers to capture wage discrimination. The model has two sources of heterogeneity.

Workers are heterogeneous in terms of productive e¢ ciency to capture the e¤ect on equilibrium

wages of di¤erences across sexes in productivity. Firms technology is also heterogeneous and its

distribution is group-speci�c. This feature allows the model to �t an economy with segregation.

Workers do not �nd jobs instantaneously and, when working, there is a positive probability of

losing the current job. The parameters that describe these events are also allowed to be gender-

speci�c in order to capture the e¤ect of frictions on the gender wage gap.

We use LIAB, a 1996-2005 matched employer-employee panel data made available by the Ger-

man Labor Agency.7 This panel provides us with a unique opportunity to disentangle the di¤erent

5To have an estimable model with employee-level data, Flabbi (2010) only includes heterogeneity at the matchlevel, not taking segregation into account, and not allowing for on-the-job search. Flabbi does allow for wagebargaining; however, the bargaining power is not estimated.

6Particularly related to Flabbi (2010), which at the moment is the most complete structural estimation in thewage-discrimination literature, the model presented in this paper allows for two-sided heterogeneity and on-the-jobsearch. In the estimation, rent splitting parameters are not imposed, but are estimated. Note that we are notforcing the same �rms distribution across gender. See Sections 2 and 4 for details.

7This dataset is subject to strict con�dentiality restrictions. It is not directly available. After the IAB hasapproved the research project, The Research Data Center (FDZ) provides on-site use or remote access to external

4

sources of the gender wage gap, because it contains valuable information on gender, wages and

occupations, but also because it is a panel that tracks �rms as opposed to individuals. Track-

ing �rms is essential in the estimation of production functions using panel estimation methods.

To the best of our knowledge, this paper presents the �rst structural estimation using matched

employer-employee data to study labor market discrimination.8

To have a more homogeneous sample, we only consider �rms in West Germany. The gender

wage gap in this part of Germany is particularly large. According to Blau and Kahn (2000) the

gender hourly-wage gap in West Germany is 32 percent; placing west Germany in position 6 in a

ranking of 22 industrialized countries. Many researchers are trying to understand the nature of

this large earning di¤erential (Lauer, 2000; Heinze and Wolf, 2010; Antonczyk, Fitzenberger and

Sommerfeld, 2010).

The empirical analysis initially involves calculating the empirical distribution of �rms�produc-

tivity and di¤erences in productivity between men and women. The linkage between worker-level

information and �rm-level data is essential at this stage. From the worker-level data, we recover

the composition of the �rm during each time period. The production function estimation exploits

within �rm variation in worker composition, capital and output. We estimate group-speci�c

productivity parameters and a �rm-speci�c technology measure. We �nd large productivity dif-

ferentials for women and noticeable di¤erences in the distribution of �rms faced by both kind of

workers.

We then analyze group-speci�c friction patterns by survival analysis. We �nd that women have

higher job-creation rates than men, and that females also have higher job-destruction rates than

males. Finally, we estimate bargaining power for male and female workers by means of individual

wages, gender-speci�c productivity measures for each �rm, the gender-speci�c distribution of

�rms�productivity and gender-speci�c frictions patterns. In spite of the large wage di¤erentials,

on average, women are found to have only slightly lower bargaining power than men.

In terms of wages, the raw gender wage gap is 41 percent. It turns out that most of the gap �65

percent �is accounted for by di¤erences in productivity. Di¤erences in destruction rates explain

9 percent while di¤erences in the distribution of the �rm�s productivity faced by male and female

researchers.8There is a recent paper by Sulis (2007) that studies gender wage di¤erentials in Italy, estimating a structural

model. Sulis uses employee level data with �rm identi�ers, without data on �rms, such as capital or output.

5

workers explain 17 percent of the total wage-gap. Netting out di¤erences in o¤er-arrival rates

would increase the gap by 13 percent. Presumably, within di¤erences in arrival rates, destruction

rates, the distribution of employers and productivity there is some content of discrimination. The

economic literature, as well as the legislation of many countries, certainly recognize these potential

sources of discrimination. But if we focus our attention on direct wage discrimination, we �nd

that di¤erences in the rent-splitting parameter are responsible for 21 percent of the wage gap,

implying that female workers receive wages that are 8.6 percent lower than equivalent males in

terms of productivity and treated the same as males in terms of the o¤er arrival rate, destruction

rate and distribution of employers.

The rest of this paper is organized as follows. In the next section, we describe the structural

model. Section 3 describes the data. Section 4 presents the estimation of the structural model

inputs, namely the productivity measures and friction parameters. We present and discuss the

intermediate results and then describe the empirical strategy used to estimate the structural

wage equation and its results. In Section 5, the counterfactual experiments are discussed and we

compare our empirical results with those found using other strategies for detecting discrimination.

Conclusions are o¤ered in Section 6.

2 Structural framework

In this section we describe the behavioral model of labor market search with matching and rent-

splitting. The primary goal of estimating a structural model is to clearly state a wage-setting

equation that enables us to measure the e¤ect of each wage determinant. Once this wage equation

is estimated, it becomes straightforward to obtain the e¤ect of discriminatory wage policies by

comparing a man�s wage with the wage that a woman with the same wage determinants would

receive. This comparison is equivalent to a standard Oaxaca-Blinder decomposition, the main

di¤erence lies in the de�nition of wage determinants. A Oaxaca-Blinder decomposition includes

observable characteristics of the worker and the job that are potentially correlated with the wage.

By comparison, in our decomposition we include wage determinants that are driven by economic

theory. We model the labor market, and therefore we propose a theory that explains how wages

are determined. According to that theory, wages are a function of the extent of frictions in the

6

labor market, the productivity of the worker, the productivity of the �rm and the bargaining

power of the worker. Knowing the function that links wages to these wage determinants, we can

calculate the portion of the wage gap that is driven by di¤erences in each wage determinant.

Previous research has shown the aptitude of these models in describing the labor market

outputs and dynamics. Building on these assessments, we are interested in using the structural

model as a measurement tool that enables us to empirically disentangle the e¤ect of each wage

determinant on the gender wage gap. Search-matching models have been used as an instrument

to address empirical questions in a variety of papers. Examples include the previously mentioned

papers in the discrimination literature, but there are also interesting contributions in measuring

returns to education (Eckstein and Wolpin, 1995) or in analyzing the e¤ect of a change in the

minimum wage (Flinn, 2006).

2.1 Assumptions

We propose a continuous time, in�nite horizon, stationary economy. This economy is populated

by in�nitely lived �rms and workers. All agents are risk-neutral and discount future income at

the rate � > 0.

Workers: We normalize the measure of workers to one. Workers may belong to one of the

groups (k) de�ned in terms of gender.9 Workers have di¤erent abilities (") measured in terms

of e¢ ciency units they provide per unit of time. The distribution of ability in the population

of workers is exogenous and speci�c to each group, with cumulative distribution function Lk(").

Group-speci�c distributions of e¢ ciency units provided by workers are crucial in considering the

between-groups di¤erences in productivity. This source of heterogeneity is perfectly observable by

every agent in the economy. Each worker may be either unemployed or employed. The workers

from a generic group k that are not actually working receive a �ow utility proportional to their

ability, bk": The workers�e¢ ciency in home production is associated with their productivity in the

labor market and the parameter that describes that association is allowed to be gender-speci�c.10

9The structural model abstracts many dimensions that may be relevant in the wage setting. In order to comparejobs that are as similar as possible, the empirical analysis is clustered at the sector and occupation level. See Section4 for details.10The assumption that worker productivities �at home� and at work are proportional greatly simpli�es the

upcoming analysis. Furthermore, we allow b to be gender-speci�c to be consistent with a potential gender special-ization in home production.

7

Firms: Every �rm is characterized by its productivity (p). We assume that p is distributed

across �rms according to a given cumulative distribution function Hk(p), which is continuously

di¤erentiable with support [pmin; pmax]: The primitive distribution of �rms� productivity faced

by workers may di¤er among groups, allowing the model to be robust to labor market segrega-

tion. This source of heterogeneity is perfectly observable by every agent in the economy. The

opportunity cost of recruiting a worker is zero.

Each �rm contacts a worker of a given group k at the same constant rate, regardless of the

�rm�s bargained wage, its productivity or how many �lled jobs it has. Unemployed workers receive

job o¤ers at a Poisson rate �0k > 0. Employed workers may also search for a better job while

employed and receive job o¤ers at a Poisson rate �1k > 0. We treat �0k and �1k as exogenous

parameters speci�c to each group k. Searching while unemployed and searching while employed

has no cost. Employment relationships are exogenously destroyed at a constant rate �k > 0;

leaving the worker unemployed and the �rm with nothing. The marginal product of a match

between a worker with ability " and a �rm with productivity p is "p:

Whenever an employed worker meets a new �rm, the worker must choose an employer and

then, if she switches employers, she bargains with the new employer with no possibility of recalling

her old job. If she stays at her old job, nothing happens. Consequently when a worker negotiates

with a �rm, her alternative option is always unemployment. The surplus generated by the match

is split in proportions �k and (1��k), for the worker and the �rm respectively. �k is exogenously

given and speci�c to each group k. We will refer to �k as the rent-splitting parameter. As in

Wolpin and Eckstein (1999), we interpret �male��female as an index of the level of discrimination

in the labor market. A di¤erence in � in the same kind of job and sector, reveals di¤erential

payments unrelated to productivity and outside options, which are only driven by belonging to a

given group.

It is well known that many other factors may have an impact on the rent-splitting parame-

ters. The main candidates are: negotiation skills, risk aversion and the discount factor. In this

model agents are assumed to be risk-neutral and the discount rate is homogeneous across groups.

Although this could be part of the story that explains di¤erences in �, there is scant convincing

empirical evidence of gender di¤erences in these primitive parameters.11 This could be the reason

11The most convincing evidence comes from arti�cial experiments, see Croson and Gneezy (2009) for a good

8

why risk aversion and the discount factor have been held constant across genders in most of the

empirical studies on wage discrimination.

It is not clear whether � can be interpreted as a Nash bargaining power. Shimer (2006) argues

that in a simple search-matching model with on-the-job search, the standard axiomatic Nash

bargaining solution is inapplicable, because the set of feasible payo¤s is not convex. This non-

convexity arises because an increase in the wage has a direct negative e¤ect over the �rm�s rents,

but an indirect positive e¤ect raising the duration of the job. This critique will hold depending

on the shape of the �rm productivity distribution. Whether � can be understood as a Nash

Bargaining power is not essential for this study. If this critique holds up, we still interpret � as a

rent-splitting parameter that simply states the proportion of the surplus that goes to the worker.

A di¤erence in these parameters remains informative about wage-discrimination.

The model assumes that the worker does not have the option of recalling the old employer.

Hence, there is no possibility of Bertrand competition between �rms, as in Cahuc, Postel-Vinay

and Robin (2006). Whether to allow �rm competition a la Bertrand or not is controversial.

While the Cahuc et al bargaining scenario may be conceptually more appealing and may help

to avoid the Shimer critique, it is not clear how realistic this assumption is. Mortensen (2003)

argues that countero¤ers are empirically uncommon. Moscarini (2008) illustrates that, in a model

with search e¤ort, �rms may credibly commit to ignore outside o¤ers to their employees, letting

them go without a countero¤er, and su¤er the loss, in order to keep in line the other employees�

incentives to not search on the job. Moreover, it can be shown that including a marginally positive

cost of negotiation, makes it unpro�table for �rms to try to poach the worker from better �rms,

and then the Bertrand competition vanishes.

In an environment where contracts cannot be written and wages are continuously negotiated,

the outside option of the job is always unemployment. In this context, if a worker receives an o¤er

from a �rm with higher productivity, she must switch. She cannot use this o¤er to renegotiate

with her current �rm, because she knows that this o¤er will not be available tomorrow, and then

her future option will be the unemployment.12 This possibility is also discussed in Flinn and

survey.12If wages are continuously negotiated, �rms could increase the wage of the worker at the same moment as the

on-the-job o¤er to try to keep the worker from quitting. If the alternative employer is more productive, it canforce the transition by also paying a premium. This auction for the worker �nishes when the actual �rm cannotpay more than the full productivity and transition holds, as in a Bertrand competition. This premium may be

9

Mabli (2010).

Beyond the theoretical relevance of between-�rms Bertrand competition, this assumption is

not critical for most of the results presented in this paper. In the Appendix, we illustrate that

using the same data with a variation of the model where between-�rm Bertrand competition is

allowed, the gender wage-gap decomposition remains practically unchanged.

This model is similar to the model presented in Flinn and Mabli (2010). The primary di¤erence

is in the distributions of productivity. To have a model that is estimable only with employee-level

data, they assume that there is a technologically-determined discrete distribution of worker-�rm

productivity. In other words, they assume discrete heterogeneity at the match level, while here

we assume two-sided continuous heterogeneity. The model presented here also has the convenient

property of producing a closed-form solution for the wage setting equation.

2.2 Value Functions

The expected value of income for a worker with ability ", who belongs to group k, currently

employed at wage w(p; "; k) is denoted by E(w(p; "; k); "; k) and it satis�es:

�E(w(p; "; k); "; k) (1)

= w(p; "; k) + �k(U("; k)� E(w(p; "; k); "; k)) +

�1k

Z wmax (p;";k)

w(p;";k)

[E( ~w(p; "; k); "; k)� E(w(p; "; k); "; k)] dF ( ~w(p; "; k)j"; k):

The expected value of being unemployed for a worker with ability ", who belongs to group k

is given by:

�U("; k)

= bk"+ �0k

Z wmax(p;";k)

wmin(p;":k)

[E( ~w(p; "; k); "; k)� U("; k)] dF ( ~w(p; "; k)j"; k):

Finally, the value of the match with productivity p" for the �rm when paying a wage w(p; "; k)

considered a hiring cost for the �rm. Modeling this possibility is outside of the scope of this paper.

10

to a worker of group k is given by:

�J(w(p; "; k); p; "; k) (2)

= p"� w(p; "; k)� (�k + �1k �F (w(p; "; k)j"; k))J(w(p; "; k); p; "; k);

where �F (w(p; "; k)j"; k) = 1�F (w(p; "; k)j"; k) and F (w(p; "; k)j"; k) is the equilibrium cumulative

distribution function of wages paid by �rms with a productivity lower than p to workers with an

ability " who belong to group k. Note that every parameter is group-speci�c. As the alternative

value of the match for the �rm is always zero, this value does not depend on alternative matches

and is therefore independent of the parameters of the other groups of workers. Although every

group is sharing the same labor market, all value functions may be considered group-by-group as

if they were in independent markets. For simplicity of the notation, we therefore omit the k-index.

Note that w(p; ") is a function of p and ", therefore given p and ", the wage is a redundant state

variable which is only included for exposition simplicity.

These expressions are equivalent to the value functions of the model with heterogeneous �rms

presented in Shimer (2006), including heterogeneity in workers ability. Here, wages are determined

by the following surplus splitting rule:

(1� �) [E(w(p; "); ")� U(")] = �J(w(p; "); p; "): (3)

After some algebra (see the Online Appendix for the proof13), it can be shown that:

w(p; ") = p"� (�+ � + �1 �F (w(p; ")j"))

�(1� �)�

Z w(p;")

wmin(p;")

1

(�+ � + � �F ( ~w(p; ")j"))d( ~w(p; ")):

Noting that �F (w(p; ")j") = �H(p) and changing the variable within the integral, we obtain a

�rst-order di¤erential equation,

w(p; ") = p"� (�+ � + �1 �H(p))(1� �)�

Z p

pmin(")

1

(�+ � + � �H(p0))

d(w(p; "))

dp0dp0:

13Online Appendix available at http://sites.google.com/site/cristianbartolucci/DetectingWageDiscirmination_OA.pdf

11

After some algebra, solving for the di¤erential equation yields:

w(p; ") = "p� "(1� �)(�+ � + �1 �H(p))�Z p

pmin

��+ � + �1 �H(p

0)��

dp0: (4)

This expression states a clear relationship between wages w(p; "), workers�ability ("), �rm

productivity (p), the distribution of �rms�productivity (H(:)), friction patterns (�1; �) and the

rent-splitting parameter (�):14 This wage equation is relatively similar to the one proposed by

Cahuc, Postel-Vinay and Robin (2006) when the wage is bargained between a �rm with produc-

tivity p and an unemployed worker with ability ":15

As expected, the model predicts that the mean equilibrium wage increases in �; and that

the mean wage paid by a �rm with productivity p increases in p: Note in Equation (4) that if

� = 1 ) w(p; ") = p"; the maximum wage that a �rm with productivity p can pay to a worker

with ability " is the full productivity: If � = 0) w(p; ") = pmin"; that is the minimum wage that a

worker would accept to leave unemployment. Figure 1 illustrates that the mean equilibrium wage

increases when �1 increases and when � decreases.16 Many models in the literature predict that

the mean equilibrium wage decreases in the amount of frictions (see for example the models in

Burdett and Mortensen 1998, Bontemps, Robin and van den Berg, 2000; Postel-Vinay and Robin,

2002; Cahuc, Postel-Vinay and Robin 2006). Wages are decreasing in the degree of frictions (�1�)

due to a compositional e¤ect and a direct e¤ect. The compositional e¤ect is due to the fact that

when frictions decrease the workers are, on average, higher in the job ladder. The direct e¤ect is

because the model is asymmetric in workers and employers, when frictions decrease the outside

option of the worker increases, but the outside option of the �rm is always zero.17

We assumed that the economy is in a steady state. The standard stationary equilibrium

14Solving the di¤erential equation we have:

w(p; ") = "p� "(1� �)(�+ � + �1 �H(p))�Z p

pmin(")

��+ � + �1 �H(p

0)��

dp0:

In section A.2 of the Online Appendix, we show that the minimum accepted productivity is independent of theworker type. Therefore we directly write pmin in the lower limit of integration.15Note that in Cahuc et al. (2006), when the wage is bargained between a �rm and an unemployed worker,

Bertrand competition does not hold. Therefore, their proposed scenario and ours are equivalent. The only di¤erencestems from the fact that both parts take into account future Bertrand competition.16These simulations are calibrated using the estimated parameters of male highly-quali�ed workers in the man-

ufacturing sector, see Section 4. The parameters are: � = 0:292, �1 = 0:217 and � = 0:034.17See van den Berg and van Vuuren (2010) for a deeper discussion on the e¤ect of friction on wages.

12

020

040

060

0M

ean

Dai

ly W

age

0 .2 .4 .6 .8 1beta

020

040

060

080

010

00W

ithin

Firm

Mea

n D

aily

Wag

e

0 1000 2000 3000Firm's Productiv ity

170

180

190

200

Mea

n D

aily

Wag

e

0 .2 .4 .6 .8 1delta

160

170

180

190

200

Mea

n D

ailly

Wag

e0 .2 .4 .6 .8 1

lambda

Figure 1: Wage Setting Equation

conditions are exploited. The in�ow must balance the out�ow for every stock of workers, de�ned

in terms of individual ability, employment status, and for those workers who are employed, �rm

productivity.

� The in�ow to unemployment must be equal to its out�ow, �0� = �(1 � �), where � is the

unemployment rate given by:

� =�

� + �0(5)

� The in�ow to jobs in �rms with productivity p or lower than p must be equal to its out�ow:

�0H(p)� =��1 �H(p) + �

�G(p)(1� �);

where G(p) is the fraction of workers employed at a �rm with productivity p or lower than

p: Using condition (5) and rearranging:

G(p) =H(p)

1 + �1��H(p)

; (6)

This stationary condition, (or its counterpart in terms of wages) is quite common and has

13

been broadly used after Burdett and Mortensen (1998) to infer the primitive distribution of

productivity (or the primitive distribution of wages) when only the distribution of produc-

tivity (or distribution of wages) for employed workers is observable. Since we use matched

employer-employee data, we can directly observe the empirical distribution of productiv-

ity at the �rm level. We use this stationary condition to construct the likelihood for the

duration analysis in Section 4.

� The fraction of employed workers with ability " or lower than " that are working in �rms

with productivity p or lower than p are (1��) ~F ("; p); where ~F ("; p) is the joint cdf of " and

p: These workers leave this group due to a better o¤er, or because they become unemployed;

such event occurs with probability (� + �1 �H(p)). The in�ow to this group is given by the

unemployed workers with ability " or lower than ", (ie: L(")�) who receive an o¤er from a

�rm with productivity p or lower than p. This last event occurs with probability �0H(p).

We now obtain the following condition:

(1� �)(� + �1 �H(p))F ("; p) = �0H(p)L(")�:

Using conditions (5) and (6) and rearranging:

F ("; p) =H(p)

(1 + �1��H(p))

L(") = G(p)L("): (7)

This expression states that there is no sorting between �rm�s productivity and worker�s abil-

ity;18 this statement is controversial, and there is an active debate in the assortative matching

literature.

Becker (1973) showed that in a model without search frictions but with transferable utility, if

there are supermodular production functions, then any competitive equilibrium exhibits positive

assortative matching. In a more recent work, Shimer and Smith (2000) and Atakan (2006) show

that in search models, complementaries in production functions are not su¢ cient enough to ensure

assortative matching.

18To show that there is no sorting, condition (7) is necessary but not su¢ cient. pmin; the minimum productivity,also needs to be independent of the worker ability. This second condition also holds in this model (the proof is inthe Online Appendix at https://sites.google.com/site/cristianbartolucci/DetectingWageDiscirmination_OA.pdf ).

14

After Abowd, Kramarz, and Margolis (1999), the empirical literature has primarily focused

on estimating the correlation between worker�s and �rm�s �xed e¤ects, using matched employer-

employee data. However, there are still no de�nitive results. Abowd et al (1999) �nd a negative

and small correlation between �rms and workers �xed e¤ects for France and a zero correlation for

the U.S.. Lindeboom, Mendez and van den Berg (2010), using a Portuguese matched employer-

employee dataset, �nd that there is positive assortative matching.

3 Data

Linked Employer-Employee Data from the German Federal Employment Agency

(LIAB).

We use the linked employer-employee dataset of the IAB (denoted LIAB), covering the period

of 1996-2005. LIAB was created by matching the data of the IAB establishment panel and

the process-produced data of the Federal Employment Services (social security records). The

distinctive feature of this data is the combination of information about individuals with details

concerning the �rms in which these people work. The workers source contains valuable data

on age, sex, nationality, daily wage (censored at the upper earnings limit for social security

contributions),19 schooling/training, the establishment number and occupation. Occupation is

recorder using a 3-digit code but in this paper is collapsed into two categories: high and low

quali�cation occupations.20

The �rm level data come form IAB Establishment Panel. These data is drawn from a strati�ed

sample of the plants, where the strata are de�ned over industries and plant sizes (large plants are

oversampled), but the sampling within each cell is random. The �rm�s data provide details on total

sales, value added, investment, depreciation,21 number of workers and sector.22 In particular, only

19In the sample of �rms used in this paper, 14.7% of the worker observations have censored information onwages. This proportion varies signi�cantly across genders and occupations. 4.7% of female observations and 18.1%of males observations are censored. The proportion of worker observations in high-quali�cation occupations, withwages that exceed the upper earning limit for social security contributions, is 37.0% while the correspondingproportion of low-quali�cation occupations is 3.1%.20We assigned the following groups to the low quali�cation occupations: agrarian occupations, manual occupa-

tions, services and simple commercial or administrative occupations. We classi�ed the following as high quali�ca-tion occupations: engineers, professional or semi-professional occupations, quali�ed commercial or administrativeoccupations, and managerial occupations.21The survey gives information about investment made to replace depreciated capital.22For a more detailed description of this dataset, see Alda et al (2005)

15

�rms with more than 10 workers, a positive output and positive depreciated capital were included

in our subsample. Since �rms of di¤erent sectors do not share the same market, we construct

separate samples for each sector. LIAB has a very detailed industry classi�cation. We focus on

four primary industries: Manufacturing, Construction, Trade, and Services.23 Participation of

establishments is voluntary, but the response rates are high (exceeding 70 per cent). However, the

response rate on some variables key for our purposes is lower. Among survey respondents, only 60

percent of �rms in the previous four industries provided valid responses for output. To estimate

productivity, we need data on output and number of workers in each category. We only consider

observations from the old Federal Republic of Germany (West-Germany). The �nal number of

observations in our sample of �rms is 15,174. Table 1 provides descriptive statistics of the �nal

sample of the �rms.

Table 1: Firms - Descriptive Statistics

no of Output no of Women (%) Men (%)�rms (mean)* workers Low-Q High-Q Low-Q High-Q

Manufact. 7,354 151.0 4,297,762 11.9 8.1 55.3 24.7Construct. 1,491 30.2 170,786 12.9 11.5 58.1 17.6

Trade 2,078 67.4 247,884 30.6 17.5 34.4 17.6Services 4,251 30.4 1,043,678 21.1 21.0 38.9 20.0Total 15,174 92.8 5,760,110 14.2 11.0 51.5 23.4

* Per annum total output in millions of euros

One of the main advantages of this data-set is that it has information for all employees subject

to social security in each �rm.24 The employee data are matched to �rms for which we have valid

estimates of productivity through a unique �rm identi�er. The raw data contains 21,246,022

observations between 1996 and 2005. After the �nal trimming we have a 10-year unbalanced panel,

including a total of 5,760,110 workers�observations distributed into 15,174 �rms�observations.

Given this selection on top of the original over-sampling of large plants, the sample becomes less

23The service sector includes three kind of services de�ned in the survey: industrial, transport and communica-tion, and other services. In the database, there is also information about �rms in the �nancial and public sectors.No measure of output is well-de�ned for these sectors, so they have been excluded from the analysis.24Employees subject to social security are workers, other employees and trainees who are liable to health, pension

and/or unemployment insurance or whose contributions to pension insurance is partly paid by the employer. Wedo not have information on workers who are not liable to social security. The following forms of employmentare not considered liable to social security: civil servants, self-employed persons, unpaid family workers and so-called "marginal" part-time workers (A "marginal" part-time worker is a person who is either only employed fora short-term or paid a maximum wage of e400 per month).

16

representative. According to the Federal Statistical O¢ ce (Statistisches Bundesamt Deutschland),

between 1996 and 2005, the proportion of the workforce in the manufacturing sector ranges

between 25.6 and 31.7 percent. In the sample used in this paper, this proportion exceeds 74 percent

(Table 1). For that reason, the analysis is performed clustering at the industry and occupation

level and when making inference on the total population, the between-clusters aggregation of

results are calculated using weights obtained from the G-SOEP.25

In the sample, women are, on average, younger than men. They also have less tenure and

less experience. Women are overrepresented in high quali�cation occupations. The proportion of

immigrants is larger within the male group. See Table 2 for details on the sample of workers.

Table 2: Workers - Descriptive Statistics

Women MenImmigrant (%) 8.4 10.4Age (years) 39.2 40.7Tenure (years) 10.1 12.0Experience (years) 15.3 17.1High-Qualification (%) 46.4 31.9Observations 1,290,156 4,130,453

The primary goal of this study is to understand the gender wage gap. In LIAB, individual

wages are top-coded. Consequently, the wage gap is estimated by maximum likelihood assuming

that log-wages are normally distributed within the �rm. The mean of the log-wage for each group

is estimated using worker-level data, maximizing saturated normal-likelihoods at the �rm level.

This is consistent with the model if we assume that, conditioning on the worker group, the ability

is log-normally distributed.

The di¤erence in conditional means is 21 percent (see Table 17 in the Appendix). This implies

that women, on average, have salaries that are 21 percent lower than men with the same set

of observable characteristics. The unconditional wage di¤erential is 41 percent, but it is not25The weights for each groups are estimated with the relative frequencies in the 1996-2005 sample of the G-

SOEP; these include: manufacturing-high quali�cation-men 12.3%, manufacturing-low quali�cation-men 15.6%,manufacturing-high quali�cation-women 6.1%, manufacturing-low quali�cation-women 4.8%, construction-highquali�cation-men 4.1%, construction-low quali�cation-men 7.0%, construction-high quali�cation-women 1.1%,construction-low quali�cation-women 0.2%, trade-high quali�cation-men 6.1%, trade-low quali�cation-men 2.9%,trade-high quali�cation-women 10.5%, trade-low quali�cation-women 2.8%, services-high quali�cation-men 10.3%,services-low quali�cation-men 4.2%, services-high quali�cation-women 8.7%, and services-low quali�cation-women3.1%.

17

stable across sectors and occupations (see Table 3). Mean-wages, estimated across industries and

occupations, illustrate that the gap ranges between 30 and 45 percent. Wage gaps are signi�cantly

di¤erent from zero for every sector and for every group. Earning di¤erentials are larger for highly

quali�ed workers in manufacturing, construction and trade sectors.

Table 3: Gender Wage Gap

Mean Daily-Wage W-GapWomen Men (%)

Manufact. Low-Q 76.94 109.25 29.57%(0.41) (0.59) (1.00%)

High-Q 105.74 190.30 44.43%(0.11) (0.22) (0.33%)

Construct. Low-Q 53.35 96.59 44.77%(0.14) (0.16) (0.30%)

High-Q 82.81 150.90 45.11%(0.27) (0.54) (0.81%)

Trade Low-Q 47.64 84.90 43.16%(0.08) (0.53) (0.51%)

High-Q 67.76 119.20 43.88%(0.02) (0.09) (0.11%)

Services Low-Q 49.11 88.47 44.80%(0.07) (0.17) (0.24%)

High-Q 87.44 156.31 44.06%(0.26) (0.68) (0.94%)

Weighted Average 72.08 124.10 41.93%

Note: Standard errors are given in parentheses. Individual wages are top-coded therefore means of log-wage are estimated using worker-level data maximizing saturated normal-likelihoods at the �rm level.Means of wages are then recovered by the moment generating function. Standard errors are obtainedusing the Delta-Method.

German Socio-Economic Panel

The LIAB version used in this paper is a panel of �rms complemented by worker data. As it

does not track workers, it is not possible to distinguish between attrition26 and job-termination.27

For that reason we use the G-SOEP (German Socio-Economic Panel) to estimate the group-

speci�c transition parameters.28 The G-SOEP is a representative repeated survey of households

in Germany. This survey has been carried out annually with the same people and families in

26There is practically no attrition in a establishment level data. There is attrition in the worker level data dueto changes in the worker�s identi�er and changes across establishments of the same �rm.27Unless the worker leaves the establishment and moves to another establishment within the panel.28Cahuc, Postel-Vinay and Robin (2006) follow the same strategy for estimating transition parameters with the

French Labor Force Survey.

18

Germany since 1984 (note that this study only analyzes the 1996-2005 data).29

4 Empirical Strategy and Results

The discrete nature of annual data implies a complicated censoring of the continuous-time trajec-

tories generated by the theoretical model. Because of these complications, a potentially e¢ cient

full-information maximum likelihood is not considered as a candidate for the estimation. Instead,

we perform a multi-step estimation procedure.30

Even though it may be asymptotically ine¢ cient, we prefer a step-by-step method. One reason

is that transition parameters are better estimated using a standard labor force survey, such as

G-SOEP. Another reason is that the consistency of every parameter in full-information maximum

likelihood is only guaranteed in the case of the correct speci�cation of the full model. However,

we are interested in obtaining productivity di¤erences and transition parameter estimates that

are robust to misspeci�cation in other parts of the model, such as the wage setting mechanism.

This facilitates the comparability of the intermediate results with previous reduced-form studies

which measure gender di¤erentials in productivity and job-retention rates. Moreover, even in an

informal way, models are generally incomplete. Therefore, it seems prudent to use estimators that

are as robust to misspeci�cation as possible.

The structural model abstracts many dimensions that may be relevant in the wage setting

(e.g., amenities or union pressure). These omitted dimensions may be mainly associated with

di¤erent job types. As can be seen in Tables 1 and 2, there are important di¤erences between men

and women in terms of occupation and sector. To compare jobs that are as similar as possible,

the empirical analysis is clustered at the sector and occupation level. The model is estimated

independently for each of the four sectors. In order to control for occupation, transition and

the rent-splitting parameters are estimated independently for both types of jobs, in each sector

and gender group. We only control for occupation parametrically when we estimate productivity,

because we need to consider the full workforce at each �rm.31

29See Wagner, Burkhauser, and Behringer (1993) for further details on the G-SOEP.30Multi-step estimation has been done in many papers. Good examples include Bontemps, Robin, and Van den

Berg (2000), Postel-Vinay and Robin (2002), and Cahuc, Postel-Vinay and Robin (2006).31We treat selection into occupations and sector as exogenous. Nevertheless, this selection may also have some

content of discrimination (e.g. Blau and Hendricks, 1979).

19

4.1 Productivity

We assume, as in the theory laid out in Section 2, that the distribution of abilities in worker

groups within each �rm �uctuates around some �xed density, say l("jp; k). According to the

model (see condition (7)), there is no assortative matching between workers and �rms. Therefore,

for a given worker group (k), the equilibrium distribution of worker types, conditional on the �rm

type, equals the marginal one: L("jp; k) = L("jk). The model describes the matching process,

and the wage setting, for single-worker �rms. For the sake of consistency with multi-worker �rms,

we assume that workers are perfectly substitutable in the production process, between and within

skill categories (k0s). Equivalently to Cahuc, Postel-Vinay and Robin (2006), we de�ne the total

amount of e¢ cient labor employed at �rm j at time t as:

Ljt =XK

kLkjt:

where Lkjt is the total amount of workers of group k in �rm j at time t; and k =

"Z"

"l("jk)d"

is the steady-state mean ability of workers of group k. We then specify �rm j�s total per-period

output at the constant-return Cobb�Douglas function of capital and quality adjusted labor input.

Yjt = AjK(1��)jt L�jt; (8)

where, Yjt is the value added produced by �rm j in period t, Kjt is the total capital and Aj is

a �rm-speci�c productivity parameter. A similar speci�cation, without imposing constant return

to scale, has already been used in the discrimination literature to estimate between-group pro-

ductivity gaps (e:g: Hellerstein and Neumark, 1999 and Hellerstein, Neumark and Troske, 1999).

We are primarily interested in four groups of workers depending on gender (men and women) and

occupation (high and low quali�cation). We normalize hm = 1; considering highly-quali�ed male

workers as the reference group.32 As such, k = ~ k=~ hm is the proportional productivity of group

k relative to the productivity of highly-quali�ed male workers.

We further assume that �rms can adjust capital instantaneously, what makes (8) totally con-

32Due to this normalization, the �rm speci�c productivity ~Aj is rede�ned as Aj �lhm:

20

sistent with the theory presented in Section 2, where we have assumed that the productivity of a

match is p". If the �rm chooses the level of capital, the �rm solves the following problem:33

maxKjt

(AjK(1��)jt L�jt � rKjt);

where r is the cost of capital. Therefore, K�jt(Aj; Ljt) �that is the optimal choice of capital given

the stock of workers and the �rm speci�c productivity parameter �is:

K�jt(Aj; Ljt) = Ljt

�1� �r

Aj

� 1�

Plugging K�jt(Aj; Ljt) into (8) and rearranging, we have that:

Yjt = Aj

"Ljt

�1� �r

Aj

� 1�

#1��L�jt =

"A

1�j

�1� �r

� 1��

#Ljt = pjLjt (9)

Note that (9) is equivalent to the production function presented in Section 2, where pj is a function

of the �rm-speci�c productivity parameter�e:g: : pj = A

1�j

�1��r

� 1��

�and Ljt is the total labor

input in e¢ ciency units. Using the panel with �rm level data on output ( �Yjt)34, depreciated

capital (Kdjt)

35 and the number of workers in each category, we estimate the production function

in logs imposing that the relative di¤erence in productivity between low and high quali�cation

occupations is constant across gender ( lw = w � l).

log( �Yjt) = log(Aj) + (1� �l) log(Kdjt) + (10)

�l log(Lhmjt + wL

hwjt + lL

lmjt + w lL

lwjt ) + ujt;

where Lhwjt and Lhmjt are, respectively, the number of women and men in high-quali�cation occu-

pations in �rm j at time t while Llwjt and Llmjt are, respectively, the number of women and men in

low-quali�cation occupations in �rm j at time t; and ujt is a zero mean stationary productivity

33Section A.2 in the Appendix provides more detail and robustness checks on this assumption.34There were problems of convergence in the estimation of (8) using value added. For that reason, output

measures where used in the estimation of � and in (8). If a constant fraction of output is spent on materials,both estimators are consistent for � and . There is only a di¤erence in the constant term of (8).On the other hand, to estimate pj in (9), the constant term matters, and hence we use measures of value-added.35Assuming that a constant fraction (d) of capital depreciates by unit of time: Kd

jt = d � Kjt ) log(Kdjt) =

log(d) + log(Kjt): Therefore �k log(d) goes to the constant term.

21

shock.

The model predicts that more productive �rms are able to attract more workers of every

type. Moreover, the optimal choice of capital depends on Aj: As a result, productive inputs

are correlated with the �rm �xed e¤ect (ie:cov(Aj; Ljt) 6= 0 and cov(Aj; Kjt) 6= 0); which would

generates inconsistent estimates of � and if we used ordinary least squares in the estimation.

Therefore, we estimate equation (10) by within-�rms non-linear least squares to remove the �rm

�xed e¤ects.

Table 4: Production Function Estimates

WF-NLLS Estimates of (10)�l w l

Manufacturing 0.963 0.672 0.484(0.005) (0.062) (0.042)

Construction 0.961 0.701 0.444(0.006) (0.052) (0.039)

Trade 0.971 0.804 0.487(0.009) (0.092) (0.056)

Services 0.945 0.588 0.298(0.007) (0.068) (0.030)

Weighted Average 0.96 0.67 0.43

Note: Time dummies included. Standard errors robust to heteroskedasticity are given in parentheses.Weighted averages take into account the number of �rms in each sector.

The within-�rms non-linear least squares results are presented in Table 4. Women�s produc-

tivity is lower than men�s productivity in similar jobs. This di¤erence ranges from 20 to 41

percent. On average across sectors, female workers are 33 percent less productive than males in

each job. One of the primary candidates to explain this large productivity gap, is that these

estimates are not taking into account that women work, on average, fewer hours than men. Using

the G-SOEP36, we �nd that the average hour-gap is 19.9 percent, hence, di¤erences in hours are

likely to be one of the main determinants of the productivity gap.37

Low-quali�cation workers are also found to be between 51 percent and 70 percent less produc-

tive than highly quali�ed workers. As in most production function estimations using microdata,

36LIAB does not provide information on hours.37Although di¤erences in hours are illustrated as being important, the primary results of this paper remain valid.

We only mention them in order to have a better understanding of the estimated productivity gap. See Section 5.1for a more detailed discussion of this issue.

22

�l is found to be very near one, and hence, �k is very small but statistically di¤erent from zero.

Although this �nding is standard,38 the main results of this paper are not very sensitive to this

issue. If instead of �l = 0:96; we include an a priori more realistic value of �l = 0:60; the wage

gap decomposition does not change signi�cantly.

Pioneered by Hellerstein and Neumark (1999) and Hellerstein, Neumark and Troske (1999),

di¤erences in productivity across genders are now well documented in the literature. The �rst

paper �nds, with Israeli �rm-level data, a productivity gap of 17 percent. The second paper, using

a U.S. sample of manufacturing plants, reports a productivity gap of 15 percent. These studies

have been criticized mainly due to the potential endogeneity of the proportion of female workers

in the �rm.39

In this paper, we treat the number of workers of each group as potentially correlated with

the �rm �xed e¤ect.40 Estimating equation (10) by within-�rm non linear least-squares, the �rm

�xed e¤ect is completely removed; hence, our estimates are robust to a correlation between the

�rm �xed e¤ect and the �rm labor input level, but also robust to a correlation between the �rm

�xed e¤ect and the labor composition of the �rm. In this dataset, we �nd a signi�cant correlation

between the �rm�s �xed e¤ect and the �rm�s labor input. Estimating equation (10) by NLLS

without �xed e¤ects, 0s estimates are signi�cantly lower, the average of NLLSw across sectors is

0.38 and the average of NLLSl across sectors is 0.26. See Table 12 in the Appendix.

The estimates presented in Table 4, require orthogonality of productive inputs to the sequence

of shock ujt: In the theory presented in Section 2, we abstract from productivity shock. Assuming

that ujt is uncorrelated with the productive inputs would be consistent with the model. As we

assume that capital is perfectly adjustable, if the productivity shock is observed by the manager

after production takes place, capital is exogenous and our results are consistent. Nevertheless, this

extra assumption could be avoided by dropping capital altogether and estimating equation (9)

directly. Results of this last exercise are presented in Table 11 in the Appendix. The estimates of

w and l are not statistically di¤erent from the ones that come out from the estimation of equa-

tion (10), what suggests that the productivity shock is not observed before the manager chooses

38One interpretation of this result is that Kdjt only captures variable capital whereas �xed capital is subsumed

in the �rm e¤ect. If this is the case, the constant returns restriction is dubious.39See Altonji & Blank (1999).40Indeed, the model predicts that more productive �rms are able to attract more workers of every type. Therefore,

the total labor input will be correlated with the �rm �xed e¤ect, but not with the labor input composition.

23

the level of K. The current productivity shock might also a¤ect the vacancy creation, making it

possibly correlated with Lj;~t 8 ~t > t, which would go against the condition of strict exogeneity

required for consistency of �xed-e¤ects estimators. As in the case of capital, the number of vacan-

cies opened by the �rm would not be correlated with the productivity shock if ujt were realized

ex-post the production takes place. However, there is large literature suggesting that productive

inputs are determined simultaneously with the productivity shock.41 It is possible to estimate �l

and k by GMM using internal instruments, assuming that Ljt and Kjt are predetermined. For

the sake of robustness of our results, we have explored this possibility, but unfortunately there is

a severe problem of lack of precision on the GMM estimates of the parameters (See Section A.1

in the Appendix for details).

4.2 Labor Market Dynamics

In the model described is Section 2, job terminations might occur endogenously, due to job-to-job

transitions, or exogenously, due to job destruction. As both processes are Poisson, the model

de�nes the precise distribution of job durations (i:e:; t) conditional on the �rm productivity

(i:e:; p):

L(tjp) =�� + �1 �H(p)

�e�[�+�1

�H(p)]t: (11)

We use the G-SOEP to estimate transition parameters. Unfortunately, this dataset does not

have productivity measures, so we estimate �1 and � treating p as an unobservable. To do this,

we maximize the unconditional likelihood L(t) =RL(tjp)g(p)dp; where g(p) is the probability

density function of the �rm�s productivity among employed workers implied by the model.

Taking derivatives with respect to p in Equation (6), we obtain the density of �rm productivity

in the population of workers as:

g(p) =(1 + �1

�)h(p)

1 + �1��H(p)

: (12)

In the Appendix we show the individual contributions to the unconditional likelihood become

41If the �rm has some knowledge of the productivity shock, it will adjust its choice of productive inputs accord-ingly. See for example Griliches and Mairesse, (1998).

24

simple enough to be estimated and are given by:

L(t) =�(1 + �1

�)

�1�

"Z (1+�1�)�t

�t

e�x

xdx

#:

Integrating the unobserved productivity out of the conditional likelihood removes p and all

reference to the sampling distribution H(p). This method is robust to any misspeci�cation in

wage bargaining. The only required property of the structural model is that there exists a scalar

�rm index, in this case p; which monotonously de�nes transitions.

In the Appendix42, we show how to obtain the exact form of the likelihood that takes into

account that some durations are right-censored, while others started before the survey�s beginning.

Finally, an individual contribution to the log-likelihood is:

l(ti) = (1� ci) log

0B@ R (1+�1�)�t

�te�x

xdx

e��Hi�� e��(1+

�1�)Hi

�(1+�1�)�Hi

R (1+�1�)�Hi

�Hi

e�x

xdx

1CA+

ci log

0B@ e��ti�� e��(1+

�1�)ti

�(1+�1�)� ti

R (1+�1�)�ti

�ti

e�x

xdx

e��Hi�� e��(1+

�1�)Hi

�(1+�1�)�Hi

R (1+�1�)�Hi

�Hi

e�x

xdx

1CA ;where ci is a right-censored spell indicator and Hi is the time period elapsed before the sample

started.43

Maximum likelihood estimates are reported in Table 5. The average duration of an employment

spell, 1=� (possibly changing employers), is between 10 and 32 years, but the mean-duration across

sectors is 20.2 years. The average time between two outside o¤ers, 1=�1, ranges from 1.7 to 4.6

years. These results seem to be fairly large, but they are compatible with the previous literature.

van den Berg and Ridder (2003) used a similar speci�cation, but with German aggregated data,

and found � to be equal to 0.060 and �1�equal to 6.5,44 Here, the weighted average of � across

sectors and groups is 0.0574 and the weighted average of �1�is 6.41.

Highly quali�ed workers have, in general, lower transition rates to unemployment and lower

42Online Appendix available at http://sites.google.com/site/cristianbartolucci/DetectingWageDiscirmination_OA.pdf43The MATA code for computing the exponential integral and the MATA code to maximize this likelihood are

available from the author upon request.44van den Berg and Ridder (2003, p.237) report monthly rates for �1 = 0:028 and �1 = �1

� = 6:5:

25

Table 5: Transition Parameters - Maximum Likelihood Estimates

Low-QualificationWomen Men

�1 � �1�

�1 � �1�

Manufact. 0.406 0.044 9.202 0.314 0.031 10.095(0.041) (0.004) (0.127) (0.021) (0.002) (0.176)

Construct. 0.601 0.098 6.085 0.437 0.105 4.162(0.188) (0.031) (0.150) (0.025) (0.006) (0.118)

Trade 0.5257 0.094 5.613 0.432 0.074 5.478(0.050) (0.009) (0.085) (0.042) (0.008) (0.090)

Services 0.559 0.095 5.866 0.458 0.086 5.313(0.051) (0.009) (0.073) (0.037) (0.007) (0.049)

High-QualificationWomen Men

�1 � �1�

�1 � �1�

Manufact. 0.308 0.044 7.025 0.217 0.034 6.373(0.031) (0.004) (0.316) (0.015) (0.002) (0.103)

Construct. 0.510 0.090 5.691 0.255 0.071 3.620(0.095) (0.017) (0.107) (0.025) (0.006) (0.048)

Trade 0.327 0.060 5.459 0.353 0.050 7.093(0.020) (0.004) (0.078) (0.032) (0.004) (0.219)

Services 0.393 0.073 5.363 0.227 0.050 4.547(0.024) (0.004) (0.062) (0.015) (0.003) (0.051)

Note: Per annum estimates. Standard errors are given in parentheses.

on-the-job o¤er arrival rates. Women are more mobile than men in terms of job-to-job transitions

and, in general, have higher job-destruction rates. Considering �1�as an index of friction (van den

Berg and Ridder 2003), we do not �nd signi�cant di¤erences in the extent of labor market friction

between men and women.

4.3 The Wage Equation: Closing the Model

The structural wage equation (4) can be written as:

wj;t;i = "iwpj;t;k(pj; �k; Hk(p); �1k; �k);

26

where wj;t;i is the daily wage of worker i; who belongs to group k,45 in a �rm j with productivity

pj at time t; and:

wpj;t;k(pj ; �k; Hk(p); �1k; �k)

= pj � (1� �k)(�+ �k + �1;k �Hk(pj))�k

�Z pj�

pmin

��+ �k + �1;k �Hk(p

0)��k dp0:

As illustrated in equation (7), " is statistically independent of p; thus

E(wj;t;i) = E("iwpj;t;k(pj; �k; Hk(p); �1k; �k))

= Ek(")E(wpjtk(pj; �k; Hk(p); �1k; �k)):

Ek(") = k is the mean e¢ ciency units of workers in group k; relative to the male highly-

quali�ed group. Therefore, the predicted mean wage for workers of group k working in �rms with

productivity pj at time t is:

E(wjtk) = kwpjtk(pj; �k; Hk(p); �1k; �k): (13)

The group chosen for normalization is irrelevant. Changing this group to a generic group k;

we would change our measure of productivity. Instead of pj, that is, the productivity measured

in terms of e¢ ciency units of highly-quali�ed males, we would have pkj = kpj ; that is, the

productivity measured in term of e¢ ciency units of group k: To de�ne equation (13) in terms of

the productivity of group k; we only need to put k inside the expectation operator:

E(wj;t;i) = E( kpj �

(1� �k)(�+ �k + �1;k �Hk(pj))�k

�Z pj

pmin

��+ �k + �1;k �Hk(p

0)��k kdp0):

45Notice that given the individual identi�er i, the group k is �xed. Therefore we could replace k by k(i): Tokeep the notation as simple as possible we simply refer to k.

27

Noting that dpdpk= 1

k; and changing the variable within the integral, we obtain:

E(wj;t;i) = E(pkj �

(1� �k)(�+ �k + �1;k �Hk(pkj ))�k

�Z pkj

pkmin

��+ �k + �1;k �Hk(p

k0)��k dpk0) �

For each �rm in the sample we estimate the average daily wage �wjtk paid to the workers of

group k at time t. Since wages are top-coded, we estimate the �rm mean-wage for each worker

group (i:e:; �wjtk) by maximum likelihood at the �rm level assuming that "i is distributed as a log-

normal.46 Under the steady state assumption, and according to the theory presented in Section

2, �wjtk exhibits stationary �uctuation around the steady state mean wage E(wjtk) paid by �rm j

with productivity pj.

Writing equation (13) in logs:

log �wjtk = ln( k) + (14)

ln[pj � (1� �k)(�+ �k + �1;k �Hk(pj))�k

�Z pj

pmin

��+ �k + �1;k �Hk(p

0)��k dp0]

+vjtk;

We estimate equation (14) by weighted non-linear least squares at the �rm level, where k, �k

and �k are parameters estimated in the previous stages, pj is the productivity of �rm j, and vjtk

is a transitory shock that comes out from within �rm aggregation. As usual, the discount factor

has been set to an annual rate of 5 percent (daily rate of 0.0134 percent).

Standard errors have to take into account that ; �1 and � are estimated in the previous

stages. To solve this problem, we combine bootstrap for with the analytical solution for �1 and

�. Hence, we obtain standard errors replicating the productivity estimation and the bargaining

power estimation in 200 resamples of the LIAB original sample, with replacement, but considering

the estimated transition parameters as the population parameters. To correct these preliminary

46The within-�rm distribution of wages of each worker group, is the same as the distribution of ability. As wagesare linear in "; to assume log-normality in the distribution of " implies log-normal wages for each group within the�rm.

28

standard errors, we add to them the analytical term corresponding to the standard errors of �1

and � reported in Table 5. The analytical correction is straightforward in this case, because the

estimators come from di¤erent samples, such that we can omit the term corresponding to the

outer product of the scores in the �rst and second stages.

Consistent standard errors are given by:

\V ar(�̂) = V ar(�̂j�̂1; �̂)bootstrap +

�̂��1\V ar(�̂1)�̂��1 + �̂��

\V ar(�̂)�̂��̂�� ̂��

;

where � is the objective function in the optimization, which, in this case is the weighted sum of

squares and �̂��=@�@�̂@�

�@�

. Second derivatives of �̂ are obtained numerically.47

Results are presented in Table 6. Women are found to have lower rent-spitting parameters

than men in construction and trade for both high and low quali�cation occupations, and in

manufacturing highly quali�ed occupations. Female workers receive larger portions of the surplus

than males in services and in manufacturing low quali�cation occupations, but these di¤erences

are not signi�cant.

There is a clear pattern in terms occupations. Low quali�cation workers receive larger shares

of the surplus in every sector, considering female and male workers.48 Union coverage, which is

higher for low quali�cation occupations, and di¤erences in compensating di¤erentials, are potential

reasons to explain this di¤erence.

Estimates of the rent-splitting parameter are considerably higher than the parameters reported

in Cahuc, Postel-Vinay and Robin (2006). This is probably due to the di¤erences in our de�nition

of match rents.49 In a similar model estimated with US employee-level data by Flinn and Mabli

(2010), the overall bargaining power is found to be 0:45. Here, the weighted average across cells

47The analytical correction �̂��1\V ar(�̂1)�̂��1+�̂��

\V ar(�̂)�̂��̂��̂��

is not signi�cant in any industry, neither for high nor

low quali�cation workers.48Di¤erences between industries and occupations may be understood as consequence of compensating di¤eren-

tials or di¤erences in union pressure. They cannot be understood as discrimination, because we are comparingoccupations and not workers.49The surplus is de�ned in terms of the productivity of the match and the outside option. Both models imply

di¤erent outside options. Without Bertrand competition the worker outside option is unemployment. Whileallowing for Bertrand competition, the worker�s outside option is the whole productivity of the poaching �rm.As the outside option in the model with Bertrand competition dominates the one in the model without Bertrandcompetition, the estimated bargaining power is smaller.

29

Table 6: Rent-Splitting Parameter Estimates

.

WNLLS estimates of (14)Women Men

� �

Manufacturing Low-Q 0.419 0.398(0.129) (0.096)

High-Q 0.226 0.292(0.088) (0.036)

Construction Low-Q 0.214 0.408(0.090) (0.045)

High-Q 0.113 0.186(0.078) (0.104)

Trade Low-Q 0.339 0.382(0.173) (0.106)

High-Q 0.152 0.222(0.066) (0.064)

Services Low-Q 0.849 0.757(0.125) (0.125)

High-Q 0.413 0.324(0.104) (0.073)

Note: Corrected Bootstrap Standard errors are given in parentheses.

is remarkably similar, 0:421.

Allowing between-�rms Bertrand competition as in Cahuc, Postel-Vinay and Robin (2006),

changes the magnitude of � for males and females, but not the di¤erence across genders. In

the Appendix, we present a numerical exercise, where � is recovered by the simulated method

of moments using the same data and a model with between-�rm Bertrand competition. The

bargaining power is found to be signi�cantly lower; the weighted average is 0:219, but the gender

and occupation patterns do not change. Women are found to have lower bargaining power than

males in construction and trade, while there is no clear pattern in manufacturing and services.

As in the model without Bertrand competition, workers in low-quali�cation occupations are also

found to have higher � than workers in high-quali�cation occupations. These �ndings are not

consistent with results found in Cahuc, Postel-Vinay and Robin (2006), where they �nd a positive

association between bargaining power and job quali�cation in France. As in this case we are

estimating a similar model to Cahuc, Postel-Vinay and Robin (2006), this occupation pattern

does not seem to be a modeling artifact, but instead is a di¤erence between German and French

30

labor markets.50

Di¤erences in rent-splitting parameters are not signi�cant in every sector. We only �nd that

male workers receive larger shares of the surplus than females in the construction sector, where

bootstrap p-values of the di¤erences in � are 4.5 percent for low-quali�cation workers and 9.6

percent for high-quali�cation workers..

5 Wage-Gap Decomposition

The structural wage setting equation provides us with a direct way to isolate the e¤ect of each

wage determinant on the overall wage di¤erential. Hence, we are able to calculate what fraction

of the wage gap is due to segregation or di¤erences in the rent-splitting parameter, productivity

or friction patterns. Although the model takes into account the equilibrium e¤ects of changes

in the primitives, this is always conditional to what we have de�ned as primitives. For example,

there are good reasons to think that the o¤er arrival rate might be a function of the distribution

of o¤ers and the bargaining power. Changes in the distribution of o¤ers and the bargaining power

might change the search e¤ort of the worker, but in the model we do not endogeneize wage e¤ort.

Moreover, di¤erences in productivity might be also a function of frictions. If a demographic group

faces a labor market with more frictions, its members are likely to accommodate their optimal

investment in human capital. In principle every mechanism could be endogeneized, and for sure

the decomposition would more complete and its interpretation would be cleaner. The purpose of

the estimation presented in this paper is to provide a wage gap decomposition to have a better

understanding of the main determinants of wage di¤erentials between male and women. This

is not a �nal answer since we are not explaining what is causing the di¤erence in each wage

determinant, but our results provide insights of relative importance of each wage determinant in

the determination of the wage gap.

To calculate our decomposition we proceed as follow. Using the structural wage equation (4):

wj;t;i = "iwp(pj; �k; Hk(p); �1k; �k);

For each sector and each worker group, it is possible to measure the wage di¤erential caused by

50See Section A.4 for details.

31

the di¤erences in each wage determinant. For example, we can estimate the wage gap accounted

for by � as the relative di¤erence between the mean-wage that women actually receive and the

mean wage that women would receive if they had the male rent-splitting parameter.

1� E ["iwp (pj; �W ; HW (p); �1;W ; �W )]

E ["iwp (pj; �M ; HW (p); �1;W ; �W )]: (15)

As shown in equation (7) " is independent of p; therefore we can estimate equation (15) at the

�rm level:

1�

WXj

Nj � wp (pj; �W ; HW (p); �1;W ; �W )

WXj

Nj � wp (pj; �M ; HW (p); �1;W ; �W );

where Nj is the number of female workers in each �rm. To decompose the wage gap, we sequen-

tially replace each female parameter with a male parameter until we reach the male predicted

mean wage: Counterfactual wages are presented in Table 7.

Di¤erences in friction patterns imply di¤erences in the observed distributions of �rm produc-

tivity within the employed workers of di¤erent groups. This is also true when both groups face

the same primitive distribution of productivity (see condition (6)). Consequently, a change in

frictions generates a direct e¤ect and a compositional e¤ect on mean wages. The direct e¤ect is

the change in the wage of a worker ", who belongs to the group k, working in a �rm p, driven by

the change in the value of the outside option of the worker; this is the e¤ect displayed in Figure 1.

The second e¤ect is the one that comes from aggregation, due to the changes in the distribution

of the accepted wages.

The wage setting equation implies that the higher the o¤er-arrival rate, the higher the wage

and the higher the job destruction rate, the lower the wage (see Figure 1). In addition, increasing

the o¤er-arrival rate makes the counterfactual �rm productivity distribution among workers to

stochastically dominate the original one, while increasing the destruction rate has the opposite

e¤ect.51 Hence, the direct e¤ect and the aggregation e¤ect follow the same pattern when we

51Given that the o¤er arrival rate of women is higher than the one of men, the distribution of �rms productivityacross the population of female worker stochastically dominates the couterfactual distribution of �rm�s productivitycorresponding to female workers if they had the o¤er arrival rate of male workers. The proof is trivial, note thatif �F1 > �

M1 :

G(�W1 ; �W ) =

H(p)

1 +�W1�W

�H(p)6 H(p)

1 +�M1�W

�H(p)= G(�M1 ; �

W ) �

32

change �1 and �:

It is straightforward to calculate the counterfactual wage of one worker if we change � or �1.

But, as the distribution of accepted wages change, it is more problematic to aggregate individual

wages when some of this parameters change. Therefore, the counterfactual wages are calculated

by simulating the distribution of productivity among employed female workers using the friction

parameters of male workers.52

Table 7: Counterfactual wages

Counterfactual SectorsMean-Daily Wages M C T S

w(�M1 ; �M ; M ; H(p)M ; �M) 190.3 150.9 119.2 156.3

High w(�M1 ; �M ; M ; H(p)M ; �F ) 153.6 103.1 84.7 194.7

Qualification w(�M1 ; �M ; M ; H(p)F ; �F ) 151.1 104.6 92.8 138.3

Occupations w(�M1 ; �M ; F ; H(p)F ; �F ) 101.5 73.3 74.6 81.3

w(�M1 ; �F ; F ; H(p)F ; �F ) 95.2 69.5 69.7 71.6

w(�F1 ; �F ; F ; H(p)F ; �F ) 105.7 82.8 67.8 87.4

w(�M1 ; �M ; M ; H(p)M ; �M) 109.2 96.6 84.9 88.5

Low w(�M1 ; �M ; M ; H(p)M ; �F ) 114.1 56.3 76.2 97.6

Qualification w(�M1 ; �M ; M ; H(p)F ; �F ) 116.2 68.1 58.6 80.6

Occupations w(�M1 ; �M ; F ; H(p)F ; �F ) 78.1 47.7 47.1 47.4

w(�M1 ; �F ; F ; H(p)F ; �F ) 71.0 48.5 42.9 45.7

w(�F1 ; �F ; F ; H(p)F ; �F ) 76.9 53.3 47.6 49.1

Considering the direct e¤ect and the aggregation e¤ect together, the e¤ect of friction is signif-

icant. We �nd that, on average across sectors, gender di¤erences in destruction rates, �, explain 9

percent of the total wage gap, while decreasing the female o¤er arrival rate, �1, to be equal to the

male one would increase the gap by 13 percent. Di¤erences in � increase the gap, while di¤erences

in �1 reduce the gap. Therefore, the e¤ects of both type of frictions are partially compensated for.

The sizes of these e¤ects are consistent with Bowlus (1997), who analyzed samples of high school

On the other hand, given that women have higher destruction rates than men

G(�W1 ; �W ) 6 G(�W1 ; �M ):

52The model used for simulations is a simpli�ed version of the model presented in Section 2, where workerheterogeneity has been omitted. Simulations use the punctual estimates of �1; �; w; l and �l for every sectorand worker group, reported in Section 3. We assume that the primitive distribution of �rm productivity is log-normal. The mean of the distribution of �rm�s productivity is calibrated in equilibrium by matching mean wagesfor each occupation and for each sector. MATA codes for simulating the model previously described are availablefrom the author upon request.

33

and college graduates from the National Longitudinal Survey of Youth (NLSY). These behavioral

patterns were found to account for 20 - 30 percent of the wage di¤erentials.

It is possible to disentangle the direct e¤ect of friction on wages from the aggregation e¤ect.

For this purpose, we calculate the mean wage of women, changing the friction parameters, but

maintaining the original distribution of �rm productivity, as is illustrated in Figure 1. These

e¤ects are surprisingly small; di¤erences in � explain, on average, 1.3 percent of the total wage

gap, while netting out �1 increases the wage-gap by 2.4 percent. In view of that, most of the

e¤ect of friction comes through the aggregation e¤ect.

The proportion of the wage gap due to di¤erences in productivity connects directly with a

branch of the literature initiated by Hellerstein et al. (1999). In this line of work, they assumed

proportionality between wages and productivity. Therefore, an inequality in wages that is not

driven by a di¤erence in productivity, suggests that there is discrimination. In this paper we

impose more structure, which allows us to have a more complete interpretation of the fraction of

the wage gap that is not explained by di¤erences in productivity.

On average, 65 percent of the total wage gap is accounted for by di¤erences in productivity.

The role of productivity in explaining the wage gap is large, but not surprising, given the de�nition

of productivity used in this exercise. Behind these large productivity-gaps, there are important

di¤erences in productivity determinants. It can be seen on Table 2 that there are signi�cant

di¤erences in age, tenure and potential experience. Moreover, in Lauer (2000), there is also

evidence suggesting that a large part of the gender wage gap in Germany can be attributed to

the fact that women have a lower endowment in human capital than men.

There is a large literature base investigating the e¤ect of segregation on the gender wage gap.53

The structural estimation enables us to compare the current mean-wages of female workers with

the counterfactual female mean-wages, if they faced the primitive distribution of �rm productivity

faced by male workers (i.e., changing HW (p) for HM(p)). We �nd that 17 percent of the wage

gap is accounted for by this di¤erence. These results are consistent with the results of Bayard,

Hellerstein, Neumark and Troske (2003). who found a negative and small e¤ect of the proportion

of women in the establishment, over wages in the U.S.. One of the advantages of the estimation

of this structural model is that it enables us to empirically disentangle the e¤ect of segregation

53See Altonji and Blank (1999).

34

from di¤erences in the distribution of �rm productivity generated by gender di¤erences in fric-

tion patterns.54 In this exercise, we are not modeling the �rm behavior explicitly and therefore

di¤erences in H(p) have no direct structural interpretation.

The e¤ect of di¤erences in productivity is not stable across occupations. Three quarters of

the wage gap is explained by di¤erences in productivity in low quali�cation occupations, while

this proportion reduces to 58 percent when considering high quali�cation occupations. Finally,

these productivity measures are not on an hourly-basis: the numbers of hours signi�cantly di¤ers

between male and female workers (see Section 5.1 for a more detailed discussion on this issue).

Table 8: Gender Wage-Gap decomposition

.

% of the Wage-Gap explained byDifferences in Low-Qualification High-Qualification All

� 10.3 8.5 9.3� -12.6 -13.8 -13.3 74.5 58.3 65.1

H(p) 23.5 13.4 17.7� 4.2 33.6 21.2

Wage-Gap 39.0 44.3 41.9

On average, 21 percent of the total wage gap is accounted for by the di¤erences in the rent-

splitting parameter. This means that women receive wages that are 9 percent lower than the

wages received by equivalent men. This �nding is not signi�cantly di¤erent to what we obtained

using the traditional approach based on Mincer-equations, where female workers are found to

receive wages that are 15 percent lower than those of equivalent male workers (see Section A.5

in the Appendix). The reported gender di¤erences in bargaining power represent a potential

explanation of the observed di¤erences in destruction rates and in productivity. Neumark and

McLennan (1995) show that as women experience labor market discrimination, they have more

career interruptions, and therefore less human capital accumulation.

As in the case of di¤erences in productivity, di¤erences in the wage setting parameter are

also occupation-speci�c. As can be seen from Table 8, the di¤erence in rent splitting parameters

turns out to be an important determinant of the gender wage gap of workers in high quali�ca-

tion occupations; this explains almost none of the wage gap in the low quali�cation jobs. This

54To the best of our knowledge, there is currently no available paper that disentangles both e¤ects.

35

result is consistent with the growing literature on glass ceilings. One potential reason to explain

this pattern is the higher union coverage in low quali�cation occupations. Collective bargaining

agreements regulating wage rates and general working conditions may limit the extent of wage

discrimination. See Heinze and Wolf (2010) and Antonczyk, Fitzenberger and Sommerfeld (2010)

for empirical evidence of a relationship between the gender wage gap and the coverage by collective

wage bargaining in Germany.55

5.1 Productivity-gap and di¤erences in hours

One possible explanation for the large estimated productivity and wage gaps may be that male

workers work more hours than females. One of the main limitations of the LIAB is that it does

not provide any measure of hours. Therefore, the estimated di¤erences in productivity and the

estimated di¤erences in wages are not on an hourly-basis. To tackle this problem, one alternative

is to look for an external source of information about hours worked by each group in each sector.

Using the G-SOEP, we �nd signi�cant di¤erences in mean-hours between genders (see Table 9).

On average, female workers work almost 20 percent fewer hours than their male counterparts.

As the structural wage equation is linear in workers� ability, correcting for hours does not

a¤ect the estimated rent-splitting parameters.56 Nevertheless, an interesting exercise is to have

an hourly-wage-gap decomposition directly plugging in the hour correction.57 Correcting mean-

wages using mean-hours is going to be valid whenever hours worked are uncorrelated with hourly

wages, and there is evidence suggesting that this may be the case.58

Results are presented in Table 10. When correcting for hours, the wage gap is signi�cantly

smaller. We �nd that the average hourly-wage gap is 27.5 percent. A smaller fraction of this gap,

48 percent, is now explained by di¤erences in productivity. On the other hand, a larger fraction of

the gap, 32 percent,59 is due to di¤erences in rent-splitting parameters. Segregation is responsible

55The order of the sequential decomposition steps matters. As a robustness check, we reversed the order of thedecomposition (see Table 16 in the Online Appendix). Results remain qualitatively similar.56If wages and productivity are linear in time worked, the hour-correction only modi�es the worker productivity

and the wage. Consider the case where a worker i works h percent of time of a full time worker.

hwi;j;t = h"iwpj;t;k(i)(pj;t; �k(i);Hk(i)(p); �1k(i); �k(i))

h is canceled out of both sides of the equation and the original wage setting equation holds.57I must thank Zvi Eckstein for this suggestion.58See Blundell and MaCurdy (1999).59Female workers used to receive wages that were 6.2% lower than equivalent males, due to di¤erences in the

36

Table 9: Mean-Hours Per Week

Men Women Hours-GapManufact. Low-Q. 39.95 38.84 12.8%

(0.10) (0.21) (0.56%)High-Q 41.68 38.69 7.2%

(0.08) (0.32) (0.78%)Construct. Low-Q 43.09 34.40 20.2%

(0.26) (1.65) (3.85%)High-Q 43.62 38.74 11.2%

(0.12) (0.97) (2.24%)Trade Low-Q 41.23 28.27 31.4%

(0.47) (0.45) (1.35%)High-Q 44.01 33.89 23.0%

(0.24) (0.52) (1.25%)Services Low-Q 44.67 28.12 37.0%

(0.43) (0.40) (1.08%)High-Q 46.33 42.08 9.2%

(0.30) (0.72) (1.66%)Note: Standard errors are given in parentheses. Means of weekly-hours are estimated using worker-level

data from the G-SOEP. Standard errors of the hours gap are obtained using the Delta-Method.

Table 10: Gender Wage-Gap decomposition

.

% of the Hourly Wage Gap explained byDifferences in Low-Qualification High-Qualification All

� 32.0 11.1 17.7� -39.2 -18.1 -25.4 43.8 21.8 48.3

H(p) 53.7 15.7 27.0� 9.6 39.4 32.4

Wage-Gap 17.1 37.8 27.5

for 27 percent of the unconditional wage gap in hourly wages. Netting out the o¤er arrival rate

would increase the hourly-wage di¤erential by 25 percent; if women had the male destruction rate,

the gap would be 17 percent smaller.

rent-splitting parameter. This number does not change because the rent-splitting parameter estimates don�tchange.

37

6 Concluding Remarks

This paper presents the �rst estimation of an equilibrium search model using matched employer-

employee data to study wage discrimination. The combination of �rm and worker data is crucial

for these purposes for many reasons. Firstly, and most importantly, it allows us to separately

identify di¤erences in workers productivity across groups and di¤erences in wage policies toward

those groups. Secondly, better data makes the estimation simpler, and therefore, a more complete

model can be estimated. Finally, it enables us to produce measures of the e¤ect of labor market

segregation, disentangled from the e¤ect of friction on the distribution of accepted wages.

The structural estimation involved several steps. First, we estimated group-speci�c produc-

tivity for each �rm, relying on the production function estimation at the �rm-level, using LIAB.

Second, we computed job-retention and job-locating rates using G-SOEP employee-level data.

And third, we estimated the wage-setting parameters (bargaining power) using individual wage

records in LIAB, and transition parameters and productivity measures speci�c to each group and

�rm estimated in the previous steps.

When analyzing productivity, we observe that women are 33 percent less productive than men

in similar jobs. This di¤erence is reduced to 17.5 percent if we control for di¤erences in hours. The

primary �ndings, in terms of friction patterns, are that women are, in general, more mobile than

men in terms of job-to-job transitions and that they have higher job-destruction rates. In spite of

having large wage di¤erentials, women were only found to have signi�cantly lower rent-splitting

parameters than men in the construction sector.

In terms of wages, we �nd that the unconditional gender wage gap is 42 percent. It turns

out that 65 percent of the gap is accounted for by di¤erences in productivity. Di¤erences in

destruction rates explain 9 percent of the total wage-gap and segregation is responsible for 17

percent of this wage di¤erential. Netting out di¤erences in o¤er-arrival rates would increase the

gap by 13 percent. Di¤erences in the rent-splitting parameter generate 21 percent of the wage gap,

which implies that female workers receive wages 9 percent lower than equivalent males. This is

not a �nal word given that we still have to understand what is causing the di¤erence in each wage

determinant. Moreover, in the model all the parameters are considered as exogenous primitives.

In reality, most of them are the outcome of individual behavior, and they are not independent.

38

Still, the purpose of our decomposition is to identity which wage determinants are contributing

the most to the wage gap. Since we �nd that the gender productivity gap is very large and it

accounts for most of the wage di¤erential between male and women. A promising area for future

research is now understanding what is behind the productivity gap. Our model is not the most

appropriate framework to analyze this question because the productivity of the worker is assumed

to be a �xed primitive characteristic and not a result of a decision of investment in human capital.

In a life cycle model, it is likely that some of the di¤erence in productivity between male and

women is a response to a di¤erence between both groups in � or H(p).60

Our results also rely on simplifying assumptions that would need further scrutiny. We now list

two very desirable extensions. A �rst interesting extension is to relax the assumption of station-

arity. Men and women have di¤erent levels of experience in the data. Di¤erences in experience

are immaterial in the model, which is in steady-state with in�nitely-lived agents. However, in the

real world workers with shorter experience are less productive and have had less time to climb the

job ladder. To incorporate experience and human capital accumulation by learning-by-doing in

the model would generate a much richer interpretations of the gender di¤erence in H(p) and the

gender di¤erences in productivity. This extension is not trivial because the equilibrium allocation

of workers to �rms in such model has assortative matching (see Carrillo-Tudela, 2009) and the

estimation of the production function would have to account for it. A second desirable extension

is to look into our reported gender di¤erences in rent splitting parameters. Within this di¤erence,

apart from pure wage discrimination, there can be primitive di¤erences in bargaining performance

(Croson and Gneezy, 2009). To empirically disentangle wage discrimination from these behavioral

di¤erences is surely di¢ cult, but it would certainly yield further insight into the understanding

of the gender wage gaps.

A Appendix

A.1 Production Function - Speci�cation Tests

The robustness of the productivity estimates are crucial to be able to reach reliable conclusions

about wage discrimination. As robustness check, we present results of the productivity estimates

60This kind of mechanisms was �rst proposed by Coate and Loury (1993).

39

under di¤erent sets of assumptions.

In Table 11, we report within-group non-linear least squares estimates of (10) and (9). In the

estimation of (9) we do not use measures of capital. Relative productivity estimates (ie: w and

l) are very similar in both estimations: di¤erences between parameters are always smaller than a

standard deviation. This robustness check is essential because in this kind of model, it is assumed

that capital adjusts instantaneously to match the number of workers in each period. Although

this assumption may seem controversial, it turns out that using observed capital or using the

theoretical optimal choice of capital does not change the relative productivity estimates.

Table 11: Production Function: Optimal Capital Input

WG-NLLS of (10) WG-NLLS of (9) w u w u

Manufacturing 0.672 0.484 0.642 0.478(0.062) (0.042) (0.132) (0.070)

Construction 0.701 0.444 0.839 0.485(0.052) (0.039) (0.141) (0.833)

Trade 0.804 0.487 0.745 0.541(0.092) (0.056) (0.176) (0.982)

Services 0.588 0.298 0.595 0.301(0.068) (0.030) (0.115) (0.040)

Note: Time dummies included. Robust to heteroskedasticity standard errors are given in parentheses.

Worker Composition Endogeneity: One of the main criticisms to the productivity estimates

reported in Hellerstein and Neumark (1999) and in Hellerstein, Neumark, and Troske (1999) was

that the proportion of women in the �rm is likely to be correlated with the �rm�s technology.61

log(Yjt) = log( �Aj) + �k log(Kdjt) + (16)

�l log(Lmhjt + wL

whjt + lL

mljt + w lL

wljt ) + ujt:

Note that (16) is the original Cobb-Douglas production function in logs without imposing

constant returns to scale and including the depreciated capital instead of the optimal capital

input. In this estimation we are assuming that the depreciation rate is constant, and hence

depreciated capital is a constant fraction of the total capital.

61See Altonji and Blank (1999).

40

Table 12: Production Function: Non Linear Least Squares in Levels

NLLS of (16)�k �l w l

Manufacturing 0.153 0.905 0.352 0.300(0.007) (0.018) (0.032) (0.016)

Construction 0.084 1.034 0.451 0.382(0.017) (0.028) (0.082) (0.043)

Trade 0.110 0.963 0.562 0.365(0.018) (0.029) (0.068) (0.039)

Services 0.180 0.839 0.356 0.292(0.098) (0.015) (0.029) (0.017)

Note: Time dummies included. Standard errors are given in parentheses.

The results are presented in Table 12. Female productivity estimates are signi�cantly smaller.

This �nding is true for all sectors. The di¤erence between results presented in Table 12 and

results presented in Table 4, where �rm�s �xed e¤ects were removed, implies a non-zero correlation

between the proportion women of women in the �rm and the �rm �xed e¤ect. Hausman tests

reject equality of w in every sector. In terms of l; the results are not so di¤erent and we only

reject equality for manufacturing.

Predetermined inputs:

It is possible to estimate �l and k by GMM using internal instruments, assuming that Ljt

and Kjt are predetermined. The estimates of 0s by non-linear GMM are imprecise in every sector

and for every group. We have contrasted di¤erent identi�cation strategies, in terms of the sets of

internal instruments and optimization routines. has been estimated using only lagged levels as

instruments in the equation in di¤erences (Arellano and Bond, 1991); lagged levels as instruments

in the equation in di¤erences jointly with lagged di¤erences as instruments in the equation in

levels (e.g. Arellano and Bover, 1995); and only lagged di¤erences as instrument in the equation

in levels (e.g. Cahuc, Postel-Vinay and Robin, 2006). We obtained extremum estimators that

minimize the two-stage robust GMM objective function and iterated-GMM objective function.62

We also calculated Monte Carlo Markov Chain type of estimators for continuously-updated GMM

(Chernozhukov and Hong, 2003). Non-linear System-GMM estimates of the production function

(9) are reported in Table 13 .63 The lack of precision in the quality parameter estimates is a

62See Arellano 2003 pp 184-19763MATA codes for computing the non-linear estimators previously described are available from the author upon

41

Table 13: Production Function: Non-Linear SYSTEM-GMM

System-GMM of (16)�k �l w l Sargan p-v

Manufacturing 0.060 0.938 0.70 0.15 58%(0.046) (0.011) (0.454) (0.069)

Construction 0.016 1.164 0.442 0.188 98%(0.043) (0.160) (0.267) (0.083)

Trade 0.053 1.003 0.370 0.322 97%(0.050) (0.170) (0.290) (0.215)

Services 0.117 0.786 0.434 0.204 93%(0.071) (0.146) (0.158) (0.090)

Note: Time dummies included. Standard errors are given in parentheses.

pervasive problem in this kind of production function speci�cation when the lag of capital and

the lag of labor are used to instrument the current capital and labor, partialling out the �rm

�xed e¤ects. See for example Cahuc, Postel-Vinay and Robin (2006) or Hellerstein and Neumark

(1998).

Estimating (10) the �rm �xed e¤ect is completely removed, but the simultaneity problem is not

totally solved. One alternative would be to treat Ljt and kjt. In Table 13 we report System-GMM

estimates of (16). However the precision in the 0s GMM estimates is poor. Capital coe¢ cients

are not signi�cantly di¤erent from zero and constant returns to scale are not rejected in any sector.

Sargan tests do not reject compatibility of instruments in any sector. Haussman test of equality

between 0s reported in Table 13 and those in Table 4 do not reject equality in any group.

request.

42

References

[1] Abowd, J. M., Kramarz, F., and Margolis, D. N. (1999), "High wage workers and high wage�rms", Econometrica, 67(2), pp. 251-334.

[2] Antonczyk, D., B. Fitzenberger,and K. Sommerfeld (2010) "Rising Wage Inequality, theDecline of Collective Bargaining, and the Gender Wage Gap", Labour Economics 17(5), pp.835-847.

[3] Alda H., St. Bender, H. Gartner (2005), "The Linked Employer-Employee Dataset of theIAB (LIAB)", IAB-Discussion Paper.6/2005.

[4] Altonji, J.G. and R.M. Blank (1999), �Race and Gender in the Labor Market�, in: O.Ashenfelter and D. Card (eds.), Handbook of Labor Economics, Volume 3, Amsterdam:Elsevier Science.

[5] Arellano, M. and S.R. Bond (1991), �Some Tests of Speci�cation for Panel Data: MonteCarlo Evidence and an Application to Employment Mquations�, Review of Economic Studies,58(2), pp. 277�297.

[6] Arellano, M., and O. Bover (1995), �Another Look at the Instrumental-Variable Estimationof Error-Components Models�, Journal of Econometrics, 68(1), pp. 29�52.

[7] Atakan, A. (2006) "Assortative Matching with Explicit Search Cost", Econometrica, 74(3),pp. 667-680

[8] Bayard, K., J. Hellerstein, D. Neumark, and K. Troske (2003). �New Evidence on Sex Seg-regation and Sex Di¤erences in Wages from Matched Employee-Employer Data�, Journal ofLabor Economics 21(4), pp. 887-922.

[9] Becker, G. (1973), �A Theory of Marriage: Part I�, Journal of Political Economy, 81(4), pp.813-846.

[10] Blau, F. andW. Hendricks (1979) "Occupational Segregation by Sex: Trends and Prospects",The Journal of Human Resources, 14(2), pp. 197-210.

[11] Blau, F. and L. Kahn (2000), "Gender Di¤erences in Pay", Journal of Economic Perspectives14(49), pp. 75�99

[12] Blau, F. and L. Kahn (2003), �Understanding International Di¤erences in the Gender PayGap�, Journal of Labor Economics, 21(1), pp. 106-144.

[13] Blundell, R. and T. MaCurdy (1999) "Labor Supply: A Review of Alternative Approaches",Handbook of Labor Economics, Volume 3, Amsterdam: Elsevier Science.

[14] Bontemps, C., J.-M. Robin, and G. J. van den Berg (2000), �Equilibrium Search with Con-tinuous Productivity Dispersion: Theory and Non-Parametric Estimation�, InternationalEconomic Review, 41(2), pp. 305�358.

[15] Bowlus, A. (1997), �A Search Interpretation of Male-Female Wage Di¤erentials�, Journal ofLabor Economics, 15(4), pp. 625-657.

[16] Bowlus, A. and Z. Eckstein (2002), �Discrimination and Skill Di¤erences in an EquilibriumSearch Model�, International Economic Review, 43(4), pp. 1309-1345.

43

[17] Burdett, K.and D. T. Mortensen (1998), �Wage�Di¤erentials, Employer Size and Unemploy-ment", International Economic Review, 39(2), pp. 257�273.

[18] Cahuc, P., F. Postel-Vinay and J.-M. Robin, (2006), �Wage Bargaining with On-the-jobsearch: Theory and Evidence�, Econometrica, 74(2), pp. 323-364.

[19] Carrillo-Tudela, C., (2009), �An Equilibrium Search Model when Firms Observe Workers�Employment Status�, International Economic Review, 50(2), pp. 485-506.

[20] Chernozhukov, V. and H. Hong, (2003), "An MCMC Approach to Classical Estimation"Journal of Econometrics 115(2), pp. 293-346.

[21] Croson, R., U. Gneezy, (2009) "Gender Di¤erences in Preferences", Journal of EconomicLiterature, 47(2), pp. 448-474.

[22] Eckstein, Z. and K.I.Wolpin (1995), "Duration to First Job and Return to Schooling: Esti-mates from a Search-Matching Model", Review of Economic Studies, 62(2), pp. 263-286.

[23] Eckstein, Z. and K. Wolpin (1999), �Estimating the E¤ect of Racial Discrimination on FirstJob Wage O¤ers�, Review of Economics and Statistics, 81(3), pp. 384-392.

[24] Flabbi L.(2010), "Gender Discrimination Estimation in a Search Model with Matching andBargaining", International Economic Review, 51(3), pp. 745-783.

[25] Flinn, C. (2006), �Minimum Wage E¤ects on Labor Market Outcomes under Search, Bar-gaining, and Endogenous Contact Rates�, Econometrica 74 (4), pp. 1013-1062.

[26] Flinn, C and J. Mabli (2010), "On-the-Job Search, Minimum Wages, and Labor MarketOutcomes in an Equilibrium Bargaining Framework", NYU mimeo.

[27] Griliches, Z. and J. Mairesse (1998), "Production Functions: the Search for Identi�cation",Essays in Honour of Ragnar Frisch. S. Strom (ed.), Econometric Society Monograph Series,Cambridge University Press.

[28] Heinze, A. and E. Wolf (2010) "The Intra-Firm Gender Wage Gap: a New View on WageDi¤erentials Based on Linked Employer�Employee Data", Journal of Population Economics23(3), pp. 851-879.

[29] Hellerstein, J., Neumark, D. (1998) "Wage Discrimination, Segregation, and Sex Di¤erencesin Wage and Productivity Within and Between Plants", Industrial Relations 37(2), pp. 232-260.

[30] Hellerstein, J. and D. Neumark (1999), "Sex, Wages, and Productivity: An Empirical Analy-sis of Israeli Firm-Level Data". International Economic Review 40(1), pp. 95�123.

[31] Hellerstein, J., D. Neumark, and K. R. Troske (1999), "Wages, Productivity, and WorkerCharacteristics: Evidence from Plant-Level Production Functions and Wage Equations",Journal of Labor Economics 17(3), pp. 409�46.

[32] Jolivet, G., F. Postel-Vinay and J.-M. Robin (2006), �The Empirical Content of the JobSearch Model: Labor Mobility and Wage Distributions in Europe and the U.S.�, EuropeanEconomic Review, 50(4), pp. 877-907.

[33] Lauer, C. (2000), "Gender Wage Gap in West Germany: How Far Do Gender Di¤erences inHuman Capital Matter?", ZEW Discussion Paper No. 00-07.

44

[34] Lindeboom, M., R. Mendes and G.J. van den Berg (2010), "An Empirical Assessment ofAssortative Matching in the Labor Market", Labour Economics, 17(6), pp. 919-929.

[35] Mondal S. (2006), "Employer Discrimination and Racial Di¤erences in Wages and Turnover",Unpublished Manuscript.

[36] Mortensen, D.T. (2003), "Wage Dispersion", MIT Press, Cambridge.

[37] Moscarini, G, (2008), "Job-to-Job Quits and Corporate Culture", Unpublished Working Pa-per.

[38] Neumark, D. and M. McLennan (1995) "Sex Discrimination and Women�s Labor MarketOutcomes", Journal of Human Resources, 30(4), pp. 713-740.

[39] Postel-Vinay, F. and J-M Robin (2002), �Equilibrium Wage Dispersion with HeterogeneousWorkers and Firms", Econometrica 70(6), pp. 2295-1350.

[40] Ridder, G. and G.J. van den Berg (2003), �Measuring labor market frictions: a cross-countrycomparison�, Journal of the European Economic Association, 1(1), pp. 224�244.

[41] Rubinstein, A. (1982), �Perfect Equilibrium in a Bargaining Model�, Econometrica, 50(1),pp. 97�109.

[42] Shimer, R. (2006), �On-the-Job Search and Strategic Bargaining�, European Economic Re-view 50(4), pp. 811.830

[43] Shimer, R. and L. Smith (2000), "Assortative Matching and Search", Econometrica, 68(2),pp. 343 369

[44] Sulis, G. (2011), "Gender Wage Di¤erentials in Italy: a Structural Estimation Approach",Journal of Population Economics, 25(1), pp 53-87.

[45] Van den Berg, G.J. and A. van Vuuren (2010), "The E¤ect of Search Frictions on Wages",Labour Economics, 17(6), pp. 875-885.

[46] Wagner, G., R.V. Burkhauser, and F. Behringer (1993), �The English Language Public UseFile of The German Socio-Economic Panel�, Journal of Human Resources, 28(2), pp. 429-433.

45

Documents

Gender Wage Gaps Reconsidered: A Structural Approach Using