Transcript
Page 1: Location and the effect of demographic traits on earnings

Regional Science and Urban Economics 29 (1999) 445–461

Location and the effect of demographic traits onearnings

a , b*Stuart A. Gabriel , Stuart S. RosenthalaDepartment of Finance and Business Economics, Marshall School of Business, University of

Southern California, Los Angeles, CA 90089-1421, USAbDepartment of Economics and Center for Policy Research, Syracuse University, Syracuse, NY,

13244-1090, USA

Received 15 October 1996; received in revised form 18 January 1999; accepted 27 January 1999

Abstract

With mobile workers and competitive markets, equilibrium nominal wage rates rise withthe local cost of living but fall with the value of local amenities. Earnings and wageregressions that ignore such effects may suffer from omitted variable bias because observededucation and demographic attributes affect both worker skill levels and location choice.Geographic fixed effects can be used to control for unobserved locational attributesprovided that their scope is at least as narrow as the underlying labor markets, but not sonarrow as to introduce simultaneity problems arising from the endogenous choice oflocation on the basis of income. Estimates from the 1985–1989 American Housing Surveysuggest that SMSA-level fixed effects control for unobserved locational attributes withoutintroducing simultaneity problems. In addition, failure to control for location leads to biasedestimates of the effect of important demographic characteristics. 1999 Elsevier ScienceB.V. All rights reserved.

Keywords: Returns to labor; Location effects; Compensating variations

JEL classification: J3; J31

*Corresponding author. Tel.: 11-213-740-6523.E-mail address: [email protected] (S.A. Gabriel)

0166-0462/99/$ – see front matter 1999 Elsevier Science B.V. All rights reserved.PI I : S0166-0462( 99 )00008-3

Page 2: Location and the effect of demographic traits on earnings

446 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

1. Introduction

Over the years, a large number of studies have found pronounced differences inearnings across workers of different education, race, and gender [recent examplesinclude Blau and Beller (1992), Katz and Murphy (1992), and Murphy and Welch(1992)]. That literature is fundamental to the study of labor economics and hasprovided important insight into the returns to education, in addition to promptingdebate about the extent to which discrimination depresses wage rates for womenand minorities. However, most earnings studies largely fail to control for locationspecific cost of living and amenity differentials that may comprise important

1components of the worker’s equilibrium compensation package. In an open citymodel with mobile workers, metropolitan area wage rates must rise to offset higherhousing costs, ceteris paribus. In addition, because mobile workers choose whereto locate in part based on preferences for location specific amenities [e.g., Tiebout(1956), Hamilton (1976), and Epple and Romer (1991)], wage rates should fallwith an increase in the value of the amenities specific to a given geographically

2distinct labor market.One implication of these arguments is that a worker’s compensation package is

comprised of both real pecuniary and nonpecuniary earnings. Real pecuniaryearnings are given by nominal earnings deflated by a location specific cost ofliving index, whereas nonpecuniary earnings take the form of location specificamenities. It follows that with competitive markets and mobile households, inequilibrium differences in nominal earnings across similarly skilled workerssituated in geographically distinct labor markets should be offset by compensating

3variations in location specific amenity and cost of living differentials. Hence,earnings and wage regressions that omit those amenity and cost of livingdifferentials will suffer from omitted variable bias to the extent that observable

1Occasionally wage and earnings studies have included broad regional dummy variables to controlfor locational effects. However, inclusion of such variables is generally done without attention to thetradeoff between omitted variable and simultaneity bias to be discussed here, and the relationship ofthat tradeoff to the theory of compensating differentials.

2Thus, both wage rates and house prices adjust across cities in response to local amenity and fiscaldifferentials. See Henderson (1982), Roback (1982, 1988), Blomquist et al. (1988), Beeson and Eberts(1989), and Gyourko and Tracy (1989, 1991), for example.

3In an earlier paper that focuses on household commutes [Gabriel and Rosenthal (1996)], we treatcommute times as an additional component to the labor compensation package. However, commutesare excluded from the present analysis for three reasons. First, theoretical arguments by Mills andHamilton (1989) and empirical work by Ihlanfeldt (1992) suggest that the relationship betweencommutes and wage rates within cities is quite flat. Second, our data contains commute times only for1985 whereas all other variables are available for both 1985 and 1989. Thus, dropping commute timesallows us to estimate earnings regressions for both 1985 and 1989. Third, we ultimately focus on theeffect of SMSA-level locational attributes that include the set of commute opportunities available inany given SMSA.

Page 3: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 447

worker characteristics influence both the worker’s skill level and the worker’schoice of labor market.

More generally, these arguments are part of a broader phenomena described byRosen (1986) as the theory of compensating wage differentials. In particular,wages adjust for all non-pecuniary job-specific attributes, including occupationalhazards, employment and income security, as well as locational amenities.However, while there have been many empirical attempts to examine the degree towhich wages adjust to offset differences in occupational safety and job security,treatment of locational effects has been quite limited. The limited attention tolocation may reflect, in part, the enormous data requirements should one attempt tocontrol for location through direct inclusion of individual location-specificattributes in the wage regression (see Henderson (1982), Roback (1988), Blom-

4quist et al. (1988), and Gyourko and Tracy (1991), for example). Such anapproach, however, remains open to the possibility of omitted (locational) variablebias since it is impossible to fully specify the complete set of locational attributes.For the sizable literature that focuses on the wage and earnings effects ofeducation, race, and other demographic characteristics of the worker, such omittedvariable bias may be important.

Controlling for locational effects is problematic, however, because as suggestedabove, accurate data on the value of localized amenity and cost of livingdifferentials across labor markets may not exist. Thus, direct inclusion of a longlist of observable locational attributes is at best an imperfect means of controllingfor location. As an alternative, fixed effects methods can be used to control forunobserved locational attributes if workers can be grouped into spatially distinctclusters. We show that such fixed effects models yield consistent estimates of theimpact of education and demographic traits on earnings provided that twoconditions hold: the geographic scope of the fixed effects must be at least asnarrow as the workers’ underlying labor markets, but not so narrow as to introducesimultaneity problems arising from the endogenous stratification of workers acrossneighborhoods on the basis of income.

These issues are explored using a unique data base from the 1985–1989American Housing Survey (AHS) that permits us to group workers both at thewithin-SMSA neighborhood level and at the SMSA level. Results favor the SMSAmodel and suggest that inclusion of SMSA fixed effects largely controls forunobserved locational attributes without introducing simultaneity problems. More-over, failing to control for location leads to downward biased estimates of theblack earnings deficit by roughly 6 percentage points and downward biased

4Henderson (1982), Roback (1988), Blomquist et al. (1988), and Gyourko and Tracy (1991)estimate quality-of-life differentials across metropolitan areas, where quality of life is measured bysumming the degree to which individual SMSA-level amenities are capitalized into wages and houseprices.

Page 4: Location and the effect of demographic traits on earnings

448 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

estimates of the returns to males by 3 to 6 percentage points. More generally, ourfindings suggest that wage and earnings studies that seek to obtain consistentestimates of demographic effects can control for possible omitted locationalattributes by including SMSA-level dummy variables in the regression, anapproach that has modest data requirements and is easily implemented.

To clarify these and other results the paper is organized as follows. Thefollowing section describes the theoretical underpinnings of the earnings andlocation analysis. Section 3 presents the econometric model and discussesestimation procedures. Section 4 describes the data and variables. The final twosections of the paper present estimation results and concluding remarks.

2. Theoretical model

Household utility (v) is given by the sum of a systematic (V ) and anidiosyncratic (f) component,

v 5V(x , a ) 1 f (2.1)ij ij j i

where V increases with the consumption of a privately purchased composite good(x ) and a local public good (a ) (V .0, V .0) at a diminishing rate (V ,0,ij j 1 2 11

V ,0), while f is unforecastable with mean zero and finite variance. To simplify22 i

exposition, we define location j as a geographically distinct labor market in whichworkers have access to the same set of labor market specific locational attributes

5and amenities. Notationally, variables that are specific to a given location aresubscripted only by j, as in a . Variables that are unique to a given householdj

regardless of the location of the family’s residence are subscripted only by i, as inf . Variables that are sensitive to both locational and household specific charac-i

teristics are subscripted by i and j. Hence, x is double subscripted becauseij

consumption of the private composite good depends on a household’s endowmentand on the labor market specific price of housing and other goods (in a manner tobe clarified below).

Each worker inelastically supplies one unit of labor. Labor markets arecompetitive, and an employer at a given location varies wage rates across workersonly in response to differences in endowment that affect the worker’s skill level.However, employers in different geographically distinct labor markets are free tooffer different wages to similarly skilled workers. Accordingly, wage rates aregiven by y 5 y (m ), and the household’s budget constraint is,ij j i

5For the purposes of this paper, by definition labor market specific attributes include only thoseamenities that would affect the equilibrium supply of labor and wage in a given area. Examples of suchlabor market specific amenities include the weather, natural features, public safety, public services, andthe like.

Page 5: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 449

y (m ) 5 p x . (2.2)j i j ij

As indicated in expression (2.2), households spend all of their monetary earningson the private composite good. Note also that within a given labor market, theprice of the composite good ( p ) is constant across households and is subscriptedj

6only by j.We adopt an open city, long run perspective, in which families are perfectly

mobile (i.e., zero moving costs). Hence, households maximize Eq. (2.1) subject toEq. (2.2) by choosing where to locate from among all possible geographicallydistinct labor markets. Those decisions determine y , p , and a , while x isij j j ij

obtained from the budget constraint as x 5 y (m ) /p . A spatial equilibrium isij j i j

attained when workers with identical endowment obtain equal expected utilityregardless of the labor market in which they are situated. Substituting for x in theij

utility function and taking expectations yields,

E[V(x ,a ) 1 f ] 5V( y (m ) /p , a ) 5 k(m ), (2.3)ij j i j i j j i

where k is the expected utility for a worker with endowment, m , and E is thei

expectations operator.Expression (2.3) implies that the worker’s equilibrium compensation is a

package comprised of real pecuniary and nonpecuniary earnings: pecuniaryearnings are given by nominal wage receipts deflated by the labor market specificprice level, y (m ) /p , while nonpecuniary earnings are provided in the form ofj i j

location specific amenities, a . Hence, in equilibrium, an increase in p or aj j

decrease in a must be offset by higher nominal wages, y , ceteris paribus.j ij

3. Econometric model

To facilitate the empirical work two simplifying assumptions are imposed. First,a worker’s exogenously given equilibrium level of utility is specified as a linearfunction of the worker’s demographic and human capital characteristics, ork(m ) 5 d b , where d includes information on age, education, gender, and race,i i d i

for example. In addition, the systematic component to a worker’s utility is writtenas a linear function of locational amenities and the log of the privately purchasedcomposite good, V( y (m ) /p , a ) 5 log( y ) 2 log( p ) 1 a . Substituting into (2.3)j i j j i j j

and allowing for random effects associated with the idiosyncratic component toutility,

log( y ) 2 log( p ) 1 a 5 d b 1 e (3.1)ij j j i d i

6The composite good is composed of goods and services whose prices may vary across geo-graphically distinct labor markets. As such, P is the cost of living index for labor market j.j

Page 6: Location and the effect of demographic traits on earnings

450 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

2where e is a mean zero normally distributed error term with variance s , and a ,i j

p , and y are defined as before. Rearranging terms, (3.1) can be rewritten as,j i

log( y ) 5 log( p ) 2 a 1 d b 1 e (3.2)ij j j j d i

which says that log wage earnings are a linear function of the labor market specificamenities and cost of living, and the worker’s demographic and human capitalcharacteristics.

The key to our empirical approach is to recognize that a and p are locationj j

specific fixed effects. Including dummy variables for each location yields,

log( y ) 5 g 1 d b 1 e (3.3)ij j i d ij

where g 5log( p ) 2 a . If the geographic scope of the individual clusters definedj j j

by j51, . . . , J, is at least as narrow as their underlying labor markets, then (3.3)takes into account all relevant locational effects and omitted variable bias goes tozero. This is convenient since one could never fully specify the complete vector oflabor market specific amenities nor obtain perfectly accurate measures of the costof living in a given labor market. In contrast, most previous analyses of earningsdifferentials implicitly constrain all of the g in (3.3) to be equal, and in so doingj

ignore the impact of location specific amenity and cost of living differentials onworker compensation.

A weakness of (3.3) is that estimates of the slope coefficients in the model may7suffer from simultaneity bias arising from the endogenous choice of location. For

example, consider an earnings regression that controls for location at theneighborhood level. On the one hand, such a regression is appealing in thatneighborhood locational controls would certainly eliminate omitted variable biasarising from locational effects given that labor markets are more broad ingeographic scope than individual neighborhoods. On the other hand, considerableevidence indicates that workers stratify themselves across neighborhoods on thebasis of earnings. Moreover, given the fixed effects approach, the slope co-efficients in the model will be identified based on within neighborhood variation inthe data. This implies that the estimated returns to education, for example, willreflect only information from mixed education neighborhoods. Such neigh-borhoods, however, will tend to be populated with low education workers with

7Note that while a and p are endogenous, simultaneity problems arise only if Cov(g , e ) differsj j j ij

from zero. Since g 5log( p ) 2 a , and local price levels increase with a [log( p ) 5 p(a ), p9(a ) . 0],j j j j j j j

Cov(g , e ) 5 Cov( p(a ), e ) 2 Cov(a , e ).j ij j ij j ij

Although individuals with unobserved skills and earnings (e .0) choose to locate in more attractiveij

areas, those areas are more expensive. Thus, both Cov( p(a ), e ) and Cov(a , e ) are positive andj ij j ij

without further restrictions on p(a ), the sign of Cov(g , e ) and whether (3.3) is subject to simultaneityj j ij

bias is ambiguous.

Page 7: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 451

above average earnings, and high education workers with below average earnings.Estimates from a neighborhood fixed effects model, therefore, should under-estimate the returns to education because of simultaneity bias.

Such simultaneity problems disappear if workers do not sort themselves byincome class across the locations specified by the fixed effects. At the SMSAlevel, estimates are unlikely to suffer from simultaneity bias since virtually allmajor US metropolitan areas contain a range of different income neighborhoodsover which individual workers can choose. Accordingly, workers with higher thanexpected earnings would likely locate in higher income neighborhoods within agiven metropolitan area, rather than seek out different metropolitan areas, per se.Suppose also that individual metropolitan areas comprise the set of geographicallydistinct labor markets in the United States, which seems reasonable as a firstapproximation. These arguments imply that earnings regressions that control forSMSA-level fixed effects likely avoid both omitted variable and simultaneity bias(at least as pertains to locational effects) and are preferred to models that ignore

8location.

4. Data and variables

Data for the study are drawn from a unique neighborhood supplement to the1985 and 1989 national core files of the American Housing Survey (AHS). In1985, the AHS selected 680 urban housing units at random from the overall corefile of roughly 55 000 housing units. For each selected housing unit, the AHS thenconducted a full survey of up to ten of that unit’s ‘closest neighbors.’ The datapermit one to group workers into their respective neighorhood clusters. Alter-natively, one can also identify the SMSA in which a worker resides, allowing oneto group workers by SMSA location.

From these data we omit observations with household heads that are other than9White, Black, or Asian. Also omitted were observations with missing values and

10the top 1 percent of wage earners whose salaries were top coded at $100 000.The resulting sample totaled 6088 cases in 1985 and 5933 cases in 1989.Restricting the sample to labor force participants reduced the sample by 45% to3173 households in 1985 and 3239 households in 1989; the reduction in sample

8See Appendix A for a full discussion of the trade-offs between omitted variable and simultaneitybias.

9Following the AHS, white Hispanics are counted as White and black Hispanics are counted asBlack.

10The analysis was also conducted including the top coded observations (TCO) with the TCO setequal to $100 000. Results were essentially identical to when the TCO were excluded.

Page 8: Location and the effect of demographic traits on earnings

452 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

size is roughly consistent with the national household labor force participation rate11of about 65 percent. Summary statistics appear in Appendix B.

As discussed above, the dependent variable in the earnings Eq. (3.3) is the logof the household head’s annual wage earnings (LogEARN). Regressors in theearnings model account for well established determinants of earnings, includingeducation, work experience, race, gender, marital status, and family size. Educa-tion is measured by a series of dichotomous variables coded 1 or 0 based onwhether or not the highest level of education attained by the household head issome graduate training (EDGRAD, 1 if yes), a college degree (EDCOL, 1 if yes),one to three years of college (EDSMCOL, 1 if yes), and less than a high schooldegree (EDLTHS, 1 if yes). The omitted educational strata in all of the regressionsis a high school degree (EDHS, 1 if yes).

Work experience is proxied by age of the household head as follows. Thecontinuous age variable, AGE, is interacted three separate times with dummyvariables corresponding to whether the household head is under age 50, between50 and 60, or age 60 and over, respectively. The three resulting interactivevariables, AGE50, AGE5060, and AGE60, are included in the regression. We tiethe age profile together in this manner to allow the effect of a one year increase inage to differ according to the broad age category to which the household head

12belongs.Additional variables control for whether the worker is black (BLACK, 1 if yes),

asian (ASIAN, 1 if yes), male (MALE, 1 if yes), or married (MARR, 1 if yes). Thenumber of children in the household (CHILD) is also included in the model.

5. Estimation results

Table 1 presents earnings equations for 1985 while Table 2 presents findings for1989. For both years, earnings were estimated using three different specificationswith respect to locational controls: ignoring locational effects as in most previousstudies, controlling for locational effects at the SMSA level, and controlling for

13locational effects at the neighborhood level. In addition, for each year, each ofthese models was estimated without controls for sample selection related to laborforce participation (Tables 1a and 2a), and then again using Heckman two-step

11We should emphasize that the AHS follows housing units over time, not households. Among laborforce participants, roughly 43% of the 1985 homes were occupied by different households in 1989.Hence, 57% of the 1989 sample is based on the same set of households as in 1985.

12Separate age terms for various categories of workers under age 50 were also tested. Results wererobust to those transformations of the age variable.

13Fixed effects for the SMSA and Neighborhood models are not presented to conserve space.

Page 9: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 453

Table 1a(a) 1985 earnings regressions-OLS models. (b) 1985 earnings regressions-selection models

Ignore location SMSA location Neigh location

Coeff t-ratio Coeff t-ratio Coeff t-ratio

(a)Constant 9.2958 225.4 – – – –Edgrd 0.3625 17.306 0.3346 15.599 0.2327 10.048Edcl 0.2638 13.013 0.2342 11.456 0.1289 6.04Edltcl 0.0878 4.577 0.0737 3.818 0.0366 1.881Edlths 20.2007 28.654 20.1922 28.174 20.1079 24.337Black 20.1203 25.255 20.1621 26.584 0.0100 0.269Asian 20.1311 22.956 20.1114 22.37 20.1067 22.129Age50 0.0130 12.062 0.0122 11.203 0.0090 7.775Age5060 0.0103 13.21 0.0096 12.193 0.0065 7.658Age60 0.0093 13.07 0.0089 12.212 0.0062 7.961Male 0.2150 10.58 0.2123 10.382 0.2210 10.658Marr 0.1132 6.077 0.1088 5.821 0.0294 1.495Child 20.0055 20.794 20.0063 20.893 20.0053 20.733

Num Obs. 3173 3173 3173Num Locations – 102 612Residual SS 469.16 439.41 301.09R-squared 0.281 0.326 0.538Std Error 0.385 0.379 0.344

(b)Constant 9.2490 29.49 – – – –Edgrd 0.3731 5.142 0.3853 3.544 0.2370 6.228Edcl 0.2706 5.291 0.2660 3.631 0.1320 4.609Edltcl 0.0910 2.775 0.0893 2.045 0.0381 1.769Edlths 20.2170 22.153 20.2719 21.856 20.1150 22.176Black 20.1280 22.206 20.2008 22.38 0.0063 0.146Asian 20.1340 22.146 20.1245 21.709 20.1070 22.131Age50 0.0132 6.312 0.0130 4.547 0.0091 7.237Age5060 0.0103 8.51 0.0098 6.493 0.0066 7.596Age60 0.0085 2.035 0.0054 0.877 0.0059 2.726Male 0.2308 2.325 0.2873 1.961 0.2280 4.721Marr 0.1199 2.558 0.1403 2.142 0.0325 1.201Child 20.0070 20.49 20.0164 20.747 20.0063 20.676Mills 0.0567 0.154 0.2689 0.481 0.0266 0.169

Num Obs. 3173 3173 3173Num Locations – 102 612Residual SS 469.13 438.93 301.08R-squared 0.281 0.327 0.538Std Error 0.385 0.38 0.344

a T-ratios for the Ignore Location and SMSA models were corrected for selection effects as describedin Maddala (1983), page 252. Uncorrected (OLS) t-ratios are presented for the Neighborhood modelbecause the large number of fixed effects made it cumbersome to correct for preestimation error: thoset-ratios are upward biased for reasons outlined in Maddala (1983).

Page 10: Location and the effect of demographic traits on earnings

454 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

Table 2a(a) 1989 earnings regressions-OLS models. (b) 1989 earnings regressions-selection models

Ignore location SMSA location Neigh location

Coeff t-ratio Coeff t-ratio Coeff t-ratio

Constant 9.4384 215.644 – – – –Edgrd 0.3840 17.456 0.3497 15.832 0.2367 9.992Edcl 0.2879 14.07 0.2566 12.625 0.1391 6.477Edltcl 0.0716 3.646 0.0668 3.444 0.0217 1.097Edlths 20.2689 210.724 20.2513 210.054 20.1544 25.662Black 20.0800 23.512 20.1266 25.313 0.0242 0.711Asian 20.0880 22.161 20.1219 22.944 20.1306 22.967Age50 0.0130 11.676 0.0125 11.391 0.0084 7.317Age5060 0.0106 12.823 0.0101 12.4 0.0065 7.433Age60 0.0072 9.607 0.0067 9.012 0.0031 3.911Male 0.2099 10.766 0.2119 10.985 0.2110 10.64Marr 0.0933 5.231 0.0881 4.941 0.0385 2.079Child 20.0102 21.552 20.0091 21.376 20.0099 21.408

Num Obs. 3239 3239 3239Num Locations – 101 611Residual SS 524.15 475.76 331.6R-squared 0.281 0.347 0.545Std Error 0.403 0.39 0.356

(b)Constant 9.7501 29.397 – – – –Edgrd 0.3058 4.059 0.2881 2.961 0.1765 5.055Edcl 0.1982 2.287 0.1885 1.747 0.0727 2.036Edltcl 0.0302 0.602 0.0349 0.61 20.0106 20.439Edlths 20.1588 21.766 20.1681 21.519 20.0736 21.665Black 20.0362 20.617 20.0895 21.334 0.0576 1.564Asian 20.0896 21.335 20.1276 22.126 20.1362 23.093Age50 0.0120 4.197 0.0117 4.082 0.0076 6.315Age5060 0.0109 7.145 0.0104 7.711 0.0067 7.601Age60 0.0136 3.104 0.0117 1.874 0.0079 3.592Male 0.1061 1.134 0.1325 1.12 0.1355 3.609Marr 0.0474 1.05 0.0535 1.018 0.0043 0.184Child 0.0058 0.319 0.0031 0.148 0.0008 0.094Mills 20.4403 21.056 20.3396 20.637 20.3258 22.35

Num Obs. 3239 3239 3239Num Locations – 101 611Residual SS 520.74 474.53 331.00R-squared 0.286 0.349 0.546Std Error 0.402 0.398 0.356

a T-ratios for the Ignore Location and SMSA models were corrected for selection effects as describedin Maddala (1983), page 252. Uncorrected (OLS) t-ratios are presented for the Neighborhood modelbecause the large number of fixed effects made it cumbersome to correct for preestimation error: thoset-ratios are upward biased for reasons outlined in Maddala (1983).

Page 11: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 455

14methods to control for sample selection (Tables 1b and 2b). In both cases,because our theory applies primarily to full-time workers but our data do notinclude hours worked, all of the earnings regressions were estimated for in-dividuals with annual earnings greater than $10 000 in order to reduce the

15presence of part time workers.A quick review of Tables 1b and 2b indicates that the Mills ratio term is

negative and generally insignificant for the 1989 models and small, positive, and16insignificant for the 1985 models. Nevertheless, we retain the selectivity

corrected results because coefficient estimates for the remaining variables in themodel differ somewhat when the Mills ratio terms are omitted, as is apparent from

17a comparison of Tables 1a to 1b and 2a to 2b. This pattern of results may reflectthe absence of exclusion restrictions which makes identification of the Mills ratio

18term difficult. Regardless, as will become apparent, the principal results beloware robust to controls for labor force participation, especially as regards thetradeoff between omitted variable and simultaneity bias,

Turning now to the other variables in the model, estimates from the IgnoreLocation models in Tables 1 and 2 are broadly comparable to those of existing

14The selection models for the Ignore Location and SMSA Location specifications were estimatedusing the Stata Heckman2 routine available at the Stata website, www.stata.com. That routine correctsthe standard errors for preestimation of the Mills ratio term as described by Maddala (1983), page 252.Correcting the standard errors in the Neighborhood model, however, is cumbersome because of thelarge number of fixed effects. For the neighborhood model, therefore, uncorrected OLS t-ratios arepresented and are upward biased [see Maddala (1983)]. As will become apparent, that bias does notaffect the qualitative nature of the results.

15We also estimated the earnings regressions using only workers with earnings greater than $1000.The qualitative nature of the main results to be described below were robust to that change. In addition,results from the first stage work/don’t work probit model (based on the $10 000 definition of working)are presented in Appendix B for both years.

16The Mills ratio coefficient equals the covariance between the error terms from the first and secondstage equations [e.g. Maddala (1983)]. Thus, a negative Mills ratio coefficient, as in 1989, implies thatindividuals with lower than average wage offers are more likely to work, a seemingly counterintuitiveresult. However, assuming that the unobserved determinants of w and w (the wage offer ando r

reservation wage, respectively) are positively correlated, one can show that the coefficient on the Millsratio will be negative if the variance of w is less than the covariance of w and w , as would occur ifo o r

w 5Aw , with A.1, for example. Under those conditions, individuals with below average wage offersr o

(w ,0) have even lower reservation wages (w ,w ) and are more likely to work. Conversely, with wo r o o

and w positively correlated, the coefficient on the Mills ratio will be positive if the variance of w isr o

greater than than the covariance of w and w .o r17See for example the coefficients on the education variables.18Individuals work if w .w , where w is the employer’s wage offer and w is the worker’so r o r

reservation wage. The observed wage, in contrast, is given by w . Given that the determinants of wo r

likely also affect the individual’s skill level and, therefore, w , all of the probit model slope regressorso

are included in the earnings regressions. As such, identification of the Mills ratio term relies on thenonlinearity of the Mills ratio.

Page 12: Location and the effect of demographic traits on earnings

456 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

studies. For example, controlling for sample selection, the Ignore Location modelin Table 1b indicates that workers with a college degree earn 27 percent more thanworkers with a high school degree, ceteris paribus. In contrast, estimates from avariety of studies [see, for example, Reimers (1983), Montgomery and Wascher(1987) and O’Neill (1990)] suggest that each additional year of education beyondhigh school adds 4 to 6 percent to a worker’s wage, or the wage premium for a

19four year college education is 16 to 24 percent. Similarly, the estimatedcoefficients on BLACK in the selection controlled Ignore Location models suggesta black–white earnings differential in the range of 4 to12 percent. By comparison,in a review of estimated earnings disparities, Ehrenberg and Smith (1985) reportthat black workers earn 5 to 8 percent less than comparably skilled white

20workers. Finally, estimates from our selection controlled Ignore Location modelssuggest that in 1985, male workers earned 23 percent more than women. Thoseresults coincide with findings of Ehrenberg and Smith (1985), who report amale–female earnings differential of approximately 20 percent. Given the similari-ty between our Ignore Location models and previous wage and earnings studies,results from our fixed effects models reported below likely generalize to the large

21number of earnings and wage studies in the literature.A review of Tables 1 and 2 reveals that the Neighborhood and Ignore Location

models yield markedly different estimates of the coefficients on education and raceregardless of controls for sample selection. In 1985, for example, the estimatedpremium associated with a graduate degree (EDGRAD) in the selection controlledneighborhood model is 14 percentage points lower than when location is ignored,and 12 percentage points lower in 1989. Similarly, the estimated earnings deficit ofblacks in the neighborhood model is reversed in sign relative to the constrainedmodel that ignores location. As argued in Section 3, however, controlling forlocation with neighborhood fixed effects likely causes the estimated returns toeducation to be downward biased because households sort themselves acrossneighborhoods on the basis of income. A similar argument suggests that the

19Additional estimates of the returns to education are provided by Kosters (1990) and Murphy andWelch (1992) who estimate somewhat larger premiums for a college degree than those above usingCPS data from the mid-1980s. Other studies on the returns to education frequently use data from theNational Survey of Youth, and the Survey of Income and Education.

20Using microdata from the 1987 National Survey of Youth and controlling for human capital, workexperience, and a large number of other demographic effects, O’Neill (1990) estimates black–whitewage differentials in the range of 2 to 10 percent among men aged 22 to 29 years. Blau and Beller(1992), using CPS data, calculate an earnings differential among black and white males of 15 percentfor 1988; in the case of females, the racial disparity in earnings declines to 6 to 9 percent.

21Previous versions of this paper also analyzed the distribution of earnings based on the Lorenzcurve. When locational effects were ignored, results based on the 1985 AHS were virtually identical topublished Lorenz curves based on 1986 CPS data (e.g. Bishop et al., 1991).

Page 13: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 457

22neighborhood model may underestimate the black earnings deficit as well. Thus,findings from the neighborhood model appear to be consistent with the presence ofsimultaneity bias, although the neighborhood model avoids omitted variable biasby controlling for all relevant locational attributes.

For reasons outlined in Section 3, however, the SMSA model likely avoids bothomitted variable and simultaneity bias, at least as related to locational effects.Comparing the Ignore Location to the SMSA model (controlling for sampleselection), findings from 1985 and 1989 suggest that failing to control for locationleads to downward biased estimates of the black earnings deficit of roughly 5 to 7percentage points, while the male earnings premium is downward biased by 3 to 6percentage points. These findings suggest that black workers and male workerstend to live in cities that are expensive relative to the locational amenities in thoselabor markets. As a result, failing to control for SMSA location causes one tounderestimate race and gender effects because the extra compensation black andmale workers receive for working in less attractive and or more expensive cities is

23overlooked.

6. Conclusions

With mobile workers and competitive markets, equilibrium nominal wage ratesvary across geographically distinct labor markets, rising with the local cost ofliving but falling with the value of the local amenities. Earnings and wageregressions that ignore such effects may suffer from omitted variable bias becauseeducation and demographic attributes affect both worker skill levels and workerchoice among geographically distinct labor markets. We show that fixed effectsmethods can be used to control for unobserved locational attributes provided twoconditions hold: the geographic scope of the fixed effects must be at least asnarrow as the workers’ underlying labor markets, but not so narrow as to introduce

22Suppose that higher income whites prefer to live in white as opposed to integrated neighborhoods,whereas higher income blacks prefer to live in integrated as opposed to black neighborhoods. Thenamong white workers, residents of white neighborhoods will tend to be of higher income than residentsof integrated neighborhoods, whereas among black workers, residents of integrated neighborhoods willtend to be of higher income than residents of black neighborhoods. Controlling for neighborhoodeffects and worker attributes, the estimated black earnings deficit will be downward biased because it isbased on integrated neighborhoods populated by blacks with above average earnings but whites withbelow average earnings.

23Relative to the SMSA model, for 1985 ignoring location also downward biases the effect of beingmarried and of having less than a high school degree. Apart from these effects and those reportedabove, results from the Ignore Location and SMSA models were similar.

Page 14: Location and the effect of demographic traits on earnings

458 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

simultaneity problems arising from the endogenous stratification of workers acrossneighborhoods on the basis of income.

These issues are explored using a unique data base from the 1985–1989American Housing Survey (AHS) that permits us to group workers both at thewithin-SMSA neighborhood level and at the SMSA level. Results favor the SMSAmodel and suggest that inclusion of SMSA fixed effects largely controls forunobserved locational attributes without introducing simultaneity problems. Rela-tive to the SMSA model, failing to control for location downward biases estimatesof the black earnings deficit by roughly 6 percentage points and downward biasesestimates of the returns to males by 3 to 6 percentage points. These findingssuggest that black workers and male workers tend to live in cities that areexpensive relative to the locational amenities in those labor markets. Thus, failingto control for SMSA location reduces estimated earnings disparities on the basis ofrace and gender because the additional compensation black workers and maleworkers receive for working in less attractive and or more expensive cities is nottaken into account. More generally, our findings suggest that wage and earningsstudies that seek to obtain consistent estimates of demographic effects can controlfor possible omitted locational attributes by including SMSA-level dummyvariables in the regression, an approach that has modest data requirements and iseasily implemented.

Acknowledgements

We thank John Engberg, Jeff Groger, David Green, and anonymous referees forhelpful comments. Gabriel acknowledges financial support from the FacultyResearch Fund of the USC Marshall School of Business. Rosenthal acknowledgesfinancial support from the Social Science Research Council of Canada (SSHRC)and the Real Estate Institute of British Columbia.

Appendix A. Omitted variable versus simultaneity bias

To clarify the manner in which Eq. (3.3) may suffer from simultaneity bias firstrewrite (3.3) with the locational means differenced off from each variable in themodel,

* * *y 5 d b 1 e , (A.1)ij ij d ij

]] ]* * *where e 5e 2e ,d 5 d 2d , y 5 y 2y , and y is in logs to further simplifyij i ij ij i ij ij i ij

Page 15: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 459

the notation. As shown by Hsiao (1986) and others, OLS estimates of (3.3) and(A.1) yield identical estimates of b . The least squares estimate for b can then bed d

written as,

21 21b 5 (d*9d*) d*9y* 5 b 1 (d*9d*) d*9e*, (A.2)OLS d

where d*, y*, and e* are matrices with the subscripts suppressed to simplifynotation. Clearly, for b to be consistent, d*9e* must go to zero as sample sizeOLS

24gets large. Alternatively, since

n n1 1 ¯] ] ¯d*9e* 5 O d e 2 O d e , (A.3)i i ij ijn ni51 i51

and d is assumed to be orthogonal to e , b is consistent if,i i OLS

n1 ¯] ¯2 O d e → 0, as n → ` (A.4)ij ijn i51

Expression (A.4) says that fixed effect estimates of b do not suffer fromd]]simultaneity bias provided the sample mean of d e goes to zero as sample sizeij ij

becomes large. An example of the type of situation in which that condition isviolated is provided in Section 3 of the text.

Appendix B. Supplemental tables

Table B-1. Selected summary statistics

1985 Probit sample 1985 OLS sample 1989 Probit sample 1989 OLS sampleSample size: 6088 Sample size: 3173 Sample size: 5933 Sample size: 3854

Variable Mean Std dev Mean Std dev Mean Std dev Mean Std dev

EARN.$ 0.5212 0.4996 1.0000 0.0000 0.5459 0.4979 1.0000 0.000010 000LogEARN 6.4720 4.6557 10.1060 0.4535 6.5214 4.7518 10.2200 0.4745EDGRAD 0.1107 0.3138 0.1582 0.3650 0.1165 0.3208 0.1541 0.3611EDCOL 0.1291 0.3353 0.1733 0.3786 0.1417 0.3488 0.1933 0.3949EDSMCOL 0.1715 0.3770 0.2039 0.4030 0.1839 0.3874 0.2170 0.4123EDLTHS 0.2502 0.4331 0.1204 0.3255 0.2161 0.4116 0.1081 0.3105BLACK 0.1330 0.3397 0.1018 0.3024 0.1347 0.3414 0.1127 0.3163ASIAN 0.0218 0.0146 0.0246 0.1549 0.0270 0.1620 0.0315 0.1747

24To obtain (A.3) note that,n n1 1

¯] * * ] ¯d*9e* 5 O d e 5 O (di 2 d )(e 2 e )ij ij ij i ijn ni51 i51nj jn n n1 1 1 1 1¯ ¯ ¯] ] ] ] ]¯ ¯ ¯5 O d e 2 O n O (d e 2 d e 1 d e ) 5 O d e 2 O d e .i i j i ij ij i ij ij i i ij ijn n n n ni51 j51 j51 j i51 i51

Page 16: Location and the effect of demographic traits on earnings

460 S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461

Table B-1. Continued

AGE50 19.226 18.230 25.889 16.894 19.239 18.513 26.742 16.980AGE5060 7.5915 18.940 9.6571 20.853 7.4690 18.750 9.3032 20.475AGE60 21.711 32.818 5.7800 18.343 22.685 33.399 5.4992 17.995MALE 0.6590 0.4741 0.7835 0.4119 0.6525 0.4762 0.7586 0.4280MARR 0.5526 0.4973 0.6543 0.4757 0.5228 0.4995 0.6045 0.4890CHILD 0.6853 1.1000 0.8295 1.1167 0.6759 1.1588 0.8290 1.1837MIDWEST 0.2224 0.4159 0.2209 0.4149 0.2233 0.4165 0.2223 0.4158SOUTH 0.3137 0.4640 0.2991 0.4579 0.3073 0.4614 0.2859 0.4519WEST 0.2385 0.4262 0.2465 0.4310 0.2434 0.4292 0.2470 0.4313

Table B-2. Probit model of earnings greater than $10 000.(Dependent variable equals 1 if earnings greater than $10 000.)

1985 Probit model 1989 Probit model

Variable Coeff t-ratio Coeff t-ratio

CONSTANT 0.0176 0.15 0.2910 2.34EDGRAD 0.4258 6.50 0.4227 6.41EDCOL 0.2589 4.32 0.4759 7.61EDSMCOL 0.1203 2.26 0.2136 3.92EDLTHS 20.4936 29.60 20.4380 27.91BLACK 20.2573 24.51 20.2135 23.65ASIAN 20.1080 20.87 0.0444 0.37AGE50 20.0065 22.23 0.0065 2.08AGE5060 20.0017 20.82 20.0007 20.33AGE60 20.0019 211.85 20.0230 213.14MALE 0.5032 10.33 0.4608 9.40MARR 0.2253 4.69 0.2107 4.37CHILD 20.0723 23.81 20.0693 23.86MIDWEST 20.0570 21.02 20.2129 23.62SOUTH 20.1463 22.82 20.2748 25.03WEST 20.1340 22.44 20.3310 25.77Sample size 6.088 5.933Log-L 22.400.78 22.718.05Percent Correctly

aPredicted 77.69 79.86a Predicted values are determined by the outcome with the higher probability.

References

Beeson, P., Eberts, R., 1989. Identifying productivity and amenity effects in interurban wagedifferentials. Review of Economics and Statistics, 443–461.

Bishop, J., Formby, J., Smith, J., 1991. Lorenz dominance and welfare: Changes in the U.S. distributionof income, 1967–1986. Review of Economics and Statistics LXXIII, 134–139.

Blau, F., Beller, A., 1992. Black–white earnings over the 1970s and 1980s: Gender differences intrends. Review of Economics and Statistics LXXIV, 276–286.

Page 17: Location and the effect of demographic traits on earnings

S.A. Gabriel, S.S. Rosenthal / Reg. Sci. Urban Econ. 29 (1999) 445 –461 461

Blomquist, G., Berger, M., Hoehn, J., 1988. New estimates of the quality of life in urban areas.American Economic Review 78, 89–107.

Ehrenberg, R., Smith, R., 1985. Modern Labor Economics, Scott, Foresman, Glenview, IL.Epple, D., Romer, T., 1991. Mobility and redistribution. Journal of Political Economy 99, 828–858.Gabriel, S., Rosenthal, S., 1996. Commute times, neighborhood effects, and earnings: An analysis of

compensating differentials and racial discrimination. Journal of Urban Economics 40, 61–83.Gyourko, J., Tracy, J., 1989. Local public rent-seeking and its impact on local land values. Regional

Science and Urban Economics 19.Gyourko, J., Tracy, J., 1991. The structure of local public finance and the quality of life. Journal of

Political Economy 99, 775–806.Hamilton, B.W., 1976. Capitalization of interjurisdictional differences in local tax prices. American

Economic Review 66, 743–753.Henderson, V.J., 1982. Evaluating consumer amenities and interregional welfare differences. Journal of

Urban Economics 11, 32–59.Hsiao, C., 1986. Analysis of Panel Data, Cambridge University Press, New York.Ihlanfeldt, K.R., 1992. Intraurban wage gradients: evidence by race, gender, occupational class, and

sector. Journal of Urban Economics 32, 70–91.Katz, L., Murphy, K. Changes in relative wages, 1963–1987: Supply and demand factors, Quarterly

Journal of Economics, 1992, 35–78.Kosters, M., 1990. Schooling, work experience, and wage trends. American economics association

papers and proceedings 80, 308–312.Maddala, G.S., 1983. Limited Dependent and Qualitative Variables in Econometrics, Cambridge

University Press, New York NY.Mills, E., Hamilton, B., 1989. Urban Economics, 4th edn, Scott, Foresman, Boston.Montgomery, E., Wascher, W., 1987. Race and gender wage inequality in services and manufacturing.

Industrial Relations 26, 284–290.Murphy, K., Welch, F., 1992. The structure of wages. Quarterly Journal of Economics, 285–326.O’Neill, J., 1990. The role of human capital in earnings differences between black and white men.

Journal of Economic Perspectives 4, 25–45.Reimers, C., 1983. Labor market discrimination against hispanic and black men. Review of Economics

and Statistics, 570–579.Roback, J., 1982. Wages, rents, and the quality of life. Journal of Political Economy, 90.Roback, J., 1988. Wages, rents, and amenities: Differences among workers and regions. Economic

Inquiry 26.Rosen, S., 1986. The theory of equalizing differences. In: Ashenfelter, O.C., Layard, R. (Eds.),

Handbook of Labor Economics, Vol. 1, Elsevier Science, Amsterdam, pp. 641–692, Chapter 12.Tiebout, C., 1956. A pure theory of local public expenditures. Journal of Political Economy 64,

416–424.


Recommended