33
(Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality Aliya Saperstein, University of California, Berkeley Abstract Debates about whether race should be included in survey research and government data gathering have raged in the around the world over the last decade. Researchers in the United States have retained their ability to use such data to monitor racial inequality, but without addressing the substantial gap between social science theory about race and racial inequality and actual research practices. In this paper, I propose a new approach to analyzing race in survey research that uses multiple measures of race to better capture the fluidity, complexity and negotiation stressed in sociological theories of race. I demonstrate that incorporating multiple measures of race reveals more complex patterns of advantage and disadvantage in studies of U.S. racial inequality than can be seen using standard methods. Further, I find that different measures of race better explain inequalities in different domains (e.g., health care or socioeconomic status). This suggests that race is not a monolithic axis of difference in American society, but a context-specific one – both in its effects and the mechanisms through which inequalities are perpetuated. Paper prepared for the conference on Social Statistics and Ethnic Diversity held in Montreal, December 2007. Please do not quote or cite this draft without permission. Many thanks go to Michael Hout, Claude Fischer, Michael Omi, Sandra Smith and Amani Nuru-Jeter for their helpful comments and suggestions. Direct correspondence to Aliya Saperstein, University of California-Berkeley, Graduate Group in Sociology and Demography, 2232 Piedmont Avenue, Berkeley, CA 94720-2120. E-mail: [email protected]. 1

(Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

(Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality Aliya Saperstein, University of California, Berkeley Abstract Debates about whether race should be included in survey research and government data gathering have raged in the around the world over the last decade. Researchers in the United States have retained their ability to use such data to monitor racial inequality, but without addressing the substantial gap between social science theory about race and racial inequality and actual research practices. In this paper, I propose a new approach to analyzing race in survey research that uses multiple measures of race to better capture the fluidity, complexity and negotiation stressed in sociological theories of race. I demonstrate that incorporating multiple measures of race reveals more complex patterns of advantage and disadvantage in studies of U.S. racial inequality than can be seen using standard methods. Further, I find that different measures of race better explain inequalities in different domains (e.g., health care or socioeconomic status). This suggests that race is not a monolithic axis of difference in American society, but a context-specific one – both in its effects and the mechanisms through which inequalities are perpetuated. Paper prepared for the conference on Social Statistics and Ethnic Diversity held in Montreal, December 2007. Please do not quote or cite this draft without permission. Many thanks go to Michael Hout, Claude Fischer, Michael Omi, Sandra Smith and Amani Nuru-Jeter for their helpful comments and suggestions. Direct correspondence to Aliya Saperstein, University of California-Berkeley, Graduate Group in Sociology and Demography, 2232 Piedmont Avenue, Berkeley, CA 94720-2120. E-mail: [email protected].

1

Page 2: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Debates about whether race should be included in survey research and government data

gathering have raged in the United States and around the world over the last decade. The

consensus position in the American academy of ‘Yes, if it is used carefully and for good reason’

retains researchers’ ability to monitor racial inequality in the United States but has failed to

address the substantial gap between social science theory about race and racial inequality and

actual research practices. I propose a new approach to analyzing race in survey research that uses

multiple measures of race to better capture the fluidity, complexity and negotiation stressed in

sociological theories of race.

Social science theories of race – and ethnicity – suggest that these are not intrinsic

characteristics of individuals but multidimensional constructs comprised of identities and

classifications that can change over time and across contexts. Thus, I argue that these concepts

cannot be completely captured by a single survey question posed to a single respondent. If we

believe that race and ethnicity are markers of status that get deployed in interactions between

(and among) individuals, institutions and the state – as several theories of race, ethnicity and

inequality suggest – then research on racial inequality should include at least two measures of

race: one for each party in the interaction. So, for example, research on racial disparities in health

care should take into account both how respondents identify and how they are perceived by their

physicians or other health care personnel.

In this paper, I demonstrate that incorporating multiple measures of race in studies of

racial inequality in the United States reveals more complex patterns of advantage and

disadvantage than can be seen using standard methods.1 Drawing from previously unanalyzed

1 In general, I use the term “race” in this paper for the sake of simplicity and because it is the term used in the survey measures I analyze below. However, in my discussion of the theoretical and empirical literature, I refer to studies that analyze race, ethnicity or both. I acknowledge there are wide-ranging debates about distinctions between “race” and “ethnicity,” and that different terms are considered more appropriate in different countries. I link them here

2

Page 3: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

data in the National Survey of Family Growth (NSFG) that includes both how the respondents

identified racially and how they were perceived by the survey interviewer, I show that perceived

race is more closely associated with receiving basic health screenings while self-identified race is

more closely related to the economic well-being of one’s family. This suggests that inequality is

perpetuated through different mechanisms in different domains of life; a point that affects not

only how we understand the relationship between race and inequality in the United States, but

that has important implications for both standard methods of data collection and policy

interventions.

Why multiple measures of race have meaning

The idea that race is “socially constructed” – that it is not an intrinsic characteristic of

individuals or a natural distinction among human groups – is not a new one. Yet, the connection

between social science theories of race and the practices of survey research remain tenuous at

best. The political nature of counting by race certainly has something to do with this conceptual

divide, as do the constraints of conducting censuses and surveys for large populations within

limited budgets (Skerry 2000, Nobles 2000, Anderson 1988). However, I argue that failing to

take social science theories of race seriously leaves survey researchers with the ability to

describe broad patterns of inequality without being able to explain how they are perpetuated and

thus design effective policies to eliminate troubling disparities. By using a single measure of

race, researchers simply reify “common sense” notions of racial difference instead of exploring

the many ways that race comes to matter in people’s lives (c.f. Wacquant 1997).

because I argue that for studies of inequality making an analytical distinction between the two terms is largely semantic (c.f. Loveman 1999). Both “races” and “ethnicities” are maintained through processes of identification and ascription, and both have conditioned the distribution of societal resources in different places and eras. Not to mention that many American survey respondents do not (or cannot) distinguish between the two concepts.

3

Page 4: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Numerous studies have noted that population counts, as well as estimates of racial

disparities in fertility, mortality and injury rates vary depending on whether the race of the

individual in question was measured by self-report or recorded by another party, such as a

relative, nurse or funeral director (e.g., Arias et al. 2007; Morgan et al. 1999; Sugarman et al.

1993; Hahn et al. 1992). However, these studies often assume that one or the other method of

measuring race is the “correct” one. They also rarely discuss whether different measures of race

may be related to not just quantitatively different estimates but substantively different

explanations of the observed disparities. For example, individual aspirations and behavior may

help explain some of the well known racial disparities in educational attainment, income and

health, but other factors such as how an individual is perceived by teachers, employers, and

doctors have also been shown to play a role (e.g., van Ryn et al. 2006; Ferguson 1998;

Kirschenman and Neckerman 1991).

In advocating a multiple-measure approach to race, I draw from theories of racial

inequality (e.g., Tilly 1998), stereotypes and perceptions (e.g., Greenwald et al. 1998), racial

classification (e.g., Omi and Winant 1994) and identity (e.g., Nagel 1994). Together they build a

case for thinking of race (and its effects) as a product of interactions, and thus why considering

other people’s racial perceptions of an individual is equally -- if not more – important than

considering an individual’s self-identity alone.

These interactions can occur at both the micro and macro levels. For example, in Durable

Inequality, Tilly (1998) argues that race is a convenient categorical distinction used by actors in

institutions to distribute scarce resources and rewards. Similarly, work in cognitive psychology

on implicit attitudes (e.g., Greenwald et al. 1998) suggests that not only do people have

subconscious biases toward perceived out-group members, but these biases can affect their

4

Page 5: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

behavior in mixed-group interactions. These scenarios fit typical definitions of discrimination,

which suggest that it is not how you think of yourself that determines your life outcomes, but

how others perceive you.

These individual or interactional level theories are also similar to Weber’s (1978) macro-

level claim regarding how status groups maintain power through exclusion or “social closure”. It

applies in the case of race (or ethnicity, the term Weber uses) because, as Weber explains, status

groups create ‘us’ and ‘them’ distinctions by highlighting supposedly socially significant

characteristics that are perceived to differ between groups. Over time, these characteristics can

come to be seen as biological, and therefore “natural” in origin through processes of social

closure such as strict prohibitions on intermarriage (Weber 1978: 385-99).

In the contemporary United States, Omi and Winant (1994) suggest that one of the key

sites of struggle for power – or against racialized exclusion – is between the state and social

movement organizations. In this struggle, racial categories or group boundaries are subject to

constant negotiation and rearticulation as groups seek more resources and the state (or rather the

group in control of the state) tries to limit access to the same. The implication of this cycle of

equilibrium and disequilibrium for survey research is that both official racial categories and the

identities that are perceived as possible for individuals to claim will change over time, and those

changes will be related to stratification outcomes.2

Back at the individual-level, Nagel (1994) explicitly theorizes that there are two aspects

to any one person’s ethnic (or racial) identity. Everyone has an internal identity, but this identity

may not match the way other people perceive them (or what is perceived as appropriate for them

to claim). Nagel suggests that external perceptions trump internal identities, causing people to

2 This suggests that not only should multiple measures of race be included in standard survey research, but they should be measured in each wave of longitudinal surveys to capture the inevitable fluidity.

5

Page 6: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

tailor the expression of their identity to social expectations (c.f. Harris and Sim 2002). This

would explain why most of the time, for most Americans, multiple measures of race (or

ethnicity) provide consistent results. However, if some people do not feel compelled to alter their

expressed identities, measures of perceived and self-identified race will not match for those

individuals. To the extent that characteristics that predict inconsistent classifications also predict

economic stratification (e.g., education, immigrant generation, age, etc.), then different measures

of race will provide different – but complementary – descriptions of inequality in the United

States (Saperstein 2006).

Bridging theory and method: Estimating race as a latent variable

A latent variable is often used in social science research to capture the pattern of

responses to a series of survey questions that are all thought to measure a single concept, such as

“depression” or “socioeconomic status.” Similarly, I argue that the concept of “race” would be

better measured by taking into account information from multiple sources.

The primary assumption in latent class analysis is that unobserved variation among

individuals or the underlying structure of a population can be inferred from the pattern of

individuals’ responses to a given set of questions in a survey. The survey items are considered

manifest, or observable, indicators of the latent, or unobservable, characteristics. So, for

example, answering affirmatively to a question about “feeling blue” might be an indicator of the

unobservable construct “depression.” The goal of a latent class analysis is to end up with

meaningful groupings of individuals for whom there are multiple pieces of information that the

researcher thinks are related. The model estimates these latent “classes” as clusters of

observations with similar conditional probabilities of giving the same pattern of responses to the

6

Page 7: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

chosen “indicators” (in this case, observed and self-reported race). To name the latent classes,

researchers often rely on the modal responses given by individuals assigned to that cluster (e.g.,

“perceived white, self-reported blacks”). The characteristics of these classes, including their size,

can then be compared to the racial categories typically used in quantitative research.3

In other work, I discuss how I use latent class analysis to estimate racial “classes” and

assess changes in aggregate-level racial variation in the United States from 1973-1988

(Saperstein 2007). Here, I simply want to note the potential that such an approach holds for the

study of racial formation and racial inequality. Standard methods of survey data collection often

include just two potential racial indicators, thereby restricting the formulation of more complex

latent structure models. However, the latent variable approach itself is a relatively flexible one

that allows researchers to determine the type and number of observed indicators, the scale of the

latent variable (i.e., categorical or continuous), the number of latent variables in the analysis, and

what each might represent (e.g., unmeasured genetic factors, relative propensities of racial

classification, or internal and therefore necessarily unspoken identities). I argue that by

incorporating multiple measures of race and conceptualizing race as a latent variable, survey

research can become a stronger tool for testing the claims of existing racial theories, bringing

new insight to the dynamics of persistent racial inequality and building new theories about the

relationship between race and inequality in the United States and around the world.

Nevertheless, since studying race using latent class analysis or other similar latent

structure models awaits survey data that includes many more measures of race than is typical in

current surveys, the purpose of my analyses below is a more modest one: to demonstrate why

one should incorporate multiple measures of race into research on inequality (regardless of

3 For a more formal statistical discussion of the latent class model and estimation techniques, see McCutcheon (1987). For a recent overview of the use of latent variables in the social sciences, see Bollen (2002).

7

Page 8: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

which method one might choose to analyze the resulting data). I rely on my latent class analyses

here only to identify potentially important groupings of individuals based on their perceived and

self-identified race. 4 For the most part, these groupings are easily visible in the cross-tabulation

of perceived and self-reported race as those cells that include that largest number of cases (see

Table 1 below).5

The Data: National Survey of Family Growth

The NSFG is based on in-person interviews with women aged 15-44 and is typically used

for studies of pregnancy, childbearing, contraception, and related aspects of maternal and infant

health.6 However, the survey also includes detailed background information about the

respondent and her husband (if relevant), such as education, religion, ethnic origin, occup

and earnings.

ation,

st in

7 Further, the first four cycles of the survey (1973, 1976, 1982 and 1988) include

the interviewer’s classification of the respondent’s race. Interestingly, none of the published

studies I found that use NSFG data from these years note that multiple measures of race exi

4 For readers familiar with latent class analysis, what this means is that the categories I discuss below do not include all individuals estimated to have a high probability of exhibiting the specific racial responses – as would be the case if I used modal assignment to convert my latent class results to standard categorical independent variables – they are the actual individuals who responded as described. All the individuals not accounted for by the seven categories I examine are combined into a residual group that is included in my regression analyses solely as a control. 5 This begs the question of why use latent class analysis at all if you can spot the largest and therefore most important groupings with the naked eye. Again, with more measures of race and more categories per indicator than I use here the patterns of association would not be nearly as obvious. 6 For additional details on the survey, see the National Center for Health Statistics website (http://www.cdc.gov/nchs/products/elec_prods/subject/nsfg.htm) or the NSFG webpage from the Office of Population Research at Princeton University (http://opr.princeton.edu/archive/nsfg/). 7 The fact that the NSFG only samples women is a limitation, but I don’t expect it to affect my conclusions about whether using multiple measures is a useful way to study racial inequality. In previous work comparing observed and self-reported race, I did not find statistically significant gender differences in the probability of having an inconsistent racial classification (Saperstein 2006).

8

Page 9: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

the survey, nor do the authors state explicitly which measure they use in their analyses.8 Here

present results only from the 1988 survey because the data on health screenings and family

income are of higher quality than in previous cycles.

, I

Measures of race. In 1988, the NSFG coded the respondent’s race in two ways. First, the

interviewer made her observation of the respondent’s race, recording it as one of three

categories: Black, White or Other. Then, amidst a series of background questions, the respondent

was asked “Which of these groups best describe your racial background?”9 The category options

were: American Indian, Asian or Pacific Islander, Black and White. Respondents could choose

all four of the racial categories if they wished.

To preserve these multiple mentions, I recoded the respondents’ self-reports into three

binary variables: one for whether or not the respondent reported herself as “Black,” one for

whether or not she reported herself as “White,” and one for whether or not she reported herself as

either “American Indian” or “Asian or Pacific Islander.” Women who gave more than one racial

background response are coded as “yes” on more than one of these variables. I collapsed the

“American Indian” and “Asian” response categories into an aggregate self-reported “Other”

category, despite the loss of detail, to make the coding consistent with that for observed race.

Forty-eight respondents had missing data for observed race and 123 had missing data for self-

reported race; these cases were dropped from the analysis.

8 I used the web-based search engines for JSTOR and the Social Science Citation Index to identify articles with the words “NSFG” and “race” or “ethnicity” in their text. 9 NSFG respondents were also asked about which groups best describe their “national origin or ancestry.” I do not make use of this data here.

9

Page 10: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

The NSFG oversampled black women in order to allow for meaningful statistical

comparisons between blacks and whites.10 Thus, black women make up approximately one-third

of the 8,279 women in my study sample though, according to census data, blacks were about 12

percent of the total U.S. population at the time. However, I do not use post-stratification weights

to reapportion the sample in any of the descriptive statistics or in my analyses below. My goal is

not to provide nationally representative estimates and in my multivariate analyses the weights act

as more of a hindrance—in terms of understating the power of the data to distinguish differences

between blacks and whites—than a help.

Table 1 shows the cross-tabulation of observed and self-reported race in the 1988

NSFG.11 Nearly 200 women (2.4 percent) chose to report more than one racial category,12 and

all but two of the multiracial women were observed to be one of the races they named. Of those

who reported a single race, 169 women (2 percent) were classified by the interviewer as one of

the two races they did not report for themselves. It is these “inconsistencies,” where the measures

of observed and self-reported race do not match, that are completely hidden in single-measur

methods of analyzing race in quantitative research. Previous research shows that these racial

classification inconsistencies are not random “error” and describing these individuals by only

one of the two measures or race will lead to substantively different conclusions about racial

inequality in the United States (Saperstein 2006). There is also reason to believe that racial

classification inconsistency would have increased over the two decades since this data was

e

10 Given that the NSFG oversampled in black census tracts, and that the 1980 census race data is based on self-reports, this suggests the survey oversamples self-reported black women. 11 The cross-classification table on which my models are based is actually a 2x2x2x3 table. I condense those dimensions in Table 1 for ease of interpretation. 12 This is the same proportion of people who were counted as multiracial in the 2000 Census after the U.S. government’s official endorsement of the “mark one or more” method of counting by race (Jones and Smith 2001).

10

Page 11: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

collected (e.g., Omi 2001), so the 1988 NSFG should be regarded as an underestimate of the

limiting effect of using a single measure of race in research in the contemporary period.

<< Table 1: Cross-tabulation of Observed and Self-reported Race in the 1988 NSFG>> Health screenings. For my health care analyses, I use a series of questions from the 1988

NSFG regarding whether and under what circumstances the respondent received any of several

health screenings. Each test or exam was covered in a pair of questions. The first read: “In the

past 12 months, during a visit for family planning services, did you have a pap smear?” This

question was followed by: “Did you have a pap smear as part of a general check-up or other

medical visit in the past 12 months?” The possible responses to each were simply “yes” or “no.”

The pair of questions regarding pap smears was followed by identically worded pairs of

questions on whether the respondent had a pelvic exam, a breast exam, a blood-pressure test and

a urine test. I analyze only the likelihood of having a pap smear, a breast exam and a blood

pressure test. Also, I do not distinguish between where the exam was conducted, only if it was or

not.

Among NSFG respondents, ages 18-44, there was no clear “trend” in the number of

exams the women received. The most common outcome for all women was receiving all three

exams, followed by receiving none of them. The third most common outcome was receiving just

one exam, and it was most often a blood-pressure check-up (results not shown). Rates of having

been screened in the past year were relatively high for each exam, ranging between 76 and 86

percent of all women.

Family income. The NSFG measures family income with the following question:

“Card 32 shows amounts of weekly, monthly, and yearly income. Would you tell me what letter represents (your total income/the total combined income of your

11

Page 12: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

family) in the past 12 months, including income from all sources such as wages, salaries, social security or retirement benefits, help from relatives, rent from property and so forth.”

The respondent was then handed a card with 17 possible responses ranging from under $2,500

yearly to $50,000 or more yearly. Because each of these categories represents a range, I coded

each category to its midpoint; so, women who selected the first income category were assigned

$1,250 in annual income. Women who chose the open-ended top category were assigned

$62,500 in annual income. To the extent that perceived white and other women are over-

represented among those with the highest family income, this arbitrary top code likely

understates differences in average income across the racial populations, making my estimates

below conservative ones.

In order to isolate adult family income from childhood family income, I restrict the age

range of my sample to women 25 years and older. This also increases the likelihood that the

respondents will have completed their educational attainment, which is an important covariate as

it is related to both own earnings and marital outcomes. The downside of the age restriction is

that it eliminates a disproportionate number of cases from some of the racial populations that

tend to be younger (e.g., women who identify as black in combination with another race).

However, two-thirds of the women who had missing data on family income were also under the

age of 25 so, in general, the age restriction should provide a more accurate assessment of average

differences in family income.

Example 1: Racial disparities in health screenings

Previous studies of racial disparities in preventative health care consistently find that

“black” women are equally likely if not more likely than “white” women to receive typical

12

Page 13: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

screenings such as papanicolaou tests (pap smears), clinical breast exams, blood pressure checks

and tests for sexually transmitted infections. This finding runs counter to racial disparities in

health generally (Keppel 2007, IOM 2002), but has been found across several national surveys

and dates from at least 1985 to the present (Hiatt et al. 2002, Hewitt et al. 2002, Wilcox and

Mosher 1993, Mosher and Aral 1991, Makuc et al. 1989). There is some question in the public

health literature as to the accuracy of this reversed racial disparity.13 For my purposes here,

though, I simply want to know whether we can uncover more information about typical patterns

in screening by using multiple measures of race. My question is not do “black” women have

higher rates of screenings than “white” women, but which “black” women report higher rates of

screening: women who are seen as black, women who identify as black or both?

From even a cursory glance at the observed frequencies, it is clear that women who are

seen as black are more likely to receive pap smears, breast exams and blood pressure checks than

women who are perceived as either white or other. Table 2 presents these frequencies by

perceived and self-identified race, along with descriptive statistics of the key covariates for

which I control in the multivariate analyses below. Comparing across the exams, racial

disparities are largest for pap smears (27 percent difference between the highest and lowest

groups) and smallest for blood pressure checks (20 percent difference).14 In general, women who

are consistently classified as other (column 7) are the least likely to report receiving any of the

screenings, and women who are seen as black but identify as multiracial (column 5) are the most

likely to report they were screened in the past year.

13 Some scholars suggest that the finding of higher rates of screening for black women is an artifact of using patients’ retrospective self-reports (Fiscella et al. 2006; McPhee et al. 2002; Gordon et al. 1993), which are subject to both recall and social desirability biases that may themselves differ by race (Warnecke et al. 1997; Zapka et al. 1996). 14 This mirrors the ranking of exams from most to the least invasive.

13

Page 14: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

<<Table 2. Descriptive statistics for health screening sample, ages 18-44, 1988 NSFG>>

Because using multiple measures of race is not standard practice, it is important to note

the insignificant differences in Table 2, as well as the significant ones. For example, among

perceived blacks there are small – and insignificant – differences between women who identify

as black alone and woman who identify as both black and another race (columns 5 and 6).15 The

gaps between the two groups are on the order of 3-6 percent, a figure not statistically significant

given the sample sizes for the two groups and not substantively significant in any case.16 Among

perceived whites, the two inconsistently classified groups – women who identify as other and

women who identify as black – have the lowest frequencies of reporting they had health

screenings in the past year (columns 3 and 4).

In general, women who are seen as white but identify as other have rates of screenings

more similar to those of consistently classified others than they do to consistently classified

whites (i.e., the rates are no more than 6 percent apart). However, women who are seen as white

and identify as black have rates of screening that are significantly lower than either of their

reference groups. For example, 77 percent of consistently classified black women reported

having a breast exam, followed by 71 percent of consistently classified white women. At 57

percent, the rate of clinical breast exams for women who are seen as white but identify as black

lags far behind both.

15 Two-thirds of these women reported “Black” and “American Indian” as their racial backgrounds. 16 I say the small differences between the two groups, and their direction, are not substantively significant in part because women who are seen as black but identify as multiracial exhibit a number of characteristics, such as higher rates of hypertension, that should result in higher rates of screening. There may be substantively interesting differences among perceived blacks in these other characteristics (e.g., why might multiracial women be more likely to have hypertension and more likely to finish high school?), but exploring those are beyond the scope of this study.

14

Page 15: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

These findings alone support my argument that multiple measures of race can provide

unique and useful information in studies of racial inequality. The simple picture one gets from

previous studies that “blacks” are more likely to report being screened than “whites” or “others”

is complicated here by: 1) the fact that the overarching pattern is defined by perceived race, not

self-identity; and 2) among perceived whites, women who identify as being nonwhite have

significantly lower rates of being screened.

Multivariate analyses of preventative health screenings

One should always interpret multivariate results about racial inequalities with great

caution. It is important to explain how I went about the process of model building and

interpretation and the assumptions I make as social scientists are often guilty of poorly

specifying or incorrectly interpreting models that include racial categories as independent

variables (Martin and Yeung 2003; Zuberi 2001). First and foremost, given that I define “race”

not as an intrinsic characteristic of individuals but a marker of socially constructed status

relationships, it would be inappropriate to attribute causal effects to “race” in multivariate

analyses. It is not being “black” per se that causes lower educational attainment, higher mortality

rates or – in this case – higher reported use of preventative health care. However, one can try to

identify potential causal mechanisms that mediate the relationships between race and an outcome

of interest. That is the approach I take here.

I first estimate a model that includes only indicators for the various racial categories; I

call this a “gross race effect” model because it identifies disparities prior to controlling for any

other factors that may be related to an individual’s race, the outcome of interest, or both. I then

define related sets of controls – such as those for insurance status, those that measure health

15

Page 16: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

history and/or frequent use of health care, those that represent general determinants of health

care access (e.g., income) and other compositional characteristics (e.g., age) – and add them one

by one, observing changes to the coefficients for the racial categories. If a given control or set of

controls significantly decreases the effect otherwise attributed to race, then I assume I have

identified an intervening mechanism that helps to perpetuate racial disparities.17

Interestingly, none of the covariates in the health screening analyses significantly alter

the general pattern of racial disparities visible in Table 2. Thus, I present only the final “full”

models in Table 3. The left panel of Table 3 provides estimates of the log odds of having the

screening named in the column for each of the racial categories named in the row, compared to

the log odds for consistently classified black women (the reference group). The fact that all but

one set of coefficients (for women who are perceived as black but identify as multiracial) are

negative indicates that, net of everything from health history to urban residence, perceived black

women are still more likely to report receiving each of the preventative health screenings. The

only estimates among perceived white women that are not statistically significant are those for

women who identify as both white and “other”.18

<<Table 3. Logistic regressions predicting reported health screenings, 1988 NSFG>>

However, for women who are seen as white but identify as black, the findings from the

full sample models in Table 3 only tell part of the story. The left panel of Table 3 shows that

17 Similarly, if a control or set of controls significantly increases the effect attributed to race, I interpret that to mean that individuals in the given racial category are doing better, on average, than we would otherwise expect based on average differences between them and the rest of the population on the characteristic being controlled. 18 These women are more likely to live in rural areas (i.e., neither in the center city, nor in the suburbs) and less likely to have sought medical care for the purposes of family planning in the past year (see Table 2), which appears to explain some of the difference between their lower rates of health screening and the rates of consistently classified black women (results not shown).

16

Page 17: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

these women remain less likely to report having preventative health screenings compared to

women who are both seen as and identify as black, after controlling for a host of other factors.

Does the same hold true when the reference is women who are both seen as and identify as

white?19 The right panel in Table 3 provides the answer, which is “maybe”. Though all three

coefficients are negative, indicating that women who are seen as white but identify as black are

less likely to report screenings than consistently classified white women, only one (predicting

breast exams) is marginally statistically significant.

The clear message from both the observed frequencies in Table 2 and the multivariate

analyses in Table 3 is that the peculiar “black” advantage in preventative health screenings is

better attributed to being perceived as black than simply identifying as black. This distinction

may sound inconsequential, semantic even, but according to my results the magnitude of the

effects are far from trivial – even if the number of women affected may be relatively small.20 For

example, in multivariate analyses among self-identified black women (not shown), being seen as

white is the third largest predictor of rates of pap testing (ß= -.734), behind the positive effects of

seeking medical assistance with family planning (ß=1.879) and whether the woman was using

oral contraception (ß=.742). In the breast exam and blood pressure models, being seen as white

is the second largest and most significant predictor behind only whether the woman went to at

19 The full sample model in the left-hand panel of Table 3 is similar but not completely comparable to the estimates one would get from doing a restricted sample comparison among self-identified blacks. This is because the distribution of characteristics on the covariates would be different, and so would their estimated effects on preventative health care. For example, having Medicaid coverage is not a significant predictor of reporting health screenings among “blacks” or in the population as a whole but it is among “whites” (results not shown). 20 It is difficult to estimate how many women might be perceived as white but identify as black among the entire U.S. population. In the NSFG, they are .5 percent of the female population aged 15-44. Assuming that proportion has stayed constant over time and across age groups (which is a rather strong assumption), of today’s current population of 300 million (of whom about 152 million are women) there would be more than three-quarters of a million women who are seen as white but identify as black – a population larger than the city of San Francisco. Even if we just assume the proportion among women ages 15-44 has remained constant, the estimated population of people who are seen as white but identify as black among women of reproductive age would be similar to that of Pittsburgh, Pennsylvania. (See, http://www.census.gov/popest/cities/tables/SUB-EST2006-01.xls)

17

Page 18: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

least one family planning visit. Further, knowing that the association is stronger for looking

black than “feeling” black should lead researchers to examine mechanisms related to interactions

between patients and health care personnel rather than individual-level attitudes or health

behaviors.21

Example 2: Racial inequality in family income

Previous studies of racial disparities in income have shown that although “black” women

have achieved relative equality in terms of their own earnings potential (net of human capital and

other controls), the yawning gap between “blacks” and everyone else in family income has yet to

be closed – and has even widened over time. Racial differences in marital status and the rise in

female-headed households help explain part of the gap (Darity and Myles 1998). Though, even

when married, the family incomes of “black” women are lower on average than those of other

married women because “black” women are more likely to be married to “black” men, the

earnings of whom are far lower than the average American male (Katz et al. 2005; Darity and

Myles 1998).

Table 4 shows average family incomes by race for women, ages 25-44, along with the

means or other descriptive statistics for the control variables in the subsequent analyses. What is

striking here, in contrast to the pattern for preventative health care discussed above, is that all

women who self-identify as black have similar average family income (columns 4, 5 and 6),

which is approximately $10,000 lower than the income of women who self-identify as either

white or other. So, again we see a black-nonblack divide in well-being, but in this case the divide

appears to be defined not by perceived race but by self-identity.

21 I put quotes around the word feeling to indicate that it is not known what exactly people mean when they report they identify with a particular race. It could represent cultural affinity, political allegiance, what other people told them they are, what they know about their ancestry or some combination of the above.

18

Page 19: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

<<Table 4. Descriptive statistics for family income sample, ages 25-44, 1988 NSFG>>

Also in contrast to the health care analyses, the likely mechanism for this racial disparity

is quite obvious: marital status. Women who are seen as white but identify as black are far less

likely to be married (32 percent), than women who are perceived to be white and do not self-

identify as black (73 percent, averaged across columns 1, 2 and 3). When they are married,

women who are seen as white but identify as black are also far more likely to report that they

have a black spouse (66 percent, not shown) than other women who are seen as white (2

percent).22 At the same time, the rates of marriage are very similar among the three groups of

self-identified blacks. As the multivariate results will show, these differences play an important

explanatory role in the lower average family income of self-identified black women.

Multivariate analysis of family income

To estimate racial disparities in family income, I use ordinary least squares regression.

The dependent variable is family income coded as described above and then logged to better

represent the relationship between income and well-being (i.e., having an extra $1,000 at the

bottom of the distribution is more significant than having an extra $1,000 at the top). The

coefficients can be interpreted roughly as the percent increase (or decrease) in logged family

income attributed to the given characteristic. The independent variables are typical to

22 I do not include information on the spouses or partners of NSFG women in Table 4 or in my multivariate results, though it is available in the survey. Unfortunately, an already small sample of inconsistently classified women combined with low rates of marriage for self-identified blacks means that for the most interesting group – women who are seen as white but identify as black – the possible number of partners is extremely small. Also, because spousal information is acquired from the respondent, there are large numbers of missing values even when there is a spouse or partner present.

19

Page 20: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

stratification research on earnings and income: Hispanic origin, age, educational attainment,

metropolitan residence, region of residence, marital status and number of children.23

As before, I conducted the family income analyses by first estimating a “gross race

effect” model. I then added groups of independent variables, one by one, to explore how the

magnitude and direction of the effects otherwise attributed to race changed. Unlike in the health

care analyses, the control variables do substantively alter the model predictions here, suggesting

that we have identified some of the characteristics that mediate the relationship between “race”

and family income. Thus, Table 5 presents three models predicting annual family income: a race

only model, a model that controls for all the characteristics from Table 4 except for marital status

and a full model that include all controls. As before, Table 5 is split into two panels: on the left

are models run on the full income sample, while models on the right are restricted to

comparisons among perceived whites. In each panel the most interesting comparison is between

the second model without marital status and the third model with marital status.24

<<Table 5. Ordinary least squares regressions predicting logged annual family income>>

23 I include both linear and squared terms for age and educational attainment. Educational attainment is measured in years of schooling completed, but is not coded continuously from 0 to 18. In separate analyses, I determined that the effect of education on family income was consistent for respondents who had less than a high school degree, regardless of how much less. Thus, education is coded as 0 for those with 0-11 completed years of schooling, 1 for 12 years of completed schooling, 2 for 13 years and so on up to 7 for 18. Similarly, I determined that the effect of how many of the woman’s biological children were living in the household was also nonlinear. So I control for the number of children with two categorical variables: 1) whether there are no children present and 2) whether there are three or more children present (having one or two children is the reference group). Marital status is controlled by two categorical variables for whether the respondent is married or cohabiting. As for metropolitan residence, in the NSFG it is possible to distinguish (roughly) center city residents from other metropolitan (i.e., suburban) residents and rural residents. In the preventative health care analyses distinctions between all three groups were salient, so I included two categorical variables (with rural as the reference group). Here, the operative distinction was between women who lived in a metropolitan area and those who did not, so I include just one (again, rural is the reference). 24 In the left panel the second model also does not include controls for the number of biological children in the household. I do this because the most salient difference between self-identified black women who are seen as white and consistently classified black women is not in their marital status but in the number of children they have living with them. Women who are seen as white but identify as black are about equally likely to have more than three children than other self-identified black women, but they are less likely to be childless (see Table 4).

20

Page 21: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

There are three results of interest in the full sample models. First, it is clear that the

dotted line dividing women who are perceived as black from everyone else is no longer the most

salient distinction, as it was in Table 3. In the first two models, women who are seen as white but

identify as black have an average (logged) family income that is slightly larger, but not

significantly different from that of consistently classified black women.25 This difference

becomes marginally significant in the final model because women who are seen as white but

identify as black have a higher income than we would expect given their average number of

children. Interestingly, also between the second and the final model, the difference in family

income between consistently classified other women and consistently classified black women

drops by more than 95 percent and is no longer statistically significant. This suggests that while

there is a (self-identified) black-nonblack divide in family income generally, among otherwise

similar women the divide is better described as drawn along (perceived) white-nonwhite lines.26

Further support for this conclusion comes from the right-hand panel of Table 5. In the

models restricted to perceived whites, we see that while women who identify as black have

significantly lower family income in general, once we control for average differences in marital

status they go from having an average income that is half as large as that of consistently

classified whites to an average income that is 17 percent lower (a difference that is no longer

statistically significant given the small number of self-identified blacks relative to the sample

25 This relationship is different from that observed in Table 4 because the average incomes in Table 4 are not logged. It turns out that women who are seen as white but identify as black have a more compressed distribution of income such that the poorest of them are less poor than the poorest consistently classified black women (results not shown). 26 Of course, for the women who are seen as white but identify as black artificially assigning them the average characteristics of the population as whole (or even self-identified blacks specifically) may remove the very variation that explains their inconsistent classification in the first place. Only further research into what “causes” their inconsistent racial classifications and identities can resolve this quandary.

21

Page 22: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

size).27 Again, this suggests that much of the disparity in family income among perceived whites

– as in the population as a whole – is associated with the rates of marriage among self-identified

racial groups.28

In the health care analyses, we saw that racial disparities in reported health screenings,

though they were in the opposite direction one would expect, were best summarized by

differences between women who were perceived as black and everyone else. There was also

evidence that while women who are seen as white but identify as black clearly fall on the

“everyone else” side of that health care divide, they also exhibited uniquely low rates of

screening that suggest their experiences in the health care system may not be easily summed up

as either “black” or “white”. Interestingly, neither of those conclusions would help us predict the

racial disparities in family income analyzed above.

Average differences in family income among American women clearly fall along self-

identified black-nonblack lines. So, in this case, it is not looking black that is the most important

distinction, but “feeling” black. In the multivariate analyses above, I find the relationship

between self-identity and family income is mitigated to a large extent by marital status. That is,

the dramatically lower rates of marriage among self-identified blacks account for anywhere from

half to two-thirds of the difference between their average income and that of self-identified

27 The rest of the difference among perceived whites appears to be mediated by having a Hispanic origin. That is, women who are seen as white but identify as black only have lower average incomes among perceived whites who are non-Hispanic. Among perceived whites who report a Hispanic origin, women who are consistently classified as white actually have lower incomes than their perceived white peers and not the other way around. The same relationship is evident in the full sample models; across all racial categories, only consistently classified white women have less income if they also report being Hispanic (results available upon request). This is a fascinating result that deserves more attention in future research, including by incorporating Hispanic origin not as a control but as one of the multiple measures of “race.” 28 Put another way, women who are seen as white but identify as black earn a larger proportion of their family incomes themselves. This can be demonstrated by running a regression of the ratio of the respondent’s income to her family income on the same list of covariates used above. By doing so, we see that women who are seen as white but identify as black account for nearly two-thirds of their family incomes with their own earnings, while consistently classified white women contribute less than half of their family incomes through earnings (results not shown).

22

Page 23: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

whites. There is some evidence that, net of their low rates of marriage and comparatively high

rates of childbearing, women who are seen as white but identify as black have higher average

income than consistently classified blacks – placing them somewhere between “blacks”, at the

bottom of the racial distribution of income, and “whites”, at the top. Thus, while their

experiences in the marriage market may be related to the fact that they “feel” black, all else being

equal the family income of these women may be better predicted by the fact they look white.29

Implications

The value added by my multiple measure approach to studying racial inequality cannot

be judged by the results from either of the above empirical examples alone. Separately, the

results from my health care and income analyses show that there may be some intriguing

intraracial differences in outcomes that have yet to be explored – but the differences are small

compared to the broad picture of racial inequality that we see with far less effort using standard

methods of measuring race.30 It is only by comparing results across the domain of health care

and family income that we learn something we did not already know about how race operates to

perpetuate inequalities. Namely, that it operates in different ways in different domains. My

results suggest that inequalities in health care depend more on how a patient is perceived racially

than how she self-identifies. At the same time, a woman’s self-identity better explains her marital

29 It is tempting to think that the racial identity of these women follows their marital status. For example, that because they are more likely to marry black men (compared to other women who are perceived as white), they are also more likely to think of themselves as black. However, it is important to recall that the vast majority of women who are seen as white but identify as black are not married. 30 Of course, this characterization of my approach ignores the improvement that comes from measuring race in a way that more clearly matches our theories of what “race” is and how it operates to perpetuate inequalities.

23

Page 24: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

status and, through it, her family income.31 Future research and policy interventions would do

well to take these distinctions into account.

In the examples I discuss above, the data on race is gathered from an in-person interview

– a costly process that it not feasible for many research purposes, including the enumeration of

the United States population through the decennial census. I have, however, compared the

analyses presented here with analyses using data from the Behavioral Risk Factor Surveillance

System (BRFSS), an annual telephone survey organized by the United States’ Center for Disease

Control. The BRFSS also includes multiple measures of race, but in this case – because the

interviewer cannot see the respondent on the other end of the telephone – perceived race is

measured by proxy with the question: “How do others in this country typically classify you?”

The substance of the results I present here are replicated in the BRFSS data, suggesting that

mail-out surveys, such as the census, could include a similar proxy for perceived race and

achieve the same purposes of better identifying the mechanisms that link race to inequality in the

United States.32

I believe my multiple measure approach to studying racial inequality also has

applications outside the United States. Researchers in Brazil and several eastern European

countries have also begun examining how comparing self-identified and perceived race (or

ethnicity) can help inform processes of categorization, discrimination and inequality (e.g.,

Ahmed et al. 2007; Telles 2002; Telles and Lim 1998). Further, we know that racial identities

and racial perceptions are themselves affected by a number of characteristics, including language 31 There are likely other factors that intervene between self-identity and marital status, such as neighborhood effects, that require further research. It is also possible that these relationships could work in the opposite direction, such that people with poor health outcomes are more likely to be perceived as black, or that women with low incomes and poor marriage prospects are more likely to self-identify as black. Demonstrating the direction of the relationship awaits longitudinal data with the appropriate measures of race (though the National Study of Adolescent Health may be a good place to start). 32 Results from these BRFSS analyses are available from the author upon request.

24

Page 25: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

use, religion, country of origin, and citizenship. Though their salience may vary by country and

over time, exploring the intersections among these characteristics and various measures of race,

ethnicity, ancestry and nationality would be a fruitful focus of international empirical research;

one that allows researchers to move beyond simply describing patterns of racial inequality and

toward explaining how the disparities persist.

25

Page 26: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

References

Ahmed, Patricia, Cynthia Feliciano and Rebecca Jean Emigh. 2007. “Internal and External Ethnic Assessments in Eastern Europe” Social Forces 86,1: 231-255. Anderson, Margo J. 1988. The American Census: A Social History. New Haven, Conn.: Yale University Press. Arias, Elizabeth, William S. Schaumann and Paul Sorlie. 2007. “Race and Hispanic Origin Reporting on Death Certificates in the United States: Status and Effects.” Paper presented at the Population Association of America Annual Meeting, New York. Available online: http://paa2007.princeton.edu/download.aspx?submissionId=71842 Bollen, Ken. 2002. “Latent Variables in Psychology and the Social Sciences.” Annual Review of Psychology 53: 605-34. Darity, William A. and Samuel L. Myers Jr. 1998. Persistent Disparity: Race and Economic Inequality in the United States since 1945. Edward Elgar Publishing Company. Ferguson, Ronald. 1998. “Teachers’ Perceptions and Expectations and the Black-White Test Score Gap,” Pp. 273-317 in The Black-White Test Score Gap, Jencks, Christopher and Meredith Phillips, eds. Washington D.C.: Brookings Institution Press. Fiscella, Kevin, Kathleen Holt, Sean Meldrum and Peter Franks. 2006. “Disparities in preventive procedures: comparisons of self-report and Medicare claims data.” BMC Health Services Research 6:122-130. Gordon NP, RA Hiatt and JI Lampert. 1993. “Concordance of Self-reported Data and Medical Record Audit for Six Cancer Screening Procedures.” Journal of the National Cancer Institute 85: 795-800. Greenwald, Anthony G, Debbie McGhee and Jordan Schwartz. 1998. “Measuring individual differences in implicit cognition: The implicit association test.” Journal of Personality and Social Psychology 74(6): 1464-1480. Hahn, R.A., J. Mulinare, and S.M. Teutsch. 1992. “Inconsistencies in Coding of Race and Ethnicity between Birth and Death in US Infants: A New Look at Infant Mortality, 1983 through 1985.” Journal of the American Medical Association 267:259-63. Harris, David R. and Jeremiah Joseph Sim. 2002. "Who is Multiracial?: Assessing the Complexity of Lived Race." American Sociological Review 67:614-27. Hewitt, Maria, Susan Devesa and Nancy Breen. 2002. “Papanicolaou Test Use Among Reproductive-Age Women at High Risk for Cervical Cancer: Analyses of the 1995 National Survey of Family Growth.” American Journal of Public Health 92: 666-669. Hiatt, Robert A., Carrie Klabunde, Nancy Breen, Judith Swan, Rachel Ballard-Barbash. 2002. “Cancer Screening Practices From National Health Interview Surveys: Past, Present, and Future.” Journal of the National Cancer Institute 94, 24: 1837-1846. Institute of Medicine. 2002. Unequal Treatment: Confronting Racial and Ethnic Disparities in Health Care. National Academy Press. Jones, Nicholas and Amy Symens Smith. 2001. “The Two or More Races Population: 2000.” Census 2000 Brief. U.S. Census Bureau. Katz, Michael B., Mark J. Stern and Jamie J. Fader. 2005. “The New African American Inequality.” The Journal of American History 92,1.

26

Page 27: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Keppel, Kenneth G. 2007. “Ten Largest Racial and Ethnic Health Disparities in the United States based on Health People 2010 Objectives.” American Journal of Epidemiology 166: 97-103. Kirschenman, Jolene and Katherine Neckerman. 1991. “We’d Love to Hire Them, But …” In The Urban Underclass, eds. Christopher Jencks and Paul E. Peterson. Washington D.C.: The Brookings Institution. Loveman, Mara. 1999. “Is ‘Race’ Essential?” American Sociological Review 64, 6: 891-898. Makuc DM, VM Freid, JC Kleinman. 1989. “National trends in the use of preventive health care by women.” American Journal of Public Health 79:21-26. Martin, John Levi and King-To Yeung. 2003. “The Use of the Conceptual Category of Race in American Sociology, 1937-99.” Sociological Forum 18:521-543. McCutcheon, Allan L. 1987. Latent class analysis. Beverly Hills: Sage Publications. McPhee, Stephen J., Tung. T Nguyen, Sarah J. Shema, Bang Nguyen, Carol Somkin, Phuong Vo, and Rena Pasick. 2002. “Validation of Recall of Breast and Cervical Cancer Screening by Women in an Ethnically Diverse Population.” Preventative Medicine 35: 463-473. Morgan, S. Philip, N. Botev, R.B. Chen and J.P. Huang. 1999. “White and Nonwhite Trends in First Birth Timing: Comparisons using Vital Registration and Current Population Surveys.” Population Research and Policy Review 18: 339-356. Mosher, William D. and Sevgi O. Aral. 1991. “Testing for Sexually Transmitted Diseases Among Women of Reproductive Age.” Family Planning Perspectives 23,5: 216-221. Nagel, Joanne. 1994. “Constructing Ethnicity: Creating and Recreating Ethnic Identity and Culture.” Social Problems 41: 152-176. Nobles, Melissa. 2000. Shades of Citizenship: Race and the Census in Modern Politics. Stanford, CA: Stanford University Press. Omi, Michael. 2001. “The Changing Meaning of Race.” In America Becoming: Racial Trends and Their Consequences, eds. Neil J. Smelser, William Julius Wilson and Faith Mitchell. Washington D.C.: National Academy Press. ___________ and Howard Winant. 1994. Racial Formation in the United States: From the 1960s to the 1990s. New York: Routledge. Saperstein, Aliya. 2006. “Double-checking the Race Box: Examining Inconsistency between Survey Measures of Observed and Self-Reported Race.” Social Forces 85(1): 57-74. ______________ 2007. “The Many Dimensions of Race: Capturing Complexity with Latent Variables.” Paper presented at the Population Association of America Annual Meeting, New York. Available online. http://paa2007.princeton.edu/download.aspx?submissionId=70498. Skerry, Peter. 2000. Counting on the Census. Washington, D.C.: Brookings. Sugarman, Jonathan R., Robert Soderberg, Jane E. Gordon, and Frederick P. Rivara. 1993. "Racial misclassification of American Indians: Its effect on injury rates in Oregon, 1989 through 1990." American Journal of Public Health 83(5):681-684. Taylor, Teletia R., Carla D. Williams, Kepher H. Makambi, Charles Mouton, Jules P. Harrell, Yvette Cozier, Julie R. Palmer, Lynn Rosenberg and Lucile L. Adams-Campbell. 2007. “Racial Discrimination and Breast Cancer Incidence in U.S. Black Women.” American Journal of Epidemiology 166:46-54.

27

Page 28: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Telles, Edward E. 2002. “Racial Ambiguity among the Brazilian Population.” Ethnic and Racial Studies 25: 415-441. ______ and Nelson Lim. 1998. “Does it Matter Who Answers the Race Question? Racial Classification and Income Inequality in Brazil.” Demography 35: 465-74. Tilly, Charles. 1998. Durable Inequalities. University of California Press. Wacquant, Loic. 1997. “For an Analytic of Racial Domination.” Political Power and Social Theory 11: 221-234. Van Ryn, Michelle, Diana Burgess, Jennifer Malat and Joan Griffin. 2006. “Physicians’ Perceptions of Patients’ Social and Behavioral Characteristics and Race Disparities in Treatment Recommendations for Men With Coronary Artery Disease.” American Journal of Public Health 96,2: 351-357. Warnecke, Richard B., Timothy P. Johnson, Noel Chavez, Seymour Sudman, Diane P. O'Rourke, Loretta Lacey, and John Horm. 1997. “Improving question wording in surveys of culturally diverse populations.” Annals of Epidemiology 7:334-34. Weber, Max. 1978. Economy and Society, trans. by G. Roth and C. Wittich. University of California Press. Wilcox, Lynne S. and William D. Mosher. 1993. “Factors Associated with Obtaining Health Screening Among Women of Reproductive Age.” Public Health Reports 108: 76-86. Zapka Jane G., Carol Bigelow, Thomas Hurley, Leigh Durland Ford, John Egelhofer, W. Max Cloud and Eckart Sachsse. 1996. “Mammography use among sociodemographically diverse women: the accuracy of self-report.” American Journal of Public Health 86,7:1016–21. Zuberi, Tukufu. 2003. Thicker Than Blood: How Racial Statistics Lie. University of Minnesota Press.

28

Page 29: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

29

Consistency is the norm, but perceived race and self-identity are not necessarily equivalent

Perceived race

Black White Other Row total

Black 2634 41 5 2680

White 20 5118 30 5168

Other 15 58 161 234

Black-White 11 1 1 13

Black-Other 54 0 2 56

White-Other 1 108 3 112

All three 12 0 0 12

Self-

iden

tifie

d ra

ce

DK/Refused 1 2 1 4

Column total 2748 5328 203 8,279

Table 1: Cross-tabulation of Perceived and Self-identified Race in the 1988 NSFG_____ Note: Unweighted counts. Self-identified “Other” combines American Indian and Asian or Pacific Islander race responses. Hispanic origin was coded separately by NSFG and is included only as a control in the analyses below.

Page 30: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Women who are seen as white are less likely to receive health screenings compared to women who are seen as black Seen as white Seen as black Seen as other

Identifies as white only

Identifies as multiracial

Identifies as other only

Identifies as black only

Identifies as multiracial

Identifies as black only

Identifies as other only

Had a pap smear in past 12 mos. 69% 67% 56% w 65% b 82% 78% 55% Had a breast exam 71% 70% 63% 57% bw 80% 77% 58% Had blood pressure checked 83% 85% 78% 73% bw 92% 86% 72% Has some health insurance 52% 44% 51% 51% 55% 47% 40% Covered by Medicaid 2% 5% 10% 16% 11% 18% 3% Insurance coverage missing 15% 14% 20% 14% 11% 12% 41% Had one or more fam. plan. visits 36% 28% 37% 35% 42% 39% 24% History of PID 11% 23% 17% 22% 29% 20% 7% History of hypertension 12% 23% 22% 19% 24% 19% 7% Currently pregnant 5% 2% 5% 0% 9% 4% 5% Currently using the pill 19% 18% 17% 24% 23% 21% 11% Abstinent past 12 mos. 6% 4% 2% 5% 5% 6% 5% Hispanic 7% 3% 37% 11% 6% 3% 5% Age 18-29 41% 39% 44% 46% 48% 46% 45% Age 30-44 59% 61% 56% 54% 52% 54% 55% Did not graduate from HS 13% 26% 22% 24% 14% 24% 11% Average family income $35,279 $31,570 $34,549 $22,250 $21,110 $22,171 $32,968 Lives in central city 17% 14% 17% 43% 58% 51% 27% Lives in suburbs 57% 51% 66% 49% 29% 33% 57% N (unweighted) 4397 93 41 37 66 2183 132 Pct. of study sample 62.7% 1.3% 0.6% 0.5% 0.9% 31.1% 1.9% Table 2. Descriptive statistics for health screening sample, ages 18-44, 1988 NSFG Note: b indicates observed frequency differs significantly from consistently classified blacks (p<.05, one-tailed test), w indicates observed frequency differs significantly from consistently classified whites (p<.05, one-tailed test). For explanation of data on insurance status, please see text.

30

Page 31: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Perceived race remains salient, even after controlling for health history, insurance coverage and other factors

Full sample health screening models Health screening models for perceived whites only

Pap smear Breast exam BP check Pap smear Breast exam BP check Consistently classified

whites -.544 *** -.441 *** -.361 *** Identifies as black -.231 -.621 † -.539 (.078) (.076) (.085) (.394) (.377) (.401)

Seen as white, IDs as multiracial -.389 -.223 -.071 Identifies as multiracial .141 .236 .305

(.251) (.253) (.309) (.247) (.249) (.306)

Seen as white, identifies as other -1.295 *** -.826 * -.755 † Identifies as “other” -.780 * -.396 -.358

(.379) (.373) (.417) (.379) (.371) (.413)

Seen as white, identifies as black -.741 † -1.090 ** -.934 *

(.391) (.375) (.404) Consistently classified

others -.950 *** -.820 *** -.821 *** (.210) (.206) (.219)

Seen as black, IDs as multiracial .098 .012 .448

(.351) (.338) (.484) Consistently classified

blacks -.126 -.122 1.103 *** Consistently

classified whites -.896 * -.766 * .754 *** Constant (.304) (.295) (.093) Constant (.392) (.376) (.119)

N 7012 7012 7005 N 4571 4571 4567

Table 3. Logistic regressions predicting reported health screenings, 1988 NSFG Note: † p<.10 *p<.05 ** p<.01 *** p<.001 All models include controls for any remaining inconsistent classifications, Hispanic origin, age, marital status, education, family income, metropolitan residence, insurance coverage, number of family planning visits in the past year, current pregnancy status, history of pelvic inflammatory disease, history of hypertension, whether the woman was abstinent in the past year and whether she was currently taking oral contraceptives.

31

Page 32: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

Women who identify as black have lower family income, on average, regardless of how they are perceived racially

Seen as white Seen as black Seen as other

Identifies as white only

Identifies as multiracial

Identifies as other only

Identifies as black only

Identifies as multiracial

Identifies as black only

Identifies as other only

Annual family income $37,157 $33,303 $37,328 $22,036 w $21,872 w $23,579 w $35,548 R's education (years) 13.5 12.5 13.2 12.5 13.0 12.8 14.1 Married 73% 73% 79% 32% 30% 39% 76% Cohabiting 4% 5% 3% 0% 2% 5% 1% Does not have children 31% 32% 21% 14% 26% 26% 31% Has three or more kids 15% 15% 28% 21% 17% 22% 13% Hispanic 7% 4% 28% 11% 6% 3% 6% Lives in metro. area 75% 65% 83% 93% 87% 84% 87% Region of residence: Northeast 21% 13% 10% 14% 19% 16% 13% South 31% 35% 17% 39% 23% 54% 10% Midwest 28% 21% 17% 36% 26% 22% 14% West 19% 31% 55% 11% 32% 8% 63% N (unweighted) 3481 75 29 28 47 1673 104 Pct. of study sample 63.3% 1.4% 0.5% 0.5% 0.9% 30.4% 1.9% Table 4. Descriptive statistics for family income sample, ages 25-44, 1988 NSFG Note: w indicates observed frequency differs significantly from consistently classified whites (p<.05, one-tailed test). Annual family income includes income from all sources for the previous 12 months.

32

Page 33: (Re)Modeling Race: A Latent Variable Approach for Research ... · (Re)Modeling Race: A Latent Variable Approach for Research on Racial Inequality . Aliya Saperstein, University of

33

Self-identifying as black is a salient predictor of family income, largely mediated through differences in marital status

Full sample family income models Family income models for perceived whites only

Race only

All but marital status and # of kids

Full model

Self-identified race only

All but marital status Full model

Consistently classified whites .658 *** .578 *** .287 *** Identifies as black -.586 *** -.513 *** -.169

(.024) (.023) (.021) (.130) (.118) (.106)

Seen as white, IDs as multiracial .557 *** .621 *** .307 *** Identifies as multiracial -.102 .026 .009

(.094) (.086) (.076) (.080) (.073) (.065)

Seen as white, identifies as other .670 *** .671 *** .335 ** Identifies as “other” .012 .108 .070

(.148) (.137) (.120) (.128) (.117) (.104)

Seen as white, identifies as black .072 .106 .195 †

(.151) (.139) (.121) Seen as other,

identifies as other .577 *** .408 *** .010 (.080) (.075) (.219)

Seen as black, IDs as multiracial -.057 -.141 -.055

(.117) (.108) (.094) Consistently

classified blacks 9.684 *** 9.061 *** 8.789 *** Consistently

classified whites 10.342 *** 9.753 *** 9.163 *** (Constant) (.019) (.046) (.042) (Constant) (.012) (.047) (.046)

N 5495 5495 5495 N 3615 3615 3615

Table 5. Ordinary least squares regressions predicting logged annual family income, 1988 NSFG Note: † p<.10 *p<.05 ** p<.01 *** p<.001 Full models include controls for any remaining inconsistent classifications, Hispanic origin, age, marital status, education, metropolitan residence, region of residence, and how many biological children under the age of 18 are currently living in the household.