Cross-cultural surveys Part of the Comparative Cross-national Electoral Research Programme

Cross-cultural surveys

Part of the Comparative Cross-national Electoral Research Programme

Funded by the Economic and Social Research Council

Contact us:

Mediaeffectsresearch.wordpress.com

@femalebrain

Cross-Cultural Surveys:

• Part I: Total Survey Error: survey quality (comparability,

translation)Data quality & Total Survey ApproachSampling questionnaire design

• Part II: data processing and statistical adjustment (i.e. survey

weights)survey weights Linking micro-macro data

PART I: DATA QUALITY IN CROSS CULTURAL SURVEYS

Total Survey Error Paradigm and CCS

Why Cross-Cultural Surveys?

• Represent opinion - Want to know what people think, reports of behaviour

• Represent people: describe a population

• Represent relationships: among attitudes & attributes

• Cross-cultural surveys - monitor and explain trends in attitudes, beliefs and values across countries

• We are interested in contextual/cultural influences

• Rapid growth in the conduct of surveys across a large number of countries and increasing availability of these data

Major Cross Cultural Surveys

Surveys• CSES*, EES, ESS, ISSP*, WVS*• Barometers

– Euro– Latin– Asia– Asian– Afro

• 218 FH Countries• 124 are present in one Cross-

national survey• *includes USA and Canada

Number of Countries

Afro-baro

meter

Eurobaro

mterESS ISS

P0

102030405060708090

100

Data sets used today

• World Values Survey

• Ron Inglehart

• European Social Survey

• Centrally funded and managed, Descartes prize winner

• Comparative Study of Electoral Systems

• Country teams for national election studies, coordinated by UMich

CCS & Total Survey Error Approach I

• Total survey error (TSE) = all errors that may arise in the design, collection, processing, and analysis of survey data.

• survey error is defined as: error = abs[TRUE VALUE – ACTUAL RESPONSE]

• errors can arise from the survey frame deficiencies, sampling error, interviewer & mode effects, item and survey non-response…

•

CCS & Total Survey Error Approach II

• CCS adds additional layers to the types of errors and how errors may affect the quality of data

• comparability

• Translation - do concepts translate?

• time use patterns and survey traditions may affect response rates

• Sampling – selection of countries to study

Questionnaire Design & Development• Objective – Comparability

• Our goal is to be able to compare responses across a

set of cultural contexts.

• Differences should be due to underlying differences in

values rather than error (e.g. question wording)

• Do not assume ‘that the use of similar instruments administered under similar conditions is truly sufficient to ensure that respondents from different cultural groups will arrive at the same interpretations os survey items” (Harkness, Vijver, Mohler 2003)

Achieving measurement equivalence with translation:

• Decentering

• Develop questionnaire in two languages

• Back translation

• After translating in other languages translate back into original language

• See guidelines developed for European Election Study

Survey (Unit) Non-response• Response rates vary across countries – what are implications for

cross-cultural survey quality?

• Lower response rates may indicate poorer quality in terms of representativeness but greater efforts to increase response rates my include ‘lazy’ respondents.

• Responding to a survey = ability & motivation AND skill of interviewer

• Improve response rates by re-contacting and refusal conversion

• Bias introduced when non-response is non-random

Non-response, motivation & data quality

Low Politi

cal In

terest

High Politi

cal In

terest0.0

30.0

60.0

% No initial refusal

Reluctant Clarification Understood0.00.51.01.52.02.53.03.54.04.55.0

Non refusal Refusal

European Social Survey: Interviewer ratings of respondents engagement with survey.

European Social Survey: How interest motivates participation and may lead to bias in surveys.

Item Non-response

• Item non-response occurs when a respondents does not give an answer to a question

• As with survey, respondents need to be motivated to respond to question – optimise rather than satisfice

• Optimising requires – comprehensive (cognition) and judgement

• Satisfice – straitlining, middle response, don’t know answers

Measuring Data Quality in CCS

Afro-baro

meter

Asian-baro

meter

Eurobaro

mterEES

ESS

Latin-baro

meterCSE

SISS

PW

VS0.0

0.5

1.0

1.5

2.0

2.5

Average number of professional survey organisations in each country:

Measuring Data Quality in CCS

CSES WVS0.0

10.020.030.040.050.060.070.080.0

Response rates

CSES WVS0.0

10.0

20.0

30.0

40.0

50.0

60.0

70.0

80.0

90.0

OECD Non-OECD

Relationship between non-response & data quality

Afro-baro

meter

Asian-baro

meter

Eurobaro

mterEES

ESS

Latin-baro

meterCSE

SISS

PW

VS0.0

0.5

1.0

1.5

2.0

2.5

Reluctance to give answers and satisficing

Sampling I

• At least two levels of sampling

• Country (level 2) and individual (level 1)

• Probability sample at level 1 (can be multi-stage)

• Level 2 - not a probability sample of countries

• Assumption of exchangeability violated

Sampling II• Level 2 - not a probability sample of countries – purposive,

convenience or the population

• Researchers concerned about number of countries when conducting statistical analysis.

• However, should also be concerned about type of sample:

• Assumption of exchangeability violated

• Not a problem if it is plausible that factors related to outcomes are unrelated to selection (Snijders)

Sample Characteristics – Bias?Turnout – global average %66

Afro-baro

meter

Eurobaro

mterESS ISS

P0

102030405060708090

100

Education – global average 8

Afro-baro

meter

Eurobaro

mterESS ISS

P0

2

4

6

8

10

12

And now for some comparisons using CSES data:

Men Women0.0

10.0

20.0

30.0

40.0

50.0

60.0

% feeling close to political party by gender and FH Political Rights

Politica

l Righ

ts 2 3 4 5

No Politica

l Righ

ts0.0

10.020.030.040.050.060.070.080.0

Women Men

Additional resources on Translation

Harkness, JA. 2008. “Round 4 ESS Translation Strategies and Procedures”, European Social Survey, [http://www.europeansocialsurvey.org/index.php?option=com_docman&task=doc_download&gid=351&itemid=80]. Harkness JA, Schoua-Glusberg A, 1998. Questionnaires in Translation. In: ZUMA-Nachrichten Spezial No.3. Cross-Cultural Survey Equivalence. Harkness JA (ed.). Mannheim: ZUMA.

PART II: DATA PROCESSING, STATISTICAL ADJUSTMENTS AND ANALYSIS

Data linking – measures of context linked to individual survey responsesMulti-level data

Weighting CCS Data to Adjust for Non-Response

Types of Weights• sample design weights, non- response weights, and post-

stratification weight• Weighting adjusts for - unequal selection probabilities as well as

adjustments for nonresponse and stratification – compensate for different probabilities of being selected

• Treating as a simple random or representative may lead to smaller standard errors

Post-stratification weights to Adjust for Non-Response

Available in most survey data setsIt requires the use of auxiliary information about the population and may take a number of different variables into account.Information usually needed:• Population estimates of the distribution of a set of demographic

characteristics that have also been measured in the sample• For example, information found in the Census such as:

• Gender, Age, Educational attainment, Household size, Residence (e.g., rural, urban, metropolitan), Region

Post-stratification weights to Adjust for Non-Response• Weights primarily adjust means and proportions. OK

for descriptive data but may adversely affect inferential data and standard errors.

• Weights almost always increase the standard errors of your estimates. (or assume self-weighted).

• Self-weighted means that equal probability of selection into sample.

• Therefore, for analysis do you need to weight data? Most of us want to go beyond descriptive statistics.

• Also, what to do with level 2, no weights.

Example: Weights in European Social Survey

• Sampling Design Weights: Kish (1994, p.173) provides the starting point of the sampling expert

• panel’s work: “need for similarity of sample designs. Flexibility of choice is particularly advisable for multinational comparisons, because the sampling resources differ greatly between countries. All this flexibility assumes probability selection methods: known probabilities of selection for all population elements.”

• Following this statement, an optimal sample design for cross-cultural surveys should consist of the best random sampling practice used in each participating country. The choice of a specific sample design depends on the availability of frames, experience, and of course also the costs in different countries. If, after the survey has been conducted, adequate estimators are chosen, the resulting values are comparable. To ensure this comparability, design weights have to be computed for each country. For this, the inclusion probabilities of every sample member at each stage of selection must be known and recorded in the Sample Design Data File (SDDF).

Problems with Weights

• Weights primarily adjust means and proportions. OK for descriptive

data but may adversely affect inferential data and standard errors.

• Weights almost always increase the standard errors of your

estimates. (or assume self-weighted).

• Self-weighted means that equal probability of selection into sample.

• Therefore, for analysis do you need to weight data? Most of us want

to go beyond descriptive statistics.

• NO WEIGHTS FOR LEVEL 2 – to adjust for non-response or

probability of selection

Data Analysis with Weighted Data

• Should use a statistical procedure that adjusts for the impact of the

weights on the standard errors. Standard errors based on the actual N

and not the weighted N.

• Not available in SPSS. SPSS treats weights incorrectly in inferential

statistics

• SVY procedures in Stata.

• Also use of pweight.

• fweight not correct

– Another choice is to not use weights at all for regression models. Instead include all the variables used to create the weights as independent variables. Results in unbiased estimates and standard errors. [MY PREFERENCE]

Macro and Micro: Data Linking

• We collect survey data across a number of countries because we

are interested in variation across different cultural contexts.

• Mean Differences correlated with country level factors

(differences in intercepts)

• Relationship between variables of interest varies across

countries (differences in slopes)

• Want to measure country level (contextual) factors so they

become explanatory (level 2) variables

• Need to link these country level factors to individual level survey

variables.

CCS - Multi-level Data Structure (2 levels)

Country 1.. k

Respondent 1.. j Party 2…i Party 3…i

Country 2…j

Party 1…i Party 2…i Party 3…i

Country 2…k

Respondent 1…j Party 1…i Party 2…i Party 3…i

Country 2…j

Resp 1…i Resp2…i Resp 3…i Resp4…i

Macro and Micro: Data LinkingTechnical issues:

In linking micro and macro data need a linking identified as a

variable in each data set (e.g. country_number)

sort data by the country identifier (for example, macro_id) and save

it.

Merge both datasets: merge macro_id using macrodata

Check the merged data: tabulate the new variable _merge

NOTE: Macro data will have one line of data per country while

survey data will have a line of data for each individual within each

country

Macro and Micro: Data Linking

CCS & Multi-level data structure

Contain multiple levels of analysis, with each level consisting of distinct units of analysis.Most common: hierarchical data.Two-level structure: Units from the lowest level of analysis (level-1 units) are nested within units from a higher level of analysis (level-2 units)Data are \clustered”

Voters nested within districtsVoters nested within timePanel dataTime-series cross-sectional data (TSCS)Three-level structure: e.g., voters nested within districts nested within countries, or students nested within classes, nested within school schools


Different ways to deal with clustered data, depends on how one treats between-cluster and within-cluster variation Pooling: degree to which cluster mean is drawn towards overall mean

Complete pooling : OLS (doesnt distinguish within- versus between-cluster variation)

No pooling- effects estimator (ignores between-cluster variation)Between effects estimator (ignores within-cluster variation|regression

of cluster means)Partially pooling : random intercept modelTakes information from both clusters and overall sample


What sort of variation are we looking at across countries (or clusters)?

Heterogeneity in the responsefactors specific to each cluster may influence outcome; factors shared by observations within each outcomeMethod: random intercept model

Variance – intercept/cluster

Different slopes – what happens if

clustering ignored


Rather than just the mean in each country being different, the relationship between two variables of interest could vary across countries.For example, the relationship between gender and political participation may be weaker in countries with more egalitarian values and political structures:

Causal heterogeneityWhen the relationship between X and Y varies across clusterHow higher level variables shape lower-level relationships.Method: random coefficient modelPrevents cluster-confounding: assuming within-cluster effect and between-cluster effects are identical, while in fact they aren't.

Different patterns

A couple of different approaches:

• Two stage (Shively and Long-Jusko in Political Analysis) –– Predict regression line in each level 2 unit (e.g.

constituency)– And then model the coefficients

• Packages such as MLwIN and HLM• Stata and SPSS have routines

Documents

Cross-cultural surveys Part of the Comparative Cross-national Electoral Research Programme