Rss Oct 2011 Mixed Modes Pres4

The role of the interviewer in

producing mode effects: Results

from a mixed modes experiment

Steven Hope (University College London)

Pamela Campanelli (Independent Survey Methods Consultant)

Gerry Nicolaas (National Centre for Social Research)

Peter Lynn (University of Essex)

Annette Jäckle (University of Essex)

Alita Nandi (University of Essex)

Possible roles of the survey interviewer:

Positive

Respondent motivation

Reducing task difficulty

Negative

Reducing privacy of interview situation

Satisficing

Social

Desirability

Face-to-face

Telephone

Web self-completion

Interviewer modes Interviewer modes

Impact of interviewer motivation (reduction of satisficing)

Primacy and recency on long scales Less primacy in F2F with showcards vs web

Less recency in F2F (no showcards) and tel vs web

Acquiescence on agree/disagree Less acquiescence in interviewer modes vs web

Middle Categories on long scales Less middle category effect in F2F and tel vs web

Non-differentiation on ranking Less non-differentiation in F2F (and tel) vs web

Interviewers can help respondents with difficult

tasks

Ranking questions produce less item non-

response in F2F vs web

End-labelled scales produce less item non-

response in F2F and tel vs web

Interviewer presence will lead to respondents

giving socially desirable answers to sensitive

questions

Higher levels of social desirability in F2F and tel

vs web on sensitive Agree/Disagree series

Quantitative results

Omnibus

BHPS (CATI and CAWI only)

BHPS confirms Omnibus

Cognitive interview results

Questions with long answer lists Less primacy in F2F with showcards vs web

Not supported

Less recency in F2F (no showcards) and tel vs web

Not supported

Virtually no traditional primacy and recency effects found (1 of 24 comparisons)

But CATI primacy effects (positivity bias – see Christian, Dillman and Smyth, 2008; Ye, Fulton, and Tourangeau, 2011) on long and short scales

Other “difficult” formats End-labelled questions: Higher levels of primacy/recency in web vs f2f and tel

Not supported

Ranking: Higher levels of primacy in web vs f2f

Not supported

Virtually no traditional primacy and recency

effects found

CATI primacy effect on end-labelled satisfaction question

8 Non-sensitive Agree/Disagree scales (1) percent strongly agree and agree

(2) those agreeing to opposite statements

Less acquiescence in interviewer modes vs web

Not supported

More acquiescence in interview modes

More apparent in telephone

True for both methods of measuring acquiescence

Some evidence of CATI primacy effects as BHPS CATI

respondents more likely to pick the first category

regardless of the direction of the item (perhaps not

hearing the “not” or “rarely”)

Long scales Higher levels of middle category selection in web vs

f2f and tel

Not supported for long scales (7-8 category)

Supported for short scales (3 category)

Agree / disagree scale (5 category) Higher levels of middle category selection in web vs

f2f and tel

Unexpected finding

Ranking Higher levels of non-differentiation in web vs f2f

Supported

End-labelled scales Higher levels of missing data in web vs f2f and tel

for end-labelled scales

Not supported (virtually no missing data)

Ranking Higher levels of missing data in web vs f2f for

ranking task

Not supported (virtually no missing data)

4 Sensitive Agree/Disagree scales Greater levels of socially desirable responding in

interviewer modes vs web for sensitive series of

Agree/Disagree questions

Supported

To explore the nature of acquiescence

To learn more about the nature of choosing a

category for less than an optimal reason

To understand how respondents understood a

ranking task

To delve into non-differentiation in rating

To investigate curious findings on two

factual questions in 3 versus 7 or 8 category

format

Almost all respondents who had agreed to opposite statements had done so for justifiable reasons.

Example:

N36. Compared to other neighbourhoods, this neighbourhood has more properties that are in a poor state of repair.

N38. Compared to other neighbourhoods, this neighbourhood has more properties that are well kept.

“In this village, . . . it’s like half and half. There is a bit [that] . . . wants doing up and there’s” the other part which doesn‟t (Female, no qualifications, very low income, White British)

Two cases problematic

Clear acquiescence (possibly due to cultural politeness – see Javeline, 1999): “I think I don’t understand that, I just say agree”

(Female, no qualifications, low income, Pakistani with poor English)

Possible acquiescence: R had ambivalent feelings; found it hard to

choose agree or disagree (Female, CSE / O or A Level, low income, White British)

Other Rs with similar views chose the middle category, thus choice of „agree‟ could be a type of acquiescence

Some cognitive Rs choose categories for less than optimal reasons

This occurred in the Agree/Disagree format as well as other formats

This occurred for middle categories as well as for other categories

Cognitive interviewing could distinguish between those who chose a category for justifiable reasons and those who appeared to be satisficing

More possible and clear satisficing in CAWI and CATI than CAPI

Examples of possible satisficing: Chose „neither nor‟ because not that bothered about

the state of repair of properties (Female, higher education below degree, medium income, White British)

Admitted this is not something she things about (Female, first degree, high income, White British)

“Is slightly satisfied the middle one? I’ll go for the middle one” (Female, first degree, high income, White British)

Examples of clear satisficing: “I’ll be truthful, I just answered that, with no

thought in my head” (Male, no qualifications, low income, White British)

“To tell you the truth, I just clicked it” (Female, no qualifications, very low income, White British)

“I’m not too sure, I think you have me on that one.” (Male, high school equivalent, on incapacity benefit, White British)

RANKING

None of the Rs did the ranking task correctly

Most had duplicate ranks

Others ranked the first item as 1 and left the rest

blank

All confused by the task

But note the findings are partially confounded by

the difficulty of the „difficult‟ survey question

„Rating‟ as opposed to „ranking‟ is known to be

vulnerable to non-differentiation (Krosnick and

Alwin, 1988)

This is because it is . . .

An easier task than „ranking‟

Set up as battery of questions

Mixed modes experiment found Non-differentiation in both ranking and rating Interviewer administration less prone than CAWI Higher percentage of non-differentiation with

rating Cognitive interviews investigated

Example based on CAWI results

Rating Ranking

Children‟s game 34.9 13.7

List of improvements 9.3 3.6

Non-differentiation found among cognitive Rs A majority of Rs showed non-differentiation on 2, 3 or 4

questions of 4 item subset

But this does not appear to be satisficing

All of these Rs gave clear and justifiable answers!

In cases of non-differentiation, Rs were then asked

to Choose which was most important Say how easy or difficult that was to do

The majority of respondents answered 'difficult'

Example:

Interviewer: “Is that an easy or a difficult choice

to make?”

Respondent: “Probably quite difficult, really.”

Interviewer: “So what would make that a difficult

choice?”

Respondent: “Well, obviously we’d like more parking

here in the close, but then, I don’t think it’s really

achievable, so then you kind of think, “Well, that’s

what I’d like in an ideal world,” but then the next

thing would be, I suppose, the schools would be more

important, would be important to me. But they’re

secondary because they don’t apply to me at the

moment, but they will in the future.”

(Female, first degree, high income, White British)

3 versus 7 or 8 category split ballot experiment

When the longer version recoded to categories of shorter version would expect these to be equivalent. But this was not the case!

True for all question comparisons: 2 satisfaction, 2 nominal factual and 2 ordinal factual questions

Makes sense on satisfaction questions - more Rs chose the middle category in 3-point versus 7-point scale

But very surprising for factual questions - particularly for 2 questions (see next slide)

Given that respondents had been randomly assigned to the two question formats, it would be easy to conclude that such differences were due to randomness (a Type 1 error) rather than being an important finding

Required further investigation using cognitive interviews

7 or 8 Category Versions 3 Category Versions FM75. Which of these best describes your

home? Would you say a . . . (READ OUT) . . .

Detached house 1

Semi-detached house 2

Terraced house 3

Bungalow 4

Flat in a block of flats 5

Flat in a house 6

Maisonette 7

Or other? 8

FM75. Which of these best describes your home?

Would you say a . . . (READ OUT) . . .

House 1

Flat or maisonette 2

Or other ? 3

FM82. How long have you lived in this area?

Would you say . . . READ OUT . . .

Less than 12 months 1

12 months or more but less than 2 years 2

2 years or more but less than 3 years 3




20 years or longer 7

FM82. How long have you lived in this area?

Would you say . . . READ OUT . . .

Less than 3 years 1


10 years or longer 3

For the Cognitive Interviewing,

Rs were asked the 3 category version of the

questions as part of the survey questions

Later Rs presented showcard with the more

detailed categories (without reminding them of

their original survey answer)

Cognitive interviewers were to probe any

inconsistencies

None of the 12 respondents were inconsistent

But it was found that both the „dwelling‟ and „years lived in

area‟ questions were confusing.

“What the hell difference is there between a maisonette

and a flat and a block of flats, a flat and a house?” (Male,

postgraduate degree, high income, White British)

Regarding a maisonette. Household member: “I’ll call it a

duplex, yeah.” Respondent: “Well, it’s what they call it in

the South.” (Male, postgraduate degree, high income, White

British)

R answered „flat,‟ but the interviewer observed it as semi-

detached house. R said it had to be very large to be called a

house (Female, higher education below degree level, medium

income, other ethnicity)

Years lived in area Difficulty in remembering the number of years

Feeling stuck between two categories

Feeling the short version was much simpler

There were differences between

interviewer and self-completion

modes: Levels of satisficing

Socially desirable responding

Satisficing behaviour differed by

mode and question format

Findings for Omnibus and BHPS were

very similar

Thank you

Evidence found in:

Long and short scales

Agree/Disagree scales

End-labelled satisfaction scale

25 Rs given „alternative‟ versions of the same

questions in CATI and CAWI.

Rs who had chosen more first category responses

in CATI than CAWI were compared to the others.

But nothing notable to distinguish their answers

Those choosing satisfied were happy with

facility/service or mentioned minor problems.

Those who had chosen dissatisfied had clear

complaints

Documents

Rss Oct 2011 Mixed Modes Pres4