Upload
gerrynicolaas
View
98
Download
1
Embed Size (px)
DESCRIPTION
The role of the interviewer in producing mode effects: Results from a mixed modes experiment
Citation preview
The role of the interviewer in
producing mode effects: Results
from a mixed modes experiment
Steven Hope (University College London)
Pamela Campanelli (Independent Survey Methods Consultant)
Gerry Nicolaas (National Centre for Social Research)
Peter Lynn (University of Essex)
Annette Jäckle (University of Essex)
Alita Nandi (University of Essex)
Possible roles of the survey interviewer:
Positive
Respondent motivation
Reducing task difficulty
Negative
Reducing privacy of interview situation
Satisficing
Social
Desirability
Face-to-face
Telephone
Web self-completion
Interviewer modes Interviewer modes
Impact of interviewer motivation (reduction of satisficing)
Primacy and recency on long scales Less primacy in F2F with showcards vs web
Less recency in F2F (no showcards) and tel vs web
Acquiescence on agree/disagree Less acquiescence in interviewer modes vs web
Middle Categories on long scales Less middle category effect in F2F and tel vs web
Non-differentiation on ranking Less non-differentiation in F2F (and tel) vs web
Interviewers can help respondents with difficult
tasks
Ranking questions produce less item non-
response in F2F vs web
End-labelled scales produce less item non-
response in F2F and tel vs web
Interviewer presence will lead to respondents
giving socially desirable answers to sensitive
questions
Higher levels of social desirability in F2F and tel
vs web on sensitive Agree/Disagree series
Quantitative results
Omnibus
BHPS (CATI and CAWI only)
BHPS confirms Omnibus
Cognitive interview results
Questions with long answer lists Less primacy in F2F with showcards vs web
Not supported
Less recency in F2F (no showcards) and tel vs web
Not supported
Virtually no traditional primacy and recency effects found (1 of 24 comparisons)
But CATI primacy effects (positivity bias – see Christian, Dillman and Smyth, 2008; Ye, Fulton, and Tourangeau, 2011) on long and short scales
Other “difficult” formats End-labelled questions: Higher levels of primacy/recency in web vs f2f and tel
Not supported
Ranking: Higher levels of primacy in web vs f2f
Not supported
Virtually no traditional primacy and recency
effects found
CATI primacy effect on end-labelled satisfaction question
8 Non-sensitive Agree/Disagree scales (1) percent strongly agree and agree
(2) those agreeing to opposite statements
Less acquiescence in interviewer modes vs web
Not supported
More acquiescence in interview modes
More apparent in telephone
True for both methods of measuring acquiescence
Some evidence of CATI primacy effects as BHPS CATI
respondents more likely to pick the first category
regardless of the direction of the item (perhaps not
hearing the “not” or “rarely”)
Long scales Higher levels of middle category selection in web vs
f2f and tel
Not supported for long scales (7-8 category)
Supported for short scales (3 category)
Agree / disagree scale (5 category) Higher levels of middle category selection in web vs
f2f and tel
Unexpected finding
Ranking Higher levels of non-differentiation in web vs f2f
Supported
End-labelled scales Higher levels of missing data in web vs f2f and tel
for end-labelled scales
Not supported (virtually no missing data)
Ranking Higher levels of missing data in web vs f2f for
ranking task
Not supported (virtually no missing data)
4 Sensitive Agree/Disagree scales Greater levels of socially desirable responding in
interviewer modes vs web for sensitive series of
Agree/Disagree questions
Supported
To explore the nature of acquiescence
To learn more about the nature of choosing a
category for less than an optimal reason
To understand how respondents understood a
ranking task
To delve into non-differentiation in rating
To investigate curious findings on two
factual questions in 3 versus 7 or 8 category
format
Almost all respondents who had agreed to opposite statements had done so for justifiable reasons.
Example:
N36. Compared to other neighbourhoods, this neighbourhood has more properties that are in a poor state of repair.
N38. Compared to other neighbourhoods, this neighbourhood has more properties that are well kept.
“In this village, . . . it’s like half and half. There is a bit [that] . . . wants doing up and there’s” the other part which doesn‟t (Female, no qualifications, very low income, White British)
Two cases problematic
Clear acquiescence (possibly due to cultural politeness – see Javeline, 1999): “I think I don’t understand that, I just say agree”
(Female, no qualifications, low income, Pakistani with poor English)
Possible acquiescence: R had ambivalent feelings; found it hard to
choose agree or disagree (Female, CSE / O or A Level, low income, White British)
Other Rs with similar views chose the middle category, thus choice of „agree‟ could be a type of acquiescence
Some cognitive Rs choose categories for less than optimal reasons
This occurred in the Agree/Disagree format as well as other formats
This occurred for middle categories as well as for other categories
Cognitive interviewing could distinguish between those who chose a category for justifiable reasons and those who appeared to be satisficing
More possible and clear satisficing in CAWI and CATI than CAPI
Examples of possible satisficing: Chose „neither nor‟ because not that bothered about
the state of repair of properties (Female, higher education below degree, medium income, White British)
Admitted this is not something she things about (Female, first degree, high income, White British)
“Is slightly satisfied the middle one? I’ll go for the middle one” (Female, first degree, high income, White British)
Examples of clear satisficing: “I’ll be truthful, I just answered that, with no
thought in my head” (Male, no qualifications, low income, White British)
“To tell you the truth, I just clicked it” (Female, no qualifications, very low income, White British)
“I’m not too sure, I think you have me on that one.” (Male, high school equivalent, on incapacity benefit, White British)
RANKING
None of the Rs did the ranking task correctly
Most had duplicate ranks
Others ranked the first item as 1 and left the rest
blank
All confused by the task
But note the findings are partially confounded by
the difficulty of the „difficult‟ survey question
„Rating‟ as opposed to „ranking‟ is known to be
vulnerable to non-differentiation (Krosnick and
Alwin, 1988)
This is because it is . . .
An easier task than „ranking‟
Set up as battery of questions
Mixed modes experiment found Non-differentiation in both ranking and rating Interviewer administration less prone than CAWI Higher percentage of non-differentiation with
rating Cognitive interviews investigated
Example based on CAWI results
Rating Ranking
Children‟s game 34.9 13.7
List of improvements 9.3 3.6
Non-differentiation found among cognitive Rs A majority of Rs showed non-differentiation on 2, 3 or 4
questions of 4 item subset
But this does not appear to be satisficing
All of these Rs gave clear and justifiable answers!
In cases of non-differentiation, Rs were then asked
to Choose which was most important Say how easy or difficult that was to do
The majority of respondents answered 'difficult'
Example:
Interviewer: “Is that an easy or a difficult choice
to make?”
Respondent: “Probably quite difficult, really.”
Interviewer: “So what would make that a difficult
choice?”
Respondent: “Well, obviously we’d like more parking
here in the close, but then, I don’t think it’s really
achievable, so then you kind of think, “Well, that’s
what I’d like in an ideal world,” but then the next
thing would be, I suppose, the schools would be more
important, would be important to me. But they’re
secondary because they don’t apply to me at the
moment, but they will in the future.”
(Female, first degree, high income, White British)
3 versus 7 or 8 category split ballot experiment
When the longer version recoded to categories of shorter version would expect these to be equivalent. But this was not the case!
True for all question comparisons: 2 satisfaction, 2 nominal factual and 2 ordinal factual questions
Makes sense on satisfaction questions - more Rs chose the middle category in 3-point versus 7-point scale
But very surprising for factual questions - particularly for 2 questions (see next slide)
Given that respondents had been randomly assigned to the two question formats, it would be easy to conclude that such differences were due to randomness (a Type 1 error) rather than being an important finding
Required further investigation using cognitive interviews
7 or 8 Category Versions 3 Category Versions FM75. Which of these best describes your
home? Would you say a . . . (READ OUT) . . .
Detached house 1
Semi-detached house 2
Terraced house 3
Bungalow 4
Flat in a block of flats 5
Flat in a house 6
Maisonette 7
Or other? 8
FM75. Which of these best describes your home?
Would you say a . . . (READ OUT) . . .
House 1
Flat or maisonette 2
Or other ? 3
FM82. How long have you lived in this area?
Would you say . . . READ OUT . . .
Less than 12 months 1
12 months or more but less than 2 years 2
2 years or more but less than 3 years 3
3 years or more but less than 5 years 4
5 years or more but less than 10 years 5
10 years or more but less than 20 years 6
20 years or longer 7
FM82. How long have you lived in this area?
Would you say . . . READ OUT . . .
Less than 3 years 1
3 years or more but less than 10 years 2
10 years or longer 3
For the Cognitive Interviewing,
Rs were asked the 3 category version of the
questions as part of the survey questions
Later Rs presented showcard with the more
detailed categories (without reminding them of
their original survey answer)
Cognitive interviewers were to probe any
inconsistencies
None of the 12 respondents were inconsistent
But it was found that both the „dwelling‟ and „years lived in
area‟ questions were confusing.
“What the hell difference is there between a maisonette
and a flat and a block of flats, a flat and a house?” (Male,
postgraduate degree, high income, White British)
Regarding a maisonette. Household member: “I’ll call it a
duplex, yeah.” Respondent: “Well, it’s what they call it in
the South.” (Male, postgraduate degree, high income, White
British)
R answered „flat,‟ but the interviewer observed it as semi-
detached house. R said it had to be very large to be called a
house (Female, higher education below degree level, medium
income, other ethnicity)
Years lived in area Difficulty in remembering the number of years
Feeling stuck between two categories
Feeling the short version was much simpler
There were differences between
interviewer and self-completion
modes: Levels of satisficing
Socially desirable responding
Satisficing behaviour differed by
mode and question format
Findings for Omnibus and BHPS were
very similar
Thank you
Evidence found in:
Long and short scales
Agree/Disagree scales
End-labelled satisfaction scale
25 Rs given „alternative‟ versions of the same
questions in CATI and CAWI.
Rs who had chosen more first category responses
in CATI than CAWI were compared to the others.
But nothing notable to distinguish their answers
Those choosing satisfied were happy with
facility/service or mentioned minor problems.
Those who had chosen dissatisfied had clear
complaints