Upload
randolph-taylor
View
215
Download
1
Embed Size (px)
Citation preview
1
Class 9
Interpreting Pretest Data, Considerations in Modifying or Adapting Measures
November 13, 2008
Anita L. Stewart Institute for Health & Aging
University of California, San Francisco
2
Overview of Class 9
Analyzing pretest data Modifying/adapting measures Keeping track of your study measures Creating and testing scales in your
sample
3
Summarize Data on Pretest Interviews
Summarize problems and nature of problems for each item
Determine how important problems are Results become basis for possible
revisions/adaptations
4
Methods of Analysis
Optimal: transcripts of all pretest interviews
For each item - summarize all problems Analyze dialogue (narrative) for clues to
solve problems
5
Behavioral Coding
Systematic approach to identifying problems with items– “interviewer” and “respondent” problems
Can code problems based on:– Standard administration
– Responses to specific probes
6
Examples of Interviewer “Behaviors” Indicating Problem Items
Question misread or altered– Slight change – meaning not affected
– Major change – alters meaning Question skipped
7
Examples of Respondent “Behaviors” Indicating Problem Items
Asks for clarification or repeat of question Did not understand question Doesn’t know the answer Qualified answer (e.g., it depends) Indicates answer falls between existing
response choices Refusal
8
Summarize Behavioral Coding For Each Item
Proportion of interviews (respondents) with each problematic behavior
# of occurrences of problem divided by N– 7/48 respondents requested clarification
9
Behavioral Coding Summary Sheet: Standard Administration (N=20)
Item #
Interviewer: difficulty reading
Subject: asks to repeat Q
Subject: asks for
clarification
1 2/20 0 1/20
2 0 0 0
3 1/20 3/20 2/20
4 0 1/20 0
10
Can Identify Problems Even When No Problem “Behaviors” Found
Respondents appear to answer question appropriately
Additional problems identified with probes– Probe on meaning: Response indicates lack of
understanding
– Probe on use of response options: Response indicates options are problematic
11
Behavioral Coding of Probe Results
I asked you how often doctors asked you about your health beliefs. What does the term “health beliefs” mean to you?
Behavioral coding: # times response indicated lack of understanding as intended– e.g., 2/15 respondents did not understand meaning
based on response to probe
12
Behavioral Coding Summary: Standard Administration (N=20) + Probes (N=10)
Item # Probe
Meaning unclear
Interviewer -difficulty
reading
Subject: asks to
repeat Q
Subject: asks for
clarification
1 10 2/10 2/20 0 1/20
2 0 0 0 0 0
3 10 4/15 1/20 3/20 2/20
4 10 0 0 1/20 0
13
Interpret Behavioral Coding Results
Determine if problems are common– Items with only a few problems may be fine
Quantifying “common” problems– several types of problems (many row entries)
– several subjects experienced a problem» problem w/item identified in >15% of interviews
14
Continue Analyzing Items with “Common” Problems
Identify “serious” common problems– Gross misunderstanding of the question– Yields completely erroneous answer– Couldn’t answer the question at all
Some less serious problems can be addressed by improved instructions or a slight modification
15
Addressing More Serious Problems
Conduct content analysis of transcript – Use qualitative analysis software (e.g.,
NVIVO) For these items: review dialogue that
ensued during administration of item and probes– can reveal source of problems– can help in deciding whether to keep, modify
or drop items
16
Results: Probing Meaning of Phrase I asked you how often doctors asked you
about your health beliefs? What does the term ‘health beliefs’ mean to you?
“.. I don’t want medicine”
“.. How I feel, if I was exercising…” “.. Like religion? --not believing in
going to doctors?”
17
Results: Probing Meaning of a Phrase
What does the phrase “office staff” mean to you?
“the receptionist and the nurses”
“nurses and appointment people”
“the person who takes your blood pressure and the clerk in the front office”
18
Results: Probing Meaning of Phrase
On about how many of the past 7 days did you eat foods that are high in fiber, like whole grains, raw fruits, and raw vegetables? – Probe: what does the term “high fiber” mean to
you? Behavioral coding of item
– Over half of respondents exhibited a problem Review answers to probe
– Over ¼ did not understand the term
Blixt S et al., Proceedings of section on survey research methods,American Statistical Association, 1993:1442.
19
Results: No Behavior Coding Issues but Probe Detected Problems
I seem to get sick a little easier than other people (definitely true, mostly true, mostly false, definitely false)
Behavioral coding of item– Very few problems
Review answers to probe– Almost 3/4 had comprehension problems– Most problems around term “mostly” (either its true
or its not)
Blixt S et al., Proceedings of section on survey research methods,American Statistical Association, 1993:1442.
20
Results: Beck Depression Inventory (BDI) and Literacy
Cognitive interviews: older adults, oncology pts, and less educated adults
Administered REALM (reading literacy test) and some selected BDI items
Asked to paraphrase items
TL Sentell, Community Mental Health Journal, 2008;39:323
21
Results: Beck Depression Inventory (BDI) and Literacy (cont)
For each item, from 0-62% correctly paraphrased item
Most misunderstandings: vocabulary confusion Phrase: I am critical of myself for my
weaknesses and mistakes– “Critical is when you’re very sick”– “I don’t know how to explain mistakes”
22
Interpreting Pretest Results of Self-Administered Questionnaires
Missing data is a clue to problematic items– More missing data associated with unclear,
difficult, or irrelevant items
– Cognitive interviewing can help determine reasons for missing data
23
How Missing Data Prevalence Helps
Items with large percent of responses missing – clue to problem
In H-CAHPS® pretest: Did hospital staff talk with you about whether
you would have the help you needed when you left the hospital?– 35% missing for Spanish group– 29% missing for English group
MP Hurtado et al. Health Serv Res, 2005;40-6, Part II:2140-2161
24
Exploring Differences by Diverse Groups
Back to issue of “equivalence” of meaning across groups
All cognitive interview analyses can be done separately by group
25
Results: Use of Response Scale
Do diverse groups use the response scale in similar ways?
Re questions about cultural competence of providers– Interviewers reported that Asian respondents
who were completely satisfied did not like to use the highest score on the rating scale
California Pan-Ethnic Health Network (CPEHN) Report, 2001
26
Results: Use of Response Scale (cont)
Behavioral Risk Factor Surveillance Survey (BRFSS) pretesting
Found that Puerto Rican, Mexican American, and African American respondents more likely to choose extreme response categories than Whites.
RB Warnecke et al, Ann Epidemiol, 1997:7:334-342
27
Differential Use of CAHPS® 0-10 Global Rating Scale
Compared Medicaid and commercially insured adults on use of scale
Medicaid enrollees more likely than commercial participants to use extreme ends of scale– All other things being equal
PC Damiano et al, Health Serv Outcomes Res Method, 2004:5:193-205
28
Results: Probe on Difficulty:CES-D Item
“During the past week, how often have you felt that you could not shake off the blues, even with help from family and friends”
Probe: Do you feel this is a question that people would or would not have difficulty understanding?– Latinos more likely than other groups to report
people would have difficulty
TP Johnson, Health Survey Research Methods, 1996
29
Overview of Class 9
Analyzing pretest data Modifying/adapting measures Keeping track of your study measures Creating and testing scales in your
sample
30
Now What!
Issues in adapting measures based on pretest results
Cognitive interview pretesting during development phases of measure– Can modify items and continue pretesting
Cognitive interview pretesting prior to using published survey:– More problematic
31
Modification: Probing the Meaning of a Phrase
What does the phrase “office staff” mean to you?
“the receptionist and the nurses”
“nurses and appointment people”
“the person who takes your blood pressure and the clerk in the front office”
We changed the question to receptionist and appointment staff
32
Results: Probing Meaning and Cultural Appropriateness
I asked you how often doctors asked you about your health beliefs? What does the term ‘health beliefs’ mean to you?
“.. I don’t want medicine” “.. How I feel, if I was exercising…” “.. Like religion? --not believing in
going to doctors?” We changed the question to “personal beliefs
about your health
33
Criteria for Whether or Not to Modify Measure
Contact author – May be open to modifications, working with you
Be sure your opinion is based on extensive pretests with consistent problems– Don’t rely on a few comments in a small pretest
Work with a measurement specialist to assure that proposed modifications are likely to solve problem
34
Tradeoffs of Using Adapted Measures
Advantages Improve internal validity
Disadvantages Lose external validity Know less about modified measure Need to defend new measure
35
Adding New (Modified) Items
One approach if you find serious problems with a standard measure– Write new items you think will be better (use same
format)– Retain original intact items and add modified items
Can test original scale and revised scale with modified items instead of original items
36
Modifying response categories
If response choices are too few and/or coarse, can improve without compromising too much– Try adding levels within existing response
scale
37
One Modification: Too Many Response Choices
SF36 version 1 1 - All of the time 2 - Most of the time 3 - A good bit of the time 4 - Some of the time 5 - A little of the time 6 - None of the time
SF36 version 2 1 - All of the time 2 - Most of the time 3 - Some of the time 4 - A little of the time 5 - None of the time
38
Modification of Health Perceptions Response Choices for Thai Translation
Usual responses: 1 - Definitely true 2 - Mostly true 3 - Don’t know 4 - Mostly false 5 - Definitely false
Modified: 1 – Not at all true 2 – A little true 3 - Somewhat true 4 - Mostly true 5 – Definitely true
e.g., My health is excellent, I expect my health to get worse
39
Modifying Item Stems
If item wording will not be clear to your population– Can add parenthetical phrases
Have you ever been told by a doctor that you have diabetes (high blood sugar)?
40
Strategy for Modified Measures
Test measure in original and adapted form Choose measure that performs the best
41
Analyzing New (Modified) Measure
Factor analysis – All original items– Original plus new items replacing original
Correlations with other variables– Does the new measure detect stronger associations?
Outcome measure– Does the new measure detect more change over
time?
42
Analytic Strategy: CAHPS® 0-10 Global Rating Scale: Response
Usual classifications 0-9, 10 (dichotomy)
Proposed classification 0-8, 9-10
PC Damiano et al, Health Serv Outcomes Res Method, 2004:5:193-205
Can’t change the scale – part of standardized survey
43
Overview of Class 9
Analyzing pretest data Modifying/adapting measures Keeping track of your study measures Creating and testing scales in your
sample
44
Questionnaire Guides
Organizing your survey measures– Keep track of measurement decisions
Sample guide to measures (last week)– Documents sources of measures
– Any modifications, reason for modification
45
“Sample Guide to Measures” Handout
Type of variable Concept Measure Data source Number of items/survey question numbers Number of scores or scales for each measure References
46
Sample “Summary of Survey Variables..” Handout
Develop “codebook” of scoring rules Several purposes
– Variable list
– Meaning of scores (direction of high score)
– Special coding
– How missing data handled
– Type of variable (helps in analyses)
47
Item Naming Conventions
Optimal coding is to assign raw items their questionnaire number – Can always link back to questionnaire easily
Some people assign a variable name to the questionnaire item– This will drive you crazy
48
Variable Naming Conventions
Assigning variable names is an important step– make them as meaningful as possible– plan them for all questionnaires at the beginning
For study with more than one source of data, a suffix can indicate which point in time and which questionnaire– B for baseline, 6 for 6-month, Y for one year– M for medical history, L for lab tests
49
Variable Naming Conventions (cont)
Medical History Questionnaire
HYPERTMB HYPERTM6
Baseline 6 months
50
Variable Naming Conventions (cont)
A prefix can help sort variable groupings alphabetically– e.g., S for symptoms
SPAINB, SFATIGB, SSOBB
51
Overview of Class 9
Analyzing pretest data Modifying/adapting measures Keeping track of your study measures Creating and testing scales in your
sample
52
On to Your Field Test or Study
What to do once you have your baseline data
How to create summated scale scores
53
Preparing Surveys for Data Entry: 4 Steps
Review surveys for data quality Reclaim missing and ambiguous data Address ambiguities in the questionnaire
prior to data entry Code open-ended items
54
Review Surveys for Data Quality
Examine each survey in detail as soon as it is returned, and mark any..– Missing data
– Inconsistent or ambiguous answers
– Skip patterns that were not followed
55
Reclaim Missing and Ambiguous Data
Go over problems with respondent– If survey returned in person, review then
– If mailed, call respondent ASAP, go over missing and ambiguous answers
– If you cannot reach by telephone, make a copy for your files and mail back the survey with request to clarify missing data
56
Address Ambiguities in the Questionnaire Prior to Data Entry
When two choices are circled for one question, randomly choose one (flip a coin)
Clarify entries that might not be clear to data entry person
57
Code Open-Ended Items
Open-ended responses have no numeric code– e.g., name of physician, reason for visiting
physician Goal of coding open-ended items
– create meaningful categories from variety of responses
– minimize number of categories for better interpretability
– Assign a numeric score for data entry
58
Example of Open-Ended Responses
1.What things do you think are important for doctors at this clinic to do to give you high quality care?
Listen to your patients more often Pay more attention to the patient Not to wait so long Be more caring toward the patient Not to have so many people at one time Spend more time with the patients Be more understanding
59
Process of Coding Open-Ended Data
Develop classification scheme– Review responses from 25 or more questionnaires – Begin a classification scheme– Assign unique numeric codes to each category– Maintain a list of codes and the verbatim answers
for each– Add new codes as new responses are identified
If a response cannot be classified, assign a unique code and address it later
60
Example of Open-Ended Codes
Communication = 1 Listen to your patients more often = 1 Pay more attention to the patient = 1 Access to care = 2 Not to wait so long = 2 Not to have so many people at one time = 2Allow more time = 3 Spend more time with the patients = 3Emotional Support = 4 Be more understanding = 4 Be more caring toward the patient
61
Verify Assigned Codes
Have a second person independently classify each response using final codes
Investigator can review a small subset of questionnaires to assure that coding assignment criteria are clear and are being followed
62
Reliability of Open-Ended Codes
Depends on quality of question, of codes assigned, and the training and supervision of coders
Initial coder and second coder should be concordant in over 90% of cases
63
Data Entry
Set up file Double entry of about 10% of surveys
– SAS or SPSS will compare two for accuracy» Acceptable 0-5% error» If 6% or greater – consider re-entering data
64
Print Frequencies of Each Item and Review: Range Checks
Verify that responses for each item are within acceptable range– Out of range values can be checked on
original questionnaire» corrected or considered “missing”
– Sometimes out of range values mean that an item has been entered in the wrong column» a check on data entry quality
65
Print Frequencies of Each Item and Review: Consistency Checking
Determine that skip patterns were followed Number of responses within a skip pattern
need to equal number who answered “skip in” question appropriately
66
Print Frequencies of Each Item and Review: Consistency Checking (N=90)
1. Did your doctor prescribe any medications? (75 = yes, 15 = no)
1a. If yes, did your doctor explain the side effects of the medication? (80 responses)
Often will find that more people answered the second question than were supposed to
67
Print Frequencies of Each Item and Review: Consistency Checking (cont.)
Go back to a questionnaires of those with problems – check whether initial “filter” item was
incorrectly answered or whether respondent inadvertently answered subset
– sometimes you won’t know which was correct Hopefully this was caught during initial
review of questionnaire and corrected by asking respondent
68
Deriving Scale Scores
Create scores with computer algorithms in SAS, SPSS, or other program
Review scores to detect programming errors
Revise computer algorithms as needed Review final scores
69
Creating Likert Scale Scores
Translate codebook scoring rules into program code (SAS, SPSS):– Reverse all items as specified
– Apply scoring rules
– Apply missing data rules Sample for SAS (see handout)
70
Testing Scaling Properties and Reliability in Your Sample for Multi-Item Scales
Obtain item-scale correlations– Part of internal consistency reliability
program Calculate reliability in your sample
(regardless of known reliability in other studies) – internal-consistency for multi-item scales– test-retest if you obtained it
71
SAS – Chapter 3: Assessing Reliability with Coefficient Alpha
Review statements and output How to test your scales for internal
consistency and appropriate item-scale correlations
72
SAS/SPSS Both Make Item Convergence Analysis Easy
Reliability programs provide:– Item-scale correlations corrected for overlap
– Internal consistency reliability (coefficient alpha)
– Reliability with each item removed» To see effect of removing an item
73
SAS – Obtaining Item-Scale Correlations and Coefficient Alpha
PROC CORR– DATA=data-set-name– ALPHA– NOMISS– VAR (list of variables)
Output:– Coefficient alpha– Item correlations– Item-scale correlations corrected for overlap
SAS Manual, Chapter 3: Assessing ScaleReliability with Coefficient Alpha
74
SAS – Chapter 3: Assessing Scale Reliability with Coefficient Alpha
PROC CORR– DATA=data-set-name– ALPHA– NOMISS– VAR (list of variables)
Output:– Coefficient alpha– Item correlations– Item-scale correlations corrected for overlap
75
Testing Reliability in STATA
www.stata.com/help.egi?alpha
Alpha varlist [if] [in] [, options]
SEE HANDOUT
76
What if Reliability is Too Low?
Have to decide if you need to modify a scale New scales under development
– Modify using item-scale criteria Standard scales – cannot change
– Simply report problems as caveats in your analyses If problem is substantial
– Can create a modified scale and report results using standard and modified scale
77
Value of Pretesting: Experts Say..
…evidence from our work suggests that many survey questions are seriously underevaluated
Evaluating items at final pretest phase is often too late in the process– Too late for extensive question redesign
A series of question evaluation steps is needed beginning well before the survey
FJ Fowler and CF Cannell. Using behavioral coding to identify problems with survey questions. In Answering Questions…, eds N Schwarz et al, Jossey-Bass, 1996
78
Homework for Class 10
Conduct 2 pretest interviews with individuals similar to your target population – Administer all questions– Administer your 4 probes
Summarize briefly your pretest results Indicate whether the measure appears to be
appropriate for the 2 pretest subjects– No inferences to broader sample needed