55
IS 4800 Empirical Research Methods for Information Science Class Notes Feb. 24, 2012 Instructor: Prof. Carole Hafner, 446 WVH [email protected] Tel: 617-373-5116 Course Web site: www.ccs.neu.edu/course/is4800sp12/

IS 4800 Empirical Research Methods for Information Science Class Notes Feb. 24, 2012

  • Upload
    talasi

  • View
    25

  • Download
    0

Embed Size (px)

DESCRIPTION

IS 4800 Empirical Research Methods for Information Science Class Notes Feb. 24, 2012. Instructor: Prof. Carole Hafner, 446 WVH [email protected] Tel: 617-373-5116 Course Web site: www.ccs.neu.edu/course/is4800sp12/. Observational Survey Experimental - PowerPoint PPT Presentation

Citation preview

Page 1: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

IS 4800 Empirical Research Methods for Information Science

Class Notes Feb. 24, 2012

Instructor: Prof. Carole Hafner, 446 [email protected] Tel: 617-373-5116

Course Web site: www.ccs.neu.edu/course/is4800sp12/

Page 2: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

2

Types of Quantitative Studies We’ve Discussed

• Observational• Survey• Experimental

– One-factor, two-level, between-subjects– One-factor, two-level, within-subjects

• aka “repeated measures” or “crossover” – Matched pairs

Page 3: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

3

Types of Experimental Designs

• Between-Subjects Design– Different groups of subjects are randomly assigned to the levels

of your independent variable– Data are averaged for analysis– Use t-test for independent means

– Example: “single factor, two-level, between subjects” design• Level A=Word vs. Level B=Wizziword

Page 4: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

4

Types of Experimental Designs

• Within-Subjects Design– A single group of subjects is exposed to all levels of the

independent variable– Data are averaged for analysis– aka “repeated measures design”, “crossover design”– Use t-test for dependent means aka “paired samples t-test”

– We will discuss “single factor, two-level, within subjects” designs.

Page 5: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

5

Between-Subjects Design

• Each group is a sample from a population• Big question: are the populations the same

(null hypothesis) or are they significantly different?

Page 6: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

6

Sidebar: Randomization

• Crucial: method must not be applied subjectively

• Point in time at which randomization occurs is important

recruiting randomization experiment final measures

Page 7: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

7

Sidebar:Randomization

• Simple randomization– Flip a coin– Random number generator– Table of random numbers– Partition numeric range into number of conditions

• Problems?

Page 8: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

8

Sidebar: Randomization

• Blocked randomization– Avoids serious imbalances in assignments of subjects to

conditions– Guarantees that imbalance will never be larger than a specified

amount– Example: want to ensure that every 4 subjects we have an equal

number assigned to each of 2 conditions => “block size of 4” – Method: write all permutations of N conditions taken B at a time

(for B = block size)• Example: AABB, ABAB, BAAB, BABA, BBAA, ABBA

– At the start of each block, select one of the orderings at random– Should use block sizes > 2

Page 9: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Sidebar: Randomization

• Stratified randomization– First stratify Ss based on measured factors (prior

to randomization) (e.g., gender)– Within each strata, randomize

• Either simple or blocked

Strata Sex Condition assignment1 M ABBA BABA…2 F BABA BBAA…

Page 10: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

10

Within-Subjects DesignsBenefits

• More Power! Why?– Controls for all inter-subject variability– Randomized between-subjects design just balances

the effects between groups– (Matched-pair controls for identified and matched

extraneous variables)

Page 11: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

11

The Problem of Error Variance

• Error variance is the variability among scores not caused by the independent variable– Error variance is common to all experimental designs– Error variance is handled differently in each design

• Sources of error variance (“extraneous variables”)– Individual differences among subjects– Environmental conditions not constant across levels of the

independent variable– Fluctuations in the physical/mental state of an individual

subject

Page 12: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Error Variance

+

IndependentVariableIndividualDifferences

EnvironmentalConditions

MeasuredOutcomes

Page 13: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

13

Handling Error Variance

• Taking steps to reduce error variance– Hold extraneous variables constant by treating subjects as

similarly as possible– Match subjects on crucial characteristics

• Increasing the effectiveness of the independent variable– Strong manipulations yield less error variance than weak

manipulations

Page 14: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

14

Matched Group Design

Match Pairs

Randomize

• Use when you know some third variable has significant correlation with outcome• A between-subjects design• Use paired-samples t-test!

Treatment 2

Treatment 1

Page 15: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

15

• Randomizing error variance across groups– Distribute error variance equivalently across levels of the

independent variable– Accomplished with random assignment of subjects to levels

of the independent variable

• Statistical analysis– Random assignment tends to equalize error variance across

groups, but not guarantee that it will– You can estimate the probability that observed differences

are due to error variance by using inferential statistics

Handling Error Variance

Page 16: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

16

Within-Subjects Designs

• Subjects are not randomly assigned to treatment conditions– The same subjects are used in all conditions– Closely related to the matched-groups design

• Advantages– Reduces error variance due to individual differences among

subjects across treatment groups– Reduced error variance results in a more powerful design

• Effects of independent variable are more likely to be detected

Page 17: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

17

• More demanding on subjects, especially in complex designs

• Subject attrition is a problem• Carryover effects: Exposure to a previous treatment

affects performance in a subsequent treatment

Within-Subjects DesignsDisadvantages

Page 18: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Carryover Example

• Embodied Conversational Agents to Promote Health Literacy for Older Adults

T0 T1 T2

Brochure Computer

DiabetesKnowledgeAssessment

DiabetesKnowledgeAssessment

DiabetesKnowledgeAssessment

Page 19: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

19

Sources of Carryover• Learning

– Learning a task in the first treatment may affect performance in the second• Fatigue

– Fatigue from earlier treatments may affect performance in later treatments• Habituation

– Repeated exposure to a stimulus may lead to unresponsiveness to that stimulus• Sensitization

– Exposure to a stimulus may make a subject respond more strongly to another• Contrast

– Subjects may compare treatments, which may affect behavior• Adaptation

– If a subject undergoes adaptation (e.g., dark adaptation), then earlier results may differ from later ones

Page 20: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

20

Dealing With Carryover Effects

• Counterbalancing– The various treatments are presented in a different order for

different subjects– May be complete or partial– Balances the effects of carryover on each treatment– Assumes carryover effect is independent of the order

Page 21: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

21

• Taking Steps to Minimize Carryover– Techniques such as pre-training, practice sessions, or

rest periods between treatments can reduce some forms of carryover

• Make Treatment Order an Independent Variable– Allows you to measure the size of carryover effects,

which can be taken into account in future experiments

Dealing With Carryover Effects

Page 22: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

22

Dealing With Carryover Effects• The Latin Square Design

– Sample partial counterbalancing approach– Used when you make the number of treatment orders equal to the number of

treatments (each treatment occurs once in every row and column)– Example: want to evaluate 4 different word processors, using 4 admins in 4

departments. A completely counterbalanced design would require 4x4x4=64 trials.

– Latin square attempts to eliminate systematic bias in assignment of treatment to departments & subjects.

Subj Department1 2 3 4

1 C B A D Treatments A-D 2 B A D C 3 D C B A 4 A D C B

Page 23: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Example of a Counterbalanced Single-Factor Design With Two Treatments

Order12

TreatmentSequenceA BB A

Subject12…

Order21…

Treatment A23.514.6…

Treatment B14.211.5…

How do you test for “order effects”?

Page 24: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Types of Studies We’ve Discussed• Review pro’s and con’s of between subjects

and within subjects. What is matched pairs?

Page 25: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

25

Example – Best Design?

• You’ve developed a new web-based help system for your email client. You want to compare your system to the old printed manual.

Page 26: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

26

Example – Best Design?• You’ve just developed the “Matchmaker” – a

handheld device that beeps when you are in the vicinity of a compatible person who is also carrying a Matchmaker.

• You evaluate the number of users who are married after six months of use compared to a non-intervention control group.

Page 27: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

27

Example – Best Design?• You’ve just developed “Reado Speedo” that

reads print books using OCR and speaks them to you at twice your normal reading rate. You want to evaluate your product against the old fashioned way on reading rate, comprehension and satisfaction.

Page 28: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Introduction to Usability Testing

I. Summative evaluation: Measure/compare user performance and satisfaction

•Quantitative measures•Statistical methods

II. Formative Evaluation: Identify Usability Problems•Quantitative and Qualitative measures•Ethnographic methods such as interviews, focus groups

Page 29: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Usability Goals (Nielsen)

1. Learnability2. Efficiency3. Memorability4. Error avoidance/recovery5. User satisfaction

Operationalize these goals to evaluate usability

Page 30: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

What is a Usability Experiment?

Usability testing in a controlled environment•There is a test set of users•They perform pre-specified tasks•Data is collected (quantitative and qualitative)•Take mean and/or median value of measured attributes•Compare to goal or another system

Contrasted with “expert review” and “field study” evaluation methodologies

The growth of usability groups and usability laboratories

Page 31: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Subjectsrepresentativesufficient sample

Variablesindependent variable (IV)

characteristic changed to produce different conditions.e.g. interface style, number of menu items.

dependent variable (DV)characteristics measured in the experimente.g. time taken, number of errors.

Experimental factors

Page 32: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Hypothesisprediction of outcome framed in terms of IV and DVnull hypothesis: states no difference between conditions

aim is to disprove this.

Experimental designwithin groups design

each subject performs experiment under each condition.transfer of learning possible less costly and less likely to suffer from user variation.

between groups designeach subject performs under only one conditionno transfer of learning more users requiredvariation can bias results.

Experimental factors (cont.)

Page 33: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Summative AnalysisWhat to measure? (and it’s relationship to a usability goal)

Total task timeUser “think time” (dead time??)Time spent not moving toward goal

Ratio of successful actions/errorsCommands used/not used

frequency of user expression of:confusion, frustration, satisfaction

frequency of reference to manuals/help systempercent of time such reference provided the needed answer

Page 34: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Measuring User Performance

Measuring learnabilityTime to complete a set of tasksLearnability/efficiency trade-off

Measuring efficiencyTime to complete a set of tasksHow to define and locate “experienced” users

Measuring memorabilityThe most difficult, since “casual” users are hard

to find for experimentsMemory quizzes may be misleading

Page 35: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Measuring User Performance (cont.)

Measuring user satisfactionLikert scale (agree or disagree)Semantic differential scalePhysiological measure of stress

Measuring errorsClassification of minor v. serious

Page 36: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Reliability and Validity

Reliability means repeatability. Statistical significance is a measure of reliability

Validity means will the results transfer into a real-life situation.It depends on matching the users, task, environment

Reliability - difficult to achieve because of high variability in individual user performance

Page 37: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Formative EvaluationWhat is a Usability Problem??

Unclear - the planned method for using the system is notreadily understood or remembered (info. design level)

Error-prone - the design leads users to stray from thecorrect operation of the system (any design level)

Mechanism overhead - the mechanism design creates awkwardwork flow patterns that slow down or distract users.

Environment clash - the design of the system does notfit well with the users’ overall work processes. (any design level)

Ex: incomplete transaction cannot be saved

Page 38: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Qualitative methods for collecting usability problems

Thinking aloud studiesDifficult to conductExperimenter prompting, non-directiveAlternatives: constructive interaction, coaching method, retrospective testing

Output: notes on what users did and expressed: goals, confusions or misunderstandings, errors, reactions expressed

QuestionnairesShould be usability-tested beforehand

Focus groups, interviews

Page 39: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

user observed performing task

user asked to describe what he is doing and why, what he thinks is happening etc.

Advantagessimplicity - requires little expertisecan provide useful insightcan show how system is actually use

Disadvantagessubjectiveselectiveact of describing may alter task performance

Observational Methods - Think Aloud

Page 40: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

variation on think aloud

user collaborates in evaluation

both user and evaluator can ask each other questions throughout

Additional advantagesless constrained and easier to useuser is encouraged to criticize systemclarification possible

Observational Methods - Cooperative evaluation

Page 41: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

paper and pencilcheap, limited to writing speed

audiogood for think aloud, diffcult to match with other protocols

videoaccurate and realistic, needs special equipment, obtrusive

computer loggingautomatic and unobtrusive, large amounts of data difficult to

analyze

user notebookscoarse and subjective, useful insights, good for longitudinal

studies

Mixed use in practice.Transcription of audio and video difficult and requires skill.Some automatic support tools available

Observational Methods - Protocol analysis

Page 42: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

analyst questions user on one to one basisusually based on prepared questionsinformal, subjective and relatively cheap

Advantagescan be varied to suit contextissues can be explored more fullycan elicit user views and identify unanticipated problems

Disadvantagesvery subjectivetime consuming

Query Techniques - Interviews

Page 43: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Set of fixed questions given to users

Advantagesquick and reaches large user groupcan be analyzed more rigorously

Disadvantagesless flexibleless probing

Query Techniques - Questionnaires

Page 44: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Advantages:specialist equipment availableuninterrupted environment

Disadvantages:lack of contextdifficult to observe several users cooperating

Appropriateif actual system location is dangerous or impractical forto allow controlled manipulation of use.

Laboratory studies: Pros and Cons

Page 45: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Steps in a usability experiment

1. The planning phase

1. The execution phase

1. Data collection techniques

1. Data analysis

Page 46: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

The planning phaseWho, what, where, when and how much?

•Who are test users, and how will they be recruited?•Who are the experimenters?•When, where, and how long will the test take?•What equipment/software is needed?•How much will the experiment cost?

Prepare detailed test protocol*What test tasks? (written task sheets)*What user aids? (written manual)*What data collected? (include questionnaire)

How will results be analyzed/evaluated?

Pilot test protocol with a few users

Page 47: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Detailed Test Protocol

What tasks?Criteria for completion?User aidsWhat will users be asked to do (thinking aloud studies)?Interaction with experimenterWhat data will be collected?

All materials to be given to users as part of the test, including detailed description of the tasks.

Page 48: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Execution phase

Prepare environment, materials, softwareIntroduction should include:

purpose (evaluating software)voluntary and confidentialexplain all procedures

recordingquestion-handling

invite questionsDuring experiment

give user written task description(s), one at a timeonly one experimenter should talk

De-briefing

Page 49: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Execution phase: ethics of human experimentation applied to usability testing

Users feel exposed using unfamiliar tools and making erros

Guidelines:•Re-assure that individual results not revealed•Re-assure that user can stop any time•Provide comfortable environment•Don’t laugh or refer to users as subjects or guinea pigs•Don’t volunteer help, but don’t allow user to struggle too long•In de-briefing

•answer all questions•reveal any deception•thanks for helping

Page 50: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Execution Phase: Designing Test Tasks

Tasks:Are representative Cover most important parts of UIDon’t take too long to completeGoal or result oriented (possibly with scenario)

Not frivolous or humorous (unless part of product goal)

First task should build confidenceLast task should create a sense of accomplishment

Page 51: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Data collection - usability labs and equipment

Pad and paper the only absolutely necessary data collection tool!

Observation areas (for other experimenters, developers, customer reps, etc.) - should be shown to users

Videotape (may be overrated) - users must sign a releaseVideo display capture

Portable usability labsUsability kiosks

Page 52: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Before you start to do any statistics:look at datasave original data

Choice of statistical technique depends ontype of datainformation required

Type of datadiscrete - finite number of valuescontinuous - any value

Analysis of data

Page 53: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Testing usability in the field

1. Direct observation in actual use discover new usestake notes, don’t help, chat later

2. Logging actual use objective, not intrusivegreat for identifying errors which features

are/are not used privacy concerns

Page 54: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Testing Usability in the Field (cont.)

3. Questionnaires and interviews with real usersask users to recall critical incidentsquestionnaires must be short and easy to return

4. Focus groups6-9 usersskilled moderator with pre-planned scriptcomputer conferencing??

5 On-line direct feedback mechanismsinitiated by usersmay signal change in user needstrust but verify

6. Bulletin boards and user groups

Page 55: IS 4800 Empirical Research Methods  for Information Science Class Notes Feb. 24, 2012

Advantages:natural environmentcontext retained (though observation may alter it)longitudinal studies possible

Disadvantages:distractionsnoise

Appropriate for “beta testing”where context is crucial for longitudinal studies

Field Studies: Pros and Cons