Analysing Qualitative data - Forsiden for today • Advantages and disadvantages of qualitative data...

Preview:

Citation preview

Analysing Qualitative data

Amela Karahasanovićamela@sintef.no

Don't forget the quiz on working with humansand qualitative analysis on Friday!

2

Plan for today

• Advantages and disadvantages of qualitative data

• Grounded theory– Experiment research versus grounded theory– How to do it?

• Analyzing the content – coding• Ensuring the high quality analysis

– Validity– Reliability

3

• Do you like my app?

4

• What they are doing when they are using your app?

• What kind of problems they might have?• What they do in their spare time?

5

• Quantitative data: time, error rate, rankings

6

0

5

10

15

20

25

30

35

40

45

1 3 5 6 7 10 11 12 13 17 18 20 21 23 24 25 26

Unified Library Application - QuitUnified Library ApplicationTitle InformationReturnItemReserve TitleLend ItemInsert Title WindowFind TitleCompilation

Subject_id 15 Task 2

Sum of Seconds

Minutes

Visited

Data collection methods

• Participant observations• Interviews• Documentation (text, pictures)• Video and audio material• Diaries• Open-ended questions in surveys

7

Qualitative data

• Data represented as words and pictures, not numbers

• Coming from educational and social sciences to study the complexity of human behaviour– Motivation, communication, understanding

• HCI blends technology and human behaviour

8

Advantages and disadvantages

Enables digging into the complexity of the problems rather than abstracting it The results are richer and more informative Qualitative data analysis is more labor-

intensive than quantitative analysis The results are considered "softer" and

"fuzzier" The results are more difficult to summarize or

simplify

9

Decision support tools in ATM

Interviews, observer notes, survey video and audio material, screen captures, log files

10

14:04 What prohibited you in making the optimal decisions? Definitely, the radar screen was very important, By that I could decide if an airplane was ready for push back. The human/machine interface is important. This update took too long time, maybe 2-3 sec. It should be 1 sec maximum. The interfaces are important, overlapping aircraft symbols, is confusing, the clarification is not clear. My mental work flow will slow down. Another point is that the preview of what is coming next has to be well defined.

• Four controllers – 30 minutes each; questions about decisions, the process in control tower, the tools that were used, the experiment

11

Stages of qualitative analysis

1. Start with the data set containing the information the substance (e.g. communication between ATCOs) and identify its major components (e.g. giving instruction, asking for more information, negotiating)S

2. Study the properties and dimensions of each component (nature of the component, their relationship)

3. Use the knowledge about each component the understand the original substance

12

Online behaviour of internet users

• First stage – users behaviour affected by their personality, education, computer experience

• Second stage – study each of them, reads literature describing types of personality, their development, effects on social behaviour

• Third stage – go back and examine how each of these components influence online behavior

• OBS! Experience of the researcher is critical for discovery process

13

Grounded theory

• Not a theory• Qualitative research method• Goal – develop a theory grounded in a

systematically collected and analyzed data • Used both as a methodology or just some

instructions like coding procedure• Can be applied to different methods

(ethnography, case studies and interviews)

14

GT - Inductive method different from experimental research

Experimental research:Theory -> Hypothesis -> Study -> Data-> Y/N

Grounded theory:Study ->Data -> Theory

– Several rounds; reverse engineering

15

Grounded theory cont.

• GT is simply the discovery of emerging patterns in data

• Conceptualizing patterns and acting in terms of them

• Sequential and iterative

16

Remember!

• No pre-formed hypothesis!• No favorite solutions! • Creativity and open mind!• Let the data to lead you!

17

Procedures for grounded theory

• Open coding– Identify phenomena

• Development of the concepts– Group phenomena into concept

• Grouping concepts into categories– Grouping and interpretation

• Formation of a theory– Create inferential and predictive statements on

phenomena

18

Advantages

• Systematic approach to analysis of qualitative data

• Allows generating theory grounded in data and coding

• One can study data early on, and formulate and refine the theory through constant interplay between data collection and analysis

19

Disadvantages

• One can be overwhelmed by details• Theories might be difficult to evaluate

– Textual data, less strict measures, coding

• Might be biased

• Keep in mind– Be open-minded and creative– Listen to data

20

Analysis - coding

• More than word counting• "Involves interacting with data, making

comparisons between data, and so on, and in doing so, deriving concepts to stand for those data, then developing these concepts in terms of their properties and dimensions"

21

Coding

• Extracting values for quantitative variables from qualitative data in order to perform some quantitative analysis

• Quantitative data are objective and qualitative data are subjective

• Subjectivity/objectivity orthogonal to whether the data is qualitative or quantitative

22

Examples

• "Ola, Kari and Rune were the only participants at the meeting" -> num_participants = 3

• "Kari said that this particular INF2260 lesson was really easy to understand, and not very complex at all compared to other classes" -> complexity = low

23

Problems

• Possible loss of information• Subjects use different words for the same

phenomenon or the same word for different phenomena

• Subject use straightforward words that mask the meaning – "Low complexity of the code" -> easy to read, easy to

understand or just small• Things can be rated differently by different

subject (average, high, low )

24

A priory

• Codes from the literature• Analysis• Several coders• Reliability check if coding is consistent• Work fine for known domains

25

A priory coding

"Okay, it worked well. Then I looked through the class diagrams, okay. . . Then I understood better how it worked. But, okay, after that I looked at the task. Just make changes in. . ." ->search, explore, action

26

Emergent coding

• Appropriate for new topics• Several researchers examine the data and

develop key coding categories• Comparison, discussion, common list• Multiple coders do the coding• Reliability measures calculated; if ok proceed

with the coding; if not go back

27

Emergent coding

28

" Definitely, the radar screen was very important. By that I could decide if an airplane was ready for push back. The human/machine interface is important. This update took too long time, maybe 2-3 sec. It should be 1 sec maximum. " ->push back decision, waiting

Identifying coding categories

• Very important as they lead the analysis• Demanding• Codes are coming from

– Theoretical framework– Researchers interpretation (research denoted

concepts)– Participants (in-vivo codes)

29

Theoretical framework

• We start research by literature review and identifying theoretical framework related to our research topic

• Difficulties experienced by senior citizens when using computers– Human capabilities : cognitive, physical,

perceptual• Taxonomies

– Categories of users, tasks, errors

30

Researcher denoted concepts

• Identify patterns, opinions, behaviour in your data -> codes

• Open coding

• "I was looking for 'find' …and it was not there…It was so irritating" -> find, frustration

31

• In vivo-codes– Participants have a good descriptions– Use it in your coding– "Curriculum integration" from the one parent's

response, name of a TV show in the analysis of QoE• Building a code structure

– Participants express same ideas id different ways– Code list – nomenclature– Several levels (different levels of details)

32

Coding the text

• Read the text (watch the video) before start the coding

• Difficult to find anything interesting – too many interesting things

• Procedure:– Look for specific items– Ask questions constantly about the data– Make comparisons constantly at various levels

33

Look for key items

• Some statements have more valuable information– Objectives: computers for education– Actions: click on– Outcomes: error message appeared– Consequences: I stopped using it– Causes: my old laptop– Context: I was in bus– Strategies: I first browse

34

Ask questions about data

• The art of asking questions in a larger context– Sensitizing questions

• What is happening here? What did the user click? How did she reach www.ifi.uio.no?

– Theoretical questions• What is relationship between two factors? How does

interaction change over time?

35

Making comparisons of data

• Compare instances under different categories– Frequency for different capabilities (physical,

cognitive, perceptual) for elderly • Compare the results between different groups

– Age, background, family support• Compare to the previous published results

– Same/contradictory, related studies• Computer software

– NVivo, Concordance, SPSS TextSmart

36

Ensuring high-quality analysis

• Subjective analysis– Which category? Are they in the same group? Is 'good'

and 'ok' the same?• Validity

– Use of well-established and well-documented procedures to increase the accuracy of findings; Did we get it right?

• Reliability– Consistency of results; Would other researchers

make the same conclusions based on the same data set?

37

Validity

• Construction of a database with the collected data material: raw data (notes, documents, photos…) results of the analysis

• Also increase reliability• Data source triangulation

– Interviews, observations, diaries

• Avoid having pet theories• Consider alternative theories

38

Reliability

• Same word might have different meanings• Body language, face expressions, drawings

might have different meanings• Large studies -> different coders analyse

different data subsets• Different people should code in the same way

39

• Intra-coder reliability (stability)– Whether the same coder do the same throughout

the whole process; Would he do the same next time? (50% A, 30% B, 20% C)

• Inter-coder reliability (reproducibility; investigator triangulation)– Whether different coders would do the same?

Multiple coders with different backgrounds

40

• To achieve good reliability– Good coding instructions– Training– Test on the limited amount of data

• Reliability measure % agreement = the number of cases coded the same way by multiple coders/ the total number of cases

41

• Coders can do the same by chance• Cohen's Kappa (0-1; 0 – coded the same by

chance; 1 – perfect reliability

• K=(Pa – Pc)/(1-Pc)• Pa – percentage of cases on which the coders

agree• Pc – percentage of agreed cases by chance• More that 60% is satisfactory

42

43

Coded by both coders

Coded by

chance

0.26, 0.12, 0.35, 0.14, codded the same by both coders

Expected agreement when the data is coded by chance 0.37*0.39=0.14K=(Pa – Pc)/(1-Pc)Pa = 0.26 + 0.12 + 0.35 = 0.73Pc = 0.14 + 0.04 + 0.18 = 0.36K= 0.58 K>0.6 would be ok

7% coded physical by

coder 1, and

cognitive by coder 2

Subjective versus objective coders

• Subjective/inside coders– Designed the study, developed the coding

scheme, collected the data– (+) know the literature; know the topic; easier to

interpret the data; minimal training– (-)might be biased and unable to see new

patterns, new behaviour

• Objective/outside coders

44

Multimedia content

• Image, audio, video, screen shots• Cursor movements tracks, facial expressions,

gestures, intonation provides rich pool of data• Extremely time consuming• Same principles as for the text analysis: study

the literature, define the scope, context and objectives; identify key instances you want to annotate, analysis; evaluation of the reliability

45

Approaches

• Manual annotation– Labor-intensive, might be affected by coder's subjectivity

• Partially automated annotation– Humans code some sequences that are used to train the

application to establish the relationship between low-level features and high-level concepts

• Fully automated annotation– Highly error-prone

• Other tools: annotation pictures when taken on mobile phone, organizing pictures in a spreadsheet

46

Exercise 1 – group work

• Take your data • Make a plan for applying GT for analysis of

your data• Make two iterations with your data• What is your experience? Advantages?

Disadvantages?

47

Exercise 2 – group work

• Do coding for another group– Two coders

• What are the problems? Compare the results. What was the agreement. Which cases you disagreed on? What you can say about reliability? What you have done/can do to assure validity?

48

Recommended