32
UNDERSTANDING KAPPA STATISTIC

Clinical Disagreement and The Kappa

Embed Size (px)

DESCRIPTION

Clinical Disagreement and The Kappa

Citation preview

Page 1: Clinical Disagreement and The Kappa

UNDERSTANDING KAPPA STATISTIC

Page 2: Clinical Disagreement and The Kappa

KAPPA• KAPPA = River Imp, Water Sprite

Origin = Japan (with Chinese & Hindu Antecedents)

• the kappa has a beak, webbed feet and a shell on its back and dwells under bridges, pouncing on any who attempt to cross the river.

Page 3: Clinical Disagreement and The Kappa

OBJECTIVES

1. To understand the concept of clinical disagreement and observer variability.

2. To understand Kappa and related statistics.

Page 4: Clinical Disagreement and The Kappa

LEARNING STRATEGIES

1. Identify the situations when clinical disagreement occurs.

2. Calculate Kappa and understand the concepts of Kappa.

Page 5: Clinical Disagreement and The Kappa

Epictetus, 2nd Century:

Appearances of the mind are 4 kinds:1. Things are what they appear to be. (It is pneumonia, and

appears like one).2. Things neither are, nor appear to be. (It is not pneumonia,

nor does it appear like one)3. They are, and do not appear to be (It is pneumonia but does

not look like one).4. They are not, yet appear to be (It is not pneumonia, but looks

like one).

Sorting out all of these appearances in everyday life is the tasks

of wise men (doctors). ( Example: Diagnosing pneumonia by history and

physical examination)

Page 6: Clinical Disagreement and The Kappa

Why the disagreement?

Variations

Sources:

How the measurements are carried out (Instruments, Tests, The person carrying out the measurements).

Biological factors (within the individuals (patients), among individuals (patients)).

Page 7: Clinical Disagreement and The Kappa

CASE SCENARIOS OF CLINICAL DISAGREEMENT

1. Two radiologists, A and B, disagreeing on evidence of malignancy.

2. Disagreement between two cardiologists in the interpretation of electrocardiograms to look for evidence of ischemia.

3. A psychiatrist from HUKM disagreeing on the diagnosis of a psychotic disorder of his patient diagnosed earlier as such by another colleague at HKL.

Page 8: Clinical Disagreement and The Kappa

WHY KAPPA?

• In reading medical literature on diagnosis and interpretation of diagnostic tests, our attention is generally focused on items such as: sensitivity, specificity, predictive values and likelihood ratios. Those mentioned above address the validity of the tests.

• But if the people who actually interpret the test cannot agree on the interpretation of the results, the test results will be of little use!

Page 9: Clinical Disagreement and The Kappa

RATIONALE OF USING KAPPA

• Kappa tries to eliminate agreement which would be expected by chance alone.

• Let’s say you and me agree 95% of the time on a specific test. Merely saying 95% agreement is not enough!

• Kappa takes into consideration the proportion of the observed agreement beyond chance (potential agreement beyond chance minus agreement attributed by chance) divided by remaining agreement (potential agreement beyond chance).

Page 10: Clinical Disagreement and The Kappa

Cont…

Example: Let’s say that my diagnostic test to confirm a malaria slide is to flip a coin. I find that I agree with my colleague (Dr X) 55% of the time.

a. Is that agreement good enough?

To know so, I would have to see how much of that agreement would be by chance alone?

Page 11: Clinical Disagreement and The Kappa

Cont….

Chance alone should account for 50%. Of the remaining 50%, I observed 5% agreement. The Kappa should therefore be (55 – 50) divided by (100 – 50) = 5/50 = 0.1

The Answer: It is a low agreement.

The lesson: You cannot diagnose malaria by flipping coins!

Page 12: Clinical Disagreement and The Kappa

Two clinicians look at the same 100 mammograms, and both thinks 20 are positive for breast cancer:

Neg Pos. Neg. Pos. Neg. Pos

Neg. 80 0 75 5 70 10

Pos. 0 20 5 15 10 10

Agree: 100% 90% 80%Kappa: 1.0 0.69 0.38

Page 13: Clinical Disagreement and The Kappa

INTER-RATER RELIABILITY

• Inter-rater reliability refers to the correlation of responses from two or more raters, each evaluating the same endpoint or making the same measurements in multiple subjects.

• Inter-rater reliability is an important concept in clinical research.

• Errors may arise as a result of different interpretation.Eg. How different pathologists interpret a histo-pathology slide.

Page 14: Clinical Disagreement and The Kappa

KAPPA COEFFICENT & CORRELATION COEFFICIENT• Kappa coefficient is analogous to Pearson

correlation coefficient (or Spearman rank correlation coefficient) and

• has the same range of values (+1 to –1).

• However, it is better in several ways in identifying disagreement compared to Pearson or Spearman rank correlation.

Page 15: Clinical Disagreement and The Kappa

WHY NOT USE CORRELATION COEFFICIENT?

1. Correlation coefficient is high for any linear relationship, not just when the first measurement equals the second measurement. If the second measurement is multiplied by 3, the correlation coefficient remains the same, although the measurements no longer agree.

2. The test of significance for the correlation coefficient uses the absence of relationship as the null hypothesis. This will invariably be rejected, since of course there is a relationship between the first and second measurements – even if they don’t agree with each other very well. The null hypothesis is rarely of clinical interest ( as variables being correlated usually have some relationship to each other).

Page 16: Clinical Disagreement and The Kappa

Kappa Formula

• Kappa takes into account the probability that some agreement will occur by chance.

K = observed agreement - chance agreement

1- chance agreement

= (Po - Pe)/ (1 – Pe)

Page 17: Clinical Disagreement and The Kappa

KAPPA COEFFICIENT (KAPPA STATISTIC)

Observer 1Pos Neg Mar. Total

Pos a b g1Observer 2

Neg c d g2

Marg. Total f1 f2 n

Page 18: Clinical Disagreement and The Kappa

Reading 2 X 2 Table

a and d = No. of times observers agree.

b and c = No. of times observers disagree.

If no disagreements, then b and c = 0 and observed agreement, Po would be 1.

If no agreements, a and d = 0, and observed agreement, Po would be 0.

Page 19: Clinical Disagreement and The Kappa

Kappa Formula

Whereby:

Po = a+d n

f1 X g1 + f2 X g2 n nPe = _________________

n

Page 20: Clinical Disagreement and The Kappa

KAPPA interpretation

Range of possible values for Kappa = -1 to 1.

Poor agreement = < 0.2Fair agreement = 0.2 to 0.4Moderate agreement = 0.4 to 0.6Good agreement = 0.6 to 0.8Very good agreement = 0.8 to 1.0

It is rare that we get a perfect or negative agreement.

Page 21: Clinical Disagreement and The Kappa

Kappa

Observer # 1 TotalPositive Negative

Obs. # 2Positive 40 10 50

Negative 20 30 50Total 60 40 100

Po = (40 +30)/100 = 0.7Pe = ((60 * 50)/100 + (40*50)/100))/100= (30 +20)/100

= 0.5Kappa = (0.7-0.5)/ (1-0.5) = 0.2/0.5 = 0.4

Page 22: Clinical Disagreement and The Kappa

WEIGHTED KAPPA

• Use in Ordinal data. Ordinal data has an inherent order, such as pain is rated “none’, ‘mild’, “moderate’ or ‘severe”.

• A weight of “0” when the two raters are maximally apart, a weight of “1” when there is exact agreement, and weight proportionately spaced for in between intermediate levels of agreement.

• The formula is the same for normal kappa except that observed and expected agreement are summed not just along the the diagonal, but for the whole table, with each cell first multiplied by a weight for that cell.

Page 23: Clinical Disagreement and The Kappa

Example: Weighted 0, 0.5, 1.0

Observer # 1Normal Mild Serious Total

Norm. 7 2 1 10#2

Mild 5 10 5 20

Serious 3 3 14 20

Total 15 15 20 50

Page 24: Clinical Disagreement and The Kappa

Observed Agreement

Normal Mild Serious Total

Normal 7 10Mild 10 20Serious 14 20

Total 15 15 14 50Observed agreement = Po =((7+10+14)/50) = 0.62

Page 25: Clinical Disagreement and The Kappa

Expected (Perfect) Agreement

Normal Mild Serious Total

Normal 3Mild 6Serious 8TotalExpected agreement = Pe = (3+6+8)/50 = 0.34

Page 26: Clinical Disagreement and The Kappa

Observed weighted agreement

Normal Mild Serious Total

Normal 2Mild 5 5Serious 3Total

Partial agreement =(2+5+3+ 5)/50 = 15/50 = 30%But we give only ½ credit for this partial

agreement, so we get 30% * 0.5 = 15%

Page 27: Clinical Disagreement and The Kappa

Expected partial agreement for weighted kappa

Normal Mild Serious Total

Normal 3

Mild 6 8

Serious 6

Total

Expected numbers= (3+6+6+8)/50 = 23/50= 0.46

Since we are giving ½ credit = 0.46 * 0.5 = 0.23

Page 28: Clinical Disagreement and The Kappa

Weighted Kappa

Total observed agreement = 0.62+0.15 = 0.77

Total expected agreement = 0.34 + 0.23 = 0.57

Weighted Kappa = (0.77-0.57)/(1-0.57) = 0.465

Page 29: Clinical Disagreement and The Kappa

Weighted values

For 3 categories:

Complete disagreement = 0 weighted.

Partial agreement = ½ weighted.

Complete agreement = 1 weighted.

For 4 categories = 0, 0.33, 0.67 and 1.0

Page 30: Clinical Disagreement and The Kappa

Using SPSS to Compute Kappa

In variable view use Doctor 1 and Doctor2 as string variables. Count as numeric.

Data weight cases.. To weight for count.Select ANALYZE I DESCRIPTIVE STATISTICS I

CROSSTABS from the SPSS menu. In the dialog box, click on the STATISTICS button and then select the Kappa option box.

Note: Make sure your data are in the right column and rows.

Page 31: Clinical Disagreement and The Kappa

Final notes

• Kappa should not be viewed as the unequivocal standard to assess rater agreement.

• Kappa value itself is influenced by chance. Confidence interval for kappa may be more informative.

• Kappa may not be reliable for rare observation. Kappa is affected by prevalence. For rare findings, very low kappa may not necessarily reflect low rate of overall agreement.

• Because it is affected by prevalence, it may not be appropriate to compare kappa between different studies or populations.

Page 32: Clinical Disagreement and The Kappa

THANK YOU