18
2 test •For testing significance of patterns in qualitative data •Test statistic is based on counts that represent the number of items that fall in each category •Test statistics measures the Chi-squared Tests

For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

Embed Size (px)

Citation preview

Page 1: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

•For testing significance of patterns in qualitative data

•Test statistic is based on counts that represent the number of items that fall in each category•Test statistics measures the agreement between actual counts and expected counts assuming the null hypothesis

Chi-squared Tests

Page 2: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2Distribution

The chi-square distribution can be used to see whether or not an observed counts agree with an expected counts.Let

O = observed count and

E = Expected count

Chi-squared Distribution

EEO 2)(2

Page 3: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

Testing if Observed Countsare in Agreement with Known Percentages

Consider items of a population distributed over k categories in in proportions

If H0 is true then we expectEi = n , expected frequency

for the ith category as opposed to Oi, observed frequency.

k ...

,2,

1

0:

0 iiH

0i

Page 4: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

Observed ExpectedFrequency Frequency

H 40 50T 60 50

sum 100 100

An ExampleBiased Coin?

Page 5: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2

22

2 2

2 2

40 5050

60 5050

1050

1050

10050

10050

2 2

4

statistic formula

O EE

( )

( ) ( )

( ) ( )

Page 6: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2distribution

degrees of freedom = (R –1)(C – 1)

R = number of rows

C = number of columns

Page 7: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2distribution

Is our chi square value an extreme outcome just by chance while in fact the null hypothesis is true and sample frequencies are not significantly apart from the ideal frequencies?

Note that chi-squared statistic is a positive number

Page 8: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

•only the right-hand sideof the table is used

•nondirectional test

•the statistic has no sign

Page 9: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

Biased Die?

Observed ExpectedDie Frequency Frequency

1 4 102 6 103 17 104 16 105 8 106 9 10

sum 60 60

Page 10: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2

22

2 2

2 2

2 2

4 1010

6 1010

17 1010

16 1010

8 1010

9 1050

14 2

statistic formula

O EE

( )

( ) ( )

( ) ( )

( ) ( )

.

Page 11: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2distribution

degrees of freedom =

number of terms -1

Page 12: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2 x 2 contingency tablesChi-squared test for independence

Var A

Var B

a1

a2

b1 b2 total

total

Ho : The two variable are independent

Ha : The two variables are associated

2 2 2test x contingency

tables

Page 13: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

Operator

Result

A

B

defnotdef. total

total

100 900 1000

60 440 500

160 1340 1500

2test

Page 14: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

Operator

Result

A

B

defnotdef.

total

total

100 900 1000

60 440 500

160 1340 1500

Total number of items=1500

Total number of defective items=160

Overall defective rate =160/1500=0.1067

Now, apply this rate to the number of items produced by each operator.

Page 15: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

Operator

Result

A

B

defnotdef.

total

total

100 900 1000

60 440 500

160 1340 1500

Expected defective from Operator A

= 1000 * 0.1067 = 106.7

(expected not defective=1000-106.7=893.3)

Expected defective from Operator B

= 500 * 0.1067 = 53.3

(expected not defective=500-53.3=446.7)

Page 16: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

Operator

Expected

A

B

defnotdef. total

total

106.7 893.3

53.3 446.7

Result

OperatorA

B

defnotdef.

total

total

100 900 1000

60 440 500

160 1340 1500

Page 17: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

r x c contingency tables

SA A NO D SD

Gr 1 12 18 4 8 12

Gr2 48 22 10 8 10

Gr3 10 4 12 10 12

Page 18: For testing significance of patterns in qualitative data Test statistic is based on counts that represent the number of items that fall in each category

2test

•use when you have categorical data

•measure the difference between actual counts and expected counts

•test the independence of two variables

•Assumptions:data set is a random sampleyou have at least 5 counts in each category

•degrees of freedom =(categories var1 -1)(categories var2 -1)