16
Chi-square Test and Correlation Statistics 101 - Villejo

Lecture 19. Chi-square and Correlation

Embed Size (px)

DESCRIPTION

Statistics 101 UP Diliman Lecture Slide

Citation preview

Page 1: Lecture 19. Chi-square and Correlation

Chi-square Test and

Correlation

Statistics 101 - Villejo

Page 2: Lecture 19. Chi-square and Correlation

Objectives

• Know the assumptions of the chi-square test for

independence.

• Perform a chi-square test for independence.

• Learn the concept of correlation.

• Compute for and interpret the coefficient of

correlation.

• Perform the test of significance on the coefficient of

correlation.

Page 3: Lecture 19. Chi-square and Correlation

Chi-square Distribution

a. Like the t-distribution, it has a single parameter called the degrees of freedom.

b. The distribution is skewed to the right. As the degrees of freedom increases, its distribution becomes more symmetric.

c. Its mean is equal to its degrees of freedom. Its variances is twice its degrees of freedom.

d. Notation: If X is a random variable that follows a chi-square distribution with v degrees of freedom, we write X~χ2(𝑣).

e. If X~χ2 𝑣 , then χ2α(𝑣) satisfies the condition that

P(X > χ2α(𝑣)) = α or P(X ≤ χ2

α(𝑣)) = 1 -α

Page 4: Lecture 19. Chi-square and Correlation

Example

Suppose X~χ2 𝑣 , determine the following:

a. P(X > 18.307)

b. P(X < 20.483)

c. χ20.01(10)

Page 5: Lecture 19. Chi-square and Correlation

• The chi-square test for independence is used to

determine whether two categorical variables are

related.

• Example:

Page 6: Lecture 19. Chi-square and Correlation

Remarks

• An r x c contingency table has r rows and c columns

• The marginal frequencies are the row and

column totals

Page 7: Lecture 19. Chi-square and Correlation

Procedure in testing for independence

1. State the null and alternative hypotheses:

Ho: the two variables are independent

Ha: the two variables are not independent

2. Choose the level of significance α

Page 8: Lecture 19. Chi-square and Correlation

3. The test statistic is:

where the Oi’s are the observed frequencies and

the Ei’s are the expected frequencies given by:

Decision Rule: Reject Ho if χ2 > χ2α(v), where

v = (r – 1) x (c – 1).

Page 9: Lecture 19. Chi-square and Correlation

4. Compute for the value of the test statistic

5. Make the decision (whether the two variables are

independent or not). Express this decision in terms

of the problem.

Page 10: Lecture 19. Chi-square and Correlation

The chi-square test for independence is valid only if

1. At least 80% of the cells have expected

frequencies ≥ 5.

2. No cell has expected frequency ≤ 1

• For a 2 x 2 contingency table, we use Yates’

correction for continuity:

Page 11: Lecture 19. Chi-square and Correlation

Example. People are chosen at random from the state of Illinois until 1000 people have been classified as to whether they are of Protestant, Catholic, or Jewish faith, and whether or not they worship regularly. Test at 0.05 level of significance if religion is independent of pattern of worship.

Page 12: Lecture 19. Chi-square and Correlation

Definition

• The coefficient of correlation is a measure of the

strength of the linear relationship existing between

two variables, X and Y, that is independent of their

respective scales of measurement.

• This is denoted by the Greek letter ρ (rho), where -1

≤ ρ ≤ 1.

Page 13: Lecture 19. Chi-square and Correlation

• ρ = 1 → perfect positive linear relationship

• ρ = -1 → perfect negative linear relationship

• ρ = 0 → no linear relationship between

X and Y

Page 14: Lecture 19. Chi-square and Correlation

Remarks

• The coefficient of correlation can establish the

strength of the linear relationship between two

quantitative variables, but it can not establish

causality!

• We cannot say X causes Y or Y causes X.

Page 15: Lecture 19. Chi-square and Correlation

Pearson product-moment coefficient of correlation:

Page 16: Lecture 19. Chi-square and Correlation

Testing the Correlation Coefficient