1 Virtual COMSATS Inferential Statistics Lecture-25 Ossam Chohan Assistant Professor CIIT Abbottabad

Preview:

Citation preview

1

Virtual COMSATSInferential Statistics

Lecture-25

Ossam ChohanAssistant Professor

CIIT Abbottabad

Recap of previous lectures

• We are working on hypothesis testing. And we discussed following topics in our last lectures:– Introduction to Hypothesis Testing.– Six Steps of hypothesis testing.– Hypothesis testing for single population i.e for mean and proportion.– Hypothesis testing for paired observations.– Chi Square distribution.

– Test of independence.– Test for homogeneity.– Test for variances.– Goodness of fit test.– Fisher’s Exact Test.

– F-Distribution.– ANOVA

– One Way ANOVA.– Two Way ANOVA.

– Multiple Comparison using LSD.

2

Objective of lecture-25

• Introduction of Correlation and Regression.– Simple Correlation and its significance in research.– Solution to problems.– Properties of Correlation.– Scatter Plot.– Coefficient of determination.– Regression Analysis.– Simple Regression model.– Probable error and standard error.– Hypothesis testing for correlation coefficient.

3

Correlation and Regression

• Is there a relationship between x and y?• What is the strength of this relationship

– Pearson’s r

• Can we describe this relationship and use it to predict y from x?– Regression

• Is the relationship we have described statistically significant? – F- and t-tests

4

Discussion on Correlation

5

Correlation• Correlation analysis is used to measure strength

of the association (linear relationship) between two variables like fertilizer and yield.

–Only concerned with strength of the relationship.

–No causal effect is implied.

–Sample correlation coefficient is represented by r.

6

Scatter Plot

• A scatter plot is a graph of a collection of ordered pairs (x , y).

• A scatter plot (or scatter diagram) is used to show the relationship between two variables.

• Correlation can be represented using scatter plot.

• Why Scatterplot?

7

Scatter Plot Examples

y

x

y

x

y

y

x

x

Linear relationships Curvilinear relationships

8

Scatter Plot Examples

y

x

y

x

y

y

x

x

Strong relationships Weak relationships

(continued)

Can we show it numerically?

9

Scatter Plot Examples

y

x

y

x

No relationship

(continued)

10

Correlation Coefficient

• The population correlation coefficient is shown by a Greek symbol ρ (pronounce it as “rho”).

• It measures the strength of the association between the variables.

• The sample correlation coefficient r is an estimate of ρ.

• It is used to measure the strength of the linear relationship in the sample observations

11

Properties of ρand r

• Unit free• Values always lies between -1 and 1.• The closer to -1, the stronger the negative

linear relationship.• The closer to 1, the stronger the positive

linear relationship.• The closer to 0, the weaker the linear

relationship.

12

Calculating the Correlation Coefficient

where:r = Sample correlation coefficientn = Sample size-no of pairs of valuesx = Value of the independent variabley = Value of the dependent variable

])y()y(n][)x()x(n[

yxxynr

2222

Sample correlation coefficient:

In algebraic equivalent:

13

Problem-27

• The test-Retest method is one way of establishing the reliability of a test. The test is administered and then, at a later date, the same test is re-administered to the same individuals. Find the correlation coefficient between two sets of scores.

First score 75 87 60 75 98 80 68 84 47 72

Second score

72 90 52 75 94 78 72 80 53 70

14

Problem-27 Solution

15

• The coefficient of determination is the portion of the total variation in the dependent variable that is explained by variation in the independent variable

• The coefficient of determination is also called R-squared and is denoted as R2

Coefficient of Determination, R2

1R0 2 where

Note: In the single independent variable case, the coefficient of determination is

where:R2 = Coefficient of determination

r = Simple correlation coefficient 16