63
Computing in Computing in Archaeology Archaeology Session 11. Correlation Session 11. Correlation and regression analysis and regression analysis © Richard Haddlesey www.medievalarchitecture.net

Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Embed Size (px)

Citation preview

Page 1: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Computing in Computing in ArchaeologyArchaeology

Session 11. Correlation and Session 11. Correlation and regression analysisregression analysis

© Richard Haddlesey www.medievalarchitecture.net

Page 2: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Lecture aimsLecture aims

To introduce correlation and To introduce correlation and regression techniquesregression techniques

Page 3: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

The scattergramThe scattergram

In correlation, we are always dealing In correlation, we are always dealing with with pairedpaired scores, and so values of scores, and so values of the the two variablestwo variables taken together taken together will be used to make a scattergramwill be used to make a scattergram

Page 4: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

exampleexample

Quantities of New Forrest pottery Quantities of New Forrest pottery recovered from sites at varying distances recovered from sites at varying distances from the kilnsfrom the kilns

SiteSite Distance Distance (km)(km)

QuantityQuantity

11 44 9898

22 2020 6060

33 3232 4141

44 3434 4747

55 2424 6262

Page 5: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Negative correlationNegative correlation

Here we can see that the quantity of pottery decreases as distance from the source increases

Page 6: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Positive correlationPositive correlation

Here we see that the taller a pot, the wider the rim

Page 7: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Curvilinear monotonic relationCurvilinear monotonic relation

Again the further from source, the less quantity of artefacts

Page 8: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Arched relationship Arched relationship (non-monotonic)(non-monotonic)

Here we see the first molar increases with age and is then worn down as the animal gets older

Page 9: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 10: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

scattergramscattergram

This shows us that scattergrams are This shows us that scattergrams are the most important means of the most important means of studying relationships between studying relationships between two two variablesvariables

Page 11: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

REGRESSION

Regression differs from other techniques Regression differs from other techniques we have looked at so far in that it is we have looked at so far in that it is concerned not just with whether or not a concerned not just with whether or not a relationship exists, or the strength of that relationship exists, or the strength of that relationship, but with its naturerelationship, but with its nature

In regression analysis we use an In regression analysis we use an independent variable to estimate (or independent variable to estimate (or predict) the values of a dependent predict) the values of a dependent variablevariable

Page 12: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Regression equationRegression equation

y = f(x)

y = y axis (in this case the y = y axis (in this case the dependentdependent

f = function (of x)f = function (of x)

x = x axisx = x axis

Page 13: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

y = f(x)

y = x y = 2x y = x2

Page 14: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 15: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

General linear equationsGeneral linear equations

y = a + bxy = a + bx

Where y is the dependent variable, x Where y is the dependent variable, x is the independent variable, and the is the independent variable, and the coefficients a and b are constants, coefficients a and b are constants, i.e. they are fixed for a given datai.e. they are fixed for a given data

Page 16: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Therefore:Therefore: If x = 0 then the equation reduces to y = If x = 0 then the equation reduces to y =

a, so a represents the point where the a, so a represents the point where the regression line crosses the y axis (the regression line crosses the y axis (the interceptintercept))

The b constant defines the slope of The b constant defines the slope of gradient of the regression linegradient of the regression line

Thus for the pottery quantity in relation to Thus for the pottery quantity in relation to distance from source, b represents the distance from source, b represents the amount of decrease in pottery quantity amount of decrease in pottery quantity from the sourcefrom the source

Page 17: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

y = a + bx

Page 18: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 19: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 20: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

least-squares

Page 21: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

least-squares

Page 22: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

least-squares

Page 23: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

least-squares

Page 24: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

y = a + bx

Page 25: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

y = a + bx

Page 26: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

y = 102.64 – 1.8x

Page 27: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 28: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 29: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

CORRELATION

Page 30: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

CORRELATION

1 correlation coefficient

Page 31: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

CORRELATION

1 correlation coefficient

2 significance

Page 32: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

CORRELATION

1 correlation coefficient• r

2 significance

Page 33: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

CORRELATION

1 correlation coefficient• r• -1 to +1

2 significance

Page 34: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 35: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

• nominal – in name only

• ordinal – forming a sequence

• interval – a sequence with fixed distances

• ratio – fixed distances with a datum point

Levels of measurement:

Page 36: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

• nominal

• ordinal

• interval

• ratio

Levels of measurement:

Page 37: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

• nominal

• ordinal

• interval Product-Moment Correlation Coefficient• ratio

Levels of measurement:

Page 38: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

• nominal

• ordinal Spearman’s Rank Correlation Coefficient• interval • ratio

Levels of measurement:

Page 39: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 40: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

The Product-MomentCorrelation Coefficient

Page 41: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

length (cm) width (cm)

sample – 20 bronze spearheads

n=20

Page 42: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

length (cm) width (cm)

r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

Page 43: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

Page 44: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

r = nΣxy – (Σx)(Σy) g √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

Page 45: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

r = nΣxy – (Σx)(Σy) g= +0.67 √[nΣx2 – (Σx)2] [nΣy2 – (Σy)2]

n=20

Page 46: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

Page 47: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

H0 : true correlation coefficient = 0

Page 48: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Page 49: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Page 50: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Sample statistics needed: n and r

Page 51: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Sample statistics needed: n and r

Test statistic: TS = r

Page 52: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Test of product moment correlation coefficient

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables approximately random

Sample statistics needed: n and r

Test statistic: TS = r

Table: product moment correlation coefficient table.

Page 53: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey
Page 54: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

n = 20

Page 55: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

n = 20 r = 0.67 p<0.01

Page 56: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

n = 20 r = 0.67 p<0.01

length (cm) width (cm)

Page 57: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

Page 58: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

H0 : true correlation coefficient = 0

Page 59: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Page 60: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Page 61: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Sample statistics needed: n and rs

Page 62: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Sample statistics needed: n and rs

Test statistic: TS = rs

Page 63: Computing in Archaeology Session 11. Correlation and regression analysis © Richard Haddlesey

Spearman’s Rank Correlation Coefficient (rs)

H0 : true correlation coefficient = 0

H1 : true correlation coefficient ≠ 0

Assumptions: both variables at least ordinal

Sample statistics needed: n and rs

Test statistic: TS = rs

Table: Spearman’s rank correlation coefficient table