30
Statistics Psych 231: Research Methods in Psychology

Statistics Psych 231: Research Methods in Psychology

  • View
    236

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Statistics Psych 231: Research Methods in Psychology

Statistics

Psych 231: Research Methods in Psychology

Page 2: Statistics Psych 231: Research Methods in Psychology

Distribution

Properties of a distribution– Shape

• Symmetric v. asymmetric (skew) • Unimodal v. multimodal

– Center• Where most of the data in the distribution are

– Mean, Median, Mode

– Spread (variability)• How similar/dissimilar are the scores in the distribution?

– Standard deviation (variance), Range

Page 3: Statistics Psych 231: Research Methods in Psychology

Center

There are three main measures of center– Mean (M): the arithmetic average

• Add up all of the scores and divide by the total number

• Most used measure of center

– Median (Mdn): the middle score in terms of location• The score that cuts off the top 50% of the from the bottom 50%

• Good for skewed distributions (e.g. net worth)

– Mode: the most frequent score• Good for nominal scales (e.g. eye color)

• A must for multi-modal distributions

Page 4: Statistics Psych 231: Research Methods in Psychology

The Mean The most commonly used measure of center The arithmetic average

– Computing the mean

μ =∑X

N

– The formula for the population mean is (a parameter):

– The formula for the sample mean is (a statistic):

X =∑ X

n

Add up all of the X’s

Divide by the total number in the population

Divide by the total number in the sample

Page 5: Statistics Psych 231: Research Methods in Psychology

The Mean The most commonly used measure of center The arithmetic average

– Computing the mean

Our population2, 4, 6, 8

μ =∑X

N=

2 + 4 + 6 + 8

4=

20

4= 5.0

Page 6: Statistics Psych 231: Research Methods in Psychology

Spread (Variability)

How similar are the scores?– Range: the maximum value - minimum value

• Only takes two scores from the distribution into account

• Influenced by extreme values (outliers)

– Standard deviation (SD): (essentially) the average amount that the scores in the distribution deviate from the mean

• Takes all of the scores into account

• Also influenced by extreme values (but not as much as the range)

– Variance: standard deviation squared

Page 7: Statistics Psych 231: Research Methods in Psychology

Variability

Low variability– The scores are fairly similar

High variability– The scores are fairly dissimilar

mean mean

Page 8: Statistics Psych 231: Research Methods in Psychology

Standard deviation

The standard deviation is the most popular and most important measure of variability.– In essence, the standard deviation measures how far off all

of the individuals in the distribution are from a standard, where that standard is the mean of the distribution. Essentially, the average of the deviations.

μ

Page 9: Statistics Psych 231: Research Methods in Psychology

Computing standard deviation (population)

Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution.

Our population2, 4, 6, 8

1 2 3 4 5 6 7 8 9 10

μ =∑X

N=

2 + 4 + 6 + 8

4=

20

4= 5.0

2 - 5 = -3

μX - μ = deviation scores

-3

Page 10: Statistics Psych 231: Research Methods in Psychology

Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution.

Our population2, 4, 6, 8

1 2 3 4 5 6 7 8 9 10

μ =∑X

N=

2 + 4 + 6 + 8

4=

20

4= 5.0

2 - 5 = -34 - 5 = -1

μX - μ = deviation scores

-1

Computing standard deviation (population)

Page 11: Statistics Psych 231: Research Methods in Psychology

Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution.

Our population2, 4, 6, 8

1 2 3 4 5 6 7 8 9 10

μ =∑X

N=

2 + 4 + 6 + 8

4=

20

4= 5.0

2 - 5 = -34 - 5 = -1

6 - 5 = +1

μX - μ = deviation scores

1

Computing standard deviation (population)

Page 12: Statistics Psych 231: Research Methods in Psychology

Step 1: To get a measure of the deviation we need to subtract the population mean from every individual in our distribution.

Our population2, 4, 6, 8

1 2 3 4 5 6 7 8 9 10

μ =∑X

N=

2 + 4 + 6 + 8

4=

20

4= 5.0

2 - 5 = -34 - 5 = -1

6 - 5 = +18 - 5 = +3

μX - μ = deviation scores

3

Notice that if you add up all of the deviations they must equal 0.

Computing standard deviation (population)

Page 13: Statistics Psych 231: Research Methods in Psychology

Step 2: So what we have to do is get rid of the negative signs. We do this by squaring the deviations and then taking the square root of the sum of the squared deviations (SS).

SS = (X - μ)2

2 - 5 = -34 - 5 = -1

6 - 5 = +18 - 5 = +3

X - μ = deviation scores

= (-3)2 + (-1)2 + (+1)2 + (+3)2

= 9 + 1 + 1 + 9 = 20

Computing standard deviation (population)

Page 14: Statistics Psych 231: Research Methods in Psychology

Step 3: Now we have the sum of squares (SS), but to get the Variance which is simply the average of the squared deviations– we want the population variance not just the SS, because

the SS depends on the number of individuals in the population, so we want the mean

• So to get the mean, we need to divide by the number of individuals in the population.

variance = 2 = SS/N

Computing standard deviation (population)

Page 15: Statistics Psych 231: Research Methods in Psychology

Step 4: However the population variance isn’t exactly what we want, we want the standard deviation from the mean of the population. To get this we need to take the square root of the population variance.

2 =X − μ( )

2∑N

standard deviation = =

Computing standard deviation (population)

Page 16: Statistics Psych 231: Research Methods in Psychology

To review:– Step 1: compute deviation scores– Step 2: compute the SS

• either by using definitional formula or the computational formula

– Step 3: determine the variance• take the average of the squared deviations• divide the SS by the N

– Step 4: determine the standard deviation• take the square root of the variance

Computing standard deviation (population)

Page 17: Statistics Psych 231: Research Methods in Psychology

Relationships between variables

Suppose that you notice that the more you study for an exam, the better your score typically is. – This suggests that there is a relationship between study time and

test performance (a “correlation”). Properties of a correlation

– Form (linear or non-linear)– Direction (positive or negative)– Strength (none, weak, strong, perfect)

To examine this relationship you should:– Make a scatterplot– Compute the Correlation Coefficient

Page 18: Statistics Psych 231: Research Methods in Psychology

Scatterplot

Plots one variable against the other Useful for “seeing” the relationship

– Form, Direction, and Strength

Each point corresponds to a different individual

Imagine a line through the data points

Page 19: Statistics Psych 231: Research Methods in Psychology

ScatterplotX Y

6 6

1 2

5 6

3 4

3 2

Y

X

1

2

3

4

5

6

1 2 3 4 5 6

Page 20: Statistics Psych 231: Research Methods in Psychology

Correlation Coefficient

A numerical description of the relationship between two variables

For relationship between two continuous variables we use Pearson’s r

It basically tells us how much our two variables vary together– As X goes up, what does Y typically do

• X, Y• X, Y• X, Y

Page 21: Statistics Psych 231: Research Methods in Psychology

FormNon-linearLinear

Page 22: Statistics Psych 231: Research Methods in Psychology

DirectionNegativePositive

• As X goes up, Y goes up

• X & Y vary in the same direction

• positive Pearson’s r

• As X goes up, Y goes down

• X & Y vary in opposite directions

• negative Pearson’s r

Y

X

Y

X

Page 23: Statistics Psych 231: Research Methods in Psychology

Strength

Zero means “no relationship”.– The farther the r is from zero, the stronger the

relationship

The strength of the relationship– Spread around the line (note the axis scales)

Page 24: Statistics Psych 231: Research Methods in Psychology

Strength

r = 1.0“perfect positive corr.”r2 = 100%

r = -1.0“perfect negative corr.”r2 = 100%

r = 0.0“no relationship”r2 = 0.0

-1.0 0.0 +1.0

The farther from zero, the stronger the relationship

Page 25: Statistics Psych 231: Research Methods in Psychology

Strength

-1.0 0.0 +1.0

-.8 .5

Which relationship is stronger?

Rel A, -0.8 is stronger than +0.5

r = -0.8r2 = 64%

Rel A

r = 0.5

r2 = 25%

Rel B

Skip regression

Page 26: Statistics Psych 231: Research Methods in Psychology

Regression

Compute the equation for the line that best fits the data points

Y = (X)(slope) + (intercept)

2.0

Change in Y

Change in X= slope

0.5

Y

X

1

2

3

4

5

6

1 2 3 4 5 6

Page 27: Statistics Psych 231: Research Methods in Psychology

Regression

Can make specific predictions about Y based on X

Y = (X)(.5) + (2.0)X = 5

Y = ? Y = (5)(.5) + (2.0)

Y = 2.5 + 2 = 4.54.5

Y

X

1

2

3

4

5

6

1 2 3 4 5 6

Page 28: Statistics Psych 231: Research Methods in Psychology

Regression Also need a measure of error

Y = X(.5) + (2.0) + error Y = X(.5) + (2.0) + error

Y

X

1

2

3

4

5

6

1 2 3 4 5 6

Y

X

1

2

3

4

5

6

1 2 3 4 5 6

• Same line, but different relationships (strength difference)

Page 29: Statistics Psych 231: Research Methods in Psychology

Multiple regression

You want to look at how more than one variable may be related to Y

The regression equation gets more complex– X, Z, & W variables are used to predict Y

– e.g., Y = b1X + b2Z + b3W + b0 + error

Page 30: Statistics Psych 231: Research Methods in Psychology

Cautions with correlation Don’t make causal claims Don’t extrapolate Extreme scores (outliers) can strongly

influence the calculated relationship