Psychology 217 Statistical Methods. Lesson U1-3: Summarizing Data Tables and graphs* –Tables...

Preview:

Citation preview

Psychology 217Statistical Methods

Lesson U1-3: Summarizing Data• Tables and graphs*

– Tables– Pie charts– Histograms and Polygons– Scatterplots/Scattergrams

*Note: The APA Publication Manual (6th ed.) dedicates 43 pages (pp. 125-167) to Tables and Figures

• Central Tendency– Mean– Standard Error of the Mean (SEM)– Median– Mode

Graphics• “A picture is worth 1,000 words”• If it takes 1,000 words to explain a picture, then

the point of diminishing returns has been violated!

Revenue (in $millions)

0102030

405060

N

NNW

NW

WNW

WNW

WSW

SW

SSW S

SSE

SE

ESE E

ENE

NE

NNE

Region

Q1

Q1R

Q2

Q2R

Q3

Q3R

Q4

Q4R

Tables

• Tables are most useful when trying to convey “a large amount of data in a small amount of space.”

• Tables should not be overused.– The 1st APA Guideline for Table use is:

• “Is the table necessary?”

• Tables are not stand-alone.– They must be discussed in the narrative as well.

• For example,…

• Additionally, preference grouping significantly varied in relation to PPRS ranking (χ2(3)=87.19, p<.001) (See Table 6). Christian non-Catholic participants (LDS and Christian remainder) significantly underrepresented in the low PPRS ranking and overrepresented in the high PPRS ranking. Conversely, those without preference significantly overrepresented in the low PPRS ranking and underrepresented in the high PPRS ranking.

Table 6. PPRS Ranking by Religious Preference Grouping

Low PPRS High PPRS

LDS 20(14%)

64(47%)

Catholic 11(8%)

21(15%)

Christian remainder 29(21%)

44(32%)

None 79(57%)

8(6%)

↑ signifies high cell percentage (p<.05) determined by cell standard deviate calculation↓ signifies low cell percentage (p<.05) determined by cell standard deviate calculation

Pie Charts

• Pie charts are almost never worth the space– Simple visual depiction of

proportions

• APA guidelines:– “The number of items compared

should be kept to five or fewer. Order the segments from large to small, beginning the largest segment at 12 o’clock […] making the smallest segment [shaded] the darkest.”

A's

B's

C's

D's

F's

Shapes of Distributions

• Frequency Distribution – a table of counts– 2 5 3 6 5 8 2 3 4 1 0 6 8 9 2 3 7 2 9 0– 2 zeros, 1 one, 4 twos, 3 threes, 1 four, 2 fives, 2

sixes, 1 seven, 2 eights, and 2 nines

Cumulative Frequency Distribution and Histogram

X f cf9 2 20 (N)8 2 187 1 166 2 155 2 134 1 113 3 102 4 71 1 30 2 2 0

1

2

3

4

0 1 2 3 4 5 6 7 8 9

Frequency Polygon and Histogram

0

1

2

3

4

0 1 2 3 4 5 6 7 8 9

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5 6 7 8 9 10

Normal, Negative Skew, Positive Skew, Bi-modal

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Scatterplots/Scattergrams

• The plotting of paired values– For example,…

• Age Toys– Bob 5 6– Anne 6 12– Sue 3 15– Bill 5 7– Tim 4 15

0

2

4

6

8

10

12

14

16

0 1 2 3 4 5 6 7

Age

To

ys

Central TendencyThe location of data• Mean

– For Interval and Ratio data and normal distributions

• Median– For Ordinal data or skewed distributions

• Mode– For nominal data or multi-modal distributions

The Mean(aka, the Arithmetic Mean, Average)The arithmetic center of the distribution• Symbolized as

– “M” in academic literature– X among statisticians

• Operationalized as (∑X)/N– Sigma (∑) means “sum of”– Meaning, “add up all the X scores and divide

by the number of X scores”

The Mean

• Data set of X values– 2 5 3 6 5 8 2 3 4 1 0 6 8 9 2 3 7 2 9 0

• ∑X = 85• N = 20• M = (∑X)/N = 85/20 = 4.25

Standard Error of the Mean (SEM)

• When we collect statistics from a sample,– our data do not represent the population

parameters with 100% accuracy– we must adjust for this error in our data

• The Standard Error of the Mean (SEM)– is an estimate of this “margin of error”– is a necessary calculation in inferential statistics

• The sample mean is a point estimate (location)• SEM is an interval estimate (spread)

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

Sampling Distribution of Means

• The Normal Distribution is a distribution of all the individual raw scores

• The Sampling Distribution of Means is a distribution of all the means of the samples that can be drawn from a population– The Central Limit Theorem is that this distribution

is normal and centers around the population mean– By definition, SEM is the standard deviation of the

Sampling Distribution of the Means

Calculating SEM

• The larger the N, the smaller the SEM

• The lesser the variability, the smaller the SEM

MSWithin

SEM = √ N

……………………………….Stay tuned!

MedianThe physical center of the distribution• Symbolized as “Mdn”• Operationalized as the X value at the 50th %ile

– Half the data are below, half the data are above

• Data set of X values– 2 5 3 6 5 8 2 3 4 1 0 6 8 9 2 3 7 2 9 0– Ascending order– 0 0 1 2 2 2 2 3 3 3 4 5 5 6 6 7 8 8 9 9

• Median is (3 + 4) / 2 = 3½

The ModeThe most frequently occurring value• Symbolized as “Mode” • Operationalized as the X value(s) with the

largest frequency(ies) on the frequency distribution– (aka, the tallest bar(s) on the histogram)

Mode = 2

X f9 28 27 16 25 24 13 32 41 10 2 0

1

2

3

4

0 1 2 3 4 5 6 7 8 9

Positive Skew(Mode = 2) < (Mdn = 3.5) < (M = 4.25)

0

5

10

15

20

25

30

35

40

45

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

10

20

30

40

50

60

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0

0.5

1

1.5

2

2.5

3

3.5

4

4.5

1 2 3 4 5 6 7 8 9 10

Central Tendency’s Interpretive Power (the need for Dispersion)

X Y Z

3 5 11

3 4 1

3 3 1

3 2 1

3 1 1

∑X = 15 ∑Y = 15 ∑Z = 15

MX = 3 MY = 3 MZ = 3

Recommended