Upload
sherman-ward
View
225
Download
0
Embed Size (px)
Citation preview
Descriptive Statistics:Maarten Buis
Lecture 1:
Central tendency, scales of measurement,
and shapes of distributions
Outline
• Practicalities
• Central tendency
• Scales of measurement
• Shapes of distributions
Two statistics courses
• Descriptive Statistics (McCall, part 1)
• Inferential Statistics (McCall, part 2)
Course Material• McCall: Fundamental Statistics for
Behavioral Sciences.
• SPSS (available from Surfspot.nl) and chapter 2 of Field
• Lectures: 2 x a week
• computer labs: 1 x a week.
• course mailing list: [email protected]
• course website
setup of lectures
• Recap of material assumed to be known
• New Material
• Student Recap
How to pass this course
• Read assigned portions of McCall before each lecture• Do the exercises• Do the computer lab assignments, and hand them in
before Tuesday 17:00!• come to the computer lab• come to the lectures• ask questions: during class or to the course mailing
list• answer questions
Recap: mean, median, mode
• Mean of 1, 1, 2, 4 is (1 + 1 + 2 + 4)/4 = 2
• Median of 1, 1, 2, 4 is the middle observation, here two middle observations: 1 and 2. Use mean of middle observations, which is 1.5.
• Mode of 1, 1, 2, 4 is the most common value: 1.
Recap: quadratic
• 12 = 1 x 1 =1
• 22 = 2 x 2 = 4
• 32 = 3 x 3 = 9
• squaring makes large numbers much larger than small numbers
Recap: absolute value
• |3| = 3
• |-3| = 3
• just loose the minus sign if it is there
Data: rents of rooms
rent rent
room 1 175 room 11 240
room 2 180 room 12 250
room 3 185 room 13 250
room 4 190 room 14 280
room 5 200 room 15 300
room 6 210 room 16 300
room 7 210 room 17 310
room 8 210 room 18 325
room 9 230 room 19 620
room 10 240
What is a reasonable summary
• One always makes errors
• What if you choose that number that minimizes the sum of the absolute errors?
• If you want to put more weight on preventing large errors you could minimize the sum of the squared errors
mean and median
• mean is that summary that minimizes the sum of the squared errors
• median is that summary that minimizes the sum of the absolute errors
Measurement
• assigning numbers to observations: for example rents to rooms
• scale of measurement:– nominal– ordinal– interval/ratio
nominal
• == assigning numbers to classify observations in categories
• The categories are exclusive, but have no further relationship with one another.
• typical example: religion
ordinal
• == assigning numbers with the purpose to order observation
• It is meaningful to speak of more or less, or lower or higher
• typical example: education
Interval
• == assigning numbers to compare differences
• It is meaningful to say that the “distance between A and C is larger than between B and C”
• Typical example: temperature, intelligence• Hard to find really good examples, often
combined with ratio
ratio
• == assigning numbers to compare ratios of observations
• requires an absolute zero point
• It is meaningful to say “A is twice B”
• typical examples: age, income, percentage immigrant children in a classroom
What is the scale of measurement of:
• Choice of Party during an election
• Gender
• exam grades
• highest achieved level of education: primary, secondary, or tertiary
what is the scale of measurement of :
• income
• percentile of income (top 5% or bottom 20%)
• highest level of education in years
Why bother?
• Determines which statistical techniques are meaningful: mean religion or most common religion
• Use common sense
Central Tendency 2
• Nominal Mode
• Ordinal Mode or Median
• Interval/ratio Mode, Median, or Mean
Dichotomous variable
• only two answers possible: yes/no, male/female, 1/0
• Every variable can be dichotomized
• Dichotomous variables can be treated as interval variables: mean is meaningful: percentage “yes”.
Frequency distribution
• A frequency distribution shows how many times a value occurs within a variable
• Can be visualized in a histogram, frequency polygon, pie chart
rent Freq. Percent Cum.
175 1 5,26% 5,26%
180 1 5,26% 10,53%
185 1 5,26% 15,79%
190 1 5,26% 21,05%
200 1 5,26% 26,32%
210 3 15,79% 42,11%
230 1 5,26% 47,37%
240 2 10,53% 57,89%
250 2 10,53% 68,42%
280 1 5,26% 73,68%
300 2 10,53% 84,21%
310 1 5,26% 89,47%
325 1 5,26% 94,74%
620 1 5,26% 100,00%
Total 19 100.00%
Shapes of distribution
• The mean, median and mode are equal in unimodal symmetric distributions.
• The mean and median are equal in multimodal symmetric distributions
• Skewness, in a right skewed distribution the mean is right of the median, the income distribution is an example of a right skewed distribution.
Shapes of distributions
• Kurtosis: leptokurtotic (flat) or platycurtotic (peaked)
• Uniform distribution, each value is equally likely
• For the fans: Mean, variance, skewness, and kurtosis are the first four moments of a distribution (p. 49 McCall)
Do before Wednesday
• Read:– McCall Ch 1: 6-14– McCall Ch 2: entirely– McCall Ch 3: 54-63
• Exercises:– 1.1-1.3– 2.1-2.7, 2.11, 2.13
Student recap