Upload
jackflash2010
View
215
Download
0
Embed Size (px)
Citation preview
7/27/2019 Summary of Statistics I
1/7
8/26/20
Type of variables:
Quantitative and Categorical
Types of Quantitative Variables
Discrete:
Listable values with gaps
between them.
Note: Not necessarily integers.
May be an infinite number of values.
Continuous:
Any value in an interval
can occur.
Categorical variable
A variable that classifies
people or things is
qualitativeor categorical.
Ex: gender, political affiliation,
race, country of origin.
Types of Categorical Variables
Nominal:
Values are simply labels or
categories.
Ex: 1 for Educator
2 for Construction worker
3 for Manufacturing worker
Types of Categorical Variables
Ordinal:
The categories have a naturalordering.
EX.
1 for President
2 for Vice President
3 for Dean of College
4 for Department Supervisor
5 for Employee
Types of Categorical Variables
Binary:Only two categories
(often labeled 0 and 1).
7/27/2019 Summary of Statistics I
2/7
8/26/20
Identify the type:
Height: Quantitative Continuous
Family Size: Quantitative Discrete
Age: Quantitative Continuous
Course Grade: CategoricalQualitative
Ordinal
Football jersey
number:
CategoricalQualitative
Nominal
Descriptive Inferential
NumericalGraphical
Statistics
Descriptive Statistics:
numerical summaries and/orgraphical displays of (sample) data.
Inferential Statistics:
Using sample data to makeconclusions about the population
from which the sample was taken.
Drawing conclusions about
the population (parameter),
based on a sample (statistic).
Statistical Inference
Population
Census
Real
Parameter
Sample
Statistic
estimates
After we collect data, then what?
Organize / Summarize Data
Graphical Numerical
7/27/2019 Summary of Statistics I
3/7
8/26/20
Graphs
Histogram
Stem-and-leaf plot
for a single
quantitative
variable
Bar Chart
Pie Chart
for a single
categorical
variable
Common Statistical Graphs Quantitative Data
Histogram- vertical bar chart of frequencies
- Frequency Polygon - line graph of frequencies
- Ogive - line graph of cumulative frequencies
Stem and Leaf Plot - Like a histogram, but shows
individual data values. Useful for small data sets.
Histogram
0
10
20
0 10 20 30 40 50 60 70 80
Years
Frequency
0
10
20
0 10 20 30 40 50 60 70 80
Years
Frequency
What values of Yearsfall in the 2nd class?
30 to 40
How many valuesfall in the 2nd class?
18 of the 50
What is the sum of theheights of all bars?
50
Frequency Polygon
- line graph of frequencies
Class Interval Frequency
20-under 30 6
30-under 40 18
40-under 50 11
50-under 60 11
60-under 70 3
70-under 80 10
10
20
0 10 2 0 3 0 4 0 50 6 0 7 0 8 0
Frequency
Years
Ogive
- line graph of cumulative frequencies
Cumulative
Class Interval Frequency
20-under 30 6
30-under 40 24
40-under 50 35
50-under 60 46
60-under 70 49
70-under 80 50 0
20
40
60
0 10 20 30 40 50 60 70 80
Years
Frequency
7/27/2019 Summary of Statistics I
4/7
8/26/20
Relative Frequency Ogive
Cumulative
Relative
Class Interval Frequency20-under 30 .12
30-under 40 .48
40-under 50 .70
50-under 60 .92
60-under 70 .98
70-under 80 1.000.00
0.10
0.20
0.30
0.40
0.50
0.60
0.70
0.80
0.90
1.00
1.10
0 10 20 30 40 50 60 70 80
Years
CumulativeRelativeFrequ
ency
Stem-and-leaf plot of Income1 N = 100
Leaf Unit = 1000
2 1 58
6 2 022330 2 555566667777778889999999
(22) 3 0001111222222222333344
48 3 5555555666777788889
29 4 00000001111223444
12 4 56667889
4 5 0
3 5 57
1 6
1 6
1 7 2
Stem-and-Leaf Plot
Stem-and-Leaf Plot
Purpose: Same as for a histogram, but provides a little more
detail about the data distribution.
Data values are split into stem and leaf components:
Choice of split is data dependent.
123 4 5.67
stem leaf discard
12,345.67
Distribution of a
Quantitative Variable
q Illustrated by a histogram ora stem-and-leaf plot.
q Indicates how the valuesof the variable are spread out
across the range of the variable.
Key Features of Data Distributions
Shape
Typical Value
Spread
Outliers
Shapes: Symmetric
Skewed right
Skewed left
Bimodal
(multi-modal)
7/27/2019 Summary of Statistics I
5/7
8/26/20
Frequency
Symmetric Distribution
Frequency
More or lessSymmetric Distribution
Frequency
Skewed Right
Frequency
Skewed Left
Frequency
Bimodal
Graphs
Histogram
Stem-and-leaf plot
for a single
quantitative
variable
Bar Chart
Pie Chart
for a single
categorical
variable
7/27/2019 Summary of Statistics I
6/7
8/26/20
This is NOTa histogram!
Bar Chart Common Statistical GraphsQualitative Data
Pie Chart -- proportional representation for
categories of a whole
Bar Chart frequency or relative frequency of
one more categorical variables
Second Quarter U.S. Truck Production
Second Quarter Truck
Production in the U.S.
(Hypothetical values)
2d QuarterTruck
ProductionCompany
A
B
C
D
ETotals
357,411
354,936
160,997
34,099
12,747920,190
Second Quarter U.S. Truck Production
Proportionofeach Truck production
Pie Chart Calculations for Company A
2d QuarterTruck
ProductionProportion DegreesCompany
A
B
C
D
E
Totals
357,411
354,936
160,997
34,099
12,747
920,190
.388
.386
.175
.037
.014
1.000
140
139
63
13
5
360
Pie Chart Calculations for Company A
2d QuarterTruck
ProductionProportion DegreesCompany
A
B
C
D
E
Totals
357,411
354,936
160,997
34,099
12,747
920,190
.388
.386
.175
.037
.014
1.000
140
139
63
13
5
360
357,411
920,190 =
=360.388
7/27/2019 Summary of Statistics I
7/7
8/26/20
Pareto Chart A pareto chart is a bar chart, sorted from most frequent to least
frequent, overlaid with a cumulative line graph (like an ogive).
0
10
20
30
40
50
60
70
80
90
100
Poor
Wiring
Short in
Coil
Defective
Plug
Other
Frequency
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
Graphs
Scatterplotfortwo
quantitative
variables
Scatterplot of Debt versus Income