1 1 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Chapter-2: Descriptive Chapter-2: Descriptive Statistics:Statistics:
Tabular and Graphical Tabular and Graphical PresentationsPresentations
Part APart A
Exploratory Data AnalysisExploratory Data Analysis
Cross-tabulations; Scatter Cross-tabulations; Scatter DiagramsDiagrams
2 2 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Tabular and Graphical ProceduresTabular and Graphical Procedures
Qualitative DataQualitative DataQualitative DataQualitative Data Quantitative DataQuantitative DataQuantitative DataQuantitative Data
TabularTabularMethodsMethods TabularTabularMethodsMethods
TabularTabularMethodsMethods TabularTabularMethodsMethods
GraphicalGraphical MethodsMethodsGraphicalGraphical MethodsMethods
GraphicalGraphical MethodsMethodsGraphicalGraphical MethodsMethods
•FrequencyFrequency DistributionDistribution•Rel. Freq. Dist.Rel. Freq. Dist.•Percent Freq. Percent Freq. DistributionDistribution•CrosstabulationCrosstabulation
•Bar GraphBar Graph•Pie ChartPie Chart
•FrequencyFrequency DistributionDistribution•Rel. Freq. Dist.Rel. Freq. Dist.•Cum. Freq. Dist.Cum. Freq. Dist.•Cum. Rel. Freq.Cum. Rel. Freq. Distribution Distribution •Stem-and-LeafStem-and-Leaf DisplayDisplay•CrosstabulationCrosstabulation
•Dot PlotDot Plot•HistogramHistogram•OgiveOgive•ScatterScatter DiagramDiagram
DataDataDataData
3 3 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Stem-and-Leaf DisplayStem-and-Leaf Display
It is It is similar to a histogramsimilar to a histogram, but it has the , but it has the advantage advantage of showing the actual data values.of showing the actual data values.
A stem-and-leaf display is an exploratory data A stem-and-leaf display is an exploratory data analysis method that shows both the analysis method that shows both the rank orderrank order and and shape of the distributionshape of the distribution of the data. of the data.
4 4 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Example: Hudson Auto RepairExample: Hudson Auto Repair
Sample of Parts Cost for 50 Tune-upsSample of Parts Cost for 50 Tune-ups
91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73
91 78 93 57 75 52 99 80 97 6271 69 72 89 66 75 79 75 72 76104 74 62 68 97 105 77 65 80 10985 97 88 68 83 68 71 69 67 7462 82 98 101 79 105 79 69 62 73
5 5 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Stem-and-Leaf DisplayStem-and-Leaf Display
55
66
77
88
99
1010
2 72 7
2 2 2 2 5 6 7 8 8 8 9 9 92 2 2 2 5 6 7 8 8 8 9 9 9
1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 91 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
0 0 2 3 5 8 90 0 2 3 5 8 9
1 3 7 7 7 8 91 3 7 7 7 8 9
1 4 5 5 91 4 5 5 9
6 6 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Steps to Construct a Stem-and-Leaf Steps to Construct a Stem-and-Leaf DisplayDisplay
Each digit on a stem is a Each digit on a stem is a leafleaf..
Each line in the display is referred to as a Each line in the display is referred to as a stemstem..
To the right of the vertical line, record the lastTo the right of the vertical line, record the last digit for each item in rank orderdigit for each item in rank order..
Arrange the first digits of each data to theArrange the first digits of each data to the left of a vertical line.left of a vertical line.
7 7 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Stem-and-Leaf DisplayStem-and-Leaf Display
55
66
77
88
99
1010
2 72 7
2 2 2 2 5 6 7 8 8 8 9 9 92 2 2 2 5 6 7 8 8 8 9 9 9
1 1 2 2 3 4 4 5 5 5 6 7 8 9 9 91 1 2 2 3 4 4 5 5 5 6 7 8 9 9 9
0 0 2 3 5 8 90 0 2 3 5 8 9
1 3 7 7 7 8 91 3 7 7 7 8 9
1 4 5 5 91 4 5 5 9
a stema stema leafa leaf
8 8 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Stretched Stem-and-Leaf DisplayStretched Stem-and-Leaf Display
Whenever stretched, the first value of a stemWhenever stretched, the first value of a stem corresponds to leaf corresponds to leaf values of 0 values of 0 4 4, and the second, and the second value of a stem corresponds to leaf value of a stem corresponds to leaf values of 5 values of 5 9. 9.
If we believe the display has condensed the data too If we believe the display has condensed the data too much, we can much, we can stretch thestretch the displaydisplay by using by using two stems two stems for each leading digit(sfor each leading digit(s).).
9 9 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Stretched Stem-and-Leaf DisplayStretched Stem-and-Leaf Display
5 5 95 5 91 41 47 7 7 8 97 7 7 8 91 31 35 8 95 8 90 0 2 30 0 2 35 5 5 6 7 8 9 9 95 5 5 6 7 8 9 9 91 1 2 2 3 4 41 1 2 2 3 4 45 6 7 8 8 8 9 9 95 6 7 8 8 8 9 9 92 2 2 22 2 2 2772255
556666777788889999
10101010
10 10 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Stem-and-Leaf Display—Some notesStem-and-Leaf Display—Some notes
Leaf UnitsLeaf Units
• Where the leaf unit is not shown, it is assumedWhere the leaf unit is not shown, it is assumed to equal 1.to equal 1.
• Leaf units may be 100, 10, 1, 0.1, and so on.Leaf units may be 100, 10, 1, 0.1, and so on.
• In the preceding example, the leaf unit was 1.In the preceding example, the leaf unit was 1.
• A single digit is used to define each leaf.A single digit is used to define each leaf.
11 11 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Example: Leaf Unit = 0.1Example: Leaf Unit = 0.1
If we have data with values such asIf we have data with values such as
88
99
1010
1111
Leaf Unit = 0.1Leaf Unit = 0.16 86 8
1 41 422
0 70 7
8.6 8.6 11.711.7 9.49.4 9.19.1 10.210.2 11.011.0 8.88.8
a stem-and-leaf display of these data will bea stem-and-leaf display of these data will be
12 12 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Example: Leaf Unit = 10Example: Leaf Unit = 10
If we have data with values such asIf we have data with values such as
1616
1717
1818
1919
Leaf Unit = 10Leaf Unit = 1088
1 91 90 30 3
1 71 7
18061806 17171717 19741974 17911791 16821682 19101910 18381838
a stem-and-leaf display of these data will bea stem-and-leaf display of these data will be
The 82 in 1682The 82 in 1682is rounded downis rounded down
to 80 and isto 80 and isrepresented as represented as
an 8.an 8.
13 13 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Crosstabulations Crosstabulations and and
Scatter DiagramsScatter Diagrams
14 14 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Crosstabulations and Scatter DiagramsCrosstabulations and Scatter Diagrams
Cross tabulationCross tabulation and a and a scatter diagramscatter diagram are methods for are methods for summarizing the data on two (or more) variablessummarizing the data on two (or more) variables simultaneously.simultaneously.
Help us understand the Help us understand the relationship between two variablesrelationship between two variables..
Thus far we have focused on methods that are usedThus far we have focused on methods that are used to summarize the data on to summarize the data on one variable at a timeone variable at a time..
15 15 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
CrosstabulationCrosstabulation
The left and top margin labels define the classes forThe left and top margin labels define the classes for the two variables.the two variables.
Crosstabulation can be used when:Crosstabulation can be used when:• one variable is qualitative and the other isone variable is qualitative and the other is quantitative,quantitative,• both variables are qualitative, orboth variables are qualitative, or• both variables are quantitative.both variables are quantitative.
A A crosstabulationcrosstabulation is a tabular summary of data for is a tabular summary of data for two variables.two variables.
16 16 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
PricePriceRangeRange Colonial Log Split A-FrameColonial Log Split A-FrameTotalTotal
<< $99,000 $99,000
> $99,000> $99,00018 6 19 1218 6 19 12 5555
4545
3030 20 35 15 20 35 15TotalTotal 100100
12 14 16 312 14 16 3
Home StyleHome Style
Cross-tabulationCross-tabulation
Example:Example:
The number of Homes sold by style and price The number of Homes sold by style and price for the past two years is shown below. for the past two years is shown below.
quantitativquantitativee
variablevariable
qualitativequalitative variablevariable
17 17 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Cross-tabulationCross-tabulation
Insights Gained from Preceding Cross-Insights Gained from Preceding Cross-tabulationtabulation
• Only three homes in the sample are an A-FrameOnly three homes in the sample are an A-Frame style and priced at more than $99,000.style and priced at more than $99,000.
• The greatest number of homes in the sample (19)The greatest number of homes in the sample (19) are a split-level style and priced at less than orare a split-level style and priced at less than or equal to $99,000.equal to $99,000.
18 18 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Cross-tabulation: Row or Column Cross-tabulation: Row or Column PercentagesPercentages
Converting the entries in the table into row Converting the entries in the table into row percentages or column percentages can provide percentages or column percentages can provide additional insight about the relationship additional insight about the relationship between the two variables.between the two variables.
19 19 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
PricePriceRangeRange Colonial Log Split A-FrameColonial Log Split A-FrameTotalTotal
<< $99,000 $99,000
> $99,000> $99,00018 6 19 1218 6 19 12 5555
4545
3030 20 35 15 20 35 15TotalTotal 100100
12 14 16 312 14 16 3
Home StyleHome Style
Cross-tabulationCross-tabulation
Frequency distributionFrequency distributionfor the price variablefor the price variable
Frequency distributionFrequency distributionfor the home style for the home style
variablevariable
20 20 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
PricePriceRangeRange Colonial Log Split A-FrameColonial Log Split A-FrameTotalTotal
<< $99,000 $99,000
> $99,000> $99,00032.73 10.91 34.55 21.8232.73 10.91 34.55 21.82100100
10010026.67 31.11 35.56 6.6726.67 31.11 35.56 6.67
Home StyleHome Style
(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100(Colonial and > $99K)/(All >$99K) x 100 = (12/45) x 100
Cross-tabulation: Row PercentagesCross-tabulation: Row Percentages
21 21 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Simpson’s ParadoxSimpson’s Paradox
Simpson’ ParadoxSimpson’ Paradox: In some cases the conclusions: In some cases the conclusions based upon an aggregated cross-tabulation can bebased upon an aggregated cross-tabulation can be completely different from the un-aggregatedcompletely different from the un-aggregated data. data.
We must be careful in drawing conclusions about theWe must be careful in drawing conclusions about the relationship between the two variables in therelationship between the two variables in the aggregated cross-tabulation.aggregated cross-tabulation.
Often Times data in two or more cross-tabulations are Often Times data in two or more cross-tabulations are aggregated to produce a summary cross-tabulation.aggregated to produce a summary cross-tabulation.
22 22 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Scatter Diagram and Trend lineScatter Diagram and Trend line
23 23 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
The general pattern of the plotted points suggests theThe general pattern of the plotted points suggests the overall relationship between the variables.overall relationship between the variables.
One variable is shown on the horizontal axis and theOne variable is shown on the horizontal axis and the other variable is shown on the vertical axis.other variable is shown on the vertical axis.
A A scatter diagramscatter diagram is a graphical presentation of the is a graphical presentation of the relationship between two relationship between two quantitativequantitative variables. variables.
Scatter Diagram and Trend lineScatter Diagram and Trend line
A A trend linetrend line is an approximation of the relationship. is an approximation of the relationship.
24 24 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Scatter DiagramScatter Diagram
A Positive RelationshipA Positive Relationship
xx
yy
25 25 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Scatter DiagramScatter Diagram
A Negative RelationshipA Negative Relationship
xx
yy
26 26 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Scatter DiagramScatter Diagram
No Apparent RelationshipNo Apparent Relationship
xx
yy
27 27 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Example: Panthers Football TeamExample: Panthers Football Team
The Panthers football team is interestedThe Panthers football team is interested
in investigating the relationship, if any,in investigating the relationship, if any,
between between interceptions madeinterceptions made and and points points scored.scored.
1133221133
14142424181817173030
xx = Number of = Number ofInterceptionsInterceptions
yy = Number of = Number of Points ScoredPoints Scored
28 28 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Scatter DiagramScatter Diagram
yy
xx
Number of InterceptionsNumber of Interceptions
Num
ber
of
Poin
ts S
core
dN
um
ber
of
Poin
ts S
core
d
551010
1515202025253030
00
3535
11 22 3300 44
29 29 Slide
Slide
© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)© University of Minnesota-Duluth, Econ-2030(Dr. Tadesse)
Insights Gained from the Preceding Scatter Insights Gained from the Preceding Scatter DiagramDiagram
• The relationship is not perfect; all plotted points inThe relationship is not perfect; all plotted points in the scatter diagram are not on a straight line.the scatter diagram are not on a straight line.
• Higher points scored are associated with a higherHigher points scored are associated with a higher number of interceptions.number of interceptions.
• The scatter diagram indicates a positive relationshipThe scatter diagram indicates a positive relationship between the number of interceptions and thebetween the number of interceptions and the number of points scored.number of points scored.
Example: Panthers Football TeamExample: Panthers Football Team