Chapter 4: Quantitative Data Part 1: Displaying Quant Data (Week 2, Wednesday) Part 2: Summarizing Quant Data (Week 2, Friday)

Embed Size (px)

Citation preview

  • Slide 1

Chapter 4: Quantitative Data Part 1: Displaying Quant Data (Week 2, Wednesday) Part 2: Summarizing Quant Data (Week 2, Friday) Slide 2 Displaying Quantitative Data Qualitative data Few categories made it easy to display this data Example: Gender has 2 categories (M/F) Example: Grade has 5 categories (A/B/C/D/F) Qual Tools: Pie Graphs, Frequency Tables, Bar Charts Quantitative data Typically has many distinct values Example: Weight, Age, Height, Salary Therefore the above qualitative tools wont work Quant Tools: Histograms, Stem & Leaf, Dot Plots Slide 3 Displaying Quantitative Data Histogram (p. 48) Group data into bins Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063647577859499 Bins:[55, 60)[60, 65)[65, 70) [70, 75)[75, 80)[80, 85) [85, 90)[90, 95)[95, 100) Slide 4 Displaying Quantitative Data Histogram (p. 48) Group data into bins Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063646977859499 Bins:[55, 60)[60, 65)[65, 70) [70, 75)[75, 80)[80, 85) [85, 90)[90, 95)[95, 100) *** Notice the observation 60 is placed in the bin [60,65) not [55,60) *** This is the standard way to place observations that fall on the boundary Slide 5 Displaying Quantitative Data Histogram (p. 48) Group data into bins Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063646977859499 Slide 6 Displaying Quantitative Data Histogram (p. 48) Group data into bins Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063646977859499 Slide 7 Displaying Quantitative Data Stem and Leaf (p. 50) Histograms provide an easy-to-understand summary of the distribution, but they dont show the data values themselves Stem and Leaf displays are the solution. Slide 8 Displaying Quantitative Data Stem and Leaf (p. 50) Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063647577859499 Bins:[50, 60)[60, 70)[70, 80) [80, 90)[90, 100) Slide 9 Displaying Quantitative Data Stem and Leaf (p. 50) Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063647577859499 Bins:[50, 60)[60, 70)[70, 80) [80, 90)[90, 100) Slide 10 Displaying Quantitative Data Stem and Leaf (p. 50) Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063647577859499 5569 6012234444579 756677789 85 901347889 Test Grades (1|2 means 12%) *** See page 51 to learn how to build stem and leaf displays for bins that differ from 10 units in length *** Slide 11 Displaying Quantitative Data Dotplots (p. 52) Example: Test Grades 5561646576779097 5662646776789198 5962646977799398 6063647577859499 Slide 12 Chapter 4: Quantitative Data Part 2: Summarizing Quant Data (Week 2, Friday) Slide 13 Summarizing Quantitative Data Shape Mode Symmetry (Symmetric, Skewed) Center Median Mean Spread Range Quartiles IQR Advanced Topics (used throughout rest of semester) Variance Standard Deviation Slide 14 Summarizing Quantitative Data Mode (p. 53) Does the histogram have a single, central hump or several separated humps? These humps are called modes. Slide 15 Summarizing Quantitative Data Mode (p. 53) Does the histogram have a single, central hump or several separated humps? These humps are called modes. Unimodal Only one central hump Slide 16 Summarizing Quantitative Data Mode (p. 53) Does the histogram have a single, central hump or several separated humps? These humps are called modes. Bimodal Two central humps Multimodal More than one central hump Slide 17 Summarizing Quantitative Data Mode (p. 53) Does the histogram have a single, central hump or several separated humps? These humps are called modes. Uniform All the bars are approximately the same height and no mode is obvious Slide 18 Summarizing Quantitative Data Symmetry (p. 54) Can you fold it along a vertical line through the middle and have the edges match pretty closely, or are more of the values on one side? Symmetric Skewed to the LeftSkewed to the Right Slide 19 Summarizing Quantitative Data Mean VS Median Mean: what we typically think of when we hear the word average. Add up the values and divide by the total number Median: the number such that exactly half of the values are above it and half are below it. Slide 20 Summarizing Quantitative Data Mean VS Median Mean: what we typically think of when we hear the word average. Add up the values and divide by the total number Median: the number such that exactly half of the values are above it and half are below it. Example 1: Consider the test grades 83, 94, 98, 99, 60 The mean can be found through: Mean = (83+94+98+99+60)/5 = 86.8 The median can be found by first ordering the values from smallest to highest: 60 83 94 98 99 Then selecting the number that is in the middle. Median = 94 Slide 21 Summarizing Quantitative Data Mean VS Median Mean: what we typically think of when we hear the word average. Add up the values and divide by the total number Median: the number such that exactly half of the values are above it and half are below it. Example 2: Consider the test grades 83, 94, 98, 99 The mean can be found through: Mean = (83+94+98+99)/4 = 93.5 The median can be found by first ordering the values from smallest to highest: 83 94 98 99 Then averaging the two numbers in the middle: Median = (94+98)/2 = 96 Slide 22 Summarizing Quantitative Data Range Largest Number Smallest Number Example: Consider the test grades 83, 94, 98, 99 The range can be found through: Range = 99 83 = 16 *** THE RANGE IS A NUMBER. 83 to 99 IS WRONG Slide 23 Summarizing Quantitative Data Quartiles (p. 58) A special way of splitting the data into fourths Order the data, split it in half, Find the medians of each half. Lower Quartile (or Q1) is the lower median Upper Quartile (or Q3) is the upper median Example 1: (even number of values) Find Q1 and Q3 of the following ages: 23 34 33 22 50 21 18 22 First, order the numbers from lowest to highest: 18 21 22 22 23 33 34 50 Next, split the data in half (four numbers in each half) First half:18 21 22 22 Q1 = Median of First Half = (21+22)/2 = 21.5 Last half:23 33 34 50 Q3 = Median of Last Half = (33+34)/2 = 33.5 Slide 24 Summarizing Quantitative Data Quartiles (p. 58) A special way of splitting the data into fourths Order the data, split it in half, Find the medians of each half. Lower Quartile (or Q1) is the lower median Upper Quartile (or Q3) is the upper median Example 2: (odd number of values) Find Q1 and Q3 of the following ages: 34 33 22 50 21 18 22 First, order the numbers from lowest to highest: 18 21 22 22 33 34 50 Next, split the data in half (22 is included in both) First half:18 21 22 22 Q1 = Median of First Half = (21+22)/2 = 21.5 Last half:22 33 34 50 Q3 = Median of Last Half = (33+34)/2 = 33.5 Slide 25 Summarizing Quantitative Data Quartiles (p. 58) A special way of splitting the data into fourths Order the data, split it in half, Find the medians of each half. Lower Quartile (or Q1) is the lower median Upper Quartile (or Q3) is the upper median IQR (Inner-quartile Range) IQR = Q3 Q1 Single number (just like range is a single number) Slide 26 Summarizing Quantitative Data Summation Notation (p. 62) Consider the grades: 80, 85, 90, 95. y (represents a summation of the grades) That is: y = 80 + 85 + 90 + 95 = 350 For now on, we will use a new notation for MEAN Where y-bar represents the mean and n is the number of values for y Slide 27 Summarizing Quantitative Data Variance The variance of a variable is a measure of how spread out the data is. It is given by the following complicated formula: Note that s-squared represents the variance. The equation is best understood through an example. Slide 28 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: Slide 29 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y Slide 30 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y 80 85 90 95 Slide 31 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y 80 85 90 95 Slide 32 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y 80-7.5 85-2.5 902.5 957.5 Slide 33 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 Slide 34 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 Slide 35 Summarizing Quantitative Data Variance Example Consider the grades: 80, 85, 90, 95. Find the variance through the equation: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 Slide 36 Summarizing Quantitative Data Variance Example Lets take a closer look at whats happening: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 10090807060 50 4030 y y Slide 37 Summarizing Quantitative Data Variance Example Lets take a closer look at whats happening: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 10090807060 50 4030 Slide 38 Summarizing Quantitative Data Variance Example Lets take a closer look at whats happening: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 10090807060 50 4030 When data is more spread out, the result is a higher variance Slide 39 Summarizing Quantitative Data Variance Example Lets take a closer look at whats happening: y 80-7.556.25 85-2.56.25 902.56.25 957.556.25 10090807060 50 4030 When all of the values are the same, the variance is 0 Slide 40 Summarizing Quantitative Data Standard Deviation Standard deviation is the square-root of variance: Note: the symbol for standard deviation is s