Making Sense of Statistics

Embed Size (px)

Citation preview

  • 8/13/2019 Making Sense of Statistics

    1/39

    by

    Jason SamuelsCUNY-BMCC

    AMATYC 39, 2013-11-2

  • 8/13/2019 Making Sense of Statistics

    2/39

    Students Dont Get StatisticsAfter years of Algebra courses, Statistics requires a very

    different way of thinking

    Whats the formula? ome steps requ re a ormu a e.g. find the z-score

    Some steps dont

    e.g. find the z-score

    wait, what?

    n

    xz

    2

    2

    2

    1

    2

    1

    21

    nn

    xxz

    n

    pp

    ppz

    )1(

    2

    22

    1

    11

    21

    )1()1(

    n

    pp

    n

    pp

    ppz

  • 8/13/2019 Making Sense of Statistics

    3/39

    Which Topics Can Be Unified? Doing calculations with standard data distributions

    Find the data value, z-score, probability

    Normal distribution, t-distribution, etc. on ence nterva s

    Hypothesis tests

    Some ideas so these topics make sense to students

  • 8/13/2019 Making Sense of Statistics

    4/39

    Key idea #1: Describe the distribution

    Orients the students toward the values they will use inthe problem and in their calculations

    escr e t e str ut on o t e ata:

    Center (mean)

    Spread (standard deviation)

    Shape (which distribution: normal, t, etc.)

  • 8/13/2019 Making Sense of Statistics

    5/39

    Describe the distribution an example

    Ex) A college has an average of 23.7 students in eachclass, with a standard deviation of 5.6. What is theprobability that a sample of 35 classes has an average of

    Get the facts: =23.7 =5.6 n=35 want P( > 25)

    Describe the distribution of Mean:

    Standard deviation:

    Shape: n>30 so its normal

    x

    x7.23x

    95.035

    6.5x

  • 8/13/2019 Making Sense of Statistics

    6/39

    Key Idea #2: Draw the Graph

    All values can be organized and connected using onegraph:

  • 8/13/2019 Making Sense of Statistics

    7/39

    Draw the graph example continued

    From before

    Get the facts: =23.7 =5.6 n=35 want P( > 25)

    Describe the distribution of : normalx 7.23

    x 95.0

    x

    x

    Now draw the graph:

    z

    x 23.7 25

  • 8/13/2019 Making Sense of Statistics

    8/39

    Key Idea #3:

    The Flow ChartAlmost every calculation

    students will do withstan ar istri utions isguided by this flow chart:

  • 8/13/2019 Making Sense of Statistics

    9/39

    Key Idea #4: The Formula There is only one formula students need to know:

    deviationstandard

    (mean)value)(datastatistictest

    Or, equivalently:

    data value = (mean)+(test statistic)(standard deviation)

    For a single data value:

    For a sample mean:

    For a sample proportion:

    zor

    xxz

    xx

    x

    x zxx

    z

    ...or...

    pp

    p

    pzp

    pz

    ...or...

  • 8/13/2019 Making Sense of Statistics

    10/39

    BenefitStudents learn that z has one meaning the number of

    standard deviations from the mean so z has one formula

    Never again will students use these varied, complex formulas:

    Students make fewer order-of-operation calculation errors

    n

    xz

    n

    pp

    ppz

    )1(

    2

    2

    2

    1

    2

    1

    21

    nn

    xxz

    2

    22

    1

    11

    21

    )1()1(

    n

    pp

    n

    pp

    ppz

  • 8/13/2019 Making Sense of Statistics

    11/39

    The formula example continuedFrom before

    Get the facts:

    =23.7 =5.6 n=35 want P( > 25)

    x Describe the distribution of

    normal

    Now:

    Find the z-score:

    x

    7.23x 95.035

    6.5x

    37.195.0

    7.2325

    x

    xxz

  • 8/13/2019 Making Sense of Statistics

    12/39

    Flow Chart & Graph - together

    Probability

    Z-score

    Data value

  • 8/13/2019 Making Sense of Statistics

    13/39

    Flow Chart & Graph - example continued

    Get the facts: =23.7 =5.6 n=35 want P( > 25)Describe the distribution of : normal

    Now, fill in the graph following the flowchart:

    7.23x 95.0xx

    x

    Probability

    Z-score

    Data value

    1.37

    .9147.0853

    ,

  • 8/13/2019 Making Sense of Statistics

    14/39

    Putting it together: an exercise

    The mean time for all flight delays is 21 minutes with astandard deviation of 12 minutes. What is theprobability that a sample of 36 flights has a delay

  • 8/13/2019 Making Sense of Statistics

    15/39

    Putting it together: an exercise

    Step 1: get the facts =21

    =12

    n=36

    (1) Get the facts:=21 =12 n=36find P( >26)

    (2) Describe the distribution:

    (3) Draw the graph:

    x

    (4) Do the calculations:

    (5) Conclusion:

  • 8/13/2019 Making Sense of Statistics

    16/39

    Putting it together: an exercise

    Step 2: describe the distribution

    Center:

    mean =21

    Spread:

    (1) Get the facts:=21 =12 n=36find P( >26)

    (2) Describe the distribution:= 21 =2 Normal

    (3) Draw the graph:

    x x

    x

    x

    standard deviation

    Shape:

    n>30, so the distribution is normal(4) Do the calculations:

    (5) Conclusion:

    26

    12

    36

    12

    nx

  • 8/13/2019 Making Sense of Statistics

    17/39

    Putting it together: an exercise

    Step 3: Draw the graph(1) Get the facts:

    =21 =12 n=36find P( >26)

    (2) Describe the distribution:= 21 =2 Normal

    (3) Draw the graph:

    x x

    x

    (4) Do the calculations:

    (5) Conclusion:

  • 8/13/2019 Making Sense of Statistics

    18/39

    Putting it together: an exercise

    Step 4: Do the calculations

    z-score:

    (1) Get the facts:=21 =12 n=36find P( >26)

    (2) Describe the distribution:= 21 =2 Normal

    (3) Draw the graph:

    x x

    x

    2126 xx

    Areas: using technology

    area to the left = .9937area to the right = .0063

    (4) Do the calculations:z = 2.5 area=.9937 & .0063

    (5) Conclusion:

    .2

    x

    2.5

  • 8/13/2019 Making Sense of Statistics

    19/39

    Putting it together: an exercise

    Step 5: Write the conclusion

    The probability is .0063

    (1) Get the facts:=21 =12 n=36find P( >26)

    (2) Describe the distribution:= 21 =2 Normal

    (3) Draw the graph:

    x x

    x

    (4) Do the calculations:z = 2.5 area=.9937 & .0063

    (5) Conclusion:The probability is .0063

  • 8/13/2019 Making Sense of Statistics

    20/39

    A harder exercise (thats not harder)

    Ex) for United, the mean delay time is 18 minutes,st.dev.=11 minutes. For Delta, the mean delay time is 22minutes, st.dev.=14 minutes. Find the probability that,

    for a sample of 32 United flights and 34 Delta flights,Delta has a higher mean delay time by over 2 minutes.

  • 8/13/2019 Making Sense of Statistics

    21/39

    A (not) harder exercise

    Step 1: Get the facts

    Delta:

    = = =

    (1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)

    (2) Describe the distribution:

    12 xx

    1 1 1United:

    2=18 2=11 n2=32

    Find P( >2)

    (3) Draw the graph:

    (4) Do the calculations:

    (5) Conclusion:

    21 xx

  • 8/13/2019 Making Sense of Statistics

    22/39

    A (not) harder exercise

    Step 2: Describe the distribution

    Center

    Mean

    =22-18=4

    (1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)

    (2) Describe the distribution:

    Normal

    12 xx

    421 xx 090.321 xx

    2121

    xx

    Spread

    Standard deviation

    Shape

    n1, n2>30 so its normal

    (3) Draw the graph:

    (4) Do the calculations:

    (5) Conclusion:

    22

    2

    2

    2

    1

    2

    1 )()(or)()(

    2121 xxxx

    nn

    09.332

    11

    34

    14 22

  • 8/13/2019 Making Sense of Statistics

    23/39

    A (not) harder exercise

    Step 3: Draw the graph (1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)

    (2) Describe the distribution:

    Normal

    12 xx

    421 xx 090.321 xx(3) Draw the graph:

    (4) Do the calculations:

    (5) Conclusion:

  • 8/13/2019 Making Sense of Statistics

    24/39

    A (not) harder exercise

    Step 4: Do the calculations

    z-score:

    (1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)

    (2) Describe the distribution:

    Normal

    12 xx

    421 xx 090.321

    xx42

    Areas:

    area to the left = .2587area to the right = .7413

    (3) Draw the graph:

    (4) Do the calculations:z = -0.65 areas: .2587 & .7413

    (5) Conclusion:

    .09.3

    -0.65

  • 8/13/2019 Making Sense of Statistics

    25/39

    A (not) harder exercise

    Step 5: Write the conclusion

    The probability is .7413

    (1) Get the facts:1=22 1=14 n1=342=18 2=11 n2=32find P( >2)

    (2) Describe the distribution:

    Normal

    12 xx

    421 xx 090.321

    xx

    (3) Draw the graph:

    (4) Do the calculations:z = -0.65 areas: .2587 & .7413

    (5) Conclusion:

    The probability is .7413

  • 8/13/2019 Making Sense of Statistics

    26/39

    A Handy Tool StatDisk

    Does all basic statistics calculations with a simplegraphical interface and one or two clicks

    .

  • 8/13/2019 Making Sense of Statistics

    27/39

    The Issue of the Center

    First students learn that they know , this defines thecenter of the distribution, and x (the value from thedata) exists relative to that

    ,

    In the case of inference confidence intervals andhypothesis tests (or p) is not known. Rather, weknow (or ) and make an inference about (or p).

    What does this mean for the distribution, and thegraph?

    x p

  • 8/13/2019 Making Sense of Statistics

    28/39

    The Issue of the Center Confidence Interval

    Formula:

    What does this imply for the graph?),( xx zxzx

    The center is , not ! We are calculating values for , not

    With confidence intervals we just use the formula and ignore it

    With hypothesis tests, the issue does not go away

    xxzx xzx

    xx

  • 8/13/2019 Making Sense of Statistics

    29/39

    The Issue of the Center

    Hypothesis Test

    Old way:

    Ho: = 0

    1 o

    and you spend all this time explaining why, even thoughthe hypothesis says > o you shade to the right of

    (and I think students still dont understand, they just do it)

    o x

    x

  • 8/13/2019 Making Sense of Statistics

    30/39

    Recognizing a Different Center Hypothesis Test

    New way:

    Ho: = 0

    1 o

    and now you shade where the claim tells you to shade,and that area is your confidence level

    xo

  • 8/13/2019 Making Sense of Statistics

    31/39

    Why This Makes Sense Shaded area matches the claim

    Hypothesis tests and confidence intervals are bothinferences about the population, and they should agree

    , , . We are using a distribution of values for

    The center is

    What does confidence mean? Its a type of probabilistic statement

    95% of the time, a conclusion made in this way will becorrect

    x

  • 8/13/2019 Making Sense of Statistics

    32/39

    Different center: an exercise

    Ex) We want to find out if the average American familyhas more than 1.8 kids (because that places a strain onmunicipal services). From a survey of 500 families, the

    =. . .

  • 8/13/2019 Making Sense of Statistics

    33/39

    Different center: an exercise

    Step 1: Get the facts

    =1.92 =0.9 n=500

    (1) Get the facts:=1.92 =0.9 n=500

    test claim: > 1.8

    (2) Describe the distribution:

    (3) Draw the graph:

    x

    x

    c a m: .

    (4) Do the calculations:

    (5) Conclusion:

  • 8/13/2019 Making Sense of Statistics

    34/39

    Different center: an exercise

    Step 2: describe the distribution of

    Center:

    (1) Get the facts:=1.92 =0.9 n=500

    test claim: > 1.8

    (2) Describe the distribution:mean=1.92 stdev=.0402 normal

    (3) Draw the graph:

    x

    mean= .

    Spread:

    Shape:n>30 so its normal (4) Do the calculations:

    (5) Conclusion:

    0402.500

    9.0st.dev.

    n

  • 8/13/2019 Making Sense of Statistics

    35/39

    Different center: an exercise

    Step 3: Draw the graph(1) Get the facts:

    =1.92 =0.9 n=500

    test claim: > 1.8

    (2) Describe the distribution:mean=1.92 stdev=.0402 normal

    (3) Draw the graph:

    x

    (4) Do the calculations:

    (5) Conclusion:

  • 8/13/2019 Making Sense of Statistics

    36/39

    Different center: an exercise

    Step 4: Do the calculations(1) Get the facts:

    =1.92 =0.9 n=500

    test claim: > 1.8

    (2) Describe the distribution:mean=1.92 stdev=.0402 normal

    (3) Draw the graph:

    x

    99.2

    0402.

    92.18.1

    z

    Areas:

    area to the left = .0014

    area to the right = .9986(4) Do the calculations:

    z = 2.99 areas: .0014 & .9986(5) Conclusion:

    -2.99

  • 8/13/2019 Making Sense of Statistics

    37/39

    Different center: an exercise

    Step 5: Write the conclusion

    We are .9986 confident in the

    (1) Get the facts:=1.92 =0.9 n=500

    test claim: > 1.8

    (2) Describe the distribution:mean=1.92 stdev=.0402 normal

    (3) Draw the graph:

    x

    .average American family

    has more than 1.8 children)

    (4) Do the calculations:z=-2.99 areas .0014 & .9986

    (5) Conclusion:We have .9986 confidence

    that > 1.8

  • 8/13/2019 Making Sense of Statistics

    38/39

    Big Changes

    All the formulas for the test statistic flip For means

    the center is

    The formula for z is:

    For ro ortions

    ..

    0

    ds

    xz

    x

    the center is

    The formula for z is:

    These are equivalent to the confidence interval formulas

    (just solve for 0) so we already used them withoutknowing it

    The formulas forx & z (given population info) were inverses;

    Now the formulas for and z from inference (confidenceintervals & hypothesis tests) are inverses as they should be

    ..

    0

    ds

    ppz

    p

  • 8/13/2019 Making Sense of Statistics

    39/39

    Jason [email protected]