Summary of Statistics I

Embed Size (px)

Citation preview

  • 7/27/2019 Summary of Statistics I

    1/7

    8/26/20

    Type of variables:

    Quantitative and Categorical

    Types of Quantitative Variables

    Discrete:

    Listable values with gaps

    between them.

    Note: Not necessarily integers.

    May be an infinite number of values.

    Continuous:

    Any value in an interval

    can occur.

    Categorical variable

    A variable that classifies

    people or things is

    qualitativeor categorical.

    Ex: gender, political affiliation,

    race, country of origin.

    Types of Categorical Variables

    Nominal:

    Values are simply labels or

    categories.

    Ex: 1 for Educator

    2 for Construction worker

    3 for Manufacturing worker

    Types of Categorical Variables

    Ordinal:

    The categories have a naturalordering.

    EX.

    1 for President

    2 for Vice President

    3 for Dean of College

    4 for Department Supervisor

    5 for Employee

    Types of Categorical Variables

    Binary:Only two categories

    (often labeled 0 and 1).

  • 7/27/2019 Summary of Statistics I

    2/7

    8/26/20

    Identify the type:

    Height: Quantitative Continuous

    Family Size: Quantitative Discrete

    Age: Quantitative Continuous

    Course Grade: CategoricalQualitative

    Ordinal

    Football jersey

    number:

    CategoricalQualitative

    Nominal

    Descriptive Inferential

    NumericalGraphical

    Statistics

    Descriptive Statistics:

    numerical summaries and/orgraphical displays of (sample) data.

    Inferential Statistics:

    Using sample data to makeconclusions about the population

    from which the sample was taken.

    Drawing conclusions about

    the population (parameter),

    based on a sample (statistic).

    Statistical Inference

    Population

    Census

    Real

    Parameter

    Sample

    Statistic

    estimates

    After we collect data, then what?

    Organize / Summarize Data

    Graphical Numerical

  • 7/27/2019 Summary of Statistics I

    3/7

    8/26/20

    Graphs

    Histogram

    Stem-and-leaf plot

    for a single

    quantitative

    variable

    Bar Chart

    Pie Chart

    for a single

    categorical

    variable

    Common Statistical Graphs Quantitative Data

    Histogram- vertical bar chart of frequencies

    - Frequency Polygon - line graph of frequencies

    - Ogive - line graph of cumulative frequencies

    Stem and Leaf Plot - Like a histogram, but shows

    individual data values. Useful for small data sets.

    Histogram

    0

    10

    20

    0 10 20 30 40 50 60 70 80

    Years

    Frequency

    0

    10

    20

    0 10 20 30 40 50 60 70 80

    Years

    Frequency

    What values of Yearsfall in the 2nd class?

    30 to 40

    How many valuesfall in the 2nd class?

    18 of the 50

    What is the sum of theheights of all bars?

    50

    Frequency Polygon

    - line graph of frequencies

    Class Interval Frequency

    20-under 30 6

    30-under 40 18

    40-under 50 11

    50-under 60 11

    60-under 70 3

    70-under 80 10

    10

    20

    0 10 2 0 3 0 4 0 50 6 0 7 0 8 0

    Frequency

    Years

    Ogive

    - line graph of cumulative frequencies

    Cumulative

    Class Interval Frequency

    20-under 30 6

    30-under 40 24

    40-under 50 35

    50-under 60 46

    60-under 70 49

    70-under 80 50 0

    20

    40

    60

    0 10 20 30 40 50 60 70 80

    Years

    Frequency

  • 7/27/2019 Summary of Statistics I

    4/7

    8/26/20

    Relative Frequency Ogive

    Cumulative

    Relative

    Class Interval Frequency20-under 30 .12

    30-under 40 .48

    40-under 50 .70

    50-under 60 .92

    60-under 70 .98

    70-under 80 1.000.00

    0.10

    0.20

    0.30

    0.40

    0.50

    0.60

    0.70

    0.80

    0.90

    1.00

    1.10

    0 10 20 30 40 50 60 70 80

    Years

    CumulativeRelativeFrequ

    ency

    Stem-and-leaf plot of Income1 N = 100

    Leaf Unit = 1000

    2 1 58

    6 2 022330 2 555566667777778889999999

    (22) 3 0001111222222222333344

    48 3 5555555666777788889

    29 4 00000001111223444

    12 4 56667889

    4 5 0

    3 5 57

    1 6

    1 6

    1 7 2

    Stem-and-Leaf Plot

    Stem-and-Leaf Plot

    Purpose: Same as for a histogram, but provides a little more

    detail about the data distribution.

    Data values are split into stem and leaf components:

    Choice of split is data dependent.

    123 4 5.67

    stem leaf discard

    12,345.67

    Distribution of a

    Quantitative Variable

    q Illustrated by a histogram ora stem-and-leaf plot.

    q Indicates how the valuesof the variable are spread out

    across the range of the variable.

    Key Features of Data Distributions

    Shape

    Typical Value

    Spread

    Outliers

    Shapes: Symmetric

    Skewed right

    Skewed left

    Bimodal

    (multi-modal)

  • 7/27/2019 Summary of Statistics I

    5/7

    8/26/20

    Frequency

    Symmetric Distribution

    Frequency

    More or lessSymmetric Distribution

    Frequency

    Skewed Right

    Frequency

    Skewed Left

    Frequency

    Bimodal

    Graphs

    Histogram

    Stem-and-leaf plot

    for a single

    quantitative

    variable

    Bar Chart

    Pie Chart

    for a single

    categorical

    variable

  • 7/27/2019 Summary of Statistics I

    6/7

    8/26/20

    This is NOTa histogram!

    Bar Chart Common Statistical GraphsQualitative Data

    Pie Chart -- proportional representation for

    categories of a whole

    Bar Chart frequency or relative frequency of

    one more categorical variables

    Second Quarter U.S. Truck Production

    Second Quarter Truck

    Production in the U.S.

    (Hypothetical values)

    2d QuarterTruck

    ProductionCompany

    A

    B

    C

    D

    ETotals

    357,411

    354,936

    160,997

    34,099

    12,747920,190

    Second Quarter U.S. Truck Production

    Proportionofeach Truck production

    Pie Chart Calculations for Company A

    2d QuarterTruck

    ProductionProportion DegreesCompany

    A

    B

    C

    D

    E

    Totals

    357,411

    354,936

    160,997

    34,099

    12,747

    920,190

    .388

    .386

    .175

    .037

    .014

    1.000

    140

    139

    63

    13

    5

    360

    Pie Chart Calculations for Company A

    2d QuarterTruck

    ProductionProportion DegreesCompany

    A

    B

    C

    D

    E

    Totals

    357,411

    354,936

    160,997

    34,099

    12,747

    920,190

    .388

    .386

    .175

    .037

    .014

    1.000

    140

    139

    63

    13

    5

    360

    357,411

    920,190 =

    =360.388

  • 7/27/2019 Summary of Statistics I

    7/7

    8/26/20

    Pareto Chart A pareto chart is a bar chart, sorted from most frequent to least

    frequent, overlaid with a cumulative line graph (like an ogive).

    0

    10

    20

    30

    40

    50

    60

    70

    80

    90

    100

    Poor

    Wiring

    Short in

    Coil

    Defective

    Plug

    Other

    Frequency

    0%

    10%

    20%

    30%

    40%

    50%

    60%

    70%

    80%

    90%

    100%

    Graphs

    Scatterplotfortwo

    quantitative

    variables

    Scatterplot of Debt versus Income