IMS 504-Week 4&5 new

Embed Size (px)

Citation preview

  • 8/7/2019 IMS 504-Week 4&5 new

    1/40

    IMS 502IMS 502

    Information Analysis for Decision MakingInformation Analysis for Decision Making

    DescriptiveDescriptive StatisticsStatistics

  • 8/7/2019 IMS 504-Week 4&5 new

    2/40

    Content Measure of Location

    Measure of Spread

    Measure of Shape

  • 8/7/2019 IMS 504-Week 4&5 new

    3/40

    Learning Outcomes After completing this chapter, you should be

    able to:Able to conduct appropriate measure of location,

    measure of spread & measure of shape

  • 8/7/2019 IMS 504-Week 4&5 new

    4/40

    Concept of Descriptive

    Statistics

    Describing Data NumericallyMeasure of Location

    (Central Tendency)

    Measure of Spread

    (Variability)

    Measure of Shape

  • 8/7/2019 IMS 504-Week 4&5 new

    5/40

    Concept of Descriptive

    StatisticsMeasure of Location

    (Central Tendency)

    MeanMean

    MedianMedian

    ModeMode

    Descriptive information about the single numerical value that isconsidered to be the most typicalof the values of a quantitative variable.

    A measure of central tendency is a measure which indicates where the

    middlemiddle of the data is. Three common measures of central tendency are:

  • 8/7/2019 IMS 504-Week 4&5 new

    6/40

    Measures of Central

    Tendency

    Statistics to represent the centre of a

    distributionMode (most frequent)

    Median (50th percentile)

    Mean (average)

    Choice of measure dependent on

    Type of data

    Shape of distribution (esp. skewness)

  • 8/7/2019 IMS 504-Week 4&5 new

    7/40

    Measures of Central

    Tendency

    Measure of Central Tendency

    Level of

    Measurement

    Mode Median Mean

    Nominal X

    Ordinal X X X

    Interval X X X

    Ratio X? X X

  • 8/7/2019 IMS 504-Week 4&5 new

    8/40

    Measures of Central

    Tendency Choosing a measure of central tendency

    Use the Mode When : 1. The variable is measured at the nominal level.

    2. You want a quick and easy measure for ordinal and

    interval-ratio variables.

    3. You want to report the most common score.

    Use the Median When : 1. The variable is measured at the ordinal level.2. A variables measured at the interval-ratio level has a

    highly skewed distribution.

    3. You want to report the central score. The median

    always lies at the exact center of a distribution.

    Use the Mean When : 1. The variable is measured at the interval-ratio level

    (except when the variable is highly skewed).

    2. You want to report the typical score. The mean is the

    fulcrum that exactly balances all of the scores.

    3. You anticipate additional statistical analysis.

  • 8/7/2019 IMS 504-Week 4&5 new

    9/40

    Measures of Central

    Tendency Mean

    Is the average of the data

    The Population Mean: = which is usually unknown, then we use the

    sample mean to estimate or approximate it.

    The Sample Mean:=

    x

    1

    N

    ii

    N

    X=

    1

    n

    i

    i

    n

    x=

  • 8/7/2019 IMS 504-Week 4&5 new

    10/40

    Measures of Central

    Tendency

    Example:Example:Example:Example:Here is a random sample of size 10 of ages, where

    1 = 42, 2 = 28, 3 = 28, 4 = 61, 5 = 31, 6 = 23, 7 = 50, 8 = 34, 9 = 32, 10 = 37.

    = (42 + 28 + + 37) / 10 = 36.6x

  • 8/7/2019 IMS 504-Week 4&5 new

    11/40

    Measures of Central

    Tendency Properties of the Mean

    UniquenessUniqueness: For a given set of data there isone and only one mean.

    SimplicitySimplicity: It is easy to understand and to

    compute.

    Affected by extreme valuesAffected by extreme values: Since all valuesenter into the computation.

  • 8/7/2019 IMS 504-Week 4&5 new

    12/40

    Measures of Central

    Tendency

    Example:Example:Example:Example:Assume the values are 115, 110, 119, 117, 121 and 126.

    The mean = 118.

    But assume that the values are 75, 75, 80, 80 and 280.

    The mean = 118, a value that is not representativenot representative of the set of data asa whole.

  • 8/7/2019 IMS 504-Week 4&5 new

    13/40

    Concept of Descriptive

    Statistics Median

    When orderingordering the data, it is the observation thatdivide the set of observations into two equal partsinto two equal parts

    such that half of the data are before it and the other

    are after it.

    The median is the center point in a set of numbers

    (50% above, 50% below)

  • 8/7/2019 IMS 504-Week 4&5 new

    14/40

    Concept of Descriptive

    Statistics Median

    Check to see which of the following tworules applies:

    Rule One

    Ifnn is odd, the median will be the middle ofobservations.

    It will be the (n+1)/2 th ordered observation.

    When n = 11, then the median is the 6th observation.

    Example:Example:Example:Example:

    Three is the median for the numbers 1, 1, 33, 4, 9

  • 8/7/2019 IMS 504-Week 4&5 new

    15/40

    Concept of Descriptive

    Statistics Median

    Check to see which of the following tworules applies:

    Rule Two

    Ifnn is even, there are two middle observations. The medianwill be the mean of these two middle observations.

    It will be the (n+1)/2 th ordered observation.

    When n = 12, then the median is the 6.5th observation,

    which is an observation halfway between the 6th and 7thordered observation.

    Example:Example:Example:Example:

    23, 28, 28, 31, 32, 34, 37, 42, 50, 61.

    Since n = 10, then the median is the 5.5th observation, i.e. =(32+34)/2 = 33

  • 8/7/2019 IMS 504-Week 4&5 new

    16/40

    Measures of Central

    Tendency Properties of the Median

    UniquenessUniqueness: For a given set of data there isone and only one median.

    SimplicitySimplicity: It is easy to calculate.

    It is not affected by extreme valuesIt is not affected by extreme values as is the

    mean.

  • 8/7/2019 IMS 504-Week 4&5 new

    17/40

    Concept of Descriptive

    Statistics Mode

    The mode is simply the most frequentlyfrequentlyoccurring number.

    If all values are different there is no modeno mode.

    Sometimes, there are more than one modemore than one mode.

  • 8/7/2019 IMS 504-Week 4&5 new

    18/40

    Measures of Central

    Tendency

    Example:Example:Example:Example:Assume the values are 23, 28, 28, 31, 32, 34, 37, 42, 50, 61.

    The median = 28 (repeated two times).

  • 8/7/2019 IMS 504-Week 4&5 new

    19/40

    Measures of Central

    Tendency Properties of the Mode

    Sometimes, it is not unique.not unique.

    It may be used for describing qualitativedescribing qualitative

    data.data.

  • 8/7/2019 IMS 504-Week 4&5 new

    20/40

    Measures of Central

    Tendency The only procedure in SPSS that will produce all three

    commonly used measure of location is Frequency.

    To begin:

  • 8/7/2019 IMS 504-Week 4&5 new

    21/40

    Working Example(Pg. 59) One hundred tennis players participated

    in a serving competition. Gender andnumber of aces were recorded for each

    player. The data can be found in

    Work4.sav on the iLearn web site that

    accompanies this title.

    Follow steps 1, 2, 3, 4, 6, 8 & 11.

  • 8/7/2019 IMS 504-Week 4&5 new

    22/40

    Exercises Use the Frequencies command to get all

    three (3) measures of location. Write a sentence or two reporting each

    measure. You may choose three(3)

    variables to report on.

  • 8/7/2019 IMS 504-Week 4&5 new

    23/40

    Concept of Descriptive

    Statistics

    Measure of Spread

    (Variability/Dispersion)

    Give information on the spread or variability of the data values. Measures of deviation from the central tendency.

    Non-parametric/non-normal: range, percentiles, min, max

    Parametric: standard deviation (SD) & properties of the normal distribution

  • 8/7/2019 IMS 504-Week 4&5 new

    24/40

    Concept of Descriptive

    Statistics

    RangeRange

    QuartilesQuartiles

    VarianceVarianceStandardStandard

    DeviationDeviation

    Measure of Spread

    (Variability/Dispersion)

    Four common measures of spread are:

  • 8/7/2019 IMS 504-Week 4&5 new

    25/40

    Measures of Spread

    Measure of Spread

    Level of

    Measurement

    Range, Min/Max Percentile Standard Deviation

    (SD)

    Nominal

    Ordinal X

    Interval X X X?

    Ratio X X X

  • 8/7/2019 IMS 504-Week 4&5 new

    26/40

    Measures of Spread Range (R)

    The range is the difference between thehighest and lowest scores in a distribution.

    If we examine the marks of the 100 students

    above, then we can see that the highestscore was 85 and the lowest was 35.

    Therefore, the range is 50 (85 35 = 50).

    The range is limited as a means of tellingabout the general spread of a group of data,

    it does set the boundaries of the scores.

  • 8/7/2019 IMS 504-Week 4&5 new

    27/40

    Measures of Spread Quartiles (Q)

    The quartiles split the ordered data into fourquarters:

    Q1, or 25th percentile -- 25.0% of theobservations

    Q2, is the median -- 50.0% of the observations Q3, or 75th percentile -- 75.0% of the

    observations

    The difference between Q3 - Q1 is called theinter-quartile range, or IQR.

    25%25% 25%25% 25%25% 25%25%

    QQ11 QQ22 QQ33

  • 8/7/2019 IMS 504-Week 4&5 new

    28/40

    Measures of Spread Variance

    The variance is the square of the standarddeviation.

    The lower the variance, the more accurately

    the mean represents the scores of all casesin a distribution of data.

  • 8/7/2019 IMS 504-Week 4&5 new

    29/40

    Measures of Spread Standard Deviation

    The standard deviation provides theresearcher with an indicator of how scoresfor variables are spread around the mean

    average.The higher the standard deviation, the morescores around the mean are spread out.

  • 8/7/2019 IMS 504-Week 4&5 new

    30/40

    Measures of Spread Using SPSS for measure of spread.

    To begin:

  • 8/7/2019 IMS 504-Week 4&5 new

    31/40

    Working Example(Pg. 59) One hundred tennis players participated

    in a serving competition. Gender andnumber of aces were recorded for eachplayer. The data can be found inWork4.sav on the iLearn web site that

    accompanies this title. Follow steps 1, 2, 3, 4, 5, 7, 8 & 11.

  • 8/7/2019 IMS 504-Week 4&5 new

    32/40

    Exercises Choose three(3) variables to work on.

    Write a few sentences summarizing thesetables for each under measurement ofspread.

    Describe the difference (if any).

  • 8/7/2019 IMS 504-Week 4&5 new

    33/40

    Concept of Descriptive

    Statistics

    SkewnessKurtosis

    Measure of ShapeMeasure of Shape

    To describes how data are distributed. Use the normal curve (combination of mean & standard deviation) to construct

    precise descriptive statements. Two common measures of shape are:

    Normal

    Negative Skewness

    Positive Skewness

    Platykurtic

    Leptokurtic

    is a measure of symmetry, or moreprecisely, the lack of symmetry.

    is a measure of whether the data are

    peaked or flat relative to a normaldistribution.

  • 8/7/2019 IMS 504-Week 4&5 new

    34/40

    Measure of Shape - Skewness

    Mean = Median = ModeMean = Median = Mode

    Coefficient = 0Coefficient = 0

    Mean < Median < ModeMean < Median < Mode

    Coefficient = NegativeCoefficient = Negative

    Mode < Median < MeanMode < Median < Mean

    Coefficient = PositiveCoefficient = Positive

  • 8/7/2019 IMS 504-Week 4&5 new

    35/40

    Measure of Shape - Kurtosis

    Data distribution with largestandard deviation. Data sets with low kurtosis tendto have a flat top near the mean

    rather than a sharp peak. The data will be far away fromthe mean.

    Kurtosis, 2 < 0 (Platykurtic)

    Data distribution with small

    standard deviation. Data sets with high kurtosis tend

    to have a distinct peak near the

    mean, decline rather rapidly, andhave heavy tails.

    The data will cluster around orclose to the Mean.

    Kurtosis, 2 > 0 (Leptokurtic)

  • 8/7/2019 IMS 504-Week 4&5 new

    36/40

    Measure of Shape Using SPSS for measure of shape.

    To begin:

  • 8/7/2019 IMS 504-Week 4&5 new

    37/40

    Working Example(Pg. 59) One hundred tennis players participated

    in a serving competition. Gender andnumber of aces were recorded for eachplayer. The data can be found inWork4.sav on the iLearn web site that

    accompanies this title. Follow steps 1, 2, 3, 4, Click on Skewness

    & Kurtosis, 8, 9, 10 & 11.

  • 8/7/2019 IMS 504-Week 4&5 new

    38/40

    Exercise Choose three(3) variables to work on.

    Write a few sentences summarizing thesetables for each under measurement ofshape.

    Describe the difference (if any).

  • 8/7/2019 IMS 504-Week 4&5 new

    39/40

  • 8/7/2019 IMS 504-Week 4&5 new

    40/40