Central Tendency 2015

Embed Size (px)

Citation preview

  • 7/25/2019 Central Tendency 2015

    1/42

    Measures of central

    tendencyDr Ismarulyusda bt Ishak

  • 7/25/2019 Central Tendency 2015

    2/42

    MEASURES OF CENTRAL TENDENCY FOR

    UNGROUPED DATA

    Mean

    Median

    Mode Relationships among the Mean, Median, and Mode

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    3/42

    RM 3,500.00

  • 7/25/2019 Central Tendency 2015

    4/42

    An indication of

    the locationor

    centrality of the data.

  • 7/25/2019 Central Tendency 2015

    5/42

    The most common measures

    Mean

    (numerical average),

    Median(the midpoint of an order data set such that half of

    the data points are above and half are below it)

    Mode

    (the value that occurs most frequently)

  • 7/25/2019 Central Tendency 2015

    6/42

    POPULATION

    THE ENTIRE COLLECTION

    OF ITEMSTHAT IS THE FOCUS OF

    CONCERN.

  • 7/25/2019 Central Tendency 2015

    7/42

    POPULATION AND SAMPLE

    Popn to identify its characteristics.

    Sample to make inferences about the

    characteristics of the population

  • 7/25/2019 Central Tendency 2015

    8/42

    Figure 1.1 Population and Sample

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    9/42

    Mean

    Arithmetic mean (most familiar)

    Another name for average.

    If describing a population, denoted as , the

    greek letter mu.

    If describing a sample, denoted as x,called x-bar.

    Appropriate for describing measurement data.

    Seriously affected by unusual values calledoutliers.

  • 7/25/2019 Central Tendency 2015

    10/42

    Calculating Sample Mean

    nxx Formula:

    That is, add up all of the data points and divide bythe number of data points.

    n= sample size

    x= variable used to represent individual data

    Do not round! Mean need not be a whole

    number.

  • 7/25/2019 Central Tendency 2015

    11/42

    Population Mean

    Mean of all values in population

    N

    x

    N= number of values in population

    x= variable used to represent individual data

  • 7/25/2019 Central Tendency 2015

    12/42

    Table 3.1 2008 Sales of Six U.S.

    Companies

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

    Find the 2008 mean sales for these sixcompanies.

  • 7/25/2019 Central Tendency 2015

    13/42

    Example 3-1: Solution

    1368228 $228 Billion6

    x

    x

    n

    1 2 3 4 5 6x x x x x x x

    Thus, the mean 2008 sales of these six companies

    was 228, or $228 billion.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

    149 406 183 107 426 97 1368

  • 7/25/2019 Central Tendency 2015

    14/42

    Example 3-3

    Table 3.2 lists the total philanthropic givings (in million dollars)

    by six companies during 2007.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    15/42

    Example 3-3

    Notice that the charitable contributions made

    by Wal-Mart are very large compared to those

    of other companies. Hence, it is an outlier.

    Show how the inclusion of this outlier affectsthe value of the mean.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    16/42

    Example 3-3: Solution

    If we do not include the charitable giving's of Wal-Mart (the outlier), the mean of the charitablecontributions of the five companies is

    22.4 31.8 19.8 9.0 27.5Mean $22.1 million

    5

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    17/42

    Example 3-3: Solution

    Now, to see the impact of the outlier on the

    value of the mean, we include the

    contributions of Wal-Mart and find the mean

    contributions of the six companies. This meanis

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

    22.4 31.8 19.8 9.0 27.5 337.9

    Mean $74.73 million6

  • 7/25/2019 Central Tendency 2015

    18/42

    Median

    Another name for 50th percentile. The middle value when the original data

    value are arranged in order of increasing.

    Appropriate for describing measurementdata.

    Robust to outliers, that is, not affectedmuch by unusual values.

  • 7/25/2019 Central Tendency 2015

    19/42

    Example 3-4

    The following data give the prices (in

    thousands of dollars) of seven houses

    selected from all houses sold last month ina city.

    312 257 421 289 526 374 497

    Find the median.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    20/42

    Example 3-4: Solution

    First, we rank the given data in increasing orderas follows:

    257 289 312 374 421 497 526

    Since there are seven homes in this data set andthe middle term is the fourth term,

    Thus, the median price of a house is 374.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    21/42

    Example 3-5

    Table 3.3 gives the 2008 profits (rounded to billions ofdollars) of 12 companies selected from all over the world.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    22/42

    Table 3.3 Profits of 12 Companies for

    2008

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

    Find the medianof these data.

  • 7/25/2019 Central Tendency 2015

    23/42

    Example 3-5: Solution

    First we rank the given profits as follows:

    7 8 9 10 11 12 13 13 14 17 17 45

    There are 12 values in this data set. Because

    there is an even number of values in the data set,

    the median is given by the average of the two

    middle values.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    24/42

    Example 3-5: Solution

    The two middle values are the sixth and seventh

    in the foregoing list of data, and these two values

    are 12 and 13.

    Thus, the median profit of these 12 companies is

    $12.5 billion.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

    12 13 25Median 12.5 $12.5 billion

    2 2

  • 7/25/2019 Central Tendency 2015

    25/42

    Median

    The median gives the centerof a histogram,

    with half the data values to the left of the median

    and half to the right of the median. The

    advantage of using the median as a measure ofcentral tendency is that it is not influenced by

    outliers. Consequently, the median is preferred

    over the mean as a measure of central tendencyfor data sets that contain outliers or not normal.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    26/42

    Case Study 3-2 The Gender Pay Gap

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    27/42

    Mode The value that occurs with the highest

    frequency in a data set. French word = fashion (most

    popular/common)

    One data set can have many modes. When two value have same greatest frequency, each

    one is a mode and the data set is bimodal

    When more than twomultimodal

    No value repeatedno mode

    Appropriate for all types of data, but mostuseful for categorical data or discrete datawithonly a few number of possible values.

  • 7/25/2019 Central Tendency 2015

    28/42

    Speeds (in km/hour) of eight cars that

    stopped for speeding violations.

    120 140 167 135 140 182 159 133

    Solution:

    _______ is the mode

    l i hi h di

  • 7/25/2019 Central Tendency 2015

    29/42

    Relationships among the Mean, Median,

    and Mode

    1. For a symmetric histogram and

    frequency curve with one peak (Figure

    3.2), the values of the mean, median, and

    mode are identical, and they lie at the

    center of the distribution.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    30/42

    Relationships among the mean, median and

    mode

    Mean, median, and mode for a symmetric histogram and frequencydistribution curve.

    R l i hi h M M di

  • 7/25/2019 Central Tendency 2015

    31/42

    Relationships among the Mean, Median,

    and Mode

    2. For a histogram and a frequency curveskewed to the right (Figure 3.3), the value ofthe mean is the largest, that of the mode is

    the smallest, and the value of the median lies

    between these two. (Notice that the modealways occurs at the peak point.) The value ofthe mean is the largest in this case because it

    is sensitive to outliers that occur in the righttail. These outliers pull the mean to the right.

    Prem Mann, Introductory Statistics, 7/ECopyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    32/42

    Mean, median, and mode for a histogram and frequency distribution curveskewed to the right.

    R l ti hi th M M di

  • 7/25/2019 Central Tendency 2015

    33/42

    Relationships among the Mean, Median,

    and Mode

    3. If a histogram and a distribution curve

    are skewed to the left (Figure 3.4), the

    value of the mean is the smallest and thatof the mode is the largest, with the value

    of the median lying between these two.

    In this case, the outliers in the left tailpull the mean to the left.

    Prem Mann, Introductory Statistics, 7/E

    Copyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    34/42

    Mean, median, and mode for a histogram and frequency

    distribution curve skewed to the left.

  • 7/25/2019 Central Tendency 2015

    35/42

    Mean for Grouped Data

    N

    m f

    nx

    mf

    Mean for

    population data

    Mean for sample

    data

    m is midpoint and f is frequency of a class

  • 7/25/2019 Central Tendency 2015

    36/42

    Example 3-14

    Table 3.8 gives the frequency

    distribution of the daily commuting

    times (in minutes) from home to workfor all25 employees of a company.

    Calculate the mean of the daily

    commuting times.

    Prem Mann, Introductory Statistics, 7/E

    Copyright 2010 John Wiley & Sons. All right reserved

  • 7/25/2019 Central Tendency 2015

    37/42

    Prem Mann, Introductory Statistics, 6/ECopyright 2007 John Wiley & Sons. All rights reserved.

  • 7/25/2019 Central Tendency 2015

    38/42

  • 7/25/2019 Central Tendency 2015

    39/42

    = 535 = 21.4 minutes

    25N

    m f

    Thus, the employees of this company spend

    an average of 21.40 minutes a daycommuting from home to work.

  • 7/25/2019 Central Tendency 2015

    40/42

    Choosing Appropriate

    If data are symmetric, the mean,

    median, and mode will be

    approximately the same.

    If data are multimodal, report the

    mean, median and/or mode for each

    subgroup. If data are skewed, report the median.

  • 7/25/2019 Central Tendency 2015

    41/42

  • 7/25/2019 Central Tendency 2015

    42/42