Module3a

  • Upload
    palkybd

  • View
    214

  • Download
    0

Embed Size (px)

DESCRIPTION

Math

Citation preview

  • Module 3aDescriptive Statistics: Numerical Methods Measures of Location Percentiles and Quartiles Measures of Variability

  • Learning GoalsUnderstand the purpose of measures of location.Be able to compute the mean, median, mode, quartiles, and various percentiles.Understand the purpose of measures of variability.Be able to compute the range, interquartile range, variance, standard deviation, and coefficient of variation.

  • Measures of locationThe table on the right contains excerpt from a data set that contains salaries for 474 employees at a Midwestern bank.

    We want to use measures of location to describe this data set.

    Sheet1

    ObservationSalary

    16300

    26360

    36480

    46480

    56480

    66540

    76600

    86660

    96720

    106780

    116780

    126780

    136840

    146840

    156900

    166960

    176960

    187080

    197260

    207260

    217380

    227500

    237680

    247680

    257860

    267860

    277860

    287860

    297860

    307860

    317920

    327980

    337980

    348040

    358040

    368160

    378160

    388160

    398220

    408280

    418280

    428340

    438340

    448340

    458340

    468340

    478400

    488400

    498460

    508460

    518520

    528520

    538520

    548520

    558580

    568580

    578580

    588640

    598640

    608640

    618700

    628700

    638700

    648760

    658760

    668760

    678760

    688760

    698820

    708820

    718820

    728820

    738880

    748880

    758880

    768940

    778940

    788940

    798940

    808940

    818940

    829000

    839000

    849000

    859000

    869000

    879000

    889000

    899060

    909060

    919120

    929120

    939180

    949180

    959180

    969180

    979240

    989240

    999240

    1009240

    1019300

    1029300

    1039360

    1049360

    1059360

    1069360

    1079420

    1089420

    1099480

    1109480

    1119540

    1129540

    1139600

    1149600

    1159600

    1169600

    1179600

    1189600

    1199600

    1209600

    1219660

    1229660

    1239660

    1249660

    1259720

    1269720

    1279720

    1289720

    1299780

    1309780

    1319780

    1329780

    1339780

    1349780

    1359780

    1369780

    1379840

    1389840

    1399900

    1409900

    1419900

    1429900

    1439960

    14410020

    14510020

    14610020

    14710020

    14810080

    14910080

    15010080

    15110080

    15210080

    15310140

    15410140

    15510200

    15610200

    15710200

    15810200

    15910260

    16010260

    16110320

    16210320

    16310380

    16410380

    16510380

    16610380

    16710440

    16810440

    16910500

    17010500

    17110500

    17210500

    17310500

    17410500

    17510500

    17610560

    17710560

    17810560

    17910560

    18010620

    18110620

    18210620

    18310620

    18410620

    18510680

    18610680

    18710680

    18810680

    18910680

    19010680

    19110680

    19210740

    19310740

    19410800

    19510800

    19610800

    19710860

    19810860

    19910920

    20010920

    20110920

    20210920

    20310920

    20410980

    20510980

    20610980

    20710980

    20810980

    20911040

    21011040

    21111100

    21211100

    21311100

    21411100

    21511100

    21611100

    21711100

    21811160

    21911160

    22011160

    22111160

    22211220

    22311220

    22411220

    22511280

    22611340

    22711340

    22811340

    22911400

    23011400

    23111400

    23211400

    23311400

    23411400

    23511460

    23611520

    23711520

    23811580

    23911640

    24011640

    24111640

    24211640

    24311640

    24411664

    24511700

    24611700

    24711736

    24811760

    24911760

    25011760

    25111760

    25211760

    25311820

    25411880

    25511940

    25611940

    25711940

    25811940

    25912000

    26012000

    26112000

    26212000

    26312000

    26412060

    26512060

    26612108

    26712120

    26812120

    26912120

    27012120

    27112180

    27212240

    27312240

    27412240

    27512300

    27612300

    27712300

    27812300

    27912300

    28012300

    28112300

    28212300

    28312300

    28412300

    28512300

    28612300

    28712300

    28812360

    28912360

    29012360

    29112420

    29212480

    29312480

    29412480

    29512540

    29612540

    29712540

    29812600

    29912600

    30012600

    30112660

    30212660

    30312660

    30412660

    30512780

    30612780

    30712780

    30812780

    30912840

    31012960

    31113020

    31213020

    31313020

    31413140

    31513200

    31613260

    31713320

    31813320

    31913320

    32013380

    32113416

    32213500

    32313560

    32413560

    32513560

    32613560

    32713560

    32813560

    32913764

    33013800

    33113800

    33213800

    33313800

    33413800

    33513848

    33613920

    33713920

    33813980

    33914040

    34014040

    34114100

    34214100

    34314100

    34414100

    34514220

    34614220

    34714280

    34814280

    34914280

    35014400

    35114400

    35214400

    35314400

    35414460

    35514640

    35614820

    35715000

    35815060

    35915120

    36015120

    36115120

    36215360

    36315420

    36415480

    36515540

    36615540

    36715660

    36815720

    36915840

    37015960

    37116020

    37216080

    37316080

    37416080

    37516080

    37616140

    37716140

    37816320

    37916320

    38016440

    38116620

    38216800

    38316920

    38416920

    38517200

    38617364

    38717400

    38817460

    38917580

    39017950

    39118000

    39218060

    39318100

    39418250

    39518400

    39618400

    39718750

    39818900

    39919020

    40019200

    40119500

    40219600

    40320000

    40420220

    40520400

    40620500

    40720580

    40820850

    40921060

    41021250

    41121600

    41221750

    41321950

    41421960

    41522000

    41622000

    41722000

    41822200

    41922300

    42022600

    42122620

    42222700

    42322800

    42423250

    42523500

    42623750

    42723760

    42824000

    42924000

    43024150

    43124250

    43224500

    43324750

    43424750

    43525000

    43626000

    43726000

    43826000

    43926400

    44026500

    44126700

    44226750

    44326750

    44427000

    44527250

    44627250

    44727500

    44827500

    44927700

    45028000

    45128000

    45228350

    45329000

    45429400

    45529500

    45630000

    45730000

    45831250

    45931300

    46031400

    46132000

    46232500

    46333000

    46433500

    46534500

    46636250

    46736500

    46836800

    46938800

    47040000

    47141400

    47241500

    47344250

    47454000

    Sheet2

    ObservationSalary

    16300

    26360

    36480

    46480

    56480

    66540

    76600

    86660

    96720

    106780

    46433500

    46534500

    46636250

    46736500

    46836800

    46938800

    47040000

    47141400

    47241500

    47344250

    Sheet3

  • Measures of LocationThe following are measures of location:MeanMedianModePercentilesQuartiles

  • MeanThe mean of a data set is the average of all the data values.If the data are from a sample, the mean is denoted by If the data are from a population, the mean is denoted by m (mu).

  • Mean

  • MedianThe median is the measure of location most often reported for annual income and property value data.A few extremely large incomes or property values can inflate the mean.

  • MedianThe median of a data set is the value in the middle when the data items are arranged in ascending order.For an odd number of observations, the median is the middle value.For an even number of observations, the median is the average of the two middle values.

  • MedianMedian Median = 50th percentilei = (p/100)n = (50/100)474 = 237Because N is even, we average the 237th and 238th data values:

    Median = (11,520 + 11,580)/2 = 11,550

  • Mean and Median ComparedBoth the mean and median are supposed to be measures of central location for the data. In the case of this data set notice that the mean is $2,217.83 more than the median (13,767.80 11,550).

    Why is there such a large discrepancy?

    Looking at the frequency distribution of current salaries helps to explain why this discrepancy exists.

  • When there are data values in a distribution that are much smaller or larger than the others such that the distribution is skewed, the mean may not be a good measure of central tendency.The histogram on the left shows the distribution of current salary. Notice two vertical lines that run from top to bottom with numbers attached. The line on the left is the median (11,550) and the line on the right is the mean (13,768).When the distribution, as in this example, has a long tail that extends to larger values (skewed right) then the mean will be larger than the median. If the distribution has a long tail that extends to smaller values (skewed left), then the mean will be smaller than the median. When the data is symmetric (not skewed) then the mean and median will be equal.

  • ModeThe mode of a data set is the value that occurs with greatest frequency.The greatest frequency can occur at two or more different values.If the data have exactly two modes, the data are bimodal.If the data have more than two modes, the data are multimodal.

  • ModeExample:SalaryIn the salary example, the modal salary was $12,300. This was the current salary of 14 of the 474 employees included in this sample.

  • PercentilesRecall how the median divided the sample into 2 equal parts half the observations are less than the median and half are greater than the median.There are other ways to split the sample on a percentage basis: such as finding the value where 10 percent of the observations are less than that value and 90 percent are greater.Admission test scores for colleges and universities are frequently reported in terms of percentiles.

  • PercentilesThe pth percentile of a data set is a value such that at least p percent of the items take on this value or less and at least (100 - p) percent of the items take on this value or more.Arrange the data in ascending order.Compute index i, the position of the pth percentile. i = (p/100)n

    If i is not an integer, round up. The pth percentile is the value in the ith position.If i is an integer, the pth percentile is the average of the values in positions i and i +1.Note: There is no universally accepted method to calculate percentiles. The method used in the book is not the same used in SPSS. For further information is available at http://cnx.rice.edu/content/m10805/latest

  • PercentilesExample: Salary (Book Method)10th Percentilei = (p/100)n = (10/100)474 = 47.4 = 48the 48th data value:10th Percentile = 8,400

  • QuartilesQuartiles are specific percentilesFirst Quartile = 25th PercentileSecond Quartile = 50th Percentile = MedianThird Quartile = 75th Percentile

  • QuartilesExample: Salaries (Book Method)Third QuartileThird quartile = 75th percent i = (p/100)n = (75/100)474 = 355.5 = 356Third quartile = 14,820

    Using SPSSNotice how the value for the 75th percentile calculated using SPSS is different.

  • Measures of VariabilityMeasures of location do not give us an idea of how observations differ from each other.

    Measures of variability quantify the spread or dispersion of observations.

    Choosing suppliers is an example of why this is important In business. When choosing between suppliers we might consider not only the average delivery time for each, but also the variability in delivery time for each.

  • Measures of VariabilityRangeInterquartile RangeVarianceStandard DeviationCoefficient of Variation

  • Measures of Variability: the RangeThe range of a data set is the difference between the largest and smallest data values.It is the simplest measure of variability.It is very sensitive to the smallest and largest data values.The value of the range does not tell us anything about the variability of the values between the largest and smallest values.

  • Measures of Variability: the Interquartile RangeThe interquartile range of a data set is the difference between the third quartile and the first quartile.It is the range for the middle 50% of the data.It overcomes the sensitivity to extreme data values.

  • Measures of Variability: the Interquartile RangeExample: Salaries (Book Method)Interquartile Range3rd Quartile (Q3) = 14,8201st Quartile (Q1) = 9,600Interquartile Range = Q3 - Q1 = 14,820 9,600 = 5,220

    Using SPSSInterquartile Range = Q3 - Q1 = 14,865 9,600 = 5,265

  • Measures of Variability: the VarianceThe variance is a measure of variability that utilizes all the data.It is based on the difference between the value of each observation (xi) and the mean ( for a sample, m for a population).

  • Measures of Variability: the VarianceThe variance is the average of the squared differences between each data value and the mean.If the data set is a sample, the variance is denoted by s2.

    If the data set is a population, the variance is denoted by 2.

  • Measures of Variability: the Standard DeviationThe standard deviation of a data set is the positive square root of the variance.It is measured in the same units as the data, making it more easily comparable, than the variance, to the mean.

    If the data set is a sample, the standard deviation is denoted s.

    If the data set is a population, the standard deviation is denoted (sigma).

  • Measures of Variability: the Coefficient of VariationThe coefficient of variation indicates how large the standard deviation is in relation to the mean. This enables the comparison of the variability of different variables.

    If the data set is a sample, the coefficient of variation is computed as follows:

    If the data set is a population, the coefficient of variation is computed as follows:

  • Example: SalaryVariance

    Standard Deviation

    Coefficient of Variation