LECTURE CENTRAL TENDENCIES & DISPERSION POSTGRADUATE METHODOLOGY COURSE

Preview:

Citation preview

LECTURELECTURE

CENTRAL TENDENCIES CENTRAL TENDENCIES & DISPERSION& DISPERSION

PO

STG

RA

DU

AT

E

ME

TH

OD

OL

OG

Y C

OU

RSE

Measures of Central Measures of Central TendencyTendency

Summarizes the entire data set into a Summarizes the entire data set into a single variable ( measurement )single variable ( measurement )

Measures of Central Tendency Measures of Central Tendency includes: includes: – ModeMode– MedianMedian– MeanMean– Trimmed MeanTrimmed Mean– SkewnessSkewness

ModeMode

The measurement that occurs most often The measurement that occurs most often ( with the highest frequency )( with the highest frequency )

Commonly used as a measure of Commonly used as a measure of popularity.popularity.

There can be more than 1 mode.There can be more than 1 mode. Not influence by extreme measurements.Not influence by extreme measurements. Applicable for both qualitative and Applicable for both qualitative and

quantitative data.quantitative data.

MedianMedian

The middle value when the measurements The middle value when the measurements are arranged from lowest to highest.are arranged from lowest to highest.

50% of the measurement lie above it and 50% of the measurement lie above it and 50% fall below it.50% fall below it.

Often used to measure the midpoint of a Often used to measure the midpoint of a large set of measurement.large set of measurement.

There is only 1 medianThere is only 1 median Not influenced by extreme measurements.Not influenced by extreme measurements. Applicable to quantitative data only.Applicable to quantitative data only.

MeanMean

The sum of the measurements The sum of the measurements divided by the total number of divided by the total number of measurements or better known as measurements or better known as the average.the average.

There is only 1 mean.There is only 1 mean. Value is influences by extreme Value is influences by extreme

measurementsmeasurements Applicable to quantitative data only.Applicable to quantitative data only.

Trimmed MeanTrimmed Mean

The mean is influenced by extreme The mean is influenced by extreme values ( Outliers )values ( Outliers )

To reduce the effect of outliers which To reduce the effect of outliers which distort the mean value, a variation of distort the mean value, a variation of the mean is introduced.the mean is introduced.

Trimmed mean drops the highest Trimmed mean drops the highest and lowest extreme values and and lowest extreme values and averages the rest.averages the rest.

SkewnessSkewness Relationship of the mode, median, mean Relationship of the mode, median, mean

and trimmed mean is reflected through the and trimmed mean is reflected through the skewness of the data.skewness of the data.

Skewness of the data measures how the Skewness of the data measures how the data is distributed.data is distributed.

Zero Skewness Zero Skewness – symmetrical ( Mode = Median = Mean)symmetrical ( Mode = Median = Mean)

Positive Skewness Positive Skewness – skewed to the right ( Mode < Median < skewed to the right ( Mode < Median <

Mean )Mean ) Negative Skewness Negative Skewness

– skewed to the left ( Mode > Media > Mean )skewed to the left ( Mode > Media > Mean )

SkewnessSkewness

Negatively or left skewed

0

10

20

30

40

50

1 2 3 4 5 6 7 8

Symmetrical

0

5

10

15

20

25

1 2 3 4 5 6 7 8 9

Positively or right skewed

0

5

10

15

20

25

30

35

1 2 3 4 5 6 7 8

Measures of Variability / Measures of Variability / Dispersion Dispersion

It is not sufficient to describe a data set using It is not sufficient to describe a data set using only measures of central tendencyonly measures of central tendency

Need to determine how dispersed / spread out Need to determine how dispersed / spread out the data is.the data is.

Measures of variability/spread includesMeasures of variability/spread includes– RangeRange– Percentile / QuartilePercentile / Quartile– Deviation / Standard Deviation (sisihan piawai)Deviation / Standard Deviation (sisihan piawai)– VarianceVariance– Coefficient of variationCoefficient of variation

RangeRange

The difference between the largest The difference between the largest and the smallest measurement of and the smallest measurement of the set.the set.

It is easy to compute but very It is easy to compute but very sensitive to outliers.sensitive to outliers.

Does not give much information Does not give much information about the pattern of variabilityabout the pattern of variability

Percentile / QuartilePercentile / Quartile The pThe pthth percentile of a set of n percentile of a set of n

measurements arranged in order of measurements arranged in order of magnitude is that value that has at most magnitude is that value that has at most p% of the measurements below it and at p% of the measurements below it and at most ( 100 – p ) % above it.most ( 100 – p ) % above it.

Example: 60Example: 60thth percentile has 60% of the percentile has 60% of the data below it and 40% above it.data below it and 40% above it.

Percentile of interest are the 25Percentile of interest are the 25thth, 50, 50thth, , 7575thth, percentiles often called the lower , percentiles often called the lower quartile, median, and upper quartile. quartile, median, and upper quartile.

Interquartile range – difference between Interquartile range – difference between the upper and lower quartile the upper and lower quartile

Variance and Standard Variance and Standard DeviationDeviation

The variance of a set of n measurements The variance of a set of n measurements yy11, y, y22, … ,y, … ,ynn with mean y is the sum of the with mean y is the sum of the squared deviations divided by n – 1.squared deviations divided by n – 1.

The standard deviation of a set of The standard deviation of a set of measurement is defined to be the measurement is defined to be the positive square root of the variance.positive square root of the variance.

Both measure how spread out the data is Both measure how spread out the data is from the mean.from the mean.

Coefficient of VariationCoefficient of Variation

Measures the variability in the values Measures the variability in the values in a population relative to the in a population relative to the magnitude of the population mean.magnitude of the population mean.

CV = CV = Standard DeviationStandard Deviation

|Mean||Mean| The CV is a unit-free number, it is The CV is a unit-free number, it is

useful when comparing variation of useful when comparing variation of different sets of data.different sets of data.

BoxplotBoxplot

Top line Top line – MaximumMaximum

22ndnd line line – Upper QuartileUpper Quartile

33rdrd line line– MedianMedian

44thth line line– Lower QuartileLower Quartile

55thth line line– MinimumMinimum

312N =

Item 5 NF

8

7

6

5

4

3

2

1

0