53
1 Module 4 Module 4 Measures of Central Tendency and Measures of Central Tendency and Dispersion Dispersion

Module4

Embed Size (px)

Citation preview

Page 1: Module4

1

Module 4Module 4Measures of Central Tendency and DispersionMeasures of Central Tendency and Dispersion

Page 2: Module4

2

Measures of Central Tendency-- Mean

•       Arithmetic • Geometric• Harmonic• Weighted Mean •

Median, Quartiles, Percentiles, Deciles

Mode

Measures of Variation

Measures Of Central Tendency And Dispersion

Page 3: Module4

3

Range

Mean Deviation

Standard Deviation ( Variance )

Inter Quartile Range

Coefficient of Variation

Measures of Skewness and Kurtosis

Standardised Variables and Scores

Measures Of Central Tendency And Dispersion

Page 4: Module4

4

Measures of Location or Central Measures of Location or Central Tendency Tendency

Measure of Location

Centre of Gravity 

There are three such measures:

Mean

Median, Quartiles, Percentiles and Deciles

Mode

Page 5: Module4

5

Properties of a MeasureProperties of a Measure

It should be easy to understand and calculate

It should be based on all observations

It should not be much affected by a few extreme

observations

It should be amenable to mathematical treatment. For

example, we should be able to

calculate the combined measure for two sets of

observations given the measure for each of the two sets

Page 6: Module4

6

Mean Mean There are three types of means viz.,

Arithmetic Mean

Harmonic Mean

Geometric Mean

Page 7: Module4

7

Arithmetic Mean Arithmetic Mean Ungrouped (Raw) Data Ungrouped (Raw) Data

nsObservatio ofNumber

nsObservatio of Sumx

n

xi

Page 8: Module4

8

Illustration 4.1Illustration 4.1

Table 4.1 : Equity Holdings of 20 Indian Billionaires

( Rs. in Millions)2717 2796 3098 3144 3527

3534 3862 4186 4310 4506

4745 4784 4923 5034 5071

5424 5561 6505 6707 6874

Page 9: Module4

9

Illustration 4.1Illustration 4.1

For the above data, the A.M. is  2717 + 2796 +…… 4645+….. + 5424 + ….+ 6874 = --------------------------------------------------------------------------

20 

= Rs. 4565.4 Millions

x

Page 10: Module4

10

Arithmetic Mean Arithmetic Mean Grouped Data Grouped Data

i

i

f

f ixx

Page 11: Module4

11

Illustration 4.2Illustration 4.2The calculation is illustrated with the data relating to equity holdings of the group of 20 billionaires given in Table 3.1

Class Interval( 1 )

Frequency( fi ) ( 2 )

Mid Value of Class Interval

( xi ) ( 3 )

fixi

Col.(4) = Col.(2) x Col.(3)

2000 – 3000 2 2500 5000

3000 – 4000 5 3500 17500

4000 – 5000 6 4500 27000

5000 – 6000 4 5500 22000

6000 – 7000 3 6500 19500

      

Sum fi = 20   fixi = 91000

Page 12: Module4

12

Illustration 4.2Illustration 4.2

values of fi and fixi , in formula

= 9100 ÷ 20

= 4550

i

i

f

f ixx

Page 13: Module4

13

Weighted Arithmetic Mean Weighted Arithmetic Mean

if the values x1, x2 x3, …. xi, ….xn have weights

w1, w2 w3, …. wi, ….,wn then the weighted mean

of x is given as

i

ii

w

xwx

Page 14: Module4

14

Illustration 4.3Illustration 4.3

Item Monthly Consump

tion 

Weight(wi)

Rise in Price (Percentage)

(pi)

 wipi

Sugar 5 5 20 100

Rice 20 20 10 200

Page 15: Module4

15

Illustration 4.3Illustration 4.3

Therefore, the average price rise could be

evaluated as

 

= =

= = = 12.

Thus the average price rise is 12 % .

205

200100

25

300

i

ii

w

pwp

Page 16: Module4

16

Geometric Mean Geometric Mean

The Geometric Mean ( G. M.) of a series of observations with x1, x2, x3, ……..,xn is defined

as the nth root of the product of these values . Mathematically G.M. = { ( x1 )( x2 )( x3 )…………….(xn ) }

(1/ n )

It may be noted that the G.M. cannot be defined if any value of x is zero as the whole product of various values becomes zero.

Page 17: Module4

17

Illustration 4.5 Illustration 4.5

For the data with values, 2,4, and 8,  G.M. = (2 x 4 x 8 ) (1/3)   = (64) 1/3

= 4

Page 18: Module4

18

Average Rate of Growth of Average Rate of Growth of Production/Business or Increase in Prices Production/Business or Increase in Prices

If P1 is the production in the first year and Pn is

the production in the nth year, then the average

rate of growth is given by ( G – 100) % where,

G = 100 (Pn / P1 )1/(n-1)

 

or log G = log 100 + { 1/(n–1) } (log Pn – log P1)

Page 19: Module4

19

Example 4.4

The wholesale price index in the year 2000-01 was 145.3. It increased to 195.5 in the year 2005-06. What has been the average rate of increase in the index during the last 5 years. Solution:By using the formula ( 4.8), we have log G = 2 +{ (1/5) ( log 195.5 – log145.3 ) }

= 2.02578Therefore,

G = Anti log (2.02578) = 106.11Thus the average rate of increase = 106.11 100 = 6.11%

Page 20: Module4

20

Combined G.M. of Two Sets of Data Combined G.M. of Two Sets of Data

 If G1 & G2 are the Geometric means of two sets

of data, then the combined Geometric mean, say G, of the combined data is given by :

n1 log G1 + n2 log G2

log G = ------------------------------- n1 + n2

Page 21: Module4

21

Combined G.M. of Two Sets of DataCombined G.M. of Two Sets of Data

As another example, suppose the average growth

rate during the first five years of business is 20 %,

and the average growth rate of business during the

next five years is 15 %, and we wish to find the

average growth rate for the entire period of 10

years. This growth rate can be found by calculating

the combined geometric mean of the geometric

means 120 and 115, for the two blocks of 5-year

periods. Thus, the requisite G.M., say G, can be

worked out as follows:

Page 22: Module4

22

Combined G.M. of Two Sets of DataCombined G.M. of Two Sets of Data

5 log 120 + 5 log 115 5 x 2.07918 + 5 x 2.06070log G = ------------------------------- = ---------------------------------- 5 + 5 10 20.6994 = ------------ = 2.06994 10Therefore,  

G = antilog 2.06994 = 117.47

Thus the combined average rate of growth for the period of 10 years is 17.47%.

Page 23: Module4

23

Weighted Geometric Mean Weighted Geometric Mean

Just like weighted arithmetic mean, we also have weighted Geometric mean

If x1, x2,….,xi,….,xn are n observations with

weights w1, w2, …wi,.., wn, then their G.M. is

defined as: 

wi log xi

G.M. = ---------------------- wi

Page 24: Module4

24

Harmonic Mean Harmonic Mean The harmonic mean (H.M.) is defined as the reciprocal of the arithmetic mean of the reciprocals of the observations. 

For example, if x1 and x2 are two observations, then the arithmetic means of their reciprocals viz 1/x1 and 1/ x2 is  

{(1 / x1) + (1 / x2)} / 2= (x2 + x1) / 2 x1 x2

The reciprocal of this arithmetic mean is 2 x1 x2 / (x2 + x1). This is called the harmonic mean. Thus the harmonic mean of two observations x1 and x2 is 2 x1 x2

-----------------

x1 + x2

Page 25: Module4

25

Relationship Among A.M. G.M. and H.M. Relationship Among A.M. G.M. and H.M.

The relationships among the magnitudes of the three types of Means calculated from the same data are as follows: (i) H.M. ≤ G.M. ≤ A.M.  i.e. the arithmetic mean is greater than or equal to the geometric which is greater than or equal to the harmonic mean. ( ii ) G.M. = i.e. geometric mean is the square root of the product of arithmetic mean and harmonic mean.

( iii) H.M. = ( G.M.) 2 / A .M.

... MHMA

Page 26: Module4

26

Median Median

whenever there are some extreme values in the data, calculation of A.M. is not desirable.

Further, whenever, exact values of some observations are not available, A.M. cannot be calculated.

In both the situations, another measure of location called Median is used.

Page 27: Module4

27

Median - Ungrouped Data Median - Ungrouped Data

First the data is arranged in ascending/descending order.  In the earlier example relating to equity holdings data of 20 billionaires given in Table 4.1, the data is arranged as per ascending order as follows 2717 2796 3098 3144 3527 35343862 4187 4310 4506 4745 4784 49235034 5071 5424 5561 6505 6707 6874

Here, the number of observations is 20, and therefore there is no middle observation. However, the two middle most observations are 10th and 11th. The values are 4506 and 4745. Therefore, the median is their average.  

4506 + 4745 9251 Median = ----------------- = -----------

2 2 

= 4625.5 Thus, the median equity holdings of the 20 billionaires is Rs.4625.5 Millions.

Page 28: Module4

28

Median - GroupedMedian - Grouped

The median for the grouped data is also defined as the value

corresponding to the ( (n+1)/2 )th observation, and is calculated

from the following formula:

( (n/2) –fc )

Median = Lm + ----------------- wm

fm

 where,

•Lm is the lower limit of 'the median class internal i.e. the interval which

contains n/2th observation

•fm is the frequency of the median class interval i.e. the class interval which

contains the ( (n)/2 )th observation

•fc is the cumulative frequency up to the median class- interval

•wm is the width of the median class-interval

•n is the number of total observations.

Page 29: Module4

29

Illustration 4.2Illustration 4.2

Class Interval Frequency Cumulative frequency

2000-3000 2 2

3000-4000 5 7

4000-5000 6 13

5000-6000 4 17

6000-70000 3 20

Page 30: Module4

30

Illustration 4.2Illustration 4.2Here, n = 20, the median class interval is from 4000 to 5000 as the 10th observation lies in this interval.Further, 

Lm = 4000

  fm = 6

  fc = 7

  wm = 1000

Therefore, 20/2 –7 x 1000

Median = 4000 + ------------------------- 6

= 4000 + 3/6 x 1000= 4000 + 500

= 4500

Page 31: Module4

31

MedianMedian

The median divides the data into two parts such that the number of observations less than the median are equal to the number of observations more than it.

This property makes median very useful measure when the data is skewed like income distribution among persons/households, marks obtained in competitive examinations like that for admission to Engineering / Medical Colleges, etc.

Page 32: Module4

32

Graphical Method of Finding the Graphical Method of Finding the MedianMedian

If we draw both the ogives viz. “Less Than “ and “ More Than”, for a data, then the point of intersection of the two ogives is the Median.

0

5

10

15

20

25

Median

Less Than Ogive

More Than Ogive

Page 33: Module4

33

Quartiles Quartiles

Median divides the data into two parts such that 50 % of the observations are less than it and 50 % are more than it. Similarly, there are “Quartiles”. There are three Quartiles viz. Q1 , Q2 and Q3. These are referred

to as first, second and third quartiles. The first quartile , Q1, divides the data into two parts

such that 25 % ( Quarter ) of the observations are less than it and 75 % more than it.

The second quartile, Q2, is the same as median. The third

quartile divides the data into two parts such that 75 % observations are less than it and 25 % are more than it.

All these can be determined, graphically, with the help of the Ogive curve

Page 34: Module4

QuartilesQuartiles

Ogive Curve (Less than type)

0.00%20.00%40.00%60.00%80.00%

100.00%120.00%

2000

3000

4000

5000

6000

7000

Mor

e

Bin

Fre

qu

en

cy

Cumulative %

Page 35: Module4

35

QuartilesQuartiles

data Q1 and Q3 are defined as values corresponding to

an observation given below :  

Ungrouped Data Grouped Data (arranged in ascending or descending order) Lower Quartile Q1 {( n + 1 ) / 4 }th ( n / 4 )th

   Median Q2 { ( n + 1 ) / 2 }th

( n / 2 )th

  Upper Quartile Q3 {3 ( n + 1 ) / 4 } th (3 n / 4 )th

Page 36: Module4

36

QuartilesQuartiles

1

1

1

)4/(1 Q

Q

cQ w

f

fnLQ

3

3

3

)4/3(3 Q

Q

cQ w

f

fnLQ

Page 37: Module4

37

Equity Holding DataEquity Holding Data

Class Interval Frequency Cumulative frequency

2000-3000 2 2

3000-4000 5 7

4000-5000 6 13

5000-6000 4 17

6000-70000 3 20

Page 38: Module4

38

( (20/4) – 2 )

Q1 = 3000 + --------------- 1000

5   ( 5 – 2 )

= 3000 + -------------------- 1000 5

  3000= 3000 + ------------- 5

  = 3000 + 600   = 3600 The interpretation of this value of Q1 is that 25 %

billionaires have equity holdings less than Rs.

Page 39: Module4

39

  (15 – 13)

Q3 = ------------- 1000 +5000

4   2

= ------- 1000 +5000 4 

= 5500The interpretation of this value of Q3 is that 75 %

billionaires have equity holdings less than Rs. 5500 Millions.

Page 40: Module4

40

Percentiles Percentiles

(95/100) n – fc

P95 = L P95 + ------------------- x wP95

f P95

where, L P95 is the lower point of the class interval

containing 95th percent of total frequency, fc is the

cumulative frequency up to the 95th percentile interval, f P95 is

the frequency of the 95th percentile interval and wP95 is the

width of the 95th percentile interval.

Page 41: Module4

41

Deciles Deciles

Just like quartiles divide the data in four parts, the

deciles divide the data into ten parts – first deciles

( 10% ) , second ( 20% ) , and so on. In fact, P10 ,

P20 , ……………….., P90 are the same as deciles.

And just as second quartile and median are the

same, so the fifth decile i.e. P50 and the median are the

same.

Page 42: Module4

42

Mode Mode

 

fm - f0

Mode = Lm + ----------------- wm

fm - f0 - f2where ,

Lm is the lower point of the modal class interval

fm is the frequency of the modal class interval

f0 is the frequency of the interval just before the modal interval

f2 is the frequency of the interval just after the modal interval

wm is the width of the modal class interval

Page 43: Module4

43

Equity Holding DataEquity Holding Data

the modal interval i.e., the class interval with the

maximum frequency (6) is 4000 to 5000. Further,

Lm = 4000

wm = 1000

fm = 6

f0 = 5

f2 = 4

Therefore

Page 44: Module4

44

Equity Holding DataEquity Holding Data

( 6 – 5 )

Mode = 4000 + -------------------- 1000

2 6 – 5 – 4

= 4000 + 1000

= 4000 + 333.3

= 4333.3

Thus the modal equity holdings of the billionaires is

Rs. 4333.3 Millions.

Page 45: Module4

45

Empirical Relationship among Empirical Relationship among Mean, Median and Mode Mean, Median and Mode

In a moderately skewed distributions, it is found that the following relationship, generally, holds good :

Mean – Mode = 3 (Mean – Median)

 

From the above relationship between, Mean, Median and Mode, if the values of two of these are given, the value of third measure can be found out

Page 46: Module4

Equity Holding DataEquity Holding Data

4333 4500 4565

(mode) (median) (mean)

Page 47: Module4

Right Skewed Distribution Right Skewed Distribution

Mode Median Mean

Page 48: Module4

SymmetricalSymmetrical

Mode Median Mean

Page 49: Module4

Left Skewed DistributionLeft Skewed Distribution

Mean Median Mode

Page 50: Module4

50

Features of a Good Statistical Features of a Good Statistical Average Average

Readily computable, comprehensible and easily understood

It should be based on all the observations

It should be reliable. enough to be taken as true representative of the

population

It should not be much affected by the extreme values in the data

It should be amenable to further mathematical treatment. This properly

helps in assessing the reliability of conclusions drawn about the population

value with the help of sample value

Should not vary much from sample to sample taken from the same

population.

Page 51: Module4

51

Comparison of Measures of Location Comparison of Measures of Location Arithmetic Mean

Advantages Disadvantages

(i) Easy to understand and calculate(ii) Makes use of full data(iii) Only number and sum of the observations need be known for its calculation.

 (i ) Unduly influenced by extreme values (ii) Cannot be calculated from the data with open-end class- intervals in grouped data or when values of all observations are available – all that is known that some observations are either less than or greater than some value, in ungrouped data

Page 52: Module4

52

Geometric Mean

Advantages Disadvantages

(i) Makes use of full data (ii) Extreme large values have lesser impacts(ii) Useful for data relating to rations and percentage(iv) Useful for rate of change/growth

 (i)      Cannot be calculated if any observation has the value zero(ii) Difficult to calculate

and interpret

Page 53: Module4

53

Median

Advantages Disadvantages

(i) Simple to understand (ii) Extreme values do not have any impact(iii) Can be calculated even if values of all observations are not known or data has open-end class intervals(iv) Used for measuring qualities and factors which are not quantifiable(v) Can be approximately determined with the help of a graph (ogives)

 (i) Arranging values in ascending

/descending order may sometime be tedious

(ii) Sum of the observations cannot be found out, if only Median is known

(i) Not amenable for mathematical calculations