22
WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam WFM 5201: Data Management and Statistical Analysis Akm Saiful Islam Lecture-1: Descriptive Statistics [Measures of central tendency] April, 2008 Institute of Water and Flood Management (IWFM) Bangladesh University of Engineering and Technology (BUET)

lec_dm_1

Embed Size (px)

Citation preview

Page 1: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

WFM 5201: Data Management and Statistical Analysis

Akm Saiful Islam

Lecture-1: Descriptive Statistics[Measures of central tendency]

April, 2008

Institute of Water and Flood Management (IWFM)Bangladesh University of Engineering and Technology (BUET)

Page 2: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Descriptive Statistics

Measures of Central Tendency Measures of Location Measures of Dispersion Measures of Symmetry Measures of Peakdness

Page 3: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Measures of Central Tendency

The central tendency is measured by averages. These describe the point about which the various observed values cluster.

In mathematics, an average, or central tendency of a data set refers to a measure of the "middle" or "expected" value of the data set.

Page 4: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Measures of Central Tendency

Arithmetic Mean Geometric Mean Weighted Mean Harmonic Mean Median Mode

Page 5: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Arithmetic Mean The arithmetic mean is the sum of a set of

observations, positive, negative or zero, divided by the number of observations. If we have “n” real numbers

their arithmetic mean, denoted by , can be expressed as:

n

xxxxx n

.............321

n

xx

n

ii

1

,.......,,,, 321 nxxxx

x

Page 6: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Arithmetic Mean of Group Data

if are the mid-values and

are the corresponding frequencies, where the subscript ‘k’ stands for the number of classes, then the mean is

i

ii

f

zfz

kzzzz .,,.........,, 321

kffff ,........,,, 321

Page 7: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Geometric Mean

Geometric mean is defined as the positive root of the product of observations. Symbolically,

It is also often used for a set of numbers whose values are meant to be multiplied together or are exponential in nature, such as data on the growth of the human population or interest rates of a financial investment.

Find geometric mean of rate of growth: 34, 27, 45, 55, 22, 34

nnxxxxG /1

321 )(

Page 8: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Geometric mean of Group data

If the “n” non-zero and positive variate-values occur times, respectively, then the geometric mean of the set of observations is defined by:

Nn

i

fi

Nfn

ff in xxxxG

1

1

1

2121

n

iifN

1Where

nxxx ,........,, 21 nfff ,.......,, 21

Page 9: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Geometric Mean (Revised Eqn.)

)( 321 nxxxxG

n

iixLog

NAntiLogG

1

1

n

iii xLogf

NAntiLogG

1

1

)( 321321 nfff xxxxG

Ungroup Data Group Data

Page 10: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Harmonic Mean

Harmonic mean (formerly sometimes called the subcontrary mean) is one of several kinds of average.

Typically, it is appropriate for situations when the average of rates is desired. The harmonic mean is the number of variables divided by the sum of the reciprocals of the variables. Useful for ratios such as speed (=distance/time) etc.

Page 11: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Harmonic Mean Group Data

The harmonic mean H of the positive real numbers x1,x2, ..., xn is defined to be

n

i i

i

x

f

nH

1

n

i ix

nH

1

1

Ungroup Data Group Data

Page 12: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Exercise-1: Find the Arithmetic , Geometric and Harmonic Mean Class Frequency

(f)x fx f Log x f / x

20-29 3 24.5 73.5 4.17 8.17

30-39 5 34.5 172.5 7.69 6.9

40-49 20 44.5 890 32.97 2.23

50-59 10 54.5 545 17.37 5.45

60-69 5 64.5 322.5 9.05 12.9

Sum N=43 2003.5 71.24 35.64

Page 13: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Weighted Mean

The Weighted mean of the positive real numbers

x1,x2, ..., xn with their weight w1,w2, ..., wn is

defined to be

n

ii

n

iii

w

xw

x

1

1

Page 14: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Median

The implication of this definition is that a median is the middle value of the observations such that the number of observations above it is equal to the number of observations below it.

)1(2

1

n

e XM

1222

1nne XXM

If “n” is odd If “n” is Even

Page 15: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Median of Group Data

L0 = Lower class boundary of the median class h = Width of the median class f0 = Frequency of the median class F = Cumulative frequency of the pre- median class

Fn

f

hLM

ooe 2

Page 16: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Steps to find Median of group data

1. Compute the less than type cumulative frequencies.

2. Determine N/2 , one-half of the total number of cases.

3. Locate the median class for which the cumulative frequency is more than N/2 .

4. Determine the lower limit of the median class. This is L0.

5. Sum the frequencies of all classes prior to the median class. This is F.

6. Determine the frequency of the median class. This is f0.

7. Determine the class width of the median class. This is h.

Page 17: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example-3:Find Median

Age in years Number of births Cumulative number of births

14.5-19.5 677 677

19.5-24.5 1908 2585

24.5-29.5 1737 4332

29.5-34.5 1040 5362

34.5-39.5 294 5656

39.5-44.5 91 5747

44.5-49.5 16 5763

All ages 5763 -

Page 18: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Mode

Mode is the value of a distribution for which the frequency is maximum. In other words, mode is the value of a variable, which occurs with the highest frequency.

So the mode of the list (1, 2, 2, 3, 3, 3, 4) is 3. The mode is not necessarily well defined. The list (1, 2, 2, 3, 3, 5) has the two modes 2 and 3.

Page 19: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example-2: Find Mean, Median and Mode of Ungroup Data

The weekly pocket money for 9 first year pupils was found to be:

3 , 12 , 4 , 6 , 1 , 4 , 2 , 5 , 8

Mean5

Mode4

Median4

Page 20: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Mode of Group Data

L1 = Lower boundary of modal class

Δ1 = difference of frequency between

modal class and class before it Δ2 = difference of frequency between

modal class and class after H = class interval

hLM21

110

Page 21: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Steps of Finding Mode

Find the modal class which has highest frequency

L0 = Lower class boundary of modal class

h = Interval of modal class

Δ1 = difference of frequency of modal

class and class before modal class

Δ2 = difference of frequency of modal class and

class after modal class

Page 22: lec_dm_1

WFM 5201: Data Management and Statistical Analysis © Dr. Akm Saiful Islam

Example -4: Find Mode

Slope Angle (°)

Midpoint (x) Frequency (f) Midpoint x

frequency (fx)

0-4 2 6 12

5-9 7 12 84

10-14 12 7 84

15-19 17 5 85

20-24 22 0 0

Total n = 30 ∑(fx) = 265