Upload
dr-singh
View
228
Download
0
Embed Size (px)
Citation preview
8/3/2019 Statistics Cental Tendency
1/52
1
StatisticsStatistics
Prof. Rushen ChahalDescriptive Statistics -
Measures of Central Tendency
Chapters 3 & 4
8/3/2019 Statistics Cental Tendency
2/52
2
TO CALCULATE THE ARITHMETIC MEAN,THE WEIGHTED MEAN, THE MEDIAN,
THE MODE, AND THE GEOMETRIC MEAN.
TO EXPLAIN THE CHARACTERISTICS,
USE, ADVANTAGES, AND
DISADVANTAGES OF EACH MEASURE OF
CENTRAL TENDENCY.
TO IDENTIFY THE POSITION OF THEARITHMETIC MEAN,MEDIAN, AND MODE
FOR BOTH A SYMMETRICAL AND A
SKEWED DISTRIBUTION.
TODAYS GOALS
8/3/2019 Statistics Cental Tendency
3/52
3
Definition: For ungrouped data, the populationmean isthe sum ofall the population values
divided by the total number of population
values. To compute the population mean, use
the following formula.
POPULATION MEAN
mu
Sigma
Population
Size
Individual
value
Q ! XN
8/3/2019 Statistics Cental Tendency
4/52
4Parameter: A measurable characteristic ofa
population. For example, the population mean.
A racing team hasa fleet of four cars. The
following are the milescovered by eachcar over
their lives: 23,000, 17,000, 9,000, and 13,000.Find the average milescovered by eachcar.
Since this fleetisthe population, the mean is
(23,000 + 17,000 + 9,000 + 13,000)/4 = 15,500.
EXAMPLE
8/3/2019 Statistics Cental Tendency
5/52
5
Definition: For ungrouped data, the samplemean isthe sum ofall the sample values divided
by the number ofsample values. To compute the
sample mean, use the following formula.
THE SAMPLE MEAN
XXn!
7X-bar Sigma
Sample
Size
Individual
value
8/3/2019 Statistics Cental Tendency
6/52
6
Population Mean QSample Mean x
WHATS THE DIFFERENCE?
?
8/3/2019 Statistics Cental Tendency
7/52
8/3/2019 Statistics Cental Tendency
8/52
8/3/2019 Statistics Cental Tendency
9/52
9Sum of Deviations
Consider the set of values: 3, 8, and 4. The meanis 5. So (3 -5) + (8 - 5) + (4 - 5) = -2 + 3 - 1 = 0.
Symbolically we write:
The mean isalso known asthe expected value
or average.
( )X X ! 0
8/3/2019 Statistics Cental Tendency
10/52
10
Definition: The midpoint ofthe valuesafter theyhave been ordered from the smallestto the
largest, or the largestto the smallest. There are
as many valuesabove the median as below itin
the dataarray.
Note: For an odd set of numbers, the median will
be the middle number in the ordered array.
Note: For an even set of numbers, the medianwill be the arithmeticaverage ofthe two middle
numbers.
THE MEDIAN
8/3/2019 Statistics Cental Tendency
11/52
11Compute the median:
The road life for asample of five tiresin milesis:
42,000 51,000 40,000 39,000 48,000
Arranging the datain ascending order gives:
39,000 40,000 42,000 48,000 51,000. Thus
the median is 42,000 miles.
EXAMPLE
8/3/2019 Statistics Cental Tendency
12/52
12Compute the median:
The following valuesare years ofservice for a
sample ofsix store managers: 16 12 8 15 7
23.
Arranging in order gives 7 8 12 15 16 23.
Thusthe median is (12 + 15)/2 = 13.5 years.
EXAMPLE
8/3/2019 Statistics Cental Tendency
13/52
13
There isa unique median for each dataset. Itis notaffected by extremely large or small
valuesand istherefore a valuable measure of
central tendency when such values occur.
Itcan be computed for ratio-level, interval-level,
and ordinal-level data.
Itcan be computed for an open-ended frequency
distribution ifthe median does not lie in an
open-ended class.
PROPERTIES OF THE MEDIAN
8/3/2019 Statistics Cental Tendency
14/52
14 Definition: The mode isthat value ofthe
observation thatappears most frequently
The exam scores for ten studentsare: 81 93 75
68 87 81 75 81 87. Whatisthe modal exam
score?
Since the score of 81 occursthe most, then the
modal score is 81.
The nextslide showsthe histogram withsixclasses for the water consumption from our
previousclass. Observe thatthe modal classis
the blue box witha midpoint of 15.
THE MODE
8/3/2019 Statistics Cental Tendency
15/52
15WATER CONSUMPTION IN 1,000 GALLONS
8/3/2019 Statistics Cental Tendency
16/52
16THEWEIGHTED MEANDefinition: The weighted mean ofaset of
numbersX1, X2, ..., Xn, withcorresponding
weightsw1, w2, ... , wn, iscomputed from the
following formula.
Why would anyone wantto weight observations?
Xw X w X w X
w w w
or X
w X
w
wn n
n
w
!
!
1 1 2 2
1 2
...
...
( )
8/3/2019 Statistics Cental Tendency
17/52
17EXAMPLE
During a one hour period on a busy Friday
night, fifty soft drinks were sold atthe Kruzin
Cafe. Compute the weighted mean ofthe price
ofthe soft drinks. (Price ($), Number sold):
(0.5, 5), (0.75, 15), (0.9, 15), and (1.10, 15).
The weighted mean is
[0.5v+ 0.75v15 + 0.9v15 + 1.1v15]/[5 +15+15+ 15]= $43.75/50 =
$0.875
8/3/2019 Statistics Cental Tendency
18/52
18Definition: The geometric mean (GM)ofaset of
n numbersis defined asthe nth root ofthe
product ofthe n numbers. The formula for the
geometric mean is given by:
One main use ofthe geometric mean isto
average percents.
How do you compute the nth root ofa number?
THE GEOMETRIC MEAN
GM X X X X nn!
1 2 3
...
8/3/2019 Statistics Cental Tendency
19/52
19EXAMPLE The profits earned by ABC Construction on
three projects were 6, 3, and 2 percent
respectively. Compute the geometric mean
profitand the arithmetic mean and compare.
The geometric mean is
The arithmetic mean profit =(6 + 3 + 2)/3 =
3.6667.
The geometric mean of 3.3019 givesa moreconservative profit figure than the arithmetic
mean of 3.6667. Thisis because the GM is not
heavily weighted by the profit of 6 percent.
GM! !( )( )( ) . .6 3 23 33019
8/3/2019 Statistics Cental Tendency
20/52
20 The other main use ofthe geometric mean to
determine the average percentincrease in sales,
production or other business or economicseries
from one time period to another.
The formula for the geometric mean asappliedto thistype of problem is:
Where did thiscome from?
THE GEOMETRIC MEAN (continued)
GM Value at end of periodValue at beginning of period
n! 1 1
8/3/2019 Statistics Cental Tendency
21/52
21 The total enrollmentata large university
increased from 18,246 in 1985 to 22,840 in 1995.
Compute the geometric mean rate ofincrease
over the period.
Here n = 10, so n - 1 = 9 = (number of periods) The geometric mean rate ofincrease is given by
Thatis, the geometric mean rate ofincrease is
2.53%.
EXAMPLE
GM
22,84018,246! !9 1 00253. .
8/3/2019 Statistics Cental Tendency
22/52
22 Do you prefer I use the arithmetic mean or the
geometric mean to compute classscore
averages?
EXAMPLE
?
8/3/2019 Statistics Cental Tendency
23/52
23 The mean ofasample of data organized in a
frequency distribution iscomputed by the
following formula:
THE MEAN OF GROUPED DATA
X-bar
Sum of
frequencies
Class midpoint
Xvalues -
Sample
size
XXf
f
Xf
n!
!
f- class
frequency
8/3/2019 Statistics Cental Tendency
24/52
24A sample oftwenty appliance storesin a large
metropolitan area revealed the following
number of VCRssold last week. Compute the
mean number sold. The formulaand
computation isshown below.
EXAMPLE
XfXf
fXn!
!
= 325/20
= 16.25 VCRs
8/3/2019 Statistics Cental Tendency
25/52
25EXAMPLE (continued) The table also gives the necessary computations.
8/3/2019 Statistics Cental Tendency
26/52
26SYMMETRIC DISTRIBUTION
Symmetric
Distribution
Mode = Median = Mean
Zero
Skewness
8/3/2019 Statistics Cental Tendency
27/52
27RIGHT SKEWED DISTRIBUTION
MODE
MEDIAN
MEAN
Positively skewed
Mean and median
are to the RIGHT ofthe mode.
8/3/2019 Statistics Cental Tendency
28/52
28LEFT SKEWED DISTRIBUTION
MODE
MEDIAN
MEAN
Negatively skewed
Mean and median
are to the LEFT of
the mode.
8/3/2019 Statistics Cental Tendency
29/52
8/3/2019 Statistics Cental Tendency
30/52
8/3/2019 Statistics Cental Tendency
31/52
31TO COMPARE (COMPUTE) VARIOUS MEASURES
OF DISPERSION FOR GROUPED AND
UNGROUPED DATA.
TO EXPLAIN THE CHARACTERISTICS, USES,
ADVANTAGES, AND DISADVANTAGES OF EACHMEASURE OF DISPERSION.
TO EXPLAIN CHEBYSHEVS THEOREM AND THE
EMPIRICAL (NORMAL) RULE.
TO COMPUTE THE COEFFICIENTS OFVARIATION AND SKEWNESS.
DEMONSTRATE COMPUTING STATISTICS WITH
EXCEL.
TODAYS GOALS
8/3/2019 Statistics Cental Tendency
32/52
32
Range: For ungrouped data, the range isthedifference between the highestand lowest values
in aset of data. To compute the range, use the
following formula.
EXAMPLE : A sample of five recentaccountinggraduates revealed the following starting salaries
(in $1000): 17 26 18 20 19. The range isthus
$26,000 - $17,000 = $9,000.
MEASURES OF DISPERSION -
UNGROUPED DATA
RANGE= HIGHEST VALUE - LOWEST VALUE
8/3/2019 Statistics Cental Tendency
33/52
33Mean Deviation: The arithmetic mean ofthe
absolute values ofthe deviations from the
arithmetic mean. Itiscomputed by the formula
below:
MEAN DEVIATION
Individual
Value
Arithmetic
Mean
Sample
Size
MD
X X
n!
8/3/2019 Statistics Cental Tendency
34/52
8/3/2019 Statistics Cental Tendency
35/52
35Population Variance: The population variance
for ungrouped dataisthe arithmetic mean ofthe
squared deviations from the population mean. It
iscomputed from the formula below:
POPULATION VARIANCE
Sigma
square
Population
size
Population meanIndividualvalue
W
Q2 2
!( )X
N
8/3/2019 Statistics Cental Tendency
36/52
36 The ages ofall the patientsin the isolation ward
of Yellowstone Hospital are 38, 26, 13, 41, and 22years. Whatisthe population variance? The
computationsare given below.
EXAMPLE
Q= 7X)/N
= 140/5 = 28.
W2 = 7(X- Q)2/N
= 534/5= 106.8.
8/3/2019 Statistics Cental Tendency
37/52
37ALTERNATIVE FORMULA FOR THEPOPULATION VARIANCE
Verify, using above formula, thatthe populationvariance is 106.8 for the previous example.
Why would you use this formula?
W
22 2
!
X
N
X
N
8/3/2019 Statistics Cental Tendency
38/52
38
Population Standard Deviation: The population
standard deviation (W) isthe square root ofthe
population variance.
For the previous example, the population
standard deviation isW = 10.3344 (square root of
106.8).
Note:If you are given the population standarddeviation, just square that number to get the
population variance.
THE POPULATION STANDARD
DEVIATION
8/3/2019 Statistics Cental Tendency
39/52
39Sample Variance: The formula for the sample
variance for ungrouped datais:
OROR
SAMPLE VARIANCE
2
2
1s
X X
n!
( )
2
22
1s X
X
nn!
( )
Samplevariance
Thissample variance is used to estimate the
population variance.
8/3/2019 Statistics Cental Tendency
40/52
40A sample of five hourly wages for blue-collar
jobsis: 17 26 18 20 19. Find the variance.
EXAMPLE
= 100/5 = 20
s2 = 50/(5 - 1)
= 12.5.
X
8/3/2019 Statistics Cental Tendency
41/52
41Sample Standard Deviation: The sample
standard deviation (s) isthe square root ofthesample variance.
For the previous example, the sample standard
s = 3.5355 (square root of 12.5).
Note:If you are given the sample standard
deviation, just square that number to get the
sample variance.
SAMPLE STANDARD DEVIATION
8/3/2019 Statistics Cental Tendency
42/52
42
Chebyshevs theorem: For any set ofobservations (sample or population), the
minimum proportion ofthe valuesthat lie within
kstandard deviations ofthe mean isat least
1 - 1/k2, where kisany constant greater than 1.
Empirical Rule: For any symmetrical, bell-
shaped distribution, approximately 68% ofthe
observations will lie within s 1Wofthe mean (Q);approximately 98% within s 2Wofthe mean (Q);
and approximately 99.7% within s 3Wofthe
mean (Q).
INTERPRETATION AND USES OF THE
STANDARD DEVIATION
8/3/2019 Statistics Cental Tendency
43/52
43
Between:
1. 68.26%
2. 95.44%
3. 99.97%
QW QW QW Q QW QW QW
Bell-Shaped Curve showing the relationship between W and Q.
8/3/2019 Statistics Cental Tendency
44/52
8/3/2019 Statistics Cental Tendency
45/52
8/3/2019 Statistics Cental Tendency
46/52
46RELATIVE DISPERSION Coefficient of Variation: The ratio ofthe
standard deviation to the arithmetic mean,expressed asa percentage.
For example, ifthe CVfor the yield oftwo
differentstocksare 10 and 25. The stock withthe larger CVhas more variation relative to the
mean yield. Thatis, the yield for thisstockis not
asstable asthe other.
CVs
X! (100%)
8/3/2019 Statistics Cental Tendency
47/52
8/3/2019 Statistics Cental Tendency
48/52
48SYMMETRIC DISTRIBUTION
Symmetric
Distribution
Mode = Median = Mean
Zero
Skewness
8/3/2019 Statistics Cental Tendency
49/52
49RIGHT SKEWED DISTRIBUTION
MODE
MEDIAN
MEAN
Positively skewed
Mean and median
are to the RIGHT ofthe mode.
8/3/2019 Statistics Cental Tendency
50/52
50LEFT SKEWED DISTRIBUTION
MODE
MEDIAN
MEAN
Negatively skewed
Mean and median
are to the LEFT of
the mode.
8/3/2019 Statistics Cental Tendency
51/52
51
=AVERAGE(A1:A10) ArithmeticMean
=MEDIAN(A1:A10) Median Value
=MODE(A1:A10) Modal Value
=GEOMEAN(A1:A10) GeometricMean=QUARTILE(A1:A10,Q) Quartile Q Value
=MAX(A1:A10)-MIN(A1:A10) Range
=PERCENTILE(A1:A10,P) Percentile P Value
EXCEL FUNCTIONSEXCEL FUNCTIONS
8/3/2019 Statistics Cental Tendency
52/52
52
=AVEDEV(A1:A10) Mean Absolute Deviation (MAD)
=VAR(A1:A10) Sample Variance
=STDEV(A1:A10) Sample Standard Deviation
=VARP(A1:A10) Population Variance
=STDEVP(A1:A10) Population Standard Deviation
=STDEV(A1:A10)/AVERAGE(A1:A10) Coefficeint of
Variation=SKEW(A1:A10) Coefficient of Skewness
(not same as book)
MORE EXCEL FUNCTIONSMORE EXCEL FUNCTIONS