Upload
mohammad-afif
View
169
Download
6
Embed Size (px)
Citation preview
CHAPTER 1DESCRIPTIVE STATISTICS
DESCRIPTIVE
STATISTICS
CHAPTER 1DESCRIPTIVE STATISTICS
Introduction
Raw data - Data recorded in the sequence in which there are collected and before they are processed or ranked.
Array data - Raw data that is arranged in ascending or descending order.
CHAPTER 1DESCRIPTIVE STATISTICS
Example 1Quantitative raw data
CHAPTER 1DESCRIPTIVE STATISTICS
These data also called ungrouped data
Example 1Qualitative raw data
CHAPTER 1DESCRIPTIVE STATISTICS
Organizing and Graphing
Qualitative Data
CHAPTER 1DESCRIPTIVE STATISTICS
Organizing and Graphing Qualitative Data
Frequency Distributions/ Table Relative Frequency and Percentage Distribution Graphical Presentation of Qualitative Data
Frequency Distributions / Table
A frequency distribution for qualitative data lists all categories and the number of elements that belong to each of the categories.
It exhibits the frequencies are distributed over various categories Also called a frequency distribution table or simply a frequency
table. The number of students who belong to a certain category is called the
frequency of that category.
CHAPTER 1DESCRIPTIVE STATISTICS
CHAPTER 1DESCRIPTIVE STATISTICS
Relative Frequency and Percentage Distribution
A relative frequency distribution is a listing of all categories along with their relative frequencies (given as proportions or percentages).
It is commonplace to give the frequency and relative frequency distribution together.
Calculating relative frequency and percentage of a category
CHAPTER 1DESCRIPTIVE STATISTICS
Relative Frequency of a category = Frequency of that category Sum of all frequencies
Percentage = (Relative Frequency)* 100
CHAPTER 1DESCRIPTIVE STATISTICS
W W P Is Is P Is W St Wj
Is W W Wj Is W W Is W Wj
Wj Is Wj Sv W W W Wj St W
Wj Sv W Is P Sv Wj Wj W W
St W W W W St St P Wj Sv
Example 3
A sample of UUM staff-owned vehicles produced by Proton was identified and the make of each noted. The resulting sample follows (W = Wira, Is = Iswara, Wj = Waja, St = Satria, P = Perdana, Sv = Savvy):
Construct a frequency distribution table for these data with their relative frequency and percentage.
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara
Perdana
Waja
Satria
Savvy
Total
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara 8
Perdana
Waja
Satria
Savvy
Total
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara 8
Perdana 4
Waja 10
Satria 5
Savvy 4
Total 50
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara 8
Perdana 4
Waja 10
Satria 5
Savvy 4
Total 50
19/50 = 0.38
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara 8
Perdana 4
Waja 10
Satria 5
Savvy 4
Total 50
19/50 = 0.38
0.16
0.38*100 = 38
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara 8
Perdana 4
Waja 10
Satria 5
Savvy 4
Total 50
19/50 = 0.38
0.16
0.08
0.38*100 = 38
0.16*100 = 16
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
CategoryFrequency
Relative Frequency
Percentage (%)
Wira 19
Iswara 8
Perdana 4
Waja 10
Satria 5
Savvy 4
Total 50
19/50 = 0.38
0.20
0.10
0.16
0.08
0.08
0.38*100 = 38
0.16*100 = 16
0.08*100 = 8
0.20*100 = 20
0.10*100 = 100.08*100 = 8
1001.00
CHAPTER 1DESCRIPTIVE STATISTICS
Graphical Presentation of Qualitative Data
Bar Graphs
A graph made of bars whose heights represent the frequencies of respective categories.
Such a graph is most helpful when you have many categories to represent.
Notice that a gap is inserted between each of the bars. It has => simple/ vertical bar chart => horizontal bar chart => component bar chart => multiple bar chart
CHAPTER 1DESCRIPTIVE STATISTICS
Simple/ Vertical Bar Chart
To construct a vertical bar chart, mark the various categories on the horizontal axis and mark the frequencies on the vertical axis
Refer to Figure 2.1 and Figure 2.2,
CHAPTER 1DESCRIPTIVE STATISTICS
Figure 2.1
Figure 2.2
CHAPTER 1DESCRIPTIVE STATISTICS
Horizontal Bar Chart
To construct a horizontal bar chart, mark the various categories on the vertical axis and mark the frequencies on the horizontal axis.
Example 4: Refer Example 3,
CHAPTER 1DESCRIPTIVE STATISTICS
UUM Staff-owned Vehicles Produced By Proton
0 5 10 15 20
Wira
Iswara
Perdana
Waja
Satria
Savvy
Ty
pe
s o
f V
eh
icle
Frequency
Figure 2.3
CHAPTER 1DESCRIPTIVE STATISTICS
Another example of horizontal bar chart: Figure 2.4
Figure 2.4: Number of students at Diversity College who are immigrants, by last country of
permanent residence
CHAPTER 1DESCRIPTIVE STATISTICS
Component Bar Chart
To construct a component bar chart, all categories is in one bar and every bar is divided into components.
The height of components should be tally with representative frequencies.
Example 5
Suppose we want to illustrate the information below, representing the number of people participating in the activities offered by an outdoor pursuits centre during Jun of three consecutive years.
CHAPTER 1DESCRIPTIVE STATISTICS
2004 2005 2006
Climbing 21 34 36
Caving 10 12 21
Walking 75 85 100
Sailing 36 36 40
Total 142 167 191
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Activities Breakdown (Jun)
020406080
100120140160180200
2004 2005 2006
Year
Nu
mb
er
of
pa
rtic
ipa
nts
Sailing
Walking
Caving
Climbing
Figure 2.5
CHAPTER 1DESCRIPTIVE STATISTICS
Multiple Bar Chart
To construct a multiple bar chart, each bars that representative any categories are gathered in groups.
The height of the bar represented the frequencies of categories.
Useful for making comparisons (two or more values).
Example 6: Refer example 5,
CHAPTER 1DESCRIPTIVE STATISTICS
Activities Breakdown (Jun)
0
20
40
60
80
100
120
2004 2005 2006
Year
Nu
mb
er
of
pa
rtic
ipa
nts
Climbing
Caving
Walking
Sailing
Figure 2.6
CHAPTER 1DESCRIPTIVE STATISTICS
Another example of horizontal bar chart: Figure 2.7
Figure 2.7: Preferred snack choices of students at UUM
CHAPTER 1DESCRIPTIVE STATISTICS
Pie Chart
A circle divided into portions that represent the relative frequencies or percentages of a population or a sample belonging to different categories.
An alternative to the bar chart and useful for summarizing a single categorical variable if there are not too many categories.
The chart makes it easy to compare relative sizes of each class/category.
CHAPTER 1DESCRIPTIVE STATISTICS
The whole pie represents the total sample or population. The pie is divided into different portions that represent the different categories.
To construct a pie chart, we multiply 360o by the relative frequency for each category to obtain the degree measure or size of the angle for the corresponding categories.
Example 7 (Table 2.6 and Figure 2.8):
CHAPTER 1DESCRIPTIVE STATISTICS
Figure 2.8
CHAPTER 1DESCRIPTIVE STATISTICS
Example 8 (Table 2.7 and Figure 2.9):
Movie Genres
Frequency Relative Frequency
Angle Size
ComedyActionRomanceDramaHorrorForeignScience Fiction
54362828221616
200 1.00 360o
0.270.18
0.140.14
0.110.080.08
CHAPTER 1DESCRIPTIVE STATISTICS
Example 8 (Table 2.7 and Figure 2.9):
Movie Genres
Frequency Relative Frequency
Angle Size
ComedyActionRomanceDramaHorrorForeignScience Fiction
54362828221616
200 1.00
0.270.18
0.140.14
0.110.080.08
360*0.27=97.2O
CHAPTER 1DESCRIPTIVE STATISTICS
Example 8 (Table 2.7 and Figure 2.9):
Movie Genres
Frequency Relative Frequency
Angle Size
ComedyActionRomanceDramaHorrorForeignScience Fiction
54362828221616
200 1.00
0.270.18
0.140.14
0.110.080.08
360*0.27=97.2O
360*0.18=64.8O
CHAPTER 1DESCRIPTIVE STATISTICS
Example 8 (Table 2.7 and Figure 2.9):
Movie Genres
Frequency Relative Frequency
Angle Size
ComedyActionRomanceDramaHorrorForeignScience Fiction
54362828221616
200 1.00 360o
0.270.18
0.140.14
0.110.080.08
360*0.27=97.2O
360*0.18=64.8O
360*0.11=39.6O
360*0.14=50.4O
360*0.08=28.8O
360*0.08=28.8O
360*0.14=50.4O
CHAPTER 1DESCRIPTIVE STATISTICS
Figure 2.9
CHAPTER 1DESCRIPTIVE STATISTICS
Line Graph/Time Series Graph
A graph represents data that occur over a specific period time of time.
Line graphs are more popular than all other graphs combined because their visual characteristics reveal data trends clearly and these graphs are easy to create.
When analyzing the graph, look for a trend or pattern that occurs over the time period.
CHAPTER 1DESCRIPTIVE STATISTICS
Example is the line ascending (indicating an increase over time) or descending (indicating a decrease over time).
Another thing to look for is the slope, or steepness, of the line. A line that is steep over a specific time period indicates a rapid increase or decrease over that period.
Two data sets can be compared on the same graph (called a compound time series graph) if two lines are used.
Data collected on the same element for the same variable at different points in time or for different periods of time are called time series data.
CHAPTER 1DESCRIPTIVE STATISTICS
A line graph is a visual comparison of how two variables—shown on the x- and y-axes—are related or vary with each other. It shows related information by drawing a continuous line between all the points on a grid.
Line graphs compare two variables: one is plotted along the x-axis (horizontal) and the other along the y-axis (vertical).
The y-axis in a line graph usually indicates quantity (e.g., RM, numbers of sales litres) or percentage, while the horizontal x-axis often measures units of time. As a result, the line graph is often viewed as a time series graph
CHAPTER 1DESCRIPTIVE STATISTICS
Example 9
A transit manager wishes to use the following data for a presentation showing how Port Authority Transit ridership has changed over the years. Draw a time series graph for the data and summarize the findings.
YearRidership
(in millions)
19901991199219931994
88.085.075.776.675.4
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
75
77
79
81
83
85
87
89
1990 1991 1992 1993 1994
Year
Rid
ers
hip
(in
mill
ion
s)
The graph shows a decline in ridership through 1992 and then leveling off for the years 1993 and 1994.
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 1
1.The following data show the method of payment by 16 customers in a supermarket checkout line. Here, C = cash, CK = check, CC = credit card, D = debit and O = other.
C CK CK C CC D O C
CK CC D CC C CK CK CC
a.Construct a frequency distribution table.b.Calculate the relative frequencies and percentages for all categories.c.Draw a pie chart for the percentage distribution.
CHAPTER 1DESCRIPTIVE STATISTICS
1.a). Frequency distribution table, relative frequencies, percentages and angle sizes of all categories.
Method of payment
Frequency, fRelative
frequencyPercentage
(%)Angle Size (o)
CashCheckCredit CardDebitOther
45421
Total 16
0.25000.31250.25000.12500.0625
1
25.0031.2525.0012.50
100
6.25
90112.5
904522.5
360
CHAPTER 1DESCRIPTIVE STATISTICS
b). Pie Chart
25%
31%
25%
13%
6%
Cash
Check
Credit Card
Debit
Other
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 2:The frequency distribution table represents the sale of certain product in ZeeZee Company. Each of the products was given the frequency of the sales in certain period. Find the relative frequency and the percentage of each product. Then, construct a pie chart using the obtained information
CHAPTER 1DESCRIPTIVE STATISTICS
1.a). Frequency distribution table, relative frequencies, percentages and angle sizes of all categories.
Type of product
FrequencyRelative Frequency
Percentage (%)
Angle Size (o)
A 13
B 12
C 5
D 9
E 11
Total 50
0.24
0.26
0.10
0.18
0.22
1.00
26
24
10
18
22
100
93.6
86.4
36.0
64.8
79.2
360
CHAPTER 1DESCRIPTIVE STATISTICS
ORGANIZING AND GRAPHING
QUANTITATIVE DATA
CHAPTER 1DESCRIPTIVE STATISTICS
2.3 ORGANIZING AND GRAPHING QUANTITATIVE DATA
2.3.1 Stem and Leaf Display2.3.2 Frequency Distribution2.3.3 Relative Frequency and Percentage
Distributions.2.3.4 Graphing Grouped Data2.3.5 Shapes of Histogram2.3.6 Cumulative Frequency Distributions.
CHAPTER 1DESCRIPTIVE STATISTICS
Stem-and-Leaf Display
In stem and leaf display of quantitative data, each value is divided into two portions – a stem and a leaf. Then the leaves for each stem are shown separately in a display.
Gives the information of data pattern.
Can detect which value frequently repeated.
CHAPTER 1DESCRIPTIVE STATISTICS
Example 10
25 12 9 10 5 12 23 7
36 13 11 12 31 28 37 6
14 41 38 44 13 22 18 19
Solution:
0 6759
02
43
2
1 342132
835
98
176 1
2
48
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 3:
Queen Bakery is the famous shop that sell cake in Town J. The operation manager of Queen Bakery is interested to study about the time that customers queue before serve at the cashier counter. Below is data (in minute) for 20 customers. Construct a stem and leaf table. 5 1 8 1 3
3 15 2 12 16
10 16 9 6 14
7 10 3 11 8
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
0 1815 23
16
3
1
83769
46025 0
0 3211 53
61
3
1
98876
54200 6
CHAPTER 1DESCRIPTIVE STATISTICS
Frequency Distributions
A frequency distribution for quantitative data list all the classes and the number of values that belong to each class.
Data presented in form of frequency distribution are called grouped data.
CHAPTER 1DESCRIPTIVE STATISTICS
CHAPTER 1DESCRIPTIVE STATISTICS
The class boundary is given by the midpoint of the upper limit of one class and the lower limit of the next class. Also called real class limit.
To find the midpoint of the upper limit of the first class and the lower limit of the second class, we divide the sum of these two limits by 2.
CHAPTER 1DESCRIPTIVE STATISTICS
400 401400.5
2 class boundary
e.g.:
Class Width (class size)
Class width = Upper boundary – Lower boundary
e.g. : Width of the first class = 600.5 – 400.5 = 200
CHAPTER 1DESCRIPTIVE STATISTICS
Lower limit + Upper limitclass midpoint or mark =
2
401 600Midpoint of the 1st class = 500.5
2
Class Midpoint or Mark
e.g:
CHAPTER 1DESCRIPTIVE STATISTICS
Constructing A Frequency Table
CHAPTER 1DESCRIPTIVE STATISTICS
Constructing Frequency Distribution Tables
1. To decide the number of classes, we used Sturge’s formula, which is
c = 1 + 3.3 log n
where c is the no. of classes n is the no. of observations in the data set.
Largest value - Smallest value
Number of classesRange
i
ic
2. Class width,This class width is rounded to a convenient
number.
CHAPTER 1DESCRIPTIVE STATISTICS
3. Lower Limit of the First Class or the Starting Point
Use the smallest value in the data set.
Example 11
The following data give the total home runs hit by all players of each of the 30 Major League Baseball teams during 2004 season
CHAPTER 1DESCRIPTIVE STATISTICS
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:i. number of classes, c = 1 + 3.3log 30
= 1 + 3.3(1.48) = 5.89 ± 6 classes
ii. Class width, i
iii. Starting Point = 135
18
8.176
135242
i
CHAPTER 1DESCRIPTIVE STATISTICS
30f
Table 2.10 Frequency Distribution for Data of Table 2.9
Total Home Runs
Tally f
135 – 152153 – 170171 – 188189 – 206207 – 224225 – 242
IIII IIIIIIIIIIIIII IIIIIIII
1025634
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 4:The followings data shows the information of serving time (in minutes) for 40 customers in a post office:
2.0 4.5 2.5 2.9 4.2 2.9 3.5 2.8
3.2 2.9 4.0 3.0 3.8 2.5 2.3 3.5
2.1 3.1 3.6 4.3 4.7 2.6 4.1 3.1
4.6 2.8 5.1 2.7 2.6 4.4 3.5 3.0
2.7 3.9 2.9 2.9 2.5 3.7 3.3 2.4
Construct a frequency distribution table
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:i. number of classes, c = 1 + 3.3log 40
= 1 + 5.29 = 6.29 ≈ 6 classes
ii. Class width, i
iii. Starting Point = 2.0
6.0
52.06
0.21.5
i
CHAPTER 1DESCRIPTIVE STATISTICS
Time class Tally f
2.0 – 2.52.6 – 3.13.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
IIII IIIIII IIII IIIIIIII IIIIII IIIIII
7157641
40f
CHAPTER 1DESCRIPTIVE STATISTICS
Relative Frequency and Percentage Distribution
100*)(Re
Re
FrequencylativePercentage
f
f
frequencyallofSum
classthatofFrequencyclassaoffrequencylative
CHAPTER 1DESCRIPTIVE STATISTICS
Example 12 (Refer example 11)Table 2.11: Relative Frequency and Percentage Distributions
Total Home Runs
Class Boundaries freq Relative Frequency
%
135 – 152153 – 170171 – 188189 – 206207 – 224225 – 242
134.5 less than 152.5152.5 less than 170.5170.5 less than 188.5188.5 less than 206.5206.5 less than 224.5224.5 less than 242.5
1025634
0.33330.06670.1667
0.20.1
0.1333
33.336.67
16.672010
13.33
Sum 30 1.0 100%
CHAPTER 1DESCRIPTIVE STATISTICS
Time class fRelative
frequency
2.0 – 2.52.6 – 3.13.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
0.1750.3750.1750.1500.1000.025
total 1.000 40f
Exercise 5: Refer to exercise 4, construct the relative frequency for the data.
CHAPTER 1DESCRIPTIVE STATISTICS
Graphing Grouped Data
Histograms A histogram is a graph in which the class boundaries are
marked on the horizontal axis and either the frequencies, relative frequencies, or percentages are marked on the vertical axis. The frequencies, relative frequencies or percentages are represented by the heights of the bars.
In histogram, the bars are drawn adjacent to each other and there is a space between y axis and the first bar.
CHAPTER 1DESCRIPTIVE STATISTICS
0
2
4
6
8
10
12
1
Total home runs
Fre
qu
en
cy
134.5 152.5 170.5 188.5 206.5 224.5 242.5
Example 13 (Refer example 11)
Figure 2.10: Frequency histogram for Table 2.10
CHAPTER 1DESCRIPTIVE STATISTICS
Polygon A graph formed by joining the midpoints of the tops of
successive bars in a histogram with straight lines is called a polygon.
Example 13
Figure 2.11: Frequency polygon for Table 2.10
0
2
4
6
8
10
12
1
Total home runs
Fre
qu
en
cy
134.5 152.5 170.5 188.5 206.5 224.5 242.5
CHAPTER 1DESCRIPTIVE STATISTICS
For a very large data set, as the number of classes is increased (and the width of classes is decreased), the frequency polygon eventually becomes a smooth curve called a frequency distribution curve or simply a frequency curve.
Figure 2.12: Frequency distribution curve
CHAPTER 1DESCRIPTIVE STATISTICS
Shape of Histogram Same as polygon. For a very large data set, as the number of classes
is increased (and the width of classes is decreased), the frequency polygon eventually becomes a smooth curve called a frequency distribution curve or simply a frequency curve.
CHAPTER 1DESCRIPTIVE STATISTICS
The most common of shapes are:(i) Symmetric
Figure 2.13 & 2.14: Symmetric histograms
CHAPTER 1DESCRIPTIVE STATISTICS
(ii) Right skewed and (iii) Left skewed
Figure 2.15 & 2.16: Right skewed and Left skewed
CHAPTER 1DESCRIPTIVE STATISTICS
Cumulative Frequency Distributions
CHAPTER 1DESCRIPTIVE STATISTICS
Cumulative Frequency Distributions
A cumulative frequency distribution gives the total number of values that fall below the upper boundary of each class.
Example 14: Using the frequency distribution of table 2.11, Total
Home Runs
Class Boundaries freqCumulative Frequency
135 – 152153 – 170171 – 188189 – 206207 – 224225 – 242
134.5 less than 152.5152.5 less than 170.5170.5 less than 188.5188.5 less than 206.5206.5 less than 224.5224.5 less than 242.5
1025634
1010+2=1210+2+5=1710+2+5+6=2310+2+5+6+3=2610+2+5+6+3+4=30
CHAPTER 1DESCRIPTIVE STATISTICS
OgiveAn ogive is a curve drawn for the cumulative
frequency distribution by joining with straight lines the dots marked above the upper boundaries of classes at heights equal to the cumulative frequencies of respective classes.
Two type of ogive: (i) ogive less than (ii) ogive greater than
CHAPTER 1DESCRIPTIVE STATISTICS
Earnings (RM)
Number of students (f)
30 – 3940 – 4950 – 5960 - 6970 – 7980 - 89
566337
Total 30
First, build a table of cumulative frequency.
Example 15 (Ogive Less Than)
Earnings (RM) Cumulative
Frequency (F)
Less than 29.5Less than 39.5Less than 49.5Less than 59.5Less than 69.5Less than 79.5Less than 89.5
05
1117202330
CHAPTER 1DESCRIPTIVE STATISTICS
Ogive Less Than
05
10152025
3035
Less than29.5
Less than39.5
Less than49.5
Less than59.5
Less than69.5
Less than79.5
Less than89.5
Earning (RM)
Cum
mul
ativ
e Fr
eq
CHAPTER 1DESCRIPTIVE STATISTICS
Earnings (RM)
Number of students (f)
30 – 3940 – 4950 – 5960 - 6970 – 7980 - 89
566337
Total 30
302519131070
More than 29.5More than 39.5More than 49.5More than 59.5More than 69.5More than 79.5More than 89.5
Cumulative Frequency (F)
Earnings (RM)
Example 16 (Ogive Greater Than)
CHAPTER 1DESCRIPTIVE STATISTICS
Ogive Greater Than
0
5
10
15
20
25
30
35
Morethan 29.5
Morethan 39.5
Morethan 49.5
Morethan 59.5
Morethan 69.5
Morethan 79.5
Morethan 89.5
Cum
mul
ativ
e Fr
eq
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 6:
Using the frequency table that you construct in exercise 4 and 5, build an ogive less than and ogive greater than for the table.
Time class f
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
total 40f
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:(ogive less than)
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55Less than 3.15
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55Less than 3.15Less than 3.75Less than 4.35Less than 4.95Less than 5.55
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55Less than 3.15Less than 3.75Less than 4.35Less than 4.95Less than 5.55
0
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55Less than 3.15Less than 3.75Less than 4.35Less than 4.95Less than 5.55
07
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55Less than 3.15Less than 3.75Less than 4.35Less than 4.95Less than 5.55
07
22
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
Less than 1.95Less than 2.55Less than 3.15Less than 3.75Less than 4.35Less than 4.95Less than 5.55
07
2229353940
CHAPTER 1DESCRIPTIVE STATISTICS
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:(ogive greater than)
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
More than 1.95 40
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
More than 1.95More than 2.45
4033
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
More than 1.95More than 2.45More than 3.15
403318
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Time class f Class boundriesCummulative
frequency
2.0 – 2.52.6 – 3.1 3.2 – 3.73.8 – 4.34.4 – 4.95.0 – 5.5
7157641
More than 1.95More than 2.45More than 3.15More than 3.75More than 4.35More than 4.95More than 5.55
40331811510
CHAPTER 1DESCRIPTIVE STATISTICS
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 7:
Given the following frequency table:
i. Construct the suitable ogive for the data
Amount ($) Number of Responses
Cumulative frequency
0 – 99 2 25
100 – 199 2 23
200 – 299 6 -
300 – 399 9 -
400 – 499 4 -
500 – 999 2 2
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Amount ($) f Class Boundaries Cumulative frequency
0 – 99 2 More than -0,5 25
100 – 199 2 More than 99.5 23
200 – 299 6 More than 199.5 21
300 – 399 9 More than 299.5 15
400 – 499 4 More than 399.5 6
500 – 999 2 More than 499.5 2
More than 999.5 0
CHAPTER 1DESCRIPTIVE STATISTICS
Ogive More Than
0
510
15
2025
30
Morethan -0,5
Morethan 99.5
Morethan199.5
Morethan299.5
Morethan399.5
Morethan499.5
Morethan999.5
Class Boundaries
Cum
ulat
ive
freq
CHAPTER 1DESCRIPTIVE STATISTICS
Smallest value Largest value
K1 Median K3
Largest value
K1 Median K3
Largest value
K1 Median K3
Smallest value
Smallest value
For symmetry data
For left skewed data
For right skewed data
Box-Plot
Describe the analyze data graphically using 5 measurement: smallest value, first quartile (K1), second quartile (median or K2), third quartile (K3) and largest value.
CHAPTER 1DESCRIPTIVE STATISTICS
Measures of Central Tendency
Ungrouped Data Group Data
(1) Mean (1) Mean
(2) Median (2) Median
(3) Mode (3) Mode
CHAPTER 1DESCRIPTIVE STATISTICS
Ungrouped Data
Mean
Mean for population data:
Mean for sample data:
where: = the sum of all values N = the population size n = the sample size, µ = the population mean = the sample mean
x
N
xx
n
x
x
CHAPTER 1DESCRIPTIVE STATISTICS
Example 17The following data give the prices (rounded to thousand RM) of
five homes sold recently in Sekayang.
158 189 265 127 191Find the mean sale price for these homes.Solution:
158 189 265 127 191
5930
5186
xx
n
Thus, these five homes were sold for an average price of RM186 thousand @ RM186 000.
CHAPTER 1DESCRIPTIVE STATISTICS
Median
Median is the value of the middle term in a data set that has been ranked in increasing order.Procedure for finding the Median
Step 1: Rank the data set in increasing order.Step 2: Determine the depth (position or location) of
the median.
1
2
n Depth of Median =
Step 3: Determine the value of the Median.
CHAPTER 1DESCRIPTIVE STATISTICS
Example 19Find the median for the following data:10 5 19 8 3
Solution:
(1) Rank the data in increasing order3 5 8 10 19
1
25 1
23
n
Depth of Median =
=
=
(3) Determine the value of the medianTherefore the median is located in third position of the data set.3 5 8 10 19Hence, the Median for above data = 8
(2) Determine the depth of the Median
CHAPTER 1DESCRIPTIVE STATISTICS
Example 20Find the median for the following data:
10 5 19 8 3 15Solution:(1) Rank the data in increasing order
3 5 8 10 15 19 (2) Determine the depth of the Median 1
26 1
23.5
n
Depth of Median =
=
=
CHAPTER 1DESCRIPTIVE STATISTICS
Determine the value of the Median
Therefore the median is located in the middle of 3rd position and 4th position of the data set.
8 109
2
Median
Hence, the Median for the above data = 9
-The median gives the center of a histogram, with half of the data values to the left of (or, less than) the median and half to the right of (or, more than) the median.-The advantage of using the median is that it is not influenced by outliers.
CHAPTER 1DESCRIPTIVE STATISTICS
Mode Mode is the value that occurs with the highest
frequency in a data set.
Example 21
1. What is the mode for given data?
77 69 74 81 71 68 74 73
Solution: Mode = 74 (this number occurs twice): Unimodal
CHAPTER 1DESCRIPTIVE STATISTICS
2. What is the mode for given data?77 69 68 74 81 71 68 74 73Mode = 68 and 74: Bimodal
A major shortcoming of the mode is that a data set may have none or may have more than one mode.
One advantage of the mode is that it can be calculated for both kinds of data, quantitative and qualitative.
CHAPTER 1DESCRIPTIVE STATISTICS
Grouped
Data
CHAPTER 1DESCRIPTIVE STATISTICS
fxμ =
N
fxx =
n x
1.MeanMean for population data:
Mean for sample data:
Where
is the midpoint and f is the frequency of a class.
CHAPTER 1DESCRIPTIVE STATISTICS
Example 22The following table gives the frequency distribution of the number of orders received each day during the past 50 days at the office of a mail-order company. Calculate the mean.
Numberof order
f
10 – 1213 – 1516 – 1819 – 21
4122014
n = 50
CHAPTER 1DESCRIPTIVE STATISTICS
fx
fx
Solution:Because the data set includes only 50 days, it represents a sample. The value of is calculated in the following table:
Numberof order
f x fx
10 – 1213 – 1516 – 1819 – 21
4122014
11141720
44168340280
n = 50 = 832
CHAPTER 1DESCRIPTIVE STATISTICS
The value of mean sample is:
fx 832x = = =16.64
n 50
Thus, this mail-order company received an average of 16.64 orders per day during these 50 days
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 8:
A survey research company asks 100 people how many times they have been to the dentist in the last five years. Their grouped responses appear below.
Number of Visits
Number of Responses
0 – 4 16
5 – 9 25
10 – 14 48
15 – 19 11
What is the mean of the data?
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16
5 – 9 25
10 – 14 48
15 – 19 11
2
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16 2
5 – 9 25 7
10 – 14 48
15 – 19 11
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16 2
5 – 9 25 7
10 – 14 48 12
15 – 19 11
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16 2
5 – 9 25 7
10 – 14 48 12
15 – 19 11 17
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16 2 36
5 – 9 25 7
10 – 14 48 12
15 – 19 11 17
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16 2 36
5 – 9 25 7 175
10 – 14 48 12
15 – 19 11 17
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses, f
x fx
0 – 4 16 2 36
5 – 9 25 7 175
10 – 14 48 12 576
15 – 19 11 17 187
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
xNumber of
Responses, ffx
0 – 4 2 16 32
5 – 9 7 25 175
10 – 14 12 48 576
15 – 19 17 11 187
Total 100 970
CHAPTER 1DESCRIPTIVE STATISTICS
The value of mean sample is:
74.9100
974
n
fx
f
fxx
Thus, an average of times the people have been to the dentist in the last five years is 9.74
CHAPTER 1DESCRIPTIVE STATISTICS
MedianStep 1: Construct the cumulative frequency distribution.Step 2: Decide the class that contain the median.
Class Median is the first class with the value of cumulative frequency is at least
n/2.Step 3: Find the median by using the following formula:
CHAPTER 1DESCRIPTIVE STATISTICS
Median
mm
n- F
2= L + if
mLmf
Where:n = the total frequencyF = the total frequency before class mediani = the class width
= the lower boundary of the class median
= the frequency of the class median
CHAPTER 1DESCRIPTIVE STATISTICS
Example 23Based on the grouped data below, find the median:
Time to travel to work
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
8141297
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:1st Step: Construct the cumulative frequency distribution
Time to travel to work
Frequency Cumulative Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
8141297
822344350
CHAPTER 1DESCRIPTIVE STATISTICS
23
1012
222
50
5.20
2
i
f
Fn
LMedianm
m
mf mL
Class median is the 3rd class
So, F = 22, = 12, = 20.5 and i = 10 Therefore,
Thus, 25 persons take less than 23 minutes to travel to work and another 25 persons take more than 23 minutes to travel to work.
252
50
2
n
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 9:
A survey research company asks 100 people how many times they have been to the dentist in the last five years. Their grouped responses appear below.
Number of Visits
Number of Responses
0 – 4 16
5 – 9 25
10 – 14 48
15 – 19 11
What is the median of the data?
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses
Cumulative frequency
0 – 4 16 16
5 – 9 25
10 – 14 48
15 – 19 11
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses
Cumulative frequency
0 – 4 16 16
5 – 9 25 41
10 – 14 48
15 – 19 11
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses
Cumulative frequency
0 – 4 16 16
5 – 9 25 41
10 – 14 48 89
15 – 19 11
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Number of Visits
Number of Responses
Cumulative frequency
0 – 4 16 16
5 – 9 25 41
10 – 14 48 89
15 – 19 11 100
CHAPTER 1DESCRIPTIVE STATISTICS
502
100
2
n
Class median is the 3rd class
So, F = 41, fm = 12, Lm= 9.5 and i = 5 Therefore,
4375.10
548
412
100
5.9
2
i
f
Fn
LMedianm
m Thus, 50 people take less than 10.4375 times to see the dentist and another 50 people take more than 10.4375 times to see the dentist in the last five years
CHAPTER 1DESCRIPTIVE STATISTICS
Mode
•Mode is the value that has the highest frequency in a data set.
•For grouped data, class mode (or, modal class) is the class with the highest frequency.
•To find mode for grouped data, use the following formula
CHAPTER 1DESCRIPTIVE STATISTICS
Mode 1mo
1 2
Δ= L + i
Δ + Δ
moL
1
2
Where:
is the lower boundary of class mode
is the difference between the frequency of class mode and the frequency of the class before the class mode
is the difference between the frequency of class mode and the frequency of the class after the class mode i is the class width
CHAPTER 1DESCRIPTIVE STATISTICS
Example 24Based on the grouped data below, find the mode
Time to travel to work
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
8141297
CHAPTER 1DESCRIPTIVE STATISTICS
moL 1 2
610 5 10 17 5
6 2
Mode = . .
Solution: Based on the table,
= 10.5, = (14 – 8) = 6, = (14 – 12) = 2 and i = 10
Mode 1mo
1 2
Δ= L + i
Δ + Δ
CHAPTER 1DESCRIPTIVE STATISTICS
We can also obtain the mode by using the histogram;
Figure 2.19
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 10:
The following table gives the distribution of the share’s price for ABC Company which was listed in BSKL in 2005.
Price (RM) Frequency
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
Find the mode for this data using formula and histogram.
CHAPTER 1DESCRIPTIVE STATISTICS
64.18
31811
115.17
21
1
iLMode mo
CHAPTER 1DESCRIPTIVE STATISTICS
25 24 23 22 21 20 19 18 17 16 15 14 13 12
fr
eq
ue
ncy
11 10 9 8 7
6
5 4 3 2 1 11.5 14.5 17.5 20.5 23.5 26.5 29.5 class boundries mode = 18.5
Median using ogive:
Example 1: example 15
Example 2: example 16
CHAPTER 1DESCRIPTIVE STATISTICS
Relationship among mean, median & mode
As discussed in previous topic, histogram or a frequency distribution curve can assume either skewed shape or symmetrical shape.
Knowing the value of mean, median and mode can give us some idea about the shape of frequency curve.
CHAPTER 1DESCRIPTIVE STATISTICS
Figure 2.20: Mean, median, and mode for a symmetric histogram and frequency distribution curve
Figure 2.21: Mean, median, and mode for a histogram and frequency distribution curve skewed to the right
CHAPTER 1DESCRIPTIVE STATISTICS
Figure 2.22: Mean, median, and mode for a histogram and frequency distribution curve skewed to the left
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 11:For the following situations, state whether
it is symmetry, skewed to the right or skewed to the left.Mean = 10, median = 15, mode = 20Mean = 15, median = 10, mode = 7Mean = 10, median = 10, mode = 11Mean = 11, median = 12, mode = 12
Left skewed
Right skewed
Approx Symmetry
Approx Symmetry
CHAPTER 1DESCRIPTIVE STATISTICS
Dispersion Measurement
The measures of central tendency such as mean, median and mode do not reveal the whole picture of the distribution of a data set.
Two data sets with the same mean may have a completely different spreads.
The variation among the values of observations for one data set may be much larger or smaller than for the other data set.
CHAPTER 1DESCRIPTIVE STATISTICS
Ungrouped Data
RANGE = Largest value – Smallest value
Example 25:
Find the range of production for this data set,
1.Range
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
Range = Largest value – Smallest value = 267 277 – 49 651 = 217 626
Disadvantages:being influenced by outliers.Based on two values only. All other values in a data set
are ignored.
CHAPTER 1DESCRIPTIVE STATISTICS
Variance and Standard Deviation
Standard deviation is the most used measure of dispersion.
A Standard Deviation value tells how closely the values of a data set clustered around the mean.
Lower value of standard deviation indicates that the data set value are spread over relatively smaller range around the mean.
Larger value of data set indicates that the data set value are spread over relatively larger around the mean (far from mean).
CHAPTER 1DESCRIPTIVE STATISTICS
NN
xx
2
2
2
1
2
2
2
nn
xx
s
Standard deviation is obtained the positive root of the variance:
VarianceStandard Deviation
Population
Sample 2ss
2
CHAPTER 1DESCRIPTIVE STATISTICS
Example 26
Let x denote the total production (in unit) of company
Company Production
ABCDE
62931267534
Find the variance and standard deviation,
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
351502 x
Company Production (x)
x2
ABCDE
6293
1267534
38448649
15 87656251156
1156
CHAPTER 1DESCRIPTIVE STATISTICS
2
55 1
1182 50
39035150-
=
=
2
2
2
xx -
ns =n -1
.
1182 50
34 3875
s .
.
Since s2 = 1182.50;Therefore,
CHAPTER 1DESCRIPTIVE STATISTICS
The properties of variance and standard deviation:
The standard deviation is a measure of variation of all values from the mean.
The value of the variance and the standard deviation are never negative. Also, larger values of variance or standard deviation indicate greater amounts of variation.
The value of s can increase dramatically with the inclusion of one or more outliers.
CHAPTER 1DESCRIPTIVE STATISTICS
Grouped Data
Range = Upper bound of last class – Lower bound of first class
Class Frequency
41 – 5051 – 6061 – 7071 – 8081 – 9091 - 100
137
13106
Total 40
Upper bound of last class = 100.5Lower bound of first class = 40.5Range = 100.5 – 40.5 = 60
VarianceStandard Deviation
Population
Sample
CHAPTER 1DESCRIPTIVE STATISTICS
2
2
2
fx
fxN
N
2
2
2
1
fx
fxns
n
Variance and Standard Deviation
2ss
2
CHAPTER 1DESCRIPTIVE STATISTICS
Example 27Find the variance and standard deviation for the following data:
No. of order f
10 – 1213 – 1516 – 1819 – 21
4122014
Total n = 50
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
No. of order f x fx fx2
10 – 1213 – 1516 – 1819 – 21
4122014
11141720
44168340280
484235257805600
Total n = 50 857 14216
CHAPTER 1DESCRIPTIVE STATISTICS
2
2
2
2
1
83214216
5050 1
7 5820
fxfx
nsn
.
75.25820.72 ss
Variance,
Thus, the standard deviation of the number of orders received at the office of this mail-order company during the past 50 days is 2.75.
Standard Deviation,
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise 13:
Refer to exercise 10, find the variance and standard deviation for the data.
Price (RM) Frequency
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
13
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
1316
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
13161922
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
1316192225
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
65
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
65224
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
6522447515415084
Total 60
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
6522447515415084
Total 60 1152
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
6522447515415084
169256361484625784
Total 60 1152
CHAPTER 1DESCRIPTIVE STATISTICS
Solution
Price (RM) f x fx x2 fx2
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
131619222528
6522447515415084
169256361484625784
84535849025338837502352
Total 60 1152 22944
CHAPTER 1DESCRIPTIVE STATISTICS
9932.1359
6.82559
601152
22944
12
2
2
2
nn
fxfx
s
Standard deviation:
Since s2 = 13.9932;Therefore,
Variance
7407.3
9932.13
2
ss
CHAPTER 1DESCRIPTIVE STATISTICS
Relative Dispersion Measurement
To compare two or more distribution that has different unit based on their dispersion OR
To compare two or more distribution that has same unit but big different in their value of mean.
CHAPTER 1DESCRIPTIVE STATISTICS
)(%100
)(%100
populationx
CV
samplex
sCV
FORMULA
Also called modified coefficient or coefficient of variation, CV.
CHAPTER 1DESCRIPTIVE STATISTICS
Example 26
Given mean and standard deviation of monthly salary for two groups of worker who are working in ABC company- Group 1: 700 & 20 and Group 2 :1070 & 20. Find the CV for every group and determine which group is more dispersed.
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
1
2
20100 2 86
70020
100 1 871070
CV % . %
CV % . %
The monthly salary for group 1 worker is more dispersed compared to group 2.
CHAPTER 1DESCRIPTIVE STATISTICS
MEASURE OF POSITION
QUARTILEINTERQUARTILE RANGE
CHAPTER 1DESCRIPTIVE STATISTICS
Determines the position of a single value in relation to other values in a sample or a population data set.
QuartilesQuartiles are three summary measures that divide
ranked data set into four equal parts.
CHAPTER 1DESCRIPTIVE STATISTICS
1
4
1Depth of Q =
n
3 1
4
3Depth of Q =
(n )
FORMULA
FORMULA
oThe 1st quartiles – denoted as Q1
oThe 2nd quartiles – median of a data set or Q2
oThe 3rd quartiles – denoted as Q3
CHAPTER 1DESCRIPTIVE STATISTICS
Example 27
Table below lists the total revenue for the 11 top tourism company in Malaysia
109.7 79.9 121.2 76.4 80.2 82.1 79.4 89.3 98.0 103.5 86.8
CHAPTER 1DESCRIPTIVE STATISTICS
1 11 13
4 4
1Depth of Q = = =
n
3 11 13 19
4 4
3Depth of Q = = =
(n )
Solution:
Step 1: Arrange the data in increasing order76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2
Step 2: Determine the depth for Q1 and Q3
CHAPTER 1DESCRIPTIVE STATISTICS
Step 3: Determine the Q1 and Q3
76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2
Q1 = 79.9 ; Q3 = 103.5
CHAPTER 1DESCRIPTIVE STATISTICS
FORMULA
Interquartile Range
The difference between the third quartile and the first quartile for a data set.
IQR = Q3 –
Q1
CHAPTER 1DESCRIPTIVE STATISTICS
Example 28
Table below list the total revenue for the 12 top tourism company in Malaysia
109.7 79.9 74.1 121.2 76.4 80.2 82.1 79.4 89.3 98.0 103.5 86.8
Determine the IQR of the data
CHAPTER 1DESCRIPTIVE STATISTICS
1 12 13 25
4 4
1Depth of Q = = =
n.
3 12 13 19 75
4 4
3Depth of Q = = =
(n ).
Solution:
Step 1: Arrange the data in increasing order74.1 76.4 79.4 79.9 80.2 82.1
86.8 89.3 98.0 103.5 109.7 121.2
Step 2: Determine the depth for Q1 and Q3
CHAPTER 1DESCRIPTIVE STATISTICS
Step 3: Determine the Q1 and Q3
74.1 76.4 79.4 79.9 80.2 82.1 86.8 89.3 98.0 103.5 109.7 121.2
Q1 = 79.4 + 0.25 (79.9 – 79.4) = 79.525
Q3 = 98.0 + 0.75 (103.5 – 98.0) = 102.125
Therefore,
IQR = Q3 – Q1 = 102.125 – 79.525 = 22.6
CHAPTER 1DESCRIPTIVE STATISTICS
Quartile For Group Data
1
1
1 QQ
n- F
4Q L + if
3
3
3 QQ
3n- F
4Q L + if
FORMULA
From Median, we can get Q1 and Q3 equation as follows:
CHAPTER 1DESCRIPTIVE STATISTICS
Time to travel to work
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
81412 9 7
Example:
Find Q1 and Q3 for the following data
CHAPTER 1DESCRIPTIVE STATISTICS
Time to travel to work
FrequencyCumulative
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
81412 9 7
CHAPTER 1DESCRIPTIVE STATISTICS
Time to travel to work
FrequencyCumulative
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
81412 9 7
8
CHAPTER 1DESCRIPTIVE STATISTICS
Time to travel to work
FrequencyCumulative
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
81412 9 7
822
CHAPTER 1DESCRIPTIVE STATISTICS
Time to travel to work
FrequencyCumulative
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
81412 9 7
82234
CHAPTER 1DESCRIPTIVE STATISTICS
Time to travel to work
FrequencyCumulative
Frequency
1 – 1011 – 2021 – 3031 – 4041 – 50
81412 9 7
822344350
CHAPTER 1DESCRIPTIVE STATISTICS
1
n 50Class Q 12 5
4 4.
1
1
14
12 5 810 5 10
14
13 7143
n- F
Q L if
. - .
.
Class Q1 is the 2nd class
Therefore,
CHAPTER 1DESCRIPTIVE STATISTICS
3
3 503nClass Q 37 5
4 4.
3
3
34
37 5 3430 5 10
9
34 3889
n- F
Q L if
. - .
.
Class Q3 is the 4th class
Therefore,
CHAPTER 1DESCRIPTIVE STATISTICS
Exercise:
Find Q1 and Q3 for the following data
Price (RM) Frequency
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
CHAPTER 1DESCRIPTIVE STATISTICS
Answer:
Price (RM) FrequencyCumulative
freq
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
CHAPTER 1DESCRIPTIVE STATISTICS
Answer:
Price (RM) FrequencyCumulative
freq
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
5
CHAPTER 1DESCRIPTIVE STATISTICS
Answer:
Price (RM) FrequencyCumulative
freq
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
519
CHAPTER 1DESCRIPTIVE STATISTICS
Answer:
Price (RM) FrequencyCumulative
freq
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
51944
CHAPTER 1DESCRIPTIVE STATISTICS
Answer:
Price (RM) FrequencyCumulative
freq
12 – 1415 – 1718 – 2021 – 2324 – 2627 - 29
51425763
51944515760
CHAPTER 1DESCRIPTIVE STATISTICS
, Class Q1 is the 2nd class
Therefore,
154
60
41 n
QClass
6429.16
314
5155.14
4
111
i
f
Fn
LQq
q
CHAPTER 1DESCRIPTIVE STATISTICS
, Class Q3 is the 4th class
Therefore,
454
)60(3
4
33
nQClass
9286.20
37
44455.20
43
333
i
f
Fn
LQq
q
CHAPTER 1DESCRIPTIVE STATISTICS
MEASURE
OF
SKEWNESS
CHAPTER 1DESCRIPTIVE STATISTICS
FORMULA
•To determine the skewness of data (symmetry, left skewed, right skewed)
Also called Skewness Coefficient or Pearson Coefficient of Skewness
CHAPTER 1DESCRIPTIVE STATISTICS
• If Sk +ve right skewed
• If Sk -ve left skewed
• If Sk = 0 symmetry
If Sk takes a value in between (-0.9999, -0.0001) or (0.0001, 0.9999) approximately symmetry.
CHAPTER 1DESCRIPTIVE STATISTICS
The duration of cancer patient warded in Hospital Seberang Jaya recorded in a frequency distribution. From the record, the mean is 28 days, median is 25 days and mode is 23 days. Given the standard deviation is 4.2 days.What is the type of distribution?Find the skewness coefficient
Example 32
CHAPTER 1DESCRIPTIVE STATISTICS
Solution:
This distribution is right skewed because the mean is the largest value
28 2311905
4 2
3 3 28 2521429
4 2
Mean - Mode
OR
Mean - Median
k
k
S .s .
S .s .
So, from the Sk value this distribution is right skewed.
EXERCISE
1. A student want to study a level of satisfaction toward a price of a product at Queen supermarket. She take a simple random of 100 customers and asked them whether they ‘very satisfied’, ‘satisfied’, ‘not sure’, ‘not satisfied’, or ‘very not satisfied’. State:
Population: All customers at Queen Supermarket
Sample100 customers at Queen Supermarket
Variablesatisfaction
Type of variableQualitative variable
Data value Very satisfied / satisfied / not sure, not satisfied / very not satisfied
Type of data collection in the survey Face tot face