Upload
badam
View
213
Download
0
Embed Size (px)
DESCRIPTION
ib mathematics
Citation preview
DESCRIPTIVE STATISTICS
A. Measures of central tendency
I. Simple discrete data
Discrete: Is the data that can be counted
Example: number of children
Number of cars
(we cannot say 1.5 children or 2.4 cars)
Example:
Consider the numbers 6 8 3 2 6 5 1 6 8. Find mean, median, mode
6 8 3 2 6 5 1 6 8 455
9 9
6 (occurs 3 times)
1 2 3 5 6 6 6 8 8
6 is in t
M
he middle
So media
ean
Mode
Medi
n
n
i
a
s 6
Or
Median=
So the 5th
term is the median: number 6
Mean The sum of all
squares divided
by the number of
values
Median The value that
is in the middle
Mode The value that is the
most frequent
Very important!!!!
The numbers must be
in order.
Otherwise we cannot
find the median
What if we had 10 numbers instead of 9?
Example:
1 2 3 5 6 8 8 9 9 9
Now the median is between 6 and 8
So median=
Or
Median=
So the median is between 5th and 6th term: number 7
A hard question:
(a)
7 4 5 8 T 14 47
8
46 T7
8
46 T 56
T 56 46
T 10
(b) Mode=4 (occurs 3 times)
(c) Median: The numbers must be in order
4 4 4 5 7 8 10 14
Median=
(8+1)/2=4.5 so is between 4th
and 5th
term
So median=6
II. Discrete data in a frequency table
This is the same as if the frequency table was like the table below:
Number of
errors
Number of
pages
0 28
1 24
2 20
3 17
4 11
Total 100
a. Discrete
b. Mean =
0x28 1x24 2x20 3x17 4x111.59 (or byGDC)
100
c. Median : 1 ( GDC)
d. Mode : 0 because zero is the most frequent number
here
III. Grouped discrete or continuous data
Continuous data : is the data that can be measured
for example weight , height
Mean: to calculate the mean we need midpoints
Median : This is estimated middle value calculated using the cumulative frequency graph.
Mode : Now we have modal group This is the group class interval that has the largest
frequency
Example
Number of errors Number of
pages
40 ≤t < 50 (45) 20
50 ≤t < 60 (55) 61
60 ≤t < 70 (65) 83
70 ≤t < 80 (75) 90
80 ≤t < 90 (85) 106
90 ≤t < 100 (95) 62
100 ≤t < 110 (105) 49
110 ≤t < 120 (115) 29
Total 500
Mean = 45x20 55x61 ... 115x29
79.6min500
Modal group = 80 t <90
Median= we find the median from the cumulative
frequency graph (we will see an example in the
following pages)
Measures of dispersion
Range: Is the difference between the largest and the smallest value
Quartiles: Quartiles separate the data into four equal parts. Each of these parts contain
25% of the data
1
1( 1)
4
thQ n term Lower Quartile
3
3( 1)
4
thQ n term Upper Quartile
3 1Q Q IQR is the Interquartile range
Box and whisker plots: Is a graphic way to display the median, quartiles and extremes of a
data set on a numbered line to show the distribution of the data.
Outlier: Is any value at least 1.5xIQR from the nearest quartile (lower or upper)
Variance: Is the arithmetic mean of the squared differences between each value and the
mean value and is calculated using GDC only
Standard deviation: Is the square root of variance. Low standard deviation shows that the
data points tend to be very close to the mean and high standard deviation shows that the
data is spread out over a large range of values.
Range
Quartiles
Standard deviation
Variance
Example: Simple discrete data
20 23 23 25 26 29 30 30 31
Q1 MEDIAN Q3
Range : 31- 20 = 11
1
n 1 9 1 10Q 2.5
4 4 4
So Q1 is between 23 and 23 → Q1 = 23
Median : (n+1)/2 =10/2 = 5 so the 5th
number
is the median = 26
Q3 = 3 ( n+1) / 4 = 3 (9 + 1 ) / 4 = 30 / 4 = 7.5
So Q3 is between the 7th
and the 8th
number
Q3 = 30
Q3 - Q1 = I Q R = 30 - 23 = 7
Mean =20 23 23 ... 31
26.39
Variance = 2 23.65148371 13.3 (GDC)
Standard deviation= 3.65 (GDC)
STEM A& LEAF DIAGRAMS
Example: Find the median weight in kg of 34 eight year-old children
Median weight: (n+1) / 2 = (34 + 1)/2 = 17.5
So we must find the number that is
between the 17th
and 18th
term
(2+4) /2 = 3 median 30.3 kg
Example
b. i. Median: (n+1)/2 = (19 + 1)/2 = 10th term 88cm
ii. Q1 = (n+1)/4 = (19 + 1)/4 = 5th term 78 cm
Q3 = 3( n+1)/4 = 3 (19+1)/ 4 = 15th term 103 cm
CUMULATIVE FREQUNCY GRAPH
An example of how we use and what can be asked in a cumulative frequency graph is given
below:
50(a) 80 40
100
Because 50/100 means the middle
So median = 30
(b) IQR = Q3-Q1
11
33
25Q 80 20
100
75Q 80
So
6
Q 20
So Q 450100
Q3-Q1=45 – 20=25
thSo 85 percentile=23
50 or ab
35(c) 80 28
100
10(d) 100% 12.5%
80ove: