28
Summary Statistics

Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Embed Size (px)

DESCRIPTION

Descriptive Statistics - Methods of organizing and summarizing information in a clear and effective way.

Citation preview

Page 1: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Summary Statistics

Page 2: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of observed values.

Population – The collection of all individuals, items or data under consideration in a statistical study.

Sample – That part of the population from which information is collected.

Page 3: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Descriptive Statistics -

Methods of organizing and summarizing information in a clear and effective way.

Page 4: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Inferential Statistics -

Methods of drawing conclusions about a population based on information obtained from a sample of the population.

Parameter –

Statistic -

A descriptive measure for a population.

A descriptive measure for a sample.

Page 5: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Summary statistics for Discrete Data

Median:The middle value, after the observations have been arranged in order of magnitude. If the total number of values is even, then the median would be halfway between the middle two values.

Mode: The set of data which occur most frequently.

For grouped data, the term modal class is used.

Page 6: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Mean:

The Mean of the Population is generally unknown.

The Sample Mean serves as an unbiased estimate of the Population Mean.

Population Mean Sample Mean

1

k

i ii

f xx

n

1

k

i ii

f x

n

Page 7: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

The shoe sizes of the members of a football team are:

10, 10, 8, 11, 10, 9, 9, 10, 11, 9, 10

Determine the mean, median and mode.

Answer: 9.73, 10, 10

Page 8: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Mean of Grouped Data

1

k

j jj

f xx

n where each value of x is the midpoint

of each class.

Page 9: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Each day, x, the number of diners in a restaurant was recorded and the following grouped frequency table was obtained.

Using the above grouped data, find an estimate of the mean.

x 16-20 21-25 26-30 31-35 36-40

Number of Diners

67 74 38 39 42

Answer: 26.4

Page 10: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Find the mean for the following set of data:

Interval Frequency20-24 125-29 630-34 1035-39 240-44 1

Answer: 31

Page 11: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

A test marked out of 100 is written by 800 students. The cumulative frequency graph for the marks is given below.

(a) Write down the number of students who scored 40 marks or less on the test. (b) The middle 50 % of test results lie between marks a and b, where a < b. Find a and b.

Answers: a) 100 b) a = 75, b = 55

SPEC06/HL1/3

Page 12: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

The heights of 60 children entering a school were measured. The following cumulative frequency graph illustrates the data obtained.

Estimate

(a) the median height;

(b) the mean height.Answers: Median = 1.04, Mean = 1.05

M04/HL1/15

Page 13: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

The box and whisker plots shown represents the heights of female students and the heights of male students at a certain school.

(a) What percentage of female students are shorter than any male students?

(b) What percentage of male students are shorter than some female students?

(c) From the diagram, estimate the mean height of the male students.

Answer: (a) 25% (b) 75% (c) 172 cm N00/HL1/4

Page 14: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

A die is rolled twenty times with the following results:

Given that the mean is 3.6, find the values of a and b.

Outcome 1 2 3 4 5 6

Frequency 2 4 A 7 2 B

Answer: a = 2, b = 3

Page 15: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Variance

The Variance tells us about how far the data lies from the Mean.

Population Variance

2 2

2 21 1

( )k k

i i i ii i

f x f x

n n

Page 16: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

The population variance is generally unknown. The Sample Variance is slightly skewed and is biased. That is to say that the sample variance is different from the Population Variance.

Sample Variance

Unbiased Estimate of the Population Variance

2 2

22 1 1

( )k k

i i i ii i

n

f x x f xs x

n n

2 2

22 2 1 11

( )

1 1 1 1

k k

i i i ii i

n n

f x x f xn ns s xn n n n

Therefore, we also have an unbiased estimate of the Population Variance.

Page 17: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Standard Deviation:

The Standard Deviation is the square root of the variance.

Population Standard Deviation

Sample Standard Deviation

2

1

( )k

i ii

f x

n

2

1

( )k

i ii

n

f x xs

n

Page 18: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

The nine planets of the solar system have approximate equatorial diameters (in thousands of km) as follows:

4.9, 12.1, 12.8, 6.8, 142.8, 120.0, 52.4, 49.5, 2.5

Determine the standard deviation of these diameters.

Answer: 49.7

Page 19: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Find the standard deviation for the following set of data:

Interval Frequency1-7 4

8-14 515-21 1022-28 6

Answer: 7.01

Page 20: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

A grouped data for the number of days to maturity for 40 short-term investments is given below. Compute the sample mean and standard deviation.

Days to maturity Frequency

30-39 3

40-49 1

50-59 8

60-69 10

70-79 7

80-89 7

90-99 4

40

Answer:

Mean = 68.0 days

Standard Deviation = 16.2 days

Find the unbiased standard deviation:

Answer:

Unbiased

Standard Deviation = 16.4 days

Page 21: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

A machine tests the distance, w, measured in thousands of km, that car tires travel before the tire wear reaches a critical amount. For a random sample of tires, the results are summarizes below:

12

23

48

15

3

0 25w

25 30w

30 35w

35 45w

45 60w

(a) Find the grouped mean for this data

(b) Find the unbiased estimate of the variance based on the grouped data.

Answers:

30700 km

70900

Page 22: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

A sample of 70 batteries were tested to how long they last. The results were:

Determine:

(a) the sample standard deviation

(b) an unbiased estimate of the standard deviation from which this sample is taken.

Answers:

(a) 21.4 hours

(b) 21.6 hours

M00/HL1/4

Page 23: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

For a set of 9 numbers and . Find the mean of the numbers.

2( ) 60x x 2 285x

Answer: mean = 5

For a given frequency distribution:2 2( ) 182.3, 1025, 30f x x fx f

find: .fx

Answer: 159

Page 24: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

A teacher drives to school. She records the time taken on each of 20 randomly chosen days. She finds that

where denotes the time, in minutes, taken on the ith day.

Calculate an unbiased estimate of

(a) the mean time taken to drive to school;

(b) the variance of the time taken to drive to school.

20 202

1 1

626 and 19780.8i ii i

x x

ix

Answers:

(a) 31.3

(b) 9.84

M03/HL1/19

Page 25: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Chebychev’s Rule

Property 1: At least 75% of the data lie within two standard deviations to either side of the mean.

Property 2: At least 89% of the data lie within three standard deviations to either side of the mean.

Property 3: In general, for any number k > 1, at least

of the data lies within k standard deviations to either side of the mean.

211k

Page 26: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Z score: The z-score for a data value is the number of standard deviations that the data value is away from the mean.

Sample z-score

Population z-score

x xzs

xz

Page 27: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

It is known that a coffee machine dispenses an average of 6 fluid ounces of coffee with a standard deviation of 0.2 fluid ounces. A cup of coffee dispensed from the machine is found to contain 7.1 fluid ounces of coffee. Determine and interpret the z-score for this cup of coffee. Does this cup of coffee contain an unusually large amount?

Answer:

The z-score is 5.5, or 5.5 standard deviations above the mean.

This cup contains more coffee than 96.7% of all cups dispensed, it is a large amount.

Page 28: Summary Statistics. One of the main purposes of statistics is to draw conclusions about a (usually large) population from a (usually small) sample of

Two major topics of inferential statistics:

1. Using the sample mean to make inferences about the population mean.

2. Using the sample standard deviation to make inferences about the population standard deviation.