28
Chapter 3 Descriptive Statistics: Numerical Methods

Chapter 3 Descriptive Statistics: Numerical Methods

  • Upload
    shira

  • View
    103

  • Download
    1

Embed Size (px)

DESCRIPTION

Chapter 3 Descriptive Statistics: Numerical Methods. Measures of Location The Mean (A.M, G.M and H. M) The Median The Mode Percentiles Quartiles. Summary Measures. Describing Data Numerically. Center and Location. Variation. Mean. Range. Weighted Mean. Variance. Median. Mode. - PowerPoint PPT Presentation

Citation preview

Page 1: Chapter 3 Descriptive Statistics: Numerical Methods

Chapter 3

Descriptive Statistics: Numerical Methods

Page 2: Chapter 3 Descriptive Statistics: Numerical Methods

Measures of LocationThe Mean (A.M, G.M and H. M)The MedianThe ModePercentilesQuartiles

Page 3: Chapter 3 Descriptive Statistics: Numerical Methods

Summary Measures

Center and Location

Mean

Median

Mode

Describing Data Numerically

Variation

Variance

Standard Deviation

Coefficient of Variation

Range

Percentiles

Quartiles

Weighted Mean

Page 4: Chapter 3 Descriptive Statistics: Numerical Methods

Mean

The Mean is the average of data values The most common measure of central tendency Mean = sum of values divided by the number of values

0 1 2 3 4 5 6 7 8 9 10

Mean = 3

0 1 2 3 4 5 6 7 8 9 10

Mean = 4

45

20

5

104321

Page 5: Chapter 3 Descriptive Statistics: Numerical Methods

Mean The mean (or average) is the basic measure of location or “central

tendency” of the data.

•The sample mean is a sample statistic.

•The population mean is a population statistic.

x

Page 6: Chapter 3 Descriptive Statistics: Numerical Methods

Mean Sample mean

Population mean

n = Sample Size

N = Population Size

n

xxx

n

xx n

n

ii

211

N

xxx

N

xN

N

ii

211

Page 7: Chapter 3 Descriptive Statistics: Numerical Methods

Example: College Class SizeWe have the following sample of data for 5 college classes:

46 54 42 46 32

We use the notation x1, x2, x3, x4, and x5 to represent the number of students in each of the 5 classes:

X1 = 46 x2 = 54 x3 = 42 x4 = 46 x5 = 32

Thus we have:

445

3246425446

554321

xxxxx

n

xx i

The average class size is 44 students

Page 8: Chapter 3 Descriptive Statistics: Numerical Methods

Median The median is the value in the middle when the data are arranged in ascending order (from smallest value to largest value).

a. For an odd number of observations the median is the middle value.

b. For an even number of observations the median is the average of the two middle values.

Page 9: Chapter 3 Descriptive Statistics: Numerical Methods

The College Class Size example First, arrange the data in ascending order:

32 42 46 46 54

Notice than n = 5, an odd number. Thus the median is given by the middle value.

32 42 46 46 54

The median class size is 46

Page 10: Chapter 3 Descriptive Statistics: Numerical Methods

Median Starting Salary For a Sample of 12 Business School GraduatesA college placement office has obtained the

following data for 12 recent graduates:

Graduate Starting Salary Graduate Starting Salary

1 2850 7 2890

2 2950 8 3130

3 3050 9 2940

4 2880 10 3325

5 2755 11 2920

6 2710 12 2880

Page 11: Chapter 3 Descriptive Statistics: Numerical Methods

2710 2755 2850 2880 2880 2890 2920 2940 2950 3050 3130 3325

Notice that n = 12, an even number. Thus we take an average of the middle 2 observations:

2710 2755 2850 2880 2880 2890 2920 2940 2950 3050 3130 3325

Middle two values

First we arrange the data in

ascending order

29052

29202890Median

Thus

Page 12: Chapter 3 Descriptive Statistics: Numerical Methods

ModeThe mode is the value that occurs with

greatest frequency A measure of central tendency Value that occurs most often There may be no mode There may be several modes

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14

Mode = 5

0 1 2 3 4 5 6

No Mode

Page 13: Chapter 3 Descriptive Statistics: Numerical Methods

The Mode

MODE The value of the observation that appears most frequently.

Page 14: Chapter 3 Descriptive Statistics: Numerical Methods

Characteristics of the Mean

1. The most widely used measure of location. 2. Major characteristics:

All values are used. It is unique. It is calculated by summing the values and

dividing by the number of values.3. Weakness: Its value can be unclear when

extremely large or extremely small data compared to the majority of data are present.

Properties and Uses of the Median1. There is a unique median for each data set.2. Not affected by extremely large or small values and

is therefore a valuable measure of central tendency when such values occur.

Characteristics of the Mode

1. Mode: the value of the observation that appears most frequently.

2. Advantage: Not affected by extremely high or low values.

3. Disadvantages: For many sets of data,

there is no mode because no value appears more than once.

For some data sets there is more than one mode.

Page 15: Chapter 3 Descriptive Statistics: Numerical Methods

Weighted Mean When the mean is computed by giving each

data value a weight that reflects its importance, it is referred to as a weighted mean.

In the computation of a grade point average (GPA), the weights are the number of credit hours earned for each grade.

When data values vary in importance, the analyst must choose the weight that best reflects the importance of each value.

Page 16: Chapter 3 Descriptive Statistics: Numerical Methods

Weighted Mean

x = wi xi

wi

where:

xi = value of observation i

wi = weight for observation i

Page 17: Chapter 3 Descriptive Statistics: Numerical Methods

Sample Data

Population Data

where:

fi = frequency of class i

Mi = midpoint of class i

Mean for Grouped Data

i

ii

f

Mfx

i

ii

f

Mfx

N

Mf iiN

Mf ii

Page 18: Chapter 3 Descriptive Statistics: Numerical Methods

Weighted Mean Used when values are grouped by

frequency or relative importance

Days to Complete

Frequency

5 4

6 12

7 8

8 2

Example: Sample of 26 Repair Projects

Weighted Mean Days to Complete:

days 6.31 26

164

28124

8)(27)(86)(125)(4

w

xwX

i

iiW

Page 19: Chapter 3 Descriptive Statistics: Numerical Methods

Example: Apartment Rents

Given below is the previous sample of monthly rents

for one-bedroom apartments presented here as grouped

data in the form of a frequency distribution.

Rent ($) Frequency420-439 8440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

Rent ($) Frequency420-439 8440-459 17460-479 12480-499 8500-519 7520-539 4540-559 2560-579 4580-599 2600-619 6

Page 20: Chapter 3 Descriptive Statistics: Numerical Methods

Example: Apartment Rents

Mean for Grouped Data

This approximation differs by $2.41 from

the actual sample mean of $490.80.

Rent ($) f i M i f iM i

420-439 8 429.5 3436.0440-459 17 449.5 7641.5460-479 12 469.5 5634.0480-499 8 489.5 3916.0500-519 7 509.5 3566.5520-539 4 529.5 2118.0540-559 2 549.5 1099.0560-579 4 569.5 2278.0580-599 2 589.5 1179.0600-619 6 609.5 3657.0

Total 70 34525.0

Rent ($) f i M i f iM i

420-439 8 429.5 3436.0440-459 17 449.5 7641.5460-479 12 469.5 5634.0480-499 8 489.5 3916.0500-519 7 509.5 3566.5520-539 4 529.5 2118.0540-559 2 549.5 1099.0560-579 4 569.5 2278.0580-599 2 589.5 1179.0600-619 6 609.5 3657.0

Total 70 34525.0

x 34 52570

493 21,

.x 34 52570

493 21,

.

Page 21: Chapter 3 Descriptive Statistics: Numerical Methods

Five houses on a hill by the beach

Review Example

$2,000 K

$500 K

$300 K

$100 K

$100 K

House Prices:

$2,000,000 500,000 300,000 100,000 100,000

Page 22: Chapter 3 Descriptive Statistics: Numerical Methods

Summary Statistics

Mean: ($3,000,000/5)

= $600,000

Median: middle value of ranked data = $300,000

Mode: most frequent value = $100,000

House Prices:

$2,000,000 500,000 300,000 100,000 100,000

Sum 3,000,000

Page 23: Chapter 3 Descriptive Statistics: Numerical Methods

Percentiles The pth percentile is a value such that at least p percent of the observations are less than or equal to this value and at least (100 – p) percent of the observations are greater than or equal to this value.

I scored in the 70th percentile on the Graduate Record Exam (GRE)—

meaning I scored higher than 70 percent of those who took the

exam

Page 24: Chapter 3 Descriptive Statistics: Numerical Methods

Calculating the pth Percentile•Step 1: Arrange the data in ascendingorder

(smallest value to largest value).

•Step 2: Compute an index i

np

i

100

where p is the percentile of interest and n in the number of observations.

•Step 3: (a) If i is not an integer, round up. The next integer greater than i denotes the position of the pth percentile.

(b) If i is an integer, the pth percentile is the average of values in i and i + 1

Page 25: Chapter 3 Descriptive Statistics: Numerical Methods

Example: Starting Salaries of Business Grads

Let’s compute the 85th percentile using the

starting salary data. First arrange the data in ascending order.

Step 1:

2710 2755 2850 2880 2880 2890 2920 2940 2950 3050 3130 3325

Step 2: 2.1012100

85

100

np

i

Step 3: Since 10.2 in not an integer, round up to 11.The 85thpercentile is the 11th position (3130)

Page 26: Chapter 3 Descriptive Statistics: Numerical Methods

QuartilesQuartiles are just specific percentiles

Let:

Q1 = first quartile, or 25th percentile

Q2 = second quartile, or 50th percentile (also the median)

Q3 = third quartile, or 75th percentile

Let’s compute the 1st and 3rd quartiles using the

starting salary data. Note we already computed the

median for this sample—so we know the 2nd quartile

Page 27: Chapter 3 Descriptive Statistics: Numerical Methods

Now find the 25th percentile: 312100

25

100

np

i

2710 2755 2850 2880 2880 2890 2920 2940 2950 3050 3130 3325

Note that 3 is an integer, so to find the 25th percentile we must average together the 3rd and 4th values:

Q1 = (2850 + 2880)/2 = 2865

Now find the 75th percentile: 912100

75

100

np

i

Note that 9 is an integer, so to find the 75th percentile we must average together the 9th and 10th values:

Q1 = (2950 + 3050)/2 = 3000

Page 28: Chapter 3 Descriptive Statistics: Numerical Methods

Quartiles for the Starting Salary Data

2710 2755 2850 2880 2880 2890 2920 2940 2950 3050 3130 3325

Q1 = 2865 Q1 = 2905 (Median)

Q3 = 3000