26
S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Embed Size (px)

Citation preview

Page 1: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

S519: Evaluation of Information Systems

Social Statistics

Chapter 7: Are your curves normal?

Page 2: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

This week

Why understanding probability is important? What is normal curve How to compute and interpret z scores.

Page 3: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

What is probability?

The chance of winning a lottery The chance to get a head on one flip of a

coin Determine the degree of confidence to state

a finding

Page 4: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Normal distribution

Figure 7.4 – P157 Almost 100% of the scores fall between (-3SD,

+3SD) Around 34% of the scores fall between (0, 1SD)

Are all distributions normal?

Page 5: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Normal distribution

The distance between contains Range (if mean=100, SD=10)

Mean and 1SD 34.13% of all cases 100-110

1SD and 2SD 13.59% of all cases 110-120

2SD and 3SD 2.15% of all cases 120-130

>3SD 0.13% of all cases >130

Mean and -1SD 34.13% of all cases 90-100

-1SD and -2SD 13.59% of all cases 80-90

-2SD and -3SD 2.15% of all cases 70-80

< -3SD 0.13% of all cases <70

Page 6: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Z score – standard score

If you want to compare individuals in different distributions

Z scores are comparable because they are standardized in units of standard deviations.

Page 7: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Z score

Standard score

X

z

X: the individual score

: the mean

: standard deviation

Sample or population?

Page 8: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Z scoreMean and SD for Z distribution?

Mean=25, SD=2, what is the z score for 23, 27, 30?

Page 9: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Z score

Z scores across different distributions are comparable

Z scores represent a distance of z score standard deviation from the mean

Raw score 12.8 (mean=12, SD=2) z=+0.4 Raw score 64 (mean=58, SD=15) z=+0.4

Equal distances from the mean

Page 10: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Comparing apples and oranges:

Eric competes in two track events: standing long jump and javelin. His long jump is 49 inches, and his javelin throw was 92 ft. He then measures all the other competitors in both events and calculates the mean and standard deviation:

Javelin: M = 86ft, s = 10ft Long Jump: M = 44, s = 4 Which event did Eric do best in?

Page 11: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Excel for z score

Standardize(x, mean, standard deviation) (x-average(array))/STDEV(array)

Page 12: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

What z scores represent?

Raw scores below the mean has negative z scores

Raw scores above the mean has positive z scores

Representing the number of standard deviations from the mean

The more extreme the z score, the further it is from the mean,

Page 13: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

What z scores represent?

84% of all the scores fall below a z score of +1 (why?)

16% of all the scores fall above a z score of +1 (why?)

This percentage represents the probability of a certain score occurring, or an event happening

If less than 5%, then this event is unlikely to happen

Page 14: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

In a normal distribution with a mean of 100 and a standard deviation of 10, what is the probability that any one score will be 110 or above?

What about 6σhttp://en.wikipedia.org/wiki/Six_Sigma

Page 15: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

If z is not integer

Table B.1 (S-P357-358) NORMSDIST(z)

To compute the probability associated with a particular z score

Page 16: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

The probability associated with z=1.38 41.62% of all the cases in the distribution fall

between mean and 1.38 standard deviation, About 92% falls below a 1.38 standard deviation How and why?

Page 17: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Between two z scores

What is the probability to fall between z score of 1.5 and 2.5 Z=1.5, 43.32% Z=2.5, 49.38% So around 6% of the all the cases of the

distribution fall between 1.5 and 2.5 standard deviation.

Page 18: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

What is the percentage for data to fall between 110 and 125 with the distribution of mean=100 and SD=10

Page 19: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

The probability of a particular score occurring between a z score of +1 and a z score of +2.5

Page 20: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

Compute the z scores where mean=50 and the standard deviation =5 55 50 60 57.5 46

Page 21: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

The math section of the SAT has a μ = 500 and σ = 100. If you selected a person at random: a) What is the probability he would have a score

greater than 650? b) What is the probability he would have a score

between 400 and 500? c) What is the probability he would have a score

between 630 and 700?

Page 22: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Determine sample size

Expected response rate: obtain based on historical data

Number of responses needed: use formula to calculate

Rate Response Expected

Needed Responses ofNumber Size Sample

Page 23: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Number of responses needed

n=number of responses needed (sample size) Z=the number of standard deviations that

describe the precision of the results e=accuracy or the error of the results =variance of the data for large population size

2

22

e

Zn x

2

x

Page 24: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Deciding

from previous surveys intentionally use a large number conservative estimation

e.g. a 10-point scale; assume that responses will be found across the entire 10-point scale

3 to the left/right of the mean describe virtually the entire area of the normal distribution curve

=10/6=1.67; =2.78

2

x

2

Page 25: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Example

Z=1.96 (usually rounded as 2) =2.78 e=0.2 n=278 (responses needed) assume response rate is 0.4 Sample size=278/0.4=695

2

22

e

Zn x

2

Page 26: S519: Evaluation of Information Systems Social Statistics Chapter 7: Are your curves normal?

Exercise

Z=1.96 (usually rounded as 2) 5-point scale (suppose most of the responses

are distributed from 1-4) error tolerance=0.4 assume response rate is 0.6 What is sample size?

2

22

e

Zn x