Normal Distribution In-class exercises · Web viewLecture 3 - Standard Scores, Probability, and the Normal Distribution Howell Chapter 6, 7 Standard Scores Standard Score: A way of

Lecture 3 - Standard Scores, Probability, and the Normal Distribution

Howell Chapter 6, 7

Standard Scores

Standard Score: A way of representing performance on a test or some other measure so that persons familiar with the standard score know immediately how well the person did relative to others taking the same test.

1. Percentile based

The Percentile Rank of a score: Percentage of Scores in a reference group less than or equal to a score value.

PRXi= iNo. of scores <= X

100* N

Typical Values

100 Largest possible percentile rank

50 Middle-of-the-road percentile rank (i.e., the median)

0 Lowest possible percentile rank.

Example. How much self-esteem do you have?

Suppose I gave you the Rosenberg Self-esteem scale.

Suppose you scored 6.4.

Do you have much self-esteem?

How can you answer that, since you’ve never seen the RSE or any RSE scores.

Now, suppose I told you that your percentile rank in a sample of persons who’d been given the RSE is 80.6.

Now you know that you have quite a bit of self-esteem, relative to others like you.

Percentile ranks are an excellent way of presenting performance scores to persons.

Most people have learned how to interpret percentile ranks at some time in their lives.

Standard Scores / Normal Distribution - 1 5/17/2023

origorder RSE Percentile

1 4.90 19.90 2 5.50 38.80 3 5.30 31.10 4 6.90 97.60 5 4.60 14.10 6 6.40 80.60 7 5.30 31.10 8 5.70 51.90 9 6.20 70.90 10 5.20 27.70 11 6.30 75.20 12 4.50 12.10 13 3.70 2.90 14 1.90 .50 15 5.80 57.80 16 6.10 68.90 17 4.60 14.10 18 6.40 80.60 19 6.70 91.30 20 3.80 3.40

Pros of Percentile Ranks

1. Useful for everyone because it’s easy to understand. (List of cases on the right is of the Rosenberg Self- Esteem scale. Raw scores are on a 1-7 scale. I claim that the Percentiles are easier for people unfamiliar with the Rosenberg to understand.)

2. Doesn’t depend on distribution shape for precise interpretation (as do Zs)

Cons of Percentile Ranks

1. No statistical heritage, no roots.

2. Not completely linearly related to the original values.

This means that equal score differences don’t have equal percentile differences.

If you’re comparing people in the middle range of possible X values, equal X differences pretty much correspond to equal Percentile rank differences.

But in the tails of the distribution, equal X differences do NOT correspond to equal percentile rank differences.


Equal score differences

Raw score

Percentile

Unequal percentile differences

2. Those based on a linear transformation of the difference between X and the mean. New score = Multiplicative constant • Old score + Additive constant

a. The Z Score (The Godfather of standard scores)

X – Mean 1

Z = ----------------------- = (------)*X – Mean/SD SD SD

Interpretation: Z is number of Standard Deviations X is above or below the mean.

Z = 1 means X is 1 SD above mean. Z = -3 means X is 3 SDs below the mean.

General rule of thumb: Most Z scores will be between -3 and +3.2/3 of Zs will be between -1 and +1; 95% of Zs will be between -2 and +2.

Mean of the whole collection of Zs: Will always equal 0.SD of the whole collection of Zs: Will always equal 1.

b. The T-score – the Education standard score

Ti = 10 * Zi + 50, rounded to nearest whole number.

Mean of Ts is always 50.SD of Ts is always 10.

c. The SAT score

SATi = 100 * Zi + 500, rounded to nearest whole number.

Mean = 500 and SD = 100.

GRE scores have been scored in this fashion. That changed in Fall 2012.


Note: Must compute from ALL the Zs.

Note: Most tests that are reported as T-scores have special formulas that take you directly to the T, without having to go through Z.

Comparison of Scales of the three types:

.

This graphic shows that Zs, Ts, and SATs are all giving us the same information: how many standard deviations a raw score is above or below the mean. They’re just giving it to us in slightly different ways.


Effects of Linear Transformations of the original scores on measures of central tendency and variability

1. Location shift: A constant is added to each score in the collection.

New X = Old X + A

New measure of central tendency = Old measure + A

New measure of variability = Old measure (no change)

2. Scale change: Multiplying or dividing each score by a constant.

New X = B*Old X.

New measure of central tendency = B*Old measure

New Range = B*Old rangeNew Interquartile range = B*Old Interquartile rangeNew SD = B*Old SD

But . . . New Variance = B2 * Old Variance


Probability – Start here on 9/12/17

A. Definition of probability

The probability of an event is a number between 0 and 1 inclusive.This number represents the likelihood of occurrence of the event.

From Wikipedia: Probability is the measure of the likelihood that an event will occur. Probability is quantified as a number between 0 and 1, where, loosely speaking, 0 indicates impossibility and 1 indicates certainty.

B. Examples of Events

The event “A hurricane will strike Florida in October.”The event "A randomly selected person is a male."The event "A randomly selected IQ score is larger than 130."The event, “An experimenter will incorrectly reject the null hypothesis.”The event "I change my name to Fred tomorrow."


C. Why do we want probabilities?

Probability a way of making decisions in the face of uncertainty.

*When our oldest son was 4, we found that he had a tumor on his neck.The doctors, based on the available evidence, gave us the probability of his survival if we did not operate and the probability of his survival if we did operate. Outcomes of many medical conditions are expressed formally in terms of probabilities.

*The consequences of a pre-emptive strike on North Korea’s nuclear facilities have very likely been expressed as probabilities – the probability of a failure of the strike, the probability of NK bombing South Korea and Japan, the probability of NK backing down in its attempts to create a nuclear arsenal, the probability of a strike strengthening the Iran government.

Probabilities are used to determine costs.

*Insurance companies have derived quantitative formulas relating the probability of persons of each age having a car accident to rates based on those probability estimates – the higher the probability, the higher the rate.

Probabilities are used to evaluate the results of experiments

1. An experiment is conducted comparing two pain relievers and the outcome is recorded.2. The probability of the particular outcome of that experiment is computed assuming that the pain relievers are equal.3a. If that probability is large, then the researcher concludes that the pain relievers are equal.3b. But if the probability is small, the researcher concludes that they’re not equal.


D. Computing or determining the probability of an event.

1. Subjective determination. The vast majority of probabilities we use are “computed” subjectively.

For example: Probability of the Light turning Red before you get to it.Probability of the self-checkout being faster than the person’d checkout.

We make subjective determinations all the time.

2. The Relative Frequency method, from the proportion of previous occurrences,

Number of previous occurrencesProbability estimate = ---------------------------------------------

Number of previous opportunities to occur

Applications of this method: Actuarial tables created by insurance companiesDetermining probability of a hurricane in the neighborhood in which you intend to move.

Problems with this methodDepends on the future mirroring the past – no climate change, for example.Not applicable to events for which we have no history (AIDS)Accurate counts may not be available (AIDS again)

3. The Universe of Elementary Events method

Let U = A collection of equally likely elementary (indivisible) events.Let A = an event defined as a subset of U.

Number of elementary events in AProbability of A, P(A) = -----------------------------------------

Number of elementary events in U

Example Let U = {1, 2, 3, 4, 5, 6 } the possible faces of a single die.Suppose A = {1, 3, 5}, then P(A) = 3 / 6 = 0.5

Problems with the Universe of Elementary Events method:

There aren't many real-world situations to which it applies

Applications: Games of chance


Two Important combination events defined as combinations of other events

1. The intersection or joint-occurrence event, “A and also B”

An outcome is in the intersection of A and B if it represents the occurrence of both A and B.

This event is called the Intersection of A and B. The name can best be appreciated by considering A and B as two streets which cross.

Example: A = Florida will be hit by a hurricane in September, 2018.B = Florida will be hit by a hurricane in October, 2018.

The intersection is “Florida will be hit in early September and Florida will be hit in early October”

2. The union event, “Either A or B or Both”.

This event is called the Union of A and B. Think of union as in the union of matrimony. After the marriage, either the husband (A) or the wife (B) or both can represent the family.

Example: A = Florida will be hit by a hurricane in September, 2018.B = Florida will be hit by a hurricane in October, 2018.Union is “Hit in September or hit in October.”


A

B

A and also B

A

B

Either A or B or both

Two important event relationships – “mutually exclusive” and “independent”.

1. Mutually exclusive events.

Two events, A and B, are mutually exclusive if the occurrence of either precludes the occurrence of the other.

They’re mutually exclusive if an outcome cannot simultaneously be both.

Examples.

A: 1st person past the door is taller than 6’.B: 1st person past the door is shorter than 5’.The 1st person cannot be both taller than 6’ and shorter than 5’, so A and B are mutually

exclusive.

A: The results of an experiment result in rejection of the null hypothesis.B: The results of the same experiment result in failure to reject the same null hypothesis.

Special type of mutually exclusive event: Complement

A: The occurrence of any event~A: The occurrence of any event except A (may also be written as A’ or –A or Ac )

Example: A = The first card drawn from a deck is the 10 of clubs~A = The first card drawn from a deck is not be a 10 of clubs

Diagram representing mutual exclusiveness


A B

2. Independent events.

Two events, A and B, are independent if the occurrence of either has no effect on the probability of occurrence of the other.

A and B can occurA and ~B can occur~A and B can occur~A and ~B can occur

Moreover, the occurrence of A or ~A will not affect the probability of occurrence of B.And the occurrence of B or ~B will not affect the probability of occurrence of A.

Examples of events that are probably independent.

A: It will rain in Tokyo tomorrow.B: The Atlanta Falcons will win on Sunday.

A: Experimenter A finds a significant difference between an Experimental and Control group.B: Experimenter B working in a different laboratory with no connection or communication

with Experimenter A finds a significant difference between a similar Experimental and Control group.

A: The flip of a coin by Jim in Chattanooga will result in a “Head”.B: The flip of a coin by Sam in Seattle will result in a “Head”.

Diagram representing independence

Note that all possible events can occur . . .

A and BA and ~B~A and B~A and ~B

If A occurs, P(B) is unaffected.If B occurs, P(A) is unaffected.


BA U

1. The Multiplicative Laws - Probability of “A and also B”.

In general, P(A and also B) = P(A) * P(B|A) = P(B) * P(A|B)

Where P(B|A) is called the conditional probability of B given A, read as Probability of B given A.

If this were a course on probability, we’d spend a week or so studying conditional probability.

We won’t pursue this general case here, however.

We will consider two special cases:

A. A and B are mutually exclusive

If A and B are mutually exclusive, P(A and also B) = 0 .

Example: A = 1st draw from a deck of cards is the King of Clubs.B = 1st draw from same deck of cards is the Ace of Spades.

Probability of A and also B = 0, because the 1st draw cannot be both a King and an Ace.

B. A and B are independent.

If A and B are independent, P(A and also B) = P(A) * P(B) .

This is called the multiplicative law of probability. It only works if A and B are independent.

Example: A = 1st draw from a deck of cards is the King of Clubs.B = 2nd (note – 2nd, not 1st ) draw from same deck of cards after shuffling is the Ace of Spades.

Probability of A and also B = 1/52 * 1/52 = 1/2704 = .00036982.

A = A hurricane will hit Florida in 2018.B = A hurricane will hit Florida in 2022.

Probability of A and also B = P(A) * P(B), since a four-year separate will probably insure that the two events are independent.


2. The Additive Law - Probability of “Either A or B or Both”

In general P(Either A or B or Both) = P(A) + P(B) – P(A and also B)

Two special cases (same two as above)

A. A and B are mutually exclusive

If A and B are mutually exclusive, P(Either A or B or Both) = P(A) + P(B).

This is called the additive law of probability

Example: A = 1st card drawn from a deck is King of ClubsB = 1st card drawn from same deck is Ace of Spades.

P(Either A or B or Both) = P(A) + P(B) + 0 = 1/52 + 1/52 = 2/52 = .038

Application of the additive law to complements: P(Either A or ~A) = 1

So, P(~A) = 1 – P(A). The probability of the complement of an event is 1 minus probability of the event.

Example: A = 1st card drawn from a deck is King of Clubs.B = 1st card drawn from a deck is not the King of Clubs, so B = ~A.

P(A or B or Both) = P(A) + P(~A) + 0 = 1/52 + 51/52 = 52/52 = 1.00

B. A and B are independent

If A and B are independent, P(Either A or B or Both) = P(A) + P(B) – P(A)*P(B).

Example: A = A hurricane will hit Florida in 2018. Suppose P(A) = .10B = A hurricane will hit Florida in 2022. Suppose P(B) = .10.

Suppose A and B are independent.

P(Either A or B or Both) = P(A) + P(B) – P(Both) = .10 + .10 -.01 = .19.


Probability of one or more: An application of the above laws

Suppose that for each year, the probability of a hurricane hitting Florida = .1.What’s the probability of 1 or more hurricanes hitting Florida in the next 10 years.Assume year-to-year occurrences are independent.

I flip a coin 10 times.What’s the probability of my getting one or more Heads in the 10 flips?

A researcher conducts research that depends on telekinesis. The probability of rejecting the null (incorrectly, since telekinesis does not exist) is .05.9 other misguided researchers conduct the same research. For each (since telekinesis still does not exist) the probability is .05 that each will reject (incorrectly) the null hypothesis is .05.What’s the probability that one or more of the researchers will (incorrectly) reject the null hypothesis?

Fifty people are in a room. The probability that any one of them will have the same birthday as I is 1/365.What’s the probability that one or more of them will have the same birthday as I?

Each of these problems has the same form:

What’s the probability of one or more occurrences of a particular event?

If the events are independent, the problem can be solved by

1) Realizing the “None” is the complement of “One or more”

Probability of one or more = 1 – Probability of none

2) Since they’re independent,

Probability of none = P(Not 1st) * P(Not 2nd) * P(Not 3rd) * etc etc etc

So, if multiple, identical events are independent,

Probability of One or more = 1 – (Probability of none)k where K is the no. of events.


Applied to the problems . . .

Probability of a hurricane = .1What’s the probability of one or more hurricanes in next 10 years.Probability of none = .9 * .9 * .9 * .9 * .9 * .9 * .9 * .9 * .9 * .9 = .910 = .35So, Probability of 1 or more = 1 – Probability of none = 1 - .35 = .65.

I flip a coin 10 times.What’s the probability of my getting one or more Heads in the 10 flips?Probability of none = (1 - .5)10

P(One or more Heads) = 1 – P(None) = 1 – (1-.5)10 = 1 - .510 = 1 - .000976 = .999.

A researcher conducts research that depends on telekinesis. The probability of rejecting the null (incorrectly, since telekinesis does not exist) is .05.9 other misguided researchers conduct the same research. For each (since telekinesis still does not exist) the probability is .05 that each will reject (incorrectly) the null hypothesis is .05.What’s the probability that one or more of the researchers will (incorrectly) reject the null hypothesis?

P(One or more incorrect rejections) = 1 – (1-.05)10 = 1 - .9510 = 1 - .598 = .40

Suppose there were 99 other misguided researchers. What would be the probability of at least one incorrect rejection? P = 1 – (1-.05)100 = 1 -.95100 = 1 -.00592 = .994.So, if there are 100 labs conducting research on telekinesis, it’s almost certain that 1 of them will find “evidence” of the phenomenon.

Fifty people are in a room. The probability that any one of them will have the same birthday as I is 1/365.What’s the probability that one or more of them will have the same birthday as I?

P(One or more with same b’day) = 1 – (1-.0027)50 = 1 - .997350 = 1 - .87 = .13.

What if there are 500 people in the room? P = 1 – (1 - .0027)500 = 1 - .9973500 = 1 - .259 = .741


Probability Applied to Values of Variables: Random Variables

A variable whose values are determined by a random process.

Its values are not predictable.

So we can't know which value a random variable will have. But in many instances, we can know the probability of occurrence of each value.

Random variables are what statisticians study.

Probability Distributions: Probabilities of values of a Random Variable

Q: If we can’t know what the next value of a random variable is, what can we know?

A: We can know the probability that the variable will take on a specific value.

Probability Distribution: A statement, usually in the form of a formula or table, which gives the probability of occurrence of each value of a random variable.


Famous probability distributions

The binomial distribution . . .

The binomial distribution describes probabilities associated with repeated independent occurrences of events, with each event put into one of two categories, e.g., H vs. T.

Suppose a bag of N coins is spilled. Suppose the variable is X, the number of coins that land H side up.So we want the Probability of X Hs when the bag of N coins is spilled. Possible values of X . . .

X: No. of H's ProbabilityN ___N-1 ___...3 ___2 ___1 ___0 ___

N!P(X) = -------------* PX(1-P)N-X

X!(N-X)!

Where N = No. coins spilled.P = Probability of a Head for with each coin.! = Factorial operator. K! = K*(K-1)*(K-2) . . . 2*1

Use syntax at right to illustrate. ============

Examples:

Number of people getting a disease, such as Ebola.Number of defective parts in a mass-production line.Number of hurricanes to hit Florida in 10 years.


input program.loop #i=1 to 10000.compute id = $casenum.end case.end loop.end file.end input program.execute.print formats id (f3.0).COMPUTE bino = RV.BINOM(10,.1) .GRAPH /HISTOGRAM=bino.

Uniform Probability Distribution

P(X between X1 and X2= (X2-X1)/(Max - Min)

Syntax:

input program.loop #i=1 to 10000.compute id = $casenum.end case.end loop.end file.end input program.execute.print formats id (f3.0).compute uni = rv.uniform(0,4).graph /histogram = uni.

Example - Playing spin-the-bottle


Min MaxX1 X2

Probability

4/0

1

2

3

The Normal Distribution

-(X-)2

----------- 1 22

P(X between X1 and X2) = Integral of ------- e 2

Syntax:

input program.loop #i=1 to 10000.compute id = $casenum.end case.end loop.end file.end input program.execute.print formats id (f3.0).compute normi = rv.normal(0,1).graph /HISTOGRAM(NORMAL)=normi .


+-

***How many geniuses are there ***in 10 million people?input program.loop #i=1 to 10000000.compute id = $casenum.end case.end loop.end file.end input program.execute.print formats id (f3.0).***Genius count – ***takes about 15 seconds for 10,000,000.compute iq = rv.normal(100,15).count genius = iq(171 thru high).fre var=genius.

More on the Normal DistributionThe most important probability distribution.

Formula, again -(X-µ)2

------ 1 2σ2

P(X) = ------------ * e σ SQRT(2π) Why so important?

1. Ubiquity.

Quantities which are the accumulations of more elementary, typically binary, entities tend to follow the normal distribution.

Simplest example: Take a bag of 100 coins. Spill the bag 10,000 times, each time counting the number of Heads that land face up. The distribution of Number of Heads will essentially follow the normal distribution. input program.loop #i=1 to 10000.compute id = $casenum.end case.end loop.end file.end input program.execute.print formats id (f3.0).COMPUTE bino = RV.BINOM(100,.5) .GRAPH /HISTOGRAM=bino.

So even though the originating event – flip of a coin – is a categorical outcome, the SUM of a bunch of those events yields a quantity that becomes more like a continuous variable as the number of possible events increases.

Even if the world is essentially qualitative, worldly outcomes can be continuous.

In nature, many quantitative outcomes are probably the result of the accumulation of 100’s of binary events, e.g., genes turned on or off. The result of the accumulations is a quantitative outcome and those quantitative outcomes are normally distributed.

Even if IQ is determined by qualitative gene flips “on” or “off”, because it’s the sum of 1000s of such flips, it varies continuously.


2. Sampling Distributions – Distributions of sample statistics

The main reason the normal distribution is so important is because of sampling distributions.

Consider a population.

Now consider taking a sample of size 4 from that population.

Suppose the mean of that sample was computed.

Now repeat the above steps thousands of times.

The result would be a distribution of sample means.

The frequency distribution of the sample means is called the Sampling Distribution of Means.

In general, the distribution of values of any sample statistic is called the Sampling Distribution of that statistic, e.g., the Sampling Distribution of the standard deviation, of the median, etc.


Values of sample mean

A few of the sample means.

Three theoretical facts and 1 practical fact about the Sampling Distribution of sample means . . .

Theoretical facts

1. The mean of the population of sample means will be the same as the mean of the population from which the samples were taken. The mean of the means is the mean. µX-Bar = µX

Implication: Sample mean is an unbiased estimate of the population mean.

2. The standard deviation of the population of sample means will be equal to original population's sd divided by the square root of N, the size of each sample. σX-Bar = σX/sqrt(N)

Implication: Sample mean will be closer to population mean, on the average, as sample size gets larger.

3. The shape of the distribution of sample means will be the normal distribution if the original distribution is normal or approach the normal as N gets larger in all other cases. This fact is called the Central Limit Theorem. It is the foundation upon which most of modern day statistics rests.

Why do we care about #3? We care because we’ll need to compute probabilities associated with sample means when doing inferential statistics. To compute those probabilities, we need a probability distribution.

That probability distribution is the normal distribution. This is the main reason the normal distribution is so important.

Practical fact

4. The distribution of Z's computed from each sample, using the formula

X-bar - Z = --------------------- / N

will be or approach (as sample size gets large) the Standard Normal Distribution with mean = 0 and SD = 1.

This ties the beginning of this lecture (Z-scores) to the rest of it (the Normal Distribution).


Documents

Normal Distribution In-class exercises · Web viewLecture 3 - Standard Scores, Probability, and the Normal Distribution Howell Chapter 6, 7 Standard Scores Standard Score: A way of