Upload
others
View
5
Download
1
Embed Size (px)
Citation preview
Chapter 6Probability Plots
Probability Plots
Specifically, normal probability plots.
Section 6.7
1 / 14
Were my data drawn from a normal distribution?
7 | 6
8 | 7
9 | 7
10 | 15
11 | 058
12 | 013
13 | 133455
14 | 12356899
15 | 001344678888
16 | 0003357789
17 | 0112445668
18 | 0011346
19 | 034699
20 | 0178
21 | 8
22 | 189
23 | 7
24 | 5
The decimal point is 1 digit(s) to the right of the |
The compressive strength data at theleft looks ‘normally distributed’, but isit really?
It is unimodal and has a bell shape, butdo the probabilities line-up with anormal distribution?
Having the correct general shape is astart, but there are specific probabilitiesthat coincide with the normaldistribution.
2 / 14
Were my data drawn from a normal distribution?
Definition
A probability plot is a graphical method for determining if sample dataconform to some specific distribution (such as normal, exponential, etc.)
More reliable than basing the decision on a histogram.Some examples of probability plots...
NOTE: Different software will label the axes differently, but it is the pattern of the data points in the plot that matters, and that
is similar across software.3 / 14
Were my data drawn from a normal distribution?
In this course, we will limit discussion to the normal probability plotas we want to know if our data conforms to a normal distribution.
We’re just briefly discussing this topic (there are more details toexplore).
In a normal probability plot, we plot the ordered observed data pointsagainst those that would have been observed if we had sampled froma truly normal distribution.
Ordered observations: x(1), x(2) . . . , x(n)where x(1) is the minimum and x(n) is the maximum.
Then each x(j) is plotted against its relevant ‘hypothetical’ z-scoreor zj if the data were truly normally distributed.
4 / 14
Normal Probability Plots
Example (n=15 observed data points)
Ordered FindzsuchthatObserved ϕ(z)=(j-0.5)/nValue j (j-0.5)/n z
-8 1 0.0333 -1.83-5.5 2 0.1 -1.28
-2.25 3 0.1667 -0.97-1.25 4 0.2333 -0.73-0.75 5 0.3 -0.52-0.75 6 0.3667 -0.34-0.25 7 0.4333 -0.17-0.25 8 0.5 0-0.25 9 0.5667 0.17
0 10 0.6333 0.340.5 11 0.7 0.520.75 12 0.7667 0.73
1 13 0.8333 0.974.5 14 0.9 1.2824 15 0.9667 1.83
5 / 14
Normal Probability Plots
Example (n=15 observed data points)
-1 0 1
-50
510
1520
25
Normal Q-Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
This normal probability plot suggests the data was NOT drawn froma normal distribution.
6 / 14
Normal Probability Plots
Example (n=15 observed data points)
histogram
observed value
Frequency
-10 -5 0 5 10 15 20 25
02
46
8
The histogram also suggests there’s a very large value, whichwouldn’t have been expected if it was truly a normal distribution.
7 / 14
Normal Probability Plots - what to look for
If the data were generated from a normal distribution, then the datapoints in the normal probability plot will fall approximately on astraight diagonal line.
8 / 14
Normal Probability Plots - what to look for
The patterns below suggest non-normality...
“S” shape “S” shape “J” shape
Light-tails Heavy tails Right - skewedcompared to compared to
normal normal
All the above patterns are signs of non-normality.9 / 14
Normal Probability Plots (not a ‘best-fit-line’)
NOTE: The diagonal line is NOT A ‘BEST FIT LINE’ to the data.
The line is simply a ‘reference line’ for your eye.
In R statistical software, the line is drawn by simply connecting thetwo (x, y) points determined by the values at the 25th and 75thpercentiles.
10 / 14
Normal Probability Plots
Example (non-normality)
The normal probability plot below suggests there are non-normality issuesbecause of the points at the bottom left. Normality is questionable.
-2 -1 0 1 2
-20
-10
010
Normal Q-Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
s
Reference line connects valuesat the 25th and 75th percentiles (in blue).
11 / 14
Normal Probability Plots - Can a transformation help?
Sometimes we can use a transformation of the data to improve thenormality (but you’ll be working on the transformed scale after that).
Below, a log-transformation helped, but didn’t quite get us tonormality.
-2 -1 0 1 2
050
100
150
200
250
NPP plot - original scale
Theoretical Quantiles
Sam
ple
Qua
ntile
s
-2 -1 0 1 2
-2-1
01
23
NPP plot - log scale
Theoretical Quantiles
Sam
ple
Qua
ntile
s
12 / 14
Normal Probability Plots
The normal probability plot below looks pretty good. Not perfect, butreasonable to assume approximate normality.
-2 -1 0 1 2
1015
2025
30
Normal Q-Q Plot
Theoretical Quantiles
Sam
ple
Qua
ntile
sReference line connects
values at the25th and 75th percentiles (in blue)
And yes, I simulated the above data from a normal distribution.
13 / 14
Normal Probability Plots
I’d like to spend more time with normal probability plots, but due totime constraints, I want you to know two main things...
1 We use a normal probability plot to check for normality.
2 What the normal probability plot looks like when the data are normallydistributed (and not normally distributed).
14 / 14