Upload
jody-freeman
View
226
Download
5
Embed Size (px)
Citation preview
Statistics, Probability and Decision Making 1
Statistics, Probability, and Decision Making
Statistics, Probability and Decision Making
Statistics, Probability and Decision Making 2
Which trial represents the length?
Most feel the mean is the best estimate.
Trial Length
1 25.45
2 25.40
3 25.50
4 25.42
5 25.38
Mean
25.44
Statistics, Probability and Decision Making 3
How Precise is the Estimate?
You decide that the length is 25.43.
But look at the measurements. Is 25.50 a misfit?
Statistics, Probability and Decision Making
Statistics, Probability and Decision Making 4
What about an unexpected value?
• Get rid of it…
• No, you need a statistical reason !
• Only if it was a mistake.
Statistics, Probability and Decision Making 5
Is it a mistake?
An outlier: A single observation "far away" from the rest.
Q: How far away is “far away”?
A: It depends on whether the value differs from the
rest within a “reasonable” range.
Statistics, Probability and Decision Making 6
Decisions, decisions…
Statistics, Probability and Decision Making 7
Rejecting Data in a Small Data Set
Run the “Q-test.”
To test 25.50, calculate Q.
Q = (The suspect - the value closest to it)
Range
Q = 0.05 ÷ 0.12 = ≈ 0.42
Trial Length
1 25.45
2 25.40
3 25.50
4 25.42
5 25.38
Mean
25.44
Statistics, Probability and Decision Making 8
Compare Qcalculated with Qcritical
• If Qcalc > Qcritical, reject.
• If Qcalc < Qcritical, keep .
Qcritical90% confidence
0.94 0.76 0.64 0.56 0.51 0.47 0.44 0.41
Number oftrials
3 4 5 6 7 8 9 10
Statistics, Probability and Decision Making 9
From the previous example…
Qcalc = 0.42
N = 5, Qcritical = 0.64
• If Qcalc > Qcritical
• If Qcalc < Qcritical
Statistics, Probability and Decision Making
Statistics, Probability and Decision Making 10
Rejecting data in a large set
• Find the confidence interval
µ ± 3 σ
• Does measurement falls outside the confidence interval?
Use a Normal Distribution
95% of the data falls withintwo standard deviations of the mean.
Statistics, Probability and Decision Making 11
Outliers…Q: Why worry about them?
A: Values may not be properly distributed.
Q: Where do they come from?A: Possible sources:
1. Recording and measurement errors
2. Incorrect distribution3. Unknown data structure
Note: Outliers are in red
Statistics, Probability and Decision Making 12
Managing Outliers If the data is a normal distribution:
1. Calculate the mean and the standard deviation.
2. Find the ±3 standard deviation range for imposing limits on the data.
3. Identify outliers (greater ± 3 standard deviations).
4. Get rid of them!!!