Upload
ashley-oliver
View
230
Download
1
Embed Size (px)
Citation preview
Information AnalysisGaussian or Normal Distribution
0
0.002
0.004
0.006
0.008
0.01
0.012
0 50 100 150 200 250 300 350
X
Pro
babi
lity
= mean, estimated as xx = observed sample mean = x/n= standard deviation, estimated as sn = sample sizeS= observed standard deviation
2/1
11
22
nn
x
n
x
s
ii
0
0.002
0.004
0.006
0.008
0.01
0.012
0 50 100 150 200 250 300 350
X
Pro
ba
bil
ity
Area under curve = 1
0
0.002
0.004
0.006
0.008
0.01
0.012
0 50 100 150 200 250 300 350
X
Pro
ba
bil
ity
Coefficient of Variation
s
xCv
-0.005
0
0.005
0.01
0.015
0.02
0.025
0 50 100 150 200 250 300 350
X
Pro
ba
bili
ty
0
0.001
0.002
0.003
0.004
0.005
0.006
0.007
0 50 100 150 200 250 300 350
X
Pro
ba
bili
ty
Cv = 150/20 = 7.5 Cv = 150/60 = 2.5
Example
100 kg of glass is recovered from municipal refuse and processed. The glass is crushed and sieved. Lot the cumulative distribution of particle size from the data below
4 mm holes 10 kg glass remained on the sieve(90 kg went through)
3 mm holes 25 kg remained on the sieve2 mm holes 35 kg remained on the sieve1 mm holes 20 kg remained on the sieveNo holes 10 kg went all the way through
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
4 3 2 1 Pan
Sieve Size (mm)
Frac
tion
Ret
aine
d
0
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
4 3 2 1 Pan
Sieve Size (mm)
Fra
ctio
n R
eta
ine
dSieve Size Fraction Retained 4 10/100 = 0.1 3 25/100 = 0.25 2 35/100 = 0.35 1 20/100 = 0.20 <1 10/100 = 0.1
Cumulative Distribution
Sieve Size Fraction SmallerThan sieve size
4 1 – 0.1 = 0.9 3 1 – (0.1+0.25) = 0.65 2 1 –(0.35 + 0.35) = 0.3 1 1 – (0.7 + 0.2) = 0.1
00.10.20.30.40.50.60.70.80.91
0 1 2 3 4 5
Particle Size (mm)
Frac
tion
of P
Artic
les
smal
ler
than
siz
e in
dica
ted
Graphs
Independent variable Abscissa (x-axis)
Dependent variable Ordinate (y-axis)
A variable is independent if the value is chosen, likesieve size in the previous example.
A value is dependent if is determined by experiment
Probability PaperX-axis is linear
Y-axis is plotted so that if the probability is normal (Gaussian) then the cumulative probability will plot as a straight line.
If this is the case the mean is at 0.5 or 50% and the standard deviation is 0.335 on either side of the mean.
You can also calculate s by: s = 2/5(x90 – x10)
ExampleConsider the recycled glass data from the previous example. What is the mean, the standard deviation, and the 95% interval?
The mean is the value on the x-axis when the y-axis value is 0.5, 2.4 mm.
The standard deviation is the spread around the mean so that 68% of the data fall into the range (or about 34% on either side of the mean).
0.5 + 0.34 = 0.84, which corresponds to 3.5 mm, so s = 3.5 – 2.4 = 1.1, or:S=2/5(3.9-1.0) = 1.16
The 95% interval means 95% of the data is in the range, or between 0.025 and 0.975, or 0.2 mm and 4.8 mm
Return PeriodReturn period is how often an event is expected to recur.
If the annual probability of an event occurring is 5%, then the event can be expected to occur once every 20 years, or have a return period of 20 years:
Return period = 1/fractional probability
To determine return periods, first rank time-variant data (smallest to largest or largest to smallest) then calculate the probabilities and plot the data.
Return Period ExampleThe data below are from a wastewater treatment plant. BOD is the measure of organic pollution in a water. The BOD is measured daily. .
Does this data fit the normal distribution? Can it be used to calculate the mean and standard deviation? What is the worst quality expected in 30 days?
First, rank the data:
Now plot the data. We will plot m/n (which is the probability), versus the BOD
It does fit the normal distribution fairly well
The mean is about 35 mg/L BOD
To find the worst quality in a 30 day period, calculate: 29/30 = 0.967. This is the fraction of days the quality is better than the worst day out of 30 days
Enter the graph at 0.967 and find the answer: 67 mg/L BOD
Sometimes data is analyzed after it is grouped. Often the mean is used to analyze the data.
Example:
Using the data from the previous problem estimate the highest expected BOD to occur once every 30 days using grouped data analysis
First define groups of BOD values.
Now plot these data Notice how the data points form a curve. This means the data don’t really fit the normal Distribution, but we’ll go ahead anyway
Now P29/30 = 0.967 and we read 67 mg/L BOD from the graph.