Upload
azura
View
383
Download
39
Embed Size (px)
DESCRIPTION
Statistics for Analytical Chemistry. Reading –lots to revise and learn. Chapter 3 Chapter 4 Chapter 5-1 and 5-2 Chapter 5-3 will be necessary background for the AA lab Chapter 5-4 we will use later. Data Analysis. Most data quantitative - derived from measurements Never really know error - PowerPoint PPT Presentation
Citation preview
Statistics for Analytical Statistics for Analytical Chemistry Chemistry
Reading –lots to revise and learnReading –lots to revise and learn
Chapter 3Chapter 3 Chapter 4Chapter 4 Chapter 5-1 and 5-2Chapter 5-1 and 5-2 Chapter 5-3 will be necessary background Chapter 5-3 will be necessary background
for the AA labfor the AA lab Chapter 5-4 we will use laterChapter 5-4 we will use later
Data AnalysisData Analysis
Most data quantitative - derived from Most data quantitative - derived from measurementsmeasurements
Never really know errorNever really know error With more measurements you get a better With more measurements you get a better
idea what it might beidea what it might be Don’t spend a lot of time on an answer Don’t spend a lot of time on an answer
-where only 20% accuracy is required -where only 20% accuracy is required -or where sampling error is big - although -or where sampling error is big - although you don’t want to make the error worseyou don’t want to make the error worse
Significant Figure ConventionSignificant Figure Convention
Final answer should only contain figures that Final answer should only contain figures that are certain, plus the first uncertain numberare certain, plus the first uncertain number
eg 45.2% eg 45.2% error less than 1% or we would only write error less than 1% or we would only write
45%45% error larger than 0.05% or would write error larger than 0.05% or would write
45.23%45.23%
RememberRemember
Leading zeros are not significantLeading zeros are not significant Trailing zeros are significantTrailing zeros are significant 0.06037 - 4 significant figures0.06037 - 4 significant figures 0.060370 - 5 significant figures0.060370 - 5 significant figures
1200 ????1200 ???? 12 x 1012 x 102 2 - 2 significant figures- 2 significant figures
Rounding OffRounding Off
Round a 5 to nearest even numberRound a 5 to nearest even number 4.55 to 4.64.55 to 4.6
Carry an extra figure all through calculationsCarry an extra figure all through calculations
BUT NOT 6 EXTRABUT NOT 6 EXTRA
Just round off at the endJust round off at the end
AddingAdding
Absolute uncertainty of answer must not exceed Absolute uncertainty of answer must not exceed that of most uncertain numberthat of most uncertain number
Simple rule: Decimal places in answer = decimal Simple rule: Decimal places in answer = decimal places in number with fewest places places in number with fewest places
12.2 12.2 00.365 00.365 001.041.04 13.605 goes to 13.6 13.605 goes to 13.6
When errors are knownWhen errors are known RRr =(A r =(A a) + (B a) + (B b) + (C b) + (C c)c) where rwhere r22 = a = a22 + b + b22 + c + c22
Example: Calculate the error in the MW Example: Calculate the error in the MW of FeS from the following atomic of FeS from the following atomic weights:weights:
Fe:55.847 Fe:55.847 0.0040.004 S:32.064 S:32.064 0.0030.003 r = (0.004r = (0.00422 + 0.003 + 0.00322))1/21/2 MW = 87.911 MW = 87.911 0.0050.005
Multiplication and DivisionMultiplication and Division Simplest rule: Sig figs in answer = Simplest rule: Sig figs in answer =
smallest number of sig figs in any value smallest number of sig figs in any value usedused
This can lead to problems - This can lead to problems - particularly if particularly if the first digit of the number is 9.the first digit of the number is 9.
1.07400 x 0.993 = 1.071.07400 x 0.993 = 1.07 1.07400 x 1.002 = 1.0761.07400 x 1.002 = 1.076 Error is ~ 1/1000 therefore 4 significant Error is ~ 1/1000 therefore 4 significant
figs in answerfigs in answer
Multiplication and DivisionMultiplication and Division
The relative uncertainty of the answer must The relative uncertainty of the answer must fall between 0.2 and 2.0 times the largest fall between 0.2 and 2.0 times the largest relative uncertainty in the data used in the relative uncertainty in the data used in the calculation.calculation.
Unless otherwise specified, the absolute Unless otherwise specified, the absolute uncertainty in an experimental measurement uncertainty in an experimental measurement is taken to be +/- the last digitis taken to be +/- the last digit
Multiplication and DivisionMultiplication and Division
With known errors - add squares of With known errors - add squares of relativerelative uncertaintiesuncertainties
r/R = [(a/A)r/R = [(a/A)22 + (b/B) + (b/B)22 +(c/C) +(c/C)22]]1/21/2
LogsLogs
Only figures in the mantissa (after the decimal Only figures in the mantissa (after the decimal point) are significant figurespoint) are significant figures
Use as many places in mantissa as there are Use as many places in mantissa as there are significant figures in the corresponding number significant figures in the corresponding number
pH = 2.45 has 2 sig figs pH = 2.45 has 2 sig figs
DefinitionsDefinitions
Arithmetic mean, (average)Arithmetic mean, (average)
Median -middle valueMedian -middle value for N=even number, use average of central for N=even number, use average of central
pairpair
AccuracyAccuracy
Deviation from true answerDeviation from true answer
Difficult to knowDifficult to know
Best way is to use Reference standardsBest way is to use Reference standards
National Bureau of StandardsNational Bureau of Standards Traceable StandardsTraceable Standards
PrecisionPrecision
Describes reproducibility of resultsDescribes reproducibility of results What is used to calculate the confidence What is used to calculate the confidence
limitlimit Can use deviation from meanCan use deviation from mean or relative deviationor relative deviation 0.1/5 x 1000 = 20ppt (parts per thousand)0.1/5 x 1000 = 20ppt (parts per thousand) 0.1/5 x 100% = 2%0.1/5 x 100% = 2%
Precision of Analytical Methods Precision of Analytical Methods
Absolute standard deviation s or sdAbsolute standard deviation s or sd Relative standard deviation (RSD)Relative standard deviation (RSD) Standard deviation of the mean sStandard deviation of the mean smm
SSmm = s/ = s/NN½½
Coefficient of variation (CV) s/Coefficient of variation (CV) s/x x 100%x x 100% Variance sVariance s22
Standard CurveStandard Curve
y = 1.9311x + 1.1127
R2 = 0.988805
1015
0 1 2 3 4 5 6 7
[Ca] (mg/L)
Read
ou
tNot necessarily linear. Linear is Not necessarily linear. Linear is mathematically easier to deal with.mathematically easier to deal with.
Correlation coefficientsCorrelation coefficients
Show how good a fit you have.Show how good a fit you have. R or RR or R22
For perfect correlation, R = 1, RFor perfect correlation, R = 1, R2 2 = 1= 1
22
22
)()(
)])(([
yyxx
yyxxR
ii
ii
LINESTLINEST
Calculates slope and interceptCalculates slope and intercept Calculates the uncertainty in the slope and Calculates the uncertainty in the slope and
the interceptthe intercept Calculates RCalculates R22
Calculates s.d. of the population of y valuesCalculates s.d. of the population of y values See page pp 68-72, Harris.See page pp 68-72, Harris.
Use these values to determine the number of sig figs for the slope and intercept
Dealing with Random ErrorsDealing with Random Errors
Indeterminate Error Indeterminate Error
Repeating a coarse measurement gives the Repeating a coarse measurement gives the same resultsame result
eg weighing 50 g object to nearest g - only eg weighing 50 g object to nearest g - only error would be determinate - such as there error would be determinate - such as there being a fault in the balancebeing a fault in the balance
If same object was weighed to several If same object was weighed to several decimal places -get random errorsdecimal places -get random errors
How many eggs in a dozen?How many eggs in a dozen?
How wide is your desk?How wide is your desk?
Will everyone get the same answer?Will everyone get the same answer?
What does this depend on?What does this depend on?
With a fewWith a few
measurements, measurements,
the mean won’t the mean won’t
reflect the truereflect the true
mean as well asmean as well as
if you take
a lot of measurements
Random errorsRandom errors With many measurements, more will be close With many measurements, more will be close
to the meanto the mean Various little errors add in different waysVarious little errors add in different ways Some cancel - sometimes will all be one waySome cancel - sometimes will all be one way A plot of frequency versus value gives a bell A plot of frequency versus value gives a bell
curve or Gaussian curve or normal error curve or Gaussian curve or normal error curvecurve
Errors in a chemical analysis will fit this curveErrors in a chemical analysis will fit this curve
Equation for Gaussian CurveEquation for Gaussian Curve
2
2
2
)(
2
uxie
y
2
2
2z
i
ey
Then
uxzLet
If z is abscissa (x axis)If z is abscissa (x axis)
Same curve is always obtained Same curve is always obtained
asas
z expresses the deviation from the mean in z expresses the deviation from the mean in units of standard deviationunits of standard deviation
StatisticsStatistics
Statistics apply to an infinite number of Statistics apply to an infinite number of resultsresults
Often we only do an analysis 2 or 3 times Often we only do an analysis 2 or 3 times and want to use the results to estimate and want to use the results to estimate the mean and the precisionthe mean and the precision
6868.3%: ±1 , 95.4%: ±2 , 99.7%: ±3
Standard deviationStandard deviation 68.3% of area is within ± 168.3% of area is within ± 1 of mean of mean 95.5% of area is within ± 295.5% of area is within ± 2 of mean of mean 99.7% of area is within ± 399.7% of area is within ± 3 of mean of mean
For any analysis, chances are 95.5 in For any analysis, chances are 95.5 in 100 that error is ± 2100 that error is ± 2
Can say answer is within Can say answer is within ± 2 ± 2 with with 95.5% confidence95.5% confidence
For a large data setFor a large data set Get a good estimate of the mean, Get a good estimate of the mean,
Know this formula -but use a calculatorKnow this formula -but use a calculator 22 = variance = variance Useful because additive Useful because additive
N
uxiNi
i2
1 )(
Small set of dataSmall set of data
Average (Average (x ) x ) An extra uncertaintyAn extra uncertainty The standard deviation calculated will differ The standard deviation calculated will differ
for each small set of data usedfor each small set of data used It will be smaller than the value calculated It will be smaller than the value calculated
over the larger setover the larger set Could call that a negative biasCould call that a negative bias
ss
For For use N in denominator use N in denominator For s use N-1 in denominator (we have one For s use N-1 in denominator (we have one
less degree of freedom - don’t know less degree of freedom - don’t know )) At end, round s to 2 sig figs or less if there At end, round s to 2 sig figs or less if there
are not enough sig figs in dataare not enough sig figs in data
1
)(1
2
N
xxs
Ni
ii
Confidence IntervalConfidence Interval We are doing an analysis to find the true We are doing an analysis to find the true
mean mean - it is unknown - it is unknown What we measure is What we measure is x but it may not be x but it may not be
the same as the same as Set a confidence limit eg 4.5 ± 0.3 gSet a confidence limit eg 4.5 ± 0.3 g The mean of the measurements was 4.5 gThe mean of the measurements was 4.5 g The true mean is in the interval 4.2-4.8 The true mean is in the interval 4.2-4.8
with some specified degree of confidence with some specified degree of confidence
Confidence limitConfidence limit
A measure of the reliability (RA measure of the reliability (Ree))
The reliability of a mean (The reliability of a mean (x ) increases as x ) increases as more measurements are takenmore measurements are taken
RRee = k(n) = k(n)1/21/2
Reliability increases with square root of Reliability increases with square root of number of measurementsnumber of measurements
Quickly reach a condition of limiting returnQuickly reach a condition of limiting return
ReliabilityReliability
Would you want a car that is 95% reliable?Would you want a car that is 95% reliable?
How often would that break down?How often would that break down?
Confidence IntervalConfidence Interval
For 100 % confidence - need a huge intervalFor 100 % confidence - need a huge interval Often use 95 %Often use 95 % The confidence level chosen can change The confidence level chosen can change
with the reason for the analysiswith the reason for the analysis
Confidence Interval when s ~ Confidence Interval when s ~
µ ± xµ ± xi i = = 1.96 1.96 for 95 % confidence for 95 % confidence
z = (z = (xxi i - µ)/- µ)/ =1.96 =1.96
Appropriate z values are given as a tableAppropriate z values are given as a table
This applies to a single measurementThis applies to a single measurement The confidence limit decreases as (N)The confidence limit decreases as (N)1/21/2 as as
more measurements are takenmore measurements are taken
Confidence IntervalConfidence Interval
In the lab this year I will make you go home In the lab this year I will make you go home before you can get enough data for s to = before you can get enough data for s to =
Therefore we will have to do a different kind Therefore we will have to do a different kind of calculation to estimate the precision.of calculation to estimate the precision.
Student’s t-testStudent’s t-test
The Student's t-Test was formulated by W. Gossett in the early 1900's. His employer
(brewery) had regulations concerning trade secrets that prevented him from publishing his discovery, but in light of the importance of the t
distribution, Gossett was allowed to publish under the pseudonym "Student".
The t-Test is typically used to compare the means of two populations
tt-test-test
t depends on desired confidence limitt depends on desired confidence limit degrees of freedom (N-1) degrees of freedom (N-1)
s
uxt i )(
Degrees of Freedom
Values of t for Various degrees of Probability
80% 90% 95% 99.9%
1 3.08 6.31 12.7 637
2 1.89 2.92 4.30 31.6
3 1.64 2.35 3.18 12.9
4 1.53 2.02 2.78 8.60
5 1.48 1.94 2.57 6.86
6 1.44 1.90 2.45 5.96
7 1.42 1.86 2.36 5.40
8 1.40 1.83 2.31 5.04
1.29 1.64 1.96 3.29
For practical purposesFor practical purposes
Assume Assume = s if you have made 20 = s if you have made 20 measurementsmeasurements
Sometimes Sometimes can be evaluated for a can be evaluated for a particular technique rather than for each particular technique rather than for each samplesample
Usually too time consuming to do 20 Usually too time consuming to do 20 replicate measurements on each sample replicate measurements on each sample
CONFIDENCE CONFIDENCE
N
ts x
ExampleExample Cal Culator obtained the following results for Cal Culator obtained the following results for
replicate determinations of calcium in limestonereplicate determinations of calcium in limestone 14.35%, 14.41%, 14.40%, 14.32%, 14.37%14.35%, 14.41%, 14.40%, 14.32%, 14.37% each is xeach is xii
Calculate the confidence interval Calculate the confidence interval
AnswerAnswer
Average = 14.37 %Average = 14.37 % S = 0.037%S = 0.037% Choose a 95 % confidence limitChoose a 95 % confidence limit Degrees of freedom = N-1 = 5-1 =4Degrees of freedom = N-1 = 5-1 =4 From t-table, t = 2.78From t-table, t = 2.78 14.37% ± ts/N14.37% ± ts/N½½
14.37 % ± 2.78 x 0.037% / 5 14.37 % ± 2.78 x 0.037% / 5 ½½
14.37 ± 0.05 %14.37 ± 0.05 %
Significant figuresSignificant figures
I say: Use two or less significant figures in a I say: Use two or less significant figures in a confidence limit. Then use the same number confidence limit. Then use the same number of decimal places in both (guided by the CL)of decimal places in both (guided by the CL)
When less than two sig figs in the CL?When less than two sig figs in the CL? When using two would require you to have When using two would require you to have
more decimal places than were in the actual more decimal places than were in the actual data.data.
The bunny gave upThe bunny gave up
Pooled standard deviationPooled standard deviation
2
)()(
......)1()1(
2
1 1
22
21
2221
21
1
1 2
nn
xxxxs
samplesofgroupsofnon
nN
nsnss
ni
i
ni
iii
p
s
sp
Comparison of MeansComparison of Means We analyze several samples and want We analyze several samples and want
to know if they are the same or differentto know if they are the same or different For each sample we take several For each sample we take several
measurements and obtain a mean measurements and obtain a mean
differencetsignificannoisthere
n
s
n
stxxIf
2
22
1
21
21
Comparing two meansComparing two means
CLchosentheattsignificannot
isdifferencethethenttIf
nn
nn
s
xxt
sdpooledaissIf
n
ts
n
tstoxxCompare
tablecalc
calc
21
2121
2
2
1
121
Comparing two meansComparing two means
CLchosentheattsignificannot
isdifferencethethenttIf
nn
nn
s
xxt
spooledtheissIf
n
s
n
stx
tablecalc
21
2121
2
2
22
1
1 )()(.
ExampleExample
Two barrels of wine were analyzed for their Two barrels of wine were analyzed for their alcohol content to determine whether or not alcohol content to determine whether or not they were from different sources:they were from different sources:
12.61% (6 analyses),12.61% (6 analyses), 12.53% (4 analyses)12.53% (4 analyses) Pooled standard deviation = 0.07 %Pooled standard deviation = 0.07 %
Degrees of freedom = 6+4-2=8Degrees of freedom = 6+4-2=8 t at 95% CL for 8 deg of freedom =2.3t at 95% CL for 8 deg of freedom =2.3 ttcalccalc < t < ttabletable
therefore difference is not significant at therefore difference is not significant at the 95% CL – the two samples are the the 95% CL – the two samples are the same at the 95% CLsame at the 95% CL
77.146
4*6
%07.0
%53.12%61.12
t
Rejection of data- Q TestRejection of data- Q Test
QQexpexp= = questionable value-nearest numerical valuequestionable value-nearest numerical value
rangerange
Look up Table of QLook up Table of Qcriticalcritical
If QIf Qexp exp < Q< Qcriticalcritical, keep the point, keep the point
If more observations are taken it is easier to If more observations are taken it is easier to determine if a point is an outlierdetermine if a point is an outlier
Calibration SensitivityCalibration Sensitivity
The slope of the calibration curve at the The slope of the calibration curve at the concentration of interestconcentration of interest
Doesn’t take precision into accountDoesn’t take precision into account
Analytical SensitivityAnalytical Sensitivity
Slope/s.d. = m/s.d.Slope/s.d. = m/s.d.
Where s = standard deviation of the signalWhere s = standard deviation of the signal Analytical sensitivity is independent of gain, Analytical sensitivity is independent of gain,
but can vary with the concentration as s can but can vary with the concentration as s can depend on concentrationdepend on concentration
Limit of detectionLimit of detection
The minimum concentration detectable The minimum concentration detectable at a known confidence levelat a known confidence level
Is the concentration corresponding to Is the concentration corresponding to the lowest usable reading (LUR)the lowest usable reading (LUR)
LUR = average blank + k s.d.LUR = average blank + k s.d.blankblank
k determines the confidence level k determines the confidence level We use k = 3 for a 95% C.L.We use k = 3 for a 95% C.L. Do not confuse LOD and LURDo not confuse LOD and LUR
Harris page 103Harris page 103 LUR corresponds to LUR corresponds to Signal detection limitSignal detection limit LOD corresponds toLOD corresponds to Concentration detection limit Concentration detection limit
When doing this in lab When doing this in lab WE CHEATWE CHEAT
We should have 20 measurements of the blank We should have 20 measurements of the blank and we never do because of time constraints. To and we never do because of time constraints. To publish a result or for a paying client, we would publish a result or for a paying client, we would need 20.need 20.
y = 1.9311x + 1.1127
R2 = 0.9888
0
4
8
0 1 2 3 4
Read
out
Ideally, the average blank = b (the intercept)
However, if b > average blank, then recalculate LUR using LUR = b + k s.d.blank
Usually say LUR = b + 3 sd LOD = 5.2 mg/L (k = 3) Note the 2 significant figures
Quality AssuranceQuality Assurance
Begins with samplingBegins with sampling Calibration CheckCalibration Check Run standards every few samples.Run standards every few samples. Reference standards are of known Reference standards are of known
concentration. Do you get the right answer?concentration. Do you get the right answer? Include in Table of Results.Include in Table of Results. SOP’s are very importantSOP’s are very important
SOP (Standard operating SOP (Standard operating procedure)procedure)
Set of written instructions that document a Set of written instructions that document a routine or repetitive activity which is followed routine or repetitive activity which is followed by employees in an organization. by employees in an organization.
The development and use of The development and use of SOPsSOPs is an is an integral part of a successful quality system. integral part of a successful quality system.
Provides information to perform a job Provides information to perform a job properly and consistently in order to achieve properly and consistently in order to achieve pre-determined specifications and quality. pre-determined specifications and quality.
http://people.stfx.ca/tsmithpa/Chem361/http://people.stfx.ca/tsmithpa/Chem361/
Numerical Criteria for Selecting Numerical Criteria for Selecting Analytical Methods Analytical Methods
PrecisionPrecision BiasBias SensitivitySensitivity Detection LimitDetection Limit Concentration RangeConcentration Range SelectivitySelectivity
Other characteristics to be Other characteristics to be consideredconsidered
Speed Ease and convenience Skill required of operator Cost and availibility of equipment Per-sample-cost
CriterionCriterion Figure of MeritFigure of Merit
PrecisionPrecision Absolute sd, relative sd, coefficient of Absolute sd, relative sd, coefficient of variation, variancevariation, variance
BiasBias Absolute systematic error, relative Absolute systematic error, relative systematic errorsystematic error
SensitivitySensitivity Calibration sensitivity, analytical sensitivityCalibration sensitivity, analytical sensitivity
Limit of Limit of detectiondetection
Av.Blank + 3 sd Av.Blank + 3 sd blankblank
ConcentratioConcentration rangen range
LOQ to LOL (limit of linearity)LOQ to LOL (limit of linearity)
SelectivitySelectivity Coefficient of selectivityCoefficient of selectivity