Upload
cole-freeman
View
215
Download
0
Tags:
Embed Size (px)
Citation preview
The LOWER DELAWARE MONITORING PROGRAM’S
WATER QUALITY DATA ANALYSIS PROTOCOL
Robert Limbeck
Watershed Scientist, DRBC
NJ Volunteer Monitoring Summit
October 1, 2004
Lower Delaware
River Study Area
Control Point Monitoring Concepts
Study Design Objectives
Establish baseline Existing Water Quality for future comparison;Set targets to maintain water quality where standards are met;Set targets to improve water quality where standards are not met;Set geographic and water quality priorities to meet the targets; andMonitor long-term to assess trends, prioritize management activities, and assess effectiveness of implementation.
How does water quality change from the Delaware Water Gap to Trenton?Which tributaries produce such changes?Where should resources be devoted for most water quality benefit?
QA/QC Considerations
Know Your Data:
Precision - The degree of agreement of repeated measurements Accuracy - How close your results are to a true or expected value Representativeness - Data represent the true environmental condition Completeness - Compare amount of valid or useable data planned to be
collected versus the amount actually collected Comparability - The extent to which data can be compared to other sample
locations or periods of time
Data Management
Go to www.drbc.net, click on Lower Delaware
See reports, QAPP, and data files for data management details
The LDMP Excel FileAvailable at www.drbc.net
Sampling Site# rp rt st Mon Year yr-mo RivMile Shortnm Concatnm Flow (cfs)DO
mg/lDO%Sat
SpC umhos/cm
pH Air Temp C Air Temp F WT C WT F TDS
DRBCNJPAC10 P R DR 2000 2000-5 207.40 DelPortland 207.4 DelPortland 6,398.00 9.1 101.7% 92 7.9 36.0 96.8 20.8 69.4DRBCNJPAC10 P R DR 2000 2000-5 207.40 DelPortland 207.4 DelPortland 16,685.00 10.6 98.6% 70 7.7 14.4 57.9 12.1 53.8DRBCNJPAC10 P R DR 2000 2000-6 207.40 DelPortland 207.4 DelPortland 28,181.00 10.2 100.9% 73 7.2 22.8 73.0 14.9 58.8DRBCNJPAC10 P R DR 2000 2000-6 207.40 DelPortland 207.4 DelPortland 11,566.00 9.2 100.0% 69 7.4 25.6 78.1 19.4 66.9 164DRBCNJPAC10 P R DR 2000 2000-7 207.40 DelPortland 207.4 DelPortland 3,173.00 7.8 90.9% 105 8.7 28.5 83.3 23.0 73.4 74DRBCNJPAC10 P R DR 2000 2000-7 207.40 DelPortland 207.4 DelPortland 3,718.00 8.1 91.6% 97 7.8 20.4 68.7 21.4 70.5 70DRBCNJPAC10 P R DR 2000 2000-8 207.40 DelPortland 207.4 DelPortland 5,920.00 8.8 103.9% 88 7.5 30.3 86.5 23.7 74.7 70DRBCNJPAC10 P R DR 2000 2000-8 207.40 DelPortland 207.4 DelPortland 3,524.00 9.4 104.0% 99 8.2 23.5 74.3 20.3 68.5 110DRBCNJPAC10 P R DR 2000 2000-9 207.40 DelPortland 207.4 DelPortland 3,718.00 8.8 100.5% 109 8.5 27.0 80.6 21.9 71.4 82DRBCNJPAC10 P R DR 2000 2000-9 207.40 DelPortland 207.4 DelPortland 3,524.00 9.7 94.7% 94 7.9 16.9 62.4 14.3 57.7 196DRBCNJPAC10 P R DR 2001 2001-5 207.40 DelPortland 207.4 DelPortland 3,285.00 10.9 113.7% 108 8.0 27.7 81.9 17.4 63.3 91DRBCNJPAC10 P R DR 2001 2001-5 207.40 DelPortland 207.4 DelPortland 4,563.00 9.4 96.2% 110 7.4 25.5 77.9 16.5 61.7 280DRBCNJPAC10 P R DR 2001 2001-6 207.40 DelPortland 207.4 DelPortland 6,534.00 10.5 112.3% 87 7.7 27.2 81.0 18.6 65.5 70DRBCNJPAC10 P R DR 2001 2001-6 207.40 DelPortland 207.4 DelPortland 5,084.00 7.8 91.6% 102 7.4 25.6 78.1 23.4 74.1 70DRBCNJPAC10 P R DR 2001 2001-7 207.40 DelPortland 207.4 DelPortland 3,129.00 8.2 96.0% 113 7.5 25.3 77.5 23.2 73.8 90DRBCNJPAC10 P R DR 2001 2001-7 207.40 DelPortland 207.4 DelPortland 2,745.00 7.2 88.7% 122 7.6 26.4 79.5 26.0 78.8 83DRBCNJPAC10 P R DR 2001 2001-8 207.40 DelPortland 207.4 DelPortland 2,558.00 6.7 86.0% 110 7.6 29.6 85.3 28.3 82.9 86DRBCNJPAC10 P R DR 2001 2001-8 207.40 DelPortland 207.4 DelPortland 1,896.00 8.3 99.7% 115 7.4 31.4 88.5 24.6 76.3 86DRBCNJPAC10 P R DR 2001 2001-9 207.40 DelPortland 207.4 DelPortland 2,175.00 8.6 98.7% 110 7.4 30.4 86.7 22.2 72.0 100DRBCNJPAC10 P R DR 2001 2001-9 207.40 DelPortland 207.4 DelPortland 3,491.00 8.8 92.2% 120 7.9 20.6 69.1 17.6 63.7 150DRBCNJPAC10 P R DR 2002 2002-5 207.40 DelPortland 207.4 DelPortland 7,200.82 9.4 94.8% 76 7.5 14.3 57.7 15.8 60.4 86DRBCNJPAC10 P R DR 2002 2002-5 207.40 DelPortland 207.4 DelPortland 10,860.78 12.3 110.8% 65 7.4 24.3 75.7 10.7 51.3 79DRBCNJPAC10 P R DR 2002 2002-6 207.40 DelPortland 207.4 DelPortland 7,934.48 9.4 100.7% 76 6.8 25.3 77.5 18.7 65.7 88DRBCNJPAC10 P R DR 2002 2002-6 207.40 DelPortland 207.4 DelPortland 8,444.20 9.5 103.9% 88 6.9 29.6 85.3 19.7 67.5 65DRBCNJPAC10 P R DR 2002 2002-7 207.40 DelPortland 207.4 DelPortland 2,613.66 8.0 95.2% 117 25.9 78.6 24.1 75.4 110DRBCNJPAC10 P R DR 2002 2002-7 207.40 DelPortland 207.4 DelPortland 2,694.40 7.9 97.3% 109 7.6 20.7 69.2 26.0 78.8 94DRBCNJPAC10 P R DR 2002 2002-8 207.40 DelPortland 207.4 DelPortland 2,013.44 8.4 99.8% 113 7.8 27.3 81.1 24.0 75.2 120DRBCNJPAC10 P R DR 2002 2002-8 207.40 DelPortland 207.4 DelPortland 1,978.62 8.2 100.7% 112 7.6 27.9 82.2 25.8 78.4 99DRBCNJPAC10 P R DR 2002 2002-9 207.40 DelPortland 207.4 DelPortland 1,944.10 8.8 99.5% 103 8.3 23.6 74.5 21.4 70.5 110DRBCNJPAC10 P R DR 2002 2002-9 207.40 DelPortland 207.4 DelPortland 2,155.90 8.6 94.8% 104 8.0 19.7 67.5 20.1 68.2 100DRBCNJPAC10 P R DR 2003 2003-5 207.40 DelPortland 207.4 DelPortland 7,794.00 10.3 98.0% 84 7.1 15.5 59.9 13.1 55.6 150
Always Prepare MetadataMETADATA for 00-03 DATABASE
FIELD (Column) DESCRIPTION
Sampling Site# site number for STORET
rp R-riffle or fast water; P-Pool or just downstream of large pool
rt R-river site; T-tributary site
st State - PA, NJ, or DR=Delaware River Interstate Control Point
Mon Year Calendar year of sample
yr-mo Year-Month combination for seasonal stratification of data
RivMile River mile location upstream from mouth of Delaware Bay
Shortnm Short site name for graphic plots by site
Concatnm Concatenation of river mile and short site name for river mile plots
Site Name Long version of site name, or full site name
DA sq mi drainage area in square miles
MO month of sample
day day of sample
Date date of sample
Time time of day, military
Flow Percentile 0=<10th; 1=10-25%ile; 2=25th-50th%ile; 3=50th-75th%ile; 4=75th-90th%ile; 5=>90th%ile
Flow (cfs) discharge corresponding to time of sample and gage height of sample
log(cfs) log transformation of flow
cfs/sqmi discharge per unit drainage area
log cfs/mi2 log transformation of discharge per unit drainage area
DO mg/l dissolved oxygen concentration
DOSat Value dissolved oxygen concentration at 100% saturation at sample water temperature
DO%Sat calculated dissolved oxygen saturation = DO/DOSatValue
Gage Ht gage height of water level - use gage height vs discharge regression to estimate discharge
Etc.
Data Checking and CleaningCheck accuracy of data entry (multiple checks are best)
Check validity of data (can fish live in 1000 degrees F?)
Check precision and accuracy of data (QA/QC process)
Once in table form, sort all columns, check formats and ranges
Make sure blank cells are really blank…
Plot data, explain outliers (we rarely exclude outlier data)
Quantify non-detect values: we use ½ MDL if <20% non-detects
If more than 20% of data are non-detects, they may bias data set
The % of non-detects may be a good water quality indicator
Stats Packages Used at DRBC
Analyse-It Excel Add-In from www.analyse-it.com
SAS Powerful! www.sas.com
Excel Statistical functions available, but…
There are many other good statistical software applications available. Prices range from $100 to $$$$$.
Representative Data
0.00%
10.00%
20.00%
30.00%
40.00%
50.00%
60.00%
70.00%
80.00%
90.00%
Lehi
gh, P
A
Lack
awax
en, P
A
Nev
ersi
nk, N
Y
Bro
dhea
d, P
A
Mon
gaup
, NY
Pau
linsK
ill, N
J
Bus
hkill
(Mon
roe)
, PA
Peq
uest
, NJ
Mus
cone
tcon
g, N
J
Tohi
ckon
, PA
Cal
licoo
n, N
Y
Sho
hola
, PA
Bus
hkill
(Nor
tham
pton
), P
A
Flat
Bro
ok, N
J
Equ
inun
k, P
A
Poh
atco
ng, N
J
Tenm
ile, N
Y
Mar
tins,
PA
Cal
kins
, PA
Mas
thop
e, P
A
Coo
ks, P
A
Tributary
Pe
rce
nt
of
To
tal
Wa
ters
he
d A
rea
Is your data representative of watershed conditions?
Natural Variability
What range of stream flow does your data cover?
OvercomeNatural
Variability
n 894
Mean 11:12.895% CI 11:07.3 to 11:17.2
Variance 0:03.46SD 1:15.79SE 0:02.53CV 11%
Median 11:15.095.2% CI 11:05.0 to 11:20.0
Range 7:15IQR 1:50
Percentile
10th 9:30.025th 10:15.050th 11:15.075th 12:05.090th 12:50.0
Coefficient p
Shapiro-Wilk 0.9897 <0.0001Skewness 0.1101 0.1773
Kurtosis -0.5621 <0.0001
-4
-3
-2
-1
0
1
2
3
4
0.33
0.35
0.38
0.4 0.43
0.45
0.48
0.5 0.53
0.55
0.58
0.6 0.63
0.65
Time
No
rmal
Qu
anti
le
0
20
40
60
80
100
120
140
160
180F
req
uen
cy
0
1
What time of day does your data represent?
Summary Stats – Frequency Plot
n 40
Mean 97.48%95% CI 95.46% to 99.49%
Variance 0.398%SD 6.309%SE 0.998%CV 6%
0
2
4
6
8
10
12
14
Fre
qu
ency
DO % Saturation at Portland, PA 2000-2003
Summary Stats (Box and Normal Plots)Median 97.18%
96.2% CI 94.77% to 99.78%
Range 27.7%IQR 8.6%
Percentile
10th 89.71%25th 92.15%50th 97.18%75th 100.71%90th 103.99%
Coefficient p
Shapiro-Wilk 0.9578 0.1406Skewness 0.6536 0.0802
Kurtosis 0.5128 0.3805
-3
-2
-1
0
1
2
3
0.85 0.9 0.95 1 1.05 1.1 1.15
DO%Sat
No
rmal
Qu
an
tile
0
2
4
6
8
10
12
14
Fre
qu
en
cy
0
1
DO % Saturation at Portland, PA 2000-2003 (Data are normally distributed)
Non-Normal
Data
n 35 (cases excluded: 5 due to missing values)
Mean 148.195% CI 48.2 to 248.0
Variance 84,563.20SD 290.80SE 49.15CV 196%
Median 32.095.9% CI 12.0 to 76.0
Range 1,298IQR 89
Percentile
10th 7.225th 11.050th 32.075th 100.090th 534.0
Coefficient p
Shapiro-Wilk 0.5443 <0.0001Skewness 2.9033 <0.0001
Kurtosis 8.5876 <0.0001
-3
-2
-1
0
1
2
3
4
0 200 400 600 800 1000 1200 1400
Entero col/100ml
No
rmal
Qu
anti
le
0
5
10
15
20
25
30
Fre
qu
ency
0
1
Enterococcus counts at Portland, PA 2000-2003
Transformed Data
n 35 (cases excluded: 5 due to missing values)
Mean 1.5876195% CI 1.34714 to 1.82808
Variance 0.490041SD 0.700029SE 0.118327CV 44%
Median 1.5051595.9% CI 1.07918 to 1.88081
Range 2.8129IQR 0.9604
Percentile
10th 0.8531125th 1.0395950th 1.5051575th 2.0000090th 2.72752
Coefficient p
Shapiro-Wilk 0.9466 0.0891Skewness 0.5866 0.1341
Kurtosis -0.3869 0.6987
-3
-2
-1
0
1
2
3
0 0.5 1 1.5 2 2.5 3 3.5
Ent Log
No
rmal
Qu
anti
le
0
2
4
6
8
10
12
14
Fre
qu
ency
0
1
Enterococcus counts, Portland, PA 2000-2003, log transformed – produces normality (antilog of results is the geometric mean)
Comparative Stats
0
0.5
1
1.5
2
2.5
3
3.5
4
4.5
134.34DelCalhoun
141.8DelWXing
148.7DelLambertvll
155.4DelBullsIsl
167.7DelMilford
174.8DelRiegelsBr
183.82DelEaston
197.84DelBelvidere
207.4DelPortland
River mile plot of log Fecal Coliform counts in the Delaware River 2000-2003
Means or Medians?
Use means and parametric stats if normality assumptions are met (sample population must be a random sample following the normal distribution).
Medians and non-parametric tests make no assumption about the data distribution. Valid for any data set.
Note: Parametric tests like the t-test require at least 30 data points for the assumption of normality to be met. Do you collect that much information? In order for parametric comparisons between sites to be made, their data distributions must be the same shape. Do not assume this is the case! Non-parametric stats are safe…
Parametric vs Non-Parametric Tests
Parametric Non-Parametric
Testing for a Difference:T-Test Mann-Whitney U Test
1-Way ANOVA Kruskal-Wallis 1-Way ANOVA
Testing for a Change:Paired Samples T-Test Wilcoxon Signed Ranks Test
1-Way Repeat Measures ANOVA Friedman 1-Way ANOVA
Testing for an Association (Correlation):Pearson Correlation Spearman Rank Correlation
Kendall Rank Concordance
Prediction:Linear, Deming, or Polynomial Regression Passing & Bablok Regression
Presentation of Data – GraphsAll Data Plot
Presentation of Data – GraphsMedians and Percentiles
Presentation of Data - Tables
Flow Percentile N Median Fecal Coliform 10th%ile to 90th%ile <10th 32 18 5 to 111
10th to 25th 55 50 5 to 130 25th to 50th 80 50 5 to 820 50th to 75th 66 52 16 to 461 75th to 90th 52 80 20 to 3,070
>90th 24 190 57 to 1,450
ResourcesUSGS Statistical Guidance Manual for Water Resource Studies (USGS manual is excellent…)
USEPA guidance (numerous sources, try a web search)
Zar, J.H. Biostatistical Analysis.
Many others, give me a call…
Bob Limbeck, DRBC
609-883-9500 ext 230
www.drbc.net