What have we learned from the John Day protocol comparison
test?
Brett Roper
John Buffington
Objectives• How consistent are measurements within a
monitoring program,
• Ability of protocols to detect environmental heterogeneity (signal-to-noise ratio),
• Understand relationships among different monitoring program’s measurement of an attribute, and to more intensively measured values determined by a research team (can we share data?).
Goal – More efficiently collect and use stream habitat data.
Sample Design7 monitoring programs• 3 crews3 channel types (12 streams)• Plane-bed (Tinker, Bridge, Camas, Potamus)• Pool-riffle (WF Lick, Crane, Trail, Big)• Step-pool (Whiskey, Myrtle, Indian, Crawfish)
plane-bed pool-riffle step-pool
Maximize variability so we can discern differences
FlowSet
Begin Point
Different End PointsDepending upon protocol and crew
Review of Design at a Stream Site
Fixed Transects for Selected Attributes;Bankfull width, BF Depth, Banks,
Conduct surveys in late summer (base flow).
9650 9700 9750 9800 9850 9900 9950 10000
9850
9900
9950
10000
On top of this “the truth”, “the gold standard”
survey points
pool
bar
contour interval = 10 cm
riffle
Attribute AREMP CFG EMAP NIFC ODFW PIBO UC
Gradient
Sinuosity
Bankfull
WD
% Pool
Pool/km
MRPD
d50
% Fines
LWD
Within a program, many attributes are consistently measured ( ), some are less so ( ).
-Objective 1
Crew
1 2
Sur
viva
l (eg
g-to
-fry
)
0.0
0.2
0.4
0.6
0.8
0.0
0.2
0.4
0.6
0.8
1.0a)
b)
Egg-to-fry survival rates from estimates of percent fines ( ) from Potamus Creek (a) and WF Lick Creek (b), for two PIBO crews.
SEF= [92.65/(1 + e-3.994+0.1067*Fines)]/100
Al-Chokhachy and Roper, submitted
Within Program Consistency• Most programs collect the majority of their
attributes in a consistent manner.• When problems are identified within a protocol
they can often be quickly addressed through minor changes (additional training, clarifying protocols, increasing operational rule sets).
• QAQC is the only way to identify problems within a protocols.
• Some sets of stream attributes (habitat units, sediment grain size) can be more difficult to be consistent with– problem is these are often the most important to aquatic biota.
• Consistency is affected (+ and -) by transformations.
Attribute AREMP CFG EMAP NIFC ODFW PIBO UC
Gradient
Sinuosity
Bankfull
WD
% Pool
Pool/km
MRPD
d50
% Fines
LWD
Generally lower S:N than internal consistency. Two exceptions, Bankfull width and large wood.
-Objective 2
Detecting Environmental Variability
• Within this sample of streams there may not be sufficient signal in some variables (sinuosity --true, width-to-depth -- ??).
• The focus on repeatability may reduce signal. Hard for me to look at the photo of the sites and not see a lot of variability.
• In attributes where signal can be highly variable (large wood) transformations will almost always improve signal and increase the ability to tell differences.
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10
Signal to Noise
Max
imu
m r
2 v
alu
e
Even if you are measuring the same underlying attribute, the more noise/less signal the weaker the estimate of the underlying relationship.
Example; Assume you knew the truth perfectly but you compared that to imperfect protocol; how strong could the relationship be?
(Stoddard et al. 2008; Kaufmann et al. 1999)
Objective 3 - Sharing Data
• What are the ranges of relationships between programs given the signal to noise?
• Given some inherent variability in our measurements are we measuring the same underlying attribute?
Attribute Highest/ Lowest
Group 1 Group 2 S:N 1 S:N 2 Max r2
Gradient High AREMP PIBO 188.2 124.4 0.987
Low ODFW CFG 5.6 4.9 0.704
W/D High NIFC ODFW 6.1 2.2 0.589
Low UC PIBO 1.7 1.5 0.374
% Pools High NIFC ODFW 13.5 5.8 0.794
Low PIBO CFG 1.4 0.4 0.174
Pool Depth High UC PIBO 11.9 7.4 0.813
Low ODFW CFG 3.9 0.2 0.127
d50 High PIBO UC 6.0 3.6 0.671
Low EMAP AREMP 1.0 2.4 0.353
0
10
20
30
40
50
60
70
PB PR SP
B D
iam
eter
(m
m)
AREMP
CFG
EMAP
NIFC
ODFW
PIBO
UC
To minimize the effect of observer variation we use the mean of means.
So although there is variation among crews in measuring sediment, it appears the monitoring protocols are measuring the same underlying characteristic.
0
10
20
30
40
50
60
PB PR SP
Perc
en
t P
oo
l AREMP
CFG
EMAP
NIFC
ODFW
PIBO
UC
In other cases it is clear programs are measuring different things – likely based on different operational definitions.
Attribute AREMP CFG EMAP NIFC ODFW PIBO UC
Gradient 0.99 0.98 0.99 NM 0.97 0.99 0.99
Sinuosity 0.93 NM 0.95 NM NM 0.76 0.87
MBW 0.59 0.63 0.73 0.57 0.65 0.59 0.51
WD 0.01 0.01 0.12 0.33 0.49 0.34 0.03
Pool/km 0.43 0.33 0.03 0.28 0.18 0.30 0.10
MRPD 0.91 0.28 0.87 0.12 0.94 0.93 0.94
d50 0.79 NM 0.87 NM NM 0.92 0.73
LWD 0.43 0.44 0.76 0.85 0.76 0.58 0.65
You can then relate each program to “the gold standard”. These coefficient of determination (r2) between intensively measured attributes and each program (mean of each reach).
What data could we share?
Probably
• Gradient
• Sinuosity
• Median Particle Size
Mostly
• Bankfull
• Residual Depth
• Large Wood
With Difficultly
• Width to depth
• Pools (%,/km)
• Percent Fines
Conclusions• Most groups do a decent job implementing
their own protocol. Every group still has room for improvement through training, improved definitions,…
• QAQC is key.
• Groups seem to be forgoing some signal in order to minimize noise.
• Difficult to exchange one groups result with another for many attributes.
• Perhaps best as a block effect for those with no interaction.
Recommendations
We will never progress on what is the right way without an improved understanding of the truth or agreed upon criteria.
• How should we define a good protocol.
• Which protocols have the strongest relationship with the biota?
• Which best implies condition?
• Which is closest to the real truth (ground based LiDAR)?
Issues for paper
• I am trying to incorporate all the final suggestions and should have it out for a quick review then submission right after the new year.