Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Water Quality Models for Supporting
Shellfish Harvesting Area Management
by
Andrew David Gronewold
Department of Environmental Sciences and PolicyDuke University
Date:Approved:
Dr. Kenneth Reckhow, co-supervisor
Dr. Robert Wolpert, co-supervisor
Dr. Rachel Noble
Dr. William Kirby-Smith
Dissertation submitted in partial fulfillment of therequirements for the degree of Doctor of Philosophy
in the Department of Environmental Sciences and Policyin the Graduate School of
Duke University
2009
ABSTRACT
Water Quality Models for Supporting
Shellfish Harvesting Area Management
by
Andrew David Gronewold
Department of Environmental Sciences and PolicyDuke University
Date:Approved:
Dr. Kenneth Reckhow, co-supervisor
Dr. Robert Wolpert, co-supervisor
Dr. Rachel Noble
Dr. William Kirby-Smith
An abstract of a dissertation submitted in partial fulfillment of therequirements for the degree of Doctor of Philosophy
in the Department of Environmental Sciences and Policyin the Graduate School of
Duke University
2009
Abstract
This doctoral dissertation presents the derivation and application of a series of wa-
ter quality models and modeling strategies which provide critical guidance to wa-
ter quality-based management decisions. Each model focuses on identifying and
explicitly acknowledging uncertainty and variability in terrestrial and aquatic envi-
ronments, and in water quality sampling and analysis procedures. While the mod-
eling tools I have developed can be used to assist management decisions in waters
with a wide range of designated uses, my research focuses on developing tools which
can be integrated into a probabilistic or Bayesian network model supporting total
maximum daily load (TMDL) assessments of impaired shellfish harvesting waters.
Notable products of my research include a novel approach to assessing fecal indica-
tor bacteria (FIB)-based water quality standards for impaired resource waters and
new standards based on distributional parameters of the in situ FIB concentration
probability distribution (as opposed to the current approach of using most probable
number (MPN) or colony-forming unit (CFU) values). In addition, I develop a model
explicitly acknowledging the probabilistic basis for calculating MPN and CFU values
to determine whether a change in North Carolina Department of Environment and
Natural Resources Shellfish Sanitation Section (NCDENR-SSS) standard operating
procedure from a multiple tube fermentation (MTF)-based procedure to a membrane
filtration (MF) procedure might cause a change in the observed frequency of water
quality standard violations. This comparison is based on an innovative theoretical
model of the MPN probability distribution for any observed CFU estimate from the
same water quality sample, and is applied to recent water quality samples collected
and analyzed by NCDENR-SSS for fecal coliform concentration using both MTF
and MF analysis tests. I also develop the graphical model structure for a Bayesian
iv
network model relating FIB fate and transport processes with water quality-based
management decisions, and encode a simplified version of the model in commercially
available Bayesian network software. Finally, I present a Bayesian strategy for cali-
brating bacterial water quality models which improves model performance by explic-
itly acknowledging the probabilistic relationship between in situ FIB concentrations
and common concentration estimating procedures.
v
Contents
Abstract iv
List of Figures ix
List of Tables xiii
Acknowledgments xiv
1 Introduction 1
1.1 Dissertation Organization . . . . . . . . . . . . . . . . . . . . . . . . 4
2 Developing a Graphical Model 8
2.1 Selecting the Model Endpoint . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Terrestrial Fate and Transport . . . . . . . . . . . . . . . . . . . . . . 12
2.3 Aquatic Fate and Transport . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.1 Bacteria loss models . . . . . . . . . . . . . . . . . . . . . . . 18
2.3.2 Bacteria transport models . . . . . . . . . . . . . . . . . . . . 23
2.4 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3 Developing and Applying a Simple Bayesian Network Model 30
3.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 Bayesian Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.3 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3.1 Study Area and Data Collection . . . . . . . . . . . . . . . . . 36
3.3.2 Model Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.3.3 Conditional Probabilities . . . . . . . . . . . . . . . . . . . . . 41
3.4 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 45
vi
3.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
4 An Assessment of Fecal Indicator Bacteria-Based Water QualityStandards and Water Quality Model Endpoints 50
5 Modeling the Relationship Between Most Probable Number (MPN)and Colony Forming Unit (CFU) Estimates of Fecal IndicatorBacteria Concentrations 60
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.2.1 Water Quality Monitoring . . . . . . . . . . . . . . . . . . . . 66
5.2.2 Theoretical Probability Model . . . . . . . . . . . . . . . . . . 67
5.2.3 OLS Regression Empirical model . . . . . . . . . . . . . . . . 67
5.3 Results and Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . 68
5.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.5 Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
6 Improving Parameter Estimation in the Aquatic Fate and Trans-port Model 79
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6.1.1 Serial Dilution Analysis . . . . . . . . . . . . . . . . . . . . . 81
6.1.2 Most Probable Number Calculations . . . . . . . . . . . . . . 82
6.2 Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.2.1 Data Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . 86
6.2.2 Parameter Estimation . . . . . . . . . . . . . . . . . . . . . . 91
6.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.4 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
vii
6.6 Computer code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
A Listing of Impaired Waters 103
B North Carolina Shellfish Harvesting Area Water Quality Standards104
Bibliography 106
Biography 117
viii
List of Figures
2.1 Graphical representation of critical environmental system responsevariables and potential model endpoints. Management decisions areindicated by boxes, and variables are represented by rounded nodes. . 11
2.2 Graphical representation of assumed system variables and causal rela-tionships for terrestrial fate and transport of fecal indicator bacteria.Management decisions are indicated by boxes, and variables are rep-resented by rounded nodes. . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3 Graphical representation of environmental processes and system vari-ables affecting aquatic fate and transport of fecal indicator organisms.Management decisions are indicated by boxes, and variables are rep-resented by rounded nodes. . . . . . . . . . . . . . . . . . . . . . . . . 28
2.4 Comprehensive graphical network of fecal contamination in designatedresource waters. Management decisions are indicated by boxes, andvariables are represented by rounded nodes. . . . . . . . . . . . . . . 29
3.1 Simple network model representing rainfall-induced fecal contamina-tion of a coastal estuary. . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.2 Graphical representation of Bayes’ theorem indicating prior and pos-terior probability densities, and the normalized likelihood for a waterquality standard. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.3 Graphical representation of environmental variables and processes as-sociated with fecal contamination in tidal shellfish harvesting areas.Management decisions are indicated by boxes, and variables are rep-resented by rounded nodes. . . . . . . . . . . . . . . . . . . . . . . . . 39
3.4 Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for all variable states are based on moni-toring data collected between 1994 and 1997 at a cluster of monitoringstations in the upper reaches of the Newport River Estuary, NorthCarolina. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
ix
3.5 Conditional probability distribution table for fecal coliform MPN node.For each of the three states of the MPN node, each row indicates themarginal probability of the node being in that state given the state ofthe three causal variables. For example, the probability that the MPNis less than 14 organisms per 100 ml, given that the tide is rising, themost recent rainfall was less than one inch, and that it has been lessthan four days since the most recent rain event, is 0.667. . . . . . . . 44
3.6 Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for fecal coliform MPN states are condi-tional upon long-term average precipitation and tidal conditions in theupper reaches of the Newport River Estuary, North Carolina. . . . . . 45
4.1 Prior and posterior distributions for σk for five randomly selected sta-tions in the Newport River using the three priors in table 4.4. Eachrow utilizes the same prior distribution, and each column represents aseparate station. Vertical gray lines are added to facilitate comparisonbetween alternative priors for each station. . . . . . . . . . . . . . . . 53
4.2 Combinations of the mean µc and standard deviation σc of the log-transformed fecal coliform concentration distribution which yieldedMPN (solid lines) or CFU (dashed lines) samples in violation of theNSSP median standard (panel a), geometric mean standard (panel b),90th percentile standard (panel c), or any standard (panel d) with afrequency of either 0.005 or 0.1. The zone of violations is in the upperright of each panel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.3 Relationship between the mean µc and standard deviation σc of the log-transformed fecal coliform concentration distribution and simulatedviolation of any CFU-based water quality standard (dashed lines) andany MPN-based water quality standard (solid lines) for possible val-ues of the negative binomial dispersion parameter α. Panels a and bindicate µc − σc pairs expected to violate standards with a frequencyof 0.1 and 0.005, respectively. . . . . . . . . . . . . . . . . . . . . . . 55
4.4 Log-likelihood (solid line) of transformation parameter γ for σc usingpaired values of µc and σc. Panel a based on values from table 4.2 forσc > 0.65, panel b based on values from table 4.2 for σc ≤ 0.65, panelc based on values from table 4.3 for σc > 0.65, and panel d based onvalues from table 4.3 for σc ≤ 0.65. . . . . . . . . . . . . . . . . . . . 56
x
4.5 Violation contour lines overlaid by violation line best-fit regressionmodel fitted values based on model parameters in table 4.5. . . . . . . 57
4.6 Joint posterior probability density contour lines (solid lines) for fourmonitoring stations in the Newport River Estuary. Dashed lines in-dicate combinations of the mean µc and standard deviation σc of thelog-transformed fecal coliform concentration distribution which violateconcentration-based standards no more than 0.5% of the time usingMPN or CFU standards as the reference. Confidences of compliance(CC) are given in the lower left of each panel for both MPN and CFU-based standards. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.1 Expected values and 95% prediction sets or prediction intervals forobservable fecal coliform MPN (panel A) and CFU (panel B) measure-ments given the true fecal coliform concentration in organisms per 100ml. For clarity, expected values and 95% prediction sets or intervalsare plotted only for every 5th integer-valued concentration c. Maxi-mum true concentrations in each plot are based on maximum MPNand CFU observations in the NCDENR-SSS data set. CFU predictionintervals are based on an MF sample aliquot volume of 100 ml. . . . . 75
5.2 Expected value and 95% credible intervals for the fecal coliform trueconcentration given MPN (panel A) and CFU (panel B) estimates inorganisms per 100 ml. For clarity, panel A includes only the 51 observ-able MPN estimates presented in standard laboratory analysis MTFconversion tables for the 5-tube serial dilution analysis procedure (see,e.g. Woodward, 1957) and panel B includes only every 5th observableCFU value based on an MF test with a sample aliquot volume of 100ml. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.3 Empirical linear regression model (panel A) and theoretical probabilitymodel (panel B) of the relationship between fecal coliform MPN andCFU estimates from the same water quality sample. . . . . . . . . . . 77
5.4 Observed values, expected values, and the theoretical probability massfunction of the MPN for a CFU measurement from the same waterquality sample. Observed values are from recent NCDENR-SSS study. 78
xi
6.1 Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for each model parameter based on samples of size10, 25, or 100. Vertical gray lines indicate the parameter value usedto simulate data. Dots (solid and hollow) indicate median values.For each sample size, the upper line (with solid circle) represents theparameter estimate based on using the MPN point estimate, and thelower line (with hollow circle) represents parameter estimates basedon using the pattern of positive tubes for model calibration. . . . . . 93
6.2 Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for model-predicted FIB concentrations at time t =1, 4, and 7 days. Vertical gray lines indicate the expected FIB con-centration using the “true” parameter values. Dots (solid and hollow)indicate median values. For each sample size, the upper line (withsolid circle) represents predicted FIB concentrations using the modelcalibrated with MPN point estimates, and the lower line (with hol-low circle) represents predicted FIB concentrations using the modelcalibrated using the pattern of positive tubes. . . . . . . . . . . . . . 95
xii
List of Tables
3.1 Marginal distribution of fecal coliform MPN results at a selected group-ing of monitoring stations. Newport River, North Carolina. . . . . . . 46
3.2 Summary of Bayesian analysis results for Newport River, North Car-olina fecal coliform MPN data. . . . . . . . . . . . . . . . . . . . . . . 47
4.1 NSSP shellfish harvesting area fecal coliform water quality standardsbased on a minimum of 30 randomly collected samples. . . . . . . . . 51
4.2 Values of µc and σc constituting MPN contour line (for simulated vi-olation frequency = 0.005). . . . . . . . . . . . . . . . . . . . . . . . 51
4.3 Values of µc and σc constituting CFU contour line (for simulated vio-lation frequency = 0.005). . . . . . . . . . . . . . . . . . . . . . . . . 52
4.4 Alternative priors for true concentration ck standard deviation σk atstation k. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.5 Regression model parameters including transformation parameter (γ),intercept (β0), and slope (β1). . . . . . . . . . . . . . . . . . . . . . . 52
4.6 Estimated confidence of compliance (CC), posterior probability of vi-olating any MPN standard, and observed violations for monitoringstations in the Newport River Estuary during the 2000-2005 assess-ment period. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.1 Example of simulated data set with sample size j = 10. Each rowrepresents a simulated grab sample with concentration c collected attime t, a simulated pattern of positive tubes (x1, x2, x3) resulting fromstandard MTF decimal dilution analysis of the grab sample, and thecorresponding MPN (**see Methods section for interpretation of re-sults with all tubes negative, or all tubes positive). . . . . . . . . . . 89
6.2 Simulation steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
A.1 Water bodies within shellfish growing area E-4 and their status relativeto the 303(d) list of impaired waters. “IR Category” refers to 2008Draft Integrated Report (IR) Category. . . . . . . . . . . . . . . . . . 103
xiii
Acknowledgments
I would like to thank the members of my doctorate committee, particularly Dr. Ken
Reckhow for agreeing to take me on as a graduate student, and for providing me not
only with the opportunity to work on a focused research project closely linked to
my own interests, but also with the opportunity to explore new research trajectories.
I’m also very grateful to Dr. Robert Wolpert for agreeing to work closely on many of
the detailed statistical aspects of my research. Finally, many thanks to Dr. Rachel
Noble and Dr. Bill Kirby-Smith for your support and friendship, it has been a
pleasure working with you.
Much of the research presented in this dissertation was supported with funds
from the United States Environmental Protection Agency (USEPA) through the
North Carolina Division of Water Quality (NCDWQ) 319 program (Contract No.
EW05049). Additional funding was also provided through grants from the National
Science Foundation (NSF Grant Nos. DMS-0112069 and DMS-0422400) and (through
collaboration with Dr. Mark E. Borsuk) the EPA Office of Research and Develop-
ment’s Advanced Monitoring Initiative (AMI) Pilot Projects Focused on GEOSS
(Global Earth Observation System of Systems). I am also very grateful for scholar-
ship support from the Water Environment Federation (WEF), Quantitative Environ-
mental Analysis (QEA), LLC, and the North Carolina Association of Environmental
Professionals (NCAEP).
I owe tremendous thanks to the staff at NCDENR-SSS, including Patti Fowler,
xiv
J.D. Potts, Andrew Haines, Shannon Jenkins, Nadine Stoddard, and Diane Mason,
all of whom were consistently generous with their time in answering questions, sharing
and explaining water quality analysis data, and teaching me about their analytical
procedures. Most, if not all, of this dissertation would not have been possible without
their help. I also owe thanks to Larry Wood and his family not only for their kindness
over the years, but also for teaching me about sailing, shellfishing, and appreciating
the beauty and joy found in natural resources, particularly those of Waquoit Bay on
Cape Cod.
Several colleagues from the Nicholas School, particularly those who are either
current or former members of Ken Reckhow’s laboratory group, provided critical
feedback at various stages of my research. In particular, thanks go to Ben Best, Joe
Sexton, Craig Stow, Song Qian, Sean McMahon, Scott Loarie, Rob Schick, Ibrahim
Alameddine, Lori Bennear, Richard Anderson, and Conrad Lamon. I also owe thanks
to my former colleagues at Stearns & Wheler, LLC, particulary Bill Hall, Jr. and
Nate Weeks, both of whom provided kind assurance both during my graduate studies,
and in my decision to return to graduate school. Many thanks also go out to the
hard work of master’s students who contributed to research on the Newport River
including Tammy Hill, Whan Chunkrua, and Ryan O’Banion.
My family is fortunate enough to live in a wonderful community in Old West
Durham, and the support from our friends and neighbors has been invaluable, par-
ticularly from Cyrus, Michelle, Julie, Nancy, Guy, Colleen, Riley, Patrick and Ally.
xv
Jim and Meg Lister also provided invaluable support, keeping my family fed and on
the right parenting track during the first few months after Michael and John were
born. We could not have survived parenting twins, nor could I have maintained
progress in my Ph.D. program, without your help. I also appreciate the one-on-one
help from many of the staff here at the Nicholas School, including Jacqui Franklin,
Deborah Wilson, Nancy Morgans, Laura Turcotte, Stephen Cash, and Meg Stephens.
I also owe special thanks to Lana BenDavid and the graduate school for their support.
I will be indebted for a very long time (if not forever) to my family, particularly
those who managed to survive staying in our crazy home while we tried to raise
twins, support a doctoral dissertation, and do all the other things families try to do.
In particular, Peter and Marcia provided critical support during a major transition
period. Granny and Pop-pop, I hope it goes without saying that your love and
support have made all the different in the world. Mom and Dad, thank you for
everything. Most of all, thank you Sara. When I asked you to marry me, I failed
to mention that I planned on quitting my job, going back to graduate school in
North Carolina, buying a house, and immediately having twins. Thanks for keeping
everything (including yourself) afloat.
xvi
Chapter 1
Introduction
This doctoral dissertation presents the derivation and application of a series of wa-
ter quality models and modeling strategies which provide critical guidance to water
quality-based management decisions by identifying and explicitly acknowledging un-
certainty and variability in terrestrial and aquatic environments, and in water qual-
ity sampling and analysis procedures. While these modeling tools can be used to
assist management decisions in waters with a wide range of designated uses, my re-
search focuses on developing tools which can be integrated into a probabilistic or
Bayesian network model supporting total maximum daily load (TMDL) assessments
of impaired shellfish harvesting waters. Such a model is currently being developed
through an ongoing 319 project with the North Carolina Division of Water Quality,
and a major goal of this dissertation is to provide tools which will improve the per-
formance of that model. Therefore, the research presented in this dissertation should
be viewed as a component of a more comprehensive modeling and research effort.
While my research provides the foundation for building a Bayesian or probabilistic
network model, the final model is not presented explicitly as part of this dissertation.
Section 303(d) of the United States (US) Clean Water Act requires that states
assess the condition of surface waters and report those which fail to meet ambient
water quality standards (Smith et al., 2001; Houck, 2002). These are added to the
1
US Environmental Protection Agency (EPA) list of impaired waters (U.S. Environ-
mental Protection Agency, 2005b) and can only be removed after the performance
of a TMDL assessment (National Research Council, 2001; Cooter, 2004) followed by
sample-based verification that the standards are being met. As with any TMDL
assessment, the primary objective of the Newport River TMDL is to determine the
maximum allowable pollutant load from point, non-point, and natural sources, in-
cluding a margin of safety (MOS), which can be discharged into a receiving water
without violating water quality standards (National Research Council, 2001; Houck,
2002). Such predictive assessments are usually based on an empirical or mechanistic
water quality model relating pollutant loading levels to water body concentrations
(Borsuk et al., 2002; Benham et al., 2006). Fecal indicator bacteria (FIB), such
as fecal coliform, are commonly used to assess potential pathogen contamination in
coastal waters, and serve as the pollutant of concern for the models presented in this
dissertation (U.S. Environmental Protection Agency, 2001).
Model Ordinance in the Guide for the Control of Molluscan Shellfish, prepared by
the National Shellfish Sanitation Program (NSSP), includes recommended FIB-based
water quality criteria for shellfish-growing waters (Food and Drug Administration
and Interstate Shellfish Sanitation Conference, 2005). States which participate in
the NSSP, and which are also members of the Interstate Shellfish Sanitation Confer-
ence, enforce the Model Ordinance as a minimum requirement for sanitary control of
shellfish (Food and Drug Administration and Interstate Shellfish Sanitation Confer-
2
ence, 2005). Similar FIB-based water quality standards are enforced in surface waters
with other designated uses, such as recreational use (N.C. Department of Environ-
ment and Natural Resources, 2004) and drinking water supply (U.S. Environmental
Protection Agency, 2005a).
The latest official assessment of US water quality data (U.S. Environmental Pro-
tection Agency, 2002) indicates that pathogens are the leading cause of coastal shore-
line standard violations (275 total miles impaired) and the second leading cause of
violations in rivers and streams (82,100 total miles impaired). The Newport River
Estuary and its tributaries, which are collectively designated as growing area E-4
by the North Carolina Department of Environment and Natural Resources Shellfish
Sanitation and Recreational Water Quality Section (NCDENR-SSS), is historically a
productive shellfish harvesting area. However, all of its segments and tributaries are
either permanently or conditionally closed to shellfishing based on poor water qual-
ity or proximity to known or potential sources of fecal contamination. As a result,
growing area E-4 comprises forty of the designated shellfish harvesting areas in North
Carolina which are currently included in the USEPA 303(d) list and therefore require
a TMDL assessment (see appendix A). Developing modeling tools which support
TMDL assessments in this area not only addresses an acute need, but also provides
additional context for addressing pathogen water quality problems around the US
and the world.
3
1.1 Dissertation Organization
My dissertation is divided into 6 chapters. This Chapter (Chapter 1) describes the
rationale for my doctorate research including overall research objectives and critical
regulatory requirements. Chapter 2 proposes a new graphical structure for either
a probabilistic or Bayesian network model of water quality in shellfish harvesting
waters. USEPA recommends initiating TMDL projects with an evaluation of ap-
propriate water quality indicators and associated target values which can be used
to assess attainment of the designated use (U.S. Environmental Protection Agency,
2001). Therefore, while chapter 2 defines (rather broadly) the scope of any bacte-
rial TMDL assessment, it also highlights a poorly-defined relationship between water
quality model endpoints and proposed measures of water quality (including alterna-
tive indicator organisms and different testing methods) as well as potential risks to
human and environmental health. Although my dissertation focuses on mitigating
fecal contamination in shellfishing resource areas (and reducing subsequent risk of
the outbreak of shellfish-borne infectious diseases), Chapter 2 serves as a reminder
that pollutants of non-fecal origin (such as red-tide causing ciguatoxins) might be
integrated into ongoing health risk-based management planning (Hackney and Pier-
son, 1994). Chapter 2 indicates a growing need in the microbial analysis and water
quality modeling field to more explicitly quantify the relationship between human
health risks and alternative measures of fecal and non-fecal contamination in coastal
resource waters. Identifying this research need was a major result of the early stages
4
of my research, and establishes the context for all subsequent Chapters of my dis-
sertation. Much of the research presented in Chapter 2 appears in peer-reviewed
proceedings of the International Water Association (IWA) WaterMatex 2007 Confer-
ence (Gronewold and Reckhow, 2007).
In Chapter 3, I develop and apply a simplified version of the conceptual graphical
model from Chapter 2 to water quality monitoring data from the Newport River
using the Bayesian analysis software package Neticar. This analysis identifies how
presumed critical environmental variables impact water quality-based management
decisions, and whether or not those variables are monitored under truly random con-
ditions. Furthermore, the initial modeling effort in Chapter 3 indicates that critical
model variables (such as the model endpoint) should explicitly acknowledge uncer-
tainty and variability (through, for example, probabilistic models) to allow compari-
son between model output and management decision criteria. The work in Chapter
3 also suggests that fecal indicator bacteria concentration forecasting models must
appropriately reflect uncertainty inherent to specific bacteria water quality analysis
procedures, and that the Neticar software package may not be the most appropriate
tool for doing so. The research in Chapter 3 appears in peer-reviewed conference
proceedings of the Water Environment Federation TMDL 2007 Specialty Conference
held in Bellevue, Washington (Gronewold et al., 2007).
In Chapter 4, I develop a novel approach to assessing FIB-based water quality
standards for pathogenically-impaired resource waters and propose new standards
5
based on distributional parameters of the in situ FIB concentration probability dis-
tribution (as opposed to the current approach of using most probable number (MPN)
or colony-forming unit (CFU) values). This work is motivated by recommendations
of the National Research Council (2001), and an exploratory analysis of historic New-
port River water quality and environmental data, which suggest that several water
bodies in shellfish growing area E-4 either do not appear to violate water quality
standards, or do not have sufficient data to justify being included in the 303(d) list.
Chapter 4 concludes with a re-evaluation of water quality standard violations in the
Newport River based on my proposed water quality standards. Much of the work
(and text) in Chapter 4 was developed in collaboration with Dr. Mark Borsuk, Dr.
Robert Wolpert, and Dr. Kenneth Reckhow and was recently published (as the cover
article) in Environmental Science & Technology (Gronewold et al., 2008).
Chapter 5 compares different FIB water quality metrics in order to determine
whether an ongoing change in NCDENR-SSS standard operating procedure (and
elsewhere, presumably) from a multiple tube fermentation (MTF)-based procedure
to a membrane filtration (MF) procedure might cause a change in the observed fre-
quency of water quality standard violations. This comparison is based on an inno-
vative theoretical model of the MPN probability distribution for any observed CFU
estimate from the same water quality sample, and is applied to recent water quality
samples collected and analyzed by NCDENR-SSS for fecal coliform concentration us-
ing both MTF and MF analysis tests. This research provides important insight into
6
whether MPN and CFU intra-sample variability stems from human error, laboratory
procedure variability, or is simply a consequence of the probabilistic basis for calcu-
lating the MPN. This research was conducted in close collaboration with Dr. Robert
Wolpert, and was recently published in Water Research (Gronewold and Wolpert,
2008).
Finally, in Chapter 6, I propose a Bayesian strategy to calibrating FIB water
quality models in which the pattern of positive tubes from a multiple-tube fermen-
tation (MTF) serial dilution analysis is used as data. My proposed strategy assumes
that the pattern of positive tubes or wells in a serial dilution analysis experiment
(using, for example, either the MTF test or IDEXX Quanti-Trayr/2000 system),
when modeled as a series of stochastic random variables, reflects variability in serial
dilution analysis procedures and, consequently, uncertainty in the estimate of the true
FIB concentration. I then compare my proposed Bayesian strategy with the common
practice of using MPN point estimates to calibrate FIB water quality models. The
research presented in Chapter 6 highlights how proper acknowledgement (or igno-
rance) of model input uncertainty affects both FIB water quality model parameter
estimates as well as model-based management decisions. Much of this research was
completed in collaboration with Dr. Song Qian, Dr. Robert Wolpert, Dr. Rachel
Noble and Dr. Kennth Reckhow, and is currently being revised following submittal
to Water Research.
7
Chapter 2
Developing a Graphical Model
Note: much of the text from this Chapter appears in peer-reviewed proceedings of the
International Water Association (IWA) WaterMatex 2007 conference (Gronewold
and Reckhow, 2007).
Appropriate graphical representation of assumed relationships between environ-
mental system variables, resource area management actions, and human health im-
pacts is the first and potentially most critical stage in the development of a proba-
bilistic or Bayesian network model designed to protect designated resource waters.
A graphical network establishes the cornerstone on which model algorithms are iden-
tified and applied, monitoring plans are implemented, and management alternatives
are evaluated. Furthermore, a graphical model structure facilitates group model
building and dissemination of these model algorithms and assumptions about system
dynamics (Borsuk et al., 2004).
In this Chapter, I present a step-by-step approach to developing a graphical net-
work relating system variables and management actions associated with fecal con-
tamination of resource waters. This graphical network model serves as a foundation
for future implementation of a probabilistic or Bayesian network model designed
to integrate environmental conditions in bacteriologically impaired surface waters
8
with management alternatives, and to forecast probability distributions of designated
model endpoints.
2.1 Selecting the Model Endpoint
Long-term water resource management projects, such as those implemented through
the TMDL program, should start with an evaluation of appropriate water quality
indicators and associated target values which can be used to assess designated use
attainment (U.S. Environmental Protection Agency, 2001). Current guidelines for
United States shellfish harvesting waters, for example, indicate fecal coliform most
probable number (MPN) and colony forming unit (CFU) values are the basis for water
quality standards, and therefore serve as logical model endpoints (Food and Drug Ad-
ministration and Interstate Shellfish Sanitation Conference, 2005). Recent research,
however, indicates that several alternative indicator organisms may more accurately
reflect the potential health risk associated with fecal contamination (National Re-
search Council, 2001; U.S. Environmental Protection Agency, 2001). Potential alter-
native indicators of fecal contamination include the family of coliform bacteria (which
include total coliform, fecal coliform, and Escherichia coli), fecal streptococci (U.S.
Environmental Protection Agency, 2001; Kashefipour et al., 2005), and Enterococcus
sp (Sanders et al., 2005). Furthermore, while indicator organism concentrations in
the water column are the standard for assessing water quality and threats of fecal
contamination, human illness and death may also occur from the consumption of
9
shellfish contaminated with pollutants of non-fecal origin (such as red-tide causing
ciguatoxins), even if the shellfish are properly cooked. Several authors, including
Hackney and Pierson (1994), provide a history of field studies relating human disease
outbreaks with contamination of shellfishing resource areas.
Other human and environmental health measures not directly linked to TMDL
implementation, but of significant concern to public health officials and the public-
at-large, include potential relationships between fecal coliform concentration in the
water column and underlying shellfish tissue, the relationship between fecal indicator
organism concentration in shellfish tissue and risk of human illness, and the relation-
ship (in any media, including waters and shellfish tissue) between fecal indicator and
pathogenic organism concentrations. These environmental and human health vari-
ables are included in the graphical network to improve model flexibility and facilitate
future adaptation to alternative management scenarios, and applications other than
strictly TMDL support.
However, because this dissertation is primarily intended to support the TMDL
assessment process and, in particular, the development of TMDLs in shellfish har-
vesting waters, the model endpoint assumed for most of this dissertation is surface
water fecal coliform concentration assessed using currently approved analytical tech-
niques and standards. I propose this endpoint with implicit understanding that fecal
coliform (or other indicator organism) aquatic and terrestrial transport and transfor-
mation processes used to establish conditional probability relationships in the net-
10
work model are not likely to accurately represent fate and transport dynamics of the
pathogens they supposedly represent. Developing a network model structure which
can be adapted to a variety of potential model endpoints, including both pathogenic
and non-pathogenic organisms of fecal origin, is an area for additional research. My
proposed graphical representation of critical model endpoints, including critical en-
vironmental system response variables, management decisions, and potential model
endpoints is included in figure 2.1.
Figure 2.1: Graphical representation of critical environmental system response vari-ables and potential model endpoints. Management decisions are indicated by boxes,and variables are represented by rounded nodes.
Uncertainty associated with some fecal indicator organism laboratory analysis
procedures can range up to an order of magnitude, and has significant impacts on
both management actions and perceptions of threats to human health. Though not
immediately obvious in figure 2.1, network models provide a logical framework for
exposing and propagating both intrinsic sources of measurement uncertainty inherent
to bacteriological analytical procedures, as well as extrinsic sources of uncertainty,
into model endpoints. Modeling strategies for addressing these potential sources of
11
uncertainty will be addressed in detail in Chapters 4, 5, and 6.
2.2 Terrestrial Fate and Transport
In order to guide long-term management strategies, a fecal coliform pollution network
model must address the relationship between land use practices, loading reduction
measures, and predicted changes in pollutant concentration probability distribution
in the receiving water. The first relationship in this causal chain, therefore, is ter-
restrial fecal pollution deposition and wash-off. Establishing conditional probability
relationships between the variables related to this process allows loading reduction
management actions to be simulated in the model either at the pollution generation
level, or at the watershed-water body interface.
Land use practices and land cover types in the coastal watersheds of North Car-
olina, as with many other watersheds discharging into coastal shellfishing resource
waters, are dominated by agriculture and forested areas in which potential fecal
pollution sources can range from waste management infrastructure to wildlife and
agricultural runoff (Weiskel et al., 1996; White et al., 2000; Shen et al., 2005). Ter-
restrial accumulation and decay of fecal indicator bacteria from these, and similar
landscapes, can be approximated by a first-order decay process coupled with a lin-
ear daily loading rate term (Alley and Smith, 1981; U.S. Environmental Protection
Agency, 2001):
dL
dt= N(s) − kt(s)L(t) (2.1)
12
where
L(t) = number of organisms on the landscape at time t (often in days)
N(s) = seasonal terrestrial FIB deposition rate (organisms per day)
kt(s) = seasonal first-order terrestrial decay rate (1/day)
Transport of fecal pollution (following deposition on the landscape) and entrapped
indicator organisms is a complicated combination of both surface and groundwater
processes. Groundwater is a potential transport mechanism for pathogens and indi-
cator organisms if it serves as a connection between a receiving surface water and a
land-based pollution source such as a waste lagoon, leaking septic tank, or improperly
designed landfill (Ferguson et al., 2003). Soils may act as a filtering mechanism for
certain pathogens, and studies have indicated that they serve as a significant barrier
for viruses, bacteria, and protozoa (Schijven and Hassanizadeh, 2000, 2002). Trans-
port of pathogens and indicator bacteria, however, varies in any media depending
on chemical, physical, and biological properties of the media (Ferguson et al., 2003).
For example, conditions which promote transfer of pollutants to groundwater include
sudden redistribution of a pollutant-bearing liquid on the land surface (such as la-
goon waste) and naturally occurring soil macropores (e.g. root channels and animal
burrows) which limits soil attenuation (Thomas and Phillips, 1979; McMurry et al.,
1998). In general, microbial subsurface transport is poorly understood and is an
area for further research. In light of these various processes, bacteria and pathogen
terrestrial transport models often express pollutant washoff primarily as a function
of precipitation in the following form (Alley and Smith, 1981; Barbe et al., 1996):
13
dL
dt= −αrbL(t) (2.2)
where
r = precipitation intensity or effective runoff rate (inches/day)
α = washoff coefficient, with conversion units
b = power constant
Equation 2.2 is often applied using watershed-averaged values for impervious sur-
face runoff, but can be applied to pervious watersheds in which the rainfall-induced
washoff is typically less than in impervious areas. Parameter values fitted to actual
data for this model are presented in Alley and Smith (1981).
Another potential algorithm for relating pollutant loading to rainfall is (Reeves
et al., 2004):
L ∼ Qn (2.3)
where
L = pollutant load (organisms/day)
Q = volumetric flow rate (ft3/day)
n = power constant (often between 1 and 1.5)
A similar representation of this equation is the power law of the form (Lee and
Bang, 2000):
14
L
As∝
[
Q
As
]n
(2.4)
where
As = watershed area
n = power constant
Equation 2.4 can be rewritten as a linear model of the form:
ln(L
As) = ln(α) × n ln(
Q
As) (2.5)
Many of these algorithms are encompassed in water quality modeling software
packages, some of which are supported by USEPA (U.S. Environmental Protection
Agency, 2001). Ferguson et al. (2003) and Shen et al. (2005) provide comprehensive
summaries of available software packages, including:
• Hydrologic simulation program - Fortran (HSPF)
• Loading simulation program C++
• Watershed analysis risk management framework (WARMF)
• Soil and water assessment tool (SWAT)
• Agricultural nonpoint sources model
• Storm water management model (SWMM)
15
The wide range of factors and high degrees of uncertainty affecting the relation-
ship between pollutant accumulation, and washoff during precipitation events, make
collection of appropriate data for such models an often overwhelming task (National
Research Council, 2001). For example, soil moisture conditions are often considered
a critical variable in watershed runoff processes (Beven, 2001), yet the high spatial
and temporal monitoring resolution required to accurately reflect these conditions
would exhaust the resources of most management groups (National Research Coun-
cil, 2001). As a result, I have combined historic algorithms (represented by equations
2.1 and 2.2) into the following:
dL
dt= N(s) − kt(s)L(t) − αrbL(t) (2.6)
This approach has the advantage of minimizing dependency on more detailed,
small-scale terrestrial processes, many of which are poorly understood and relatively
underrepresented in the literature, including (Ferguson et al., 2003):
• transport particle size distribution
• relationship between physical properties of watersheds, microbial cellular-scale
properties, and transport phenomenon
• microbial die-off and decay upon initial transfer into aquatic environment
A graphical representation of the proposed terrestrial fate and transport compo-
nent is presented in figure 2.2.
16
Figure 2.2: Graphical representation of assumed system variables and causal rela-tionships for terrestrial fate and transport of fecal indicator bacteria. Managementdecisions are indicated by boxes, and variables are represented by rounded nodes.
2.3 Aquatic Fate and Transport
Small coastal embayments, including many tributaries of the Newport River Estuary,
are defined as partially enclosed water bodies with a connection to a larger bay or
estuary. Depths in these small coastal embayments can range between 0.5 and 3.0
meters (Fischer, 1979; Thomann and Mueller, 1987), and pollutant concentrations are
heavily influenced by tidal flushing and surface runoff. Advection and other trans-
port processes in coastal resource waters are frequently dominated by tidal activity
(Grant et al., 2001; Kashefipour et al., 2005; Sanders et al., 2005). The relatively
shallow depth and strong advective forces in small coastal embayments often result
in complete or near-complete vertical mixing (Thomann and Mueller, 1987).
17
In addition to advective forces, governing processes affecting fecal indicator organ-
ism aquatic fate and transport include settling (Chapra, 1997) and natural mortality
(Gameson and Gould, 1974; Auer and Niehaus, 1993). Mortality of biological or-
ganisms is often represented in water quality models by a first-order loss rate ka
(in units day−1), which includes effects of temperature, salinity, and solar radiation
(Chapra, 1997). In addition, coastal embayments are often surrounded by wetlands
which undergo continuous wetting and drying cycles and, as a result, may represent
a non-point source of pollution (Sanders et al., 2005).
In an effort to develop a simple and robust network model structure, I review in
detail (in the following sections) potential approaches to modeling FIB transport and
decay processes. I then extract key model algorithms to be represented in the final
network model.
2.3.1 Bacteria loss models
Bacteria die-off and decay in aquatic environments is typically represented in water
quality models by an effective loss rate, ka (in units day−1) accounting for natural
mortality, solar radiation, and settling (Chapra, 1997):
ka = kad+ kai
+ kas (2.7)
where kadrepresents natural mortality, kai
represents mortality due to solar radia-
tion, and kas represents loss due to settling. Additional environmental variables which
18
potentially contribute to bacteria loss not typically addressed explicitly in bacteria
water quality models include (Davies-Colley et al., 1994; U.S. Environmental Protec-
tion Agency, 2001):
• attraction to solids
• water column pH
• starvation and predation
• structural damage
• osmotic pressure induced by salinity gradient following runoff events
• nutrient deficiencies
• turbidity
• variations in spectral quality of sunlight
• oxygen and nutrient concentrations
Natural Die-Off (kad)
Fecal coliform bacteria natural die-off rates can be approximated by a first-order
temperature and salinity-dependent process of the following form (Mancini, 1978;
Thomann and Mueller, 1987; Chapra, 1997):
kad= (0.8 + 0.006Ps)θ
T−20
19
where Ps is the percentage of sea water, T is the water temperature in degrees Celsius,
and θ expresses the temperature dependency of a reaction rate (and is typically
between 1.0 and 1.1). This equation can be modified as a function of measured
salinity S, assuming a seawater salinity in the range of 30 to 35 parts per thousand
(ppt):
kad= (0.8 + 0.02S)θT−20
Historic studies indicate a wide range of temperature-dependent pathogen and
indicator organisms survival rates. For example, in research results summarized
by U.S. Environmental Protection Agency (2001), pathogens have been inactivated
following exposure to temperature extremes, including freezing and boiling (Tzipori,
1983; Badenoch et al., 1990), while pathogen survival rates at moderate temperatures
(i.e. between approximately 4 and 20 degrees Celsius) ranged between 2 and 6 months
(Bingham et al., 1979; Adam, 1991; Medema et al., 1997). More recent studies, such
as those conducted and cited by Auer and Niehaus (1993), also indicate no significant
relationship between ambient temperature and decay rate (in the absence of solar
effects), implying θ = 1 in equation 2.8 (for additional details, see Mitchell and
Chamberlin, 1979; Moeller and Calkins, 1980; Auer and Niehaus, 1993)). Freshwater
studies cited in Novotny and Olem (1994) indicate enteric virus survival rates ranging
between 2 and 188 days.
20
Death due to Solar Radiation (kai)
Bacterial loss in aquatic environments due to solar radiation is often approximated
as (Mancini, 1978; Thomann and Mueller, 1987; Auer and Niehaus, 1993; Chapra,
1997):
kai=
αI0
keH(1 − e−keH) (2.8)
where α is a proportionality constant often approximated as unity (Thomann and
Mueller, 1987), I0 is surface light energy, ke is a light extinction coefficient (typically
in units of 1/m) derived from suspended solids concentration or secchi disk depth
measurements, and H is the depth (in meters) of the layer over which the approximate
decay rate is being applied. Research on effects of solar radiation on bacteria and
pathogen decay rates include a comparison between Giardia and Cryptosporidium
decay rates in sunlight (see Johnson et al., 1997; Kashefipour et al., 2005), effects of
turbidity on solar penetration in the water column and subsequent increased survival
of microorganisms (Salomon and Pommepuy, 1990), and comparisons between loss of
viral infectivity under various light and substrate concentration conditions indicating
solar radiation as a significant factor on loss of viral infectivity (Noble and Fuhrman,
1997). While these studies provide insight into the role of environmental variables
on the fate of both pathogenic and indicator organisms, it is likely they could only
be presented in models with a level of detail too high for supporting thousands, and
perhaps tens of thousands, of TMDL assessments.
21
Settling Loss (kas)
Bacteria settling rates are believed to be a function of the fraction of organisms
entrapped in settling solids, and can be approximated as (Chapra, 1997):
kas = Fpvs
H(2.9)
where vs is the settling velocity of the solids (in meters per day), H is the depth of
measurement (in meters), and Fp is the fraction of bacteria attached to solids, which
can be approximated by:
Fp ≈Kdm
1 + Kdm
where Kd is a partition coefficient (in m3/g) and m is the suspended solids concen-
tration (in mg/L).
Settling velocity vs can be zero, positive, or negative. Negative settling velocities
account for microorganisms entrapped in resuspended sediment (see, e.g. Thomann
and Mueller, 1987). Recent studies indicate that resuspension of sediment and en-
trapped bacteria can impair water quality in the absence of precipitation events (see
Irvine and Pettibone, 1993; Weiskel et al., 1996). Obiri-Danso and Jones (2000)
found that fecal indicator organisms, in particular, are susceptible to resuspension
during dry weather. Studies supporting these findings indicate that fecal indicator
and pathogenic bacteria may survive longer in sediment than in overlying waters,
22
often by order of magnitude difference (Ashbolt et al., 1993; Nix et al., 1993; Ghins-
berg et al., 1994; Davies-Colley et al., 1994; Obiri-Danso and Jones, 2000; Sanders
et al., 2005). Some pathogenic organisms, such as Campylobacter, do not survive
for more than a few hours in cold weather, and on the order of minutes in the sum-
mer, making its presence in sediment a strong indicator of recent fecal pollution and
potential threats to human health (Obiri-Danso and Jones, 2000). Other potential
factors related to resuspension events include soil characteristics and hydrodynamic
shear forces at the sediment-fluid interface (Blanchard et al., 1997).
2.3.2 Bacteria transport models
Approaches to modeling the transport and fate of fecal indicator organisms in a
shallow tidal estuary range from simple one-dimensional models focusing on first
order decay and dilution to complex 3-dimensional models encompassing diffusivity
gradients, temperature and salinity gradients, and velocity profiles. Some models, for
example, predict continuous advection, dispersion, and die-off throughout tidal cycles
(Sanders et al., 2005). Others, as recommended by Thomann and Mueller (1987),
use a time scale no finer than one point in the tidal cycle to the same point in the
next cycle. Each type of model carries implicit monitoring requirements, with the
more complex models requiring more extensive monitoring networks with a broader
range of environmental parameters.
Regardless of model structure or spatial and temporal scale, microbial transport
23
models historically address advection, diffusion, and dilution (Fischer, 1979). First-
order decay (or loss) terms appearing in these models can be viewed as an integration
of potential loss factors discussed in section 2.3.1. Fecal coliform concentrations in
tidal estuaries, in particular, are commonly assumed to be governed by river discharge
and tidal range (Grant et al., 2001; Kashefipour et al., 2005). Rarely, however, do
these processes apply equally to a single water body. For example, exploratory anal-
ysis of historic water quality data in the Newport River indicates that the main body
of the Newport River estuary acts as a large tidal basin with high tidal exchange rates
and salinity values, and relatively infrequent water quality violations. Tributaries of
the Newport River estuary, however, exhibit relatively high frequency of standard
violations and are typically small enough (both in surface area and volume) that
tidal advection likely outweighs effects of other hydraulic processes.
For the remainder of this section, I review potential hydraulic transport processes
and associated modeling strategies for tidal estuaries and coastal embayments, fol-
lowed by a summary of modeling approaches most applicable to the coastal waters
of the Newport River and its tributaries. Of course, most of these algorithms will
not likely be included in the final proposed model and are presented with the under-
standing that they provide context and guidance for choosing the proposed model
and, if necessary, for making changes to the model in the future.
Some of the most well-known and frequently applied water quality models are
based on solutions to the advective-diffusion equation, which is commonly used for
24
modeling bacteria and other non-conservative substances undergoing first-order de-
cay (for details, see Fischer, 1979). Similarly, the QUAL2K pathogen model applies
a mass balance approach to solving fecal bacteria concentration on a reach-by-reach
basis (Chapra et al., 2007). Several recent studies, however, serve as building evi-
dence that the advective-diffusion equation, and similar mechanistic models, promote
a level of detail exceeding the limitations of most data collection resources (National
Research Council, 2001; Borsuk et al., 2004). Salomon and Pommepuy (1990), for
example, acknowledge the complexity and cost associated with implementing a 3-
dimensional model, and found (in their particular study) that dilution was so dom-
inant, subsequent detailed investigations of organism mortality were not justified.
Arega and Sanders (2004), while successfully applying the California tidal wetland
modeling system (and providing a comprehensive list of similar studies) demonstrate
the potential large amounts of data and, in their case, the use of dye studies, required
for complex model support. Such effort is not expected to be practical on the scale
of the TMDL program. Furthermore, it is unclear if advection-diffusion equations,
and other high order differential equations typically applied to hydraulic water qual-
ity problems, apply to cellular transport in water bodies dominated by dilution and
advection.
Most importantly, the water quality standards for shellfish harvesting waters are
based on water quality at the surface at a particular monitoring station. As a re-
sult, detailed 2 and 3-dimensional models exceed not only the resources, but also
25
the needs of the TMDL assessments in the Newport River Estuary. Finally, because
this doctorate research is intended to support model implementation on a scale in
the order of thousands of models, and perhaps tens of thousands of surface waters,
the underlying model algorithm should be as simple as possible to facilitate monitor-
ing and modeling efforts, and to simulate model endpoints within acceptable error
limits (Reckhow, 1999). Tidal flushing models follow a general modeling strategy
recommended for rivers such as the Newport (Thomann and Mueller, 1987), which
combine mass balance theory with volumetric water exchange due to the rise and fall
of the tide, and has origins dating back to the work of Ketchum (1951). Subsequent
efforts to revise and apply Ketchum’s tidal flushing model, which are now commonly
referred to as tidal prism models, include Kuo and Neilson (1988), Sanford et al.
(1992), Luketina (1998), and in a recent coastal North Carolina TMDL, Shen et al.
(2005).
The tidal prism is defined as the difference between the volume of water in an
embayment at high and low tide (Luketina, 1998), and the concentration of a non-
conservative pollutant S in a tidal environment can be modeled as follows:
dS
dt=
W
V− kS(t) − (1 − b)Q
V(S(t) − Samb) −
I(t)
V(S(t) − Si(t)) (2.10)
26
where
S(t) = pollutant concentration at time t (in ppt or mg/L)
t = time (days)
W = within-estuary source (mg per day)
V = estuary average volume (L)
k = first order decay rate (1/day)
b = return flow factor (0 < b < 1)
Q = estuary outflow (L/day)
Samb = salinity in water outside estuary (ppt)
I(t) = estuary inflow at time t (L/day)
Si(t) = pollutant concentration in estuary inflow at time t (ppt)
In addition to being relatively simple, the tidal prism model has the advantage
of having only one hydrologic calibration parameter, the return flow factor b (Kuo
et al., 2005). This factor has been reported in the literature to range between 0.23
(Sanford et al., 1992) and 0.3 (Kuo et al., 2005), and these sources caution against
using (in the absence of any monitoring data) the often-recommended value of 0.5.
Based on a review of historic pathogen fate and transport models, I propose that
the tidal prism model is most appropriate for the waters of the Newport River Estu-
ary. While a simple zero-dimensional model may be suitable for the Newport River
Estuary tributaries, the central portion of the Newport River is most likely too large
27
for representation by a zero-dimensional model with a single reference monitoring
point. The loading reduction requirements for Newport River Estuary tributaries
may therefore have to serve as a conservative guide for the loading reduction re-
quirements of the Estuary itself (if it is found to be in violation of water quality
standards). A graphical representation of my proposed aquatic fate and transport
model, including critical environmental processes and system variables related to a
tidal prism model, is included in figure 2.3.
Figure 2.3: Graphical representation of environmental processes and system vari-ables affecting aquatic fate and transport of fecal indicator organisms. Managementdecisions are indicated by boxes, and variables are represented by rounded nodes.
2.4 Summary
The comprehensive graphical model is developed by combining designated submodels
for each system component, however a model simplifying step adapted from Borsuk
et al. (2004), in which model variables which are not controllable, predictable, or
observable are removed from the network, results in the graphical network presented
28
in figure 2.4.
Figure 2.4: Comprehensive graphical network of fecal contamination in designatedresource waters. Management decisions are indicated by boxes, and variables arerepresented by rounded nodes.
29
Chapter 3
Developing and Applying a Simple
Bayesian Network Model
Note: the research in this Chapter appears in peer-reviewed conference proceedings
of the Water Environment Federation TMDL 2007 Specialty Conference in Bellevue,
Washington (Gronewold et al., 2007).
Fate and transport processes related to bacteriological contamination of recre-
ational and shellfish harvesting waters are complicated and often poorly-understood
with a broad range of historic modeling efforts and associated varying degrees of
success. Developing water quality models which reflect implicit causal relationships
between environmental phenomena, land use patterns, and surface water quality are
vital to the long-term success of the USEPA Total Maximum Daily Load (TMDL)
Program. In this Chapter, I develop and apply a simple Bayesian network model
intended to support fecal coliform TMDL assessments in shellfish harvesting waters.
System components are graphically presented and discussed as a critical initial step in
successful model development, followed by establishment of probabilistic relationships
between system components. The subsequent model, while only a simplified version
of the more comprehensive model expected to be developed after my dissertation, is
suggested as an innovative tool for successful implementation of future TMDLs for
microbial contaminants. I begin by describing context for this research along with
30
a technical description of Bayesian networks. A graphical model representing sys-
tem dynamics in shellfish harvesting waters is presented, followed by application of
a submodel with data from the Newport River Estuary in eastern North Carolina.
3.1 Background
The goal of the TMDL process is to determine the maximum pollutant loading which
can enter a water body without exceeding water quality standards (National Research
Council, 2001; Shen et al., 2005). Despite the complications associated with model-
ing the relationship between fecal indicator bacteria (FIB) loading, surface water FIB
concentrations, and ultimately shellfish contamination, shellfish harvesting resource
area managers are charged with protecting human health by closing harvesting areas
immediately following conditions which may increase the risk of exposure to water-
borne pathogens. Shellfish harvesting areas which violate long-term water quality
standards are placed on the USEPA 303(d) list of impaired waters and are required
to undergo a TMDL assessment. Simple models are therefore needed to simultane-
ously support short-term management programs while providing forecast information
which can guide long-term management actions towards water quality standard com-
pliance. Key characteristics of these simple models should include, but not be limited
to, appropriate acknowledgement of uncertainty (in all phases of the process) and ap-
plicability to thousands of shellfish harvesting areas for which a TMDL assessment
is required, but has not been initiated.
31
Local shellfishing resource area management plans contain conservative criteria
for shellfish growing area closures and openings in order to protect human and envi-
ronmental health. Closure criteria typically include the volume of recent precipitation
events, while reopening criteria may include a subjective analysis of the number of
days since the precipitation event, event intensity, and monitoring to confirm water
quality restoration. Although these criteria are based on historic relationships be-
tween stormwater runoff and high pathogen concentrations in receiving waters, the
implicit causal relationship between precipitation intensity, lag between precipitation
events, land use patterns, receiving water quality and subsequent shellfish contami-
nation is poorly understood. Because short-term protection of human health takes
priority over long-term restoration of impaired shellfishing areas, effective implemen-
tation of a shellfishing resource area management plan does not necessitate explicit
understanding of the runoff-contamination relationship. Current management prac-
tices reflect the assumption that precipitation-based responses in water quality are
similar within neighboring stations and closure decisions are often subsequently ap-
plied to large areas encompassing several stations.
Additional management scenarios in shellfish harvesting areas include short-term
closure and re-opening of resource areas under the authority of local management
agencies. The primary objective of local management scenarios, as opposed to the
long-term remediation goals of the TMDL program, is protecting human health
through restricting or prohibiting shellfish harvesting either during adverse pollution
32
conditions, such as a recent rainfall event, or due to long-term water quality standard
violations. Due to the close relationship between the criteria and environmental pro-
cesses related to these two management schemes, fecal pollution modeling strategies
need to be developed that address both public health concerns and retention and/or
restoration of the beneficial uses of the waterbody.
3.2 Bayesian Networks
A Bayesian network is a graphical representation of conditional probability distri-
butions relating a set of system variables coupled with their formal statistical and
probabilistic relationships (see Pearl, 1988; Spiegelhalter et al., 1993, for extensive
definitions). Qualitative assessment of graphical model structure represents the first
of three stages in the development of a Bayesian network model in which system vari-
ables and assumptions about their relationships are identified, and was discussed pre-
viously in Chapter 2 (Spiegelhalter et al., 1993). Each system variable in a Bayesian
network model is represented by a node, and the presence or absence of an arc between
nodes indicates conditional dependence or independence, respectively. Although arcs
between variable nodes typically imply causality in Bayesian networks, the condi-
tional dependence represented by an arc may indicate a more complex relationship
(Borsuk et al., 2004). The graphical model, while providing a framework for identi-
fying system variables and qualitative beliefs regarding their interdependence, does
not by itself carry a probabilistic interpretation (Spiegelhalter et al., 1993).
33
The second stage of Bayesian network model development acknowledges an im-
plicit joint probability distribution encompassing the proposed model variables and
reflecting the graphical structure of the network (Spiegelhalter et al., 1993). For ex-
ample, fecal contamination of coastal estuaries may be represented using a simple
model which relates rainfall distribution (R) to fecal coliform concentration (F ) as a
function of both non-point (N) and point source (P ) loading, as presented in figure
3.1.
Figure 3.1: Simple network model representing rainfall-induced fecal contaminationof a coastal estuary.
The joint probability of system variables in this simplified model can be written
via the chain rule as:
p(R, N, P, F ) = p(R)p(N |R)p(P |R, N)p(F |R, N, P )
The implied conditional independence indicated by the lack of an arc between
nodes allows us to simplify the joint probability to:
34
p(R, N, P, F ) = p(R)p(N |R)p(P |R)p(F |N, P )
This simplification is possible because once the direct causes of a system variable
are observed, other system variables do not influence understanding of the node’s
distribution (Spiegelhalter et al., 1993). The resulting joint probability can therefore
be viewed as a set of several local distributions, each made up of only a node and
its parents (Spiegelhalter et al., 1993; Borsuk et al., 2004). These local distributions,
commonly referred to as belief universes (see, e.g. Jensen et al., 1990), represent
the cornerstones of model decomposition and one of many benefits associated with
modeling an environmental system with a Bayesian network.
The third and final stage (Spiegelhalter et al., 1993) of Bayesian network model
development involves encoding the conditional probability distribution within the
graphical model structure. Conditional probability distributions are often established
using model simulations, in some cases combined with expert opinions on system
dynamics.
The Bayesian component of Bayesian network models addresses how new infor-
mation is used to modify the conditional probability relationships between system
variables in an existing model. Computations relating future conditional probabil-
ity relationships (posterior distributions) with previous or current understanding of
the relationships (prior distributions) and new observations (likelihood) are based on
Bayes’ theorem, which can be expressed as the following:
35
posterior ∝ likelihood × prior
A graphical representation of Bayes’ theorem is included in figure 3.2.
Figure 3.2: Graphical representation of Bayes’ theorem indicating prior and poste-rior probability densities, and the normalized likelihood for a water quality standard.
3.3 Methods
3.3.1 Study Area and Data Collection
The focus area for this study is the Newport River Estuary (NPRE), located along
the eastern coast of North Carolina in Carteret County. The Newport River and its
tributaries are collectively referred to as shellfish growing area E-4. Shellfish growing
36
area E-4 is locally managed by the North Carolina Department of Environment and
Natural Resources Shellfish Sanitation and Recreational Water Quality Section (SSS),
and encompasses forty individual harvesting areas currently included in USEPA’s
303(d) list of impaired waters targeted for TMDL assessment (see Appendix A)
Water quality samples from shellfish growing area E-4 are routinely collected by
SSS from 29 sampling stations in accordance with guidelines outlined by the National
Shellfish Sanitation Program Food and Drug Administration and Interstate Shellfish
Sanitation Conference (2005). Routine compliance samples are collected roughly
5 to 6 times per year, while adverse condition samples are collected after rainfall
events in order to determine the duration of short-term shellfish harvesting area
closings. The primary data set used for this analysis is the routine monitoring data. In
addition to analyzing samples for fecal coliform concentration, the approximate status
of the tide is recorded during each sampling event. Stations are periodically added to
and removed from the sampling program depending on monitoring needs. The SSS
monitoring data is the longest continuing dataset using consistent station locations
for bacteriological water quality information in the Newport River Estuary and is
the primary source of inference for determining water quality standard violations
and TMDL modeling efforts. Rainfall data within the Newport River Estuary is
obtained from the National Oceanographic and Atmospheric Association’s (NOAA)
national climatic data center (NCDC) weather observation station in Morehead City,
North Carolina.
37
3.3.2 Model Variables
A comprehensive graphical model representing assumed processes and system compo-
nents in a tidal shellfish harvesting area was developed in Chapter 2. Components of
the graphical model (see figure 3.3) were identified and related to one another based
on a review of historic studies of tidal estuary systems and guidance from USEPA
(Grant et al., 2001; Kashefipour et al., 2005; U.S. Environmental Protection Agency,
2001). Recent research indicates that a wide range of alternative indicator organisms
may reflect the health risks associated with fecal contamination, and therefore may be
considered as potential model endpoints (National Research Council, 2001). Such or-
ganisms include, but are not limited to, the family of coliform bacteria (which include
total coliform, fecal coliform, and Escherichia coli) and Enterococcus sp (U.S. Envi-
ronmental Protection Agency, 2001). Current guidelines for United States shellfish
harvesting waters indicate fecal coliform most probable number (MPN) and colony
forming unit (CFU) values as a basis for water quality standards.
For the purposes of this study, a simplified network model is derived from the
comprehensive model (figure 3.3) which includes only those variables which are mea-
surable, and which relate precipitation and tidal dynamics with fecal coliform MPN
measurements. Because water quality samples collected for this study were analyzed
by SSS using a 5-tube serial dilution multiple tube fermentation procedure resulting
in MPN estimates of fecal coliform concentration, fecal coliform MPN will serve as the
model endpoint. Exploratory analysis of historical data, local management criteria,
38
Figure 3.3: Graphical representation of environmental variables and processes as-sociated with fecal contamination in tidal shellfish harvesting areas. Managementdecisions are indicated by boxes, and variables are represented by rounded nodes.
and conversations with SSS personnel indicate precipitation and tide are two of the
most significant variables affecting bacteriological water quality within the Newport
River Estuary. A similar model simplification process is presented in Borsuk et al.
39
(2004).
In order to facilitate both graphical representation and Bayesian updating, I im-
plement the proposed model using the Bayesian network software package Neticar.
A critical aspect of implementing a Bayesian network model within most packaged
software programs is variable discretization, and variables in the proposed submodel
are primarily discretized in order to best reflect current local and federal management
criteria. For example, shellfish harvesting areas in the Newport River Estuary are
often closed after a daily rainfall event exceeding one inch. The magnitude of the
most recent rainfall event is therefore selected as a submodel variable with alternative
states of less than one inch and at least one inch. In addition, the shellfish manage-
ment guidelines outlined in Title 15A of the North Carolina Administrative Code
(NCAC), Chapter 18 (Environmental Health), SubChapter A (Sanitation), Sections
.0300 through .0900 (see Appendix B) indicate that the median fecal coliform most
probable number (MPN) or the geometric mean MPN of water shall not exceed 14
organisms per 100 ml, and not more than ten percent of the samples shall exceed
a fecal coliform MPN of 43 organisms per 100 ml (based on the five-tube serial di-
lution analysis procedure used by SSS). A graphical representation of the proposed
submodel, indicating the selected variables and their states, is included as figure 3.4.
Each node in figure 3.4 represents a system variable, and the rows within each node
indicate a variable state along with the associated probability distribution. Where
applicable, the bottom of each node includes the node variable mean and standard
40
Figure 3.4: Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for all variable states are based on monitoring datacollected between 1994 and 1997 at a cluster of monitoring stations in the upperreaches of the Newport River Estuary, North Carolina.
deviation. The values in figure 3.4 are based on water quality data collected from
monitoring stations in the upper reaches of the Newport River between 1994 and
1997.
3.3.3 Conditional Probabilities
Representing relationships between variables using conditional probability distribu-
tions facilitates not only model updating (using Bayes’ theorem), but also analysis of
sensitivity of the response variable (fecal coliform MPN) to alternative environmental
states. For example, the probability distributions expressed in figure 3.4 are based
on precipitation and tidal conditions only at the time of sampling. It is therefore
uncertain if the distribution of fecal coliform MPN presented in figure 3.4 is an ap-
propriate indicator of long-term average conditions in the water body and if it can
be used as an accurate tool for assessing impairment of the designated use.
41
Historic data from the Morehead City NCDC station for this time period indi-
cates that there are between 0 and 4 days of dryness between rainfall events roughly
84% of the time, and more than 4 days of dryness between rainfall events 16% of
the time. Historic data analysis also indicates that the magnitude of daily rainfall
events is less than 1 inch approximately 90% of the time. Adjusting the distribution
of environmental variables to reflect long term conditions provides a better under-
standing of the long-term distribution of the water quality measurement. Neticar
stores relationships between causal and response variables in a conditional proba-
bility table. In this example, the relationship between variables does not change as
we modify marginal probability distributions of (assumed) causal variables. Using
the chain rule, we can demonstrate how Neticar calculates the marginal probabil-
ity distribution for any state of the fecal coliform MPN given different states of the
causal variables. For example, figure 3.5 (from the Neticar graphical user inter-
face) shows the empirically-based conditional probability distribution table for fecal
coliform MPN. Each row corresponds to the conditional probability that the fecal
coliform MPN will be in a given state given the state of all three causal variables.
For example, the first row of the table indicates that there is a 0.67 probability that
the fecal coliform MPN will be below 14 organisms per 100 ml when the tide is rising,
when the most recent rainfall is less than one inch, and when the most recent rainfall
event was less than four days ago. Using the chain rule, the marginal probability
that the fecal coliform MPN is between 0 and 14 organisms per 100 ml (integrated
42
over all possible states the three causal variable states, expressed here as x1, x2, x3)
can be written as:
p(0 ≤ MPN < 14) =∑
X
p(0 ≤ MPN < 14 | x1, x2, x3)π(x1, x2, x3) (3.1)
We assume that x1, x2, x3 are independent, and can therefore rewrite equation 3.1
as:
p(0 ≤ MPN < 14) =∑
X
p(0 ≤ MPN < 14 | x1, x2, x3)π(x1)π(x2)π(x3)
We can then combine the conditional probabilities for the fecal coliform MPN in
figure 3.5 with any set of marginal probabilities of environmental (causal) variables.
The marginal probability that the fecal coliform MPN is between 0 and 14 organisms
per 100 ml under long-term environmental conditions (which, as stated previously,
are slightly different than those under which the samples were collected) is:
43
Figure 3.5: Conditional probability distribution table for fecal coliform MPN node.For each of the three states of the MPN node, each row indicates the marginalprobability of the node being in that state given the state of the three causal variables.For example, the probability that the MPN is less than 14 organisms per 100 ml, giventhat the tide is rising, the most recent rainfall was less than one inch, and that it hasbeen less than four days since the most recent rain event, is 0.667.
p(0 ≤ MPN < 14) =∑
X
p(0 ≤ MPN < 14 | x1, x2, x3)π(x1)π(x2)π(x3)
= (0.67)(0.90)(0.50)(0.84) +
(0.55)(0.90)(0.50)(0.16) +
(0.33)(0.10)(0.50)(0.84) +
(0.33)(0.10)(0.50)(0.16) +
(0.61)(0.90)(0.50)(0.84) +
(0.60)(0.90)(0.50)(0.16) +
(0.33)(0.10)(0.50)(0.84) +
(0.55)(0.10)(0.50)(0.16)
= 0.60
44
A summary of marginal distributions for each causal variable revised to reflect
long-term conditions in the Newport River Estuary, along with the revised marginal
distribution for the fecal coliform MPN, is presented in figure 3.6.
Figure 3.6: Graphical submodel relating precipitation events, tidal dynamics, andwater quality. Probabilities for fecal coliform MPN states are conditional uponlong-term average precipitation and tidal conditions in the upper reaches of the New-port River Estuary, North Carolina.
3.4 Results and Discussion
Results of the analysis of the conditional probability distributions of water quality
data within the upper reaches of the Newport River between 1994 and 2004 using
the proposed Bayesian network submodel are presented in table 3.1. The summary
table divides the data into three time periods, and indicates distribution of fecal
colifom MPN under the conditions at the time of sampling (e.g. figure 3.4) and ad-
justed for long-term average conditions (e.g. figure 3.6). Analysis of the data in table
3.1 indicates little change in the probability distribution of fecal coliform between
the selected time periods and between the long-term average distribution and the
45
Table 3.1: Marginal distribution of fecal coliform MPN results at a selected groupingof monitoring stations. Newport River, North Carolina.
Marginal distribution under Marginal distribution adjusted forsampling conditions long-term average conditions
MPN (org/100mL) Probability MPN (org/100mL) Probability
0 to 14 0.58 0 to 14 0.601994-1997 14 to 43 0.23 14 to 43 0.20
≥ 43 0.19 ≥ 43 0.200 to 14 0.64 0 to 14 0.66
1997-2000 14 to 43 0.17 14 to 43 0.17≥ 43 0.19 ≥ 43 0.17
0 to 14 0.53 0 to 14 0.562001-2004 14 to 43 0.30 14 to 43 0.28
≥ 43 0.17 ≥ 43 0.16
distribution under sampling conditions. Results of the analysis indicate either that
the original monitoring program reflects long-term conditions, or that I don’t have
enough data to support alternative conditional scenarios for all possible combinations
of the variable states. Future modeling efforts should include data not initially in-
cluded in the standard SSS monitoring program in order to improve understanding
of the relationship between rainfall events, tide, and bacteria concentrations.
Results of the Bayesian analysis of water quality data are presented in table 3.2,
and indicate that the Bayesian analysis may provide a more representative long-term
indication of water quality in the Newport River. In addition, the results indicate
that a Bayesian analysis provides an opportunity to apply relative weights to current
and historic data based on potential knowledge of changing dynamics within the
contributing watershed.
In particular, a Bayesian analysis yields fecal coliform MPN probability distri-
46
Table 3.2: Summary of Bayesian analysis results for Newport River, North Carolinafecal coliform MPN data.
Prior distribution Posterior distributionMPN (org/100mL) Probability MPN (org/100mL) Probability
0 to 14 0.33(2) 0 to 14 0.611994-1997 14 to 43 0.33(2) 14 to 43 0.20
≥ 43 0.33(2) ≥ 43 0.190 to 14 0.61 0 to 14 0.64
1997-2000 14 to 43 0.20 14 to 43 0.18≥ 43 0.19 ≥ 43 0.18
0 to 14 0.64 0 to 14 0.622001-2004 14 to 43 0.18 14 to 43 0.22
≥ 43 0.18 ≥ 43 0.16NOTES: 1) All distributions conditional on long-term average conditions.
2) A very low relative weight (effective sample size = 1) was appliedto this prior distribution. See text for additional details.
butions at the end of each selected time period (i.e. 1994-1997, 1997-2000, and
2000-2004) with less between-time-period variance than the marginal probability dis-
tributions. For example, the marginal probability that fecal coliform MPN is below
14 is 0.56 for the 2000-2004 period compared to 0.66 to the 1997-2000 period (see
table 3.1). The Bayesian posterior probability that fecal coliform MPN is below 14
following the 2001-2004 time period is 0.62, compared to 0.64 for the 1997-2000 time
period (see table 3.2). These results imply that a Bayesian analysis is less influ-
enced by potential anomalies in the sampling data from a particular time period,
and perhaps provides a better overall representation of conditions within the water
body.
In addition, Bayesian analysis using the Neticar software allows prior and like-
lihood information to be weighted in order to reflect possible knowledge that either
47
historical or current data may serve as a more accurate indication of conditions as-
sessed for regulatory compliance. As an example, the prior probability distributions
presented for the 1994-1997 time period in table 3.2 are intended to reflect complete
ignorance of water quality conditions. A typical Bayesian analysis would reflect this
ignorance through an improper uniform prior distribution, applying equal probability
to all possible values of the fecal coliform MPN. In Neticar, a uniform probability
is applied using equal probabilities for all categories of the selected variable. As a
result, the prior distribution in table 3.2 for the 1994-1997 time period contains a
probability of 0.33 for each variable state. In order to minimize the effect of applying
disproportionate prior probabilities to each of the possible values of the fecal coliform
MPN, I apply a relative weight of 1 (i.e. relative sample size = 1) to the prior dis-
tribution allowing the likelihood (with a sample size of roughly 90) to dominate the
posterior distribution.
3.5 Conclusions
I have presented a case study applying conditional probability networks and Bayesian
updating to evaluate short and long-term water quality conditions within the Newport
River Estuary in North Carolina. This case study is intended to support the ongoing
evaluation of fecal contamination in the Newport River, and to serve as a precedent
for other water quality assessments conducted through the USEPA TMDL program.
A noted advantage to evaluating fecal contamination with a Bayesian network
48
model is the ability to easily adjust conditional probability distributions based on
changing knowledge of existing environmental conditions, and integration of new ev-
idence from ongoing and future water quality monitoring programs. The proposed
submodel serves as a template for a more rigorous analysis using the full comprehen-
sive Bayesian network model presented in figure 3.3. This research also suggests that
the current sampling scheme represents well the marginal probability distributions of
dominant environmental factors (e.g. wind and tide).
49
Chapter 4
An Assessment of Fecal IndicatorBacteria-Based Water Quality Standards
and Water Quality Model Endpoints
The content of this Chapter is published in Gronewold et al. (2008) and is available
at doi: 10.1021/es703144k. By permission of the American Chemical Society, the
abstract, figures, and tables are included below.
Abstract
Fecal indicator bacteria (FIB) are commonly used to assess the threat of pathogen
contamination in coastal and inland waters. Unlike most measures of pollutant lev-
els however, FIB concentration metrics, such as most probable number (MPN) and
colony-forming units (CFU), are not direct measures of the true in situ concentration
distribution. Therefore, there is the potential for inconsistencies among model and
sample-based water quality assessments, such as those used in the Total Maximum
Daily Load (TMDL) program. To address this problem, we present an innovative
approach to assessing pathogen contamination based on water quality standards that
impose limits on parameters of the actual underlying FIB concentration distribution,
rather than on MPN or CFU values. Such concentration-based standards link more
explicitly to human health considerations, are independent of the analytical proce-
50
dures employed, and are consistent with the outcomes of most predictive water quality
models. We demonstrate how compliance with concentration-based standards can be
inferred from traditional MPN values using a Bayesian inference procedure. This
methodology, applicable to a wide range of FIB-based water quality assessments, is
illustrated here using fecal coliform data from shellfish harvesting waters in the New-
port River Estuary, North Carolina. Results indicate that areas determined to be
compliant according to the current methods-based standards may actually have an
unacceptably high probability of being in violation of concentration-based standards.
Table 4.1: NSSP shellfish harvesting area fecal coliform water quality standardsbased on a minimum of 30 randomly collected samples.
Basis for standard Standardq50 µgeo q90
n MPN observations from 5-tube MTF procedure 14 14 43n CFU observations from MF procedure 14 14 31
µc 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55σc 2.03 1.99 1.96 1.92 1.91 1.86 1.83 1.82 1.78 1.72 1.72 1.70µc 0.60 0.65 0.70 0.75 0.80 0.85 0.90 0.95 1.00 1.05 1.10 1.15σc 1.67 1.64 1.60 1.58 1.53 1.51 1.47 1.41 1.41 1.36 1.33 1.31µc 1.20 1.25 1.30 1.35 1.40 1.45 1.50 1.55 1.60 1.65 1.70 1.75σc 1.28 1.23 1.21 1.17 1.12 1.09 1.05 1.03 0.98 0.94 0.90 0.88µc 1.80 1.85 1.90 1.95 2.00 2.05 2.10 2.15 2.20 2.25 2.30 2.32σc 0.83 0.80 0.74 0.72 0.68 0.62 0.57 0.52 0.46 0.38 0.25 0.10
Table 4.2: Values of µc and σc constituting MPN contour line (for simulated violationfrequency = 0.005).
51
µc 0.00 0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45σc 1.93 1.91 1.85 1.83 1.81 1.79 1.75 1.72 1.70 1.67µc 0.50 0.55 0.60 0.65 0.70 0.75 0.80 0.90 0.95 1.00σc 1.63 1.60 1.57 1.54 1.52 1.47 1.44 1.38 1.33 1.31µc 1.05 1.10 1.15 1.20 1.25 1.30 1.40 1.45 1.50 1.55σc 1.28 1.24 1.21 1.20 1.14 1.10 1.05 1.02 0.98 0.94µc 1.60 1.65 1.70 1.75 1.80 1.85 1.90 1.95 2.00 2.05σc 0.92 0.90 0.85 0.82 0.80 0.76 0.72 0.71 0.66 0.62µc 2.10 2.15 2.20 2.25 2.30 2.35 2.40 2.45 2.50 2.51σc 0.61 0.57 0.53 0.51 0.48 0.42 0.35 0.26 0.12 0.05
Table 4.3: Values of µc and σc constituting CFU contour line (for simulated violationfrequency = 0.005).
Prior α β E(σk) V(σk)σk ∼ Un(α, β) 0 100 50 833.33φk ∼ Ga(α, β) 1.5 0.375 0.69 0.27φk ∼ Ga(α, β) 1.0 2.0 2.5 ∞
Table 4.4: Alternative priors for true concentration ck standard deviation σk atstation k.
standard σc range γ β0 β1
MPN >0.65 1.39 2.65 -1.04MPN ≤0.65 2.61 2.44 -1.05CFU >0.65 1.03 1.98 -0.66CFU ≤0.65 2.61 1.65 -0.66
Table 4.5: Regression model parameters including transformation parameter (γ),intercept (β0), and slope (β1).
52
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σDensity
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ≈
U(α
=0,
β=
100)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1.5,
β=
0.37
5)
σ
Density
0.0
0.5
1.0
1.5
2.0
2.5
3.0
01234
prio
rpo
ster
ior
σ=
1φ,
φ
≈Γ(
α=
1, β
=2)
Fig
ure
4.1
:P
rior
and
pos
teri
ordis
trib
uti
ons
for
σk
for
five
random
lyse
lect
edst
atio
ns
inth
eN
ewpor
tR
iver
usi
ng
the
thre
epri
ors
inta
ble
4.4.
Eac
hro
wuti
lize
sth
esa
me
pri
ordistr
ibuti
on,
and
each
colu
mn
repre
sents
ase
par
ate
stat
ion.
Ver
tica
lgr
aylines
are
added
tofa
cilita
teco
mpar
ison
bet
wee
nal
tern
ativ
epri
ors
for
each
stat
ion.
53
a)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
b)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
c)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
d)
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Analytical Procedure
MPNCFU
Figure 4.2: Combinations of the mean µc and standard deviation σc of the log-trans-formed fecal coliform concentration distribution which yielded MPN (solid lines) orCFU (dashed lines) samples in violation of the NSSP median standard (panel a), ge-ometric mean standard (panel b), 90th percentile standard (panel c), or any standard(panel d) with a frequency of either 0.005 or 0.1. The zone of violations is in theupper right of each panel.
54
a) µ c
σc
0.0
0.5
1.0
1.5
2.0
2.5
3.0
0.00.51.01.52.02.53.0
MP
N A
naly
sis,
=
2 10 ∞
CF
U A
naly
sis,
=
2 10 ∞
α α
b) µ c
σc0.
00.
51.
01.
52.
02.
53.
0
0.00.51.01.52.02.53.0
MP
N A
naly
sis,
=
2 10 ∞
CF
U A
naly
sis,
=
2 10 ∞
α α
Fig
ure
4.3
:R
elat
ionsh
ipbet
wee
nth
em
ean
µc
and
stan
dar
ddev
iati
onσ
cof
the
log-
tran
sfor
med
feca
lco
lifo
rmco
nce
n-
trat
ion
dis
trib
uti
onan
dsi
mula
ted
vio
lati
onof
any
CFU
-bas
edw
ater
qual
ity
stan
dar
d(d
ashed
lines
)an
dan
yM
PN
-bas
edw
ater
qual
ity
stan
dar
d(s
olid
lines
)fo
rpos
sible
valu
esof
the
neg
ativ
ebin
omia
ldis
per
sion
par
amet
erα.
Pan
els
aan
db
indic
ate
µc−
σc
pai
rsex
pec
ted
tovio
late
stan
dar
ds
wit
ha
freq
uen
cyof
0.1
and
0.00
5,re
spec
tive
ly.
55
−2 0 2 4 6
−50
050
100
a)
γ
log−
likel
ihoo
d
−2 0 2 4 6
−10
010
2030
b)
γlo
g−lik
elih
ood
−2 0 2 4 6
−50
050
100
c)
γ
log−
likel
ihoo
d
−2 0 2 4 6
−20
010
2030
d)
γ
log−
likel
ihoo
d
Figure 4.4: Log-likelihood (solid line) of transformation parameter γ for σc usingpaired values of µc and σc. Panel a based on values from table 4.2 for σc > 0.65,panel b based on values from table 4.2 for σc ≤ 0.65, panel c based on values fromtable 4.3 for σc > 0.65, and panel d based on values from table 4.3 for σc ≤ 0.65.
56
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5
0.0
0.5
1.0
1.5
2.0
Violation frequency contour lines
MPNCFU
Model fit
MPNCFU
σc = 0.65
Figure 4.5: Violation contour lines overlaid by violation line best-fit regressionmodel fitted values based on model parameters in table 4.5.
57
CC (%) Posterior probability of Violated any MPN standardStn. MPN CFU size-30 sample violating during the 2000–2005
any MPN standard assessment period?3 52 39 5 no4 44 33 6 no
4A <1 <1 53 yes4B <1 <1 85 yes5A <1 <1 80 yes7 <1 <1 55 yes8 14 9 18 no
8A 15 12 15 no9 93 89 <1 no10 100 100 <1 no11 53 41 5 no
14A 51 40 6 no16A 32 20 12 no18 62 50 3 no24 80 71 1 no25 3 2 32 no
27A <1 <1 49 yes28 96 93 <1 no29 <1 <1 58 yes35 80 73 1 no41 <1 <1 89 yes
41A <1 <1 72 yes55 60 49 4 no56 78 67 1 no83 47 35 5 no84 13 6 17 no85 94 91 <1 no86 87 81 1 no
Table 4.6: Estimated confidence of compliance (CC), posterior probability of vi-olating any MPN standard, and observed violations for monitoring stations in theNewport River Estuary during the 2000-2005 assessment period.
58
Station 25
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC = 2−3%
Station 27A
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC < 1%
Station 3
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC = 39−52%
Station 35
µc
σ c
0.0 0.5 1.0 1.5 2.0 2.5 3.0
0.0
0.5
1.0
1.5
2.0
2.5
Probability density contour linesMPN 0.5% violation standardCFU 0.5% violation standard
CC = 73−80%
Figure 4.6: Joint posterior probability density contour lines (solid lines) for fourmonitoring stations in the Newport River Estuary. Dashed lines indicate combina-tions of the mean µc and standard deviation σc of the log-transformed fecal coliformconcentration distribution which violate concentration-based standards no more than0.5% of the time using MPN or CFU standards as the reference. Confidences of com-pliance (CC) are given in the lower left of each panel for both MPN and CFU-basedstandards.
59
Chapter 5
Modeling the Relationship Between Most
Probable Number (MPN) and Colony
Forming Unit (CFU) Estimates of Fecal
Indicator Bacteria Concentrations
Reproduced in part with permission from Gronewold and Wolpert (2008). Copyright
2008 Elsevier. Available at doi:10.1016/j.watres.2008.04.011
Most probable number (MPN) and colony-forming-unit (CFU) estimates of fe-
cal coliform bacteria concentration are common measures of water quality in coastal
shellfish harvesting and recreational waters. Estimating procedures for MPN and
CFU have intrinsic variability and are subject to additional uncertainty arising from
minor variations in experimental protocol. It has been observed empirically that the
standard multiple-tube fermentation (MTF) decimal dilution analysis MPN proce-
dure is more variable than the membrane filtration CFU procedure, and that MTF-
derived MPN estimates are somewhat higher on average than CFU estimates, on
split samples from the same water bodies. I construct a probabilistic model that
provides a clear theoretical explanation for the variability in, and discrepancy be-
tween, MPN and CFU measurements. I then compare my model to water quality
samples analyzed using both MPN and CFU procedures, and find that the (often
large) observed differences between MPN and CFU values for the same water body
60
are well within the ranges predicted by my probabilistic model. Results indicate that
MPN and CFU intra-sample variability does not stem from human error or labora-
tory procedure variability, but is instead a simple consequence of the probabilistic
basis for calculating the MPN. These results demonstrate how probabilistic models
can be used to compare samples from different analytical procedures, and to deter-
mine whether transitions from one procedure to another are likely to cause a change
in quality-based management decisions.
5.1 Introduction
Coastal water resource management agencies frequently revise standard water qual-
ity analysis procedures based on the latest available technologies. For example, the
North Carolina Department of Environmental and Natural Resources Shellfish San-
itation and Recreational Water Quality Section (NCDENR-SSS), and similar water
resource management agencies, are considering replacing multiple-tube fermentation
(MTF) fecal coliform analysis procedures with membrane filtration (MF) procedures
because MF results, while variable, are much less so than MTF results (as commonly
implemented) from the same water quality sample. NCDENR-SSS and other agen-
cies are concerned, however, that water quality-based management decisions for a
particular water body (such as approval or prohibition of shellfishing) may change
after MF procedures are implemented.
Here, I derive a theoretical model for the probability distribution of MTF and MF
61
test results from the same water quality sample. This innovative approach allows a
side-by-side comparison of alternative testing methods, accommodating their intrinsic
differences (rather than assuming that these differences have no effect). Further, I
find the probability distributions for the true fecal coliform concentrations associated
with different possible measurement results from each procedure.
Differences, if observed, between the MTF-MF relationship predicted by my model
and the MTF-MF relationship observed empirically in samples from a particular
laboratory, would suggest significant extrinsic sources of uncertainty and variability
(i.e. unrelated to natural spatial distribution of organisms in a sample aliquot volume)
and, more importantly, an increased chance that changing standard fecal coliform
analysis from MTF to MF might lead to a change in water quality-based management
decisions.
Variability in MTF and MF analysis results can be divided into two categories:
intrinsic stochastic variability due to the natural dispersion of bacteria within sample
containers, and extrinsic variability. Intrinsic sources of variability are mostly a
consequence of procedure design, and are explained later in this section. Extrinsic
sources of variability include departures from expected sampling protocol, microbial
cell damage (during filtration, for example) which may reduce the number of viable
organisms (Kloot et al., 2006), and clumping of bacteria cells (Noble et al., 2003b).
Other potential extrinsic sources of variability relate to environmental conditions
at the time of sampling, including antecedent rainfall, turbidity, and season (Cabelli
62
et al., 1983; Noble et al., 2003a). These extrinsic sources of variability are not included
in my model and, if they actually contribute to MTF-MF intra-sample variability,
will limit my model’s ability to explain the difference between MTF and MF results.
Fecal and total coliform bacteria are indicators of potential fecal pollution and
water-borne pathogenic threats to human health (Cabelli, 1983; LeClerc et al., 2001).
Other bacterial measures of water quality include Escherichia coli (a subset of fecal
coliforms), and enterococci (Noble et al., 2003a). Extensive definitions of fecal and
total coliform bacteria are presented elsewhere (Rompre et al., 2002; Kloot et al.,
2006). My model is applied to monitoring data from shellfish harvesting areas in
which fecal coliform is a more common measure of water quality. As a result, I
discuss only fecal coliform bacteria concentrations for the rest of this paper, however
the application of probabilistic models to intra-sample variability can be applied to a
wide range of microbial, physical, and chemical pollutants (see, e.g. Kinzelman et al.,
2003; U.S. Geological Survey, 1996; Horowitz, 1986).
MTF and MF are two common procedures for estimating fecal coliform concen-
trations in coastal resource waters (Eckner, 1998; Buckalew et al., 2006). MTF and
MF fecal coliform analysis results are reported as most probable number (MPN) and
colony-forming unit (CFU) estimates of the true fecal coliform concentration c (typ-
ically in organisms per 100 ml). Detailed descriptions of the MF microbial analysis
procedure are presented in Rose et al. (1975), Rippey et al. (1987), Dufour et al.
(1981), Eckner (1998), and Esham and Sizemore (1998). Similar descriptions of the
63
MTF procedure are presented in Cochran (1950); Hurley and Roscoe (1983); Beliaeff
and Mary (1993); McBride et al. (2003).
MPN estimates derived from a standard (e.g. 5-tube × 3 dilution series) MTF
analysis are, by definition, the possible values of the concentration at which the
likelihood function (see Appendix, equation 5.2) attains its maximum. The likelihood
function offers an indication of how strongly an observed pattern of positive tube
counts from an MTF analysis support each possible value c of the concentration
(McBride, 2005, pp. 12–13). The MPN estimates are highly variable because this
function has a very broad peak, and so is close to its maximum value over a wide
range of possible concentrations.
Additional discussion of the statistical assumptions inherent in MTF-based MPN
calculations can be found in Eisenhart and Wilson (1943); Beliaeff and Mary (1993);
Klee (1993). CFU estimates are based on the number of distinguishable bacterial
colonies which form on a culture plate after filtration and incubation. CFU variability
is inversely proportional to the volume of sample water filtered, and therefore while
CFU estimates are variable, the variability is often small compared to that of MTF-
derived MPN estimates when large aliquot volumes are used. The broad likelihood
function of MTF positive tube count observations and variability in the number of
distinguishable bacterial growth colonies are both examples of intrinsic variability in
MPN and CFU estimates, and are therefore addressed explicitly in my model.
Several recent studies document empirical relationships between fecal bacteria
64
analysis results from different testing procedures (e.g. Eckner, 1998; Noble et al.,
2003b; Kloot et al., 2006). The study by Noble et al. (2003b), for example, which
compares beach water quality analysis results using MF, MTF, and the IDEXX
Quanti-Tray R©/2000 chromogenic substrate test (CST) kit, indicates that measure-
ment error inherent to analytical procedures is likely to exceed differences between an-
alytical procedures assuming standard laboratory procedures are followed; Buckalew
et al. (2006) also find the intrinsic variability of these methods to exceed their differ-
ences.
Furthermore, Noble et al. (2003b) acknowledge that different test procedures are
likely to yield different fecal coliform concentration estimates because they measure
different metabolic process endpoints. Similar historic studies include a comparison
between MF-derived estimates of enterococci and E. coli by Levin et al. (1975) and
Dufour et al. (1981), a comparison between total coliform, fecal coliform, and fecal
streptococci concentration estimates using MTF procedures by Sayler et al. (1975),
and comparison between both MTF and MF estimates of E. coli, Klebsiella, and
Enterobacter species by Dufour and Cabelli (1975). I know of no study, however,
which attempts to explain the difference between standard MF and MTF procedures
by modeling only intrinsic variability in MPN and CFU estimates.
The remaining sections of this paper include a description of fecal coliform water
quality sampling and analysis procedures, followed by my approach to deriving a
probabilistic model of the relationship between observed MPN and CFU estimates.
65
I then present results of the analysis, including a comparison of my proposed theo-
retical probability distributions to observations from a recent NCDENR-SSS water
quality study which included analysis for fecal coliform concentration using both
MTF and MF procedures. I fit an ordinary least-squared (OLS) regression model to
the NCDENR-SSS data and compare regression model fitted values and prediction
intervals to my theoretical probability model. I conclude with a discussion of how my
findings might be used to guide water resource area management agencies through
transitions from one standard water quality analysis procedure to another.
5.2 Methods
5.2.1 Water Quality Monitoring
One-hundred and forty-four surface water quality samples were collected by NCDENR-
SSS personnel at monitoring stations throughout the Newport River Estuary in East-
ern North Carolina between May 2006 and January 2007 (NCDENR, 2007, unpub-
lished data). As a designated shellfish harvesting area, the Newport River Estuary
is governed by the National Shellfish Sanitation Program (NSSP) whose guidelines
(Food and Drug Administration and Interstate Shellfish Sanitation Conference, 2005)
require that its water quality standards be based on either MPN or CFU estimates of
fecal coliform bacteria concentration. Water quality samples were therefore analyzed
by NCDENR-SSS for fecal coliform concentration using both 5-tube decimal dilution
66
MTF and MF analysis tests in accordance with both NSSP guidelines and industry
standards (APHA, 2005).
5.2.2 Theoretical Probability Model
I derive a probabilistic model, addressing only intrinsic sources of variability, of the
relationship between fecal coliform MTF and MF measurements from the same water
quality sample. This model is theoretical because it assumes extrinsic sources of
variability are insignificant. I begin by calculating the probability distribution of the
MPN and CFU for any true fecal coliform concentration c (measured in organisms per
100 ml). I then implement a Bayesian analysis to derive the conditional distribution
of the true fecal coliform concentration c for any recorded MPN or CFU estimate.
Finally, I apply conditional probability distribution theory to yield the probability
function of the MPN for any observed CFU estimate from the same sample. Details
of the calculation procedures are included in the last section of this Chapter.
5.2.3 OLS Regression Empirical model
In addition to deriving a theoretical probability model, I fit a simple empirical log-
scale OLS regression model to the NCDENR-SSS data (see Weisberg, 2005, pp. 21–
30 for details on OLS regression). When all tubes in an MTF test are negative,
the maximum likelihood estimate (and hence the MPN) of the true concentration
c is zero (see Calculations section, equation 5.1). Because the logarithm of zero is
67
not finite, my regression model excludes 7 NCDENR-SSS data points with an MPN
of 0 organisms per 100 ml, and (for similar reasons) two data points with a CFU
of 0 organisms per 100 ml. The regression model also excludes the 19 NCDENR-
SSS observations whose MF test results were recorded as “too numerous to count
(TNTC)”.
5.3 Results and Discussion
In figure 5.1 I present expected values of the MPN (in panel A) and CFU (in panel B)
for every 5th integer-valued true fecal coliform concentration c in the range 0 ≤ c ≤
250, including 95% prediction sets. The 95% prediction set is the finite collection of
highest-probability values from a (perhaps multi-modal, as in the case of the MPN)
discrete probability distribution whose cumulative probability is at least 0.95. While
these sets are well-represented as intervals for the CFU in panel B, it is clear (see
panel A) that the likely MPN values vary widely and the 95% prediction sets are not
well-represented by continuous intervals. The results in figure 5.1 illustrate that the
wide variability of MPN results, a feature which might be misattributed to extrinsic
variability, is really a simple consequence of the probability distribution for the MPN.
In figure 5.2 I present expected values of the true fecal coliform concentration,
along with 95% credible intervals, for observable MPN estimates (in Panel A) and
for every 5th observable CFU estimate (in Panel B). A Bayesian 95% credible interval
contains the true fecal coliform concentration with a probability at least 0.95; see
68
Casella and Berger (2002, pp. 436–437) or McBride (2005, pp. 208–209), where credi-
ble intervals are described in detail and contrasted with confidence intervals. Details
of my Bayesian analysis are presented in the Calculations section. The “observable
MPN estimates” are those which can possibly arise from the (NSSP standard) 5-tube
fermentation serial dilution analysis (the most likely ones are presented, for example,
in tables in Woodward, 1957); for a sample aliquot volume of 100 ml (per NCDENR-
SSS operating protocol), the observable CFU estimates are all nonnegative integers.
Lengths of credible intervals depend on the numbers of tubes used, for MPN, and on
aliquot volume, for CFU (see Calculations, equation 5.5); thus, although the confi-
dence intervals for the CFU method are narrower than those for the MPN method
for any fixed sample volume (as suggested by the relative interval lengths in panels
A and B of figure 5.2), intervals could be made narrower for either method by using
more tubes (for MPN) or a greater volume (for CFU).
In figure 5.3 I present OLS regression model fitted values and theoretical probabil-
ity model expected values of the MPN for CFU estimates observed in the NCDENR-
SSS study. In addition, I present MPN 95% prediction intervals and prediction sets
for the regression model and probabilistic model, respectively. Observations from the
NCDENR-SSS study are also plotted in figure 5.3.
Prediction intervals in panel A of figure 5.3 are based on standard assumptions
regarding the distribution of OLS linear regression model fitted value residuals (see
Weisberg, 2005), and are presented to contrast with the true discrete multi-modal
69
distribution of the MPN presented in both panel B of figure 5.3, and in detail in figure
5.4. Figure 5.4 includes the full theoretical probability distribution of the MPN for
an observed CFU value of 6 organisms per 100 ml along with a histogram of MPN
estimates from 13 of the NCDENR-SSS water quality samples with a CFU estimate
of 6 organisms per 100 ml. Figures 5.3 and 5.4 demonstrate not only that the most
likely MPN estimates for a given water quality sample are a discrete subset of non-
consecutive observable MPN estimates, but also that the NCDENR-SSS observations
are entirely consistent with my theoretical probability model. Furthermore, my the-
oretical probability model explains why the MPN is a positively-biased estimate of
fecal coliform concentration (Garthright, 1993, 1997).
Despite differences between regression model fitted values (panel A of figure 5.3)
and expected values from my theoretical probability model (panel B of figure 5.3),
I expect empirical regression model fitted values to approach expected values of the
MPN for a specific CFU as sample size increases. Differences, if any, between large-
sample empirical regression model fitted values and my theoretical model expected
values might suggest significant non-probabilistic (i.e. extrinsic) sources of variabil-
ity. Exploring comparisons between my proposed probabilistic model and regression
models fit to very large data sets is an area for future research.
70
5.4 Conclusions
I derived a theoretical model of the MPN probability distribution for any observed
CFU estimate from the same water quality sample. Recent water quality samples
collected and analyzed by NCDENR-SSS for fecal coliform concentration using both
MTF and MF analysis tests yielded MPN and CFU estimates entirely consistent
with my theoretical probabilistic model. My results indicate that MPN and CFU
intra-sample variability does not stem from human error or laboratory procedure
variability, but is instead a simple consequence of the probabilistic basis for calculat-
ing the MPN.
I anticipate this study will serve as a stepping stone towards future research on
whether different fecal coliform analysis procedures might lead to different water
quality standard violation frequencies for the same water body. Method-dependent
differences, if any, might propagate into coastal resource water management decisions
through two undesirable pathways. First, analysis of water quality samples from a
coastal resource water might, depending on the analysis procedure used, result in
different management actions (such as closing or opening a shellfish harvesting area).
Second, if fecal coliform concentration estimates vary depending on whether MTF or
MF procedures are used, potential benefits of merging historic MPN and new CFU
data sets would be limited (Noble et al., 2003b). Future research on the probabilistic
basis for current water quality standard violations, coupled with the modeling tools
presented in this paper, could provide answers to these research questions.
71
Other suggested studies stemming from this research include, but are not limited
to, quantifying membrane filtration-related fecal coliform thinning and contamination
rates, exploring environmental effects on fecal coliform concentration estimate bias,
and determining how measuring different coliform bacteria metabolic output effects
fecal coliform concentration estimates.
5.5 Calculations
Assuming fecal coliform organisms at concentration c (in organisms per 100 ml) are
well mixed in a water sample, it is commonly assumed that aliquots of volume vi ml
from the water sample contain a Poisson Po(cvi/100) distributed number of fecal co-
liform organisms (McCrady, 1915; Greenwood and Yule, 1917; de Man, 1977; Russek
and Colwell, 1983; Best and Rayner, 1985; Woomer et al., 1990; Briones and Re-
ichardt, 1999). Out of ni serial dilution analysis tubes, the numbers of positive tubes
xi are independent binomial Bi(ni, pi) random variables with pi = 1− exp(−cvi/100)
(for more on using Poisson and binomial distributions in environmental data analysis,
see Ott, 1995, pp. 93–113 and 127–137). The MPN for m dilution series can therefore
be expressed as:
MPN = argmaxc
[
m∏
i=1
(
1 − e−cvi/100)xi
(
e−cvi/100)ni−xi
]
(5.1)
72
and the conditional probability distribution of positive tube counts X = {xi}, given
true fecal coliform concentration c, is:
f(x | c) =m∏
i=1
(
ni
xi
)
[
1 − e−cvi/100]xi
[
e−cvi/100]ni−xi
(5.2)
The Poisson-distributed CFU observation Y ∼ Po(λ) with mean λ = cV/100 for
sample aliquot volume V ml has conditional probability distribution, given true fecal
coliform concentration c, given by
f(y | c) =1
y!(cV/100)ye−cV/100 for y ∈ 0, 1, 2, . . . (5.3)
The posterior probability distribution of the true fecal coliform concentration c
for an observed tube count combination x, using Jeffreys’ scale-invariant “reference”
prior distribution π(c) ∝ 1/√
c (Jeffreys, 1946; Bernardo and Ramon, 1998), is given
by:
f(c | x) ∝ c−1/2e−(c/100)∑m
i=1 vi(ni−xi)m∏
i=1
(
1 − e−cvi/100)xi
, c > 0 (5.4)
Using the same Jeffreys’ prior distribution, the posterior distribution of c for a
73
given CFU observation y is:
f(c | y) ∝ cy−1/2e−cV/100, c > 0 (5.5)
which is a Gamma Ga(α, λ) distribution with shape parameter α = y + 1/2 and rate
parameter λ = V/100.
Finally I calculate the probability distribution of the positive tube count vector
x = (x1, . . . , xm), 1≤xi≤ni for any CFU observation y, P[X = x | Y = y], by
combining equations 5.2 and 5.5:
f(x | y) =
∫
∞
0
f(x | c)f(c | y)dc (5.6)
=(V/100)y+1/2
Γ(y + 1/2)×
∫
∞
0
cy−1/2e−(c/100)[V +∑m
i=1 vi(ni−xi)]m∏
i=1
(
ni
xi
)
(
1 − e−cvi/100)xi
dc.
74
050
100
150
200
250
02004006008001000
Tru
e fe
cal c
olifo
rm c
once
ntra
tion
(org
anis
ms
per
100
ml)
Fecal coliform MPN (organisms per 100 ml)
A
E(M
PN
|c)
MP
N 9
5% p
redi
ctio
n se
t for
spe
cifie
d tr
ue c
once
ntra
tion
1:1
line
050
100
150
200
250
02004006008001000T
rue
feca
l col
iform
con
cent
ratio
n (o
rgan
ism
s pe
r 10
0 m
l)Fecal coliform CFU (organisms per 100 ml)
E(C
FU
|c)
1:1
line
! M
PN
95%
pre
dict
ion
inte
rval
for
spec
ific
true
con
cent
ratio
n
B
Fig
ure
5.1
:E
xpec
ted
valu
esan
d95
%pre
dic
tion
sets
orpre
dic
tion
inte
rval
sfo
rob
serv
able
feca
lco
lifo
rmM
PN
(pan
elA
)an
dC
FU
(pan
elB
)m
easu
rem
ents
give
nth
etr
ue
feca
lco
lifo
rmco
nce
ntr
atio
nin
orga
nis
msper
100
ml.
For
clar
ity,
expec
ted
valu
esan
d95
%pre
dic
tion
sets
orin
terv
als
are
plo
tted
only
for
ever
y5th
inte
ger-
valu
edco
nce
ntr
atio
nc.
Max
imum
true
conce
ntr
atio
ns
inea
chplo
tar
ebas
edon
max
imum
MP
Nan
dC
FU
obse
rvat
ions
inth
eN
CD
EN
R-S
SS
dat
ase
t.C
FU
pre
dic
tion
inte
rval
sar
ebas
edon
anM
Fsa
mple
aliq
uot
volu
me
of10
0m
l.
75
050
100
150
200
250
02004006008001000
Fec
al c
olifo
rm M
PN
(or
gani
sms
per
100
ml)
True fecal coliform concentration (organisms per 100 ml)
A
E(c
|MP
N)
1:1
line
! 9
5% c
redi
ble
inte
rval
s
050
100
150
200
250
02004006008001000F
ecal
col
iform
CF
U (
orga
nism
s pe
r 10
0 m
l)True fecal coliform concentration (organisms per 100 ml)
E(c
|CF
U)
1:1
line
! 9
5% c
redi
ble
inte
rval
s
B
Fig
ure
5.2
:E
xpec
ted
valu
ean
d95
%cr
edib
lein
terv
als
forth
efe
calco
lifo
rmtr
ue
conce
ntr
atio
ngi
ven
MP
N(p
anel
A)an
dC
FU
(pan
elB
)es
tim
ates
inor
ganis
ms
per
100
ml.
For
clar
ity,
pan
elA
incl
udes
only
the
51ob
serv
able
MP
Nes
tim
ates
pre
sente
din
stan
dar
dla
bor
ator
yan
alysi
sM
TF
conve
rsio
nta
ble
sfo
rth
e5-
tube
seri
aldiluti
onan
alysi
spro
cedure
(see
,e.
g.W
oodw
ard,
1957
)an
dpan
elB
incl
udes
only
ever
y5th
obse
rvab
leC
FU
valu
ebas
edon
anM
Fte
stw
ith
asa
mple
aliq
uot
volu
me
of10
0m
l.
76
Fec
al c
olifo
rm C
FU
(or
gani
sms
per
100
ml)
Fecal coliform MPN (organisms per 100 ml)
!
NC
DE
NR
−S
SS
dat
a us
ed in
log−
linea
r re
gres
sion
mod
elN
CD
EN
R−
SS
S d
ata
excl
uded
from
log−
linea
r re
gres
sion
mod
elR
egre
ssio
n m
odel
fitte
d va
lues
95%
MP
N p
redi
ctio
n in
terv
al1:
1 lin
e
0125102050100200500
01
25
1020
5010
020
0
A
Fec
al c
olifo
rm C
FU
(or
gani
sms
per
100
ml)
Fecal coliform MPN (organisms per 100 ml)
NC
DE
NR
−S
SS
dat
aE
(MP
N|C
FU
)95
% M
PN
pre
dict
ion
set (
for
spec
ified
CF
U)
1:1
line
0125102050100200500
01
25
1020
5010
020
0
B
Fig
ure
5.3
:E
mpir
ical
linea
rre
gres
sion
model
(pan
elA
)an
dth
eore
tica
lpro
bab
ility
model
(pan
elB
)of
the
rela
tion
ship
bet
wee
nfe
calco
lifo
rmM
PN
and
CFU
esti
mat
esfr
omth
esa
me
wat
erqual
ity
sam
ple
.
77
0.00
0.05
0.10
0.15
0.20
0.25
0.30
0.35
MPN (organisms per 100 ml)
Pro
babi
lity
mas
s
Observed CFU = 6 organisms per 100 mlE(MPN|CFU = 6) = 7.6 organisms per 100 mlObserved MPN values when CFU = 6f(MPN|CFU=6)
01
23
4
0 1 5 10 20 50 100 200 500 1000
Num
ber
of o
bser
vatio
ns
Figure 5.4: Observed values, expected values, and the theoretical probability massfunction of the MPN for a CFU measurement from the same water quality sample.Observed values are from recent NCDENR-SSS study.
78
Chapter 6
Improving Parameter Estimation in the
Aquatic Fate and Transport Model
Much of the research in this chapter was completed in collaboration with Dr. Song
Qian, Dr. Robert Wolpert, Dr. Rachel Noble and Dr. Kenneth Reckhow, and was
submitted to Water Research.
Water resource management decisions often depend on mechanistic or empirical
models to predict water quality conditions under future pollutant loading scenarios.
While explicitly acknowledging process, observation, and analytical uncertainty in
these models is considered critical to model-based resource management decisions
and protection of human and environmental health, few tools have been developed
which explicitly propagate analytical uncertainty into fecal indicator bacteria (FIB)
water quality models. Here, I explore how ignorance or acknowledgement of model
input uncertainty affects model parameter estimates in a simple FIB water quality
model. I present two approaches to calibrating the model using simulated results
of a standard multiple-tube fermentation (MTF) serial dilution analysis. The first
approach uses only the most probable number (MPN) point estimate, while the sec-
ond implements a Bayesian approach to modeling the number of positive tubes in
each MTF dilution series as a stochastic random variable. I find that my proposed
Bayesian approach yields parameter estimates which are asymptotically more accu-
79
rate and precise, and model predictions with less uncertainty than those based on
using MPN point estimates. These results suggest a potential new strategy for reduc-
ing uncertainty in model-based water resource management decisions, such as those
implemented through the United States Environmental Protection Agency (USEPA)
Total Maximum Daily Load (TMDL) program.
6.1 Introduction
Explicitly acknowledging analytical uncertainty is a potentially critical component of
water quality modeling and model-based water resources management. Nonetheless,
few tools have been developed and applied to propagate intrinsic analysis uncer-
tainty through coastal shellfish harvesting and recreational water quality models into
model forecasts and management decisions. Water quality standards in designated
recreational and shellfish harvesting areas are often based on the concentration of
fecal indicator bacteria (FIB) such as total coliforms, fecal coliforms, and enterococ-
cus (U.S. Environmental Protection Agency, 2001). Estimating FIB concentrations
through dilution series analysis and calculation of a most probable number (MPN) is
a well-documented procedure which, though broadly applied in water resource man-
agement, contains several sources of uncertainty (Best and Rayner, 1985; Woomer
et al., 1990; Garthright, 1993). When MPN estimates are used to calibrate bacterial
water quality models, the uncertainty associated with MPN estimating procedures
is often ignored. Such ignorance can lead to poor model parameter estimates and to
80
misguided management decisions (Qian et al., 2004).
Here, I propose a Bayesian strategy to calibrating FIB water quality models in
which the pattern of positive tubes from a multiple-tube fermentation (MTF) serial
dilution analysis is used as a model input. My proposed strategy assumes that the
number of positive tubes in each series, when modeled as a stochastic random variable,
reflects variability in the MTF analysis procedure and, consequently, uncertainty in
the estimate of the true FIB concentration. I compare the proposed Bayesian strategy
with the common practice of using MPN point estimates to calibrate FIB water
quality models. In the following two sections, I present a brief introduction to serial
dilution analysis (subsection 6.1.1) and MPN calculation methodology (subsection
6.1.2).
6.1.1 Serial Dilution Analysis
Serial dilution analysis of water quality samples is a procedure commonly used by re-
search laboratories and regulatory agencies to quantify FIB concentrations in coastal
and inland resource waters. Several serial dilution analysis procedures are in common
use, each using different aliquot volumes and different measures of FIB metabolic ac-
tivity. In the standard 5-tube fermentation decimal dilution procedure, water quality
sample aliquots with volume equal to 10 milliliters (ml), 1 ml, or 0.1 ml are trans-
ferred into three respective sets of five tubes, resulting in a total of fifteen tubes.
This procedure is called the 5-tube decimal dilution procedure because each dilution
81
series has five tubes, and because the volume of the original sample in each series is
separated by a factor of ten. After a period of incubation, the number of positive
tubes in each dilution series is recorded, yielding results of the form (x1, x2, x3) where
xi is the number of positive tubes in dilution series i. Tubes are considered positive
if gaseous by-products of bacteria lactose fermentation are visible. As a result, this
technique is commonly referred to as multiple tube fermentation (MTF). A full de-
scription of MTF laboratory procedures is presented in Standard Methods for the
Examination of Water and Wastewater (APHA, 2005).
While the model calibration procedures presented in this paper are based on
results of a standard 5-tube MTF analysis, other serial dilution analysis procedures
are also widely used. An example is the commercially available semi-automated
IDEXX Quanti-Tray R©/2000 system, which includes a sampling tray with 97 wells
(49 have a volume of 1.86 ml, and 48 have a volume of 0.186 ml) into which a 100
ml sample is distributed. The IDEXX Quanti-Tray R©/2000 technology represents a
type of alternative serial dilution analysis procedure to which the methods presented
in this paper apply.
6.1.2 Most Probable Number Calculations
McCrady (1915) is often credited with first quantifying FIB concentrations using
MPN theory (Eisenhart and Wilson, 1943; Hurley and Roscoe, 1983). Since Mc-
Crady’s work, numerous articles have been published on the theory behind MPN
82
calculations and their application in water quality assessment and food sanitation
(see Greenwood and Yule, 1917; de Man, 1977; Russek and Colwell, 1983; Beliaeff
and Mary, 1993; McBride, 2003). MPN calculation theory is often based on the
probability of observing negative or positive bacterial water quality test samples of
volume v taken from a larger sample of volume V . It has been shown previously (e.g.
Cochran, 1950) that if the large sample of volume V contains b organisms, then the
probability of obtaining a positive test sample, pf , is:
pf = 1 − (1 − v/V )b (6.1)
For very small values of v/V , this probability is well approximated by:
pf ≈ 1 − e−bv/V = 1 − e−cv/100 (6.2)
where c (the parameter of interest in an MTF serial dilution analysis) is the FIB
concentration in the original sample, in organisms per 100 ml.
If n test samples of volume v are taken from the original sample, the number
of positive tubes x after a period of incubation has a Binomial Bi(n, pf) probability
distribution. The probability density function of x is therefore:
f(x|c) =
(
n
x
)
pxf(1 − pf)
n−x (6.3)
Federal guidelines (Food and Drug Administration and Interstate Shellfish San-
83
itation Conference, 2005, for example) require that dilution series analysis of FIB
water quality samples include multiple dilution sets. These sets are designed to cover
the actual FIB concentration range while reducing the probability of observing ei-
ther zero or n positive test samples in each dilution set, which would result in either
zero or infinite estimates of FIB concentration. The joint probability of observing xi
positive test samples (i ∈ 1,. . . ,m) in one of m dilution sets with dilution volume vi
and ni samples is represented by the following likelihood function:
L(xi, c|ni, vi) =
m∏
i=1
(1 − e−cvi/100)xi(e−cvi/100)ni−xi (6.4)
Numerous methods for estimating the MPN have been proposed, ranging from
iterative trial-and-error approaches and Bayesian statistical procedures (see, e.g.
Garthright, 1993; Klee, 1993; Roussanov et al., 1996; Briones and Reichardt, 1999), to
approaches proposing an “exact” value of the MPN using classical occupancy theory
(for details, see Tillett and Coleman, 1985; McBride, 2003). A common approach,
which I implement here, is to approximate the MPN as the maximum likelihood
estimate (MLE) of equation 6.4 (as a function of c). The MPN can therefore be
expressed as:
MPN = argmaxc
[
m∏
i=1
(
1 − e−cvi/100)xi
(
e−cvi/100)ni−xi
]
(6.5)
From a Bayesian statistics perspective (see Berry, 1996; Bolstad, 2004), equation
84
6.4 represents the posterior probability distribution of the true FIB concentration
with an implied uniform prior distribution. This Bayesian interpretation suggests
that information about uncertainty in the true FIB concentration is contained in the
pattern of positive serial dilution analysis tubes, and that calculating and report-
ing an MPN point estimate effectively discards that information. The widespread
application of MPN-based water quality standards (Food and Drug Administration
and Interstate Shellfish Sanitation Conference, 2005, for example) has presumably
focused FIB concentration analysis uncertainty on MPN standard errors and MPN
confidence intervals, rather than the FIB concentration likelihood function in equa-
tion 6.4 (e.g. Eisenhart and Wilson, 1943; Cochran, 1950; Aspinall and Kilsby, 1979;
Hurley and Roscoe, 1983; McBride, 2003). As a result, FIB water quality models
are commonly calibrated using only the MPN point estimate. In the next section, I
compare this approach, which implicitly ignores water quality analysis uncertainty,
to my proposed Bayesian modeling strategy, which explicitly acknowledges water
quality analysis uncertainty.
6.2 Methods
I explore potential benefits of the proposed Bayesian modeling strategy by applying
it, along with the traditional approach of using the MPN point estimate, in the
calibration of the following FIB fate and transport model (Thomann and Mueller,
1987; Chapra, 1997):
85
ln(c) = ln(c0) − k(t) (6.6)
in which c is the true FIB concentration (in organisms per 100 ml) at time t, c0 is the
true FIB concentration at time t=0, and k is a first-order decay rate (see Chapra,
1997, for details).
I calibrate the model in equation 6.6 using simulated data, rather than actual
observations, in order to compare parameter estimates from the two modeling strate-
gies to the parameter values used in the simulation. The following sections include
detailed descriptions of my data simulation and model parameter estimating proce-
dures.
6.2.1 Data Simulation
I simulate the evolution of a FIB water quality grab sample with concentration c (per
equation 6.6) into FIB water quality laboratory analysis results through a three-step
approach. First, I simulate values of the true FIB concentration c using the following
modified version of equation 6.6, which includes a lognormally-distributed LN(0, σm)
stochastic model process error term:
c = eln c0−kt+No(0,σm) (6.7)
86
While process error is typically included in models to account for uncertainty
and unknown sources of variability, here I include it as one of the parameters to
be estimated during model calibration. For the simulation, I use σm = 0.3 (in log-
organisms per 100 ml). In addition, I use c0 = 1500 organisms per 100 ml and decay
rate k = 0.8 (1/day). My choice of k = 0.8 (1/day) is based on a review of a range
of values presented in Bowie et al. (1985).
In order to assess how model calibration varies with sample size, I simulate 100
sets of j water quality samples (j ∈ 10, 25, 100). Each set is simulated using j values
of t evenly spaced between 0 and 10 days. I simulate the model over a period of 10
days in order to generate FIB concentrations at the upper and lower detection limits
of the standard 5-tube decimal dilution procedure. With a decay rate of 0.8 (1/day),
an initial concentration c0 = 1500 organisms per 100 ml is expected to be reduced
by roughly 99.97% after 10 days (to a concentration of 0.45 organisms per 100 ml).
In the second step, I simulate the pattern of positive tubes (x1, x2, x3) result-
ing from a standard (5-tube) MTF decimal dilution analysis of each simulated FIB
concentration c using the following model (see equation 6.3):
xi ∼ Bi(
ni = n = 5, p = 1 − e−cvi/100)
This model can be implemented using standard statistical software functions, such as
rbinom in the program R (R Development Core Team, 2006), which generate random
87
binomial variables given parameters n and p.
In the third and final simulation step, I calculate the MPN associated with each
set of positive tubes simulated in the second step by solving equation 6.5 using the
function uniroot in the software package R (see Appendix). If all of the tubes
in a simulated MTF analysis are negative, the MLE of the likelihood function in
equation 6.4 (and therefore the MPN) is zero (for details, see Qian et al., 2005).
There is no standard, however, for reporting an MPN from an MTF result with all
tubes negative. Furthermore, MPN values of 0 are incompatible with the log-linear
model because the logarithm of 0 is negative infinity. As a result, using MPN point
estimates to calibrate log-linear bacteria water quality model parameters requires a
subjective interpretation of an MTF result with all tubes negative. I incorporate
serial dilution results with all tubes negative (as they arise from the simulation) by
randomly selecting an MPN value from a Uniform U(a, b) probability distribution
with a = 0 and b = 1.7 (the lowest MPN value reported in standard tables, such
as Woodward (1957), when at least one of the fifteen tubes in an MTF analysis is
positive).
If all of the tubes in an MTF decimal dilution analysis are positive, the MLE
of the likelihood function in equation 6.4 (and therefore the MPN) is infinite. I
incorporate simulated MTF analysis results with all tubes positive into a regression-
based calibration assessment using an MPN estimate of 1,700 organisms per 100 ml
(per Food and Drug Administration and Interstate Shellfish Sanitation Conference
88
(2005)). An alternative modeling strategy, which I do not implement here but is
common in bacterial water quality analysis, is discarding results with either all tubes
negative or all tubes positive.
Table 6.1 includes a representative sample of simulated data, including theoretical
grab sample FIB concentrations c (each collected at time t), the simulated pattern
of positive tubes (x1, x2, x3) from MTF decimal dilution analysis of each sample, and
the corresponding MPN. I use this data in the next section to estimate parameters
of the model in equation 6.7. A summary of steps in the data simulation process is
included in table 6.2.
t c x1 x2 x3 MPN(days) (organisms/100 ml) (organisms/100 ml)
0.0 1167.9 5 5 3 9201.1 501.6 5 5 3 9202.2 355.4 5 5 3 9203.3 94.2 5 3 1 1104.4 52.9 5 2 0 495.6 20.9 4 2 0 226.7 7.9 1 1 0 4.07.8 2.4 1 0 0 2.08.9 1.0 0 0 0 0.4**10.0 0.4 0 0 0 0.3**
Table 6.1: Example of simulated data set with sample size j = 10. Each row repre-sents a simulated grab sample with concentration c collected at time t, a simulatedpattern of positive tubes (x1, x2, x3) resulting from standard MTF decimal dilutionanalysis of the grab sample, and the corresponding MPN (**see Methods section forinterpretation of results with all tubes negative, or all tubes positive).
89
Ste
pVari
able
(s)
sim
ula
ted
Model
Para
met
ers
Pre
dic
tors
or
calc
ula
ted
1c
c=
eln
c0−
kt+
No(0
,σm
)c 0
,in
itia
lFIB
conce
ntr
ation
(org
anis
ms
per
100
ml)=
1500
tk,firs
t-ord
erFIB
dec
ayra
te(1
/day
)=
0.8
σm
,m
odel
resi
duals.
e.(log-o
rganis
ms
per
100
ml)
=0.3
2x1,x
2,x
3x
i∼
Bi(n
,pi)
n,num
ber
oftu
bes
inea
chdilution
seri
es=
5c
pi,pro
bability
ofposi
tive
test
sam
ple
inse
ries
i=
1−
e−cv
i/100
vi,sa
mple
aliquot
volu
me
(ml)
inse
ries
i,∈
[10,1
,0.1
]
3M
PN
arg
max
c[∏
m i=1
(
1−
e−cv
i/100)
xi(
e−cv
i/100)
n−
xi]
m,num
ber
ofdilution
seri
es=
3x1,x
2,x
3
Table
6.2
:Sum
mar
yof
step
suse
dto
sim
ula
tehypot
het
ical
wat
erqual
ity
anal
ysi
sdat
ain
cludin
gFIB
fate
and
tran
spor
tin
anaq
uat
icen
vir
onm
ent
wit
hfirs
t-or
der
dec
ay(s
tep
1),ra
ndom
lyge
ner
ated
pat
tern
ofpos
itiv
ese
rial
diluti
onan
alysi
stu
bes
(ste
p2)
,an
dca
lcula
tion
ofth
eas
soci
ated
MP
N(s
tep
3).
90
6.2.2 Parameter Estimation
My first approach to estimating parameter values (i.e. c0, k, σm) in the first-order
decay model (equation 6.7) uses an ordinary least-squares (OLS) regression with
ln(MPN) point estimates as the model response variable (see Weisberg, 2005, for
details on OLS regression). The regression model is:
ln(MPN) = β0 + β1 ∗ t + No(0, σ) (6.8)
where β0 is an estimate of ln(c0), β1 is an estimate of k, and σ is an estimate of σm.
For each of the 100 size j (j ∈ 10, 25, 100) sample sets, I record the estimated mean
value of ln(c0), k, and σm.
My second approach implements a Bayesian modeling strategy in which I derive
posterior distributions for each model parameter using Markov-chain Monte Carlo
(MCMC) simulations in the WinBUGS software program (Lunn et al., 2000; Spiegel-
halter et al., 2003). My Bayesian modeling approach is based on the assumption
that the number of positive tubes in an MTF dilution series (xi) can be modeled as
a Binomial Bi(n, pf ) random variable evolving from the true FIB concentration c as
follows:
91
ln(c) = ln(c0) − k ∗ t + No(0, σm)
xi ∼ Bi(n = 5, pi = 1 − e−cvi/100)
For each of the 100 samples sets, I record an estimated mean value of c0, k, and
σm. Detailed code for implementing this approach in WinBUGS, including selection
of parameter prior distributions, is included in the Appendix.
6.3 Results
Model calibration using both MPN point estimates and the pattern of positive serial
dilution analysis tubes yielded accurate estimates of parameters c0 and k. As shown
in figure 6.1, the inner quartile range (thick black line) contains the “true value” of
c0 and k for both procedures for all three sample sizes. For sample sizes of 10 and
25, however, estimates of c0 and k are more precise in models calibrated using the
MPN point estimate.
Model calibration using the MPN, however, consistently resulted in significant
overestimates of model error (σm) for all sample sizes. Furthermore, the magnitude
of overestimation increased with sample size. In contrast, estimates of σm using
the pattern of positive serial dilution analysis tubes yielded an inner quartile range
containing the true value of σm for samples of size 25 and 100, and parameter 95%
92
credible intervals containing the true value of σm for samples of size 10. None of the
95% intervals for σm contained its true value, regardless of sample size.
0 1000 2000 3000 4000c0
Sam
ple
size
1025
100
0.6 0.7 0.8 0.9 1 1.1
k
0.0 0.5 1.0 1.5σm
Figure 6.1: Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for each model parameter based on samples of size 10, 25, or 100.Vertical gray lines indicate the parameter value used to simulate data. Dots (solidand hollow) indicate median values. For each sample size, the upper line (with solidcircle) represents the parameter estimate based on using the MPN point estimate,and the lower line (with hollow circle) represents parameter estimates based on usingthe pattern of positive tubes for model calibration.
6.4 Discussion
My analysis indicates that using the pattern of positive tubes from an MTF serial
dilution analysis as data provides far more accurate estimates of the model error term
(σm), but provides somewhat less precise and less accurate estimates of model decay
rate k and initial concentration c0 (particularly with a small sample size). I expect
that the relative uncertainty in c0, particularly when the pattern of positive serial
dilution tubes is used for inference, is a simple consequence of the data-generating
process. More specifically, I set the log-linear model intercept term, c0, to 1500
93
organisms per 100 ml, which is close to the upper detection limit of the standard
5-tube MTF procedure. Water quality grab samples simulated at time t ≈ 0 might
have yielded MTF results with all tubes positive, and because I assigned these results
an MPN value of 1700 organisms/100 ml, the estimate of c0 appears to be accurate,
when in fact it is likely determined by my choice of the upper value of censored data.
When a dilution series yields all positive or negative results, the underlying con-
centration is essentially non-identifiable. Common approaches to addressing these
data points in models, including either removing them before analysis or reporting
them as below or above a certain value, often lead to a loss of information (Qian
et al., 2004). I also explored alternative linear modeling procedures for censored
data, including the EM algorithms presented in Schmee and Hahn (1979) and Tan-
ner (1991). Differences between parameter values estimated using EM algorithm,
and those presented in my results, are insignificant.
As discussed in Qian et al. (2005), using all of the observed serial dilution counts
as data for model inference (including those with all positive and all negative results)
is expected to yield models which outperform those using MPN-based data, regard-
less of whether those using MPN data omit or censor the MPN values associated
with all positive or all negative tube counts. This study has demonstrated potential
effects of using the MPN on model parameter estimates, however further analysis is
needed to understand potential effects on model forecasts. Here, I demonstrate how
uncertainty in FIB concentration model parameters propagates into predictions of
94
FIB concentration. I use a Monte Carlo simulation procedure using triplicate values
of c0, k, and σm to simulate the distribution FIB concentrations (using my original
model in equation 6.7) at t = 1, 4, and 7 days. I find that model prediction uncer-
tainty is consistently higher in models calibrated using MPN point estimates than
models calibrated using the pattern of positive serial dilution analysis tubes (figure
6.2). These results emphasize how explicitly modeling analytical process uncertainty
improves not only understanding of the relationship between pollutant concentrations
in the water column and laboratory-derived estimates of the concentration, but also
how uncertainty in resource area management decisions might relate to variability in
those estimates.
0 500 1000 1500 2000 2500c
t (days) = 1
Sam
ple
size
1025
100
t (days) = 4
0 50 100 150 200 250
c
0 5 10 15 20 25 30c
t (days) = 7
Figure 6.2: Estimated inner quartile (50%, thick black line) and 95% intervals(thin black line) for model-predicted FIB concentrations at time t = 1, 4, and 7days. Vertical gray lines indicate the expected FIB concentration using the “true”parameter values. Dots (solid and hollow) indicate median values. For each samplesize, the upper line (with solid circle) represents predicted FIB concentrations usingthe model calibrated with MPN point estimates, and the lower line (with hollowcircle) represents predicted FIB concentrations using the model calibrated using thepattern of positive tubes.
I also explored the choice of parameter prior distributions as a potential source
95
of bias in the posterior parameter distribution. For example, posterior parameter
distributions for k based on a normal prior distribution, k ∼ No(0,σ2k) with σk ∼
U(0,20), were compared to the posterior parameter distribution based on a uniform
prior distribution, k ∼ U(0,20) (see Gelman, 2006, for details on prior distribution
parameterization). Differences between the resulting posterior parameter distribu-
tions were negligible, indicating that my selection of prior distributions was not a
significant source of parameter estimation bias.
Opportunities for applying my modeling approach are found in a broad range
of environmental and public health-related disciplines. For example, Harris et al.
(1998) utilize MPN data in the analysis of planktonic diatom concentrations in sedi-
ment samples and cite similar studies using MPN calculations (e.g. Larrazabal et al.,
1990; An et al., 1992). Eckford and Fedorak (2005) use an MPN method to as-
sess nitrate-reducing bacteria growth in oil fields, and Fegan et al. (2004) present a
series of studies enumerating Escherichia coli O157 in cattle feces using MPN pro-
cedures. Additional examples of MPN-based environmental assessment include soil
and groundwater composition analysis (Menyah and Sato, 1996; Papen and von Berg,
1998) and aquifer contamination studies (Bekins et al., 1999). A specific example of
an MPN-based assessment of fecal contamination in recreational water bodies is the
Oregon Beach Monitoring Program (Neumann et al., 2006). This program, while
acknowledging environmental conditions as potential sources of data variability, ap-
plies MPN point estimates of FIB concentration rather than probabilistic estimates,
96
and therefore represents the type of study which could utilize, and potentially be
improved by, my modeling strategy.
In light of the many examples of uses of MPN data, I must acknowledge an on-
going transition in FIB water quality monitoring from traditional MTF technologies
towards chromogenic substrate (such as IDEXX Quanti-Tray R©/2000) and membrane
filtration (MF) technologies (see Noble et al., 2003b, for details). As mentioned pre-
viously, the IDEXX system yields MPN estimates of FIB concentration, and water
quality management laboratories using IDEXX data could apply the methodology
presented in this paper. Laboratories switching to the MF technology are likely to
continue using historic MPN estimates until the MF-based data sets are sufficiently
large. I also recognize that my approach may depend on well-maintained historic
records of MTF serial dilution analysis data. For ongoing programs, my results might
therefore provide an incentive for ensuring this data is readily available. For moni-
toring programs with large historic data sets, the potential effort of retrieving tube
count data would need to be compared with the potential benefits of my modeling
approach on a case by case basis.
This study represents a new contribution to an ongoing initiative within the envi-
ronmental modeling community to improve model-based water resource management
decisions through innovative approaches to addressing potential sources of uncer-
tainty. Cornerstones of this initiative are identified by Reckhow (1994), who warns
against the assumption that water quality assessment is precise, and suggests that
97
all potential sources of uncertainty should be incorporated into decision making pro-
cesses. Jakeman and Letcher (2003) also argue that model uncertainty and error ac-
cumulation are two important considerations arising from the use of natural resource
management-support models. Similar perspectives emphasizing the importance of
error acknowledgment and propagation through water quality models are presented
by Vandenberghe et al. (2007) and Benham et al. (2006). The modeling strategy pre-
sented here formally acknowledges uncertainty through probabilistic representation
of information from as high up in the hierarchical chain of data evolution as possible
(i.e. the pattern of positive serial dilution analysis tubes), and represents an efficient
approach to addressing current initiatives identified by the environmental modeling
community.
If using the pattern of positive tubes in a dilution series analysis consistently
improves model parameter estimation and, presumably, the predictive capabilities of
bacterial water quality models, potential benefits include more efficient use of manage-
ment resources, reduced effort associated with calculating, reporting, and interpreting
laboratory analysis results, and a shift away from debates over the best approach to
quantifying MPN uncertainty (see, e.g., Roussanov et al., 1996; Garthright, 1997)
to appropriate model selection. My approach to acknowledging and modeling uncer-
tainty in the MTF serial dilution analysis procedure represents a type of innovative
tool (as discussed in Borsuk et al., 2002) for improving local, regional, and global
water resource management plans.
98
6.5 Conclusions
I present a simulation-based analysis of bacterial model parameter estimation proce-
dures using two approximations of the “true” FIB concentration, each resulting from
a different interpretation of MTF serial dilution analysis results, and each reflecting
a different understanding of uncertainty. My analysis indicates that using pattern
of positive tubes from a serial dilution analysis improves parameter estimation and
associated model forecasts when compared to using the MPN point estimates of FIB
concentration. Similar results were obtained in a study of mice infectivity rates by
Qian et al. (2005), who found that Bayesian model parameter estimation resulted
in lower uncertainty, and suggested that MPN estimates may not be as suitable for
model parameter estimation as the “count data” from which they are derived.
Recent advances in computational speed and Bayesian analytical software greatly
facilitate the type of probabilistic data representation demonstrated in this paper.
Future research in this area includes using the pattern of positive tubes from a serial
dilution analysis (or similar probabilistic modeling strategies) in more complex mod-
els which traditionally use FIB concentration point estimates. Additional research
opportunities based on this study include potential analysis of changes in long term
water quality standard violation forecasts and resource area management decisions
using probabilistic models. The following is a summary of observations made during
the course of this study:
• Using the pattern of positive serial dilution analysis tubes to calibrate FIB
99
water quality models yields far more accurate estimates of model error, and
comparable estimates of other model parameters, when compared to using the
MPN.
• Model parameter inference using MPN point estimates yielded an significant
overestimation of model error leading to unnecessarily large model prediction
intervals. When this uncertainty propagates into water quality-based manage-
ment decisions, it is often accounted for by an implicit margin of safety (MOS).
The model parameter inference procedures presented in this paper allow anal-
ysis, and possibly a reduction of the MOS.
• Using the pattern of positive serial dilution analysis tubes as a direct model
input eliminates the need to calculate MPN point estimates and upper and
lower limits of censored MPN estimates, thereby simplifying model inference
and avoiding common sources of error and uncertainty.
• Bacterial water quality model inference based on probabilistic representation of
hierarchical data applies to both historic (e.g. traditional 5-tube MTF analysis)
and new (e.g. IDEXX Quanti-Tray R©/2000) analytical procedures, and repre-
sents an approach to addressing uncertainty consistent with ongoing objectives
identified by the environmental modeling community.
100
6.6 Computer code
A. Function for calculating MPN:
calc.mpn <- function(tubes,v,
n.tubes) ifelse(all(tubes==0), runif(1,0.001,1.7),
ifelse(all(tubes==5),1700,
uniroot(function(c) sum((tubes*v)/(1-exp(-(c/100)*v))-(v*n.tubes)),
low = 0.1, up = 3000 , tol = 1e-10 ) $root))
B. Prior distributions and WinBUGS code for estimating parameters using
the pattern of positive tubes from a decimal dilution analysis:
Following the approach of Gelman (2006), I use the following parameter prior
distributions:
π(c0) ∼ LN(0, σc0)
π(σc0) ∼ U(0, 20)
π(k) ∼ U(0, 20)
π(σm) ∼ U(0, 20)
and the following WinBUGS code:
model {
for (j in 1:J){ #J = number of samples in each set
t1[j] ~ dbin(p1[j],n) #id = set number (out of n.run)
t2[j] ~ dbin(p2[j],n)
t3[j] ~ dbin(p3[j],n)
p1[j] <- 1-exp(-(c[j]/100)*v1)
p2[j] <- 1-exp(-(c[j]/100)*v2)
p3[j] <- 1-exp(-(c[j]/100)*v3)
c[j] <- exp(logc0[id[j]]-k[id[j]]*t[j]+error[j])
error[j] ~ dnorm(0,tau[id[j]])
}
101
v1 <- 10
v2 <- 1
v3 <- .1
n <- 5
for (i in 1:n.run){ #n.run = 100 sets
logc0[i] ~ dnorm(0,tauc0)
tau[i] <- pow(sigma[i],-2)
sigma[i] ~ dunif(0,20)
k[i] ~ dunif(0,20)
}
tauc0 <- pow(sigmac0,-2)
sigmac0 ~ dunif(0,20)
}
102
Appendix A
Listing of Impaired Waters
Water body name Assessment Unit Classification IR Category
Newport River 21-(17)a SA 4csNewport River 21-(17)b1 SA 4csNewport River 21-(17)b2 SA 5Newport River 21-(17)c SA 5Newport River 21-(17)d1 SA 5Newport River 21-(17)d3 SA 4csNewport River 21-(17)e1 SA 4csNewport River 21-(17)e2 SA 4csNewport River 21-(17)f SA 4csNewport River 21-(17)g1 SA 4csNewport River 21-(17)g2 SA 4csNewport River 21-(17)h SA 5
Little Creek Swamp 21-18 SA 4csMill Creek 21-19 SA 4csBig Creek 21-20 SA 4cs
Little Creek 21-21 SA 4csHarlowe Canal 21-22-1 SA 4csAlligator Creek 21-22-2 SA 4csHarlowe Creek 21-22a SA 4csHarlowe Creek 21-22b1 SA 4csHarlowe Creek 21-22b2 SA 4csHarlowe Creek 21-22b3 SA 4csHarlowe Creek 21-22c SA 5Oyster Creek 21-23a SA 5Oyster Creek 21-23b SA 4cs
Eastman Creek 21-24-1 SA 4csBell Creek 21-24-2a SA 4csBell Creek 21-24-2b SA 4csCore Creek 21-24a SA NACore Creek 21-24b1 SA 4csCore Creek 21-24b2 SA 4csCore Creek 21-24c SA 4csWare Creek 21-25 SA 5
Russell Creek 21-26a SA 4csRussell Creek 21-26b SA 4csWading Creek 21-27 SA 4csGable Creek 21-28a SA 4csGable Creek 21-28b SA 4csWillis Creek 21-29 SA 4cs
Crab Point Bay 21-30 SA 4cs
Table A.1: Water bodies within shellfish growing area E-4 and their status relativeto the 303(d) list of impaired waters. “IR Category” refers to 2008 Draft IntegratedReport (IR) Category.
103
Appendix B
North Carolina Shellfish Harvesting Area
Water Quality Standards
Title 15A of the North Carolina Administrative Code (NCAC), Chapter 18 (Envi-ronmental Health), SubChapter A (Sanitation), Sections .0300 through .0900 providerules governing the harvest, growth, distribution and consumption of shellfish. Thefollowing is a summary of the four major shellfish growing area classifications aspresented in Section .0900 of the pertinent section of the NCAC:
Approved Areas - A shellfish growing area is classified as Approved if the followingcriteria are met:
1. the shoreline survey has indicated that there is no significantpoint source contamination;
2. the area is not contaminated with fecal material, pathogenicmicroorganisms, poisonous and deleterious substances, or ma-rine biotoxins that may render consumption of the shellfish haz-ardous;
3. the median fecal coliform most probable number (MPN) or thegeometric mean MPN of water shall not exceed 14 per 100 milliliters,and not more than ten percent of the samples shall exceed a fecalcoliform MPN of 43 per 100 milliliters (per five tube decimal di-lution) in those portions of areas most probably exposed to fecalcontamination during adverse pollution conditions.
Conditionally Approved Areas As stated in NCAC, conditionally approved ar-eas are those expected to meet Approved Area criteria for extended periods andthe factors determining those periods are known and predictable. Written man-agement plans are developed by the Division of Environmental Health for theseareas. When management plan criteria are met, the Division may recommendthese areas opened to shellfish harvest on a temporary basis. When manage-ment plan criteria are not met, or the public health appears to be jeopardized,the Division recommends immediate closure of the area.
Restricted Areas An area is classified as restricted with the sanitary survey in-dicates a limited degree of pollution, and the area is not contaminated to theextent that indicates that consumption of shellfish could be hazardous after con-trolled depuration or relaying. According to Shellfish Sanitation Section Staff,
104
shellfish may be transported from restricted areas to other areas for cleansingfor a minimum of 14 days.
Prohibited Areas Areas are classified as Prohibited if there is either no currentSanitary Survey, if sanitary survey information indicates that the area does notmeet criteria for an Approved, Conditionally Approved, or Restricted Area.In addition, areas are classified as Prohibited if the growing area is within awastewater treatment plant outfall buffer zone, immediate vicinity of a marina(unless it has less than 30 slips, has no boats over 24 feet in length, or hasno boats with heads or cabins). Specific growing area limits are included inSection .0900 of NCAC.
105
Bibliography
Adam, R. D. (1991). The biology of Giardia spp. Microbiological Reviews 55, 4,706–732.
Alley, W. M. and Smith, P. E. (1981). Estimation of accumulation parameters forurban runoff quality modeling. Water Resources Research 17, 6, 1657–1664.
An, K. H., Lassus, P., Maggi, P., Bardouil, M., and Truquet, P. (1992). Dinoflag-ellate cyst changes and winter environmental-conditions in Vilaine Bay, SouthernBrittany (France). Botanica Marina 35, 1, 61–67.
APHA (2005). Standard methods for the examination of water and wastewater. Amer-ican Public Health Association, Washington, DC, 20th edn.
Arega, F. and Sanders, B. F. (2004). Dispersion model for tidal wetlands. Journalof Hydraulic Engineering-ASCE 130, 8, 739–754.
Ashbolt, N. J., Grohmann, G. S., and Kueh, C. S. W. (1993). Significance of spe-cific bacterial pathogens in the assessment of polluted receiving waters of Sydney,Australia. Water Science and Technology 27, 3/4, 449–452.
Aspinall, L. J. and Kilsby, D. C. (1979). A microbiological quality-control procedurebased on tube counts. Journal of Applied Bacteriology 46, 2, 325–329.
Auer, M. T. and Niehaus, S. L. (1993). Modeling Fecal Coliform Bacteria–I. Fieldand Laboratory Determination of Loss Kinetics. Water Research 27, 4, 693–701.
Badenoch, J., Bartlett, L., Benton, C., Casemore, D., Cawthorne, R., Earnshaw, F.,Ives, K., Jeffery, J., Smith, H., Vaile, M., Warrell, D., and Wright, A. (1990). Cryp-tosporidium in water supplies. Report of the group experts. Tech. rep., Departmentof the Environment, Department of Health. London, UK. HMSO.
Barbe, D. E., Cruise, J. F., and Mo, X. (1996). Modeling the buildup and washoff ofpollutants on urban watersheds. Water Resources Bulletin 32, 3, 511–519.
Bekins, B. A., Godsy, E. M., and Warren, E. (1999). Distribution of microbialphysiologic types in an aquifer contaminated by crude oil. Microbial Ecology 37,4, 263–275.
Beliaeff, B. and Mary, J.-Y. (1993). The most probable number estimate and itsconfidence-limits. Water Research 27, 5, 799–805.
Benham, B. L., Baffaut, C., Zeckoski, R. W., Mankin, K. R., Pachepsky, Y. A.,Sadeghi, A. M., Brannan, K. M., Soupir, M. L., and Habersack, M. J. (2006).Modeling bacteria fate and transport in watersheds to support TMDLs. Transac-tions of the ASABE 49, 4, 987–1002.
106
Bernardo, J. M. and Ramon, J. M. (1998). An introduction to Bayesian referenceanalysis: inference on the ratio of multinomial parameters 47, 1, 101–135.
Berry, D. A. (1996). Statistics: a Bayesian Perspective. Duxbury Press, Belmont,California.
Best, D. J. and Rayner, J. C. W. (1985). A comparison of the MPN and Fisher-Yatesestimators for the density of organisms. Biometrical Journal 27, 2, 167–172.
Beven, K. (2001). How far can we go in distributed hydrological modelling? Hydrologyand Earth System Sciences 5, 1, 1–12.
Bingham, A. K., Jarroll, E. L., and Meyer, E. A. (1979). Giardia-sp - physicalfactors of excystation invitro, and excystation vs eosin exclusion as determinantsof viability. Experimental Parasitology 47, 2, 284–291.
Blanchard, G. F., Sauriau, P. G., Gall, V. C. L., Gouleau, D., Garet, M. J., andOlivier, F. (1997). Kinetics of tidal resuspension of microbiota: Testing the ef-fects of sediment cohesiveness and bioturbation using flume experiments. MarineEcology-Progress Series 151, 17–25.
Bolstad, W. M. (2004). Introduction to Bayesian Statistics. Wiley-Interscience, Hobo-ken, N.J.
Borsuk, M. E., Stow, C. A., and Reckhow, K. H. (2002). Predicting the frequencyof water quality standard violations: A probabilistic approach for TMDL develop-ment. Environmental Science & Technology 36, 10, 2109–2115.
Borsuk, M. E., Stow, C. A., and Reckhow, K. H. (2004). A Bayesian network of eu-trophication models for synthesis, prediction, and uncertainty analysis. EcologicalModelling 173, 2-3, 219–239.
Bowie, G., Mills, W., Porcella, D., Campbell, C., and Chamberlin, C. (1985). Rates,constants, and kinetics formulations in surface water quality modeling. UnitedStates Environmental Protection Agency Office of Research and Development En-vironmental Research Laboratory, Washington, D.C., 2nd edn.
Briones, A. M. and Reichardt, W. (1999). Estimating microbial population countsby ‘most probable number’ using Microsoft Excel R©. Journal of MicrobiologicalMethods 35, 2, 157–161.
Buckalew, D. W., Hartman, L. J., Grimsley, G. A., Martin, A. E., and Register, K. M.(2006). A long-term study comparing membrane filtration with colilert R©definedsubstrates in detecting fecal coliforms and Escherichia coli in natural waters. Jour-nal of Environmental Management 80, 3, 191–197.
107
Cabelli, V. J. (1983). Water-borne Viral Infections In: M. Butler, R. Medlen andR. Morris (eds), “Viruses and Disinfection of Water and Wastewater.”. SurreyPress, Guilford, England.
Cabelli, V. J., Dufour, A. P., McCabe, L. J., and Levin, M. A. (1983). A marine recre-ational water-quality criterion consistent with indicator concepts and risk analysis.Journal Water Pollution Control Federation 55, 10, 1306–1314.
Casella, G. and Berger, R. L. (2002). Statistical Inference. Duxbury, Pacific Grove,California.
Chapra, S. C. (1997). Surface water-quality modeling. Mcgraw-hill series in waterresources and environmental engineering index. McGraw-Hill, New York.
Chapra, S. C., Pelletier, G. J., and Tao, H. (2007). QUAL2K: A modeling frame-work for simulating river and stream water quality, version 2.07: Documentationand user’s manual. Tech. rep., Civil and Environmental Engineering Dept., TuftsUniversity.
Cochran, W. G. (1950). Estimation of bacterial densities by means of the ‘mostprobable number’. Biometrics 6, 2, 105–116.
Cooter, W. S. (2004). Clean water act assessment processes in relation to changingU.S. Environmental Protection Agency management strategies. EnvironmentalScience & Technology 38, 20, 5265–5273.
Davies-Colley, R. J., Bell, R. G., and Donnison, A. M. (1994). Sunlight inactivationof Enterococci and fecal-coliforms in sewage effluent diluted in seawater. Appliedand Environmental Microbiology 60, 6, 2049–2058.
de Man, J. C. (1977). MPN tables for more than one test. European Journal ofApplied Microbiology and Biotechnology 4, 4, 307–316.
Dufour, A. P. and Cabelli, V. J. (1975). Membrane-filter procedure for enumeratingcomponent genera of coliform group in seawater. Applied Microbiology 29, 6, 826–833.
Dufour, A. P., Strickland, E. R., and Cabelli, V. J. (1981). Membrane-filter methodfor enumerating Escherichia coli. Appl. Environ. Microbiol. 41, 5, 1152–1158.
Eckford, R. E. and Fedorak, P. M. (2005). Applying a most probable number methodfor enumerating planktonic, dissimilatory, ammonium-producing, nitrate-reducingbacteria in oil field waters. Canadian Journal of Microbiology 51, 8, 725–729.
Eckner, K. F. (1998). Comparison of membrane filtration and multiple-tube fermen-tation by the colilert and enterolert methods for detection of waterborne coliform
108
bacteria, Escherichia coli, and enterococci used in drinking and bathing water qual-ity monitoring in southern Sweden. Applied and Environmental Microbiology 64,8, 3079–3083.
Eisenhart, C. and Wilson, P. W. (1943). Statistical methods and control in bacteri-ology. Bacteriological Reviews 7, 2, 57–137.
Esham, E. C. and Sizemore, R. K. (1998). Evaluation of two techniques: mFC andmTEC for determining distributions of fecal pollution in small, North Carolinatidal creeks. Water Air and Soil Pollution 106, 1, 179–197.
Fegan, N., Higgs, G., Vanderlinde, P., and Desmarchelier, P. (2004). Enumerationof Escherichia coli O157 in cattle faeces using most probable number techniqueand automated immunomagnetic separation. Letters in Applied Microbiology 38,1, 56–59.
Ferguson, C., Husman, A. M. D., Altavilla, N., Deere, D., and Ashbolt, N. (2003).Fate and transport of surface water pathogens in watersheds. Critical Reviews inEnvironmental Science and Technology 33, 3, 299–361.
Fischer, H. B. (1979). Mixing in inland and coastal waters. Academic Press, NewYork.
Food and Drug Administration and Interstate Shellfish Sanitation Conference (2005).National Shellfish Sanitation Program - guide for the control of molluscan shellfish.
Gameson, A. and Gould, D. (1974). Effects of solar radiation on the mortality ofsome terrestrial bacteria in sea water. In International Symposium on Dischargeof Sewage from Sea Outfalls, vol. Paper No. 22, London. Pergamon Press.
Garthright, W. E. (1993). Bias in the logarithm of microbial density estimates fromserial dilutions. Biometrical Journal 35, 3, 299–314.
Garthright, W. E. (1997). A Bayesian analysis of serial dilutions offers a worsepositive bias than the MPN and proposes an inappropriate interval estimate. FoodMicrobiology 14, 5, 515–517.
Gelman, A. (2006). Prior distributions for variance parameters in hierarchical models(comment on article by Browne and Draper). Bayesian Analysis 1, 3, 515–534.
Ghinsberg, R. C., Dov, L. B., Sheinberg, Y., Nitzan, Y., and Rogol, M. (1994).Monitoring of selected bacteria and fungi in sand and sea-water along the Tel-avivcoast. Microbios 77, 310, 29–40.
Grant, S. B., Sanders, B. F., Boehm, A. B., Redman, J. A., Kim, J. H., Mrse, R. D.,Chu, A. K., Gouldin, M., McGee, C. D., Gardiner, N. A., Jones, B. H., Svejkovsky,
109
J., and Leipzig, G. V. (2001). Generation of Enterococci bacteria in a coastalsaltwater marsh and its impact on surf zone water quality. Environmental Science& Technology 35, 12, 2407–2416.
Greenwood, M. and Yule, G. U. (1917). On the statistical interpretation of somebacteriological methods employed in water analysis. The Journal of Hygiene 16,1, 36–54.
Gronewold, A. D., Borsuk, M. E., Wolpert, R. L., and Reckhow, K. H. (2008). An as-sessment of fecal indicator bacteria-based water quality standards. EnvironmentalScience & Technology 42, 13, 4676–4682.
Gronewold, A. D. and Reckhow, K. H. (2007). Developing a Bayesian network modelfor bacteriologically impaired surface waters. In proceedings of the 7th Interna-tional (IWA) Symposium on Systems Analysis and Integrated Assessment in WaterManagement (Washington, D.C., USA).
Gronewold, A. D. and Wolpert, R. L. (2008). Modeling the relationship betweenmost probable number (MPN) and colony-forming unit (CFU) estimates of fecalcoliform concentration. Water Research 42, 13, 3327–3334.
Gronewold, A. D., Wolpert, R. L., Noble, R. T., Coulliette, A. D., and Reckhow,K. H. (2007). Developing a Bayesian network model for supporting fecal coliformTMDL assessments. In proceedings of the Water Environment Federation SpecialtyConference - TMDL 2007 (Bellevue, Washington, USA).
Hackney, C. R. and Pierson, M. D. (1994). Environmental indicators and shellfishsafety. Chapman & Hall, New York.
Harris, A. S. D., Jones, K. J., and Lewis, J. (1998). An assessment of the accuracyand reproducibility of the most probable number (MPN) technique in estimatingnumbers of nutrient stressed diatoms in sediment samples. Journal of ExperimentalMarine Biology and Ecology 231, 1, 21–30.
Horowitz, A. (1986). Comparison of methods for the concentration of suspendedsediment in river water for subsequent chemical analysis. Environmental Science& Technology 20, 2, 155–160.
Houck, O. A. (2002). The Clean Water Act TMDL program: law, policy, and imple-mentation. Environmental Law Institute, Washington, D.C., 2nd edn.
Hurley, M. A. and Roscoe, M. E. (1983). Automated statistical analysis of microbialenumeration by dilution series. Journal of Applied Bacteriology 55, 1, 159–164.
Irvine, K. N. and Pettibone, G. W. (1993). Dynamics of indicator bacteria popula-tions in sediment and river water near a combined sewer outfall. EnvironmentalTechnology 14, 6, 531–542.
110
Jakeman, A. J. and Letcher, R. A. (2003). Integrated assessment and modelling: Fea-tures, principles and examples for catchment management. Environmental Mod-elling & Software 18, 6, 491–501.
Jeffreys, H. (1946). An invariant form for the prior probability in estimation problems.Proceedings of the Royal Society of London Series A– Mathematical and PhysicalSciences 186, 1007, 453–461.
Jensen, F. V., Olesen, K. G., and Andersen, S. K. (1990). An algebra of Bayesianbelief universes for knowledge-based systems. Networks 20, 5, 637–659.
Johnson, D. C., Enriquez, C. E., Pepper, I. L., Davis, T. L., Gerba, C. P., and Rose,J. B. (1997). Survival of Giardia, Cryptosporidium, poliovirus and salmonella inmarine waters. Water Science and Technology 35, 11-12, 261–268.
Kashefipour, S. M., Lin, B., and Falconer, R. A. (2005). Neural networks for pre-dicting seawater bacterial levels. Proceedings of The Institution of Civil Engineers-Water Management 158, 3, 111–118.
Ketchum, B. (1951). The exchanges of fresh and salt waters in tidal estuaries. Journalof Marine Research 10, 1, 18–38.
Kinzelman, J., Ng, C., Jackson, E., Gradus, S., and Bagley, R. (2003). Entero-cocci as indicators of Lake Michigan recreational water quality: Comparison oftwo methodologies and their impacts on public health regulatory events. Appliedand Environmental Microbiology 69, 1, 92–96.
Klee, A. J. (1993). A computer-program for the determination of most probablenumber and its confidence-limits. Journal of Microbiological Methods 18, 2, 91–98.
Kloot, R. W., Radakovich, B., Huang, X.-Q., and Brantley, D. (2006). A compar-ison of bacterial indicators and methods in rural surface waters. EnvironmentalMonitoring and Assessment 121, 1, 275–287.
Kuo, A. and Neilson, B. (1988). Modified Tidal Prism Model for Water Quality inSmall Coastal Embayments. Water Science and Technology 20, 6/7, 133–142.
Kuo, A., Park, K., Kim, S., and Lin, J. (2005). A Tidal Prism Water Quality Modelfor Small Coastal Basins. Coastal Management 33, 1, 101–117.
Larrazabal, M. E., Lassus, P., Maggi, P., and Bardouil, M. (1990). Modern dinoflag-ellate kysts in Vilaine Bay Southern Brittany (France). Cryptogamie Algologie 11,3, 171–185.
LeClerc, H., Mossel, D. A. A., Edberg, S. C., and Struijk, C. B. (2001). Advancesin the bacteriology of the coliform group: Their suitability as markers of microbialwater safety. Annual Review of Microbiology 55, 201–234.
111
Lee, J. H. and Bang, K. W. (2000). Characterization of urban stormwater runoff.Water Research 34, 6, 1773–1780.
Levin, M. A., Fischer, J. R., and Cabelli, V. J. (1975). Membrane filter technique forenumeration of enterococci in marine waters. Applied Microbiology 30, 1, 66–71.
Luketina, D. (1998). Simple Tidal Prism Models Revisited. Estuarine, Coastal andShelf Science 46, 1, 77–84.
Lunn, D. J., Thomas, A., Best, N., and Spiegelhalter, D. (2000). WinBUGS-ABayesian modelling framework: Concepts, structure, and extensibility. Statisticsand Computing 10, 4, 325–337.
Mancini, J. L. (1978). Numerical estimates of coliform mortality-rates under variousconditions. Journal Water Pollution Control Federation 50, 11, 2477–2484.
McBride, G. B. (2003). Preparing exact most probable number (mpn) tables usingoccupancy theory, and accompanying measures of uncertainty. NIWA TechnicalReport 121 62.
McBride, G. B. (2005). Using statistical methods for water quality management.Issues, problems and solutions. John Wiley & Sons Ltd Chichester, UK.
McBride, G. B., McWhirter, J. L., and Dalgety, M. H. (2003). Uncertainty in mostprobable number calculations for microbiological assays. Journal of AOAC Inter-national 86, 5, 1084–1088.
McCrady, M. H. (1915). The numerical interpretation of fermentation tube results.Journal of Infectious Diseases 17, 1, 183–212.
McMurry, S. W., Coyne, M. S., and Perfect, E. (1998). Fecal coliform transportthrough intact soil blocks amended with poultry manure. Journal of EnvironmentalQuality 27, 1, 86–92.
Medema, G. J., Bahar, M., and Schets, F. M. (1997). Survival of Cryptosporid-ium parvum, Escherichia coli, faecal Enterococci and Clostridium perfringens inriver water: Influence of temperature and autochthonous microorganisms. WaterScience and Technology 35, 11, 249–252.
Menyah, M. K. and Sato, K. (1996). A proposal for re-evaluating the most probablenumber procedure for estimating numbers of Bradyrhizobium spp. Biology andFertility of Soils 23, 2, 110–112.
Mitchell, R. and Chamberlin, C. (1979). Indicators of viruses in water and food(edited by Berg G.). 1–12. Ann Arbor Science Publishers, Inc, Ann Arbor, MI.
112
Moeller, J. R. and Calkins, J. (1980). Bactericidal agents in waste-water lagoons andlagoon design. Journal Water Pollution Control Federation 52, 10, 2442–2451.
National Research Council (2001). Assessing the TMDL approach to water qualitymanagement.
N.C. Department of Environment and Natural Resources (2004). Coastal recreationalwaters monitoring, evaluation, and notification rules: 15a ncac 18a .3400.
NCDENR (2007). Study on comparison between CFU and MPN estimates of fecalcoliform concentration.
Neumann, C. M., Harding, A. K., and Sherman, J. M. (2006). Oregon Beach mon-itoring program: Bacterial exceedances in marine and freshwater creeks/outfallsamples, October 2002-April 2005. Marine Pollution Bulletin 52, 10, 1270–1277.
Nix, P. G., Daykin, M. M., and Vilkas, K. L. (1993). Sediment bags as an integratorof fecal contamination in aquatic systems. Water Research 27, 10, 1569–1576.
Noble, R. T. and Fuhrman, J. A. (1997). Virus decay and its causes in coastal waters.Applied and Environmental Microbiology 63, 1, 77–83.
Noble, R. T., Moore, D. F., Leecaster, M. K., McGee, C. D., and Seisberg, S. B.(2003a). Comparison of total coliform, fecal coliform, and enterococcus bacterialindicator response for ocean recreational water quality testing. Water Research37, 7, 1637–1643.
Noble, R. T., Seisberg, S. B., Leecaster, M. K., McGee, C. D., Ritter, K. J., Walker,K. O., and Vainik, P. M. (2003b). Comparison of beach bacterial water qualityindicator measurement methods. Environmental Monitoring and Assessment 81,1, 301–312.
Novotny, V. and Olem, H. (1994). Water quality: Prevention, identification, andmanagement of diffuse pollution. Van Nostrand Reinhold, New York, 1st edn.
Obiri-Danso, K. and Jones, K. (2000). Intertidal sediments as reservoirs for hip-purate negative Campylobacters, Salmonellae and faecal indicators in three E.U.recognised bathing waters in northwest England. Water Research 34, 2, 519–527.
Ott, W. (1995). Environmental statistics and data analysis. Lewis Publishers, BocaRaton.
Papen, H. and von Berg, R. (1998). A most probable number method (MPN) for theestimation of cell numbers of heterotrophic nitrifying bacteria in soil. Plant andSoil 199, 1, 123–130.
113
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausibleinference. Morgan Kaufmann Publishers, San Mateo, Calif.
Qian, S. S., Donnelly, M., Schmelling, D. C., Messner, M., Linden, K. G., and Cotton,C. (2004). Ultraviolet light inactivation of protozoa in drinking water: a Bayesianmeta-analysis. Water Research 38, 2, 317–326.
Qian, S. S., Linden, K. G., and Donnelly, M. (2005). A Bayesian analysis of mouseinfectivity data to evaluate the effectiveness of using ultraviolet light as a drinkingwater disinfectant. Water Research 39, 17, 4229–4239.
R Development Core Team (2006). R: A Language and Environment for StatisticalComputing. R Foundation for Statistical Computing, Vienna, Austria. ISBN 3-900051-07-0.
Reckhow, K. H. (1994). Water-quality simulation modeling and uncertainty analysisfor risk assessment and decision-making. Ecological Modelling 72, 1, 1–20.
Reckhow, K. H. (1999). Water quality prediction and probability network models.Canadian Journal of Fisheries and Aquatic Sciences 56, 7, 1150–1158.
Reeves, R. L., Grant, S. B., Mrse, R. D., Oancea, C. M. C., Sanders, B. F., andBoehm, A. B. (2004). Scaling and management of fecal indicator bacteria in runofffrom a coastal urban watershed in southern california. Environmental Science &Technology 38, 9, 2637–2648.
Rippey, S. R., Adams, W. N., and Watkins, W. D. (1987). Enumeration of fecal-coliforms and Escherichia-coli in marine and estuarine waters - an alternative tothe APHA-MPN approach. Journal Water Pollution Control Federation 59, 8,795–798.
Rompre, A., Servais, P., Baudart, J., de Roubin, M.-R., and Laurent, P. (2002).Detection and enumeration of coliforms in drinking water: current methods andemerging approaches. Journal of Microbiological Methods 49, 1, 31–54.
Rose, R. E., Geldreich, E. E., and Litsky, W. (1975). Improved membrane-filtermethod for fecal coliform analysis. Applied Microbiology 29, 4, 532–536.
Roussanov, B., Hawkins, D. M., and Tatini, S. R. (1996). Estimating bacterial densityfrom tube dilution data by a Bayesian method. Food Microbiology 13, 5, 341–363.
Russek, E. and Colwell, R. R. (1983). Computation of most probable numbers. Appl.Environ. Microbiol. 45, 5, 1646–1650.
Salomon, J. C. and Pommepuy, M. (1990). Mathematical-model of bacterial-contamination of the morlaix estuary (france). Water Research 24, 8, 983–994.
114
Sanders, B. F., Arega, F., and Sutula, M. (2005). Modeling the dry-weather tidalcycling of fecal indicator bacteria in surface waters of an intertidal wetland. WaterResearch 39, 14, 3394–3408.
Sanford, L., Boicourt, W., and Rives, S. (1992). Model for estimating tidal flushingof small embayments. Journal of Waterway, Port, Coastal and Ocean Engineering118, 6, 635–654.
Sayler, G. S., Nelson, J. D., Justice, A., and Colwell, R. R. (1975). Distributionand significance of fecal indicator organisms in Upper Chesapeake Bay. AppliedMicrobiology 30, 4, 625–638.
Schijven, J. F. and Hassanizadeh, S. M. (2000). Removal of viruses by soil passage:Overview of modeling, processes, and parameters. Critical Reviews In Environ-mental Science and Technology 30, 1, 49–127.
Schijven, J. F. and Hassanizadeh, S. M. (2002). Virus removal by soil passage at fieldscale and groundwater protection of sandy aquifers. Water Science and Technology46, 3, 123–129.
Schmee, J. and Hahn, G. J. (1979). A simple method for regression analysis withcensored data. Technometrics 21, 4, 417–432.
Shen, J., Sun, S.-C., and Wang, T.-P. (2005). Development of the fecal coliform totalmaximum daily load using Loading Simulation Program C++ and tidal prismmodel in estuarine shellfish growing areas: A case study in the Nassawadox coastalembayment, Virginia. J. Environ. Sci. Heal. A 40, 9, 1791–1807.
Smith, E. P., Ye, K. Y., Hughes, C., and Shabman, L. A. (2001). Statistical assess-ment of violations of water quality standards under section 303(d) of the CleanWater Act. Environmental Science & Technology 35, 3, 606–612.
Spiegelhalter, D., Dawid, A., Lauritzen, S., and Cowell, R. (1993). Bayesian Analysisin Expert Systems. Statistical Science 8, 3, 219–247.
Spiegelhalter, D. J., Thomas, A., Best, N. G., and Lunn, D. J. (2003). WinBUGSversion 1.4 user manual. Tech. rep., Medical Res. Counc. Biostat. Unit, Cambridge,UK.
Tanner, M. A. (1991). Tools for Statistical Inference. Springer-Verlab, New York,NY.
Thomann, R. V. and Mueller, J. A. (1987). Principles of surface water quality mod-eling and control. Harper & Row, New York.
Thomas, G. W. and Phillips, R. E. (1979). Consequences of water-movement inmacropores. Journal of Environmental Quality 8, 2, 149–152.
115
Tillett, H. E. and Coleman, R. (1985). Estimated numbers of bacteria in samplesfrom non-homogeneous bodies of water - how should mpn and membrane filtrationresults be reported. Journal of Applied Bacteriology 59, 4, 381–388.
Tzipori, S. (1983). Cryptosporidiosis in animals and humans. Microbiological Reviews47, 1, 84–96.
U.S. Environmental Protection Agency (2001). Protocol for developing pathogenTMDLs. Tech. Rep. EPA 841-R-00-002, Office of Water (4503F), United StatesEnvironmental Protection Agency, Washington, DC.
U.S. Environmental Protection Agency (2002). National water quality inventory:Report to congress (2002 reporting cycle), EPA 841-R-07-001.
U.S. Environmental Protection Agency (2005a). Code of federal regulations: Title40, chapter 1, part 141.
U.S. Environmental Protection Agency (2005b). Guidance for 2006 assessment, listingand reporting requirements pursuant to sections 303(d), 305(b) and 314 of theClean Water Act.
U.S. Geological Survey (1996). Water quality of the Lower Columbia River Basin:Analysis of current and historical water-quality data through 1994 (Water-resources investigations report 95-4294), 52-53. Tech. rep., U.S. Geological Survey.
Vandenberghe, V., Bauwens, W., and Vanrolleghem, P. A. (2007). Evaluation ofuncertainty propagation into river water quality predictions to guide future moni-toring campaigns. Environmental Modelling & Software 22, 5, 725–732.
Weisberg, S. (2005). Applied linear regression. Wiley series in probability and statis-tics. Wiley-Interscience, Hoboken, N.J., 3rd edn.
Weiskel, P. K., Howes, B. L., and Heufelder, G. R. (1996). Coliform contaminationof a coastal embayment: Sources and transport pathways. Environmental Science& Technology 30, 6, 1872–1881.
White, N. M., Line, D. E., Potts, J. D., Kirby-Smith, W., Doll, B., and Hunt, W. F.(2000). Jump Run Creek shellfish restoration project. Journal of Shellfish Research19, 1, 473–476.
Woodward, R. L. (1957). How probable is the most probable number? Journal ofthe American Water Works Association 49, 1, 1060–1068.
Woomer, P. L., Bennett, J., and Yost, R. (1990). Overcoming the inflexibility ofmost-probable-number procedures. Agronomy Journal 82, 2, 349–353.
116
Biography
My research and career objectives first took shape during my undergraduate educa-
tion at Cornell University’s School of Civil and Environmental Engineering. After
graduating from Cornell in 1995, I was employed as a project manager and licensed
professional engineer with the environmental engineering consulting firms Stearns &
Wheler, LLC and the Ecological Engineering Group, Inc. Between 1995 and 2003
I initiated and completed over forty planning, design, and construction projects in
areas of wastewater, water, and solid waste management. Significant project accom-
plishments include obtaining grant funding for point and non-point source pollution
mitigation projects in small communities through the Massachusetts Coastal Zone
Management (CZM) Coastal Pollutant Remediation (CPR) program, and serving as
the resident engineering during the closure of a 54-acre municipal solid waste landfill.
I also completed a series of comprehensive watershed and wastewater management
planning studies for rapidly growing communities in southeastern Massachusetts.
Each planning project included a detailed analysis of environmental management
infrastructure alternatives, evaluation of public policy and regulatory issues, and ex-
tensive field work to determine hydrogeological and surface water quality conditions.
In addition to my work as an environmental engineer, I began supervising and
coordinating a wide variety of research projects in 1999 as a scientist and teacher
with the Sea Education Association (SEA) based in Woods Hole, Massachusetts. I
have since logged over 200 days at sea as a teacher with SEA while advising high
117
school and college-level students during the data gathering and report writing phases
of individual research projects, including non-point source pollution analysis of nu-
trient loading in Samana Bay in the Dominican Republic and distribution of spiny
lobster larvae across the gulf stream. As a student with SEA, I investigated the
impacts of stormwater runoff on eutrophication in St. George’s Harbor, Bermuda,
and subsequent implications for outbreak of shellfish-borne diseases such as paralytic
shellfish poisoning (PSP) and ciguatera. My experience with SEA provided a unique
perspective on global environmental problems through coastal research projects in
Nova Scotia, Bermuda, the Lesser Antilles, and Central America. My passion for
teaching, research, and pursuing graduate study was confirmed by my experience
with SEA, and my enthusiasm persisted through adverse conditions at sea such as
severe weather, sleep deprivation, and seasickness. Throughout my experiences in
engineering consulting and with SEA, however, I repeatedly questioned traditional
approaches to addressing uncertainty in water quality measurements, construction
cost estimates, and other critical environmental management decision criteria. These
questions inspired my return to graduate school and my work on Bayesian statistical
models.
I began graduate studies at the Nicholas School of the Environment at Duke Uni-
versity under the guidance of Drs. Kenneth H. Reckhow and Robert L. Wolpert.
My research focused on applying statistical models to help solve environmental re-
source and infrastructure management problems. I specialize in developing innovative
118
modeling tools which integrate monitoring data from multiple spatial and temporal
scales to characterize interrelated meteorological and hydrological processes, as well
as ecosystem response dynamics. My doctorate research focuses on developing mod-
eling tools for evaluating climate change, land use, and pollutant mitigation scenarios
to restore water quality in impaired shellfish harvesting waters in Eastern North Car-
olina. Significant contributions from this research include a new set of water quality
standards imposing limits on parameters of the true fecal bacteria concentration
(the applicable measure of water quality), as opposed to traditional standards which
impose limits on most probable number (MPN) and colony-forming unit (CFU) con-
centration point estimates. This research recently appeared as a cover article in En-
vironmental Science & Technology. I also developed an innovative approach to mod-
eling the relationship between alternative measures of fecal coliform concentration,
which provided important guidance to shellfish harvesting area managers currently
debating a shift in standard laboratory protocol. This research recently appeared
in Water Research. The contributions of my graduate work to the scientific com-
munity were acknowledged through several awards and scholarships, including the
Water Environment Federation Robert Canham Graduate Scholarship, the North
Carolina Association of Environmental Professionals Graduate Scholarship, and the
QEA, LLC Graduate Scholarship. In addition, I received an Outstanding Student
Paper Award for a presentation of my research at the American Geophysical Union
(AGU) Fall 2007 Meeting.
119
While conducting my research at Duke, I also served as the primary instructor
for the Nicholas School of the Environment graduate-level course in water quality
management, and periodically served as a guest lecturer for courses in water quality
modeling and probability. I consistently received positive evaluations from students
at Duke and at SEA, and was awarded the Nicholas School of the Environment
teaching assistant of the year award after my first year of graduate study.
120