43
Freshwater Sediment Standards Science Panel August 25, 2010 Presenters: Russ McMillan – Department of Ecology Teresa Michelsen – Avocet Consulting 1

Freshwater Sediment Standards Science Panel August 25, 2010 Presenters: Russ McMillan – Department of Ecology Teresa Michelsen – Avocet Consulting 1

Embed Size (px)

Citation preview

  • Slide 1

Freshwater Sediment Standards Science Panel August 25, 2010 Presenters: Russ McMillan Department of Ecology Teresa Michelsen Avocet Consulting 1 Slide 2 Freshwater Sediment Standards Goal For Today Are the approaches used to develop chemical and biological criteria scientifically defensible? 2 Slide 3 Freshwater Sediment Standards Discussion Points For Today Introduce regulatory and policy context Introduce regulatory and policy context Present proposed biological and chemical criteria and framework Present proposed biological and chemical criteria and framework Identify and discuss scientific and technical issues Identify and discuss scientific and technical issues 3 Slide 4 Freshwater Sediment Standards Policy Decisions Consistency with SMS regulatory framework. Consistency with SMS regulatory framework. Adopt biological and chemical criteria. Adopt biological and chemical criteria. Two tier structure: SQS and CSL. Two tier structure: SQS and CSL. Allowance of some adverse effects. Allowance of some adverse effects. Biological override of chemical criteria. Biological override of chemical criteria. 4 Slide 5 Confirmatory bioassays override chemistry Confirmatory bioassays override chemistry Two tier structure: SQS and CSL Two tier structure: SQS and CSL Bioassay suite Multiple species/sensitive life-history stages Bioassay suite Multiple species/sensitive life-history stages Minimum of 3 endpoints Minimum of 3 endpoints Both acute and chronic tests Both acute and chronic tests Proposed Biological Sediment Standards Regulatory Framework 5 Slide 6 Narrower choice of species than SMS marine criteria due to limited bioassay availability. Narrower choice of species than SMS marine criteria due to limited bioassay availability. Built on EPA and ASTM protocols well established in the Northwest: Built on EPA and ASTM protocols well established in the Northwest: Hyalella 10-day mortality 366 Hyalella 10-day mortality 366 Hyalella 28-day mortality 312 Hyalella 28-day mortality 312 Hyalella 28-day growth 79 Hyalella 28-day growth 79 Chironomus 10-day growth 525 Chironomus 10-day growth 525 Chironomus 10-day mortality 568 Chironomus 10-day mortality 568 Proposed Suite of Bioassays 6 Slide 7 Consistent w/SMS designation of sediment quality Consistent w/SMS designation of sediment quality SQS: Single SQS level hit SQS: Single SQS level hit CSL: 2+ SQS level hits; 1+ CSL level hit CSL: 2+ SQS level hits; 1+ CSL level hit Consistent w/SMS sensitive endpoints. Consistent w/SMS sensitive endpoints. Proposed Suite of Bioassays 7 Slide 8 Bioassay suite will include sensitive endpoints from chronic and acute tests: 3 Endpoints 3 Endpoints 2 Species 2 Species 1 Chronic test 1 Chronic test 1 Sublethal endpoint 1 Sublethal endpoint Requirements for Proposed Bioassay Suite 8 Slide 9 Test Acute Bioassays Chronic Bioassays Lethal Endpoint Sub-lethal Endpoint Hyalella azteca 10-day mortalityX X 28-day mortality XX 28-day growth X X Chironomus dilutus 10-day mortalityX X 10-day growthX X 20-day mortality XX 20-day growth X X Bioassays: Acute and Chronic and Endpoint Effects Levels 9 Slide 10 Is the proposed bioassay suite scientifically defensible as being appropriately protective of the benthic community? Proposed Bioassay Suite Question 10 Slide 11 Differs from SMS marine interpretation due to difficulty in identifying appropriate reference areas. Differs from SMS marine interpretation due to difficulty in identifying appropriate reference areas. Due to lack of reference sites, interpretation of an SQS and CSL hit is based on a comparison to control. Due to lack of reference sites, interpretation of an SQS and CSL hit is based on a comparison to control. Comparison to control is a more conservative interpretation than comparison to a reference sediment. Comparison to control is a more conservative interpretation than comparison to a reference sediment. Proposed Freshwater Bioassay Interpretation 11 Slide 12 Test QA limits Control QA limits Reference SQSCSL Hyalella azteca 10-day mortality C 20%R 25% T C > 15%T C > 25% 28-day mortality C 20%R 30% T C > 10%T C > 25% 28-day growth CF 0.15 mg/ind RF 0.15 mg/ind T/C < 0.75T/C < 0.6 Chironomus dilutus 10-day mortality C 30%R 30% T C > 20%T C > 30% 10-day growth CF 0.48 mg/ind RF/CF 0.8 T/C < 0.8T/C < 0.7 20-day mortality C 32%R 35% T C > 15%T C > 25% 20-day growth CF 0.48 mg/ind RF/CF 0.8T/C < 0.75T/C < 0.6 C=Control, R=Reference, T=Test, F=Final, (SQS &CSL hits statistically sign. diff.) Bioassay Interpretation: Comparison to Control 12 Slide 13 Is it scientifically defensible to base interpretation of a bioassay hit by using a comparison to control rather than a comparison to a reference sediment? Is it scientifically defensible to base the designation of sediment quality on a suite of bioassays comparing test to control without the benefit of a reference sediment? Question 13 Slide 14 14 Slide 15 History of Freshwater SQV Development Early work on FW Apparent Effects Thresholds (AETs) & Floating Percentile Method (FPM; Portland Harbor) throughout the late 1990s Early work on FW Apparent Effects Thresholds (AETs) & Floating Percentile Method (FPM; Portland Harbor) throughout the late 1990s 2002 Formal evaluation of FW AETs and other existing SQV sets (TELs/PELs, etc.) 2002 Formal evaluation of FW AETs and other existing SQV sets (TELs/PELs, etc.) Decision that a new approach was needed: Decision that a new approach was needed: FW AETs not sufficiently conservative FW AETs not sufficiently conservative TELs/PELs, etc., far too conservative TELs/PELs, etc., far too conservative National evaluations were not looking at both types of statistical errors National evaluations were not looking at both types of statistical errors 15 Slide 16 Statistical Digression False Negative (FN) = Predicting that a sample will be non-toxic when it is actually toxic False Negative (FN) = Predicting that a sample will be non-toxic when it is actually toxic False Positive (FP) = Predicting that a sample will be toxic when it is actually non-toxic False Positive (FP) = Predicting that a sample will be toxic when it is actually non-toxic Existing national methods were focused on reducing false negatives at lower screening levels and false positives at upper screening levels, creating substantial errors and inefficiencies in between, where most actual data are located. We focused on reducing both types of errors at the same time, for all levels of effects. 16 Slide 17 History, cont. 2003 Developed interim FPM values, used as guidance by Ecology and in the regional dredging manual (SEF) 2003 Developed interim FPM values, used as guidance by Ecology and in the regional dredging manual (SEF) 2007 Regional Sediment Evaluation Team (RSET) OR/WA workgroup formed to update FPM SQVs 2007 Regional Sediment Evaluation Team (RSET) OR/WA workgroup formed to update FPM SQVs 2008 RSET technical work completed; final SQV selection still under discussion and review 2008 RSET technical work completed; final SQV selection still under discussion and review 2009/2010 Ecology begins rule revision, finalizes the values, begins peer review 2009/2010 Ecology begins rule revision, finalizes the values, begins peer review 17 Slide 18 Past and current review Presentations at 5 national and regional scientific conferences (1999-2009) Presentations at 5 national and regional scientific conferences (1999-2009) DEQ-led peer review & public meetings during Portland Harbor (2001 state site) DEQ-led peer review & public meetings during Portland Harbor (2001 state site) Public and agency review of 2003 Ecology report Public and agency review of 2003 Ecology report Presentations at 4 SMARMs (2003-2010) + numerous RSET public meetings Presentations at 4 SMARMs (2003-2010) + numerous RSET public meetings SMS Advisory Workgroup peer review of approach and draft SQV report (2010) SMS Advisory Workgroup peer review of approach and draft SQV report (2010) Science Panel and national peer review (2010) Science Panel and national peer review (2010) 18 Slide 19 Projects/Guidance to Date 1999, 2003, 2008 Portland Harbor, OR 1999, 2003, 2008 Portland Harbor, OR 2001 Onondaga Lake, NY 2001 Onondaga Lake, NY 2003 Los Angeles Harbor, CA 2003 Los Angeles Harbor, CA 2004 San Francisco Bay/Oakland Harbor, CA 2004 San Francisco Bay/Oakland Harbor, CA 2003 Draft Ecology Freshwater Guidelines, WA 2003 Draft Ecology Freshwater Guidelines, WA 2006 Interim Sediment Evaluation Framework Guidelines for WA/OR/ID 2006 Interim Sediment Evaluation Framework Guidelines for WA/OR/ID 2010 Updated Freshwater Guidelines, WA 2010 Updated Freshwater Guidelines, WA 19 Slide 20 Question 1 General Approach: - Is it appropriate to use sediment bioassays to represent effects to the benthic community? - Is the use of a multivariate model to empirically derive chemical SQVs scientifically defensible? 20 Slide 21 Use of toxicity tests Marine AETs were derived based on toxicity tests, benthic community studies, and chemistry Marine AETs were derived based on toxicity tests, benthic community studies, and chemistry Benthic community studies were considered equivalent to a chronic bioassay Benthic community studies were considered equivalent to a chronic bioassay Freshwater benthic community data were searched for but not found in OR, WA, or ID Freshwater benthic community data were searched for but not found in OR, WA, or ID Substantial diversity of freshwater sites compared to marine areas complicates use of benthic community data, if there were any Substantial diversity of freshwater sites compared to marine areas complicates use of benthic community data, if there were any 21 Slide 22 Floating Percentile Method Goal: Minimize false negatives and false positives simultaneously Approach: Data QA, screening, and summing Determine true toxicity based on bioassay results Search for the most predictive chemical concentrations, allowing each chemical to move independently to the level at which it appears to be toxic 22 Slide 23 Features Uses synoptic chemistry and bioassay field data Uses synoptic chemistry and bioassay field data Multivariate considers all chemicals at once Multivariate considers all chemicals at once Incorporates measures to address covariance Incorporates measures to address covariance Requires selection of false negative target Requires selection of false negative target Follows with optimization of false positives Follows with optimization of false positives Repeat for a range of false negative targets Repeat for a range of false negative targets Thoroughly evaluates reliability Thoroughly evaluates reliability Now automated using a series of Excel spreadsheets Now automated using a series of Excel spreadsheets 23 Slide 24 24 Slide 25 Covariance All chemicals are addressed simultaneously avoids inappropriate assignment of toxicity All chemicals are addressed simultaneously avoids inappropriate assignment of toxicity Covariance analysis can be run ahead of time Covariance analysis can be run ahead of time Model results allow visual identification of covariance patterns Model results allow visual identification of covariance patterns Appropriate chemical classes can be summed Appropriate chemical classes can be summed Those that cant be summed but often covary are subjected to multiple runs with different starting points to find the low concentration for each chemical, then selection of the combination with the highest reliability Those that cant be summed but often covary are subjected to multiple runs with different starting points to find the low concentration for each chemical, then selection of the combination with the highest reliability 25 Slide 26 Question 2 Data Issues: - Is the data set sufficiently robust and representative? - Has appropriate data screening and QA been conducted? 26 Slide 27 Data Set Chemistry Oregon and Washington Oregon and Washington West and east of the Cascade Mountains West and east of the Cascade Mountains Lakes, rivers, small and large Lakes, rivers, small and large Various geochemical environments Various geochemical environments 50 analytes and sums 105 chemicals 50 analytes and sums 105 chemicals Rigorous QA/QC applied (PSEP QA2) qualifiers rectified among data sets Rigorous QA/QC applied (PSEP QA2) qualifiers rectified among data sets 27 Slide 28 Data Set Bioassay Endpoints Hyalella 10-day mortality 366 Hyalella 10-day mortality 366 Chironomus 10-day mortality 550 Chironomus 10-day mortality 550 Chironomus 10-day growth 504 Chironomus 10-day growth 504 Hyalella 28-day mortality 319 Hyalella 28-day mortality 319 Hyalella 28-day growth 79 Hyalella 28-day growth 79 Rigorous QA/QC applied recent ASTM protocols, QA2 review of lab sheets, etc. Rigorous QA/QC applied recent ASTM protocols, QA2 review of lab sheets, etc. 28 Slide 29 Question 3 Reliability Testing: - Is the reliability testing that was conducted an appropriate method for evaluating SQVs? - Is the comparative reliability analysis that was conducted an appropriate way of making decisions while fine-tuning the approach? - Are the reliability measures that were used the right ones and were the relative weights given to them appropriate? - Are there alternative methods of validation that could have been used without collecting additional data? 29 Slide 30 Reliability Sensitivity (100% false negatives) Sensitivity (100% false negatives) Efficiency (100% false positives) Efficiency (100% false positives) Predicted no-hit reliability Predicted no-hit reliability Predicted hit reliability Predicted hit reliability Overall reliability Overall reliability All measures of reliability were used for ALL effects levels endpoints given greater weight are shown in yellow 30 Slide 31 Reliability Measures Diagram 31 Slide 32 Reliability Goals The RSET Workgroup set the following goals before beginning SQV development: SQS (%)CSL (%) Sensitivity80 9075 85 Efficiency70 8075 85 Predicted Hit Reliability 70 8075 85 Predicted No-Hit Reliability 80 9075 85 Overall Reliability80 90 32 Slide 33 33 Slide 34 Freshwater Standards Reliability Values are averages across relevant assays No Effect Level Minor Effect Level 34 Slide 35 Comparative Reliability Analysis East side vs. west side vs. combined East side vs. west side vs. combined TPH vs. PAH vs. combined TPH vs. PAH vs. combined Microtox include? Microtox include? Hyalella growth include Portland Harbor? Hyalella growth include Portland Harbor? Ammonia and sulfides issues Ammonia and sulfides issues N-qualified pesticides N-qualified pesticides Blank-correction standardization Blank-correction standardization Comparison to control vs. reference Comparison to control vs. reference 35 Slide 36 SQV Validation RSET made a decision early on to not withhold part of the data set for validation RSET made a decision early on to not withhold part of the data set for validation For such a large heterogeneous area, we needed all the data to develop the best possible model For such a large heterogeneous area, we needed all the data to develop the best possible model Most other SQV sets have followed the same approach, with independent validation following after Most other SQV sets have followed the same approach, with independent validation following after Independent validation will require a large, representative data set, not just a few projects Independent validation will require a large, representative data set, not just a few projects Other validation methods available? Other validation methods available? 36 Slide 37 Question 4 Data Interpretation and Regulatory Decision- Making: - Is the overall framework and selection of final SQVs consistent with the SMS and marine SQVs? - Is the approach used to select final SQVs scientifically defensible? 37 Slide 38 Challenges Criteria Selection Expectation: SQS values clustered below CSL values SQS CSL SQS CSL 38 Slide 39 Challenges Criteria Selection Reality: Differences between bioassays were much greater than differences between endpoints try species sensitivity distribution approach 39 Slide 40 > values- no toxicity observed for that endpoint up to the listed concentration. Sample concentrations at or above this level should undergo toxicity testing. > values- no toxicity observed for that endpoint up to the listed concentration. Sample concentrations at or above this level should undergo toxicity testing. Approach for selection of CSL: next significantly different value Approach for selection of CSL: next significantly different value 40 Slide 41 Questions? 41 Slide 42 Petroleum Toxicity PAHs contribute to the greatest number of errors: PAHs do not sufficiently capture petroleum toxicity PAHs do not sufficiently capture petroleum toxicity No form of normalizing or summing solves the problem No form of normalizing or summing solves the problem Addressed through side-by-side PAH/TPH & combined model runs Addressed through side-by-side PAH/TPH & combined model runs Best results when both were included Best results when both were included May be legacy issues with not enough TPH data May be legacy issues with not enough TPH data 42 Slide 43 Summation of Chemical Classes PAHs/TPH classes PAHs/TPH classes PCB Aroclors PCB Aroclors Dioxins/Furans Dioxins/Furans Chlordanes Chlordanes DDT, DDE, DDD isomers DDT, DDE, DDD isomers Heptachlors Heptachlors 43