29
Why quality control and quality assurance is important for the legacy of GEOTRACES through its database? Adam Leadbetter ([email protected]), British Oceanographic Data Centre

Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Embed Size (px)

Citation preview

Page 1: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why quality control and quality assurance is important for the legacy of GEOTRACES through its database?

Adam Leadbetter ([email protected]), British Oceanographic Data Centre

Page 2: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Outline

- Data matter!

- Why compatible data?

- The Geotraces database

- A data intensive future…

Page 3: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

1. Data Matter!

Presenter
Presentation Notes
Some quotes to set the context for my talk
Page 4: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Data matter!

“A scholar’s positive contribution is measured by the sum of the original data that he contributes. Hypotheses come and go but data remain.”

Santiago Ramón y Cajal(Nobel Prize winner,1906) in

Advice to a Young Investigator (1897)

Page 5: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Data matter!

“You are not finished until you have done the research, published the results, and published the data, receiving formal credit for everything.

Preserve or Perish”Mark Parsons

US National Snow and Ice Data CenterData Management for the International Polar Year (2006)

Page 6: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

2. Why compatible data?

Page 7: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

“If HTML and the [World Wide] Web made all the online documents look like one huge book, [compatibility] will make all the data in the world look like one huge database.”

Sir Tim Berners-LeeW3C

Weaving the Web (1999)

Page 8: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

Page 9: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

- The Linked Data cloud is built on compatible data

- Similarly, Geotraces db builds on compatible data

- How?

Page 10: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

- Intercalibration for QC / QA

- Only on the legacy database

- A distinction must be made where IC has not happened

- May be older “compliant data” which does not meet standards

Page 11: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

- Standards

- Metadata- Bottle – type & make- Filter – type & make- Analytical method

- Parameter codes

- Allows data merging & long-term data archiving

Presenter
Presentation Notes
Geotraces set metadata standards. Based on BODC’s standards – feed in to European / global standards (SeaDataNet / ISO / INSPIRE). Enough metadata to load into system – who, what, when, where, why (project / cruise reports) QC / QA – means you’ve met the standard. Data screened. More detailed in sections below
Page 12: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

- Merging

- Allows easy management of “crossover stations”

- Marked as “fixed stations” in the db

- Enables comparison of data between cruises

Page 13: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Why compatible data?

- Mantra

“To make the data accessible and usable in 5, 10, 30… years timewithout the need to contact the

data originator.”

Presenter
Presentation Notes
Does not take the credit away from the originator. But, on a bad day I can’t remember what I was doing that morning… Ultimate goal of QA / QC-ing data that are submitted.
Page 14: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

3. The GeoTraces database

Page 15: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database
Page 16: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

http://www.bodc.ac.uk/geotraces/

Presenter
Presentation Notes
Global coverage Yellow = Cruises which have happened Black = Cruises from IPY years Red = Planned sections
Page 17: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

The GeoTraces database

- Key parametersTrace elementsStable isotopesRadioactive isotopesRadiogenic isotopesOthers to allow future work to be done

- Supporting parametersSalinity, Temperature, O2, nutrients

http://www.bodc.ac.uk/geotraces/

Presenter
Presentation Notes
Trace elements – e.g.: Fe Essential micronutrient, Mn Tracer of Fe inputs and redox cycling Stable isotopes – delta15N, delta13C Radioactive isotopes – 230Th, 231 Pa Radiogenic isotopes – Pb, Nd Particles / Aerosols Nutrients - nitrate, phosphate, silicic acid
Page 18: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

The GeoTraces database

- 2014: Intermediate data product

- It will only include- Submitted data (get your data in by 2013)- Intercalibrated data- Data passed by the IC committee

Presenter
Presentation Notes
Data product lead by Reiner Schlitzer @ AWI
Page 19: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

The GeoTraces database

- 2014: Intermediate data product

- It will only include- Submitted data (get your data in by 2013)- Intercalibrated data- Data passed by the IC committee

Presenter
Presentation Notes
DOIs come back to the Parsons quote. The full data lifecycle is achieved… Data publications: e.g. ESSD; RMetS/Wiley GeoScience Data Journal; Data letters in G3 (Geochemistry, Geophysics, Geosystems).
Page 20: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

4. A data intensive future

Presenter
Presentation Notes
A few thoughts on the future - Many ideas borrowed from Fox (RPI) and Diviacco (OGS, Trieste)
Page 21: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

A data intensive future

“We know more than we can tell.”

Michael PolanyiFellow of the Royal SocietyThe Tacit Dimension (1967)

Presenter
Presentation Notes
So how can we tell more? We have to define where data fit into our scientific lives. And may be even examine the way in which we conduct science.
Page 22: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

A data intensive future

Data Information Knowledge

Producers Consumers

Context

PresentationOrganization

IntegrationConversation

CreationGathering

Experience

Page 23: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

A data intensive future

Observation

Pattern

Tentative hyp.

Theory

Induction

Page 24: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

A data intensive future

Observation

Pattern

Tentative hyp.

Theory

InductionTheory

Hypothesis

Observation

Confirmation

Deduction

Page 25: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

A data intensive future

Is a method of logical inference introduced by C. S. Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch”

Abduction

Page 26: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

A data intensive future

Is a method of logical inference introduced by C. S. Peirce which comes prior to induction and deduction for which the colloquial name is to have a "hunch”

Abduction

• Starts when an inquirer considers of a set of seemingly unrelated facts

• armed with an intuition that they are somehow connected and …

• But data intensive!!• And this can be a job for visualization!!!

Presenter
Presentation Notes
Supported by comparable, compatible data. GeoTraces project database is a perfect platform for abductive reasoning
Page 27: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Conclusions

- Data matter – and increasingly so!

- The GeoTraces data assembly centre aids in making data compatible

- The GeoTraces database will be a big legacy

- Who knows how it may end up being used?

Page 28: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Conclusions

- Low quality data have higher costs- High quality data require communication- Need a planned QA & QC strategy- Investment in training- Best practices- Use appropriate tooling- Extensive metadata to prevent “data entropy”

Robinson, Meyer & Lenhardt (2012). Eos 93(19), 189

Page 29: Why quality control and quality assurance is important for the legacy of GEOTRACES through its database

Thank you

[email protected], @AdamLeadbetter