22
OBIS standards and formatting Data Management and Administration This work is licensed under a Creative Commons Attribution- NonCommercial-ShareAlike 4.0 International License.

OBIS standards and formatting

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: OBIS standards and formatting

OBIS standards and formatting

Data Management and Administration This work is licensed under a Creative Commons Attribution-

NonCommercial-ShareAlike 4.0 International License.

Page 2: OBIS standards and formatting

Darwin Core: standards, terms and definitions

Data in OBIS: biodiversity records, environmental data and sampling methods associated to the records are assembled into up to three different data files.

OBIS

OCCURRENCES: taxa present in a specific place and time.

EVENTS: something that occurs at a place and time.

MEASUREMENTS OR FACTS: biotic and abiotic data.

IPT

Page 3: OBIS standards and formatting

Darwin Core: standards, terms and definitions

Depending on the structure of your data, you will be able to enter your data into OBIS in three ways:

1) Entering only the data set of occurrences, whether the data come from: field collections, taxonomic lists of museums, literature, etc.

1) Entering the data set of occurrences and also the environmental data and

the measurements made on each taxa (e. g. sizes, wet weight).

1) Entering the data set of occurrences, the environmental data and the measurements made on each taxa, as well as the details of how taxa were collected (e. g. collection methods, tools).

Page 4: OBIS standards and formatting

Darwin Core: standards, terms and definitions ● In order to publish in OBIS it’s necessary to organize your data under certain

standards and structures.

● Darwin Core is a standard designed to create a common language for documenting and publishing data about species registers (field observations or preserved specimens in a collection).

● Developed by the Dublin Core Metadata Initiative - DCMI and nowadays

supported by the TDWG (Biodiversity Information Standards, formerly The International Working Group on Taxonomic Databases).

Page 5: OBIS standards and formatting

Darwin Core Archive Darwin Core Archive (DwC- A) is a standard for publishing biodiversity data using the Darwin Core format. Darwin Core archives contain text files which are logically arranged in a star schema. This means that there is one core file and (optionally) multiple extensions files.

For example:

➔The species occurrence recorded in a research field trip = core file.

➔Environmental data, sampling methodology, etc. = extension file.

OBIS accepts 2 types of core files: Occurrence Core and Event Core.

Page 6: OBIS standards and formatting

Darwin Core Archive When to use Occurence Core:

When the dataset does not result

from sampling activities, such as datasets based on museum collections, citations of occurrences from literature, individual sightings.

Datasets which do result from

sampling activities, but no information on how the data was sampled or processed is lost.

When to use Event Core: When the dataset contains abiotic

measurements, or other measurements which are related to a sample.

When specific details are known about

how a biological sample was taken and processed.

Page 7: OBIS standards and formatting

Darwin Core Archive

Full lines: eventID links; Dashed lines: occurrenceID links.

Page 8: OBIS standards and formatting

Darwin Core: standards, terms and definitions A template has been designed to be filled according with the terms that adjust to your data information. Definitions and applications for each term are described in the Darwin Core excel file provided in this course. This is a simplified example of an Occurrence Data sheet, with the terms placed in columns:

Page 9: OBIS standards and formatting

Formatting the

taxon occurrences records

data set

Page 10: OBIS standards and formatting

data processing: occurrence records For publishing your data in OBIS you need to follow few steps in order to

format your data collection and make it suitable for uploading into OBIS database through the IPT.

For the simplest case in which you have only taxon occurrence records you will need to create one occurrence core file.

Page 11: OBIS standards and formatting

There are 8 required terms that MUST be completed to publish your data in OBIS. However, we recommend to complete the table as much as possible.

For the example shown, the 8 required terms contents are:

1. scientificName:the full scientific name of any taxon level. For example: Voluta musica (Species name), Vermetidae (Family name), Polychaeta (Class name). 2. scientificNameID:an identifier for the nomenclatural details of a scientific name; assigned according to the World Register of Marine Species (WoRMS) and you have to search for it in the web portal (www.marinespecies.org).

data processing: occurrence records

Page 12: OBIS standards and formatting

data processing: occurrence records 3. occurrenceID: an identifier for the Occurrence. For the example table

shown above, combining the codes of your institution (Universidad Simón Bolívar) with the codes of the collections or catalogs (MCN) = MCNUSB-134 you can have the occurrenceID. 4. eventDate:the date-time or interval during which an event occurred. Must be in ISO 8601 format: year-month-dayThh:mm:ss. For example: 2015-02-25T15:30:00 5. decimalLatitude:the geographic latitude for the occurrence register. Must be in decimal degrees. For example: 12.2354 (for the Northern hemisphere); -12.2354 (for the Southern hemisphere).

Page 13: OBIS standards and formatting

data processing: occurrence records 6. decimalLongitude:the geographic longitude for the occurrence

register. Must be in decimal degrees. For example: 68.357 (for the eastern hemisphere); -68.357 (for western hemisphere). If the locality is known but not the exact coordinates you could search in geocoding services:Marine Regions or Google Maps 7. occurrenceStatus:a statement about the presence or absence of a Taxon at a Location. Use “Present”or “Absent”.

8. basisOfRecord: The specific nature of the data record. Vocabulary controlled by TDWG; valid values are “PreservedSpecimen”, “FossilSpecimen”, ”LivingSpecimen”, ”HumanObservation”, “MachineObservation”.

Page 14: OBIS standards and formatting

data processing: occurrence records PreservedSpecimen: when specimen is deposited in a collection (please

add institutionCode, collectionCode and CatalogNumber) FossilSpecimen: important to distinguish collection date from geological time zone. LivingSpecimen: an intentionally kept/cultivated living specimen e.g. in an aquarium or culture collection. HumanObservation: e.g. bird sighting, benthic sample but specimens were discarded after counting. MachineObservation: sensors, e.g. DNA sequencers, image recognition

Page 15: OBIS standards and formatting

data processing: occurrence records

If you have additional information, you can add Darwin Core terms to your occurence core ( see OBIS manual http://iobis.org/manual/darwincore/)

Record Level terms

● institutionCode: identifies the institution which owns the data. The specimens collected were stored in the Natural Science Museum of Simon Bolivar University: USB.

● collectionCode: identifies the collection or dataset within that

institute: MCN.

Page 16: OBIS standards and formatting

data processing: occurrence records

Taxonomy and Identification

● scientificNameAuthorship: the author of the taxon. Example Voluta

musica (scientificName) Linnaeus, 1758 (scientificNameAuthorship). ● identificationQualifier: in case of uncertain identifications,

qualifiers such as cf. or aff. should be placed here.

Page 17: OBIS standards and formatting

data processing: occurrence records taxonRank: can aid us in identifying the taxon that scientificName refers

to, and avoid linking to homonyms, although it is not necessary when scientificNameID a is provided. Very usefull in case a scientificNameID has not been created, e. g. when a new taxon is described.

Location locality: the specific description or name of the place. In our example:

Isla Caribe, Isla Margarita.

Page 18: OBIS standards and formatting

● locationID: an identifier for the set of location information. You may find this information using the Marine Regions Searching tool: you will find the MRGID. Copy and paste it in your data set excel file.

data processing: occurrence records

Page 19: OBIS standards and formatting

data processing: occurrence records + biotic measurements

To search for a specific location using Marine Regions: 1) Go to the "Gazetteer" section 2) Look for the search "Geographic name" box and write the name of the

location (e. g. Margarita) 3) Click "search"

Page 20: OBIS standards and formatting

data processing: occurrence records + biotic measurements If the locality is not found in the Marine Region portal you may:

1) Write to them and inform that there is a location not registered in the portal.

2) Use the Getty Thesaurus of Geographic Names.

If you have no successful results in any of your searching you may leave the cell empty.

● locationAccordingTo: you must complete with the source of the

location reference.

Page 21: OBIS standards and formatting

data processing: occurrence records Occurrence

● recordedBy: the primary collector observer, e. g. Carolina Peralta (Collector Name).

● preparations: to document the preparation and preservation methods

of stored specimens. ● catalogNumber: applied for stored specimens, is the identifier for the

record in the collection.

Page 22: OBIS standards and formatting

Next section:

Formatting the

taxon occurrence records + events + Measurements or facts…