Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
OBIS standards and formatting
Data Management and Administration This work is licensed under a Creative Commons Attribution-
NonCommercial-ShareAlike 4.0 International License.
Darwin Core: standards, terms and definitions
Data in OBIS: biodiversity records, environmental data and sampling methods associated to the records are assembled into up to three different data files.
OBIS
OCCURRENCES: taxa present in a specific place and time.
EVENTS: something that occurs at a place and time.
MEASUREMENTS OR FACTS: biotic and abiotic data.
IPT
Darwin Core: standards, terms and definitions
Depending on the structure of your data, you will be able to enter your data into OBIS in three ways:
1) Entering only the data set of occurrences, whether the data come from: field collections, taxonomic lists of museums, literature, etc.
1) Entering the data set of occurrences and also the environmental data and
the measurements made on each taxa (e. g. sizes, wet weight).
1) Entering the data set of occurrences, the environmental data and the measurements made on each taxa, as well as the details of how taxa were collected (e. g. collection methods, tools).
Darwin Core: standards, terms and definitions ● In order to publish in OBIS it’s necessary to organize your data under certain
standards and structures.
● Darwin Core is a standard designed to create a common language for documenting and publishing data about species registers (field observations or preserved specimens in a collection).
● Developed by the Dublin Core Metadata Initiative - DCMI and nowadays
supported by the TDWG (Biodiversity Information Standards, formerly The International Working Group on Taxonomic Databases).
Darwin Core Archive Darwin Core Archive (DwC- A) is a standard for publishing biodiversity data using the Darwin Core format. Darwin Core archives contain text files which are logically arranged in a star schema. This means that there is one core file and (optionally) multiple extensions files.
For example:
➔The species occurrence recorded in a research field trip = core file.
➔Environmental data, sampling methodology, etc. = extension file.
OBIS accepts 2 types of core files: Occurrence Core and Event Core.
Darwin Core Archive When to use Occurence Core:
When the dataset does not result
from sampling activities, such as datasets based on museum collections, citations of occurrences from literature, individual sightings.
Datasets which do result from
sampling activities, but no information on how the data was sampled or processed is lost.
When to use Event Core: When the dataset contains abiotic
measurements, or other measurements which are related to a sample.
When specific details are known about
how a biological sample was taken and processed.
Darwin Core Archive
Full lines: eventID links; Dashed lines: occurrenceID links.
Darwin Core: standards, terms and definitions A template has been designed to be filled according with the terms that adjust to your data information. Definitions and applications for each term are described in the Darwin Core excel file provided in this course. This is a simplified example of an Occurrence Data sheet, with the terms placed in columns:
Formatting the
taxon occurrences records
data set
data processing: occurrence records For publishing your data in OBIS you need to follow few steps in order to
format your data collection and make it suitable for uploading into OBIS database through the IPT.
For the simplest case in which you have only taxon occurrence records you will need to create one occurrence core file.
There are 8 required terms that MUST be completed to publish your data in OBIS. However, we recommend to complete the table as much as possible.
For the example shown, the 8 required terms contents are:
1. scientificName:the full scientific name of any taxon level. For example: Voluta musica (Species name), Vermetidae (Family name), Polychaeta (Class name). 2. scientificNameID:an identifier for the nomenclatural details of a scientific name; assigned according to the World Register of Marine Species (WoRMS) and you have to search for it in the web portal (www.marinespecies.org).
data processing: occurrence records
data processing: occurrence records 3. occurrenceID: an identifier for the Occurrence. For the example table
shown above, combining the codes of your institution (Universidad Simón Bolívar) with the codes of the collections or catalogs (MCN) = MCNUSB-134 you can have the occurrenceID. 4. eventDate:the date-time or interval during which an event occurred. Must be in ISO 8601 format: year-month-dayThh:mm:ss. For example: 2015-02-25T15:30:00 5. decimalLatitude:the geographic latitude for the occurrence register. Must be in decimal degrees. For example: 12.2354 (for the Northern hemisphere); -12.2354 (for the Southern hemisphere).
data processing: occurrence records 6. decimalLongitude:the geographic longitude for the occurrence
register. Must be in decimal degrees. For example: 68.357 (for the eastern hemisphere); -68.357 (for western hemisphere). If the locality is known but not the exact coordinates you could search in geocoding services:Marine Regions or Google Maps 7. occurrenceStatus:a statement about the presence or absence of a Taxon at a Location. Use “Present”or “Absent”.
8. basisOfRecord: The specific nature of the data record. Vocabulary controlled by TDWG; valid values are “PreservedSpecimen”, “FossilSpecimen”, ”LivingSpecimen”, ”HumanObservation”, “MachineObservation”.
data processing: occurrence records PreservedSpecimen: when specimen is deposited in a collection (please
add institutionCode, collectionCode and CatalogNumber) FossilSpecimen: important to distinguish collection date from geological time zone. LivingSpecimen: an intentionally kept/cultivated living specimen e.g. in an aquarium or culture collection. HumanObservation: e.g. bird sighting, benthic sample but specimens were discarded after counting. MachineObservation: sensors, e.g. DNA sequencers, image recognition
data processing: occurrence records
If you have additional information, you can add Darwin Core terms to your occurence core ( see OBIS manual http://iobis.org/manual/darwincore/)
Record Level terms
● institutionCode: identifies the institution which owns the data. The specimens collected were stored in the Natural Science Museum of Simon Bolivar University: USB.
● collectionCode: identifies the collection or dataset within that
institute: MCN.
data processing: occurrence records
Taxonomy and Identification
● scientificNameAuthorship: the author of the taxon. Example Voluta
musica (scientificName) Linnaeus, 1758 (scientificNameAuthorship). ● identificationQualifier: in case of uncertain identifications,
qualifiers such as cf. or aff. should be placed here.
data processing: occurrence records taxonRank: can aid us in identifying the taxon that scientificName refers
to, and avoid linking to homonyms, although it is not necessary when scientificNameID a is provided. Very usefull in case a scientificNameID has not been created, e. g. when a new taxon is described.
Location locality: the specific description or name of the place. In our example:
Isla Caribe, Isla Margarita.
● locationID: an identifier for the set of location information. You may find this information using the Marine Regions Searching tool: you will find the MRGID. Copy and paste it in your data set excel file.
data processing: occurrence records
data processing: occurrence records + biotic measurements
To search for a specific location using Marine Regions: 1) Go to the "Gazetteer" section 2) Look for the search "Geographic name" box and write the name of the
location (e. g. Margarita) 3) Click "search"
data processing: occurrence records + biotic measurements If the locality is not found in the Marine Region portal you may:
1) Write to them and inform that there is a location not registered in the portal.
2) Use the Getty Thesaurus of Geographic Names.
If you have no successful results in any of your searching you may leave the cell empty.
● locationAccordingTo: you must complete with the source of the
location reference.
data processing: occurrence records Occurrence
● recordedBy: the primary collector observer, e. g. Carolina Peralta (Collector Name).
● preparations: to document the preparation and preservation methods
of stored specimens. ● catalogNumber: applied for stored specimens, is the identifier for the
record in the collection.
Next section:
Formatting the
taxon occurrence records + events + Measurements or facts…