Upload
dag-endresen
View
115
Download
0
Embed Size (px)
Citation preview
NMBU, Ås, January 2015
Status 27. January 2015
GBIF enables free and open access to biodiversity data online. We are an interna>onal government-‐ini>ated and -‐funded ini>a>ve focused on making biodiversity data available to all and anyone, for scien>fic research, conserva>on and sustainable development. 2
GBIF provides a data discovery system
global registry data portal
that is dependent on resolvable stable iden<fiers for efficient func<onality
3
1. Informa*on infrastructure – an Internet-‐based index of a globally distributed network of interoperable databases that contain primary biodiversity data.
2. Community-‐developed tools, standards
and protocols – the tools data providers need to format and share their data.
3. Capacity-‐building and training – and
access to a global expert community.
Map of GBIF Country Participants
31 DEC 2014
parti
cipa
tion
NB! The low membership coverage in Asia and Africa is an important gap!
GBIF Secretariat in Copenhagen with 20 staff members [link]
Node team at UiO NHM: Dag Endresen, Node Manager Chris>an Svindseth, Database manager Fridtjof Mehlum, Research Director Einar Timdal, Associate Professor Geir Søli, Associate Professor
Artsdatabanken Trondheim:
Wouter Koch, Advisor Nils Valland, Senior advisor
The Research Council of Norway:
Per Backe-‐Hansen, Head of delega>on
6
Artskart provides the Norwegian portal for species occurrences.
a subset of the same
data as in GBIF
Why GBIF?
OECD Global Science Forum (1999): “establish and support a distributed system of interlinked and interoperable modules (databases, soCware and networking tools, search engines, analy<cal algorithms, etc.) that together will form a Global Biodiversity Informa<on Facility (GBIF)”. [First global GBIF mee<ng in 2001; Secretariat in Copenhagen 2003]
The Millennium Ecosystem Assessment showed that human ac>ons ogen lead to irreversible losses in the diversity of life, and these losses have been more rapid in the past 50 years than ever before in human history. Biological diversity is key to resilience – the ability of natural and social systems to adapt to change, and is essen>al for nearly every aspect of human well-‐being. Because human threats to biodiversity occur across large spa>al and temporal scales, biodiversity and ecosystem monitoring, forecas>ng, and risk assessments require data to be organized in a globally-‐accessible, integrated infrastructure.
GBIF provides this infrastructure.
(Wilson, 2002; Worm et al., 2006; Duke et al., 2007)
GBIF and GEO Intergovernmental group on earth observations
Data Integration & Interoperability
GBIF provides the infrastructure delivering species occurrence data in GEO.
GEO BON Biodiversity observa>on network
GIASIP Global Invasive Alien Species Informa>on Partnership
GBIF provides the infrastructure delivering species occurrence data in GIASIP.
GBIF and IPBES (Naturpanelet) Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES)
IPBES provides a plalorm to support policy decisions based on biodiversity research results. GBIF provides the infrastructure delivering species occurrence data in IPBES.
Science
Policy
Biodiversity
Data, informa>on and knowledge
IPBES GBIF
Data publication
Data distribution in GBIF
Density of georeferenced species occurrence records published through GBIF (see http://www.gbif.org/occurrence) Last updated: 2014-‐07-‐09
Data published through GBIF.org
hnp://www.gbif.org | 16 JAN 2015
Trend in primary biodiversity records (millions)
data
pub
lishi
ng
100
150
200
250
300
350
400
450
500
550
Data published — by GBIF participant
NOTE: Datasets are assigned to countries according to the loca<on of the publishing ins<tu<on, including aggregated datasets with contributors from many other countries. hnp://www.gbif.org | 16 JAN 2015
data
pub
lishi
ng
1. United States 67,332,382 6. Sweden 5,165,053
2. Netherlands 15,659,739 7. Norway 4,845,994
3. Germany 6,988,553 8. Finland 2,506,681
4. United Kingdom 5,564,923 9. Belgium 2,492,458
5. Australia 5,351,016 10. Canada 1,512,676
1. United States 209,492,282 6. Germany 18,733,051
2. Sweden 49,346,620 7. Finland 18,511,977
3. United Kingdom 47,237,309 8. France 17,503,770
4. Australia 36,653,791 9. Norway 17,338,833
5. Netherlands 21,268,595 10. Spain 10,194,958
Number of new records published—Top 10 participant Countries (1 Jan to 31 Dec 2014)
Total number of records published—Top 10 Participant Countries (as of 31 Dec 2014)
GBIF portal:
18,0 million occurrences are located in Norway. Published from 31 countries worldwide.
GBIF portal:
17,2 million occurrences published form Norwegian ins>tutes. Covering 201 countries worldwide.
Danmark Finland
Norway Sweden
Iceland
Jan 2015 Datasets Occurrences
Denmark 53 9 384 792 Finland 57 18 514 033 Iceland 4 458 705 Norway 93 17 188 892 Sweden 35 50 083 140
Status for Nordic GBIF nodes (data hosted by…)
hnp://www.gbif.o
rg/cou
ntry/N
O
Download data
GBIF Portal – download data • Before downloading species occurrence data from GBIF, please take the >me to register. – hnp://www.gbif.org/user/register
• Downloads from the GBIF portal are packaged as a Darwin Core Archive (DwC-‐A). – hnp://www.gbif.org/faq/datause
• The species occurrence data are found in the “occurrence.txt” data file.
• This tab-‐delimited text file can e.g. be imported to a spreadsheet such as Excel or to a database.
• NOTE: the data files can become very large! So look at the file size before you open them in MS Excel.
Data download requests, by country
Requests for download do not necessarily result in data actually being downloaded. Based on country indicated by user login | 16 JAN 2015
use
of g
bif.o
rg
1. United States 22,539 6. China 2,886
2. Mexico 11,354 7. United Kingdom 2,873
3. Spain 6,229 8. Costa Rica 2,869
4. Denmark 5,432 9. Colombia 2,685
5. Brazil 4,132 10. Australia 2,635
Total of
84,951 requests from users in
106 countries, islands and territories
1 Jan 2014 – 31 Dec 2014
Data use in research
Use citations, by country of authors
15 JAN 2015
rese
arch
use
1. United States 114 6. Italy 22
2. Spain 41 7. Mexico 20
3. United Kingdom 40 8. Brazil 19
4. Germany 36 9. France 18
5. Australia 32 10. South Africa 17
Total 2014
Number of research publications from January to December 2014 citing use of GBIF-mediated data, ranked by country according to affiliation of author. Top 10 countries shown.
Relationship line represents collaboration between authors affiliated in different countries.
Dec 2014
1. United States 22 4. South Africa 5
2. United Kingdom 9 7. Switzerland 4
3. Spain 8 7. China 4
4. Germany 5 7. Mexico 4
4. Italy 5
Dec 2014
Number of research publications in December 2014 citing use of GBIF-mediated data, ranked by country according to affiliation of author. Top 9 countries shown.
GBIF citation in research 2008-2014
Last updated: 2014-‐09-‐02
57
43
61 66
90
76 80
17
35
48
66 63
33 29
52
89
148
169
229
249
194
0
50
100
150
200
250
300
2008 2009 2010 2011 2012 2013 2014 (Jan-‐Aug)
No. of p
eer-‐review
ed pub
lica>
ons
GBIF men>oned
GBIF discussed
GBIF-‐mediated data used
Scien>sts from Norwegian ins>tutes
are using GBIF-‐mediated data:
Darwin Core
Unifying species data
Integrated access for records of the occurrence of any species: • What? • When? • Where? • What evidence? • Data owner? • Link to full record
Presence only data
Collec*ons
Ecological Monitoring Genomics
Darwin Core
2015: Survey data compa>ble with exis>ng Darwin Core data, plus:
• Which species were recorded together?
• Which sets of data are directly comparable?
• Which species were most abundant in each sample?
Presence/absence
Darwin Core + Core Survey
Fields
Sample Id Method Id
Rela>ve abundance ...
Slide by Donald Hobern, 2012
Darwin Core – a vocabulary of terms
Wieczorek J, Bloom D, Guralnick R, Blum S, Döring M, De Giovanni R, Robertson T, and Vieglais D (2012) Darwin Core: An Evolving Community-‐Developed Biodiversity Data Standard. PLoS ONE 7(1): e29715. (doi:10.1371/journal.pone.0029715)
hnp://rs.tdwg.org/dwc/terms/
Darwin Core Archive (DwC-A)
v DwC-A publish DwC records including terms from DwC-A extensions.
v Simple text based format. v Zipped single file archive.
Germplasm.txt
Survey & plot data GBIF priority in 2015
Wiser SK, Spencer N, De Caceres M, Kleikamp M, Boyle B & Peet RK (2011). Veg-‐X – an exchange standard for plot-‐based vegeta>on data. Journal of Vegeta>on Science 22 (2011) 598–609. DOI:10.1111/j.1654-‐1103.2010.01245.x
“A primary technical impediment to large-‐scale sharing of vegeta<on data is the lack of a recognized interna<onal exchange standard for linking the panoply of tools and database implementa<ons that exist” (…) The specimen-‐based standards cited above [Darwin Core and ABCD], however, are not adequate for community sampling because the informa<on required goes beyond specimen and occurrence data” (Wiser et al. 2011).
hnp://terms.tdwg.org/wiki/Vegeta>on_Survey
Survey & plot data (priority in 2015)
Vegeta>on plot data
Image credit: Onar Michelsen, Norwegian University of Science and Technology
Identifiers
Record-‐level Terms dcterms:type | dcterms:modified | dcterms:language | dcterms:rights | dcterms:rightsHolder | dcterms:accessRights | dcterms:bibliographicCita>on | dcterms:references | ins*tu*onID | collec*onID | datasetID | ins*tu*onCode | collec*onCode | datasetName | ownerIns>tu>onCode | basisOfRecord | informa>onWithheld | dataGeneraliza>ons | dynamicProper>es Occurrence occurrenceID | catalogNumber | occurrenceRemarks | recordNumber | recordedBy | individualID | individualCount | sex | lifeStage | reproduc>veCondi>on | behavior | establishmentMeans | occurrenceStatus | prepara>ons | disposi>on | otherCatalogNumbers | previousIden>fica>ons | associatedMedia | associatedReferences | associatedOccurrences | associatedSequences | associatedTaxa MaterialSample materialSampleID Event eventID | samplingProtocol | samplingEffort | eventDate | eventTime | startDayOfYear | endDayOfYear | year | month | day | verba>mEventDate | habitat | fieldNumber | fieldNotes | eventRemarks dcterms:Loca*on loca*onID | higherGeographyID | higherGeography | con>nent | waterBody | islandGroup | island | country | countryCode | stateProvince | county | municipality | locality | verba>mLocality | verba>mEleva>on | minimumEleva>onInMeters | maximumEleva>onInMeters | verba>mDepth | minimumDepthInMeters | maximumDepthInMeters | minimumDistanceAboveSurfaceInMeters | maximumDistanceAboveSurfaceInMeters | loca>onAccordingTo | loca>onRemarks | verba>mCoordinates | verba>mLa>tude | verba>mLongitude | verba>mCoordinateSystem | verba>mSRS | decimalLa>tude | decimalLongitude | geode>cDatum | coordinateUncertaintyInMeters | coordinatePrecision | pointRadiusSpa>alFit | footprintWKT | footprintSRS | footprintSpa>alFit | georeferencedBy | georeferencedDate | georeferenceProtocol | georeferenceSources | georeferenceVerifica>onStatus | georeferenceRemarks GeologicalContext geologicalContextID | earliestEonOrLowestEonothem | latestEonOrHighestEonothem | earliestEraOrLowestErathem | latestEraOrHighestErathem | earliestPeriodOrLowestSystem | latestPeriodOrHighestSystem | earliestEpochOrLowestSeries | latestEpochOrHighestSeries | earliestAgeOrLowestStage | latestAgeOrHighestStage | lowestBiostra>graphicZone | highestBiostra>graphicZone | lithostra>graphicTerms | group | forma>on | member | bed Iden*fica*on iden*fica*onID | iden>fiedBy | dateIden>fied | iden>fica>onReferences | iden>fica>onVerifica>onStatus | iden>fica>onRemarks | iden>fica>onQualifier | typeStatus Taxon taxonID | scien*ficNameID | acceptedNameUsageID | parentNameUsageID | originalNameUsageID | nameAccordingToID | namePublishedInID | taxonConceptID | scien>ficName | acceptedNameUsage | parentNameUsage | originalNameUsage | nameAccordingTo | namePublishedIn | namePublishedInYear | higherClassifica>on | kingdom | phylum | class | order | family | genus | subgenus | specificEpithet | infraspecificEpithet | taxonRank | verba>mTaxonRank | scien>ficNameAuthorship | vernacularName | nomenclaturalCode | taxonomicStatus | nomenclaturalStatus | taxonRemarks ResourceRela*onship (Auxiliary Terms) resourceRela*onshipID | resourceID | relatedResourceID | rela>onshipOfResource | rela>onshipAccordingTo | rela>onshipEstablishedDate | rela>onshipRemarks MeasurementOrFact (Auxiliary Terms) measurementID | measurementType | measurementValue | measurementAccuracy | measurementUnit | measurementDeterminedDate | measurementDeterminedBy | measurementMethod | measurementRemarks
The purpose of iden*fiers …is to name things, making it possible to refer to them.
hnp – PURL – UUID hnp://purl.org/nhmuio/id/41d9cbb4-‐4590-‐4265-‐8079-‐ca44d46d27c3
Including machine readable formats
Citizen science Transcription
hnp://gbif.no
/dugnad/
Custom data portals
Data paper
• Peer review op>on for biodiversity datasets. • Authors get scien>fic credit for data publica>on. • Mee>ng concerns over data quality. • Mee>ng concerns over data cita*on
mechanism.
hnp://www.gbif.org/publishingdata/datapapers
Metadata requirements • Dataset descrip>on • Project descrip>on • People and Organiza>ons (including roles) • Coverage
– Taxonomic coverage – Geographic coverage – Temporal coverage
• Methods • Intellectual property rights, licensing • Keywords
Data paper workshop
• The first Norwegian data paper wri>ng workshop, in Oslo 2nd to 3rd December 2014 with 11 par>cipants hnp://goo.gl/GtW1Vx
• A second data paper workshop will be organized in Trondheim, 24th to 25th March 2015 with 20-‐25 par>cipants hnp://goo.gl/Ef1ZAy
Dimitri Brosens, GBIF Belgium
Publish your own data!
Many species occurrence data are “hidden” in reports and documents produced by universi*es, research ins*tutes, public agencies and the university museums. Publish your biodiversity data!
Photo by: Niklas Bildhauer
Publish and archive your own species occurrence data
• You can always publish your species occurrence data by sending an email to gbif-‐[email protected]
• The GBIF Norway helpdesk will assist with data publishing (to GBIF and Artskart)!
• You can install a data publishing sogware such as the GBIF Integrated Publishing Toolkit (IPT).
• Ci*zen Science portals such as Artsobservasjoner, iNaturalist, Anymals + Plants, …
• You can also use a data archiving pla_orm such as B2SHARE (EUDAT) or NorStore (Norwegian research data, EUDAT).
hnp://artsob
servasjone
r.no/
CC-‐BY Dag Endresen
Published to GBIF
Archive your own data!
Work in progress…!
Grants for biodiversity data preparation
Small grant to support data prepara=on
• GBIF Norway has some funds for suppor>ng new data providers with prepara>on of exis>ng biodiversity datasets.
• To assist data owners to start publishing data.
• The applica>on form and condi>ons can be requested by email from gbif-‐[email protected]
Thanks for listening!
Dag Endresen [email protected]
Chris>an Svindseth
chris>[email protected]
gbif-‐[email protected]