Building the Biodiversity Knowledge Graph

  • View
    273

  • Download
    7

  • Category

    Science

Preview:

DESCRIPTION

Slides from 4th Global Online Biodiversity Informatics Seminar https://plus.google.com/events/clvk6nd14d9fhh7e4a6oe5mt9s0

Citation preview

Building the Biodiversity Knowledge Graph

@rdmpage

http://iphylo.blogspot.com

• There are known knowns, things we know that we know

• There are known unknowns, things we now know we don’t know

• But there are also unknown unknowns, things we do not know we don’t know

known

unknown

knowns

unknowns

Things we don’t know that we know

Melissotarsus insularis

Melissotarsus insularis no hit

CASENT0107663-D01 DQ176312

Melissotarsus sp. BLF m1DQ176312

CASENT0107663-D01Melissotarsus insularis

1

Melissotarsus insularisMelissotarsus sp. BLF m1 =

We have a vast amount of “old stuff”

Numbers of new animal names

1923

WWIWWII

We are learning new stuff

“New” and “old” are disconnected

Dark taxa

http://iphylo.blogspot.co.uk/2011/04/dark-taxa-genbank-in-post-taxonomic.html

Mammals in GenBank

Proper Linnaean names

Aus sp.

Mammals

Proper Linnaean names

Aus sp.

“Invertebrates”

BOLD

Challenge: linking things together

(sticky data)

Data is good

More data is better…

…but this data is not sticky

Location

name

name

Tags

Namenname

Identifiers

Shared identifiers are sticky

Identifiers

• Globally unique

• Resolvable (for humans and machines)

• Use other people’s identifiers to link things together

Human and machine readable

machine

human

{"author": [ { "family": "Page", "given": "Roderic D.M." } ], "container-title": "PeerJ", "reference-count": 60, "page": "e190", "deposited": { "date-parts": [ [ 2013, 11, 18 ] ], "timestamp": 1384732800000 }, "title": "BioNames: linking taxonomy, texts, and trees", "type": "journal-article", "DOI": "10.7717/peerj.190", "ISSN": [ "2167-8359" ], "URL": "http://dx.doi.org/10.7717/peerj.190”}

Using other people’s identifiers is hard work and scary

• Hard work - you have to find their identifiers

• Scary - what happens if other person breaks their identifiers?

• Solution: make it easy to find them, and make them robust (e.g., CrossRef and DOIs)

http://dx.doi.org/10.7717/peerj.190

DOI (Digital Object Identifier)

Biodiversity Knowledge Graph(linking things together)

Our questions are “paths” in this network

Phylogeography

Taxonomy

GenBank records from Spain

MESH term

PMID:948206

http://biostor.org/reference/102054

http://data.gbif.org/occurrences/215921922/

BHL and GBIF as biomedical databases

http://iphylo.blogspot.co.uk/2012/03/bhl-and-gbif-as-biomedical-databases.html

Metrics(counting links in the knowledge graph)

In an attempt to live up to that increasing demand for documentation, the leadership of the Natural History Museum of Denmark has issued an order to its curatorial staff - The staff members are requested to document which publications from 2011, written entirely by external scientists, that in one way or another are based on material in the collections of the Museum.

http://markmail.org/message/opv2we7fkmro2nen@TAXACOM

https://twitter.com/#!/search/10.1371%252Fjournal.pone.0036881

https://twitter.com/edwbaker/status/205595933159858176

https://twitter.com/edwbaker/status/205595933159858176

http://www.museum-analytics.org/

Cited, linkable specimens

NMNH Vertebrate Zoology Herpetology Collections 11194

CAS Herpetology Collection Catalog

MCZ Herpetology Collection

Herpetology Collection (University of Kansas Biodiversity Research Center)

9619

6720

5818

http://iphylo.blogspot.co.uk/2012/02/gbif-specimens-in-biostor-who-are-top.html

Annotation(everyone can make

the knowledge graph)

http://bionames.org/labs/bookmarklet/

How many people view annotation

DataFix me!

Annotation as fixing errors

Annotation as buildingthe knowledge graph

paper specimen

paper

sequence

taxonomic name

specimen

cites

publishes

has voucher

OK, but if the biodiversity knowledge graph is so cool, why haven’t we

made it already?

Open question:

Who will build thebiodiversity knowledge graph?

Recommended