LSIDs suck (sadly)
“suck”is a technical term
DOIs suck ($ € £ )
Unique identifiers
Taxonomic names aren’t enough
Names have too much information
Cherie Booth
Cherie Booth Cherie Blair
Names can change when circumstances change
Jonathon Roughgarden = Joan Roughgarden
Names carry meaning
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Semantically opaque
identifier has no meaning
trouble with meaning
POK erythroid myeloid ontogenic factor
POK erythroid myeloid ontogenic factor
Pokemon causes cancer
Explict metadata and data access
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Have to fuss with DNS
Reliant on Internet address
What about DOI’s ?
doi:10.1080/10635150490264996
Have subscription?
What, no subscription?
Human-readable documents
Can’t predict what you get
Handles might be useful
GUID resolving to metadata
Resource Description Framework
Simple format (e.g., XML)
Everything is a resource…
supports inference
underpins Semantic Web
subject object
property
http://www.w3.orgWorld Wide Web
consortium
dc:publisher
RDF is everywhere
RDF is everywhere
RDF is everywhere
Existing vocabularies
Basic metadata (Dublin Core)
Geography (WGS 84)
Publications (PRISM)
Rights (Creative Commons)
Requires you to have URIs for objects
RDF documents can be independent
Can be as small as one triple
Aggregate triples from different sources
Store in a triple store
QuickTime™ and aTIFF (LZW) decompressor
are needed to see this picture.
Make new inferences
There are known knowns, things we know that we
know
There are known unknowns, things we now
know we don’t know
But there are also unknown unknowns, things we do not
know we don't know
things we don’t know we know
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Poissonia heterantha
Coursetia heterantha
Tephrosia heterantha
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Tephrosia heterantha
basionym
basionym
International Plant Names Index (IPNI)
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
Poissonia heterantha
Coursetia heterantha
Tephrosia heterantha
QuickTime™ and aTIFF (Uncompressed) decompressor
are needed to see this picture.
basionym
International Plant Names Index (IPNI)
basionym
Poissonia heterantha Coursetia heterantha
IPNI “knows” these names are synonyms
But doesn’t know it knows it
Melissotarsus insularis
Melissotarsus insularis no hit
CASENT0107663-D01 DQ176312
Melissotarsus sp. BLF m1DQ176312
CASENT0107663-D01Melissotarsus insularis
1
Melissotarsus insularisMelissotarsus sp. BLF m1 =
CASENT0107663-D01
DQ176312
Melissotarsus sp. BLF m1
source
subject TaxId:342313
TaxId:342313
HNS30687 Melissotarsus insularislabel
CASENT0107663-D01subject HNS30687
label
GUIDs are trivial
It’s about metadata
It’s about inference
Lessons from DOIs
“reference backbone”
add value to electronic publications
Crossref assigns DOI prefix
Crossref stores metadata
Crossref can be searched
“Oh, the vision thing” George Bush (Snr), 1987
GBIF assigns handles
Data sources provide identifiers
Data sources provide metadata and data
GBIF has metadata standards
Data sources supply metadata to GBIF
Low barrier to entry
Data sources install Java Handle System (or
whatever)
If use RDF then GBIF can support inference
We learn something we didn’t know
Are we ready to do this?
Quality of service
Metadata is what counts…
…and persistence