20
Globally Unique Identifiers in Biodiversity Informatics Kevin Richards Landcare Research NZ TDWG 2008

Globally Unique Identifiers in Biodiversity Informatics

  • Upload
    borka

  • View
    34

  • Download
    2

Embed Size (px)

DESCRIPTION

Globally Unique Identifiers in Biodiversity Informatics. Kevin Richards Landcare Research NZ TDWG 2008. Introduction. GUID ( G lobally U nique ID entifier) What, Why, Which, How LSIDs Issues. What are GUIDs. G lobally U nique ID entifier - PowerPoint PPT Presentation

Citation preview

Page 1: Globally Unique Identifiers  in  Biodiversity Informatics

Globally Unique Identifiers in

Biodiversity Informatics

Kevin RichardsLandcare Research NZ

TDWG 2008

Page 2: Globally Unique Identifiers  in  Biodiversity Informatics

Introduction

GUID (Globally Unique IDentifier)

– What, Why, Which, How– LSIDs– Issues

Page 3: Globally Unique Identifiers  in  Biodiversity Informatics

What are GUIDs

Globally Unique IDentifier• A short name for a complex entity on the web• Each name identifies only one entity• Examples:

– UUID eg 3E9D6B68-A08C-4F15-BC8A-1265F15D30E2

– DOI eg doi:10.1006/jmbi.1998.2354 – Handle eg hdl:123.456/abc

– LSID eg urn:lsid:indexfungorum.org:names:213645

– PURL eg http://purl.oclc.org/abc/123

Page 4: Globally Unique Identifiers  in  Biodiversity Informatics

What is a GUID

– Properties• Persistent• Opaque • Resolvable, sometimes - useful for locating

information about the entity

Page 5: Globally Unique Identifiers  in  Biodiversity Informatics

Why use GUIDs

Data at Provider 2

BOOK : “Three little pigs” 2 copies

Data Consumer

Data at Provider 1

BOOK : “The three little pigs” 3 copies

BOOKS:“Three little pigs” … (2)“The three little pigs” … (3)

Page 6: Globally Unique Identifiers  in  Biodiversity Informatics

Data at Provider 2 (ID = P2)

BOOK : “Three little pigs”ID (eg ISBN) = A123 2 copies

Data Consumer

Data at Provider 1 (ID = P1)

BOOK : “The three little pigs”ID (eg ISBN) = A1233 copies

BOOKS:ID : A123 : “The three little pigs”… (5)

… but with GUIDs …

BOOK Titles:ID A123 : Provider P1 : “The three little pigs”ID A123 : Provider P2 : “Three little pigs”

Page 7: Globally Unique Identifiers  in  Biodiversity Informatics

Example in our domain

ConsensusId : urn:lsid:compositae.org:names:45240C9B-D419-4B6F-93A5-D0A6DEAB4C81Name : Anthemis gaudium-solis Velen.

Provider Id Taxon Name

IPNI urn:lsid:ipni.org:names:177325-1:1.1 Anthemis gaudium-solis Vel.

Tropicos 50163035 Anthemis goudium-solis Velen.

Euro+Med 133202 Anthemis gaudium-solis Velen.

Govaerts {29FFBEDC-19F5-4899-BCB3-05EE2C7816C8} Anthemis gaudiumsolis Velen.

Page 8: Globally Unique Identifiers  in  Biodiversity Informatics

GUIDs are vital to TDWG architecture

Page 9: Globally Unique Identifiers  in  Biodiversity Informatics

Which GUID

• GUID Subgroup Recommendations:• Use LSIDs for identifying biodiversity data• Reuse GUIDs where they already exist

– GUID type

– Existing assignments

• See GUID Report - http://wiki.gbif.org/guidwiki/wikka.php?wakka=GUID2Report&show_comments=1

Also Canberra LSID Workshop report:http://www.tdwg.org/fileadmin/subgroups/guid/LSID_policy_workshop_Report_Canberra.pdf

Page 10: Globally Unique Identifiers  in  Biodiversity Informatics

What is an LSID?

• Life Science IDentifier• Developed by The Object Management Group &

W3C• Implemented by the team at IBM• Used for – data objects, datasets, images, files

Page 11: Globally Unique Identifiers  in  Biodiversity Informatics

LSID Format urn:lsid:bioguid.org:taxon:1122:v1

• Prefix - indicates that this is a URN

• URN type - indicates that it’s an LSID-type urn

• Authority - the authority who issued the LSID

• Namespace - internal to that authority

• Object identifier - within that authority

• Version - optional

Page 12: Globally Unique Identifiers  in  Biodiversity Informatics

LSID Rules

• Data doesn’t change (byte identical)

• Always available for resolution– Hand over to another authority if necessary

• At least some basic metadata

Page 13: Globally Unique Identifiers  in  Biodiversity Informatics

Pros of LSIDs

Not tied to physical addresses (as URLs are) Comparison can be done without resolving the ID

– eg for cases like “does object a = object b” Do not require any central registration or central

service Quick to adopt Encourage thought and planning before they are

allocated

Page 14: Globally Unique Identifiers  in  Biodiversity Informatics

Cons of LSIDs

However …

Requires DNS SRV record

Requires specialised software to resolve an LSID (not built in to most software)

The restriction - “LSID data cannot change” can be difficult

Page 15: Globally Unique Identifiers  in  Biodiversity Informatics

How

• What data/objects to apply Ids to

• Decide on – Authority– Namespace– Local ids (new vs existing)

• Issue LSIDs

• Setup resolver

Page 16: Globally Unique Identifiers  in  Biodiversity Informatics

LSID Code

• Current Code Stacks– Open Source (sourceforge.net)– Java, C++, Perl (IBM)– Microsoft .NET (Myself)– TAPIR LSID configuration

Page 17: Globally Unique Identifiers  in  Biodiversity Informatics

LSID Tools

• IBM LSID Launchpad• Firefox LSID Browser• LSID Tester (Rod Page)• Web based resolver – http://lsid.tdwg.org/

http://lsid.tdwg.org/urn:lsid... to get LSID metadata http://lsid.tdwg.org/summary/urn:lsid... to get summary info of LSID object

• Example LSID servers:– Index Fungorum - urn:lsid:indexfungorum.org:names:213649 – IPNI – urn:lsid:ipni.org:names:30000959-2:1.1.2.1– uBio - urn:lsid:ubio.org:namebank:11815

Page 18: Globally Unique Identifiers  in  Biodiversity Informatics

Issues to think about

• Who assigns new LSIDs?

• Who maintains LSID resolvers?

• What to assign LSIDs to:– Physical or Digital– Granularity– Only objects that need to be resolved /

identified externally– Is there any data, or only metadata?

Page 19: Globally Unique Identifiers  in  Biodiversity Informatics

Issues to think about

• When to resolve LSIDs– Every time an LSID is encountered, or only

when a client requests it?

• TDWG standards for metadata– Which ones?– Consistent application

Page 20: Globally Unique Identifiers  in  Biodiversity Informatics

References• LSID Source Forge - http://lsids.sourceforge.net/

• LSID .NET Source Forge - http://sourceforge.net/projects/lsid-dotnet

• LSID Tutorial - http://www-128.ibm.com/developerworks/opensource/library/os-lsid/

• LSID Specification - http://www.omg.org/cgi-bin/doc?dtc/04-05-01

• LSID Tester - http://linnaeus.zoology.gla.ac.uk/~rpage/lsid/tester/

• LSID Launchpad - http://www-124.ibm.com/developerworks/downloads/detail.php?group_id=124&what=rele&id=553

• GUID Subgroup - http://www.tdwg.org/activities/guid/

• GUID Subgroup Reports

– http://wiki.gbif.org/guidwiki/wikka.php?wakka=GUID2Report&show_comments=1

– http://wiki.tdwg.org/twiki/pub/TIP/TipDocuments/GUID1Report.pdf

• Firefox LSID developer site - http://lsid.mozdev.org/