10
Life Science Identifiers Chris Wroe (based on material from myGrid team and IBM Life Sciences)

Life Science Identifiers

  • Upload
    glenys

  • View
    26

  • Download
    0

Embed Size (px)

DESCRIPTION

Life Science Identifiers. Chris Wroe (based on material from myGrid team and IBM Life Sciences). The Issue. Each database on the web has: Different policies for assigning and maintaining identifiers, dealing with versioning etc. Different mechanism for retrieving an item given an ID. - PowerPoint PPT Presentation

Citation preview

Page 1: Life Science Identifiers

Life Science Identifiers

Chris Wroe

(based on material from myGrid team and IBM Life Sciences)

Page 2: Life Science Identifiers

The Issue

• Each database on the web has:– Different policies for assigning and

maintaining identifiers, dealing with versioning etc.

– Different mechanism for retrieving an item given an ID.

• LSID designed to harmonise the retrieval of data.

Page 3: Life Science Identifiers

What is an LSID?

• An OMG standard:urn:lsid:AuthorityID:NamespaceID:ObjectID:RevisionID

urn:lsid:ncbi.nlm.nig.gov:GenBank:T48601:2

• LSID Designator: A mandatory preface that notes that the item being identified is a life science-specific resource

• Authority Identifier: An Internet domain owned by the organization that assigns an LSID to a resource

• Namespace Identifier: The name of the resource (e.g., a database) chosen by the assigning organization

• Object Identifier: The unique name of an item (e.g., a gene name or a publication tracking number) as defined within the context of a given database

• Revision Identifier: An optional parameter to keep track of different versions of the same item

Page 4: Life Science Identifiers

How is data retrieved?

Application

LSID client

1. Get me info for:urn:lsid:pdb.org:1AFT

PDB Authority @ pdb.org

PDB Data resolver

PDB Metadata resolver

PDB database

2. Where can I get data and metadata for urn:lsid:pdb.org.1AFT

2. Get me the data and metadata for:urn:lsid:pdb.org:1AFT

Page 5: Life Science Identifiers

Authority Commitments

• Data returned for a given LSID must always be the same

• Must always maintain an authority at e.g. pdb.org that can point to data and metadata resolvers.

Page 6: Life Science Identifiers

LSID Components

• IBM built client and server implementations in Perl, Java, C++http://www-124.ibm.com/developerworks/oss/lsid/

• Fairly simple to wrap an existing database as a source of data or metadata

• Client also simple to use

• LSID Launchpad adds LSID resolution to Internet Explorer

Page 7: Life Science Identifiers

LSID Launchpad

Page 8: Life Science Identifiers

Use within myGrid

• Needed an identifier for things such as workflows, experiments, new data results etc

• Everything identified with LSIDs !• LSID saves us having to invent our own

conventions and code.• Can pass references to data around and

be reassured the other party will know how to resolve that reference

Page 9: Life Science Identifiers

Haystack demonstratorGenBank

record

Provenance of results

Managing collection of

sequences for review

Page 10: Life Science Identifiers

Caveats

• To be successful across the community, it will require widespread adoption by data providers such as Genbank, UniProt etc

• Status of such adoption unclear