10
ORCID Technical Report May 18, 2011

ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Embed Size (px)

Citation preview

Page 1: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

ORCID Technical Report

May 18, 2011

Page 2: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Development Approach

2

Alpha• Completed Spring 2010• Self-claim oriented• Limited light integration with a

few participant services• Currently in use for

demonstration

Phase 1• In progress 2011• Will be developed by ORCID, Inc.• Will provide core for future

production service• Will focus on currently active

researchers• Development headed by Geoff

Bilder

Phase 2• Development 2012+• Will address assertions by wide

group of third parties• Will extend capabilities for

alternate roles and other types of contributions

• Will provide mechanisms for automatic de-duplication of third party donated records

Page 3: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

ORCID Alpha Available for Demo

3

…This Alpha provides a test environment for illustrating use cases and gathering feedback from the community for services ORCID may provide. The Alpha does not represent ORCID's live system…

Page 4: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

ORCID Alpha Features

4

Links to Published Work

via DOI

Institutional Affiliations

Page 5: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

ORCID Alpha Features

5

Crossref Lookup for Publications

Profile Management and Privacy Controls

Page 6: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Phase 1 Scope

6

• ORCID will build a central registry of unique identifiers for researchers and scholars with the following scope:• ORCID will focus on currently active researchers.• Data will come from individuals and universities.• ORCID will be a hybrid system of self- and organization-asserted identity.• Data collected will be those needed for disambiguation - extra data for

optionally creating full CV-like profiles might be added in the future.• The system will provide basic matching and disambiguation of names.• The ORCID system will, from the start, enable 3rd parties to build value added

services using ORCID infrastructure• ORCID services will be developed based on the needs of the ORCID

community.

Page 7: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Phase 1 Functionality

7

The development of the ORCID “alpha” and subsequent discussions with stakeholders have identified a number of additional changes that would need to be made to the system in order to meet common requirements. These include:

• Incorporate OAuth2 & profile exchange• Privacy mechanism to support tertiary control (private/protected/public) at field level (needed to support profile exchange)• Authentication/authorization mechanism to support “delegated” management of profiles (e.g. a researcher can grant permission to a departmental secretary or librarian to edit a profile on their behalf)• Include production-level publication lookup feature from CrossRef• Expose minimal provenance information for metadata records

Page 8: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Phase 2: Issues of Assertion

8

• Consider disambiguation as a collection of “claims” by different “parties”

• Evaluating the duplication, contradiction, and uniqueness can indicate the credibility of a record

Page 9: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Ongoing Research: Profile Exchange Group

9

• The work of the Profile Exchange sub-group:• Recommend a technological approach to reliably and efficiently merge

researcher profiles from different databases

• Approach• Create a Gold Standard of test data which comprises of a set of data where

there is a number of known True Positive matches• In terms of creating the test set, a True Positive is determined by matching md5

hash of lowercase email addresses where the email address contains partial match with name of author (this is to avoid the known problem of generic email addresses such as “lab@” “admin@”, “info@”, etc. We are not at all proposing that email/email hash would be used in a final system.

• The test data will be loaded into a system, where by contributors with matching technology can pull down the records and apply their methods.

Page 10: ORCID Technical Report May 18, 2011. Development Approach 2 Alpha Completed Spring 2010 Self-claim oriented Limited light integration with a few participant

Profile Exchange Group Progress

10

• Mike Taylor is leading the R&D work in this space• Research Specialist at Elsevier Labs• Was responsible for ORCID Alpha Scopus integration

• Workspace created for Profile Exchange R&D• Elsevier has provided a server to host data• Access Innovation / Data Harmony donated an instance of their XIS XML

document repository• Datasets

• Focus on High Energy Physics and Computer Science

• Data contributed from Elsevier (Scopus) and IEEE

• DTD• Published DTD to ORCID working group wiki• Normalizing data onto common DTD to ensure

like-to-like comparisons• Current Activities

• Research on datasets to develop clustering algorithms and relationships