Integrating the DOI with Intra-organization Legacy Systems

Preview:

DESCRIPTION

Integrating the DOI with Intra-organization Legacy Systems. WWW8 Conference - DOI Workshop Toronto, May 11, 1999 Andy Stevens John Wiley & Sons, Inc., New York stevens@wileynpt.com. Topics for discussion. Business and technical reasons for using DOIs - PowerPoint PPT Presentation

Citation preview

Integrating the DOI with Intra-organization Legacy Systems

WWW8 Conference - DOI Workshop

Toronto, May 11, 1999

Andy Stevens

John Wiley & Sons, Inc., New York

stevens@wileynpt.com

Topics for discussion

• Business and technical reasons for using DOIs

• Choosing numbering schemes, granularity, bookkeeping

• Technical implementation issues

Wiley as an example

• Wiley is a medium-size independent publisher with annual revenue around US$450 million

• We mostly publish technical and scientific books and journals:– 1000 new book titles per year– 400 academic journals– various Web sites, CD-ROMs, software

Business reasons for using DOIs• Need a way to keep track of e-commerce. How

can you track sales for an online item unless you have a way to track the item itself? Similar reasoning applies for tracking rights and royalties.

• Facilitates rebundling of online content• The DOI cuts across product lines. For Wiley,

this means journals, books, Internet-only content.

Technical reasons for using DOIs

• Persistence of URLs is a problem. For example, as our online journals mature, they get shuffled from system to system.

• We need a way to do bookkeeping on our own content

• We need a way to link to other people’s content (central metadata database)

Applying DOIs to Wiley journals

• We initially chose to concentrate on our journal content due to the well-regimented production process and also the existence of our online journals project (Wiley InterScience)

• Our journals were already assigning identifiers to content (SICI numbers)

Applying DOIs to Wiley books

• More difficult because the book production process is much more varied within the company -- many different platforms (Quark, Framemaker, etc.), many different vendors and compositors

• not much of an online presence for book content (yet), though this is changing (e.g. sample chapters for Amazon)

Choosing a numbering scheme

• Q: What’s a SICI number?

• A: a serial item and contribution identifier, of course! NISO Z39.56-1996

1097-4571(199612)47:12<896::AID-ASI3>3.3.CO;2-J

– compact identifier containing ISSN, publication date, volume, issue, page, component, format, check digit

Choosing a numbering scheme (cont.)

• We are assigning SICI numbers to all of our abstracts, articles, and issues anyway, so we might as well just use them for DOIs:

10.1002/(SICI)1097-4571(199612)47:12<896::AID-ASI3>3.3.CO;2-J

• For the journals themselves, just use ISSN:10.1002/(ISSN)1097-4571

• Don’t forget: the DOI is a dumb number!

Choosing a numbering scheme (cont.)

• For books, use BICI number or some similar scheme (undecided as of yet)

• Will probably be finalized soon as encyclopedias come online

Granularity

• For our journals, we chose abstract, article, issue (TOC), journal. No need to get more specific, as there is currently no business need to identify down to the paragraph or diagram level

• For books, we will automatically assign DOIs down to the chapter level, and then have a manual procedure for creating “finer” DOIs as needed

The need for a new system

• Given that legacy systems generally do not cross product lines (e.g. books, journals, and rights are all produced/accounted in separate systems), we decided to create a new system to maintain DOIs.

• The “Wiley DOI Server” has open network-based APIs (via HTTP or FTP) for integration with external systems, old or new.

Wiley DOI Server

• What is it?– Initially, a Sun Sparc 5 with mSQL database on

my desktop (Summer ’97)– Currently, a dedicated Sun Sparc E450 with

Sybase and Verity (Fall ’98) – Eventually, just another application on a big

Sun server in our IT department

Resolution of Wiley DOIsCNRI

Directory(dx.doi.org)

1. RequestWiley DOI

2. Redirect to URL forresponse page on Wiley DOI Server

5. Request content associatedwith requested DOI

6. Return content ContentWeb Server

Wiley DOIServer

(doi.wileynpt.com)

3. Request DOI response pagefor requested Wiley DOI

4. Link user to content(if available)

Functions of Wiley DOI Server

• Provides response page for DOI requests– from user perspective: interruption– from publisher perspective: opportunity to do

authentication and branding, and possibility to provide additional services (make the interruption a feature, not a bug).

• Provides FTP site for new DOIs as they are created by other Wiley systems. Current delivery format: XML files

Functions of Wiley DOI Server (cont.)

• Provides place to do maintenance– ping test to check for dead URLs– register new DOIs in central directory

• Stores metadata about DOIs to allow reverse DOI lookup. Hot Topic!!– Currently, we use a custom metadata field set

geared to Wiley journals– defined API for batch lookups (similar to PubMed)

The need for a metadata database

• For the DOI to be useful as a linking mechanism, we need to be able to lookup DOIs based on item metadata (just like we would lookup an ISBN for a given book title).

• Several initiatives underway to develop technical infrastructure and to decide field set(s).

Ongoing issues

• Support from management is essential

• Authentication: most DOI-assigned content is controlled by some kind of online authentication system. How do we gracefully deal with multiple logins without burdening the user too much?

• Metadata and links are valuable, how do we police them?

Ongoing issues (cont.)

• Still need to implement some features of Wiley DOI server:– automatic registration of new DOIs into central

directory– dual ISSNs for journals (print and electronic)

can confuse searching– foreign characters currently confuse searching

Summary

• The business needs to recognize the usefulness and need for the DOI for it to succeed within a company.

• Coexistence with legacy systems is possible in a networked environment.

• Many cross-linking opportunities exist, but depend on a good metadata database.

Recommended