NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider World: Successful...

Preview:

DESCRIPTION

Smart Content applications at Elsevier Michael Lauruhn, Disruptive Technology Director, Elsevier

Citation preview

Michael Lauruhn

December 3, 2014

Smart Content applications at ElsevierNISO/NFAIS Virtual Conference:

Connecting the Library to the Wider World - Successful Applications of Linked Data

| 2

Smart Content & Linked Data at Elsevier

Background

Key Components of Smart Content

Current Examples

Project Planning Considerations

Introduction & Agenda

| 3

Introduction: Smart Content & Linked Data

Elsevier Content

Componentized text

Data

Multimedia

3rd Party Linked data

Web Open data

Vocabulary

| 4

Smart Content infrastructure in practice

Trial: NCT00623103

Serious Adverse events:

Atrial fibrillation

med:drugs Rivastigmine

Elsevier

Delirium treatment: An unmet challenge

Rivastigmine, a cholinesterase inhibitor, has been used to

treat delirium in elderly patients with stroke. 1 A biologically

plausible premise—that impaired cholinergic transmission

might either cause or worsen delirium—led to a

randomised, placebo-controlled, double-blind trial by

Maarten van Eijk and colleagues 2 in The Lancet in which

they added rivastigmine or placebo to usual treatment of

patients in intensive care. The trial was halted at 104

patients by the drug safety and monitoring board (DSMB)

because of increased mortality (12/54 in the rivastigmine

group, 4/50 in the placebo group; p=0·07) and a worse

outcome. The rivastigmine group …

foaf:page

owl:same as

owl:same as

| 5

Smart Content as Infrastructure

Product Development & Enhancement • More accurate search results• Faceted navigation• Improved content discoverability

Content Analytics• New insights and abilities to take inventory

about what we publish • Identification of co-occurring terms• Link to related external content & data

Personalization• Individual content recommendations• Targeted individual marketing

Editorial Productivity• Flexible product types – new collections,

image banks, etc. • Increased speed to market

Key Components

of Smart Content

| 7

Vocabulary Example: EMMeT

EMMeT

UMLS

SNOMED

ICD9ICD10

MeSH

LOINC

Gold

Standard

(Drugs)

Elsevier

Custom

Resources

Multi-language taxonomy:

>1 million concepts

>3 million synonyms

Classes include:

Anatomy

Diseases

Drugs

Symptoms

Procedures

Sourced from several

standardized vocabularies

| 8

Medical Name

Malignant Neoplasm of the Breast

Consumer Friendly Name

Breast Cancer

Synonyms

Malignant Tumor of Breast

Malignant Breast Neoplasm

Breast Ca

Codes

ICD9 – 174.9

MeSH – D001943

SNOMED-CT – 190121004

Semantic Type/Group

Neoplastic Process/Disease

• Breast Disorders

• Cancer of the Thorax

• Mammary Neoplasms

• More….

• Breast Sarcoma

• Familial Breast Cancer

• Malignant lymphoma of the Breast

• Malignant Neoplasm of the breast outer

quadrant

• More…

Symptoms

Diagnostic

Procedures

Treatment

Procedures

Medications

Risk Factors

Prevention

Complications

Breast Lump, Nipple Retraction, …..

Mammography, Breast Biopsy, …..

Chemotherapy, Mastectomy, ….

Tamoxifen, Doxorubicin, …..

Family History, Genetics, Predisposition, ….

Screening, Preemptive Mastectomy, ….

Metastatic Cancer, ….

Se

ma

ntic R

ela

tion

ship

s

4

2

3

1

| 9

Vocabulary Example: EMMeT

EMMeT

UMLS

SNOMED

ICD9ICD10

MeSH

LOINC

Gold

Standard

(Drugs)

Elsevier

Custom

Resources

FrEMMeT

SpEMMeT

| 10

Linked Data Repository

| 11

• Knowledgebase of semantic data

• Large scale integration of related

sources of medical and scientific

content and data

• High performance service layer

APIs for integration into end-user

products and internal platforms

Linked Data Repository

Editorial &

Author

Keywords

Classic subject metadata

Componentized text

Robust Data models

Entity extraction

Linked Data Environment

Full-text

Indexing

Semantic Annotations

Current Examples

| 13

ClinicalKey search

| 14

• Support the FundRef initiative facilitated by CrossRef organization to

provide a standard way of reporting funding sources for published

scholarly research.

• SciVal Funding is an online solution that provides targeted

recommendations on grants, making it easier for researchers to

discover funding opportunities related to their area of research.

SciVal Funders Vocabulary

| 15

Similar Methods for Neuroscience

System to extract and index the Methods sections of articles from

100 Elsevier neuroscience journals

Built a comparison and recommendation system so readers can

find and evaluate articles with “Similar Methods” to the ones

presented in the current article

| 16

Similar Methods for Neuroscience

Search process targets factors:

what brain regions are being studied

what organism is being used

what methodologies are being employed

what disease model is the focus of the study

| 17

Leveraging Wikipedia for Neuroscience

• Pilot project that identifies concepts from a

Neuroscience topics vocabulary

• Provides Wikipedia definitions to add context

around the article’s significant concepts

Additional context for Energy terms

18

• A ‘dictionary app’ using the portions of the Encyclopedia of

Energy (1818 terms)

• Available for articles from Applied Energy and Energy

Conversion and Management; additional pilots planned.

Example: http://www.sciencedirect.com/science/article/pii/S0306261913001888

Terms from dictionary

are highlighted in article,

when the reader clicks

on the term the definition

from the dictionary will

be shown in the feature

(right hand pane)

Project Planning

Considerations

| 20

• Get stakeholders invested

• Think about what users currently do… and what they can do better

• Focus on Use Cases to stay centered and identify priorities with

decision making.

Get to a Use Case early

Particularly helpful when introducing a

new infrastructure to an organization

| 21

• Integration with third party content, data models and vocabularies

requires a vetting process:

Are they accurate?

Are they trustworthy?

Are they current?

Are they sustainable?

Quality & Reliable of Resources

Warning: Some of the more attractive

resources on the web are one off

projects are no longer maintained

| 22

• As knowledge models and vocabularies grow, resources are needed

to keep them current

• Governance policy should account for sources for new concepts,

terminology and relations:

New content types

Search logs

New trends & discoveries

Ongoing Maintenance

These require resources (people’s

time) that need to be factored into the

total cost of ownership

| 23

• Applying semantic web technologies for applications is not an

exclusively IT solution:

Sponsors, stakeholders and subject experts need to contribute and

shape the vocabularies and the application functionality

The fine tuning for some of these applications can be surprisingly

manual

It’s important to not get distracted by the outliers and corner cases

Quality & Testing

Installing and implementing these

technologies OOTB is getting

easier…Quality is where it gets hard

| 24

• Test sets are essential

Real content

Real use cases

Scores that show accuracy and measure improvement

Quality & Testing

Our SME’s: Our Application:

| 25

• Don’t forget to look at opportunities for internal applications

Consider internal workflows

Look for efficiency enhancements

Look for discovery opportunities

• Start small

Get some early proof of concepts that you can share with stakeholders

before tackling bigger challenges

Other lessons & observations

Michael Lauruhn

m.lauruhn@elsevier.com

@MikeLauruhn

@ElsevierLabs

Thank You

Recommended