28
Encyclopedia of Life Redefining Publication and Access to the Primary Literature David P. Shorthouse Marine Biological Laboratory Woods Hole, MA

Shorthouse

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Shorthouse

Encyclopedia of LifeRedefining Publication and Access to the

Primary Literature

David P. Shorthouse

Marine Biological Laboratory

Woods Hole, MA

Page 2: Shorthouse

2

Imagine an electronic page for each species of organism on Earth, available

everywhere by single access on command. The page contains the scientific name of

the species, a pictorial or genomic presentation of the primary type specimen on

which its name is based, and a summary of its diagnostic traits. The page opens out

directly or by linkage with other databases such as ARKive, Ecoport, and

GenBank. It comprises a summary of everything known about the species’

genome, proteome, geographic distribution, phylogenetic position, habitat,

ecological relationships, and, not least, its perceived practical importance for

humanity.

Page 3: Shorthouse

Steering Committee

Biodiversity Heritage Library

Executive Smithsonian

Marine Biological Laboratory

Biodiversity Informatics

Atlas of Living Australia

Missouri Botanical GardenPlants

Harvard UniversityEducation and outreach

Field MuseumResearch Community

MacArthur Foundation

Sloan Foundation

Page 4: Shorthouse

Not the first time…

Tree of Life

Catalogue of Life

SpeciesBase

Discover Life

4

Page 5: Shorthouse

What makes this distinct…

Grandeur of the vision

Taxonomically intelligent, names-based cyberinfrastructure

Aggregation of content

Participatory

Open source content and software

…not just a website

5

Page 6: Shorthouse

6

David “Paddy” Patterson

Peter Mangiafico

Patrick LearyDavid Shorthouse

Kristen LansPam Fournier

Alexey Shipunov

Vitthal KudalJeremy Rice

Dimitry Mozzherin Anne Thessen

Page 7: Shorthouse

Exemplar Pages

Anne Pringle • Brian Farrell • Alta Buden • Margaret Thayer • Michael Ashburner • Christy Geraci • Lilibeth Miranda • Senjie Lin

Rick Wilkerson

Jonathan Losos • David Langor • David Shorthouse • Mary Hennen • Judy Stoffer • George Yatskievych • Kendra Buresch

Tonia Hsieh • David Patterson • Christian Thompson • Rod Eastwood • Jerry Louton • Seth Bordenstein

Rich Pyle • Roger Hanlon • Tamara ClARKMWendy Applequist • Grace Servat • Bob Magill • Sandy Knapp • Vicki Funk

Page 8: Shorthouse

Exemplar Process

8

Current Rate ≈ 100 pages / year

Page 9: Shorthouse

9

Page 10: Shorthouse

Biodiversity Heritage Library

Missouri Botanical Garden

New York Botanical Garden

Royal Botanic Gardens, Kew

Field Museum

Natural History Museum (London)

Smithsonian Institution

American Museum of Natural History

Botany Libraries, Harvard University

Ernst Mayr Library of the Museum of Comparative Zoology, Harvard University

Marine Biological Laboratory / Woods Hole Oceanographic Institution Library (MBL/WHOI)

10

Page 11: Shorthouse

11

WHAT?

Digitize the core literature of biodiversity. Full works, not bits & pieces.Open Access: all content can be repurposed, reused, reformatted.Congruent: must fit in to a dynamic knowledge ecology.

Page 12: Shorthouse

BHL Status

9.2M pages

Challenges: metadata extraction & search trajectories– Penn State collaborations– Honing name-matching algorithms, natural

language processing

12

Page 13: Shorthouse

Names are Messy

13

Aa paleacea

Limulus polyphemus

Kiwa hirsuta

Osedax frankpressi

Kingia australisPieris japonica

Pieris rapae

Trypanosoma brucei

Homo sapiens

13

Page 14: Shorthouse

More than One Meaning (Polysemes)

14

Aotus trivirgatus

Aotus Illiger 1811

Aotus

Aotus Smith 1805

Aotus ericoides

. Resolve with intelligent disambiguationAuthority, species, contextual data

Contextual data

PrimateMonkeyEyesFoodPanamaAotus nancymaae

Contextual data

legumeplantflowerMirbelieaAustraliaAotus mollis

Anorexia nervosaHabeas corpusEtcetera etcetera

Page 15: Shorthouse

15

Many names for one species…

Koko

Горилла

Guerilla

Eastern Lowland Gorilla

Gorilla graueri

Gorilla berengei

Gorilla beringei Matschie

Gorilla beringei mikenensis

King kong

Gorilla gorilla

Virunga

Gorila

GorilleMountain gorilla

大猩猩

ゴリラ

15

Page 16: Shorthouse

EOL the Aggregator

“Content partner” schema– Media elements, species profile model– Attribution, licensing

16

Pyle, R. L., J. L. Earle and B. D. Greene. 2008. Five new species of the damselfish genus Chromis (Perciformes: Labroidei: Pomacentridae) from deep coral reefs in the tropical western Pacific. Zootaxa. 1671: 3–31.

Page 17: Shorthouse

17

All information currently in the public domain

will remain in the public domain.

Content providers are required to

adopt a Creative Commons license for the information

that they serve through the EOL. Except for public-

domain content, the default and

preferred license is CC-BY

Content providers who request some restrictions on re-

use of their information may

select: CC-BY-SACC-BY-NC

CC-BY-NC-SA

Licensing Policy for Content Partners

To the greatest extent possible, the Encyclopedia of Life promotes an open-source, open-access approach.

The EOL will provide attribution information for all

content that it serves. EOL will also indicate the

Creative Commons license attached to

each object (text, structured data, graphics,

multimedia, etc.).

V5.0 5 April 2008

Page 18: Shorthouse

EOL the Enabler

Curate species page

Partial funds for post-doctoral positions

18

Page 19: Shorthouse

Cybertaxonomy Drivers

Pool of active taxonomists is evaporatingShift to online workflow must:– Be meaningful (foster engagement with organisms)– Attract funding– Provide personal and institutional visibility– Be scholarly (e.g. citation metrics)– Be simple and task-oriented– Federate workloads

19

Hine, Christine. 2008. Systematics as Cyberscience: Computers, Change, and Continuity in Science. MIT Press. Cambridge, Massachusetts. 307pp.

Page 20: Shorthouse

Nearctic Spider Database: Can This be a Template?

Meaningful

Provides personal & institutional visibility

Simple, task-oriented

Shares the workload

20

Page 21: Shorthouse

21

What’s in My Backyard?

<?xml version="1.0" encoding="windows-1252"?><!--Zoom Search Engine Version 5.0 (1002) PRO--><rss version="2.0" xmlns:opensearch="http://a9.com/-/spec/opensearch/1.1/" xmlns:zoom="http://www.wrensoft/zoom/response/5.0/schema/"><channel><title>Nearctic Spider Database</title><description>Search species pages in The Nearctic Spider Database</description><link>http://canadianarachnology.dyndns.org/data/canada_spiders/</link><opensearch:link rel="search" href="./data/canada_spiders/search/search.xml“ type="application/opensearchdescription+xml" /><zoom:searchquery>pardosa moesta</zoom:searchquery><zoom:searchcategory>All</zoom:searchcategory><opensearch:totalResults>27</opensearch:totalResults><opensearch:startIndex>10</opensearch:startIndex><opensearch:itemsPerPage>10</opensearch:itemsPerPage><item> ............

Can I Share or Get Help?

Can I Track My Searches?

OpenSearch

Can I Grab That Image?

HTML (JavaScript)

& bbCode Gadgets

Page 22: Shorthouse

Taxa-Centric Approach

22

Page 23: Shorthouse

23

Page 24: Shorthouse

http://lifedesk.eol.org

Mid-December alpha testing

Customization (skinning)

Image gallery

Facile names & classification management

Species Page creation:– Images, “chapters”

Licensing and attribution

Granular roles and permissions 24

Page 25: Shorthouse

25

Page 26: Shorthouse

Why Data Centricity?Observe some features

Of some individuals

On a few occasions

In a few places

Record them incompletely

Convert un-interpreted data into interpreted assertions

Construct a narrative

Loss of data26

Page 27: Shorthouse

The Future

27

Raw dataTripl

e stor

e Correlations

Filters:

Faceted searches: What was that tree with pink flowers that we saw in Washington last May?

Visualizations

Page 28: Shorthouse

EOL is…

NOT the mother of all catalogues

A prelude to biocentric data management– Enable a shift from narrative taxonomy to

datacentric taxonomy

A web hosting infrastructure for taxa-centric pursuits

28