46
Vincent S. Smith Small pieces loosely joined Towards a unified theory of biodiversity for the web

Small pieces loosely joined: towards a unified theory of biodiversity for the web

Embed Size (px)

Citation preview

Vincent S. Smith

Small pieces loosely joinedTowards a unified theory ofbiodiversity for the web

Macro taxonomyThe big picture of taxonomic research

• Inventory the Earth’s species• Document their relationships• “Publish” these data

Goal…

• 1.8 M described spp. (10M names)• 300M pages (over last 250 years)• 1.5-3B specimens

Data set…

People…• 4-6,000 scientists• 30-40,000 “pro-amateurs”• Many more citizen scientists?

Micro taxonomyThe practice of taxonomic research

How do we integrate micro &macro taxonomy for the Web?

• Parochial• Specialized experts• Fragmented & distributed

Sociology…

• Different (domain specific)• Communities of practice• Non transferable skills

Methodology…

Output…• Heterogeneous & scattered• High volume, low impact• Hard to find (use)

http://Scratchpads.eu

What is a Scratchpad?A website for you & your community

Your data1

Published & reviewedon your site

3Uploaded &

tagged

2

Your data1

Published & reviewedon your site

3Uploaded &

tagged

2

Fast Intuitive Fit for use

What is a Scratchpad?A website for you & your community

What can Scratchpads do?Import, manage, search & browse:

DNA & Phylogenies

Specimens

Literature Images

DNA & Phylogenies

Specimens

Literature ImagesTaxonomy

What can Scratchpads do?Integration & connectivity within & between sites

+Administration -Change your site information -Change you front page -Change your logo -Activity and access logs+Backup -Backing up your data -Restoring your data+Bibliography -Creating a record -Importing from a ref. manager -Exporting to a reference manager+Blog -Creating and adding a blog+Custom Content -Defining a CCK -Importing from a spreadsheet -Creating a custom view+Fileshare -Creating and using a fileshare+Forum -Altering the forum settings -Creating a container for a forum -Creating a new forum -Creating a new topic inside a forum

+Groups -Creating a group -Subscribing to a group+Image -Uploading & basic annotation -Linking image & location records -Linking image & specimen records -Linking image & publication records -Overlay annotations on images+Layout -Change your theme -Menus -Blocks and sidebars+Locations -Creating a record -Importing from a spreadsheet+Pages -Creating, editing, cloning & deleting -Configuring the panels template+Panels -Adding & configuring content -Creating a new panel -Citing a Panels page+Phylogeny -Adding a phylogenetic tree

+Specimens -Creating a record -Importing from a spreadsheet -Linking specimen & location records -Linking specimen & pub. records+Tasks -Creating a tasklist+Taxonomy -Importing from a spreadsheet -Importing from ClassificationBank -Starting from scratch -Taxonomy manager -Displaying a classification -Adding names -Deleting names -Taxonomy & panels+Users -Your settings -Adding a new user -User roles and permissions -Adding and editing user profile fields -Logging in+Webform -Creating and using webforms

What can Scratchpads do?In summary:

What can Scratchpads do?Visual taskguide

Current ScratchpadsAntsBeesBeetlesBig-headed fliesBirdsBlackfliesCiliatesCockroachesDragon TreesDung BeetlesFalse ButtonweedFlat wormsFliesForaminiferaFossil InsectsFungus GnatsHolometabolaLeaf-miner FliesLiceLichens of BermudaMalvaceaeMegalastrum fernsMilichiid fliesMosquitoesMossesNannotax fossilsNepticuloid mothsPalmsPearl oystersPolychaete wormsScaleworms

TermitesTriticid grassesWeevilsWood Ferns

Sulawesi FernsStick insects

Sites: 70+Users: 850+Pages: 130kSince March 2007

Tracking visitors across sites

Key monthly statistics- 50,000 page views - 6,000 visitors- 8 minutes on site- 50% returning visits

(average per month 08’)

Scratchpad visitors

Scratchpad applicationsA multipurpose, flexible technology

4th Edition Howard & Moore, Birds of the world(fact checking, data compilation, 2010, funding)

eBooks

European Mosquito Bulletin (ISSN 1460-6127), Phasmid Studies (ISSN 0966-0011)(submission, review, & dissemination of articles)

eJournals

Scratchpad applicationsA multipurpose, flexible technology

Image galleriesNanno fossils, Cockroaches, Stick insects, Flatworms, Grasses, Lichens & many more…

(rapid upload, annotation, & display of images)

Scratchpad applicationsA multipurpose, flexible technology

How do Scratchpads work?Getting a Scratchpad

• Biological focus• Agree to T&C’s (click-thru)• CC license “by-nc-sa”

Requirements

• Maintainer• Scope/Mission/API Keys• (Sub)domain name

Application

Content• Unrestricted (overlapping)• No branding (focus on authors)• Value added

http://scratchpads.eu/apply

Using a Scratchpad

• User categories (maintainer, ed. contrib.)• Public / private content (flexible groups)• Admin. page (site settings & behavior)

Management

• Content types (biblio, maps, “page” etc)• Forms, managers, Excel, EndNote etc• Custom content (add or extend data types)

Data Input

Tagging (indexing)• Taxonomy terms (2M +)• Multiple classifications• Auto-tagging

How do Scratchpads work?

AutotaggingIndexing data to make it findable

1. Create content

2. Find terms

3. Submit(Index)

(Autotag)

(e.g. reference)

Journal citationmentions taxon name

1. Create content

2. Find terms

3. Submit(Index)

(Autotag)

(e.g. reference)

Matches taxonomyterm (Drag & Drop)

AutotaggingIndexing data to make it findable

1. Create content

2. Find terms

3. Submit(Index)

(Autotag)

(e.g. reference)

Page tagged (indexed)with taxon name

AutotaggingIndexing data to make it findable

Indexing data to make it findable

How do Scratchpads work?

• Tagged data can bepresented differently

• For example as part ofa traditional bibliography

• Or as small windowsor “panels” of data

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

Taxonomichierarchies

Files anddocuments

Phylogenetictrees

Customizedcontent

Specimenrecords

Photographs &illustrations

PersonalizedinstructionsCommon

namesBibliographic

literature

Types of Scratchpad Panel…Built with “tagged data”

Dynamically built species pages

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

Browsed through a taxonomy

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

Including 3rd party content

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

With data curation toolsWith data curation tools

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

Listing all “authors”

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

Dated, permanent & citable

Integrating data & “publishing” in a Scratchpad

How do Scratchpads work?

Choose which panels to display

Adjusting the panels layout

How do Scratchpads work?

An example based on the Catalogue of Life classification

How do Scratchpads work?

2 million taxon pagesOpen curation at http://catlife.myspecies.info

The informatics landscape

Biodiversity on the Web

Scratchpads are personalizing biodiversity science

Biodiversity on the Web

Biodiversity Heritage Library• Digitising heritage literature

Encyclopedia of Life• A web page for every species

Scholarly Journals• Traditional publishing

BHL, EOL and scholarly journals

A unified theory of biodiversity?

• Biodiversity publications since 1469- 5.4 million books- 800,000 monographs- 40,000 periodicals

• Held by Natural History librariesE.g., NHM holds more than 1M books, 250kmonographs & periodicals, 0.5M artworks

• Sharing the digisation of contents• Focus on out of copyright materials• Partnership with “Internet Archive”

• BHL partnership of 10 Nat. Hist. libraries

• Make the contents “findable”

“Digitizing biodiversity literature”

Biodiversity Heritage Library

1 scribe machine, 3,500 pages per shift per day

2. Extract text (OCR)

1. Scan (photograph)

34 scribe machines now in operation

3. Find keywords- Taxonomic names- Author names- Citations- Collection data- Morphological data- Descriptions- Identification keys- Illustrations- Photographs

“Digitizing biodiversity literature”

Biodiversity Heritage Library

2. Extract text (OCR)

3. Find keywords

1. Scan

- Taxonomic names- Author names- Citations- Collection data- Morphological data- Descriptions- Identification keys- Illustrations- Photographs

Palma, R.L., andR.L.C. Pilgrim.2002. A revisionof the genusNaubates(Insecta:Phthiraptera:Philopteridae).J. R. Soc. N.Z.32:7-60.

“Digitizing biodiversity literature”

Biodiversity Heritage Library

2. Extract text (OCR)

3. Find keywords

1. Scan

- Taxonomic names- Author names- Citations- Collection data- Morphological data- Descriptions- Identification keys- Illustrations- Photographs

4. Index

5. Put on the web

Palma, R.L., andR.L.C. Pilgrim.2002. A revisionof the genusNaubates(Insecta:Phthiraptera:Philopteridae).J. R. Soc. N.Z.32:7-60.6. 10M pp. to date

“Digitizing biodiversity literature”

Biodiversity Heritage Library

Scratchpads as a tool to add articles (and markup) to BHL?

Creating a community built virtual taxonomic library

Scratchpads and BHL

NotYet?Yes

• A web page for all 1.8M species

• $25m funding (5 years)- MacArthur and Sloan Foundations

• Megascience mashup- Aggregating data from the web

• Multiple audiences- Science & outreach

• 10 years to complete- First draft 2008, “finished” 2017!

“A web page for every species”

Encyclopedia of Life

• Struggling to find an identity?- Competition, vetting, growth, credit

• A possible publishing platform?- LifeDesks / Scratchpads

BiodiversityJournals

• Fragmented

• Mostly commercial

• Data poor• Fixed audience

- Hard to repurpose

• Possible role for EoL?- Web publishing platform (cf Wikipedia)

• Zootaxa- 15% n. spp; 50 spp. a week!

Scholarly communication in taxonomy & systematics

Journals Articles

• Scratchpads / EoL / Zootaxa- MS Word Template (markup)- Simultaneous publication

“Small pieces loosely joined”

Summary

1. Bringing data togetherBiodiversity studies are data rich, poorly archived & ever changing

2. Bringing people togetherBiodiversity researchers are few in number, fragmented & highly distributed

3. Bringing science togetherBiodiversity science demands a different approach to addressing BIG questions

BIG IS DIFFERENTNew opportunities & new challenges!

Ben Scott Vladimir BlagoderovIrina BrakeEdward Baker

Thanks…

Simon Rycroft Dave Roberts Kehan Harman

Questions?

Scalable & sustainable technology

Scratchpad management

Virtual machine, open-source software, self-archiving, backed-up, multi-site configuration(easy to move & upgrade, secure & reliable, citable, screencasts, low admin., low marginal costs)

Hardware, software & user support