View
5.827
Download
1
Category
Tags:
Preview:
Citation preview
BHL DEVELOPMENTS
BHL-EUROPE MEETING
NÁRODNÍ MUZEUM, PRAGUE
16 NOV 2009
Chris Freeland Technical Director, BHL
Kai in STL,describing ametadata format
We like to have fun while BHLing…
Blame the scotch
Biodiversity Heritage Library: http://biodiversitylibrary.org
Stats: Now Online
Last week: 15,000 titles 40,000 volumes 16.4mil pages
Today: 34,636 titles 66,544 volumes 25.2mil pages
BHL Partner Libraries
BHL + >100 other libraries with open access content at archive.org
Biodiversity Heritage Library: http://biodiversitylibrary.org
Stats: Usage
Jan – Sep 2009 266,000 visitors 436,000 visits 2.1million
pageviews
Daily average 970 visitors 1,600 visits / day 7,700 pageviews /
dayJan – Sep 2009
Launch to 30 Sep 2009
New Color Scheme: To be released this week
http://github.com/openlibrary/bookreader
Biodiversity Heritage Library: http://biodiversitylibrary.org
Cloud storage & computing
Biodiversity Heritage Library: http://biodiversitylibrary.org
Global, coordinated development Building a community of developers
Funded & volunteer RubyBHL: http://github.com/mjy/rubyBHL
PyBHL: http://linux.softpedia.com/get/Programming/Libraries/pybhl-51612.shtml
Programmers from China & Australia committed to project
New partners, new content, new possibilities
Biodiversity Heritage Library: http://biodiversitylibrary.org
Open Software & Development BHL Bits:
Portal code, utilities, services http://code.google.com/p/bhl-bits/
Taxonomic Literature Group Google Group for discussion of “taxonomic
literature & the services required to make literature interoperable within biodiversity research and biodiversity informatics.”
http://groups.google.com/group/taxonlit
Biodiversity Heritage Library: http://biodiversitylibrary.org
Open Data
Downloads Simple tab-delimited exports of core data http://www.biodiversitylibrary.org/data/BHLExportSchema.pdf
Data model DB schema as ERD
http://bhl-bits.googlecode.com/files/20090930_BHLDataModel.pdf
Biodiversity Heritage Library: http://biodiversitylibrary.org
Services
Names Service Return all occurrences of a name throughout BHL
digitized corpus Documentation: http://bit.ly/2e6sg9
Access to 51million name strings using TaxonFinder
1.4million unique names
OpenURL Facilitate links to citations: protologues, articles,
references Documentation:
http://www.biodiversitylibrary.org/openurlhelp.aspx Useful to Nomenclators, Reference Systems
IPNI Tropicos
Biodiversity Heritage Library: http://biodiversitylibrary.org
Services: OpenURL
http://www.biodiversitylibrary.org/openurl?pid=title:3934&volume=14&issue=&spage=301&date=1879
http://www.biodiversitylibrary.org/openurl?pid=title:3934&volume=14&issue=&spage=301&date=1879
http://www.tropicos.org/Name/1200408
Biodiversity Heritage Library: http://biodiversitylibrary.org
Services: OpenURL Disambiguation Looking for:
BHL returns:
Biodiversity Heritage Library: http://biodiversitylibrary.org
Services: OpenURL Results
Biodiversity Heritage Library: http://biodiversitylibrary.org
Encyclopedia of Life
522,000 species pages linked to BHL #1 referring site
Biodiversity Heritage Library: http://biodiversitylibrary.org
Other Consumers
EarthCape Labs Sort/Search capabilities with harvested names YouTube demo:
http://www.youtube.com/watch?v=qw7qw87JTOs
BioGUID / iPhylo BHL Name Timeline & Comparison
http://bioguid.info/bhl/ http://bioguid.info/bhl/compare.php
New Viewer Tagging So much cool stuff we can’t keep up!
http://iphylo.blogspot.com/search/label/BHL
@rdmpage
http://bioguid.info/bhl/compare.php?name1=Physeter+catodon&name2=Physeter+macrocephalus
Biodiversity Heritage Library: http://biodiversitylibrary.org
Crowdsourced Articles
http://www.biodiversitylibrary.org/pdfgen/17298
Demo: http://youtube.com/watch?v=oidf3b26jVs
Biodiversity Heritage Library: http://biodiversitylibrary.org
Crowdsourced Articles
12,000 PDFs generated through September 2009 4,900 submitted with article metadata Analysis: http://bit.ly/4Jqu9
Biodiversity Heritage Library: http://biodiversitylibrary.org
Great, but how to…
display / manage?
meet community demands for bibliography / citation management?
build from more open source tools?
Biodiversity Heritage Library: http://biodiversitylibrary.org
Development goals re: citations Create a repository for community-vetted
taxonomic bibliographies. Ability to ingest, display, download, and
index articles so that the BHL can operate as an article repository.
Identify article boundaries in BHL digitized content using contributed bibliographies & algorithms.
Build from existing community of work around Drupal / Biblio. In use by collaborators
“something like GenBank or NameBank for citations…”
So, CitationBank…or CiteBank (savs chars)
Need…
Biodiversity Heritage Library: http://biodiversitylibrary.org
Crowdsourced Articles
PDFs from BHL pushed into Drupal/Biblio:
Biodiversity Heritage Library: http://biodiversitylibrary.org
http://citebank.biodiversitylibrary.org/search
Biodiversity Heritage Library: http://biodiversitylibrary.org
http://citebank.biodiversitylibrary.org/node/47423
Biodiversity Heritage Library: http://biodiversitylibrary.org
CiteBank boundaries
Book
Citation
Pageturning UIPDFOCR
eBook/Kindle
Stored *somewhere* & retrievable via HTTP URI
CitationCitationCitation
Bibliography
CiteBank
BHL Data Flow – Sep 2009
CiteBank
Biodiversity Heritage Library: http://biodiversitylibrary.org
Points of discussion @ TDWG09…
Linked Literature and the Biodiversity Heritage Libraryhttp://www.tdwg.org/proceedings/article/view/548
Biodiversity Heritage Library: http://biodiversitylibrary.org
Who can upload & edit?
Trusted repositories? Approved specialists? BHL Librarians? People in this session? Citizen scientists? 6th graders? Rod Page?
Discussion: Session participants thought it important that BHL get as many citations as possible, then find ways of implementing trust mechanisms for users such as iSpot (Drupal module), ratings systems, ways of tagging inappropriate materials.
Biodiversity Heritage Library: http://biodiversitylibrary.org
What about duplicates?
3 Bibliographies had Syst. Nat. All 3 in different reference
manager formats All 3 had variant forms
of title:
Syst. Nat.
Systema Naturae
Systema naturae per regna tria naturae
Library catalogues:Caroli Linnaei...Systema naturae per regna tria
naturae :secundum classes, ordines, genera, species, cum characteribus, differentiis, synonymis, locis.
Discussion: Important to have all the ways in which materials have been referred to over time, then have algorithms & people aggregate titles/articles (translations) into reconciliation groups, resulting in a master index.
Biodiversity Heritage Library: http://biodiversitylibrary.org
Accuracy
How clean is clean? How dirty is dirty? What’s good enough?
How to Rank Gold/Platinum?
Dirty Bucket/Clean Bucket?
Discussion: Let users decide which is the “right” form for use; may differ from project to project. BHL should take it all in, then refine using our libraries’ collected knowledge + involvement from domain specialists.
Biodiversity Heritage Library: http://biodiversitylibrary.org
Right technologies?
“But Drupal’s awful…just ask ___ for their bad experience.”
“Drupal’s great!”
“MySQL won’t scale” “MySQL’s great!”
Discussion: Drupal has limitations, but a large community of developers & implementers. There may be a “Montpellier Declaration” to centralize efforts within biodiversity informatics around the framework. Drupal/Biblio is a good starting point for CiteBank, needs further evaluation after more data are loaded & site is used.
…BHL keeps growing & growing & growing…
New projects
Biodiversity Heritage Library: http://biodiversitylibrary.org
Darwin’s Library
AMNH, NHM, CUL, BHL (MOBOT)
Funded by NEH/JISC Digitization of Darwin’s
personal library, with annotations New interfaces for recording,
indexing, displaying annotations
Review “Dannotate” technology from ALA:http://metadata.net/sfprojects/dannotate.html
Biodiversity Heritage Library: http://biodiversitylibrary.org
BHL Take Away
Content now available in EPUB format Used by Stanza, transferable to Kindle
Blog post by John Mignault (NYBG): http://john.mignault.net/blog/2009/10/28/first-bhl-e-book-
experiments/
Biodiversity Heritage Library: http://biodiversitylibrary.org
Next steps
Bring hardware online at MBL Have one point of redundancy By Q1 2010
Bring BHL-Europe & other nodes online In conjunction with DuraCloud & other solutions
Release CiteBank for beta & sandbox testing Beta at http://citebank.biodiversitylibrary.org Sandbox at http://sandcite.biodiversitylibrary.org Production release by Q2 2010
Integration of BHL-Europe tools & content
Biodiversity Heritage Library: http://biodiversitylibrary.org
Global BHL Coordination
Biodiversity Heritage Library: http://biodiversitylibrary.org
Thanks!
Chris FreelandTechnical Director, BHL
Director, Center for Biodiversity Informatics, Missouri Botanical Garden
chris.freeland@mobot.orghttp://twitter.com/chrisfreeland
Recommended