41
How Did BHL Get to Big Data? 3 October 2017 TDWG 2017 | Ottawa Martin R. Kalfatovic Twitter @ BHLProgDirector Biodiversity Heritage Library

How Did BHL Get to Big Data?

Embed Size (px)

Citation preview

Page 1: How Did BHL Get to Big Data?

How Did BHL Get to

Big Data?3 October 2017

TDWG 2017 | Ottawa

Martin R. KalfatovicTwitter @ BHLProgDirector

Biodiversity Heritage Library

Page 2: How Did BHL Get to Big Data?

A Science/Library/Technology

Project

Page 3: How Did BHL Get to Big Data?

“The cultivation of natural

history cannot be efficiently

carried out without reference to

an extensive library.”

Charles Darwin, et al (1847)

Page 4: How Did BHL Get to Big Data?

BHL encompasses

Technology Libraries Science

Page 5: How Did BHL Get to Big Data?

Built on foundation of 250+ years of library collecting in

the field of natural history...

Page 6: How Did BHL Get to Big Data?

Focusing on collection strengths at founding partner

institutions, BHL worked in core biodiversity areas ...

Page 7: How Did BHL Get to Big Data?

The Internet Archive provided a robust and low-cost

platform to work with partners around the world ...

Page 8: How Did BHL Get to Big Data?

9. Page View

Page 9: How Did BHL Get to Big Data?

A Collaboration of Many Content

Providers

Page 10: How Did BHL Get to Big Data?

No single partner library held all

the content, so to ramp up quickly,

BHL built on strengths

Botany (Botanical Gardens)

Entomology (Smithsonian)

Large run serial publications

(NHM London, MBL WHOI)

Vertebrate Zoology (Harvard

MCZ and AMNH)

Page 11: How Did BHL Get to Big Data?
Page 12: How Did BHL Get to Big Data?

1

20

2000000

4000000

6000000

8000000

10000000

12000000

14000000

16000000

18000000

1 2 3 4 5 6 7 8 9 10 11 12

Year

Pages

Growth of BHL content by year

Page 13: How Did BHL Get to Big Data?

53+MILLIONPAGES

TITLES VOLUMES

128,000+ 213,000+

178+MILLIONINSTANCES OF TAXONOMIC NAMES

645+IN-COPYRIGHT TITLES LICENSED FOR BHL

AGREEMENTS

WITH 275+LICENSORS

*Stats as of October 2017

Page 14: How Did BHL Get to Big Data?

Robust and Sustainable

Funding Strategies

Page 15: How Did BHL Get to Big Data?

Core funding in 2007 from the MacArthur Foundation through the Encyclopedia of Life

Page 16: How Did BHL Get to Big Data?

BiodiversityHeritageLibrary

Synthesis CenterField Museum

SecretariatSmithsonian

Education &Outreach

Smithsonian/Harvard

InformaticsMarine Biological

Laboratory

Page 17: How Did BHL Get to Big Data?

*As of September 2017

MEMBERS

• American Museum of Natural History Library

• BHL Australia

• BHL México

• Cornell University Library

• Field Museum of Natural History Library

• Harvard University Botany Libraries

• Harvard University, Museum of Comparative

Zoology, Ernst Mayr Library

• Library of Congress

• The LuEsther T. Mertz Library, The New York

Botanical Garden

• Missouri Botanical Garden, Peter H. Raven

Library

• Muséum national d’Histoire naturelle

• National Library Board, Singapore

• Natural History Museum Library, London

• Royal Botanic Gardens, Kew, Library, Art &

Archives

• Smithsonian Libraries

• United States Department of Agriculture,

National Agricultural Library

• United States Geological Survey Libraries

Program

• University Library, University of Illinois

Urbana-Champaign

• University of Toronto Libraries

Page 18: How Did BHL Get to Big Data?

*As of September 2017

AFFILIATES

• Academy of Natural Sciences of Drexel

University, Library and Archives

• BHL Africa

• BHL China

• BHL Egypt

• BHL SciELO (Brazil)

• Bibliothèque cantonale et universitaire -

Lausanne

• California Academy of Sciences Library

• Canadian Museum of Nature

• Chicago Botanic Garden, Lenhardt Library

• Internet Archive

• Los Angeles County Arboretum & Botanic

Garden

• Marine Biological Laboratory/Woods Hole

Oceanographic Institution Library (MBLWHOI

Library)

• Mendel Museum

• Narodni Museum (National Museum, Prague)

• Natural History Museum Los Angeles County

• Naturalis Biodiversity Center

• Oak Spring Garden Foundation

• Smithsonian Institution Archives

Page 19: How Did BHL Get to Big Data?

Finances2006 – 2016 Grants Received (by year)

Page 20: How Did BHL Get to Big Data?

FUNDING SOURCES

• Federal Funding• Federal allocation to Smithsonian Libraries

• Member and Affiliate Dues

• Institutional Endowments

• Grants• Alfred P. Sloan Foundation

• Arcadia Fund

• Council on Library & Information Resources

• Gordon & Betty Moore Foundation

• Institute of Museum & Library Services

• JRS Foundation

• MacArthur Foundation

• Mellon Foundation

• National Endowment for the Humanities

• National Science Foundation (NSF)

• Richard Lounsbery Foundation

• Donations

• Product Development

• Institutional Subventions

• In-Kind Contributions

Page 21: How Did BHL Get to Big Data?

CASH & IN-KIND CONTRIBUTIONS

DIRECT STAFF$1,424,792.54

VALUE

OF

MEMBER & AFFILIATE

CONTRIBUTIONS 2016

OTHER$392,751.28

2015

VS

2016

TOTAL IN-KIND

CONTRIBUTIONS

2015$1,358,908.20

2016$1,817,543.82

27.26

TOTAL MEMBER &

AFFILIATE FTEs

WORKING ON BHL

IN 2016

Page 22: How Did BHL Get to Big Data?

Growth Drivers

Page 23: How Did BHL Get to Big Data?

Permissions for In Copyright Material

Thanks to the work of the Expanding Access

to Biodiversity Literature team (Mariah Lewis

and Patrick Randall) and Bianca Crowley,

BHL had a successful year with 164 newly

licensed titles and 83 licensors since our last

meeting.

• Licensed titles in CY 2016: 164

• Licensors in CY 2016: 83

Page 24: How Did BHL Get to Big Data?

Permissions for In Copyright Material

Page 25: How Did BHL Get to Big Data?

BHL is a Global Consortium

19MEMBERS

AS OF SEPTEMBER 2017

18AFFILIATES60+ WORLDWIDE PARTNERS

Page 26: How Did BHL Get to Big Data?

International Focus

Page 27: How Did BHL Get to Big Data?

Biodiversity Heritage Library

Field Notes Project• Funded by a Digitizing Hidden Special

Collections and Archives grant from the

Council on Library and Information

Resources (CLIR)

• Two-year award for 491,713 USD.

• Collaborative effort to digitize field notes,

assign metadata, and publish online

through BHL & Internet Archive

• Lead Institutions: Smithsonian Libraries

and Smithsonian Institution Archives.

• Participating Institutions:

• American Museum of Natural History;

• The Field Museum of Natural History

Library; Harvard University Botany

Libraries; Harvard University, Museum of

Comparative Zoology, Ernst Mayr Library;

LuEsther T. Mertz Library, The New York

Botanical Garden; Missouri Botanical

Garden, Peter H. Raven Library; Museum

of Vertebrate Zoology at the University of

California, Berkeley; Yale Peabody

Museum Archives; and Internet Archive.

Page 28: How Did BHL Get to Big Data?

Smithsonian Field Book Project• Currently funded by the Arcadia

Foundation, UK. Initiated with funding

from the Council on Library and

Information Resources and previously

supported by Smithsonian Women’s

Committee, and the National Park

Service’s Save America’s Treasures.

• Arcadia’s two-year award funded at

511,200 USD.

• Is coordinating work to catalog,

conserve and digitize scientists’ field

notes from the collections of the

Smithsonian.

• Content will be made available through

the Smithsonian’s Collection Search

Center at collections.si.edu and the

Biodiversity Heritage Library at

biodiversitylibrary.org, as well as

international aggregator sites such as

the Internet Archive and the Digital

Public Library of America.

Page 29: How Did BHL Get to Big Data?

Expanding Access to

Biodiversity Literature• Funded by the Institute of Museum and

Library Services (IMLS) in 2015 as part

of the National Leadership Grants for

Libraries program.

• Two-year award for 846,457 USD.

• EABL is helping libraries, museums,

and natural history societies make their

content more widely available by

providing the tools and support

necessary to facilitate contribution to

the Digital Public Library of America

(DPLA) through BHL.

• Lead Institution: The New York

Botanical Garden.

• Participating Institutions: Harvard

Ernst Mayr Library of the Museum of

Comparative Zoology (MCZ), Missouri

Botanical Garden (MBG), and

Smithsonian Libraries (SIL).

• Progress to date: 3,578 volumes (479

titles; 393,063 pages); 127 in copyright

titles from 59 contributors.

Page 30: How Did BHL Get to Big Data?

116,500+

IMAGES IN FLICKR

TOTAL IMAGES

TAGGED34,500+

256+MILLIONTOTAL VIEWS ON IMAGES

OF TOTAL FLICKR

COLLECTION TAGGED

TAGGED IMAGES IN

EOL

30% 18,000+

BHL FLICKR NAMED 1 OF WIRED’S

27 MUST-FOLLOW FEEDS IN

THE WORLD OF SCIENCE*Stats as of June 2017

WWW.FLICKR.COM/BIODIVLIBRARY

Page 31: How Did BHL Get to Big Data?

Connecting with Users

Page 32: How Did BHL Get to Big Data?

6.5+MILLIONTOTAL USERS TO DATE

AVERAGE MONTHLY

USERS113,000+

12+ MILLIONTOTAL WEBSITE VISITS TO DATE

AVERAGE MONTHLY

VISITS192,000+

VISITS FROM

243COUNTRIES &

TERRITORIES

*Stats as of September 2017

Page 33: How Did BHL Get to Big Data?

1. London2. New York3. Mexico City4. Paris5. Sydney6. Berlin7. Washington8. Melbourne9. New Delhi10. Sao Paulo

Top 10 Cities by Sessions, CY 2016

Page 34: How Did BHL Get to Big Data?

124,295 users

February 2016

CY 2016

2.123m sessions

1.162m users

96,862 users/month

2007-2016

Page 35: How Did BHL Get to Big Data?

8.51% sessions

Mobile Sessions CY 2015

10.45% sessions

Mobile Sessions CY 2016

Mobile sessions increase by 34.43% over the past year

Page 36: How Did BHL Get to Big Data?

A Commitment to Open Access…

BHL is a charter signatory of the Bouchout Declaration

for Open Biodiversity Knowledge Management.

Fundamental principles of the Declaration:

Free & Open Use

Policies to Foster Free &

Open Access

Persistent Identifiers

Tracking Identifiers to

Ensure Attribution

Infrastructure, Standards &

Protocols to Improve Access

Linked Data

Sustainable Knowledge Management

Registers for Content &

Services

Page 37: How Did BHL Get to Big Data?

“Science is all about disseminating knowledge

and building upon what has come before, yet so

much of our knowledge of plants and animals

has remained inaccessible to those who could

make use of it.’”

Dr. John SullivanEvolutionary Biologist

Academy of Natural Sciences, PhiladelphiaCornell University

Page 38: How Did BHL Get to Big Data?

BHL: A Source for Big Data Analysis

Presenter: Mike Lichtenberg

11:00 AM - 12:30 PM, Ballroom A

4 October 2017 (Wednesday)

Using Big Data Techniques to Cross Dataset

Boundaries -

Integration and Analysis of Multiple Datasets

Organizers: Matthew Collins, Robert Guralnick,

Martin R. Kalfatovic

Page 39: How Did BHL Get to Big Data?
Page 40: How Did BHL Get to Big Data?

Expanding Access to Biodiversity Literature

Presenter: Mariah Lewis

Scientific Names: Linking the Past to Provide Context for Knowledge

Presenter: Thomas M. Orrell

A path to continuous reindexing of scientific names appearing in

Biodiversity Heritage Library data

Presenter: Dmitry Mozzherin

Crowdsourcing Data Enhancements to Improve Named Entity

Recognition in the Biodiversity Heritage Library

Presenter: Katie Mika

BHL’s Feedback Tools and User Surveys: Investigating User Needs

for Data in Digital Libraries

Presenter: Carolyn A. Sheffield

Page 41: How Did BHL Get to Big Data?

Thank You!

Twitter @ BHLProgDirector