Upload
ethel-dalton
View
222
Download
0
Tags:
Embed Size (px)
Citation preview
Working out the Future of Library Resource Discovery
Marshall BreedingIndependent Consultant,Founder and Publisher, Library Technology Guideshttp://www.librarytechnology.org/http://twitter.com/mbreeding
October 5, 2015NISO Event: Future of Library Resource Discovery
http://www.niso.org/publications/white_papers/discovery/
Description
Marshall Breeding will highlight some of the key findings of the white paper he developed for the NISO Discovery to Delivery topic committee. The presentation will include some updated information on the state of the current arena of commercial and open source discovery services, including trends in adoption and new technical and functional capabilities. Looking forward, Breeding will mention some longer-term possibilities and opportunities for discovery services to move beyond the current models of centralized indexes, including greater reliance on semantic technologies and linked data.
Library Discovery Past
Bound Catalog
National Library of Colombia
Card Catalog
National Library of Argentina
Online Catalog
Books, Journals, and Media at the Title Level
Not in scope: Articles Book Chapters Digital objects
Scope of SearchSearch:
Search Results
ILS Data
NOTIS: MDAS
Multiple Database Access System Released in 1989 Article-level indexing (Mostly Wilson
Databases) Grant supported by Pew Charitable Trusts Development Partners: NOTIS and
Vanderbilt UniversitySee: Steffey, RJ. “NOTIS multiple database access system: a look behind the scenes” Online , v14 n5 p46-49 Sep 1990
Next-gen Catalogs or Discovery Interface
Single search box Query tools
Did you mean Type-ahead
Relevance ranked results Faceted navigation Enhanced visual displays
Cover art Summaries, reviews,
Recommendation services
Books, Journals, and Media at the Title Level
Other local and open access content
Not in scope: Articles Book Chapters Digital objects
Scope of Search
Discovery Interface search model
Search: Digital
Collections
ProQuest
EBSCOhost
…MLA
Bibliography
ABC-CLIO
Search Results
Real-time query and responses
ILS Data
Local Index
Meta
Search
En
gin
e
Web-scale Index-based Discovery
Search:
Digital Collections
Web Site Content
Institutional
Repositories
…E-Journals
Reference Sources
Search Results
Pre-built harvesting and indexing
Conso
lidate
d In
dex
ILS Data
Aggregated Content packages
(2009- present)
Usage-generate
dData
Customer
Profile
Open Access
Evaluating the Performance of Index-based Discovery Services
Intense competition: how well the index covers the body of scholarly content stands as a key differentiator
Difficult to evaluate based on numbers of items indexed alone.
Important to ascertain how your library’s content packages are represented by the discovery service.
Important to know what items are indexed by citation, which are full text, and how A&I content is handled
Open Discovery Initiative
Libraries
Publishers
Service Providers
14
Marshall Breeding, Vanderbilt UniversityJamene Brooks-Kieffer, Kansas State University Laura Morse, Harvard UniversityKen Varnum, University of Michigan
Sara Brownmiller, University of OregonLucy Harrison, College Center for Library Automation (D2D liaison/observer)Michele Newberry
Lettie Conrad, SAGE PublicationsRoger Schonfeld, ITHAKA/JSTOR/PorticoJeff Lang, Thomson Reuters
Linda Beebe, American Psychological AssocAaron Wood, Alexander Street Press
Jenny Walker, Ex Libris GroupJohn Law, Serials SolutionsMichael Gorrell, EBSCO Information Services
David Lindahl, University of Rochester (XC)Jeff Penka, OCLC (D2D liaison/observer)
The Context for ODI
Based on a meeting at ALA Annual Conference in New Orleans on Sunday, June 26, 2011. Recognition of the following trends and issues: Emergence of Library Discovery Services solutions
Based on index of a wide range of content Commercial and open access Primary journal literature, e-books, and more
Adopted by thousands of libraries around the world, and impact millions of users
Agreements between content providers and discovery providers ad-hoc, not representative of all content, and opaque to customers.
15
ODI deliverables
Standard vocabulary NISO Recommended Practice:
Data format & transfer Communicating content rights Levels of indexing, content availability Linking to content Usage statistics Evaluate compliance
Inform and Promote Adoption
16
ODI Recommended Practices Published June 25, 2014 NISO RP-19-2014
http://www.niso.org/workrooms/odi/ http://www.niso.org/workrooms/odi/publications/rp/rp-19-2014
Metadata elements for content providers to contribute to discovery service providers
Content providers disclose extent to which they participate with each discovery service
Discovery Service providers disclose what content is represented in index
Discovery services disclose any bias in search results or relevancy relative to business relationships
Discovery services provide use statistics
NISO Discovery White Paper
Commissioned by NISO Discovery to Delivery Topic Committee
First Draft Nov 2014 Revised based on feedback from D2D
Published Feb 20, 2015 Launched at ER&L
NISO Discovery Paper Outline General Background Integration between Discovery Services and
Management Systems Linked Data Gap Analysis Opportunities for Future Enhancements in
discovery Discovery Beyond Library-provided Interfaces Open Discovery Initiative: recommendations for
Phase II Longer term prospects
Library Discovery Present
Library Perspective
Strategic investments in subscriptions Strategic investments in Discovery Solutions to
provide access to their collections Expect comprehensive representation of resources in
discovery indexes Problem with access to resources not represented in index Encourage all publishers to participate and to lower
thresholds of technical involvement and clarify the business rules associated with involvement
Need to be able to evaluate the coverage and performance of competing index-based discovery products
Value and Economy
Academic and research libraries spend far more of their budgets on content than resource management or discovery technologies
Discovery represents essential infrastructure to maximize impact of library collections
Resource management represents essential infrastructure to assemble and assess optimal collection to support library mission
Ever increasing costs of content exert pressure on budgets and demand more effective discovery and more efficient management
Role of the library in discovery Acquisition and Management of
resources Integrate content into campus enterprise
infrastructure and information architecture
Provide general and specialized interfaces
Participate in production and publication Participate more deeply in research
process Manage content on behalf on the
institution in ways that optimize access and discovery.
Web-scale Index-based Discovery
Search:
Digital Collections
Web Site Content
Institutional
Repositories
…E-Journals
Reference Sources
Search Results
Pre-built harvesting and indexing
Conso
lidate
d In
dex
ILS Data
Aggregated Content packages
(2009- present)
Usage-generate
dData
Customer
Profile
Open Access
Bento Box Discovery Model
Search:
Digital Collections
Web Site Content
Institutional
Repositories
E-JournalsSearch Results
Pre-built harvesting and indexing
Conso
lidate
d In
dex
ILS Data
Aggregated Content packages
Open AccessVuFind /
Blacklight
State of Discovery indexes
Very strong coverage of primary publishers of scholarly materials Especially English and other Western
Languages Weaker coverage of scholarly content in
other international regions Asian languages, Arabic, etc.
Mixed coverage of A&I resources Mixed converge of non-textual resources
Some Key Areas for Publishers1. Expose content appropriately2. Trust that access to material will be
controlled consistent with subscription terms
3. “Fair” Linking4. Materials not disadvantaged or
underrepresented in library discovery implementations
5. Usage reporting
Representation of A&I
Important to understand how a discovery service incorporates A&I resources Does it receive content from the A&I
provider directly and make use of value-added terminology
If not: citations or full-text indexing of some portion of the titles represented in the A&I product
NOT the same, and possibly misleading
28
A&I Content in Discovery Services
What is the place for A&I services in the discovery ecosystem
Are there technology solutions capable of substituting for A&I content? Specialized and scoped search
methodologies Clustering, term extraction, etc.?
Specialized vocabulary and other metadata make positive contributions to the discovery process
Researchers value A&I tools
ODI Standing Committee
Libraries
Publishers
Service Providers
30
Marshall Breeding, Independent Consultant
Laura Morse, Harvard UniversityJason Price, SCELC
Ken Varnum, University of MichiganDave Whisenant, Florida Virtual Campus
Lettie Conrad, SAGE PublicationsMichael McFarland, CredoReferenceJill O’Neill, NFAIS
Elise Sassone, Springer
Aaron Wood, Ingram Content GroupJulie Zhu, IEEE
Scott Bernier, EBSCO Information ServicesSteven Guttman, Proquest
Rachel Kessler, Ex LIbris
John McCullough, OCLC
ODI Standing Committee
The Open Discovery Initiative Standing Committee was formed following approval of the Recommended Practice published by NISO on June 25, 2014
We are charged with the following tasks:
• Promotion and education of ODI Recommended Practice for all stakeholders• Provide support for content providers and discovery service providers during adoption and completion of conformance checklists• Provide a forum for ongoing discussion related to all aspects of discovery platforms for all stakeholders• Consider next steps for items deemed out scope from the original ODI Work Group Recommended Practice • Identify emerging needs in the open discovery space and determine appropriate courses of action• Make recommendations to the D2D topic committee on further work items required to fulfill the goals of the Open Discovery Initiative
31
Current issues and areas of development
Challenge for Relevancy
Technically feasible to index hundreds of millions or billions of records through Lucene or SOLR
Difficult to order records in ways that make sense
Expectation that relevancy be neutral relative to content source or publisher
Many fairly equivalent candidates returned for any given query
Must rely on use-based and social factors to improve relevancy rankings
Relevancy
Ever-improving, yet flaws remain Increased use of use data and
personalize context to identify and order search results
State of the art improving via more sophisticated search and retrieval technology, increased use of aggregated contextualized data, and other factors
Socially-powered discovery
Leverage use data to increase effectiveness of discovery
Usage data can identify important or popular materials to inform relevancy engines
Identify related materials that may not otherwise be uncovered through keyword matching
Be careful to avoid introducing bias loops
Externalizing functionality
Provide tools and widgets in course management platforms
Reading list management Improving presentation via mobile
devices
Open access content
Only a minority of scholarly resources available through open access licenses
Difficult to identify open access versions available
Often presented proprietary content when open access is also available
Interoperability of Discovery Services and Management Platforms
Discovery and Management solutions offered as matched sets Ex Libris: Primo / Alma ProQuest: Summon / Intota OCLC: WorldCat Discovery Service / WorldShare Platform
Independent Discovery and Management Kuali OLE: no discovery component EBSCO Discovery Service: Works with any Resource management
system Both product categories depend on an ecosystem of
interrelated knowledge bases API’s exposed to mix and match, but are efficiencies and
synergies are lost? Recommendation to explore expectation regarding
interoperability between these two product categories
Discovery Service Installations
Product 2007 2008 2009 2010 2011 2012 2013 2014 Installed
EBSCO EDS 1774 2634 8246
Primo 12 37 53 506 111 101 98 88 1528
AquaBrowser 55 339 64 69 74 58 81 6 89
Encore 72 72 109 56 72 36 346
BiblioCommons 41 ~200
Summon 50 164 214 158 238 195 697
Enterprise 16 75 100 102 123 150 538
Infor Iguana 18 74
Axiell Arena 61 57 33 35 95 404
Gap Analysis
Many resources still not addressed in central indexes Especially A&I products
Better coverage of open access materials Better support for internationalization and
multilingual search and retrieval Improved capabilities for precise search, known
items, browsing Improved and more transparent relevancy rankings Non-textual content and retrieval mechanisms Better integration with learning management
systems
Opportunities for Enhancements in Discovery
Improved delivery of APIs More coherent ecosystem of APIs among
discovery services and with resource management systems
Social features and scholarly collaboration
Address research data Special Collections and archival
materials: hierarchical discovery and browsing
Expanded Analytics and Altmetrics
Beyond Index-based Discovery
Library Discovery Future
The future of Resource Discovery More comprehensive discovery indexes Stronger technologies for search and
retrieval Discovery beyond library-provided
interfaces Linked Data to supplement discovery
indexes
Universal participation
Barriers to participation soften as mutual interest prevails over competitive conditions
Advantage to content providers to maximize exposure of resources
Discovery providers gain value in functionality as metadata becomes increasingly commoditized
Essential to preserve value of indexing and abstracting services
Content providers see discovery as a essential channel for distribution
More Distributed Discovery
Address the reality that discovery takes place outside of library provided interfaces
Optimized exposure in the ecosystem of search engine and social network
Not Concentrated on the Library web site Expression of discovery services via
other campus tools and portals and beyond
Multi-layered discovery
Native interfaces of specialized content services
Disciplinary aggregations General library discovery tools Global Internet-based discovery
Discovery beyond Library Interfaces
Improved performance of library content through Google Scholar Same expectations for transparency?
Better exposure of library-oriented content Schema.org or other microdata formats
Better exposure of scholarly resources Open access & Proprietary
Embedded tools in other campus interfaces
Part of the General Internet Infrastructure
Scholarly content will be promoted via similar mechanisms as commercial content
Additional levels of infrastructure to protect privacy
Resource management and/or discovery tools expose content items as open linked data
Library opts out of Discovery Utrecht University Library Decision to not implement a discovery service
but to rely entirely on Google Scholar and other general and scholarly search engines
http://www.uu.nl/en/university-library/searching-for-literature/searching-for-articles-books-theses
Kortekaas , Simone. “Thinking the unthinkable: a library without a catalogue — Reconsidering the future of discovery tools for Utrecht University library.” LIBER General Annual Conference 2012
Linked Data
Major trend toward information systems based on linked data Many projects now based on linked data Area of peak interest for Library of Congress, OCLC,
etc BIBFRAME
Potential to transform how libraries approach discovery
Likely interim hybrid models: central indexes + Linked Data
Current opportunities in making library content more discoverable
Library adoption of Linked data architecture
Not yet a fully operational method for library-oriented content Increasing representation of bibliographic
resources BIBFRAME stands to make great impact
Universe of scholarly resources not well represented
Will current expectations for content providers to make metadata or full text available for discovery expand to exposure as open linked data?
Hybrid models
Can index-based search tools be improved through Linked Data Browse to related resources Add additional hierarchies of structure to
search results
Will linked data models prevail? Possibility that open linked data may
eventually supplant index-based products?
Index technology supplements fundamental architecture based on linked data
Possibilities for Open Access discovery index
Open source tools exist for discovery Interfaces: VuFind Blacklight
No open access discovery indexes High threshold of expense and difficulty to build
index Platform costs Software development Publisher relations Billions of content items to index and maintain
Current model requires massive resources
Threshold of resources required currently too high for open access central discovery index
Assessment might change if options narrowed
Opportunities to lower barriers to entry? More open model more likely to come
through linked data discovery model
Commoditization of Central Indexes Knowledgebases of e-resource coverage
commoditized via KBART and other factors
Central index content likewise will eventually become commoditized
Limited number of discovery service platforms?
Value found in the synergies between library resource management and optimized discovery and delivery
Value in open scholarship
Hopefully the future will be based on open access to scholarly research
Mandates from funding organizations will transform scholarly communications
Current discovery models based on preponderance of proprietary content
Future discovery must assume dominance of open access publishing and underlying data sets
Future of discovery service products Remain one of the essential components of library
technology infrastructure Loosely or tightly tied to resource management Increased sophistication in direct discovery and
delivery functionality Increased expectation to syndicate content to local
and global discovery context Investments made in creation of discovery service
platforms will provide leverage into each next phase of scholarly information infrastructure
Scholarly publishing arena may change dramatically in next decade.
Open Discovery Initiative: recommendations for Phase II
Address A&I concerns to improve participation
Data exchange mechanisms: metadata + content Lower threshold of participation
Interoperability with resource management systems
Potential Opportunities for NISO Convene a second phase of the Open Discovery
Initiative Launch research project on open linked data in
scholarly publishing sector to facilitate new models of discovery and access
Expand scope of Altmetrics group to address their integration in discovery service ecosystem
Possible new workgroup to explore recommended practices for improving discoverability of resources via open linked data, schema.org, and other mechanisms.
Longer term prospects
Opportunities for discovery directly tied to realities in scholarly publishing
Dominance of proprietary publishing requires index-based discovery
Future to open access and exposure as open linked data will enable additional models of discovery
An ongoing conversation
Now in a critical point for discovery Current products evolve Reaching limits of the prevailing
architecture? Current set of products and services an
interim step Important for stakeholders to engage in
defining the future of library resource discovery
Future products must address expected changes in scholarly publishing, library priorities, and institutional strategies.
Questions and discussion