Upload
julie-meloni
View
1.345
Download
0
Embed Size (px)
DESCRIPTION
In these trying financial times, libraries and cultural heritage institutions in general face difficult resource allocation decisions: for example, do you spend hundreds of thousands of dollars on proprietary software or do you hire a few good software developers and library professionals who can lead the design of applications and platforms specific to your needs? For some, leveraging open source software and the communities that form around it helps solve some of these problems.The University of Virginia Library is a key partner in the collaborative and open source project known as "Hydra”; the goal of the Hydra Project is to create a comprehensive set of open source repository workflow tools that allow librarians and scholars to manage describe, deliver, reuse and preserve digital information. U.Va.’s committment to the project includes the definition of metadata standards, the creation of search and discovery interfaces, and the development and implementation of multiple Hydra “heads” such as the interface and workflow in use for the U.Va. institutional repository. U.Va is also a key contributor to the Blacklight project; Blacklight is an open source discovery interface or "next-generation catalog" — and can be seen powering the newly updated U.Va. OPAC, Virgo.This talk will provide a brief overview of both the Hydra and Blacklight projects and the tools under development, will describe some of the processes and challenges for development teams working within a library setting, and show some of the ways that open source software works (and where it gets tricky) within this setting.
Citation preview
Developing and Deploying Open Source Tools in the Library: Hydra, Blacklight, and BeyondJulie Meloni, University of Virginia LibraryNYPL Brown Bag Lunch Talk // 26 August [email protected] // @jcmeloni
The Million-Dollar Question
Do you spend hundreds of thousands of dollars on proprietary software (licensing, maintenance contracts, support contracts, etc.) that performs one set of tasks, or do you hire a few good software developers
and library professionals who can lead the design of applications and platforms specific
to your needs?
The Answer …
• The people cost more.• The people can also do more, especially when
committed to open source wherever possible.• In turn, other institutions benefit as well.
• This approach will not work for every institution.• This approach does work for University of Virginia
Library.
Problems with Proprietary Software
• Expensive in terms of• licensing• hardware• Maintenance
• Vendor lock-in• dependencies make switching costs too great
Problems with Open Source Software
• Expensive in terms of• Human resources (learning, collaboration, and commitment
to a community takes a lot of time!)
• No vendor support• Reliance on internal resources and a community that may
have different goals than your own.
Where Does that Leave Us?
• OSS is no panacea• Know what you're getting into• Philosophies are difficult to implement wholesale• Implementations must serve the greater goals of the
library• The process of testing, implementing, and testing again,
and working with a community to achieve goals, takes time but is worth the effort for stability and scalability.
OSS at UVa Library
• Fedora (Flexible Extensible Digital Object Repository Architecture): a solid, modular architecture on which to build repositories, archives, and related systems• 2001 Mellon grant to Cornell & UVa enabled development
• Blacklight: creating, implementing, and maintaining an open source OPAC (& related collaborations)• Developed originally within the Scholars’ Lab and UVa Library as a
skunkworks project• Embracing the Hydra philosophy that• no single application can meet the full range of needs• no single institution can handle development and maintenance• requires a common repository infrastructure; flexible, atomic
data models; modular services and configurable components
Up Next…• The Hydra Project: what we do, what we get out of it, and
what we contribute back to the community• How using an open source discovery interface has allowed us
to quickly address the needs of our institution and its patrons• How working with open source has allowed more Library staff
outside of the development team have a say in the design, development, and deployment of our products
The Hydra Project• Collaborative effort between University of Virginia, Stanford
University, University of Hull, Fedora Commons/DuraSpace, and MediaShelf.
• Working group created in 2008 to fill a need to develop an end-to-end, flexible, extensible, workflow-driven, Fedora application kit.• Technical Framework • Community Framework
• No direct funding of the Hydra Project itself.
Hydra Project Assumption #1• No single application can meet the full range of digital asset
management needs, but there are shared primitive functions:• Deposit simple or multipart objects, singly or in bulk• Manage object’s content, metadata, and permissions• Search both full text and fielded search in support of user
discovery and administration• Browse objects sequentially by collection, attribute, or ad-hoc
filtering• Delivery of objects for viewing, downloading, and dissemination
through user and machine interfaces
Hydra Project Response
One body, many heads.
Hydra is designed to support tailored applications and workflows for different content types, contexts, and interactions by building from:
• a common repository infrastructure,• flexible, atomic data models, and• modular services and configurable components
Hydra Technical Framework• Fedora as repository layer for persisting and managing digital
objects. • An abstraction layer sits between Fedora and the Hydra heads,
insulating applications from changes in the repository structure• ActiveFedora is a Ruby gem for creating and managing objects
in Fedora• Solr indexes provide fast access to information Blacklight for
faceted searching, browsing and tailored views on objects• The Hydra-Head plugin itself: a Ruby on Rails library that
works with ActiveFedora to provide create, update and delete actions against objects in the repository
Hydra Project Assumption #2• No single institution or provider can resource the
development or maintenance of a full set of solutions for the same needs.• Problems with proprietary software include expense in terms of
licensing, hardware, maintenance, potential vendor lock-in• Problems with open source software include expense of human
resources, and lack of vendor support causes a reliance on internal resources and community that may have different goals than your own.
Hydra Project Response
“If you want to go fast, go alone. If you want to go far, go together.”
• Hydra Steering Group• Collaborative roadmapping, resource allocation and coordination,
governance of the technology core• Hydra Managers • Shape and fund work, commission “heads”, create functional
requirements and specifications, UI/UX design, documentation, training, evangelism
• Hydra Developers• Define technical architecture, commit code, integration and
release, testing, testing, testing.
Hydra Community Framework• Conceived and executed as a collaborative, open source effort
from the start• An open architecture, with many contributors to the core• Collaboratively built “solution bundles” that can be adapted
and modified to suit local needs• Hydra heads as reference implementations• Ultimate objective of the Hydra Project is to effectively
intertwine its technical and community threads of development, producing a community-sourced, sustainable application framework.
http://projecthydra.org/
Great, But…
WHAT DID YOU BUILD???
We built Libra: an unmediated, self-deposit, institutional repository for
scholarly material.http://libra.virginia.edu/
• In February 2010, the University of Virginia Faculty Senate passed an Open Access resolution:• All faculty encouraged to “reserve a nonexclusive, irrevocable,
non-commercial, global license to exercise any and all rights under copyright relating to each of her or his scholarly articles in any medium, and to authorize others to do the same.”
• NSF requirements for preservation and access of data used in or resulting from researchers’ grant-funded projects.
• Discovery, access, and preservation of our students’ electronic theses and dissertations.
Why Did We Need Libra?
• Given institutional commitment to these University-wide problems, resources were allocated from both the University Library and Information Technology & Communication.
• UVa was already committed to the Hydra Project, and to assist in the development of an end-to-end, flexible, extensible, workflow-driven Fedora application kit.• The solution to our problems clearly required such an application
toolkit…good thing the Hydra Project had one in development.• Hydra offerings ARE NOT a turnkey institutional repository
solutions, but frameworks for depositing, managing, searching, browsing, and delivering digital content.• We built on that.
How Did We Get Libra?
• Our solution should:• Be unmediated• Provide sustainable access to and discovery of scholarly
materials• Enable collection of depository-designated metadata• Manage depositor-designated access permissions
• Work with internal stakeholders to gather requirements and user stories, as this is their repository.
• Work with Hydra partners to move the common code base forward while still developing our own application in our own branch.
Libra Development Principles
The Result: A Highly Customized Application
http://libra.virginia.edu
Works With Multiple Item Types
All Discoverable
…and detailed
Sustainable Access to Scholarly Work
Open Source in Practice
• Blacklight is an open source discovery interface that can be used as a front end for a digital repository, or as a single-search interface to aggregate digital content that would otherwise be siloed.• customizable and removable for ultimate flexibility• many core developers part of the Hydra Project (Bess Sadler, now
at Stanford, Bob Haschert at UVa, etc)• Continued development by a core group of committers governed
by developer norms.
http://projectblacklight.org/
Basic Blacklight
Customized Blacklight
Even More Customizations
Good, Broad, Requirements Gathering
• Functional requirements define the functionality of the system, in terms of inputs, behaviors, outputs.• What is the system supposed to accomplish?
• Functional requirements come from stakeholders (users), not (necessarily) developers.• stakeholder request -> feature -> use case -> business rule
• Developers can/should/will help stakeholders work through functional requirements.• Functional requirements should be written in a non-technical
way.
• An epic is a long story that can be broken into smaller stories.
• It is a narrative; it describes interactions between people and a system• WHO the actors are• WHAT the actors are trying to accomplish• The OUTPUT at the end
• Narrative should:• Be chronological • Be complete (the who, what, AND the why)• NOT reference specific software or other tools• NOT describe a user interface
Non-Technical Folk Write Epics and Stories
• Stories are the pieces of an epic that begin to get to the heart of the matter.
• Still written in non-technical language, but move toward a technical structure.
• Given/When/Then scenarios• GIVEN the system is in a known state WHEN an action is performed THEN
these outcomes should exist• EXAMPLE:
• GIVEN one thing • AND an other thing • AND yet an other thing
• WHEN I open my eyes • THEN I see something • But I don't see something else
Non-Technical Folk Write Epics and Stories
• Scenario: User attempting to add an object• GIVEN I am logged in
• AND I have selected the “add” form• AND I am attempting to upload a file
• WHEN I invoke the file upload button• THEN validate file type on client side
• AND return alert message if not valid• AND continue if is valid
• THEN validate file type on server side• AND return alert message if not valid• AND finish process if is valid
Actual Story Example
• Developers involved at the story level• Writing stories• Validating stories• Throwing rocks at stories• Getting at the real nitty-gritty of the task request
• Moving from story to actual code• Stories written in step definitions become Ruby code• Tests are part of this code• Code is tested from the time it is written
Writing Code From Stories
• Watch out for the butterfly effect…• When one change in a complex system has large effects
elsewhere, through a sensitive dependence on initial conditions.
• Epics and stories do not have to be golden, but changes should be carefully considered• Developers illuminate the potential effects of changes• The cycle of epic, story, coding begins again
• This includes any story that touches the changed story
Never Stop Communicating
• Each release has with a list of known issues and potential areas of improvement
• We go through the cycle of epic, story, coding/testing, user testing, story editing, coding/testing, (etc) again and again.
• Products are organic and grow upward and outward• …but if you want to lop off part of that tree, expect there will be
systematic changes • developers are there to ensure the tree doesn’t fall on your
house
We Never Think We’re Finished
We Never Ignore the User• Work closely with the UX team to ensure that wireframes and
prototypes are put in front of users before we take action.• Patrons vet the stakeholder requests just like developers do, but
from a user’s perspective rather than a technical one.• In some notable instances, patron desires have differed
tremendously from what stakeholders believe they want.
• The story of integrating a discovery service: how and why we didn’t blend results.• User testing produced clear requests, different from librarian
assumptions.• Open source flexibility allowed us to go from requirements
gathering to user testing to requirements changing to development and deployment in four months.
We Will…
• NEVER return to using proprietary software and solutions (when we can help it).
• ALWAYS try to find an open source solution, or build one if it doesn’t exist.
• SHARE everything we possibly and legally can, with anyone who wants to use it.
• HOPE that any of you considering the use of open source versus proprietary software will consider it and ask questions…