RepoMMan: using Web Services and BPEL to facilitate
workflow interaction with a digital repositoryRichard Green
Overview
• RepoMMan Project overview
• User and University needs
• A solution: BPEL and Web Services
• Example workflows: deposit and metadata
• Problems? What problems?
• Future work
JISC Workflows activity and implementation meeting
Aston Business SchoolTuesday 13th February 2007
RepoMMan project
Repository, Metadata and Management project
JISC-funded for two years to end May 2007...
...to develop a BPEL-based, standards-compliant, Web Services based, workflow tool for Fedora
Closely aligned with the University’s commitment to deploy an institutional repository
RepoMMan
Two strands: research – user needs, documentation etc (BPEL,
surveys, Beginner’s guide etc)
technical – development
Surface tool in University portal and/or Sakai C&LE
Hull’s vision for a DR
Hull’s vision for an Institutional Repository is an extremely broad one
Conventional view was exposure of completed objects
Hull’s view encompasses storage, access, management and preservation of a wide range of file types from concept to completion
User needs
Survey and interviews revealed very wide range of file types potentially to deal with
Users need Storage (backed up)
Access (from anywhere)
Management (sharing, locking, versioning etc)
Some want preservation
Tool must fit users’ expectations and environment Assist/improve doing what they already do
University needs
Flexible wide range of content, some public - some private,
appropriate UIs
Standards-based Web Services, standards – BPEL, JSR168 etc
Scalable initially 2500 potential depositors
Open source software
Effective search & discovery need good metadata
The repository grand plan
Summary
The repository will be a working environment as well as a showcase
Aim is to provide storage, access, management and potential preservation painlessly
Development based on user needs
Standards compliant, Web Service approach
Solution
Toolset to manage repository workflows for usersBPEL - Business Process Execution Language
(Active Endpoints Open Source version)
Web Services
Fedora repository softwareSOAP, RESTscales to millions of flexible objects (NSDL =
6m+)
BUT
Fedora is not an ‘out-of-the-box’ solution
Provides the repository functionality for YOU to use flexibly
Basic depositor workflows
Put object into repository including some sort of structuring
Get object from repository
Delete including at version level
Add metadata
Publish= promote to public space
The RepoMMan deposit tool
‘Putting’ and ‘getting’ a file
Left hand side of screen gives browse capability on user’s hard drive
Right hand side of screen shows the user’s private repository space like a directory structure
(‘Folders’ are in fact repository collections)
Folders can be expanded and new folders created
Objects can be further expanded to show versions
Three-tier stack
Three-tier stack
Model View Controller layer providing user interface
BPEL orchestrating Web Services (Fedora and other) to move files and objects around
Fedora drawing on ID Management System and University Storage Area Network
UML diagram for simple deposit
Three-tier stack: simple deposit
‘Putting’ a file
Part of the BPEL process diagram(Active Endpoints visualisation software)
- switch depending on whether object already exists
- the left hand side branch creates a new object
- the right hand side modifies an existing one
- each of the globes with a ‘swirl’ round it is a Web Service call
BPEL and web services
BPEL can draw on any available Web Service
Access to an expanding range of data made available through Web Services at Hull (and, of course, elsewhere)
Which brings us to metadata!!!
Making public
Good metadata at the heart of effective search and discovery
When a user wants to ‘make an object public’ we will automatically add metadata
Potentially complex: BPEL to orchestrate
Has object already got metadata? Derive metadata using tool(s) Extract appropriate parts Combine with contextual metadata Allow user to edit Create datastream(s) (DC, Rich metadata (UVa),
preservation metadata, etc) Add to digital object
Metadata from web services
Much contextual metadata about the ‘author’ can be derived from University systems’ Web Services or context
Technical metadata can be derived by using tools such as JHOVE (as a Web Service)
Descriptive metadata about the content is the Holy Grail?Data Fountains (UC Riverside & NSDL)
Metadata from context & web svcs
Contextual metadata from environment (LDAP, Portal, Sakai)
Technical metadata (JHOVE etc)
Additional descriptive metadata from content (Data Fountains etc)
Allow user to edit / tweak
Acknowledgements to the Arrow consortium for the design idea
Problems? What problems?
Not a smooth journey! Lessons: Don’t be a pioneer? Puppies don’t come free?
Active BPEL was/is fine
Fedora Web Services (v2.0, v2.1, v2.1.1) an issue Web Services were rpc/encoded
BPEL and AXIS couldn’t always successfully validate responses
Fedora Web Services (v2.2) OK! Web Services now document/literal
Next stages
Complete the RepoMMan deposit tool (!!!) Internal testing
Trialling with users
Small-scale trials
Complete the RepoMMan metadata work
Post-project, develop further and deploy
Develop workflow tools for the admin side of the ‘make public’ process, records management, preservation...
In time switch to DROID (for MIME type)? Shibboleth?
Project website and contacts
RepoMMan: http://www.hull.ac.uk/esig/repomman
Contacts:
[email protected] (Manager)
[email protected] (Integration architect)
[email protected] (Director)
[email protected] (Software developer)