24
The Planets Interoperability Framework Rainer Schmidt AIT Austrian Institute of Technology [email protected] 1st DPIF Symposium, April 21-23, 2010, Dresden, Germany. Integrated Access to Preservation Tools

The Planets Interoperability Framework

  • Upload
    alcina

  • View
    35

  • Download
    0

Embed Size (px)

DESCRIPTION

The Planets Interoperability Framework. Integrated Access to Preservation Tools. Rainer Schmidt AIT Austrian Institute of Technology [email protected]. 1st DPIF Symposium, April 21-23, 2010, Dresden, Germany. Outline. Overview of the Integrated Environment - PowerPoint PPT Presentation

Citation preview

Page 1: The Planets Interoperability Framework

The Planets Interoperability Framework

Rainer SchmidtAIT Austrian Institute of Technology

[email protected]

1st DPIF Symposium, April 21-23, 2010, Dresden, Germany.

Integrated Access to Preservation Tools

Page 2: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Outline

Overview of the Integrated Environment

Main Objectives and Architecture

Planets Preservation Services

Digital Objects and Metadata

Integrating Repositories

The Workflow Execution Engine (WEE)

Conclusions & Lessons Learned

Page 3: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Planets Project

“Permanent Long-term Access through NETworked Services”

Addresses the problem of digital preservation

driven by National Libraries and Archives

Project instrument: FP6 Integrated Project

5. IST Call

Consortium: 16 organisations from 7 countries

Duration: 48 months, June 2006 – May 2010

Budget: 14 Million Euro

http://www.planets-project.eu/

Page 4: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

The Planets Interoperability Framework

An integrated System for the development and evaluation of

preservation strategies.

Uniform access mechanisms to a broad range of “commodity” tools,

e.g. for characterization, migration, emulation.

Integration of existing repositories, data/metadata formats.

Specification, execution, recording of preservation workflows.

Integration with end-user applications for preservation planning and

the evaluation of tools/strategies. PLANETS Preservation Planning Tool and Testbed

Page 5: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Agents and Activities

Preservation Expert

IF Gateway Server

<<create experiment>>

Digital Library/Repository

<<retrieve objects>>

Preservation Services

<<apply object>>

Application Provisioning

Provenance

Experiment Repository

Data Model Mapping

Service Orchestration

Access Pres. Applications

Service Registration

Data Transfer

Deposit Result

<<migrate>>

<<characterize>>

<<compare>>

User Management

Export Digital Objects

Page 6: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Service-Orientated Architecture

XML Web Services (SOAP, WSDL, WS-*)

Platform, Language, and Location Independence

Homogeneous interfaces for preservation activities, data

management, workflow execution.

Remotely access repositories and data.

Discover and dynamically utilize tools in a workflow.

Supports distributed and cross-organizational deployments

Shared hardware, software, maintenance

Browser-based access to large number of resources

Page 7: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Service Gateway Architecture

Preservation Planning Tool

Experimentation Testbed Application

Notification andLoggingSystem

Workflow Execution UI

Workflow Execution and

Monitoring

Experiment Data and Metadata

Repository

Service and Tool

Registry

Application Services

ExecutionServices

Data Access Services

AdministrationUI

Authenticationand

Authorization

User Applications

Portal Services

Application Execution and Data Services

Physical Resources, Computers, Networks

Page 8: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Preservation Interfaces (the Verbs)

Define atomic preservation activities (level-one)

Concentrates on low-level concepts and actions

• Bit-stream operations, no data management

Designed to be light-weight and easy to implement

Independent from a specific tool, language, or content type

E.g. Characterize, Migrate, Compare, CreateView

>50 Tools wrapped/provided as Planets Services

Provides the basic abstractions for assembling workflows.

Page 9: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Preservation Interfaces (the Verbs)

Define atomic preservation activities (level-one)

Concentrates on low-level concepts and actions

• Bit-stream operations, no data management

Designed to be light-weight and easy to implement

Independent from a specific tool, language, or content type

E.g. Characterize, Migrate, Compare, CreateView

>50 Tools wrapped/provided as Planets Services

Provides the basic abstractions for assembling workflows.

Page 10: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Digital Objects (the Nouns)

Generic data abstraction for modeling digital entities.

Encapsulates content and metadata

Consumed and/or produced by

Planets preservation services

Provides minimal and generic model for data management

Stored in Object Repository

Does not prescribe serialization schema

May be created from DC/ORE RDF record and be

serialized using METS/PREMIS schemas.

Page 11: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Digital Objects (the Nouns)

Content

Digital Object

PropertiesEvents

Metadatacontains_object

fragment

Type, Time, Agent,Service, Result, …

Creator, Title,Description, Format, …

Embedded Data or Repository URL

Relationships (possiblyassociated with event)

Tagged UninterpretedMetadata Chunks

Page 12: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Digital Object Managers

Individual adapters for retrieving (& storing) Planets DOs

Provide access to existing repositories.

Map metadata records to Planets DOs

Ingest digital objects to Planets data repositories

Current implementation for

retrieving OAI-PMH records, BL digitized newspaper, Web

resources, Amazon S3 buckets, …

Planets Data Registry services (ingesting DOs) based on Apache

Jackrabbit and Fedora Commons.

Page 13: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Page 14: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Data Registry

A service to deposit, access, and organize Planets digital objects

based on bi-directional Digital Object Manager.

Accessible to Workflow Execution Engine

Records Experiment and Preservation Metadata

Supports Export of Experiment Results

A Repository that implements Planets Digital Object Model and

naming schema (Planets URIs).

Supports asynchronous pass-by-reference and direct access to

binary Content (Content Resolver)

Page 15: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Data Registry

A service to deposit, access, and organize Planets digital objects

based on bi-directional Digital Object Manager.

Accessible to Workflow Execution Engine

Records Experiment and Preservation Metadata

Supports Export of Experiment Results

A Repository that implements Planets Digital Object Model and

naming schema (Planets URIs).

Supports asynchronous pass-by-reference and direct access to

binary Content (Content Resolver)

Page 16: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Page 17: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Workflow Orchestration

Separation of concerns:

Fragments of complex workflow logic (templates) are implemented by

<<workflow developers>>

<<Experimenters>> selected from predefined templates, configure them, and

execute individual processes.

Templates implement abstract and reusable processes definitions based

on level-on operations (API) and decision logic.

Execute in trusted environment (level-two)

handle digital objects in metadata repository and

basis for recording provenance and preservation information

Page 18: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Workflow Execution Engine (WEE) Service

Template

WEE Template Rep. Service

Workflow ClientApplication

Cmp.

WEE ExecutionService

Cmp.

<<4: execute>>

<<1: register>>

XML

<<3: configure>>

<<2: select>>

Experimenter WorkflowDeveloper

Page 19: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Page 20: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Summary

Research infrastructure for

integrating variety of tools and repositories

executing defined preservation operations

recording provenance and preservation metadata

Not necessary an “out-of-the-box” solution

Extensible network of services,

Public deployment,

Allows sharing of resources and results.

Downloadable package available for local installation of selected

preservation tools/services.

Page 21: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Conclusions (1) - Preservation Actions

Defined interfaces for Preservation Actions required

Prerequisite for QA and other complex pres. strategies (workflows)

Preservation strategy often trivial (complexity within the tool)

Automation and Quality Control are key issues

Verifiability of technical interoperability is crucial

Depends much on communication method (native, DSL)

• keep as simple as possible

Semantic interop. requires well defined properties and metrics

• often domain dependent

• defined tests and benchmarks required

Page 22: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Conclusions (2) - Component Framework

The Planets IF provides an environment for preservation components to

run and interact

Distributed system required for extensibility and integration

Service interfaces specified at exchange language level

(HTTP, SOAP, WS* Specs.)

Interoperability often not a problem of specification but of inconsistencies in

different implementations

3rd party tools impose multiple levels of indirection

OS calls, different languages, different middleware stacks

Supporting (proprietary) tools may impact hosting environment and factors

like performance, robustness, and fault tolerance.

Page 23: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Conclusions (3) - Repository Integration

Planets provide a flexible approach for bridging access to

heterogeneous repository systems.

Diverse APIs, metadata representation, data access

Stds. exist (OAI-ORE, RDF) but not yet adopted

Missing standards for integration of digital preservation actions with

digital repository systems

(a) Defined Methods for Access, Re-Ingest, Versioning

(b) Entirely integrated with repository

• can improve performance, may affect trustworthiness

Considerable efforts required to adapt data management systems in place

Page 24: The Planets Interoperability Framework

DPIF Symposium, April 21-23, 2010, Dresden

Fin