Perseus’ Archiving Needs And What They Mean For Librarians

Preview:

Citation preview

Perseus’ Archiving Needs

And What They Mean For Librarians

Preserving Perseus

Data and Behaviors

• What does Perseus have to lose?

• Data– If lost, we cannot do anything.– The primary text is primary.

• Behavior– We lose the ability to make associations

Structure of the Talk

• Perseus’ current and future options for archiving/preserving its data and behaviors

• Use this to motivate new skills required by and emerging new roles for librarians

Perseus’ Preservation Options…

• Be Open– Hard to maintain a black box

• Distribute for Redundancy– Library of Alexandria: Don’t put all your

eggs in one basket.

• Use Institutions for Reliability/Quality– Library of Alexandria: Lots of quality

content

Be Open

Be Open: Data

• Data formats – Non-binary for text

• Images are different

– Application-independent– Easily transformable when possible

• XML

• Licensing– Can other people use this data?– Are other people able to create derivative works?

Be Open: Behaviors

• Protocol Specifications– What does Perseus mean? (semantics)– Defining behaviors

• Browsing by logical citation scheme: CTS protocol

• Perseus’ APIs– Open source implementations– Let people download these

implementations

Distribute For Redundancy

Distributing Data

• Leveraging Geographic Distribution– SRB/iRods

• Desktop/Web-based GUI

• The more copies, the safer our data will be– Perseus lets people download raw data

• Creative commons

Distribute Your Behaviors

• Mirror sites– Enables distribution of behaviors

• Distributed computing power– Performance gain

• For Perseus’ mission: the more copies, the better!– Let people download your specs and

implementations.• GPL license

Use Institutions For Reliability & Quality

Give Institutions Your Data

• Quality– Policies for ingest ensure a standard for

the data and metadata

• Leverage Expertise– Their job is to archive and preserve data

Give Institutions Your Behaviors

• Institutional repositories can preserve behaviors– Fedora

• Forces documentation – Specification – Implementation

• If using a different implementation– Is the specification really implementation-

independent?

Skills Perseus Needs from Future Librarians

• Data formats:– XML

• Manipulating the data– XSLT– Basic Scripting: Perl, Python, Groovy

• Licensing agreements– Creative Commons– GPL

• Grid/Distributed Computing• Investigate Institutional Repositories

– Fedora