Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
A strategic view of document and digital object managementfor the University of the Witwatersrand,
Johannesburg
Prof Derek W. KeatsDeputy Vice Chancellor
(Knowledge & Information Management)The University of the Witwatersrand, Johannesburg
http://[email protected]
Whataredocuments?
How does
the computer
'see' them?
Thestorage
view
Themanipulation
view
Thestructural
view
Theoperational
view
Thestorage
view
Theoperational
view
Themanipulation
view
Thestructural
view
Require softwarethat understandsthe 'document' andknows how to present it.
The storage view
The operational view
The manipulation view
The structural view
Time Time Time
The futureToday
Physicaldeterioration
Digitalobsolescence
Accidentaldamage
Loss of metadata
Survival
Devices
File formats
A major threat to
proprietaryfile formatscommon inproprietary
systemsToday
Physicaldeterioration
Digitalobsolescence
Accidentaldamage
Loss of metadata
Survival
Devices
File formats
Device obsolescence
File format obsolescence
Software supporting the
format fails in the marketplace or is
bought by a competitor and
withdrawn.
File format obsolescence
Software upgrades fail to support legacy files
The format itself is superseded by
another or evolves in complexity
The format "take up" is low or
industry fails to create compatible
software
The format fails, stagnates, or is no longer compatible with the current
environment
>
A small subset of commonly used media formats!
Media
If you don't have the software,even a perfectly preserved document is of no use.
Digitization
Documentmanagement
Borndigital
Digitalrecovery
Digital archiving
Digital preservation
Ana
logu
eD
igita
l
Time
Dig
ital
asse
ts
Risk without long term planning
As a componentof how we manageour digital assets
Why digital asset management?
● We are a knowledge organization● Knowledge workers spend 30-40%
of their time on document related tasks● This increases significantly when
other digital assets are taken into consideration● Digital assets are increasing and increasingly
easy to lose● Digital assets form the basis of much of our
research
Digital archiving and preservation● Institutional papers and documents
Other digital assets
● Historical papers● Library collections● Various history projects● Rockart collections ● Video and audio collections
● e.g. Wits TV● Donations of significant collections
from industry● History of human evolution research● Research output and theses● Research data
The curse of the born-analogue
Capture
Create
Classify
Share
Archive
DestroyProtect
Retain
Find &use
Preserve
Route
Creating semanticand socially connected
document storesarchives
repositoriesmuseumsherbaria
21st Century
Chisimba
Semantic and social 'X'● Fedora commons● Fedora commons
SWORD API● Chisimba
Fedora Commons
SWORD API
Chisimba API
XMPPeLearning'Portals'
Workflow
WEWE
Workflow
WeWe Basics● Rules-driven workflow engine● Rules represented in XML● Sequential event support● Conditional Return support● Written in Perl● Uses PostgreSQL Database● Open Source ● Originally developed for The University of the
Witwatersrand, Johannesburg● Multiple Management interfaces
WeWe Designer● Web-based design tool for designing
workflows● Supports multiple events with multiple return
types/states● Drag and drop interface● Written in JQuery● Open Source Interface● Adapt from Design “Template” support
WeWe Developer● Developers create Rules Modules● Modules can be written in Perl or any other
language that can be executed from the Linux commandline
● API● Commandline Interface
Workflow Process
Enterprise document managementAn approach using private cloud
Folderserver WEWE Chisimba
Private cloud infrastructure
Site
Ingest
Bor
ndi
gita
l
Sharedfolder
Network
WEWE
NetworkSite Site
Site
Sharedfolder
WWW
WEWEWorkflow managed by WEWE layer
Hostedservices
Digitalarchive
Virtualization
ChisimbaFedora
ChisimbaOther
Private cloud infrastructure
Witsportals eLearning
OS: Open Solaris
SOA layer
Zimbra
iRODS
Remotesite
Remotesite
Remotesite
Remotesite
WEWE
Compute cloud
Hierarchical storageRobotictape library Spinning disks
Flash memory
Computecloud
Storagecloud
Robotictape
library
Digitalarchive
Fedora
WEWE
ChisimbaArchon
Private cloud infrastructureUse in establishing digital archive
W
EW
E ru
les
Inge
st
Sou
rce
artif
acts
Dig
ital
conv
ersi
onRemote
site
Ingest
Sourceartifacts
Digitalconversion
WEWE rules
Remotesite
Borndigital
Docs
Aud
ioV
ideo
etcSOA layer
OS: Open Solaris
First tier storage
Computecloud
Storagecloud
Robotictape
library
Digitalarchive
Fedora
WEWE
ChisimbaArchon
Private cloud infrastructureUse in establishing digital archive
W
EW
E ru
les
Inge
st
Sou
rce
artif
acts
Dig
ital
conv
ersi
onRemote
site
Ingest
Sourceartifacts
Digitalconversion
WEWE rules
Remotesite
Borndigital
Docs
Aud
ioV
ideo
etcSOA layer
OS: Open Solaris
First tier storage
Scanning &assembly
#!/bin/bash#Scan in the pagesscanadf mode "Black & White" resolution 200
#Convert each page to a pdf filedoconvert $file $file.pdfrm $filedone
#Concatenate all the individual pdf files pdftk image*.pdf cat output $1.pdfrm image*.pdfmv *.pdf /home/$USER/monitored/outgoing/ .
exit 0
The real challengeis getting the documentscanned and into aPDF and sent off to somewhere meaningful.
Thats why we needexpensive documentimaging software.
Right?
Let's have one digital asset management project for Wits and let us create the synergy
that leads to innovation.
Attribution file: http://www.dkeats.com/usrfiles/users/ 1563080430/attribution/attrib.txt
http://www.dkeats.com/usrfiles/users/
Slide 1Slide 2Slide 3Slide 4Slide 5Slide 6Slide 7Slide 8Slide 9Slide 10Slide 11Slide 12Slide 13Slide 14Slide 15Slide 16Slide 17Slide 18Slide 19Slide 20Slide 21Slide 22Slide 23Slide 24Slide 25Slide 26WeWe BasicsWeWe DesignerWeWe DeveloperWorkflow ProcessSlide 31Slide 32Slide 33Slide 34Slide 35Slide 36Slide 37Slide 38