22
Documentverwerking Documentverwerking P11 Document Management P11 Document Management Prof.Dr.ir. Patrick P. Bergmans Prof.Dr.ir. Patrick P. Bergmans Faculteit IngenieursWetenschappen Faculteit IngenieursWetenschappen Universiteit Gent Universiteit Gent

Documentverwerking P11 Document Management Prof.Dr.ir. Patrick P. Bergmans Faculteit IngenieursWetenschappen Universiteit Gent

Embed Size (px)

Citation preview

DocumentverwerkingDocumentverwerkingP11 Document ManagementP11 Document Management

Prof.Dr.ir. Patrick P. BergmansProf.Dr.ir. Patrick P. BergmansFaculteit IngenieursWetenschappenFaculteit IngenieursWetenschappen

Universiteit GentUniversiteit Gent

2

Document Management (1)Document Management (1)

Document Management controls the “Document Management controls the “life cyclelife cycle” ” of documents in an organization; how they areof documents in an organization; how they are CreatedCreated ReviewedReviewed PublishedPublished ConsumedConsumed Disposed of, or retainedDisposed of, or retained

Management implies a “top-down” approach, Management implies a “top-down” approach, but Document Management Systems are not but Document Management Systems are not always implemented top-downalways implemented top-down Documents must reflect the Documents must reflect the cultureculture of a company of a company But they must also introduce a more formal approach But they must also introduce a more formal approach

to to document usagedocument usage

3

Document Management (2)Document Management (2)

Tools used for Document Management Tools used for Document Management should beshould be Formal and rigorous Formal and rigorous when requiredwhen required Yet maximize Yet maximize flexibilityflexibility

Document Management Systems shouldDocument Management Systems should Promote Promote finding and sharing finding and sharing information easilyinformation easily Organize content in a Organize content in a logical, retrievable waylogical, retrievable way Standardize the Standardize the representationrepresentation of content of content Help an organization meet its Help an organization meet its legal legal

responsibilitiesresponsibilities Support systems for Support systems for collaborationcollaboration Provide systems for efficient Provide systems for efficient archivingarchiving

4

Document Management (3)Document Management (3)

An effective Document Management System An effective Document Management System specifies:specifies: What What types of documents types of documents and other content can be and other content can be

created within an organizationcreated within an organization What What templatestemplates to use for each type of document to use for each type of document What What metadatametadata to provide for each type of to provide for each type of

documentdocument Where to Where to storestore documents at each stage of a documents at each stage of a

document's life cycledocument's life cycle How to How to control access control access to a document at each stage to a document at each stage

of its life cycleof its life cycle How to How to move documents move documents within the organization as within the organization as

team members contribute to the documents’ team members contribute to the documents’ creation, review, approval, publication, and creation, review, approval, publication, and dispositiondisposition

5

Document Management (4)Document Management (4)

An effective Document Management An effective Document Management System specifies:System specifies: …… What What policiespolicies to apply to documents so that to apply to documents so that

document-related actions are auditeddocument-related actions are audited, documents , documents are retained or disposed of properly, and content are retained or disposed of properly, and content important to the organization is important to the organization is protectedprotected

How documents are How documents are convertedconverted as they transition as they transition from one stage to another during their life cyclesfrom one stage to another during their life cycles

How documents are treated as How documents are treated as corporate recordscorporate records, , which must be retained according to legal which must be retained according to legal requirements and corporate guidelinesrequirements and corporate guidelines

6

Planning Document ManagementPlanning Document Management

Planning stepsPlanning steps Identify Document Management participants Identify Document Management participants

and and stakeholdersstakeholders Analyze document Analyze document usageusage Plan document Plan document librarieslibraries Plan Plan content typescontent types Plan Plan versioningversioning, content approval, check-out , content approval, check-out

proceduresprocedures Plan Plan workflowsworkflows for documents for documents Plan information management Plan information management policiespolicies

7

Document Management StakeholdersDocument Management Stakeholders

Identify Document Management participants Identify Document Management participants and stakeholders (WHO???)and stakeholders (WHO???)

Who Who createscreates documents in the organization documents in the organization Who Who reviewsreviews documents documents Who Who editsedits documents documents Who Who usesuses documents documents Who approves the Who approves the publicationpublication of documents of documents Who designs Who designs Web sites Web sites used for hosting used for hosting

documentsdocuments Who manages “Who manages “recordsrecords”” Who deploys and maintains Who deploys and maintains document serversdocument servers

8

Analyze Document UsageAnalyze Document Usage

Analyze Document Analyze Document UsageUsage Document Document typestypes Specific and Specific and detaileddetailed description of the usage description of the usage

of a document, or of of a document, or of classesclasses of documents of documents• Simplify documents with Simplify documents with little uselittle use

AuthorAuthor of all documents of all documents FormatFormat of the document; format of the document; format conversionsconversions

should also be recordedshould also be recorded Describe Describe usersusers of all documents (individuals, of all documents (individuals,

teams, departments)teams, departments) LocationLocation of the documents of the documents

9

Plan Document LibrariesPlan Document Libraries

Plan Document LibrariesPlan Document Libraries Library in a Library in a team siteteam site

• Less formal documents, ideas, proposalsLess formal documents, ideas, proposals Library in a Library in a portalportal area (Intranet site) area (Intranet site)

• Legal documents, templates, active contractsLegal documents, templates, active contracts Library in a Library in a Document Center Document Center sitesite

• Centrally managed documents; best practicesCentrally managed documents; best practices Library in a Library in a Records RepositoryRecords Repository

• Document archival, long term legal requirements, Document archival, long term legal requirements, corporate recordscorporate records

Translation management document libraryTranslation management document library Slide and presentation librarySlide and presentation library

10

Plan Content TypesPlan Content Types

What are What are content typescontent types?? PropertiesProperties of the document type of the document type WorkflowsWorkflows associated with the type associated with the type Information management Information management policiespolicies associated with associated with

the typethe type Document Document templatestemplates Document Document conversionsconversions CustomCustom features features

When creating a “new” document of a specific When creating a “new” document of a specific types, all properties of the type are types, all properties of the type are automatically automatically inheritedinherited

Document type is Document type is stored in the document stored in the document and and cannot be changed by the document authorcannot be changed by the document author May be enforced by May be enforced by DTD of XML Schema DTD of XML Schema approachapproach

11

Plan Versioning, Content Approval, Check-outPlan Versioning, Content Approval, Check-out

Types of versioningTypes of versioning NoneNone

• Only the last version of the document is kept, and no Only the last version of the document is kept, and no log of changes or edit are availablelog of changes or edit are available

• Only use for unimportant documentsOnly use for unimportant documents MajorMajor versions only versions only

• Uses a simple versioning scheme (1, 2, 3..);Uses a simple versioning scheme (1, 2, 3..);• All versions normally read-accessible to all All versions normally read-accessible to all

stakeholdersstakeholders Major and minor Major and minor versionsversions

• Most often Most often doubledouble version scheme: [major.minor] (1.2, version scheme: [major.minor] (1.2, 2.5, ..)2.5, ..)

• Allows to implement a storage policy based on both Allows to implement a storage policy based on both digitsdigits

• For example: retain all current minor releases, and 2 For example: retain all current minor releases, and 2 former major releases (5.0, 6.0, 7.0, 7.1, 7.2, ..)former major releases (5.0, 6.0, 7.0, 7.1, 7.2, ..)

12

Plan Versioning, Content Approval, Check-outPlan Versioning, Content Approval, Check-out

Content Approval: before making a version Content Approval: before making a version “official”, its content must be “official”, its content must be approvedapproved..

No version control: when an (approved) document No version control: when an (approved) document is being edited, or has been edited, but its content is being edited, or has been edited, but its content has not yet been approved, has not yet been approved, no approved (earlier) no approved (earlier) version is availableversion is available

Major version control: an approved version may be Major version control: an approved version may be edited, and pending the approval of the new edited, and pending the approval of the new version, the version, the previous (major) version previous (major) version remains remains availableavailable

Minor version control: an approved version may be Minor version control: an approved version may be edited, and the author has the choice to create a edited, and the author has the choice to create a new major or minor version; official approved new major or minor version; official approved version is version is optionally latest minor or major versionoptionally latest minor or major version

13

Document WorkflowsDocument Workflows

A document workflow described the A document workflow described the various stepsvarious steps a document goes through a document goes through during its life cycleduring its life cycle With a With a formal workflow model formal workflow model (graphic, (graphic,

procedural, scripted)procedural, scripted) Includes Includes creation, iterative edits, and approvalscreation, iterative edits, and approvals Authorizations and signaturesAuthorizations and signatures Collection of document Collection of document metadatametadata Checking Checking inin and checking and checking outout Includes starts, pauses and stops in the Includes starts, pauses and stops in the

workflowworkflow MessagingMessaging (sending e-mails) (sending e-mails)

14

Information Management PoliciesInformation Management Policies

Implementation of Implementation of Information Access Rights Information Access Rights as as they apply to documentsthey apply to documents Create, modify, read rightsCreate, modify, read rights

• By individualsBy individuals

• By departmentBy department PrintPrint rights rights Control and management of Control and management of confidential confidential

informationinformation, contained in documents, contained in documents Control of Control of digital copiesdigital copies Control of Control of physicalphysical (paper) (paper) copiescopies, with , with

automatic registered numbering, overprinting of automatic registered numbering, overprinting of watermarks, etcwatermarks, etc

15

Special Functional ComponentsSpecial Functional Components

Metadata collection and generationMetadata collection and generation IndexingIndexing SummarizingSummarizing Auto-translationAuto-translation Terminology ControlTerminology Control ConversionsConversions Integration with other systemsIntegration with other systems

16

Metadata Collection and GenerationMetadata Collection and Generation

Metadata are data that are Metadata are data that are associated with associated with the the document, but that are document, but that are not part not part of a document’s of a document’s contentcontent Name of the Name of the authorauthor DateDate of creation and trail of revisions of creation and trail of revisions Authoring Authoring applicationapplication of the document of the document Statistical information Statistical information about the documentabout the document Digital document Digital document rightsrights Custom Custom structural data structural data (XML)(XML) Headers, footers, watermarks Headers, footers, watermarks for printingfor printing

Metadata can be Metadata can be indexedindexed and searched and searched Metadata can be input Metadata can be input manuallymanually, or , or automaticallyautomatically

generated by the authoring applicationgenerated by the authoring application

17

IndexingIndexing

Documents may be Documents may be indexedindexed for efficient retrieval for efficient retrieval Keywords for the index may be Keywords for the index may be manuallymanually added, added,

or automatically generatedor automatically generated A A fewfew selected keywords, or selected keywords, or The The whole document whole document may be indexedmay be indexed

Indexing should be Indexing should be matchedmatched or adapted to the or adapted to the searchsearch mechanism associated with the mechanism associated with the Document Management SystemDocument Management System

Indexing may also be “Indexing may also be “semanticsemantic” (not just words ” (not just words but their but their general meaning general meaning is indexed, using is indexed, using categoriescategories)) Use of categorization systemsUse of categorization systems

18

SummarizingSummarizing

Summarization of documents can be performed, Summarization of documents can be performed, at the time a document is storedat the time a document is stored Completely Completely automaticallyautomatically User assistedUser assisted

There are three types of automatic There are three types of automatic summerizationsummerization StatisticalStatistical SemanticSemantic MixedMixed

Summary may beSummary may be Included in the document as Included in the document as metadatametadata Saved on the system as a Saved on the system as a separate documentseparate document, ,

linked to the originallinked to the original

19

Auto-TranslationAuto-Translation

Automatic translation Automatic translation of documents, or section of of documents, or section of documents, can be performed by some documents, can be performed by some (experimental) Document Management Systems(experimental) Document Management Systems

Essentially two systemsEssentially two systems Complete analysis of sentences, and synthesis Complete analysis of sentences, and synthesis in the in the

other languageother language Translation memoriesTranslation memories, storing pre-translated snippets of , storing pre-translated snippets of

documentsdocuments Translation mostly not perfect, but often Translation mostly not perfect, but often usableusable Especially if the document is very Especially if the document is very domain-specificdomain-specific

PharmaceuticalPharmaceutical documents documents User and service User and service manualsmanuals But not for But not for legallegal texts texts

20

Terminology ControlTerminology Control

TerminologyTerminology Control or Management is used when Control or Management is used when the the meaning of wordsmeaning of words, used in the document, must , used in the document, must be accurately monitoredbe accurately monitored Delete = erase = scratch a file; in Delete = erase = scratch a file; in software manualssoftware manuals; ;

which should be used (in all instances)?which should be used (in all instances)? Gas = gasoline = fuel = petrol tank; which one should Gas = gasoline = fuel = petrol tank; which one should

be used in the be used in the user manual of a caruser manual of a car; in the ; in the USUS? In the ? In the UKUK??

Terminology management systems rely on a central Terminology management systems rely on a central terminology server terminology server with terminology usage ruleswith terminology usage rules Terminology Terminology databasedatabase maintained by dedicated maintained by dedicated

utilitiesutilities Can be Can be used interactively used interactively when authoring documentswhen authoring documents Can be used to screen pre-authored texts for correct Can be used to screen pre-authored texts for correct

terminology usageterminology usage

21

ConversionsConversions

Conversions between Conversions between formatsformats (see (see introduction)introduction) ML ML RTF RTF PDL PDL Bitmap Bitmap Bitmap Bitmap PDL PDL RTF RTF ML ML Any format to HTML for viewing with Any format to HTML for viewing with browserbrowser Also conversions at “constant semantic Also conversions at “constant semantic levellevel” ”

(jpg, tif, png, bmp, etc)(jpg, tif, png, bmp, etc) CompressionCompression and decompression and decompression

Simultaneous Simultaneous tracking of versions at tracking of versions at several levelsseveral levels Automatically delete PDF version, if source (ML Automatically delete PDF version, if source (ML

or RTF) is being editedor RTF) is being edited

22

Integration with other SystemsIntegration with other Systems

Many document management systems attempt to Many document management systems attempt to integrate document managementintegrate document management directly into directly into other applicationsother applications Authoring applications: so that users may access Authoring applications: so that users may access

(read-modify-write) directly in the document (read-modify-write) directly in the document management system repository, without leaving the management system repository, without leaving the application; such integration is commonly available application; such integration is commonly available for for office suites and e-mail softwareoffice suites and e-mail software..

Management applications, such as CRM systems, ERP Management applications, such as CRM systems, ERP systems, MRP systems, accounting packages, etcsystems, MRP systems, accounting packages, etc

Integration often uses Integration often uses open standards open standards such as such as ODMA, LDAP, WebDAV and SOAP to allow ODMA, LDAP, WebDAV and SOAP to allow integration with other software and compliance integration with other software and compliance with internal controls.with internal controls.