19
ForgetIT Project, GA 600826 Towards Concise Preservation by Managed Forgetting Research Seminar University of Caen 12 November 2013 Nattiya Kanhabua, Claudia Niederée, and Wolf Siberski L3S Research Center / Leibniz Universität Hannover Hannover, Germany

Preservation and Forgetting: Friends or Foes?

Embed Size (px)

Citation preview

ForgetIT Project, GA 600826

Towards Concise Preservationby Managed Forgetting

Research SeminarUniversity of Caen12 November 2013

Nattiya Kanhabua, Claudia Niederée, and Wolf SiberskiL3S Research Center / Leibniz Universität Hannover

Hannover, Germany

2

ForgetIT Project Consortium

An interdisciplinary team of experts in:– Preservation, information management, information extraction– Multimedia analysis, storage computing, cognitive psychology

3

Outline

Motivation & VisionPilot ApplicationsApproaches: OverviewIntegration Framework

4

Inspiration

However we are facing– dramatic increase in content creation (e.g. digital photography)– information overload and changing professional + private lives– increasing storage costs for long-term storage (>10 years)– increasing use of mobile devices with restricted capacity– inadvertent forgetting in lack of systematic preservation

And: Forgetting plays a crucial role for human remembering and life in general (focus, stress on important information, forgetting of details)

A Computer that forgets ?Intentionally ??

And in context of preservation???

So: “Shouldn’t there be something like forgetting in digital memories as well?

ForgetIT

5

Complementing Human Memory

V. Mayer-Schönberger. Delete - The Virtue of Forgetting in the Digital Age. Morgan Kaufmann Publishers, 2009.

6

Motivation

major progress in preservation technology

maturing Information extractiontechnology

storage as service (e.g. clouds)

Opportunities increasing amount of digital contenthandled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

large gap for adoption high-up front cost no established

practices lack of understanding

of benefit reluctance to invest

Major Obstacles

7

Vision: Building a Bridge

major progress in preservation technology

maturing Information extractiontechnology

storage as service (e.g. clouds)

Opportunities increasing amount of

digital contenthandled over decades

more or less systematic backup strategies used

non-paper practices for long-term perspective required

Needs

ForgetIT

Enabling smooth transition to preservation

Creating immediate benefit + reducing effort

Opening alternatives to “keep it all” and “forgetting by accident”

Easing interpretation in the long run

taking inspiration from and complementing human memory

large gap for adoption high-up front cost no established

practices lack of understanding

of benefit reluctance to invest

Major Obstacles

8

Building the Bridge

Managed Forgetting

Synergetic

Preservation

Contextualized Remembering

• bringing back information into active use in a meaningful way

• as opposed to the current “forgetting by accident”

• inspired by human forgetting

• couples information management and preservation management

9

• High awareness of trip details

• Showing of pictures

• Sorting out redundant pictures

• Sub-grouping and sorting

Simple Example: Holidays

+20 Years+5-10 Years+1 Yearsafter trip +1 month

• Trip to Paris with Friends

• Thousands of picures

• Life goes on• Pictures go

out of focus• Creation of a

small diverse subset for showing occasionally

• Creation of summary page

• Addition of context info

• Further reduction of redundancy

• Rest of pictures into archiveFebruary 2015ParisTeam: Me, Mary Christine, Tom

• Changes in life (e.g. marriage)

• Addition/update of context information

• Dealing with preservation issues

girlfriendGirlfriendwife

• Revisiting of Photo of trip photos

• Re-integration into overall photo collection (link into context)

10

Application: Personal Preservation

Starting point:Tremendous growth of information in personal sphereDiversity and fast evolution of devices, platforms and formatsKeeping info sustainably available: Only ad hoc solutions for mid-term, long-term solutions

ForgetIT approach: Preservation solution for personal information spaceBased on concept of Semantic DesktopConsideration of social web content, multimedia content, other types of personal content, knowledge structuresAdditional short/mid-term benefit: de-cluttering information space by managed forgettingConsideration of multi-level infrastructures (e.g. mobile, PC, cloud)

Dissemination/Exploitation: Personal Preservation as a service (e.g. to customers of a telco company)

11

Application: Organizational Preservation

Starting point: existing and popular CMS (TYPO3)Sophisticated workflows for content creation and publicationBut: Separation of publication and preservation/archival Access to archived content is difficult and costly obsolete and even outdated information stays online

ForgetIT approach:Preservation as integral part (binary model gradual managed forgetting)

Bolder attitude towards removing content possibleAutomated support of cleaning up processesSupport of many stages of archiving, e.g. offline but still in index, aggregates online/ content in archive, only aggregates kept, etc.

Dissemination/Exploitation: Involvement of TYPO3 community, TYPO3 with preservation extension as open source project to TYPO3 community

12

Variables & Dimensions

Personal Organization

Scenarios • Personal events (years at school, holidays, social events, graduations, marriage, etc)

• Public events

• Work-related events (project starts/closing, business trips, new products, etc.)

Data Type • Local: photos, contacts, sms• Online: user-generated content• Feature:

1. user behaviors2. social context

• Local: textual documents, files• Online: web pages• Feature:

1. user roles2. policies

Interaction search/retrieve, re-find, organize, explore

Action summarization, aggregation, delete

13

Information Value Assessment

Memory Buoyancy Preservation Value

Short-term relevance/interestsE.g., current meeting documents

Long-term interestsE.g. important life events

Subjective metrics+ usage logs (views, edits, modifies)+ social context, influences

Objective metrics+ diversity, coverage, quality

Managed Forgetting

Inspired by central role of human forgetting

Aim: – help in identifying and focus on relevant information– supporting preservation content selection

Will replace inadvertent forgetting

Managed forgetting ≠ automatic deletion

Instead: range of forgetting options e.g. – resource condensation– change of indexing & ranking– reduction of redundancy

Based on:

Careful information value assessment

Forgetting strategies via policies

Forgetting options to integrate final manual checking before deletion

Combination with multi-tier storage solution possible

14

Automatic Deletion?

decreasing memory buoyancy

Use of tiers

15

Contextualized Remembering

Aim: – bringing back information into active use in a

meaningful way even if a lot of time has passed– aiming for semantic level of preservation

Based on:

taking into account relevant parts of context when moving to archiveincreasing contextualization of preserved contentconsidering context evolution over time (evolution-aware contextualization)

Evolution-aware Contextualization & Re-contextualization

16

Context of Interpretation

t

C C‘

Archival InformationSystem

Pres(D‘)

Pres(C‘)

Information System

Human ForgettingChange in focusStructural changes

C‘‘

Evolution-awareContextualization

Re-contextualization

Pres(D‘)

Pres(C‘‘)

Semantic evolutionStructural evolutionTerminology evolution

Pres(D‘)

Pres(C‘‘)

D

Contextualization

C‘‘‘

D

Context-awarePreservation

Semantic Evolution Detection

DD

ForgetIT Project, GA600826 - Kickoff Meeting, Hannover, February 2013

17

Synergetic Preservation

smooth and step-wise transition between active information use and preservation enables rich information flow in both directionssupports more informed preservation decisionseases preservation adoption

Data Management

Descr. Info.

Archival Storage

AIPs

Access

Ingest

Administration

Preservation Planning

Preserve-or-Forget Framework

Synergetic Preservation

Extraction & Contextualization

Re-Contextualization

Content Management

Access

Authoring

Administration

Adapter Layer

Managed Forgetting

Information Assessment Condensation

Arc

hiv

al In

form

atio

n S

yste

m

Info

rmat

ion

Man

agem

ent

Sys

tem

Integration Framework

18

Information Management System

• Resources + Meta data:• ResourceID• Content (size, tags, aging, geo)• Context (folder/file usage)• Social features • Resources neighbours (Graph)

Forgettor

Assessorcalculates:+ Memory Buoyancy+ Perservation Value

Analyzer1. Classification of resources

w.r.t. startegies2. Triggers forgetting actions

Strategies

ValuesStatistics

Forgetting strategies for

different types of resources

Resources Meta-Info

Resources Values + Decisions

Input: strategy meta-infomation (content, context,

neigbours )previous values

Processing Resources based on stategies and

information values

Storing the new values and sending them back to IMSArchives

Acce

ss

Stor

e

Store &access data

19

Thank you

http://ForgetIT-Project.eu/