Archiving - Fundamentals

  • Upload
    ramuhcl

  • View
    235

  • Download
    0

Embed Size (px)

Citation preview

  • 8/2/2019 Archiving - Fundamentals

    1/48

  • 8/2/2019 Archiving - Fundamentals

    2/48

    2

    Topics

    Copyright SvalTech, Inc., 2009

    SvalTech

    Database Archiving Definitions

    Database Archiving Application Profiles

    Elements of a Successful Implementation Solution Comparisons

    Business Case Basics

  • 8/2/2019 Archiving - Fundamentals

    3/48

    3

    Database Archiving Definitions

    SvalTech

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    4/48

    4

    Definition

    Document Archiving

    word

    pdf

    excel

    XML

    File Archiving

    structured files

    source code

    reports

    Email Archiving

    outlook

    lotus notes

    Database Archiving

    DB2

    IMS

    ORACLE

    SAP

    PEOPLESOFT

    Physical Documents

    application forms

    mortgage papers

    prescriptions

    Multi-media files

    pictures

    sound

    telemetry

    The process of removing selected data items fromoperational databases that are not expected to be referencedagain and storing them in an archive database where

    they can be retrieved if needed.

    SvalTech

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    5/48

  • 8/2/2019 Archiving - Fundamentals

    6/48

    6

    Data Domain Business Records

    SvalTech

    The data captured and maintained for a single businessevent or to describe a single real world object.

    Databases are collections of Business Records.

    Database Archiving is Records Retention.

    customeremployeestock trade

    purchase orderdeposit

    loan payment

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    7/48

    7

    Drivers

    SvalTech

    overloaded

    operationaldatabases

    Longer Data Retention requirements

    Expanded Business

    Mergers and Acquisitions

    Copyright SvalTech, Inc., 2009

    Operational problems

    Data Governancee-Records Retentione-Discovery Readiness concerns

    Application Changes

  • 8/2/2019 Archiving - Fundamentals

    8/48

    8

    Data Retention

    SvalTech

    The requirement to keep data for a business object for aspecified period of time. The object cannot be destroyed untilafter the time for all such requirements applicable to it has past.

    Business Requirements

    Regulatory Requirements

    The Data Retention requirement is the longest of all requirement lines.

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    9/48

    9

    Data Retention

    SvalTech

    Retention requirements vary by business object type

    Retention requirements from regulations are exceeding business requirements

    Retention requirements will vary by country

    Retention requirements imply the obligation to maintain the authenticity of the datathroughout the retention period

    Retention requirements imply the requirement to faithfully render the data on demand in acommon business form understandable to the requestor

    The most important business objects tend to have the longest retention periods

    The data with the longest retention periods tends to be accumulate the largest number ofinstances

    Retention requirements often exceed 10 years. Requirements exist for 25, 50, 70 andmore years for some applications

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    10/48

    10

    Data Time Lines

    SvalTech

    createevent discard

    eventoperational reference inactive

    phase phase phase

    operational phase can be updated, can be deleted, may participate inprocesses that create or update other data

    reference phase used for business reporting, extracted into businessintelligence or analytic databases, anticipated queries

    inactive phase no expectation of being used again, no known businessvalue, being retained solely for the purpose of satisfyingretention requirements. Must be available on request inthe rare event a need arises.

    for a single instance of a business record

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    11/48

    11

    Data Time Lines

    SvalTech

    for a single instance of a business record

    Create POUpdate POCreate InvoiceBackorder

    Create Financial RecordUpdate on ShipUpdate on Ack

    Weekly Sales ReportQuarterly Sales report

    Extract for data warehouseExtract for bus analysisCommon customer queriesCommon bus queries

    Ad hoc requestsLaw suit e-Discovery requestsInvestigation data gathering

    Retention requirement

    operational reference inactive

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    12/48

    12

    Data Time Lines

    SvalTech

    Some objects exit the operational phase almost immediately (financial records)

    Some objects never exit the operational phase (customer name and address)

    Most transaction data has an operational phase of less than 10% of the retentionrequirement and a reference phase of less than 20% of the retention requirement

    Inactive data generally does not require access to application programs: only access to adhoc search and extract tools

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    13/48

  • 8/2/2019 Archiving - Fundamentals

    14/48

    14

    Application SegmentsSvalTech

    OS1

    time

    S1

    Application: customer stock transactions

    Source 1 = Trades All Stock Trades

    case 1

    OS1

    time

    S1

    S2

    Application: customer stock transactions

    Source 1 = Stock Trades North American DivisionSource 2 = Stock Trades Western Europe Division

    OS2

    case 2

    = major metadata break

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    15/48

    15

    Application SegmentsSvalTech

    OS1

    time

    S1

    S2

    Application: customer stock transactions

    Source 1 = Stock Trades North American Division application XSource 2 = Stock Trades Western Europe Division application YSource 3 = acquisition of Trader Joe: merged with Source 1 on 7/15/2009Source 4 = acquisition of Trader Pete: merged with Source 1 on 8/15/2009

    OS2

    case 3

    = major metadata break

    S3OS3

    S2OS4

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    16/48

    16

    Application SegmentsSvalTech

    A well designed database archive preservesapplication segments Data is always kept in segment format

    Metadata is preserved at the segment level The archive administrative catalog shows

    Segments Segment version number

    Time period covered

    System generated from Time order of consecutive segment strings

    Parallel segment strings for the same application

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    17/48

    17

    Database ArchivingApplication Profiles

    SvalTech

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    18/48

    18

    Overloaded Operational Database

    SvalTech

    Transaction data Lots of data

    Hundreds of millions of rows High daily transaction rate

    24/7 operational availability requirement Long retention period (7 years or more) Short useful active life (less than 2 years) Low access requirements during the inactive period

    Very low access frequency Response time not critical

    Access requirements are simple, easily satisfied with ad hoc tools

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    19/48

    19

    If You Dont Archive

    SvalTech

    Inactive Data will impact operational performance Harder to tune Scans take longer

    Utility functions will take longer to execute Backups

    Database reorganizations Recovery Operations take longer

    Outage recoveries Disaster recoveries

    System Costs will Escalate Need more expensive online storage Need system upgrades

    Pay more for application and DBMS software Older data will become less reliable

    Copyright SvalTech, Inc., 2009

    Continue to keepall data in operationalDatabase.

  • 8/2/2019 Archiving - Fundamentals

    20/48

    20

    Retired Application

    SvalTech

    Merger of companies results in an operationalapplication being duplicated

    Data Structures are not compatible

    One keeps data elements not in other One encodes data elements differently One designed for different OS/DBMS than other

    Decision is made to use one system and

    abandon the other one Meets all requirements of an operationalapplication

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    21/48

    21

    If You Dont Archive

    SvalTech

    Must retain old application environment toaccess data

    Old System

    Old Application Program

    Old DBMS

    Must keep knowledgeable staff to access Application experts

    System experts

    DBA function

    Or, Must merge data into active application

    Copyright SvalTech, Inc., 2009

    Pay the high cost of the oldapplication environmentand staff until last recordreaches end of retentionperiod.

    $$$$$$$$$$$$$$$$$$$$$$$$

    Higher cost and time ofconversion

    Data conversion problems

    Data loss

    Resolution of data quality issues

    Resulting database is huge

    Operational problems

    Lengthy Utility runs

    Lengthy Recovery periods

    Escalating system costs

  • 8/2/2019 Archiving - Fundamentals

    22/48

    22

    Application Renovation Project

    SvalTech

    Application is undergoing major change Replaced with packaged application

    Legacy modernization

    Legacy termination Rewritten to be web-centric

    Need to satisfy new requirements

    Old data structures are out of date Legacy DBMS

    Legacy file system Data meets all other requirements for archiving

    operational application

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    23/48

    23

    If You Dont ArchiveSvalTech

    Must convert all data in onesystem to other system

    Copyright SvalTech, Inc., 2009

    More expensive and complexdesign phase

    Must accommodate old datain new design

    May compromise new design

    Higher and longer conversionperiod

    Data conversion problems

    Data loss

    Resolution of data qualityissues

    Resulting data is less reliable

  • 8/2/2019 Archiving - Fundamentals

    24/48

    24

    Elements of a Successful

    Implementation

    SvalTech

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    25/48

  • 8/2/2019 Archiving - Fundamentals

    26/48

    26

    Architecture of Database Archiving

    Archive Server

    Operational System

    archivecatalog

    archivestorage

    OP DB

    Archive AdministratorArchive DesignerArchive Data ManagerArchive Access Manager

    SvalTech

    Archive Extractor

    Application program

    Archive extractor

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    27/48

    27

    Archive DesignerSvalTech

    Metadata Capture current metadata Validate it Enhance it Design archive storage format

    Data Define business records to be archived Define source of data Define data structures within operational system Define reference data needed to include with it Define archive format of data

    Policies

    Define extract policy (when a record becomes inactive) Define operational disposal policy (when to remove from operational database) Define storage policy (how to protect data in archive) Define discard policy (when to remove from archive)

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    28/48

    28

    Archive ExtractorsSvalTech

    Extractor process Verify consistency with design metadata Extract data as defined in designer Mark or delete from operational database as defined in designer Pass data to archive data manager Keep audit records on everything done

    Do not impact operational performance Support interruptions with transaction level recovery Support restart Finish scans within acceptable time periods

    Scheduling Establish periodic executions

    Find non-disruptive periods Be consistent

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    29/48

    29

    Archive ExtractorsPhysical vs. Application Extractors

    SvalTech

    Copyright SvalTech, Inc., 2009

    Operational System

    OP DB

    Archive Extractor

    Application program

    Archive extractor

    Physical Extractor

    Gets/deletes data directly from the database

    tables, rows, columns

    Application Extractor

    Gets/deletes data from an application API

    virtual tables, rows, columns

    application program

  • 8/2/2019 Archiving - Fundamentals

    30/48

  • 8/2/2019 Archiving - Fundamentals

    31/48

    31

    Archive AccessSvalTech

    Query Capability Determine applicability based on archive segment versions of metadata SQL based in best, if possible Employ external indexes to determine which archive segments to look into Employ internal indexes to avoid reading all of an archive segment

    Support standard access tools

    Report generation (such as Crystal Reports) Generic query tools JDBC interface

    Support metadata version browsing

    Support generation of load files based on query results

    Support generation of load files based on original data source based on query results

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    32/48

    32

    Archive AdministrationSvalTech

    Manage Archive Catalog Application archive designs Audit trails Results logs

    Manage Archive Storage Systems Ensure periodic readability checks Maintain access audit trails

    Manage Archive Access Authorizations for users Authorizations for specific events

    Unloads Ensure audit records are created for all access

    Manage e-Discovery requests

    Ensure Extract and Discard processes are run when they are supposed to

    Manage Metadata Change Process

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    33/48

    33

    Solution Comparisons

    SvalTech

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    34/48

    34

    SvalTech

    Database LOAD FilesSaved image copies

    Parallel databasesPartitions of operational db

    Reformatted archive segmentsstored as files

    load files

    XML filesspecial files

    typically homegrownsolutions

    typically vendor

    solutions

    Copyright SvalTech, Inc., 2009

    How Archive Data is Stored

  • 8/2/2019 Archiving - Fundamentals

    35/48

    35

    SvalTech

    Requires restaging data to accessNot searchable in archiveProblems handling metadata changes

    Dont get $$$$ savingsRequires database administrationProblems handling metadata changes

    IndexedDirect access via SQLSeparated by archive segmentsMetadata resolution across archive segmentsCan exploit storage subsystem capabilitiesCan use hosted storage

    Copyright SvalTech, Inc., 2009

    Storage ComparisonsDB Solutions

    Backup Solutions

    Non-DB Special files

    parallel

    partitioned

    db arrays

    image copies

    unload files

    XMLload files plus

    proprietary

  • 8/2/2019 Archiving - Fundamentals

    36/48

    36

    SvalTech

    Copyright SvalTech, Inc., 2009

    Data Structure Comparisons

    Things to Look for

    Is metadata maintained in archive

    Is metadata validated

    Is metadata enhanced

    Is data restructured to achieve source independence

    from application programs

    from DBMS type

    from source OS/ hardware

    Is reference information captured in archive

    Is data maintained in original form in archive forever

    Can user see data form prior to conversions

  • 8/2/2019 Archiving - Fundamentals

    37/48

    37

    SvalTech

    Copyright SvalTech, Inc., 2009

    Data Access in the Archive

    Things to Look for

    Can requests be satisfied directly from the archive

    Can common generic tools be usedJDBC

    Report generators

    Can data be unloaded in forms for re-platforming

    Can data be accessed efficiently

    Is it indexed

    Is representations consistent

    Are metadata differences accounted for

  • 8/2/2019 Archiving - Fundamentals

    38/48

    38

    SvalTech

    Copyright SvalTech, Inc., 2009

    Administration of the Archive

    Things to Look for

    Is there a full time administrator

    Is there an archive catalog databasewhat is in the archive

    where is it stored

    Is security maintained

    different from operational

    Are actions and events logged

  • 8/2/2019 Archiving - Fundamentals

    39/48

    39

    SvalTech

    Copyright SvalTech, Inc., 2009

    A Myth

    Homegrown Solutions are good enough.

    Truth:

    They do solve the problem of getting inactive data out of operational databases

    However,

    They do not realize maximum cost savings

    They generally do not realize any cost savings

    They generally cannot be directly accessed

    They often require original application environment

    They are never indexed

    They often compromise data integrity across metadata changes

    They often offer less protection from data loss

  • 8/2/2019 Archiving - Fundamentals

    40/48

    40

    SvalTech

    Copyright SvalTech, Inc., 2009

    A Myth

    Homegrown Solutions are cheaper and faster to implement.

    Truth:

    A good vendor solution will guide you through the process and get done quickly

    Managing the archive is easier and cheaper than managing databases

  • 8/2/2019 Archiving - Fundamentals

    41/48

    41

    Business Case Basics

    SvalTech

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    42/48

  • 8/2/2019 Archiving - Fundamentals

    43/48

    43

    Reason for Archiving

    Operational operational archive

    All data inoperational db

    most expensive system

    most expensive storagemost expensive software

    Inactive data inarchive db

    least expensive systemleast expensive storageleast expensive software

    In a typical op db

    60-80% of data

    is inactive

    This percentage

    is growing

    SvalTech

    Size Today

    Copyright SvalTech, Inc., 2009

  • 8/2/2019 Archiving - Fundamentals

    44/48

    44

    Cost Saving ElementsSvalTech

    Copyright SvalTech, Inc., 2009

    Look for and compute difference in storage costs

    front-line vs archive storage

    byte counts differences between operational and archive

    Look for and compute difference in system costs

    operational vs archive systems

    are operational system upgrades avoidedare software upgrades avoided

    can systems be eliminated for application

    can software be eliminated for application

    Look for savings on people costs

    can people be eliminated or redirected for retired applications

    Potential savings on changes/ application renovations

    simplification of design

    elimination of data conversions

  • 8/2/2019 Archiving - Fundamentals

    45/48

    45

    Operational Efficiency ImpactsSvalTech

    Copyright SvalTech, Inc., 2009

    Will operational performance be enhanced with less data

    Will utility time periods be reduced (backup, reorganization)

    fewer occurrences needed

    less data to process each time

    Will recovery times be reduced and what is that worth

    interruption recoveries

    disaster recoveries

    Will implementation of data structure changes be improved

    avoidedreduced amount of data to unload/modify/reload

  • 8/2/2019 Archiving - Fundamentals

    46/48

  • 8/2/2019 Archiving - Fundamentals

    47/48

    47

    Business Case SummarySvalTech

    Copyright SvalTech, Inc., 2009

    Database Archiving solutions generally provide for lower cost software,

    can use lower cost storage more efficiently, and run on smaller machines.

    Each business case is different

    Many factors can be used in building business case

    Seen an application justified on storage costs alone

    Seen an application justified on disaster recovery time alone

    Seen an application justified on better data security alone

    Each organization will have many potential applications

    Having a database archiving practice can create synergies across many

    applications thus adding more value

  • 8/2/2019 Archiving - Fundamentals

    48/48

    48

    Final ThoughtsSvalTech

    Copyright SvalTech, Inc., 2009

    Database Archiving is coming

    Database Archiving is good

    Reduces cost

    Improves operational efficiency

    Reduces Risk

    Need a complete solution to be effective

    Need professional staff

    Educated

    Fulltime