Backup and Recovery Concepts

Backup and recovery concepts

Backup and recovery conceptsA backup of a database is a database copy, together with control information, ready to be restored in the event of a failure. A database backup minimizes data loss and gives you the ability to reconstruct the failed database from the backup copy using the recovery process.

There are many types of failure that cause a database to need recovery. However, not all types of failure require human interaction. Here are various types of failures that you may encounter:

Statement failures

A statement failure occurs when there is a logic misstep in an application. This failure could have a number of different causes, for example a statement running in an infinite loop, a user not having the correct privleges needed to perform a certain task (causing a valid statement to fail) or insert failure due to running out of space.

In Oracle, statement failures usually don't require an administrator, except to perform minimal tasks such as adding more disk space, fixing the application logic, or assigning proper user privileges. In DB2 UDB, statement failures are handled similarly. An administrator is not needed to solve a statement failure in terms of backup and recovery.

User errors

Valid yet destructive statements such as deleting an entire payroll table, dropping a claims table or even inadvertently dropping an entire database can cause extended down time for your database applications. You will need more than proper education and guidance in order to avoid these errors. Data definition language (DDL) statements are not statements that can be rolled back. So if a user error has involved DDL, proper measures will need to be taken to bring back your database.

For example, if a table is mistakenly dropped, as an administrator you have two choices. You can either use a copy of your backup image and roll it forward to the time right before the table was dropped (point-in-time recovery), or you can restore a logical backup through the exp utility. Both options could potentially result in lost data.

When a user drops a table in DB2 UDB, you can either do a database level point-in-time recovery, rolling forward to the point before the table is dropped; or, better still, use a table space level roll forward operation. That way, you do not have to bring the database down, and data is still accessible to your users. We shall see this in the scenario section.

Process failures

Failure of user, server or background processes is due to abnormal termination or disconnection from these processes. A failed process will cause a task to be unable to continue.

For a process failure on Oracle, no human intervention is required, as the Oracle PMON background process will periodically monitor for these aborted processes. For user and server processes, PMON will release the locks held by users, roll back every uncommitted transaction, and release any resources this process is using. If an Oracle background process is aborted, often a restart of the instance is required.

On DB2 UDB, process failures such as a dead db2agent will cause a crash recovery. In crash recovery, transactions that are not committed will be rolled back while committed data will be flushed to disk. To enable this automatic crash recovery feature, verify that the database configuration parameter autorestart is set to ON. By default, autorestart is ON.

Database instance failures

A database instance failure is typically caused by an operating system crash or a power outage.

When the database instance fails on Oracle, there is no need for human intervention as the Oracle SMON background process will detect the problem as soon as you restart your machine. SMON will initiate the instance recovery.

In DB2, when database manager and memory structures go down due to power failure, disk corruption or network failure, crash recovery will be needed to recover DB2 to a consistent and usable state. To enable this automatic crash recovery feature, verify that the database configuration parameter autorestart is set to ON. By default, autorestart is ON. You can also issue the command RESTART DATABASE to manually restart the database. The RESTART DATABASE command uses active logs to roll back changes that are not yet committed. Changes that are already committed but not yet flushed to disk will be flushed to disk.

Media failure

Media failure occurs when someone accidentally deletes a database file from the file system, the entire hard disk goes down with its data files, causing data to be inaccessible, or just pure data corruption. This is a more severe kind of failure than the types we've already discussed.

In Oracle, media failure does require an administrator to recover. Usually media failure is caused by either the complete loss of a data file or an SCN timestamp being out of sync with the rest of the database. Recovery from this kind of failure varies from one situation to the other, primarily depending on the mode of archive and the data files that have been corrupted.

These failures are handled much the same way in DB2 UDB and Oracle, by restoring either offline or online backups and using roll forward operations to recover the logs.

Because media failure is the most complex of all the types of failure, you must have a well thought-out strategy to plan for it. The rest of the article will be devoted to planning for and recovering from this last form of failure.

Back to top

Data structures used for backup and recoveryOracle and DB2 UDB both have a set of components to make up the backup and recovery mechanism. The architecture diagrams you see in Figure 1 and Figure 2 are overviews of the primary components used to provide the backup and recovery for Oracle and DB2.

Figure 1. Oracle architecture

Oracle memory structure Buffer Cache Memory allocated for storing data blocks read from physical data files.

Shared Pool Memory allocated for storing parsed SQL statements, PL/SQL procedures and data dictionary information.

Log Buffer Memory allocated for storing before and after image of changed data. The entry is sequential.

Large Pool Memory allocated for use by RMAN for backup and recovery.

Oracle background processes Database Writer (DBWr) Writes to physical data files the dirty bits of buffer asynchronously.

Log Writer (LGWr) Writes entries from log buffer to redo the log files.

CheckPoint (CKPt) Synchronizes the headers of data files and control files with the current redo log and checkpoint numbers.

Archiver (ARCH) Automates the copying of redo logs. Manual archive of redo logs will be required if this is not turned on.

Oracle database structures Data files Physical files where data is stored.

Control files Files that contain the physical structure and status of the database such as names of data file and log files with the absolute paths, size of files, block size, online or offline status of data files etc. It contains also the names and paths, file size and block size of log files.

Redo logs Files that contain the before and after images of changed data. Redo logs are needed for recovery.

Parameter files Files that store the parameters for instance startup. You can have multiple ways of starting an instance. Oracle first searches if the spfile_SID.ora exists. If it does not, Oracle searchs for the spfile.ora parameter file. If neither spfile_SID.ora nor spfile.ora exist, Oracle will use the init_SID.ora parameter file.

Archive logs Physical copies of the online redo logs. Archived logs are needed in online recovery.

Next we'll look at the DB2 UDB architechture and structures.

Figure 2. DB2 UDB architecture

DB2 UDB memory structure Package Cache Memory allocated to store both static and dynamic SQL statements.

Buffer Pool Memory allocated to store data before flushing it to disks

Log Buffer Memory used to store all changes to database before it is flushed to the logs on disk.

Figure 3. DB2 UDB database structure

Drive / Directory The drive or directory specified on the CREATE DATABASE command

DB2 Instance Name The name of the DB2 Instance owner

NODE0000 The partition number of the database. 0 for a non-partitioned database

SQL00001 Database ID starting from 1

SQLOGDIR The default log directory for the database

SQLT0000.0 The catalog table space, SYSCATSPACE

SQLT0001.0 The temporary table space, TEMPSPACE1

SQLT0002.0 The user table space, USERSPACE1

Back to top

Backup and recovery optionsThere are two modes of backup and recovery for both Oracle and DB2 UDB databases: offline and online. Either mode can recover with either complete or incomplete recovery.

Offline backup requires that all applications disconnect from the database, whereas online backup allows transactions to carry on while backup is in progress. There are implications of in terms of restoration of the backup mode chosen, in that the backup mode determines the recovery mode. We shall see this in the following sections.

Complete recovery, as the name implied, recovers all committed transactions, whereas incomplete recovery brings back transactions with some lost of data. Both Oracle and DB2 UDB allow you to either recover to the current time with no loss of data, or to a time previous to the current time, with some loss of data.

Often, the goal is to reach a compromise between business and operational needs with the recovery mode chosen. For instance, if the database is not mission-critical and 24x7, some downtime and loss of data may be acceptable, and re-keying of data may be an acceptable option should media failure occurs. Depending on the available backups and logs available, it is possible that sometimes there is no other choice but to do an incomplete recovery.

There are two types of DB2 logging, and each logging method has an implications for recovery. The two types of logging are circular and archive logging. When you choose to use circular logging (the default), the only choice is to do an offline backup and version recovery. If you choose to have archive logging, and do an online backup and roll forward recovery, you can recover to a point in time or to the end of the logs for the minimum loss of data.

We'll look at these two modes of backup and recovery in detail below.

Offline (Cold) BackupIn Oracle, the simplest form of all is to do an offline or cold backup. Offline backup is the backup of all the physical database files after a normal database shutdown. This backup method is the most common. A database using this form of backup is running in NOARCHIVELOG mode, where any changes made to the database after the backup will be irrecoverable should a media recovery is needed. As the name NOARCHIVELOG implies, there is no archiving of redo logs, and thus no rolling forward beyond the online redo logs. In case of media failure to any part of the database files, simply restore all the database files, and of course all the changes made after the backup are irrecoverable. Administrators are advised to do an offline backup whenever the schema of the database changes, or changes are made such as adding or removing data files, and adding or removing of table spaces.

The database files can be copied from disk to disk or from disk to tape, depending on the backup procedure chosen. Operating system command such as cpio, tar and dd for Unix, and backup manager or ocopy for Windows can be used to copy physical database files to a backup media. Database files consist of

Data files

Control files

Online redo log

You can loocate these files if you have adhered to Oracle's Optimal Flexible Architecture (OFA). See Figure 4.

Figure 4. Optimal flexible architecture

In this example, ORACLE_BASE would be set to the value /u01/app/oracle, and ORACLE_HOME would be set to either /u01/app/oracle/product/8.0.4 or /u01/app/oracle/product/8.1.7.

The steps to perform an offline backup on Oracle are:

Shut down database in normal mode

Copy all data files, control files and online redo logs file to backup media

Offline backup is also the simplest form of backup in DB2 UDB. Offline backup requires that a full database backup be taken and, obviously, that the database be offline while the backup is in process. In other words, while doing an offline backup, users are not able to access the database.

If you are using circular logging, offline backup is the only type of backup supported. This is the default logging method when you first create a database. For circular logging, both log retain for recovery status and user exit for logging status are set to NO. Two parameters LOGRETAIN and USEREXIT are both set to OFF.

Figure 5. LOGRETAIN OFF

In circular logging, logs are used in a round robin manner much the same way as Oracle uses redo logs in NoArchiveLog mode. As long as transactions in the logs are flushed to disks, logs can be reused. However, during a long running transaction, a log full situation may occur, where all the primary logs are active. When this happens, secondary logs will be allocated dynamically. The DB2 configuration parameter LOGSECOND specifies the maximum number (up to 254) of secondary logs that can be allocated dynamically when needed. Secondary logs will only be removed when the database is deactivated.

Figure 6. Primary and secondary logs

Figure 7. Database configuration LOGSEC

Be aware that if you decide to use circular logging and offline recovery as your backup strategy, all transactions after the last offline backup will be lost.

You can perform an offline backup either from the Control Center or using command line processor. To do a Control Center backup, you will

Bring up the Control Center.

Go to the desired database (for example Sample) and right click on Backup, as shown in Figure 8. Figure 8. Backing up in DB2 UDB from the Control Center

Note that the status of your logging is displayed.

Figure 9. Circular logging

Click Next to cotinue. Choose FileSystem as the media type. Click Add and fill in the path (eg. e:\tmp). Click Next.

Figure 10. Choosing media type

Note that your only option, Full Backup, is shown. Click Next. Figure 11. Backup option screen

Take the defaults. ClickNext.

Figure 12. Choosing backup parameters

Enter the user id and password. Leave the rest at the defaults.

Figure 13. Choosing to run immediately or run as a task

If you choose to run as a task, you will see the screen shown in Figure 14:

Figure 14. Task creation summary

You can either run it immediately from the DB2 Task Manager, or let it be triggered according to a scheduled time. The completed task will be displayed as you see in Figure 15:

Figure 15. Tasks in Task Center

If you choose to run the backup from the DB2 command line, the syntax for command DB2 backup is:

Listing 1. Command line Backup Syntax BACKUP DATABASE database-alias [USER username [USING password]]

[TABLESPACE (tblspace-name [ {,tblspace-name} ... ])] [ONLINE]

[INCREMENTAL [DELTA]] [USE {TSM | XBSA} [OPEN num-sess SESSIONS]] |

TO dir/dev [ {,dir/dev} ... ] | LOAD lib-name [OPEN num-sess SESSIONS]]

[WITH num-buff BUFFERS] [BUFFER buffer-size] [PARALLELISM n]

[WITHOUT PROMPTING]

To do a backup in offline mode, simply issue a DB2 command:

Listing 2. Simple offline backup example backup database sample to e:\tmp

Or you can have more options such as

Listing 3. Offline backup example with some parameters backup database sample to e:\tmp

backup database sample to e:\tmp1, e:\tmp2

with 4 buffers

buffer 4096

parallelism 2

When the backup command is issued, a backup image will be created. Note that the backup file names are slightly different for UNIX and Windows. For Windows, your path and filenames will have this format:

Figure 16. Backup files in Windows

UNIX backup files will look something like this:

Figure 17. UNIX backup files

Table 1 shows similarities and differences between Oracle and DB2 UDB for offline backup.

Table 1. Comparing offline backup: Oracle and DB2 UDBOffline backup (Oracle) Offline backup (DB2 UDB)

No particular authority needed

Operating in NoArchiveLog mode (default)

Circular logs contain both committed and uncommitted data

Logs are written round robin manner namely when the last log is full, it will overwrite log 1

For a long running transaction when all logs are active, log writing will overwrite in round robin manner

You can roll forward to do recovery as long as redo logs are not overwritten

Need to shutdown clean before backing up

All transactions since the last backup will be lost

Use OS level utilities to copy database files to other disks or media SYSADM, SYSCTRL or SYSMAINT authority is required

Operating in LOGRETAIN and USEREXIT = OFF

Redo logs contain both committed and uncommitted data

Logs are written round robin manner namely when the last log is full, it will overwrite log 1

For a long running transaction when all logs are active, secondary logs will be allocated according to parameter LOGSECOND

No rolling forward allowed

No connection allowed. You will get a SQL1035N: The database is currently in use. SQLSTATE=57019 error if there is an active connection

All transactions since the last backup will be lost

Issue Backup Database command to backup

Recovery from offline (cold) backup in OracleThere is nothing magical about restoring an offline backup in Oracle. You simply copy back the physical database files into the right locations. In cases where you need to relocate the data files, you need to update the control file by issuing the following command.

Listing 4. Command for relocating data files in Oracle alter database rename file to

For DB2 UDB, each backup will create a timestamp. Therefore, the restore will be according to the backup timestamp to ensure the latest backup image is restored in order to minimize the loss of data. Note, the more frequently you take a backup, the less data will be lost.

Figure 18 illustrates an example. At unit of work n, a media failure renders a directory to be corrupted and inaccessible. You can either restore to Backup DB3 or earlier. Imagine, if Backup DB3 isnt taken, you have no choice but to get earlier than backup that was taken at 2004062510.

Figure 18. Offline backup and restore

For DB2 UDB, there are again two ways you can opt to do a restore. You can either use the Control Center or command line processor (CLP). To restore from the Control Center follow these steps:

From the Control Center, select the database (for example, Sample) and right click to perform a restore. You have an few options as shown below. Click Next.

Figure 19. Offline restore options

Highlight a backup image and move to the right panel. Click Next.

Figure 20. Backup image selections

You may choose to redirect the containers if desired:

Figure 21. Container redirect options

Note that your only option, Full Backup, is shown. Click Next

Figure 22. Choosing restore parameters

Take the defaults. Click Next.

Figure 23. Choosing further restore parameters

Take the defaults or choose to run immediately. Enter user id and password. Click Next.

Figure 24. Task scheduling

Click Finish.

Once completed restoring, you will see something like this in Task Manager:


To do a CLP restore of an offline backup, the command syntax is as shown in Listing 5:

Listing 5. Command for restore database RESTORE DATABASE source-database-alias

{ restore-options | CONTINUE | ABORT }

restore-options:

[USER username [USING password]] [{TABLESPACE [ONLINE] |

TABLESPACE (tblspace-name [ {,tblspace-name} ... ]) [ONLINE] |

HISTORY FILE [ONLINE]}] [INCREMENTAL [AUTOMATIC | ABORT]]

[{USE {TSM | XBSA} [OPEN num-sess SESSIONS] |

FROM dir/dev [ {,dir/dev} ... ] | LOAD shared-lib

[OPEN num-sess SESSIONS]}] [TAKEN AT date-time] [TO target-directory]

[INTO target-database-alias] [NEWLOGPATH directory]

[WITH num-buff BUFFERS] [BUFFER buffer-size]

[DLREPORT file-name] [REPLACE EXISTING] [REDIRECT] [PARALLELISM n]

[WITHOUT ROLLING FORWARD] [WITHOUT DATALINK] [WITHOUT PROMPTING]

For example, to restore a particular image from E:\tmp, you will have to specify the backup image to be restored. The command in Listing 6 will restore an image taken at June 17 around 10:00 AM.

Listing 6. Simple restore database command db2 restore database sample from e:\tmp taken at 2004061710

Figure 26. Restore database using CLP

There are times when restoring to the original directories is not possible, for example with media failure. You can use a RESTORE DATABASEREDIRECT to use different containers. Directory and file containers are automatically created if they do not already exist. For example, if my tablespace 0 needed to be redirected, I would issue these commands:

Listing 7. Containers redirected using CLP db2 restore database sample from e:\tmp taken at 2004061710 replace existing

db2 set tablespace containers for 0 using (path 'E:\tmp\0')

(note if its a DMS tablespace, you will need a set tablespace containers for 0

using (file 'e:\tmp\0.dms' )

db2 restore database sample continue

Table 2 compares offline restore for DB2 UDB and Oracle.

Table 2. Comparison of offline restore for Oracle and DB2 UDBOffline backup (Oracle) Offline backup (DB2 UDB)

No particular authority needed

Administrators doing backups will have to label them correctly so that during restore, a wrong copy of backup will not be restored.

Data files can be renamed and relocated using Alter Database Rename File to DDL statement

Cannot restore to different database SYSADM, SYSCTRL or SYSMAINT authority is required to restore to an existing database. Restoring to a new database will require SYSCTRL or SYSADM authority

During restore, administrators are able to pick backups from certain timestamps. Backup command both from Control Center and CLP create timestamp for backups to ease the restore.

You can redefine table space containers by invoking the RESTORE DATABASE command and specifying the REDIRECT parameter

You can either restore to an existing database or a new database

Online (hot) backupOracle online backup

In Oracle, online backup requires that the database operate in ARCHIVELOG mode. At shops where the database must stay up 24x7, it is impossible to shut out users from accessing the database for an offline backup. In this kind of scenario, the database should run in ARCHIVELOG mode, where transactions keep running while the backup process is taking place.

Unlike offline backup, online backup only requires that data files and control files be backed up. The unit of backup for offline database is the entire database, whereas the unit of backup for online backup is some or all table spaces.

The entire notion of online backup in Oracle is that as users carry out transactions, all changes made to the database, either committed or uncommitted, are stored to the redo log buffer and subsequently are written to the online redo log files by the LGWr process. Redo logs are written in a round-robin manner; therefore, before they are overwritten, the redo logs need to be archived either manually or automatically by kick-starting the ARCH process. When all transactions are being logged and later archived preferably with multiplexing (using parameter LOG_ARCHIVE_DUPLEX_DEST), a recovery can be carried out using these archived redo logs should media recovery is needed.

Note that using online backup itself doesnt guarantee that you will not lose data. Steps such as immunizing ther database by multiplexing (putting in different locations) control files, using online redo logs and archiving redo logs are necessary to avoid a single point of failure.

To switch to an online with auto archive from the default offline backup in Oracle, you need to do the following:

Fill in the parameters LOG_ARCHIVE_START, LOG_ARCHIVE_DEST, LOG_ARCHIVE_FORMAT and LOG_ARCHIVE_DUPLEX_DEST, LOG_ARCHIVE_DEST_N in the init.ora with the appropriate values

Shutdown and Startup mount

Alter database archivelog

Alter database open

Archive log list (to verify)

Archive log all

Backup all newly created logs

Online backup occurs either when any or all online table spaces are put into online backup mode by issuing a command Alter Tablespace ts_name Begin Backup. When this command is issued, all data files that are in the online backup mode will be issued checkpoint SCN. In other words, all dirty bits of the data buffer will be flushed to the data files. No checkpoint SCN will be issued after the initial checkpoint SCN. For those data files not in the online backup mode, subsequent checkpoint will increase the SCN in their headers. When Alter Tablespace ts_name End Backup is issued, again a checkpoint SCN is recorded at the headers of online backup data files. These Begin and End backup command allows Oracle to know which redo to apply and when to stop during rolling forward.

Oracle recommends the minimum time elapsed between Begin and End backup commands. Also, minimum user activities should be allowed during this time for the fact that Oracle logs the entire block images on the first change to the block. This will result in enormous redo logs to be generated for online backup data files. Practically it means when Begin backup is issued, an operating system copy should follow, and then quickly end by an End backup. The backup is restricted to sequential backup meaning after backing up data files in tablespace1 then go on for data files in tablespace2.

For online backup, special care needs to be taken for the SYSTEM table space and rollback segments table space. You need to also make sure that the control files are backed up once there are schema changes to the database. The backup scripts should also reflect the latest schema.

DB2 UDB online backup

The mechanism for DB2 UDB to do recovery to the point in time is similar to Oracle. To understand the DB2 UDB online backup and recovery, you need to understand archive logging. There are three definitions of logs that we need to familiar with:

Active logs Logs contain transactions that are not committed or roll backed or committed transactions yet not flushed to disk.

Online archive logs Logs contain information for committed and flushed transactions.

Offline archive logs Archive logs that are being moved from active log directory to either another directory in another media or tape.

There are numerous configuration parameters that are related to logging.

LOGFILSIZ Total number of 4KB size to be allocated (default is 250)

LOGPRIMARY Total of primary logs (default is 3)

LOGSECOND Total number of secondary logs that can be allocated when the primary logs fill up. When this is set to 1 (version 8), infinite active log is available

NEWLOGPATH - To change the location where the future log files are to be stored. Will only be effective once database is reactivated.

MIRRORLOGPATH A secondary path to write the logs to avoid single point of failure.

OVERFLOWLOGPATH Specify the location of the logs during a rollforward operation to allow access logs in multiple locations

USEREXIT To specify to use userexit for automatic archiving of archive logs.

BLK_LOG_DSK_FUL - prevent disk full errors from being generated when DB2 cannot create a new log file in the active log path.

In order to use online backup, archive logging must be turned on. You can turn on archive logging by setting LOGRETAIN to ON.

Once archive logging is the logging mechanism, it differs from circular logging in the sense that circular logging overwrites logs that are committed, whereas archive logging archives logs that are committed.

In archive logging, there is no reuse of logs that have become archive logs. To move those online archived logs, you will either move them manually or use a user exit to move them. Figure 27 illustrates the archive logging mechanism:

Figure 27. Active, online and archived logging

At LOG (n+1), which is the active log, log (n) and log (n-i) (where ibr/> Figure 30. Modifying LOGRETAIN using Control Center

You will also need to do a full offline backup. If you dont you will receive the error below trying to connect to the database.

Figure 31. Backup pending message

Now, do a full offline backup using CLP command.

Figure 32. Full offline backup

Navigate to database sample, and right click to do a backup. The first screen you see will be as below. Choose Backup selected table spaces and click Next.

Figure 33. Online backup choices

Select table space Userspace1 to backup. Click Next

Figure 34. Selecting tablespaces to back up

For testing purpose, choose file system. Add the path, for example E:\tmp. Click Next.

Figure 35. Media type selection

Take the defaults and click Next.

Figure 36. Choosing online backup

Take the defaults and click Next.

Figure 37. Specifying backup parameters

Take the defaults. Enter user id and password. Click Next.

Figure 38. Running immediately or save as tasks

Click Finish.


However, if you decide to do a database level backup, which is ideal for a smaller size database, you will do the following.

Navigate to database sample, right click to do a backup. The first screen you see will be as below. Choose Backup entire database and click Next.

Figure 40. Backup the entire database

Choose the path. Click Next.

Figure 41. Choosing the media type


Figure 42. Online backup


Figure 43. Backup parameters

Take the defaults. Enter user and password. Click Next.

Figure 44. Schedule the task

Click Finish.

To do an online table space and database level backup via CLP command prompt, follow the steps below.

Turn on LOGRETAIN. Shutdown and start up the database again to make the configuration change effective.

Figure 45. Turning LOGRETAIN ON

Once the configuration parameter is effective, you will see LOGRETAIN = RECOVERY.

Figure 46. LOGRETAIN = RECOVERY

You will also need to do a full offline backup. If you dont, you will receive the error below trying to connect to the database.

Figure 47. Backup pending

Now, do a full offline backup using CLP command.

Figure 48. Offline full backup

To do a tables space level online backup, issue the command:

Listing 9. Backup a Tablespace Online via CLP db2 backup database sample tablespace(userspace1) online to e:\tmp

Figure 49. Online tablespace backup via CLP

To do a database level online backup, issue the command:

Listing 10. Backup a database online via CLP db2 backup database sample online to e:\tmp

Figure 50. Online database backup via CLP

Table 3compares DB2 UDB and Oracle online backup.

Table 3. Comparison of online backup for Oracle and DB2 UDBOnline backup (Oracle) Online backup (DB2 UDB)

SYSDBA privilege is required

Operating in ArchiveLog mode

Redo logs contain both committed and uncommitted data. Oracle uses archived logs to roll forward to consistent state and Undo table space to rollback any uncommitted data.

Online Redo Logs are written round robin manner namely when the last log is full, it will overwrite log 1. During log switch, online redo logs will be archived.

For a long running transaction when all logs are active, online redo logs will only be overwritten if either you manually archive the non-active online redo or that you have the ARCH process to do it for you. Failure to do the either of the above will cause the Oracle to hang.

LOG_ARCHIVE_DUPLEX_DEST, LOG_ARCHIVE_DEST_N can be used to specify more than one location Oracle write logs to. Maximum value for LOG_ARCHIVE_DEST_N is 10. When one path is down, Oracle will ignore it and continue functioning as long as there is one path that works. However, a message will be logged in audit trail. No synchronization of the paths will happen.

Enabling ArchiveLog mode will require a database rebounce, and a close and clean full backup.

Control files contain the name of the database, log sequence and checkpoint information, the physical file structure etc.

Table space level backup

Backup all data blocks, both changed and unchanged. Not suitable for huge warehouse environment

No Compression in Backup.

Database restore and roll forward does not necessary require the database be shutdown.

Sequential Backup - Backing up table space one after the other.

Issue Alter Tablespace ts_name Begin Backup SYSADM, SYSCTRL or SYSMAINT authority is required to restore to an existing database. To restore to a new database you will need either SYSADM and SYSCTRL

Operating in LOGRETAIN=ON

Logs contain both committed and uncommitted data. In DB2, both roll forward and rollback using logs

Logs are not written in round robin manner. Archived logs will not be overwritten. An active log will become an archived log upon closing by DB2

For a long running transaction when all logs are active, secondary logs will be allocated according to parameter LOGSECOND

MIRRORLOGPATH configuration parameter is used to multiplex logs. Maximum value for MIRROLOGPATH is 1. When one path is down, DB2 will write to DB2 administration notification log and continue functioning as long as there is one path that works. However, DB2 will continually trying to write to the bad path when the current log is completed. No synchronization of the paths will happen.

Enabling LOGRETAIN will only be effective after a DB2 database restart. You will need a full offline backup

Recover History File contains info such as backup and restore of database or tables paces, roll forward recovery from database or table space, altering and quiescing a table space, REORG and RUNSTATS, dropping a table etc.

Both tables pace and database level backups

Backup all data blocks, both changed and unchanged. Not suitable for huge warehouse environment

Backup compression has been added as an option to the BACKUP DATABASE command and the db2Backup APIs.

Database restore and roll forward must be done offline. Restore and roll forward for table space level does not require an offline

Parallelism allowed

Issue Backup Database command to backup. Eg Backup database sample tablespace(userspace1) online to e:\tmpOr Backup database sample online to e:\tmp

Recovery from Online (Hot) BackupOracle online recovery depends a great deal on what is actually lost, for example, losing a group on online redo logs, losing a rollback segment data file, or corrumption of a data file for the SYSTEM table space. There are basically two situations where data can either be recovered completely or partially. Refer to the Case scenarios section for more detailed examples.

To do a roll forward recovery in DB2 UDB, archive logging must be used. During a restore of database, exclusive use is required. You can either restore either at database or table space level. You will restore the backup image and roll forward to complete the process, or leave the database in a pending state. There are two types of rolling forward:

To the end of logs

To a point in time

In DB2 UDB, there are two options to do a restore: the Control Center and CLP. The following steps show you how to perform a restore of an entire database.

In the Control Center, right click on the Sample database and select Restore.

Figure 51. Using restore from Control Center

Note that there are three options: restore to an existing database, to a new database or just restore the RHF. In this case, we just restore to the existing database. Click Next.

Figure 52. Restore alternatives

To restore the entire database, take the default. Click Next.

Figure 53. Restore entire database

Pick the backup image to restore.

Figure 54. Pick backup image to restore

Leave the default. Redirect is need when a container is damaged. Choosing redirect will put your database in restore pending state. Click Next.

Figure 55. Pick backup image to restore

You can either restore without rolling forward, or restore with roll forward. You can either pick from the default log directory or overflow path. In our case, we restore to local time.

Figure 56. Roll forward options

You can leave it either in pending state or active state. Pending state is good for container redirection. We choose active state. Click Next.

Figure 57. Final state of database


Figure 58. Parameter options


Figure 59. Some more parameter options

Run now without saving task history. Click Finish. Note that if the database is not in exclusive mode, you will see the error below

Figure 60. Non-exclusive use during online restore

In the CLP, you will issue the following commands to do a restore and roll forward a full database.

Listing 11. Restore and roll forward a full database RESTORE DATABASE SAMPLE FROM "e:\tmp" TAKEN AT 20040629152131

WITHOUT PROMPTING;

ROLLFORWARD DATABASE SAMPLE TO 2004-06-29-15.23.09.000000 USING

LOCAL TIME AND COMPLETE;

However, sometimes it is more economical to do tables space level backup for larger size databases. Also, restoring a table space will only require exclusive use for that table space. The remaining table spaces can still be accessed. You can restore more than one table space at one time. The following steps show how to restore from a table space (assuming that USERSPACE1 is corrupted):

Instead of choosing entire database to be restored, choose Restored selected table spaces. Click Next.

Figure 61. Choosing to restore tablespaces instead of a full database

Choose the backup image to use. Click Next.

Figure 62. Choose backup image to restore

The rest of the steps remain the same as restoring from a full database.

At the page Restore Option, you can restore the table spaces online. Remember, a full database restore will require exclusive use for the entire restore operation, whereas online restore of a table space will only require that tablespace to be in exclusive mode.

Figure 63. Choosing restore options

In the CLP, you will issue the following to do a restore and roll forward a table space:

Listing 12. Restore and roll forward a full database RESTORE DATABASE SAMPLE TABLESPACE (USERSPACE1)

ONLINE FROM "e:\tmp" TAKEN AT 20040629152131 WITHOUT PROMPTING;

ROLLFORWARD DATABASE SAMPLE TO END OF LOGS AND COMPLETE

TABLESPACE (USERSPACE1) ONLINE;

Table 4 compares DB2 UDB and Oracle online restore

Table 4. Comparison of online restore for Oracle and DB2 UDBOnline recover (Oracle) Online restore (DB2 UDB)

Restore from database files

Both database and table space recover are supported

Recover database UNTIL will be used for incomplete recovery Restore from the backup image

Both database and table space restore are supported

Both complete and incomplete recovery will be using backup images with or without the roll forward enabled

Incremental backup and recoveryAs database size grows into terabyte and petabyte, the time and hardware resources required will grow substantially. Often times, its not feasible to back up the entire database or even few table spaces at one time. For example, in a data warehouse environment, there may be few changes in the data. In this case, it would be better to just back up the changed pages instead of the entire database or table spaces.

In Oracle, Recovery Manager is used to do the incremental backup. There are two types of incremental backups provided by Oracle.

Differential incremental (default)

Cumulative incremental

Differential incremental backup backs up all blocks changed after the most recent backup at level n or lower. You can specify levels 0-4 for differential incremental. The diagram below illustrates the concept of differential incremental backup.

Figure 64. Oracle differential incremental backup

Here is the explanation:

For the first Sunday, a base level 0 backup is taken, all data blocks are copied.

Monday uses level 2 and since 0 is lower than 2, RMAN will back up chances since Sunday.

Tuesday uses level 2 and since Monday is at same level, RMAN will back up changes since Monday.

Wednesday uses level 2 and since Tuesday is at same level, RMAN will back up changes since Tuesday.

Thursday is uses level 1 and since Monday, Tuesday and Wednesday are all higher than level 1, RMAN will back up chances since Sunday because it is at level 0 which is a lower level.

Friday uses level 2 and since Thursday uses level 1, a lower level, RMAN will backup changes since Thursday.

Saturday uses level 2 and RMAN will backup changes since Friday.

Second Sunday, where level 0 is specified. RMAN will backup since last level 0 because there is nothing equal or less than level 0 other than the level 0 since last Sunday.

Repeat the process

Cumulative incremental backup on the other hand, backs up all the blocks used since the most recent backup at level n-1 or lower. Figure 65 illustrates the concept of cumulative incremental backup.

Figure 65. Oracle cumulative incremental backup

Here is the explanation:

The first Sunday uses level 0. RMAN will backup all data blocks that have ever been in use in database.

Monday uses level 2. RMAN will backup all changes since Sunday for level 0 is lower than 2.

Tuesday uses level 2. RMAN will backup all changes since Sunday. This backup includes those blocks that were copied on Monday.

Wednesday uses level 2. RMAN will backup all changes since Sunday. This backup includes those blocks that were copied on Monday and Tuesday.

Thursday uses level 1. RMAN will backup all changes since Sunday.

Friday uses level 2. RMAN will backup all changes since Thursday.

Saturday uses level 2. RMAN will backup all changes since Thursday.

Second Sunday where level 0 is specified, there is no level lower than 0 other than the level 0 on previous Sunday. Therefore, all changes since previous Sunday will be backup.

Repeat the process.

DB2 UDB has a similar incremental backup and recovery strategy. There are two types of incremental backups provided.

Incremental

Delta

An incremental backup is a backup of all changes since the most recent, successful, full backup. This form of backup is also known as cumulative backup. Each successive incremental image contains the entire contents of the previous incremental image, and the changes since the previous full backup. In the case of a failure for example on Saturday after the incremental backup is taken, you can just restore the first Sunday full backup and apply the incremental on Saturday. See the diagram below.

Figure 66. DB2 UDB incremental backup

A delta backup is a backup of changes since the last successful, full, incremental or delta backup. In the case of a failure on Saturday, you could restore the first Sunday full backup and apply the delta backups from Monday until Saturday. See the diagram below:

Figure 67. DB2 UDB delta backup

The configuration parameter, TRACKMOD tracks the database updates. When set to NO (the default), there is no tracking for database page updates. When this parameter is set to "Yes", the database manager tracks database modifications so that the backup utility can detect which subsets of the database pages must be examined by an incremental backup and potentially included in the backup image. After setting this parameter to "Yes", you must take a full database backup in order to have a baseline against which incremental backups can be taken.

Note that the combination database and table space incremental backups are permitted. For table space tracking, a flag for each table space indicates whether a table space is dirty and need to be back up.

The steps below show you how to do an incremental backup and restore. Assuming we back up as depicted above Sunday full backup, Monday to Saturday we do incremental backups.

Turn on TRACKMOD. Make sure you disconnect all applications before reconnect to take effect. Issue the command:

Listing 13. Turning TRACKMOD ON update db cfg using trackmod on

Figure 68. Turning TRACKMOD ON

Do a full database backup as a base for the incremental. This is the backup on Sunday. Issue the command:

Listing 14. Backup database online db2 backup database sample to e:\tmp

Figure 69. Online backup of database

For Monday, do an incremental backup. Issue the command

Listing 15. Backup database incrementally db2 backup database sample incremental

Figure 70. Incremental backup of database

For Tuesday to Saturday, repeat the incremental backup process.

Say, on the next Sunday early morning, media failure happens. To restore without losing data, you will restore from the first Sunday backup, followed by incremental backup taken on Saturday and roll forward. Note that you can restore either a full backup manually, apply the Saturday incremental backup image and roll forward. You can use automatic keyword to automatic restore from the full backup and apply the Saturday backup image. At this moment, you will still be in roll forward pending mode (SQL1117N) since there will be transactions to be roll forward.

Listing 16. Restoring from incremental backup automatically db2 restore database sample incremental automatic

from e:\tmp taken at 20040701112735 without prompting

Figure 71. Incremental restore

Listing 17. Roll forward to end of logs db2 rollforward database sample to end of logs and complete

Figure 72. Roll forward to end of logs

For Delta backup and restore, in the same scenario, you will restore each day's delta backup image (Monday through Saturday) and roll forward to complete the recovery. DB2 UDB provides a utility, db2ckrst to query the database history and generates a list of timestamps for the backup image that are required for an incremental restore.

Back to top

Backup and recovery utilitiesBoth Oracle and DB2 UDB provide utilities to safeguard the backup image. The table below lists some of them.

Table 5. Table comparing utilities to safeguard backup image and database filesOracle utilities DB2 UDB utilities

Export - Oracle uses export to do a logical backup. Export is especially good to recover from a dropped table

Recovery Manager (RMAN) - checks before copying each modified block

DB_BLOCK_CHECKSUM - this parameter stores checksum at every block header. However, setting to true will have an impact on performance

LOG_BLOCK_CHECKSUM - Enables the ARCH process to checksum the archived log files. However, setting to true will have an impact on performance

DBVerify - Check the integrity of data files. Issue the command, dbv file=c:\oracle\oradata\file1.dbf logfile=c:\temp\out.log Not applicable

Not applicable

Not applicable

Not applicable

db2ckbkp - checks the integrity of the entire backup image. db2dart - examines databases for architectural correctness and reports any encountered errors.

Back to top

Recovery case scenariosComplete recoveryA complete recovery is a recovery that involves rolling forward the logs to bring the database to the most current time with no lost of data.

Oracle complete recoveryIn the Oracle world, there are numerous ways to accomplish a complete recovery. Typically this recovery is achieved when database is running in ARCHIVELOG mode and backup online. Complete recovery can be done at the database level or data file and tablespace level. There are cases where database level recovery desirable. To do a database level recovery, you need to do the following:

Mount the database

Online all the data files

Copy back the lost data file or the entire database

Apply the redo (both online and archived)

There are times you want to do a data file and tablespace level recovery. In this case you have to do the following:

Offline the data file or tablespace

Copy back the data files to be recovered

Roll forward

For Oracle, depending on the severity of a media failure and the backup strategy deployed, restoring to a consistent state with no lost of data can sometimes be impossible. There are few scenarios (not exhaustive) such as below:

Recovering from the loss of a multiplexed control file member

Scenario 1 : Disk and filesystem is intact

Shutdown abort

Copy good controlfile to original location of bad controlfile:- cp /path/blah_good.ctl /path/blah_bad.ctl

Scenario 2 : Disk and filesystem is no good

Shutdown abort

Copy good controlfile to a new location:- cp /path/blah_good.ctl /new_good_path/blah.ctl

Change the CONTROL_FILES param in init.ora to reflect the new good path

Startup

Recovering from the loss of one member of an online redo log group

No need to shut down

To investigate what status is the missing redo log is - Select * from v$logfile shows which group and member is INVALID

From step 1, you will have the full path of the member that went corrupted. Drop this member - Alter database drop logfile member 'fullpath/log_filename.log'

Add a new member to the group - Alter database add logfile member '/fullpath/log_filename.log' reuse to group n; where log_filename.log and group n follow step 1

Recovering from a situation where there is no data file backup

Do a shutdown abort

Startup mount

Get the Path and Size of the missing datafile - select df.file#, df.status, df.enabled, df.create_bytes, df.name from v$recover_file rf, v$datafile df where rf.file#=df.file# and rf.error = "FILE NOT FILE" (note the path and size of the missing file)

Create the datafile - alter database create datafile '/path/filename.dbf' as '/path/filename.dbf' size xxx reuse (Make sure that the path and size are the same as step 4)

If the file is offline, then bring in online - alter database datafile '/path/filename.dbf' online;

Recover the database - recover database

Open the database - alter database open

Recovering from the loss of a data file that belongs to an indexes-only tablespace

Restore a good copy of datafile

Mount the database - startup mount

Recover the datafile - recover datafile 'fullpath/filename'

You will be prompted for archived log. Confirm until you receive "Media Recovery Complete"

Recovering from the loss of a datafile that belongs to a temporary tablespace

Offline the tablespace

In archivelog - alter database datafile xxx offline immediate

In nonarchivelog - alter database datafile xxx offline drop

Drop the tablespace

Drop tablespace xxx

Remove physical files of the tablespace and recreate them

Recovering from the loss of a datafile that belongs to a readonly tablespace

To recover a loss datafile that belongs to READ-ONLY tablespace is an easy task indeed. Since READ-ONLY tablespace is never modified, simply restore the datafile to its original location shall do the job. However, if you change from read-only to read-write vice-versa since last backup, you have to restore the file and do a media recovery on it.

Recover datafile xxx

Apply the logs until you see "Media Recovery Complete"

Recovering from the loss of an inactive redo log group

To investigate what status is the missing redo log is - a. select * from v$logfile shows which group is INVALID. b. select * from v$log shows the status of the invalid group

Now that you are confirmed that the lost redo file is an INACTIVE redo log, shutdown the database - shutdown immediate

Mount the database - Startup mount

Since the redo log group is inactive and is archived, just clear the redo log - alter database clear logfile group N (where N is the group # of the lost redo log)

Open the database - Alter database open

Check status - select * from v$log shows a new redo with UNUSED status

Do a full backup if you want

Recovering from the loss of a data file that belongs to the system tablespace

Shutdown abort the database if the database is still up

Copy the corrupted or lost datafile from backup to the original location

Startup mount

select v1.group#, member, sequence#, first_change# from v$log v1, v$logfile v2 where v1.group#=v2.group#;

Recover the database by Recover datafile '/path/filename.dbf'

Logs will be prompted. Confirm it until you see "Media Recovery Complete". If you are asked to enter a non-existence archived log, enter the full path of a member of the redo group where the sequence number matches the one being prompted (from step 4) until you see "Media Recovery Complete".

Alter database open

Recovering from the loss of a data file that belongs to a traditional rollback segment tablespace

The database was cleanly shut down(All the committed data are written to disks)

Comment out the ROLLBACK_SEGMENTS entry in init.ora

Startup restrict mount

Alter database datafile '/path/filename.dbf' offline drop;

Alter database open

Drop tablespace tablespace_name including contents;

Recreate the rollback tablespace with all of its rollback segments. The segment name should correspond to ROLLBACK_SEGMENTS in init.ora

Shutdown immediate

Uncomment the ROLLBACK_SEGMENTS in init.ora

Startup

select segment_name, status from dba_rollback_segs just to make sure all rollback segments are online

The database was not cleanly shut down (there are active transactions in the rollback segments)

Restore the corrupted/lost file from backup using OS cp command

Startup mount

Check the status of the datafile: select name, status from v$datafile; Online the datafile if it's OFFLINE by Alter database datafile '/path/filename.dbf' ONLINE


Recover datafile '/path/fileame.dbf'

Logs will be prompted. Confirm it until you see "Media Recovery Complete". If you are asked to enter a non-existence archived log, enter the full path of a member of the redo group where the sequence number matches the one being prompted (from step 4) until you see "Media Recovery Complete".

Alter database open

The database is up and running (Simpler)

Create few additional rollback segments to handle the database activities. Eg. Create tablespace rbstemp datafile '/path/rbstemp01.dbf' size 50M'. Create rollback segment xxx tablespace rbstemp

Offline the lost datafile: Alter database datafile '/path/filename.dbf' offline

Restore the lost datafile from backup using OS cp


Recover datafile '/path/filename.dbf'

Logs will be prompted. Confirm it until you see "Media Recovery Complete". If you are asked to enter a non-existence archived log, enter the full path of a member of the redo group where the sequence number matches the one being prompted (from step 4) until you see "Media Recovery Complete"

Online the datafile : Alter database datafile '/path/filename.dbf' online

Recovery from the above situation has to be taken care of in a case by case manner if you are using the traditional ALTER TABLESPACEBACKUP BEGIN. A good plan is to guard against any of the types of loss mentioned above. However, since Oracle 8i, Oracle recommends that Recovery Manager (RMAN) be used to do the online backup. There are advantages of RMAN such as

Incremental backup

Redo inflation with traditional backup will not happen

Integrity check for data to avoid data corruption

Parallel backup allowed

Automate the backup and restore process

DB2 complete recoveryHere are the types of recovery without loss of data:

Online restore for database with a good backup image. Roll forward to the end of logs and complete. Note that this includes all kind of media failure, including containers corrupted, and so on. Refer to the steps above to do a database level recovery.

Online restore for table space with a good backup image. Roll forward to the end of logs and complete. Note that this includes all kind of media failures, container corruption, and so on. Refer to the steps above to do a table space level recovery.

Dropping a table accidentally. To recover a dropped table, do the following.

You have to turn on the DROP_RECOVERY option to Y for table spaces are enabled for dropped table recovery initially. Issue the commands below to check whether a table space is enabled for dropped table recovery and if not, turn it on.

db2 select tbspace, drop_recovery from syscat.tablespaces

db2 alter tablespace userspace1 dropped table recovery on

Make sure your LOGRETAIN is ON. If not issue

db2 update db cfg using logretain on (you need to disconnect all applications to take effect)

Do a full offline backup

db2 backup database sample to e:\tmp

Now at certain point in time, you find out that one of your tables was accidentally dropped by an amateur DBA, or more precisely a trainee DBA. Do the following:

list the dropped table history by issuing the command. You will notice few things; the backup ID (eg 000000000000bb000002000d), dropped table start and end time (eg. 20040705150008) and the DDL for the dropped table. You will need the backup ID and the DDL for the dropped table in the subsequent steps.

db2 list history dropped table all for sample

Restore table space using the backup image without rolling forward

db2 restore database sample tablespace (userspace1) from e:\tmp taken at 2004070515 without rolling forward

Roll forward using complete recover dropped table to e:\tmp. You will see a ROLLFORWARD command complete successfully prompt.

db2 rollforward database sample to end of logs and complete recover dropped table 000000000000bb000002000d to e:\tmp.

Now, create the dropped table from the list dropped table DDL statement.

Import the exported data, for example:

db2 import from d:\tmp\node0000\data of del method p (1,2,3,4,5,6,7,8,9,10,11,12,13,14) messages d:\tmp\imp.loog insert into employee(empno, firstnme, midinit, lastname, workdept, phoneno, hiredate, job, edlevel, sex, birthdate, salary, bonus, comm)

Incomplete recoveryAn incomplete recovery is a recovery that involves rolling forward the redo logs to bring the database to timestamp as far as the roll forward allows you with some lost of data. It is as the name implied, a partial recovery.

Oracle incomplete recoveryIn Oracle world, incomplete recovery is typically caused by

Lost of a table

Lost of online redo logs before they are archived

Lost of archived redo logs preventing a roll forward beyond that point

Lost of a control file with no backup

The underlying difference for complete and incomplete recovery lies in the inability to proceed beyond certain timestamp due to various reasons listed above. Incomplete recovery will require a reset of the logs by issuing Alter Database Open Resetlogs command. Unlike complete recovery, you have to restore all data files, instead of only the selected data files to be recovered. When doing incomplete recovery, you have to make sure data files are online.

Change-based recovery. Change-based recovery is a recovery that recovers until a specified SCN. Eg. Recover database until change 87643

Time-based recovery. Time-base recovery is a recovery that recovers until a specified SCN. Eg. Recover database until time 2004-06-05:15:20:00

Cancel-based recovery. Cancel-based recovery is a recovery that recovers until user key in cancel.Eg. Recover database until cancel. At this point in time roll forward will go on until users decided to halt by keying cancel.

Log sequence-based recovery Log sequence-based recovery is a recovery that recovers until the specified log sequence number (only available when using Recovery Manager).

Note that all transactions after the until will be lost. This is the nature of incomplete recovery. Typically, after an incomplete recovery is carried out, there is a need to do an offline backup immediately. All the redo logs after the alter database open resetlogs will render useless and will not be good for recovery. Resetlogs simply means the sequence number for the redo logs is reset.

We will also see some incomplete recovery case scenarios though not an exhaustive one later in Recovery Case Scenarios. This article is not meant to address how to recover from every single failure due to media failure.

Here are some example scenarios that could cause incomplete recovery in Oracle:

Recovering from the loss of an unarchived online log file

To investigate what status is the missing redo log is - a. select * from v$logfile shows which group is INVALID b. select * from v$log shows the archived status of the invalid group

Now that you are confirmed that the lost redo file is an unarchived redo log (archived=NO), shutdown the database - shutdown immediate

Restore all datafiles from backup


If the media is corrupted where the default filesystem cannot be written, move the file to other location - a. In OS, find a directory to host the new file and touch the file (same same) b. alter database rename file '/oldpath/filename.log' to '/newpath/filename.log'

Do a cancel based incomplete recovery - a. recover database until cancel b. Press Enter til you see ora-00308 and ora-27037 c. Rerun recover database until cancel but type CANCEL at the prompt this time

Open the database and reset the logs - alter database open resetlogs (Note after opening using resetlogs will create online log files)

Do a user managed backups of the entire database

Recovering from the loss of an only active redo log group


Now that you are confirmed that the lost redo file is an ACTIVE redo log, shutdown the database - shutdown immediate

Restore the backup


If the media is corrupted where the default filesystem cannot be written, move the file to other location - a. In OS, find a directory to host the new file and touch the file (same name). b. alter database rename file '/oldpath/filename.log' to '/newpath/filename.log'. If the original filesystem is good, skip this step.

Do a cancel based incomplete recovery - a. recover database until cancel. b. Press Enter until you see ora-00308 and ora-27037. c. Rerun recover database until cancel but type CANCEL at the prompt this time

Open the database and reset the logs - alter database open resetlogs


Recovering from the loss of the current redo log group


Now that you are confirmed that the lost redo file is a CURRENT redo log, shutdown the database - shutdown immediate

Restore the backup


If the media is corrupted where the default filesystem cannot be written, move the file to other location - a. In OS, find a directory to host the new file and touch the file (same name). b. alter database rename file '/oldpath/filename.log' to '/newpath/filename.log'. If the original filesystem is good, skip this step.

Do a cancel based incomplete recovery - a. recover database until cancel. b. Press Enter until you see ora-00308 and ora-27037. c. Rerun recover database until cancel but type CANCEL at the prompt this time.

Open the database and reset the logs - alter database open resetlogs


DB2 incomplete recoveryIn DB2, an incomplete recovery may be caused by:

The loss of the active log. This is probably the worst case scenario. It is therefore imperative to use the database configuration parameter MIRRORLOGPATH to mirror the active log.

Seek help from IBM support.

The loss of the archived log. In this scenario, you can only roll forward to a point in time, to the last good archived log file, for example:

db2 rollforward db sample to 2004-06-28-10.30.50.000000

Back to top

ConclusionBackup and recovery are essential for any business environment where data loss must be minimized if not totally avoided. Oracle and DB2 UDB both have mechanisms to backup and restore both small and large databases. For small and medium databases, Oracle provides both offline and conventional online backup (using Alter Tablespace Backup Begin command). DB2 UDB provides offline and online backup as well. For offline backup, the exclusive use is required to do a backup whereas online backup allow users to connect while backup is in progress. The restore for table spaces or database can be done by restoring a backup image and rolling forward. Both databases provide the roll forward until a certain point in time and to the end of the logs.

Dealing with huge databases such as those in data warehouse environment, both Oracle and DB2 provide incremental and differential backup to avoid copying unchanged data pages. To use incremental backup or automate the online backup, Oracle recommends Recovery Manager to be used.

Oracle and DB2 UDB both provide effective mechanisms and capabilities for securing data from loss due to media corruption or human error. Using the information provided in this article, you should find it easy to transfer your Oracle skills to the DB2 UDB environment.

Resources IBM DB2 UDB Information Center provides in-depth details about all aspects of IBM DB2 UDB.

IBM DB2 UDB Data Recovery and High Availability Guide and Reference discusses IBM DB2 UDB Backup and Recovery in depth.

The developerWorks Migration Station provides a multitude of resources for migrating your databases and applications to DB2 UDB.

The IBM DB2 Migration Toolkit is a downloadable toolkit to help you migrate to DB2.

The book High Availability Guide for DB2 will answer many of your questions about database availability.

Setting Up a User Exit on DB2 UDB V8 to Archive Database Logs

Document options

Print this page

E-mail this pageRate this page

Help us improve this contentLevel: Introductory

David Kline, DB2 Technical Support representative, IBM

02 Jul 2003

This article explains how to use a user exit program to manage the archiving and retrieving of database log files for DB2 Universal Database.

2003 International Business Machines Corporation. All rights reserved.

IntroductionDB2 Universal DatabaseTM uses log files as a primary means of ensuring the consistency and recoverability of data. In an earlier article, An Overview of Transactional Logging in DB2 Universal Database, I introduced transactional logging concepts and explained how logs are managed. In this article I'll take you a step further, and explain how DB2 UDB can use a user exit program to archive and retrieve database log files. In addition, I'll provide a step-by-step example to explain how to modify, compile, and test the db2uext2.cdisk sample user exit program that comes with your installed DB2 UDB server product. You'll find links to additional resources covering user exit programming with DB2 UDB at the bottom of the article.

This article covers DB2 V8.1 Enterprise Server Edition fix pack 1 (single-partition) server on Windows 2000 and AIX 5L. Although other sample user exit programs are available such as user exit to tape or Tivoli Storage Manager, I do not cover them here.

Back to topHow does a user exit work with DB2 UDB?The basic idea behind using a user exit program with DB2 UDB is to provide a method for archiving and retrieving database log files to enable log redundancy and movement away from volatile media. It is important to note that you can implement other operations besides archiving and retrieving logs within the user exit, based on your specific needs.

Having a user exit strategy with DB2 UDB will not recover 100% of your transactions if a database needs to be recovered using archived log files. A user exit program is merely a means to provide more protection for your existing log files by copying them off to a safe location. It's one part of your data integrity strategy, but an important part.

After compiling a user exit program, the db2uext2 executable is placed in a directory where the database manager can find it. This directory is /sqllib/adm on UNIX and \Program Files\IBM\SQLLIB\BIN on Windows.

The database manager will not call db2uext2 unless it is aware that a user exit program is available. The only way the database manager will know that the db2uext2 can be called is by setting the database configuration parameter userexit to on. Once this parameter is set and the DB2 instance is recycled, the database manager will make a call every five minutes to the user exit program to check for log files that can be archived to the program-specified archive directory.

If a database recovery is necessary, the database manager will call db2uext2 during the roll forward operation to copy the archived log files back into the active log directory. The log files are then re-applied to the restored database.

Let's take a look at the calling format made by the database manager to the user exit program. Note that this information can also be found in the comments sections of the user exit sample program:

db2uext2 -OS -RL

-RQ -DB -NN

-LP -LN [-AP]

where: os = operating system release = DB2 release request = 'ARCHIVE' or 'RETRIEVE' dbname = database name nodenumber = node number logpath = log file path logname = log file name logsize = log file size (optional) startingpage = starting offset in 4K page unit (optional) adsmpasswd = ADSM password (optional)

Note: logsize and startingpage are only used when logpath is a raw device.

Log files are archived and retrieved from disk with the following naming convention: archive: archive path + database name + node number + log file name retrieve: retrieve path + database name + node number + log file name

For example: If the archive path is "c:\mylogs", the retrieve path is "c:\mylogs", the database name is "SAMPLE", the node number is NODE0000, file name is S0000001.LOG, the log file would be: archived to - c:\mylogs\SAMPLE\NODE0000\S0000001.LOG retrieved from - c:\mylogs\SAMPLE\NODE0000\S0000001.LOG

Here's how the logic flows within the user exit: 1) Install signal handlers. 2) Verify the number of parameters passed. 3) Verify the action requested. 4) Start the audit trail ( if requested ). 5) Take one of the following paths based on the action requested: a) If the requested action is to archive a file, copy the log file from the log path to the archive path. i) If the log file is not found proceed to point 6. b) If the requested action is to retrieve a file, copy the log file from the retrieve path to the log path. i) If the log file is not found proceed to point 6. 6) Log errors (if requested and required). 7) End the audit trail (if requested). 8) Exit with the appropriate return code.

It is possible for you to manually call the user exit program to archive a log file, but it's better to use the ARCHIVE LOG command so that mistakes are not made on your behalf when specifying the above parameters. A link for the ARCHIVE LOG command can be found at the bottom of this article.

Back to topLog file terminologyThe basic function of a user exit program with DB2 is to copy log files to and from the active log directory. It is worth mentioning some terminology here to clarify where the active log directory is located and the states of database log files.

Active log directory:This directory is located in your database directory. On Windows, if a single database called SAMPLE is created on C:\ and the instance name is db2inst1, the following directory structure will exist:

C:\DB2INST1\NODE0000\SQL000001\SQLOGDIR

SQL00001 is the database directory for the SAMPLE database and SQLOGDIR is the active log directory.

Figure 1 below shows the active log directory on a Windows operating system:

Figure 1. Active log directory

Back to topLog files statesWithin the active log directory, log files can be either active logs or online archived logs. Active logs are those logs needed by DB2 for current transaction processing and crash recovery. Online archived logs are logs no longer needed by DB2 UDB for regular processing but may be needed during a database recovery. When implementing a user exit program, these online archived logs should eventually show up as a copy in your archive log directory.

Since the purpose of a user exit program with DB2 UDB is to copy database logs to an archive directory, you will end up with duplicate log files in your active log directory, by default, SQLOGDIR. You may consider removing these duplicate online archived logs to free up file system space. Be very careful to verify that these logs were successfully copied to the archive directory before removing them from the database directory. You must also make sure they are no longer needed by the database manager for crash recovery. To determine which log files are no longer needed for normal processing in your active log directory, check the database configuration with the following command:

db2 "get db cfg for sample"

The database configuration output from this command will include the first active log file, for example:

First active log file = S000009.LOG

The log file S000009.LOG, shown in the output above, is the currently active log for the database. Any log files that are numbered less than this log file are considered online archived logs.

Here is an example:

In this scenario, the active log directory holds log files S000000.LOG - S000009.LOG and the archive log directory holds S000000.LOG - S000008.LOG. Because S000009.LOG is the first active log file, S000001.LOG - S000008.LOG can be deleted from the active log directory to free up disk space. The S000009.LOG file must be left in the active log directory since it is still being used for current transactions.

The database history file can also be checked to see which log files are no longer required in the active log directory. The following command will list database backup information:

db2 "list history backup all for database sample"

Here's an example of output from that command:

List History File for sample

Number of matching file entries = 4

Op Obj Timestamp+Sequence Type Dev Earliest Log Current Log

-- --- ------------------ ---- --- ------------------------

B D 20030416162026001 F D S0000010.LOGS0000014.LOG

------------------------------------------------------------

Contains 2 tablespace(s):

00001 SYSCATSPACE

00002 USERSPACE1

------------------------------------------------------------

In the above output, the earliest log above will signify that any log after and including the S0000010.LOG is needed. Any log before S00000010.LOG can be safely deleted. Again, it is important to verify that a copy of the log files exist in the archive log directory before deleting those logs from the active log directory.

Although manually deleting log files from the active log directory is possible, a safer way to remove online archived log files is through the prune logfile command. This command can be used to delete log files in the active log directory. In the example below, the following command will delete log files S000000.LOG - S000008.LOG:

db2 "prune logfile prior to S000009.LOG"

Note: Depending on your recovery strategy, there may be scenarios where previous roll forward operations are performed on the database. Old log files in your archive directory may be overwritten with newer log files with the same name, thereby preventing point in time recovery of a database using old log files. It is important for the programmer of the user exit program to take situations such as this into account.

Back to topSetting up a user exitIn this article, we will be using the db2uext2.cdisk sample c program provided by DB2, which is located in your c directory. On Unix, the c directory is located at /sqllib/samples. On Windows, this directory is located in Program Files/IBM//samples.

Setting up a user exit program on WindowsModify and compile user exit program

1. Create a directory called C:\mylogs

2. Copy the C:\Program files\IBM\SQLLIB\samples\c\db2uext2.cdisk to a working directory.

3. For this example, the following part of the user exit program should be verified to reflect the path c:\\mylogs\\.

#define ARCHIVE_PATH "c:\\mylogs\\"

#define RETRIEVE_PATH "c:\\mylogs\\"

#define AUDIT_ACTIVE 1 /* enable audit trail logging */

#define ERROR_ACTIVE 1 /* enable error trail logging */

#define AUDIT_ERROR_PATH "c:\\mylogs\\"

/* path must end with a slash */

#define AUDIT_ERROR_ATTR "a" /* append to text file */

#define BUFFER_SIZE 32

/* # of 4K pages for output buffer */

4. Make sure a supported C compiler is installed on your system (i.e. Microsoft Visual Studio) and your environment has the compiler's path.

5. From the command line, rename the db2uext2.cdisk to db2uext2.c and build it:

cl db2uext2.c

Once the program is compiled, db2uext2.exe and db2uext2.obj files will be created.

6. Place the db2uext2.exe executable in the /SQLLIB/BIN directory so that the database manager can find and execute it to archive and retrieve logs.

Create and prepare the database for user exit

7. Create the SAMPLE database with the db2sampl command at the DB2 command window. This will let you use the sample tables for the examples below.

db2sampl

8. Update the database configuration file so that user exit is turned on for the database. Note that only one database can be assigned the user exit program since the bin directory is shared by all DB2 instances.

db2 "update db cfg for sample using userexit on"

db2stop force db2start

Note that the logretain database configuration parameter does not have to be on to enable roll forward recovery of log files since the userexit parameter also does this.

9. For testing purposes, you can encourage log files to be archived by decreasing the logfilsiz database configuration parameter.

db2 "update db cfg for sample using logfilsiz 200"

The value 200 is the number of 4k pages for each log file. In our example, the size of each log file will be initialized to 800k (200 * 4k).

10. Since the userexit database configuration parameter is set to YES, a full database backup is required before a connection to the database can be made. Make a note of the timestamp produced from the backup command as this will be used during the database restore.

db2 "backup db sample to c:\backups"

At this point, your user exit program is enabled and ready to be used. The user exit program will be invoked every 5 minutes to check the active log directory for log files that need to be archived. Note that the larger the log file specified with the logfilsiz database configuration parameter, the longer it will take to fill a log file thereby making that log file more vulnerable to disk failure, corruption, etc. Depending on your transaction loads, you should only specify a log file size that will fill up in a reasonable period of time so that the user exit program can archive the log file off to a safe directory.

Under a more ideal scenario it is best to archive the log files to a different disk on a separate system. This reduces I/O on the main disk and adds a second level of protection if the disk or system crashes.

Setting up a user exit program on AIXModify and compile user exit program

1. Create the following directory structure:

/mylogs/SAMPLE/NODE0000

Then give recursive permissions to each directory in the structure:

chmod -R 777 /mylogs

2. Copy the db2uext2.cdisk to a working directory and give the file permissions:

cp /home/db2v8_32/sqllib/samples/c/db2uext2.cdisk

/home/db2v8_32/db2uext2.c

chmod 777 /home/db2v8_32/db2uext2.c

3. For this example, the following part of the user exit program should be updated to reflect the path /mylogs/.

#define ARCHIVE_PATH "/mylogs/"


#define RETRIEVE_PATH "/mylogs/"


#define AUDIT_ACTIVE 1

/* enable audit trail logging */

#define ERROR_ACTIVE 1

/* enable error trail logging */

#define AUDIT_ERROR_PATH "/mylogs/"


#define AUDIT_ERROR_ATTR "a"

/* append to text file */

#define BUFFER_SIZE 32

/* # of 4K pages for output buffer */

4. Make sure you have a C compiler installed on your system and your environment has the compiler's library and path in it.

5. From the command line, build the db2uext2.c program:

cc -o db2uext2 db2uext2.c

Once the program is compiled, a db2uext2 executable will be created.

6. Move the db2uext2 executable to the sqllib/adm directory.

mv db2uext2 /home/db2v8_32/sqllib/adm/

Create and prepare the database for user exit

7. Create the SAMPLE database with the db2sampl command. This will allow you use the sample tables for the examples below.

db2sampl

8. Update the database configuration file so that user exit is turned on for the database.

db2 "update

db cfg for sample using userexit on"

db2stop force db2start

Note that the logretain database configuration parameter does not have to be on to enable roll forward recovery of log files since the userexit parameter also does this.

9. For testing purposes, you can encourage log files to be archived by decreasing the logfilsiz database configuration parameter.

db2 "update db cfg for sample using logfilsiz 200"

The value 200 is the number of 4k pages for each log file specified by logprimary and logsecond parameter.

10. Since the userexit database configuration parameter is set to YES, a full database backup is required before a connection to the database can be made. Make a note of the timestamp produced from the backup command as this will be used during the database restore.

db2 "backup db sample to /home/db2v8_32/backups"

At this point, your user exit program is enabled and ready to be used. The user exit program will be invoked every 5 minutes to check the active log directory for log files that need to be archived. Note that the larger the log file specified with the logfilsiz database configuration parameter, the longer it will take to fill a log file thereby making that log file more vulnerable to disk failure, corruption, etc. Depending on your transaction loads, you should only specify a log file size that will fill up in a reasonable period of time so that the user exit program can archive the log file off to a safe directory.

Under a more ideal scenario it is best to archive the log files to a different disk on a separate system. This reduces I/O on the main disk and adds a second level of protection if the disk or system crashes.

Testing the user exitIn the scenarios below, Windows is used to provide a visual representation for the location of specific files associated with the user exit program. After changing the paths, we also tested the steps below on AIX.

Note: For Windows users, upon first archival with the user exit program, the archive directory structure \SAMPLE\NODE0000 will be created under the mylogs directory. The archive logs path on AIX, which is created manually, is similar with the following: /sample/NODE0000 (see Setting up a user exit program on AIX).

Promote the Archival of log files

Scenario 1:

db2 connect to sample

db2 "insert into staff (select * from staff)"

db2 connect reset

(all database connections must be released so that the database frees up resources and closes log files)

Check the c:\mylogs\SAMPLE\NODE0000 directory.

The above scenario will truncate the currently active log file and copy it off to the c:\mylogs directory. The size of a truncated log file will be less than 800k in order to prevent wasted space.

Figure 2 below shows the truncated log file that is moved to the archive directory.

Figure 2. Truncated log file

Scenario 2:

You can also test by filling your log file with data. The database manager will check for full log files and copy them off to the archive directory.

db2 "connect to sample"

db2 "insert into staff (select * from staff)"

(Perform this insert 10 times to fill the log file)

Remember that you updated the database configuration parameter logfilsiz to 200 4k pages (800k) in order to fill the log file quickly.

Now you can check the c:\mylogs directory.

Figure 3 below shows the placement of the archived logs.

Figure 3. Placement of archived logs

If your user exit program is working correctly, you should see an ARCHIVE.LOG created under c:\mylogs and the one or more log files should be copied to c:\mylogs\SAMPLE\NODE0000\.

Figure 4 below shows the placement of the ARCHIVE.LOG file:

Figure 4. Placement of ARCHIVE. LOG file

Retrieving the archived log files during database recovery

Now we will perform a database restore. The restore will invoke the user exit program to retrieve the log files from the c:\mylogs directory back into the active log directory where they can be applied to the database during roll forward recovery.

db2 connect reset

db2 "restore database sample from

vc:\backups taken at 20030416114420"

SQL2539W Warning! Restoring to an existing

database that is the same as the backup image

database. The database files will be deleted.

Do you want to continue? (y/n)

The value 20030416114420 in the restore command is the timestamp taken from the previous database backup. Since we are restoring on top of our existing sample database, you will receive the above warning message. Select Y and press Enter.

db2 "rollforward db sample to end of logs and stop"

Rollforward Status

Input database alias = sample

Number of nodes have returned status = 1

Node number

Documents

Backup and Recovery Concepts