67
1 Ingres® Backup and Recovery Bruno Bompar Senior Manager Customer Support

Ingres Backup and Recovery

  • Upload
    others

  • View
    7

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Ingres Backup and Recovery

1

Ingres® Backup and Recovery

Bruno BomparSenior Manager Customer Support

Page 2: Ingres Backup and Recovery

2

Abstract

Proper backup is crucial in any production DBMS installation, and Ingres is no exception. And backups are useless unless you can recover from them. This session explains how Ingres backup and recovery work. We will also cover some ideas on how best to do a regular backup and how to do a save recovery.

Page 3: Ingres Backup and Recovery

3

Agenda

• Why backup and recovery?

• Disaster scenarios

• Ingres features

• Housekeeping

• Customisation

• Issues to Consider

• Tips and cautions

Page 4: Ingres Backup and Recovery

4

Why backup and recovery?

• Insurance

• What if?

• Cost to business

• Critical functionality

• One part of overall process

Page 5: Ingres Backup and Recovery

5

Scenarios to Consider

• System Crash

• Database Corruption

• Lost Table

• Accidental Transaction

Page 6: Ingres Backup and Recovery

6

System Crash

• Automated Recovery

• After a crash Ingres will• Scan the transaction log file• Rollback uncompleted transaction• Apply completed transactions

• Databases will be consistent• Depends on the crash

Page 7: Ingres Backup and Recovery

7

Database Corruption

• Databases can be recovered

• Only if valid Ingres backup is available!

• ckpdb command to backup

• rollforwarddb to recover

Page 8: Ingres Backup and Recovery

8

Backup Mechanisms

• OS backup• invalid unless done with Ingres shut down cleanly• important for backing up Ingres installation, journals,

checkpoints, dumps• useless for backing up databases unless you can

guarantee a clean shutdown

• unloaddb• an archiving or porting tool, not a backup tool• no way to ensure a consistent snapshot without locking out

all users (an "offline" archive)

Page 9: Ingres Backup and Recovery

9

Backup Mechanisms

• In order to get the most out of a backup mechanism, two things are needed:• a way to take a static snapshot of the database without

interfering too greatly with active users• a way to record incremental changes since that static

snapshot

• Ingres does both via checkpoints and journals• a checkpoint is the static backup or snapshot• the journals are the ongoing change records

Page 10: Ingres Backup and Recovery

10

Backup Mechanisms

• Terminology note! Ingres differs from other DBMS's in its use of the word "checkpoint"

• Ingres:• a checkpoint is a backup snapshot• a consistency point (CP) is a buffer and log flush

• Other DBMS's:• a checkpoint means a buffer flush• a backup is just called a backup

Page 11: Ingres Backup and Recovery

11

Database Checkpoints

• Backup the whole database

• Online or Offline

• Enable / Disable journaling

• Can be performed in parallel

• Written to• Tape• Disk

• Don’t forget iidbdb!!

Page 12: Ingres Backup and Recovery

12

Online versus Offline

• Offline• Requires exclusive access to database

• Online• Users carry on working• No DDL statements• Slower than offline• Can cause transaction log file to fill

Page 13: Ingres Backup and Recovery

13

Online Checkpointing

• An online checkpoint (the ckpdb command) has three phases:• quiescing the database• file copying with change logging• completion recording

Page 14: Ingres Backup and Recovery

14

Online Checkpointing

Page 15: Ingres Backup and Recovery

15

Online Checkpointing

Page 16: Ingres Backup and Recovery

16

Online Checkpointing

• File copying is controlled by the checkpoint template (cktmpl.def)• can be modified by Ingres administrator• change copy command, add file compression, etc• amazing things are possible

• DML allowed during file copying• but not DDL - no file creation/deletion

• Changes during file copying are specially logged• before-images sent to dump files

Page 17: Ingres Backup and Recovery

17

Checkpointing

• After copying is complete, the checkpoint success or failure is recorded in the database config file• aaaaaaaa.cnf• another copy left in cnnnnnnn.dmp in dump location• note that the checkpoint itself does not contain a record of

the checkpoint completion

• Config file records last N checkpoint attempts• successful or not• N = 99 for recent releases of Ingres• N = 16 for older versions (2.0 and older)

Page 18: Ingres Backup and Recovery

18

Online Checkpointing

• When it's all over, you have• one or more checkpoint files (one for each data location)

• in disk checkpoint area, or on tape• zero or more dump files containing changes made while

file-copying• an updated database config file

• plus an updated copy in the dump location• a new set of journal files

• a fresh journal file is started at the end of the database quiescent phase

Page 19: Ingres Backup and Recovery

19

Checkpointing

• What to save after the checkpoint completes:• the checkpoint and dump locations

• you need both• infodb output (human readable listing of the database

config file)• output of: select * from iifile_info

• for manual table level recovery and emergencies• optional but recommended

Page 20: Ingres Backup and Recovery

20

Journals

• Audit trail of all changes made to selected tables• written in batches by the archiver (dmfacp)

• Default for tables is journaling ON• journaling also needs to be enabled for the database using

ckpdb +j• this is an offline checkpoint; no users allowed

• Journal files grow to a target size, then a new one is started• current expected size and sequence number is stored in the

database config file• each checkpoint starts a fresh set of journal files

Page 21: Ingres Backup and Recovery

21

Database Checkpoint - Examples

• Command line• Online checkpoint

ckpdb dbname• Offline checkpoint – enabling journaling

ckpdb +j dbname ’#m3’• Offline checkpoint – disabling journaling

ckpdb -j dbname

Page 22: Ingres Backup and Recovery

22

Database Checkpoint - Examples

• Visual DBA

Page 23: Ingres Backup and Recovery

23

Recovery

• Recovery is a two step process• one command (rollforwarddb) with two distinct phases

• First, restore the database to a point in time (a checkpoint)

• Second, replay journals• optional• all journals, or stop at a given time

Page 24: Ingres Backup and Recovery

24

Recovery

Page 25: Ingres Backup and Recovery

25

Recovery

Page 26: Ingres Backup and Recovery

26

Recovery

Page 27: Ingres Backup and Recovery

27

Recovery

• The database must exist before it can be recovered

• All required data locations must exist

• A valid config file must be available• recovery looks in the data location first, then the dump

location• config file is renamed to aaaaaaaa.rfc

• The last checkpoint must be valid• can ask for an earlier checkpoint with #cn option

Page 28: Ingres Backup and Recovery

28

When Recovery Is Needed

• Stay calm!• you have practiced recovery, right?• haste makes mistakes• turn off the mobile phone, pager, etc• the database will be ready when it's ready

• Save your current database config• ideally, make a copy of the dump location and the data location

aaaaaaaa.cnf• as a minimum save aaaaaaaa.cnf• allows you to try again if something goes wrong• if you have time, save everything in sight

Page 29: Ingres Backup and Recovery

29

Database Recovery

• Point in time recovery• Last checkpoint only• Last checkpoint + 10 hours work• 5 checkpoints ago

• Based on available files

Page 30: Ingres Backup and Recovery

30

Database Recovery - Examples

• Command Line• Last checkpoint only, no journals

Rollforwarddb +c –j dbname• Last checkpoint, journals to 12:32 on 10/05/02

Rollforwarddb +c +j dbname –e10-may-2002:12:32:00

Page 31: Ingres Backup and Recovery

31

Database Recovery - Examples

• Visual DBA– Last checkpoint

only, no journals

Page 32: Ingres Backup and Recovery

32

Database Recovery - Examples

• Visual DBA– Last checkpoint,

journals to 12:32 on 10/05/02

Page 33: Ingres Backup and Recovery

33

Recovery Scenarios

• Data area is lost• shut down Ingres if it's not down• restore data directories with db config file• restart Ingres

• transaction log contents can be moved to journals only if a valid config file is available!

• rollforwarddb• up-to-the-minute recovery should be possible

Page 34: Ingres Backup and Recovery

34

Recovery Scenarios

• Transaction log is lost• wasn't it mirrored?• recreate transaction log• rollforwarddb• most recent transactions not moved to journals will be lost

Page 35: Ingres Backup and Recovery

35

Recovery Scenarios

• Checkpoint or dump location is lost• recreate location directories• take fresh checkpoint• loss of checkpoint area should not affect running database

Page 36: Ingres Backup and Recovery

36

Recovery Scenarios

• Journal location is lost• installation will continue to run until transaction log fills up• recreate journal directory• alterdb -disable_journaling to halt journaling• restart archiver which will have stopped due to inability to

write journals• ckpdb +j to restart journaling

Page 37: Ingres Backup and Recovery

37

Recovery Scenarios

• Software or human error is discovered

• If mistake is discovered immediately:• crash/restart Ingres, or remove all user sessions• rollforwarddb with -e option to replay journals, stopping

short of the time of mistake

• If mistake isn't discovered until later, recovery is more complicated• Ingres Journal Analyzer (IJA) can help

Page 38: Ingres Backup and Recovery

38

Accidental Transaction

• AuditDB• Filter against

• Table• Users• Time

• Scan Journal files• Generate SQL• Execute

Page 39: Ingres Backup and Recovery

39

Accidental Transaction

• Ingres Journal Analyzer• Auditdb with Knobs on…• Connect to remote servers• Force Log Flush• Point and Click

Page 40: Ingres Backup and Recovery

40

Accidental Transaction

Page 41: Ingres Backup and Recovery

41

Accidental Transaction

Page 42: Ingres Backup and Recovery

42

Page 43: Ingres Backup and Recovery

43

Recovery Scenarios

• Disaster

• Use OS backups to restore Ingres system directories, all data, work, checkpoint, dump, journal directories

• rollforwarddb iidbdb• you have been checkpointing iidbdb, right?• restores users, locations, database privileges, etc

• rollforwarddb databases

Page 44: Ingres Backup and Recovery

44

Recovery Scenarios

• Rollforwarddb failure• restore the config or dump info you saved before

attempting rollforwarddb• rename aaaaaaaa.rfc back to aaaaaaaa.cnf if it exists• cure any other rollforwarddb complaints• try again

• Last checkpoint didn't work• use ckpdb #cn to restore an older one• you do have more than one checkpoint around, right?

Page 45: Ingres Backup and Recovery

45

Lost Table

• Table can be recovered

• From table checkpoint only

• Enforce logical consistency

• Journaling must be enabled

Page 46: Ingres Backup and Recovery

46

Table Checkpoints - Examples

• Command line• Checkpoint table t1

ckpdb dbname –table=t1• Checkpoint table t1 and t2

ckpdb dbname –table=t1,t2

Page 47: Ingres Backup and Recovery

47

Table Recovery - Examples

• From table checkpoint only

• Command line• Recover table t1

rollforwarddb dbname –table=t1• Recover table t1 and t2

rollforwarddb dbname –table=t1,t2

Page 48: Ingres Backup and Recovery

48

Housekeeping Ingres

• Infodb

• Checkpoints

• Dumps

• Journals

Page 49: Ingres Backup and Recovery

49

Infodb / aaaaaaaa.cnf

• Shows meta-data about database• Locations• Checkpoint sequence

• Valid / Invalid• Dump / Journal sequence• Counters

• Last table id• Last valid checkpoint

Page 50: Ingres Backup and Recovery

50

Infodb / aaaaaaaa.cnf

• Info stored in aaaaaaaa.cnf

• Three copies• Primary database location• Dump location as aaaaaaaa.cnf• Dump location as cxxxx.dmp

• Infodb reads CNF file in database area

• Copy to dump area with every change• II_DUMP• database own dump area

Page 51: Ingres Backup and Recovery

51

Checkpoint files

• Stored in 1 location• II_CHECKPOINT• Database defined checkpoint area

• One file for each location

• Format depends on archiver used

Page 52: Ingres Backup and Recovery

52

Dump files

• Changes during ONLINE checkpoint

• Required for recovery

• Single location• II_DUMP• Database defined dump area

Page 53: Ingres Backup and Recovery

53

Journal Files

• Record of changes• Table configuration

• Facilitates point in time recovery

• Files stored in single location• II_JOURNAL• Database defined journal area

Page 54: Ingres Backup and Recovery

54

Backing up the backup files

• OFFLINE Checkpoint• Database aaaaaaaa.cnf• Dump aaaaaaaa.cnf• Output from infodb• Checkpoint• Journals

• ONLINE Checkpoint• All above• Dump files

Page 55: Ingres Backup and Recovery

55

Cleaning up

• ckpdb –d• All but the last checkpoint• Dump, journal files deleted as well

• alterdb –delete_oldest_ckp• Oldest checkpoint only• Maintain set of checkpoints• Dump, journal files deleted as well

Page 56: Ingres Backup and Recovery

56

Customisation• cktmpl.def

• $II_SYSTEM/ingres/files

• Defines actions• Before / During / After• Tape• Disk

• II_CKTMPL_FILE• ingsetenv only

• Most common entries to change:• WSDD: work phase of regular checkpoint• WRDD: work phase of regular rollforward

• Some things you can do:• add compression/decompression• use a different utility (eg star instead of tar)• wild and crazy stuff

• Test both checkpoint and restore after modifying the template

Page 57: Ingres Backup and Recovery

57

Issues To Consider

• Files• Ingres supports large files• OS archiver utility may not

• POSIX standard• tar• cpio

Page 58: Ingres Backup and Recovery

58

Tips and Cautions

• Hardware "solutions" aren't solutions• "I don’t need to backup, I have magic solution of the

moment"• RAID 5, mirroring, whatever• you aren't protected against software failures• you aren't protected against human failures• you aren't protected against disasters• you may not be protected against multiple hardware

failures• you are putting all your eggs in one basket

Page 59: Ingres Backup and Recovery

59

Tips and Cautions

• Backups are no good if they don't work• make sure that ckpdb works• automatic verification is better than manual verification

• not ensuring that checkpoints are working may be the #1 cause of recovery failure

• Automate as much as possible• error checking• disk space checking• old-checkpoint deletion

Page 60: Ingres Backup and Recovery

60

Tips and Cautions

• A choice of checkpoints is better than just one• avoid ckpdb -d (delete all prior checkpoints)• alterdb -delete_oldest_ckp is better• manual (or scripted) deletion of old checkpoints is often best

• maintains checkpoint history in the config file

• Keep as many checkpoints as you can• gives you more recovery options• don't skimp on checkpoint disk space (disks are cheap!)• you can delete checkpoints but keep journals• it's all on OS backups, right??

Page 61: Ingres Backup and Recovery

61

Tips and Cautions

• Be wary of checkpointing to tape• nasty, unreliable devices they are• "oops, there wasn't a tape in the drive"• if you must use tape, verify your backups regularly

• tape drives have been known to write unreadable tapes

• Keep checkpoint and dump locations together• on the same file system or drive• keep them on the same OS backup schedule• checkpoints are worthless without the dump info

Page 62: Ingres Backup and Recovery

62

Tips and Cautions

• Practice is essential• not just once, but regularly• practice on look-alike installation if production is not

available• practice on production at least occasionally

• clean Ingres shutdown• OS backup everything in sight• verify the OS backup, then run your recovery tests

• you need hardware resources to support your recovery practice

Page 63: Ingres Backup and Recovery

63

Tips and Cautions

• Document your recovery procedures• let someone else do a trial recovery• keep the procedures up to date• make sure that more than one person knows how to do a

recovery

• make sure that more than one person knows where to find the documentation

• – keep a copy offsite or in a safe place

Page 64: Ingres Backup and Recovery

64

Tips and Cautions

• Backing up and archiving are different• a backup has a short useful lifetime• an archive (unload) is good indefinitely

• Backup planning and disaster recovery planning are different• recoverable backups are just one aspect of a complete

disaster recovery plan

Page 65: Ingres Backup and Recovery

65

More Information

• Ingres DBA guide• Chapter 15 (2.6)

• Ingres Command Reference Guide

• Compressed Checkpoints• Servicedesk Doc ID 409751

Page 66: Ingres Backup and Recovery

66

Summary

• Backups deserve more than lip service

• Ensuring 100% recoverable backups takes time, effort, and money

• Ingres checkpoint and rollforward capabilities are simple yet powerful and customisable

• With proper practice and procedures, a recovery is nothing to be afraid of

Page 67: Ingres Backup and Recovery

67

Questions & Answers

?