480
DB2 Universal Database Advanced Administration Workshop (Course Code CF45) Student Notebook ERC 7.3 IBM Certified Course Material V3.1.0.1 cover

Front cover - ee.buu.ac.thwiroon/elearning/ibm_courseware/(CF45-DB2v8) DB… · DB2 Universal Database Advanced Administration Workshop (Course Code CF45) Student Notebook ERC 7.3

Embed Size (px)

Citation preview

V3.1.0.1

cover

���

Front cover

DB2 Universal DatabaseAdvanced Administration Workshop (Course Code CF45)

Student NotebookERC 7.3

IBM Certified Course Material

Student Notebook

Trademarks

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both:

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Microsoft, Windows and Windows NT are trademarks of Microsoft Corporation in the United States, other countries, or both.

Intel is a trademark of Intel Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product and service names may be trademarks or service marks of others.

AIX® AS/400® CICS®CICS/6000® DataJoiner® DataPropagator™DB2® DB2 Connect™ DB2 Universal Database™DRDA® Encina® HACMP™Informix® iSeries™ MQSeries®MVS™ OS/2® OS/390®OS/400® Redbooks™ Red Brick™WebSphere® z/OS®

November 2004 Edition

The information contained in this document has not been submitted to any formal IBM test and is distributed on an “as is” basis withoutany warranty either express or implied. The use of this information or the implementation of any of these techniques is a customerresponsibility and depends on the customer’s ability to evaluate and integrate them into the customer’s operational environment. Whileeach item may have been reviewed by IBM for accuracy in a specific situation, there is no guarantee that the same or similar results willresult elsewhere. Customers attempting to adapt these techniques to their own environments do so at their own risk.

© Copyright International Business Machines Corporation 1997, 2004. All rights reserved.This document may not be reproduced in whole or in part without the prior written permission of IBM.Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is subject to restrictionsset forth in GSA ADP Schedule Contract with IBM Corp.

Student NotebookV3.1.0.1

TOC

Contents

Trademarks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xi

Course Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii

Agenda . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xv

Unit 1. Automatic Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-2

1.1 Automatic Maintenance. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-3Prerequisites for Automatic Computing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-4DB Configuration Automatic Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-5Runstats and Profiling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-7Configure Automatic Maintenance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-10

1.2 Control Center, Activity Monitor, and Memory Visualization . . . . . . . . . . . . . . 1-13Control Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-14Activity Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-16Memory Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-18

1.3 Health Center . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-19Health Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-20Health Indicator Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-22Health Center - Alerts View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-25Alarm Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-27Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1-30

Unit 2. Remote Administration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2

2.1 DB2 UDB Administration Server (DAS) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-3DAS Administration Server and Tools Catalog . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-4Using the Administration Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6

2.2 Discovery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-11Configuration Assistant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-12Using the Configuration Assistant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14Using the Configuration Assistant . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16Discovery Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18

2.3 Instance-Level Commands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-21Instance Attachment versus DB Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22Explicit/Implicit ATTACH/CONNECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-23DB2 UDB Directory Structure - CONNECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-25Remote Administration - ATTACH . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27ATTACH to Local Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28ATTACH to Remote Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-29Connectivity Checklist (TCP/IP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-30Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Contents iii

Student Notebook

Unit 3. The Governor and Audit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-2

3.1 Establishing Rules with the Governor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-3DB2 UDB Governor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-4Starting the Governor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-5What Does the Governor Do? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7Governor Configuration File Format . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-9Governor Clauses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-11Elements Checked . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-13Actions Taken . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-15Example Governor Configuration File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-17The Governor Log File Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-20Querying Governor Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-22Table to Hold Accounting Records (UNIX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-23Why Do We Need Auditing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-24

3.2 Audit Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-27How Does the DB2 Audit Facility Work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-28DB2audit Performance Impact . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-30Audit Facility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-32db2audit Command - How It Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-34Auditing Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-37Configuring db2audit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-38Extracting Audit Record . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-40Audit Record - Connect (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-42Audit Record - Connect (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-43Audit Record - Create Table (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-44Audit Record - Create Table (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-45How to Clean Up Audit Logs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-46How to Optimize Work with db2audit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-48Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-53

Unit 4. Problem Determination Tools and Techniques . . . . . . . . . . . . . . . . . . . 4-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-2db2diag.log and Administration Log DBM CFG . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3Administration Notification Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-5Administration Notification Log . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-7Interpreting Log Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-8Interpreting Log Entries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10Log File Example (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11Log File Example (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12db2diag db2diag.log Analysis Tool Command . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-13

4.1 Inspect. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17Online Database Checking Tool - INSPECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18Inspect Syntax (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20Inspect Syntax (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-22

4.2 db2support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-25db2support Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

iv DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

TOC

db2support Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-284.3 db2trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-31

DB2 Trace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-32DB2 Trace to Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-33

4.4 Problem Determination Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-35Information Needed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36 Miscellaneous Troubleshooting Tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-42

Unit 5. Parallelism, SMP Enablement, and Process Model . . . . . . . . . . . . . . . . . 5-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2

5.1 Parallelism and SMP Enablement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3Why Use Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-4Parallelism in DB2 UDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6More DB2 UDB Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-8SQL Query Parallelism Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-10Subsection Pieces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11Intra-Query Parallelism (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13Intra-Query Parallelism (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15Intra-Query Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-16Data Parallelism for SMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17Parallel Configuration Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-19Which Degree of Parallelism Is Used? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-23Intra-Partition Parallelism Recommendations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-24Process Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-26Process Model on SMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28Agent States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34Assigning Subagents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-36Controlling the Number of Subagents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-38Connection Concentrator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-40Utility Parallelism for SMP Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-41Load Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-42Index Create Parallelism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-44Backup Parallelism with SMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-46Restore Parallelism with SMP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-47Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-50

Unit 6. Advanced Utility Topics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-2

6.1 Using db2move . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-3How db2move Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-4db2move Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-5db2look . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-10

6.2 db2ocat. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-15db2ocat Utility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-16How db2ocat Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-18db2cfexp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-20

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Contents v

Student Notebook

db2cfimp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-22List Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-23Change the Priority for Utilities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-25

6.3 Capacity Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-27Estimate Size GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-28Storage Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-29Database Container Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-31DMS Table Space Container Management (Drop Container) . . . . . . . . . . . . . . . .6-33DMS Table Space Container Management (Reduce Size) . . . . . . . . . . . . . . . . . .6-34DMS Container Management (Add New/No Rebalance) . . . . . . . . . . . . . . . . . . . .6-35ALTER TABLESPACE Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-36

6.4 High Availability Monitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-39High Availability Monitor on UNIX (db2fm) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-40Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-44

Unit 7. Online Table and Index Reorganization . . . . . . . . . . . . . . . . . . . . . . . . . 7-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-2

7.1 Table Reorganization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-3Table Reorganization Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4Online (INPLACE) Table Reorg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5Online Table Reorganization - Inside . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7Online Table Reorganization - Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-8Online Reorg Syntax - Table or Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-10Online Table Reorganization - Usage Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-14Classic Table Reorg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-15Online Index Create and Reorganization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-17Online Index Create Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-18Online Index Create - Usage Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-19Online Index Reorganization - Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-21Online Index Reorganization - Usage Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-22“Online Index Reorganization” now “Online Index Defragmentation of Leaf Pages” . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-24Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-25

Unit 8. Multidimensional Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-2

8.1 Multidimensional Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-3Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-4Single-Dimensional Data Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-6How Does Single-Dimensional Clustering Work? . . . . . . . . . . . . . . . . . . . . . . . . . . .8-7MDC - How It Works . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-8Multidimensional Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-9Background - Extents (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-11Background - Extents (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-12Terminology: Dimension . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-13Terminology: Slice (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-14Terminology: Slice (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .8-15

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

vi DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

TOC

Terminology: Slice (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-16Terminology: Cell . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-17Block Indexes and Dimension Block Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-18Dimension Block Indexes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-19Benefit: Clustering in Multiple Dimensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-20Insert . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-21Insert: The Composite Block Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-22Insert Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-23Benefit: Guaranteed Clustering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-24Block Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-25The Block Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-26Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-27Benefit: Reduced Overhead and Logging . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-28Simple and Flexible Syntax (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-29Simple and Flexible Syntax (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-30Simple and Flexible Syntax (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-32Query Processing - Example (1 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-33Query Processing - Example (2 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-34Query Processing - Example (3 of 3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-35Benefit: Faster Queries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-36Considerations for Dimension Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-37MDC and Generated Columns - Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-38Caution: The Importance of Monotonicity (1 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . 8-39Caution: The Importance of Monotonicity (2 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . 8-40Caution: The Importance of Monotonicity (3 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . 8-41Caution: The Importance of Monotonicity (4 of 4) . . . . . . . . . . . . . . . . . . . . . . . . . 8-42MDC and Generated Columns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-43Fast and Efficient Data Roll-in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-44MDC Load Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-45MDC Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-47In Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-48Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8-49

Unit 9. Advanced Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-2

9.1 Benefits of Online Load. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-3LOAD Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-4Online Load Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-9Locking - Offline versus Online Load . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-10Online Load - Index Rebuild . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-15Online Load - Incremental Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-18

9.2 Load File Type Modifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-19Load File Type Modifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-20Free Space in Index and Data Pages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-23Load From Cursor - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-25LOAD ... HOLD QUIESCE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-27Load and Load Performance Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-28

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Contents vii

Student Notebook

Non-Recoverable Load Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-30Non-Recoverable Load Considerations (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . .9-32Non-Recoverable Load Considerations (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . .9-34

9.3 Additional Load Utility Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-37LOAD - TEMPFILES PATH Option . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-38LOAD QUERY Command . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-39LOAD QUERY - Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-41LOAD RESTART . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-43LOAD - TERMINATE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-45LOAD - Incremental Indexing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-46

9.4 Abnormal Load Termination. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9-49Abnormal Load Termination Cleanup Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-50Abnormal Load Termination Cleanup Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-52Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .9-54

Unit 10. Distributed Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-2

10.1 Distributed Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10-3What Is a Distributed Unit of Work? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-4Levels of Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-5Federated Database - Distributed Request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-6Application-Directed DUOW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-8Application Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-10DUOW Examples (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-11DUOW Examples (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-13Setting DUOW Options with CLP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-15Multisite Update with Two-Phase Commit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-17What Is Involved in a Two-Phase Commit? . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-19Two-Phase Commit in DB2 UDB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-21GUI - Multisite Update Wizard . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-23Configure Multisite Update Wizard (1 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-24Configure Multisite Update Wizard (2 of 2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-26Test Multisite Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-27The Resynchronization Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-28DUOW Database Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-31Selecting Your TM Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-32Lock Timeout Avoidance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-34Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .10-36

Unit 11. Federated Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-2

11.1 Federated Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-3Distributed Queries - General . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-4DB2 UDB Federated Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-6Federated Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-9DB2 UDB Federated Database GUI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-11Sample Scenario DML (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .11-12

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

viii DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

TOC

Sample Scenario DML (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-13Sample Scenario DML (3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-15Performance Issues in Federated Databases . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-17Additional Education . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-19Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11-22

Unit 12. Replication - Optional . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-1Unit Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-2

12.1 Replication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-3Replication Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-4UDB Replication Concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-6Replication Center (1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-8Replication Center (2) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-10Additional Courses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-12Unit Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12-14

Appendix A. Checkpoint Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1

Bibliography. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .X-1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Contents ix

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

x DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

TMK

Trademarks

The reader should recognize that the following terms, which appear in the content of this training document, are official trademarks of IBM or other companies:

IBM® is a registered trademark of International Business Machines Corporation.

The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both:

Java and all Java-based trademarks are trademarks of Sun Microsystems, Inc. in the United States, other countries, or both.

Microsoft, Windows and Windows NT are trademarks of Microsoft Corporation in the United States, other countries, or both.

Intel is a trademark of Intel Corporation in the United States, other countries, or both.

UNIX is a registered trademark of The Open Group in the United States and other countries.

Linux is a trademark of Linus Torvalds in the United States, other countries, or both.

Other company, product and service names may be trademarks or service marks of others.

AIX® AS/400® CICS®CICS/6000® DataJoiner® DataPropagator™DB2® DB2 Connect™ DB2 Universal Database™DRDA® Encina® HACMP™Informix® iSeries™ MQSeries®MVS™ OS/2® OS/390®OS/400® Redbooks™ Red Brick™WebSphere® z/OS®

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Trademarks xi

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

xii DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

pref

Course Description

DB2 Universal Database Advanced Administration Workshop

Duration: 4 days

Purpose

This course teaches how to perform advanced database administration tasks using DB2 Universal Database. These tasks include client administration, automatic maintenance features, problem determination, advanced load usage, parallelism and SMP exploitation, DB2 Governor, audit facility, distributed data management, remote administration, multidimensional data clustering, online reorganizations, use of the Health Center, and federated database support.

Audience

System administrators, database administrators, and technical personnel who are involved in planning, implementing, and maintaining DB2 Universal Databases.

Prerequisites

Before taking this course, you should have a working knowledge of DB2 Universal Database. You can develop this knowledge through job experience or by taking the following courses:

• DB2 UDB Database Administration Workshop for UNIX (CF21)

• DB2 UDB Database Administration Workshop for Linux (CF20)

• DB2 UDB Database Administration Workshop for Windows (CF23)

• DB2 UDB Enterprise - Extended Edition for UNIX Administration Workshop (CF24)

• DB2 UDB Database Administration Workshop for Sun Solaris (CF27)

• DB2 UDB for Experienced Relational DBAs (CF28)

• DB2 UDB Multi-Partition Environment for Single-Partition DBAs (CG24)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Course Description xiii

Student Notebook

Objectives

After completing this course, you should be able to:

• Effectively apply advanced techniques to administer a DB2 Universal Database

• Explore parallelism and symmetric multiprocessor (SMP) enablement as well as the process model

• Explore problem determination tools and techniques

• Use online index as well as online table reorganization

• Explore multidimensional clustering

• Explore advanced load utilities

• Explore federated databases

• Configure the DB2 Governor to enforce time and CPU restrictions

• Manage a distributed data environment

• Explore the automatic maintenance features

Contents

• Automatic computing • Remote administration • The Governor and audit • Problem determination tools and techniques • Parallelism and SMP enablement • Advanced utility topics • Online table and index reorganization • Multidimensional clustering • Advanced load • Distributed management • Federated databases

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

xiv DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

pref

Agenda

Day 1

Welcome Automatic ComputingLab: Automatic ComputingRemote Administration Lab: Remote Administration The Governor and AuditLab: Using the Governor and Audit

Day 2

Problem Determination Tools and TechniquesLab: Problem Determination Tools and TechniquesParallelism and SMP EnablementLab: Parallelism and SMP EnablementAdvanced Utility TopicsLab: Advanced Utility Topics

Day 3

Online Table and Index ReorganizationLab: Online Table and Index ReorganizationMultidimensional ClusteringLab: Multidimensional ClusteringAdvanced Load

Day 4

Distributed Management Lab: Distributed Management Federated Databases Lab: Federated Databases

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Agenda xv

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

xvi DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 1. Automatic Computing

What This Unit Is About

This unit teaches you how to manage jobs using the automatic computing capability of DB2 UDB.

What You Should Be Able to Do

After completing this unit, you should be able to use and be informed about:

• Automated backup with policy • Simplified memory configuration • Automatic RUNSTATS and sampling • Automated table REORG with policy • Control Center customization and navigation • Health Center • Health Center Recommendation Advisor • Activity Monitor

How You Will Check Your Progress

Accountability:

• Checkpoint questions • An exercise

References

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Administration Guide: Performance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-1

Student Notebook

Figure 1-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Identify and use the automatic capabilities of DB2 UDB

Backup/Restore/Log File Management

Memory Configuration

Runstats

Reorg

Use and configure the Health Center

Use the Activity Monitor

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

1.1 Automatic Maintenance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-3

Student Notebook

Figure 1-2. Prerequisites for Automatic Computing CF457.3

Notes:

The following database configuration parameters (which are all turned off by default) allow you to control the automatic maintenance activities of several DB2 UDB utilities:

• auto_maint is the parent of all other automatic maintenance configuration parameters. When this parameter is turned off, all of its children parameters are also disabled, but their settings do not change. In this way, automatic maintenance can be enabled or disabled globally.

© Copyright IBM Corporation 2004

Prerequisites for Automatic Computing

DB2 UDB

Backup ?

Restore ?

Reorg ?

Runstats ?

???

Update DB CFG

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 1-3. DB Configuration Automatic Maintenance CF457.3

Notes:

auto_db_backup enables or disables automatic backup operations for a database. A backup policy (a defined set of rules or guidelines) can be used to specify the automated behavior. The objective of the backup policy is to ensure that the database is being backed up regularly. The backup policy is created automatically when the DB2 Health Monitor first runs. To be enabled, this parameter must be set ON.

auto_tbl_maint is the parent of all table maintenance parameters (auto-runstats, auto_stats_prof, auto_prof_upd, and auto_reorg).When this parameter is disabled, all of its children parameters are also disabled, but their settings do not change. When this parameter is enabled (ON), recorded values for the children parameters take effect. With this parameter, you are able to enable or disable table maintenance globally.

auto_runstats is the automated table maintenance parameter which enables (ON) or disables (OFF) automatic table runstats. A runstats policy (a defined set of rules or guidelines) can be used to specify the automated behavior. Runstats is used by the optimizer to determine the most efficient plan for accessing the physical data.

© Copyright IBM Corporation 2004

DB Configuration Automatic Maintenance

auto_db_bakup

auto_tbl_maint

auto_runstats

auto_stats_prof

auto_prof_upd

auto_reorg

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-5

Student Notebook

auto_stats_prof - Enabling this parameter (ON) turns on statistical profile generation, which is designed to improve applications whose workloads include complex queries, many predicates, joins, and grouping operations over several tables.

auto_prof_upd is a child of auto_stats_prof. When enabled (ON), it specifies that the runstats profile is to be updated with recommendations. When disabled (OFF), recommendations are stored in the opt_feedback_ranking table which you can inspect when manually updating the runstats profile.

auto_reorg enables (ON) or disables (OFF) automatic table and index reorganization for a database. A reorganization policy (a defined set of rules or guidelines) can be used to specify the automated behavior.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 1-4. Runstats and Profiling CF457.3

Notes:

Having out-of-date or incomplete statistics for a table or an index could lead the optimizer to select a plan that is not optimal and to slow down query execution. Determining which statistics to collect for a given workload is complex, and keeping these statistics up to date is time-consuming.

Using automatic statistics collection, you can let DB2 determine which statistics are required by your workload and which statistics need to be updated. When enabled, DB2 UDB will automatically run the RUNSTATS utility in the background to ensure that the correct statistics be collected and maintained.

The performance impact is minimized in several ways:

• Statistics collection is performed using throttled RUNSTATS. Throttling controls the amount of resources consumed by the RUNSTATS utility based on current database activity. When the database activity increases, the utility runs more slowly by reducing its resource demands.

© Copyright IBM Corporation 2004

Runstats and Profiling

DB2 UDB Catalog

auto_runstats

auto_stats_prof Query

Feedback

Warehouse

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-7

Student Notebook

• Only the minimal set of statistics for optimizing performance is collected. This is achieved through the use of statistics profiling, which uses information about previous database activity to determine which statistics are required by the workload and how quickly those statistics will become out of date.

• Only tables with high levels of activity (measured through the number of updates, deletes, and inserts) are considered for statistics collection. Large tables (of more than 4,000 pages) are also sampled to determine whether the high table activity has indeed changed the statistics. Statistics for these large tables are only collected if warranted.

• The RUNSTATS utility is automatically scheduled to execute during the optimal maintenance window specified in your maintenance policy definition. This policy also specifies a set of tables that are within the scope of automatic statistics collection, and minimizes unnecessary resource consumption.

• While the collection of automated statistics is being performed, the affected tables are still available for regular database activity as if RUNSTATS were not running on the tables.

Statistics profiles

Statistics profiles can also be generated automatically. When this feature is enabled, information about database activity is collected and stored in a query feedback warehouse. Based on this data, a statistics profile is generated. Enabling this feature can alleviate the problem of uncertainty about which statistics are relevant to a particular workload, and permits the minimal set of statistics to be collected.

This feature can be used with the automatic statistics collection feature, which schedules statistics maintenance based on the information contained within the profile.

Note: Automatic statistics profile generation can only be activated in DB2 serial mode, and is blocked for queries in federate, SMP, or MPP environments.

There are different ways to use this feature:

• In a test environment where the performance overhead and runtime monitoring can be easily tolerated. When the test system uses realistic data and queries, this will allow the proper settings of statistics parameters for RUNSTATS to be determined. The generated profiles can be transferred to the production system where queries can benefit without incurring any monitoring overhead.

• To address performance issues of specific queries in a production environment, AUTO_STATS_PROF can be turned on for a certain period of time. Automatic statistics profiling will analyze the query feedback and create a recommendation in the SYSTOOLS.OPT_FEEDBACK_RANKING tables. You can inspect these recommendations and define the appropriate profiles.

Note: There is some performance overhead associated with monitoring the queries and storing the feedback data in the feedback warehouse.

To use automatic statistics profiling, you must first create the query feedback warehouse which consists of three tables with SYSTOOLS schema. These tables are:

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

OPT_FEEDBACK_QUERY, OPT_FEEDBACK_PREDICATE

and

OPT_FEEDBACK_PREDICATE_COLUMN

To create these tables, you can use the SYSINSTALLOBJECT stored procedure, which is the common stored procedure for creating and dropping objects in the SYSTOOLS schema.

You can invoke this stored procedure as follows:

call SYSINSTALLOBJECTS (toolname, action, tablespacename, schemaname)

where:

• toolname specifies the name of the tool whose objects are to be created or dropped. In this case, use ASP or AUTO STATS PROFILING.

• action is C for create, or D for drop.

• tablespacename is the name of the table space in which the feedback warehouse tables will be created. If not specified, the default will be the user space.

• schemaname is the name of the schema with which the objects will be created or dropped. This parameter is currently not used.

Example: to create a feedback warehouse in the tablespace myts, enter:

call SYSINSTALLOBJECTS (“ASP,’C’,”myspace”,””)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-9

Student Notebook

Figure 1-5. Configure Automatic Maintenance CF457.3

Notes:

To set up your database for automatic maintenance, you can use the graphical user interface.

• From the Control Center, right-click a database object, or, from the Health Center, right-click the database instance that you want to configure. Select Configure Automatic Maintenance from the popup window.

• On the Type screen you can change the automation setting or disable automation. You also have information there on how the current automatic maintenance settings are set.

• On the Timing screen you can specify the maintenance window during which automatic maintenance can be performed. You will be able to specify an online or an offline maintenance window, and whether the maintenance should take place inside or outside this window.

• On the Notification screen you can add or remove contacts from the health notification contact list.

© Copyright IBM Corporation 2004

Configure Automatic Maintenance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• On the Activities screen you can select the maintenance activities and configure them. You can also determine if an activity should be automated or not, as well as the notification.

- Backup Database: You can configure the backup criteria (more frequent backups, backup database recoverability with performance, less frequent backup, or any customizing). You will also see the criteria details, such as maximum time between the backups, and log space used between the backups. Furthermore, you will be able to specify the backup location as well as the type of backup (online or offline).

- REORG: You can determine if all tables (without schema starting with SYS) or only selected tables should be reorganized. The selection of tables is done by a select on SYSCAT.TABLES.

- RUNSTATS: You can determine if all tables (without schema starting with SYS) or only selected tables should be reorganized. The selection of tables is done by a select on SYSCAT.TABLES.

• On the Summary screen you can verify your settings with all the relevant details.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-11

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

1.2 Control Center, Activity Monitor, and Memory Visualization

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-13

Student Notebook

Figure 1-6. Control Center CF457.3

Notes:

The Control Center can be opened using different default views, and you will be asked which of them to use when opening for the first time. This can be changed at any time subsequently, using Tools from the menu bar and then Customize Control Center from the popup menu. In this chapter, we will not talk about the basic view but about the two others:

• Custom: here you can define which folders and objects you will see in the Object Tree pane, such as All databases, Tables, Views, and so on. Furthermore, if you select one of the folders or objects, you will be able to select the actions listed in the popup menus when the folder or object is right-clicked from the Control Center.

• Advanced: is the default, and lists the standard set of objects that are traditionally used.

The Object Detail pane is dependent on the object selected in the Object Tree pane or the Contents pane, and also on the status of the object (for example, if a database is active or not). If the object is not active, you will see actions that could be performed. If the object is active, you will receive additional information, such as the database object’s status timestamp, DBM state, last backup timestamp, size and capacity, health information, and

© Copyright IBM Corporation 2004

Control Center

Object Tree Pane Object Detail Pane

Contents Pane

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

maintenance (automatic). In addition, you have a direct link there for further actions, such as for backup, managing storage, Health Monitor, or automatic maintenance setup.

When you select, for example, a table in the Contents pane, you will see in the Object Detail pane the columns and definitions (including key information) of the table, as well as actions such as to open, query, or show related objects.

In addition, by selecting View from the menu bar, you can again specify which details (columns) you want to see in the Contents pane, and also save your customized view.

You can apply further operations, such as to sort and filter the relevant data on the views.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-15

Student Notebook

Figure 1-7. Activity Monitor CF457.3

Notes:

You can invoke the activity monitor by right-clicking the instance and selecting Activity Monitor.

First, you must select the database to be monitored.

Then, you need to select a monitoring task:

• Resolving a general database system slowdown

• Resolving the performance degradation of an application

• Resolving an application locking situation

• Tuning the dynamic SQL statement cache

• Or, creating your own task

Then you determine any filters to apply, such as authorization ID, application name, or agent ID. There might well be some performance impact associated with collecting data for the selected reports.

© Copyright IBM Corporation 2004

Activity Monitor

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Once finished, you can select a report to view, as well as choose how many SQL statements should be shown, and also obtain some recommendations regarding problem solving.

The information you will see might include the agent ID, application name, authorization ID, total CPU time, user CPU time, or system CPU time, depending on the view selected.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-17

Student Notebook

Figure 1-8. Memory Visualization CF457.3

Notes:

By right-clicking the instance, you can choose to view memory usage. Then select the relevant database, and Memory Visualizer will be opened.

Here you see all relevant memory resources and parameters, which you can assign to be shown in the memory usage plot.

The resources shown are DBM shared memory, such as backup and restore, buffer pools, locks, and so on, and agent private memory, such as application heap, sortheap, and so on.

You will also have an overview of the actual utilization, the parameter value, and warning or alarm thresholds, if any. These thresholds can be updated directly in the Memory Visualizer.

© Copyright IBM Corporation 2004

Memory Visualization

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

1.3 Health Center

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-19

Student Notebook

Figure 1-9. Health Monitor CF457.3

Notes:

The Health Monitor is a server-side tool that constantly monitors the health of the instance, even without user interaction. If the Health Monitor finds that a defined threshold has been exceeded (for example, the available log space is not sufficient), or if it detects an abnormal state for an object (for example, an instance is down), the Health Monitor will raise an alert.

A health indicator is a system characteristic that the Health Monitor checks. The Health Monitor comes with a set of predefined thresholds for these health indicators. The Health Monitor checks the state of your system against these health indicator thresholds, when determining whether to issue an alert. Using the Health Center, commands, or APIs, you can customize the threshold settings of the health indicators, and define who should be notified and what script or task should be run if an alert is issued.

The Health Center provides the graphical interface to the Health Monitor. You use it to configure the Health Monitor, and to see the rolled up alert state of your instances and database objects.

© Copyright IBM Corporation 2004

Health Monitor

db2 get alert cfg for database on sample

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The Health Center and Control Center are integrated through Health Beacons. Health Beacons in the Control Center provide notifications about new alerts in the Health Center. Beacons are implemented on all Control Center windows and notebooks; simply click a Health Beacon to access the Health Center.

The Health Monitor gathers information about the health of the system using interfaces that do not impose a performance penalty. It does not turn on any snapshot switches to collect information. The Health Monitor is enabled by default when a instance is created; you can deactivate it using the database manager configuration parameter health_mon.

DB2 UDB also provides a Web Health Center that can be used to access the Health Monitor information from a Web browser or PDA.

You can also use DB2 commands and APIs to retrieve health information from the Health Monitor, allowing you to integrate DB2 Health Monitoring with existing system-wide monitoring functions. For example, to obtain the current configuration settings for an instance, database, table space, or container you can use the GET ALERT CFG command.

Example:

db2 get alert cfg for database on sample using db.spilled_sortsAlert Configuration

Indicator Name = db.spilled_sortsType = Threshold-basedWarning = 30Alarm = 50Sensitivity = 0Formula = (db.sort_overflows/db.total_sorts)*100;Actions = DisabledThreshold or State checking = Enabled

SQL22004N Cannot find the requested configuration for the given object.Returning default configuration for “databases”.

Note that, for this example, the database SAMPLE did not have specific settings; it was using the default settings for all databases.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-21

Student Notebook

Figure 1-10. Health Indicator Settings CF457.3

Notes:

Health indicators are used by the Health Monitor to evaluate specific aspects of database manager or database performance. A health indicator is a specific measurement that gauges the healthiness of some aspect of a particular class of database objects, for example table spaces. Health indicators measure either a finite set of distinct states or a continuous range of values to determine whether the state is healthy or unhealthy. The state defines whether or not the database object or resource is operating normally. If the change in the state is determined to be unhealthy, the Health Monitor issues an alert through the specified reporting channels.

An alert is generated in response to either a change to a non-normal state or to the value of the health indicator falling into a warning or alarm zone based on defined threshold boundaries. For health indicators measuring distinct states, an attention type alert is issued if a non-normal state is registered. For health indicators measuring a continuous range of values, threshold values define boundaries or zones for normal, warning, and alarm states. If, for example, the value enters the threshold range of values that defines an alarm zone, an alarm type alert is issued to indicate that the problem needs immediate attention. There are three types of alerts: alarm, warning, and attention.

© Copyright IBM Corporation 2004

Health Indicator Settings

db2 get description for health indicator db.spilled_sorts

db2 update alert cfg for database on sample using

db.spilled_sorts set alarm 60, warning 40

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Health Monitor information can be accessed through the Health Center, Web Health Center, the CLP, or APIs. Health indicator configuration is available through these same tools.

The following are the categories of health indicators:

• Table space storage • Sorting • Database management system • Database • Logging • Application concurrency • Package and catalog caches, and workspaces • Memory

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-23

Student Notebook

Command line example:

db2 get alert cfg for database on sample using db.spilled_sorts

Alert Configuration

Indicator Name = db.spilled_sorts Type = Threshold-based Warning = 30 Alarm = 50 Sensitivity = 0 Formula = (db.sort_overflows/db.total_sorts)*100; Actions = Disabled Threshold or State checking = Enabled

SQL22004N Cannot find the requested configuration for the given object.Returning default configuration for "databases".

db2 get description for health indicator db.spilled_sorts

DESCRIPTION FOR db.spilled_sorts

Sorting is considered healthy if there is sufficient heap space in which to perform sorting and sorts do not overflow unnecessarily. Sorts that overflow to disk can cause significant performance degradation. If this occurs, an alert may be generated. The indicator is calculated using the formula: (db.sort_overflows /db.total_sorts)*100. The system monitor data element db.sort_overflows is the total number of sorts that ran out of sort heap and may have required disk space for temporary storage. The data element db.total_sorts is the total number of sorts that have been executed.

db2 update alert cfg for database on sample using db.spilled_sorts set alarm 60, warning 40DB20000I The UPDATE ALERT CONFIGURATION command completed successfully.

db2 get alert cfg for database on sample using db.spilled_sorts

Alert Configuration

Indicator Name = db.spilled_sorts Type = Threshold-based Warning = 40 Alarm = 60 Sensitivity = 0 Formula = (db.sort_overflows/db.total_sorts)*100; Actions = Disabled Threshold or State checking = Enabled

Notice that there is now a specific setting for the SAMPLE database, so the SQL22004 message is no longer returned.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 1-11. Health Center - Alerts View CF457.3

Notes:

When an alert is raised, two things can occur:

• Alert notifications can be sent by e-mail or to a pager address, allowing you to contact whomever is responsible for a system.

• Preconfigured actions can be taken. For example, a script or a task (implemented from the new Task Center) can be run.

Using the Health Monitor's drill-down capability, you can access details about current alerts and obtain a list of recommended actions that describe how to resolve the alert.

db2 get health snapshot for database on sample

Database Health Snapshot

Snapshot timestamp = 01-24-2004 19:55:09.783435

Database name = SAMPLEDatabase path = C:\DB2\NODE0000\SQL00002\

© Copyright IBM Corporation 2004

Health Center - Alerts View

db2 get health snapshot for database on sample

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-25

Student Notebook

Input database alias = SAMPLEOperating system running at database server= NTLocation of the database = LocalDatabase highest severity alert state = Alarm

Health Indicators:

Indicator Name = db.spilled_sorts Value = 100 Evaluation timestamp = 01-24-2004 19:50:32.078000 Alert state = Alarm

Indicator Name = db.sort_shrmem_util Value = 0 Evaluation timestamp = 01-24-2004 19:50:32.078000 Alert state = Normal

Indicator Name = db.db_op_status Value = 0 Evaluation timestamp = 01-24-2004 19:50:32.078000 Alert state = Normal

You can also click the recommendation advisor, which guides you through the problem solving depending on your requirements (if you would like to do a full investigation or to have an immediate solution to the problem).

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 1-12. Alarm Details CF457.3

Notes:

You can follow one of the recommended actions to address the alert. If the recommended action is to make a database or database manager configuration change, a new value will be recommended, and you can implement the recommendation by clicking a button. In other cases, the recommendation will be to investigate the problem further by launching a tool, such as the CLP or the Memory Visualizer.

db2 get recommendations for health indicator db.spilled_sorts

RECOMMENDATIONS FOR db.spilled_sorts

Increase sort heap

From the Control Center:1. Expand the object tree until you find the Databases folder and expand the folder until you find the database that you want.2. Right-click the database, and click Configure... from the pop-up menu. The Configure Database dialog opens.3. On the "Performance" tab, update the "sort heap" parameter as recommended

© Copyright IBM Corporation 2004

Alarm Details

db2 get recommendations for health indicator

db2.spilled_sorts

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-27

Student Notebook

above and click OK to apply the update.

From the Command Line Processor, type as shown in the followingexample:CONNECT TO DATABASE nameUPDATE DATABASE CONFIGURATION FOR DATABASE name USING "SORTHEAP" sizeCONNECT RESET

Tune workloadYou can run the Design Advisor to tune the database performance for your workload by adding indexes and materialized query tables. This can help reduce the need for sorting. You will need to provide your query workload and database name. The advisor will evaluate the existing indexes and materialized query tables in terms of the workload and recommend any new objects required.

From the Control Center:1. Expand the object tree until you find the Databases folder and expand the folder until you find the database that you want.2. Right-click the database, and click Design Advisor... from the pop-up menu. The Design Advisor opens.3. Follow the steps of the wizard to optimize query performance and apply the recommendations of the wizard.

This function is also available from the command line by typing:

db2advis

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. What is the prerequisite to be able to run automatic RUNSTATS?

__________________________________________________

2. Which content in the Control Center is dependent on the selected object or folder?

a. Object Tree pane

b. Contents pane

c. Object Detail pane

__________________________________________________

3. The Health Center can be configured and started only through the GUI.

a. True

b. False

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 1. Automatic Computing 1-29

Student Notebook

Figure 1-13. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Identify and use the automatic capabilities of DB2 UDBBackup/Restore/Log File ManagementMemory ConfigurationRunstatsReorg

Use and configure the Health Center

Use the Activity Monitor

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

1-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 2. Remote Administration

What This Unit Is About

This unit provides information on remote administration using several methods: using the DB2 UDB Administration Server (DAS), using Discovery, and working at the instance level with the ATTACH and DETACH commands.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Administer a remote database using:

- DB2 UDB Administration Server (DAS) - Discovery - The instance-level commands ATTACH and DETACH

• Identify when an instance attachment is required

• Attach to or detach from an instance

• Complete the connectivity check list

How You Will Check Your Progress

Accountability:

• Machine lab

References

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Command Reference

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-1

Student Notebook

Figure 2-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Administer a remote database using:

DB2 Administration Server (DAS)

Configuration Assistant

Instance-level commands

Identify when an instance attachment is required

Attach to or detach from an instance

Complete the connectivity check list

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

2.1 DB2 UDB Administration Server (DAS)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-3

Student Notebook

Figure 2-2. DAS Administration Server and Tools Catalog CF457.3

Notes:

The DB2 Administration Server (DAS) is a control point used only to assist with tasks on DB2 servers. You must have DAS running if you want to use available tools like Configuration Assistant, Control Center, or the Development Center. DAS is used for:

• Enabling remote administration of DB2 UDB servers

• Providing the facility for job management, including the ability to schedule the running of both DB2 UDB and operating system command scripts.

• Defining the scheduling of jobs, viewing the results, and performing other administrative tasks against jobs using the Task Center.

• Provide discovering information about the configuration of DB2 instances, databases, and other DB2 administration servers in conjunction with the DB2 Discovery utility. This information is used by the Configuration Assistant and the Control Center to simplify and automate the configuration of client connections to DB2 UDB databases.

You can only have one DAS on a machine (including USS UNIX system services on OS/390 or z/OS packaged and delivered as part of the DB2 Management clients feature).

© Copyright IBM Corporation 2004

DAS Administration Server and Tools Catalog

tools catalog

database

Scheduler

DB2 UDB

Administration

Server

DB2 UDB

Instances

Systems (Windows and UNIX)

Control Center

Command Editor

Development Center

Data Warehouse Center

Command Line Processor

TCP/IP

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The DAS on Windows and UNIX includes a scheduler to run tasks defined using the Task Center.

Task information, such as commands, schedule, notification, completion actions, and run results associated with the task, are stored in a DB2 database called the Tools Catalog database. The Tools Catalog contains metadata required for DB2 tools and the scheduler. You can create the Tools Catalog database as a part of the setup, or by the Control Center, or through the CLP using the CREATE TOOLS CATALOG command.

Also, DAS has FFDC (first failure data capture) which includes an administration notification log (db2dasdiag.log), dump files, and trap files. They are located in the db2path\DB2DAS00\dump directory on Windows systems, and in $DASHOME/das/dump on UNIX-based systems. We will talk about the FFDC in the Problem Determination unit in more detail.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-5

Student Notebook

Figure 2-3. Using the Administration Server CF457.3

Notes:

When you install and configure the DB2 UDB database product, the Administration Server is automatically created. You will be asked for the userid and the password. If the userid is not created under Windows, the system will do this for you.

You can also manually create an Administration Server, and start, stop, list, and remove an Administration Server instance.

To create a DAS, you must have root authority on UNIX platforms, or be using an account that has the correct authorization to create a service. After you create the DAS, you can establish or modify its ownership by providing a user account name and user password to the db2admin setid command.

To create an Administration Server, the syntax is as follows:

(Intel) db2admin create user:username password:passwrd (Unix) instance subdirectory/dascrt -u DASUser (you must have root authority) (AIX) /usr/opt/db2_08_02/instance/dascrt -u DASuser(HP-UX, Solaris, Linux) /opt/IBM/db2/V8.2/instance/dascrt -u DASUser

© Copyright IBM Corporation 2004

Using the Administration Server

Creating the Administration Serverdascrt -u DASUser (UNIX)

db2admin create (Intel)

Starting and stopping the Administration Serverdb2admin start

db2admin stop

Listing the Administration Serverdb2admin

Removing the Administration Serverdasdrop ASName (UNIX)

db2admin drop (Intel)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

To manually start or stop the DAS, on Windows you must first log on to the machine using an account or user ID that belongs to the Administrators, Server Operators, or Power Users group. On UNIX, the account or user ID must be made part of the dasadm_group. The dasadm_group is specified in the DAS configuration parameters.

To start or stop the DAS on Windows, use the following:

db2admin start db2admin stop

When working on UNIX operating systems, you must do the following:

• Run the startup script using one of the following:

- . DASHOME/das/dasprofile (for Bourne or Korn shell)

- source DASHOME/das/dascshrc (for C shell)

- db2admin start

- db2admin stop

• For both cases under UNIX, the person using these commands must be logged on with the authorization ID of the DAS owner. The user needs to belong to the dasadm_group to issue a db2admin start or db2admin stop command.

The Administration Server is automatically started after each system reboot. The default startup behavior of the DAS can be altered using the dasauto command.

To obtain the name of the Administration Server on your system, execute the following command at a command prompt.

db2admin

To see the values for the DB2 administration server configuration parameters, enter:

db2 get admin cfg

This will show you the current values that were given.

To update individual entries in the DAS configuration file, or to reset it, enter:

db2 update admin cfg using ...

To reset the configuration parameters to the recommended defaults, enter:

db2 reset admin cfg.

Some of the DAS configuration parameters can be changed online, while others would require the DAS to be stopped (db2admin stop) and then started (db2admin start) to become effective. For details about all configured parameters, please refer to the DB2 UDB Administration Guide: Performance.

If you want to remove the Administration Server, first you must stop it. Next, if you want to preserve scheduled jobs or journaled results, you must back up the files in the sqllib subdirectory under the home directory of the DAS (UNIX), or under the DB2DAS00 subdirectory under the sqllib subdirectory (Intel). The instance directory is indicated by

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-7

Student Notebook

the DB2INSTPROF registry variable. Log out as the Administration Server owner. On UNIX servers, you must then log in as root; under Windows NT/2000 you must have SYSADM, SYSCTRL, or SYSMAINT authority. Remove the Administration Server instance using the command as follows:

(Intel) db2admin drop (Unix ) dasdrop

On UNIX, the drop command removes the sqllib directory under the home directory of the Administration Server.

Tools catalog database and DAS scheduler:

The tools catalog database contains task information created by the Task Center and Control Center. These tasks are run by the DB2 administration server’s scheduler. The scheduler is a specific piece of the DB2 administration server that acts as an agent to read the tools catalog database and runs the tasks at their respective times.

It is a prerequisite for creating the Tools Catalog database that the DB2 administration server is installed.

To create the DB2 tools catalog tables (in an existing or a new database), you must issue the CREATE TOOLS CATALOG command. The database must be local and the command is not valid on a DB2 client. You need SYSADM or SYSCTRL authority as well as the DASADM authority to be able to update the DAS server configuration. Examples of the command (details can be found in the Command Reference):

To create a tools catalog with the name cc on a new database with the name toolsdb, you issue:

db2 create tools catalog cc create new database toolsdb

To create a tools catalog using an existing database toolsdb (which must be deactivated and activated after creation), issue:

db2 create tools catalog use existing database toolsdb force

The DAS scheduler requires a Java virtual machine (JVM) to access the tools catalog information. The JVM information is specified using the jdk_path DB2 DAS configuration parameter.

The Control Center and Task Center access the tools catalog database directly from the client. The tools catalog database therefore needs to be cataloged at the client. The Control Center provides the means to automatically retrieve information about the tolls catalog database and to create the necessary directory entries. The only communication protocol supported for this automatic cataloging is TCP/IP.

To update the DAS to work with the tools catalog database, use the following command:

db2 update admin cfg using toolscat_inst INSTANCENAME toolscat_db DBNAME toolscat_schema SCHEMANAME

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

If the tools catalog database is remote to the DB2 Administration server, you need to issue db2admin setschedid sche-user/sched-password to establish the logon account used by the scheduler to connect to the tools catalog database.

There are two DAS configuration parameters used to enable notifications by the scheduler or the Health Monitor:

• smtp_server is used to identify the Simple Mail Transfer Protocol (SMTP) server used by the scheduler to send e-mail and pager notifications as part of task execution completion actions as defined through the Task Center or by the Health Monitor to send alert notifications.

• contact_host specifies the location where the contact information by the scheduler and health monitor for notification is stored (for example, the DAS server’s TCP/IP hostname).

E-mail and pager notifications from the DB2 administration server can be local or remote. A contact list is required to ensure that notifications are sent to the correct hostname.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-9

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

2.2 Discovery

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-11

Student Notebook

Figure 2-4. Configuration Assistant CF457.3

Notes:

The Configuration Assistant is the GUI tool that is used to configure access to remote databases. It can be invoked from the DB2 UDB Desktop folder, or from the command line with the db2ca command.

Use the Configuration Assistant to configure your clients. It also can be used as a lightweight alternative to the Control Center when you do not want to install the complete set of GUI tools.

Clients must be configured so that they can work with the available objects:

• To access an instance or database on another server system, you must catalog that system in the node directory.

• To access a database, you must catalog the database information in the database directory.

© Copyright IBM Corporation 2004

Configuration Assistant

Used to configure clientconnections

Clients can be configured:Manually

Automatically using DB2 Discovery

Using imported profiles

DB2

Administration

Server

DB2 Server

Client

Configuration

Assistant

Remote Client

DB2 DiscoveryUsed to also to configure

DBM Configuration

DB2 Registry

Export profiles

Configure another Instance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

From the Configuration Assistant, you can perform various tasks, including:

• Add new database objects

• Work with existing database objects

• Bind applications

• Set database manager configuration parameters

• Import and export configuration information

• Set DB2 registry parameters

• Test connections

• Change passwords

• Configure CLI parameters

The graphical interface makes these complex tasks easier, through:

• Wizards to help perform certain tasks

• Dynamic fields that are activated based on your input choices

• Hints that help you make configuration decisions

• DB2 Discovery, which can retrieve information about selected database objects

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-13

Student Notebook

Figure 2-5. Using the Configuration Assistant CF457.3

Notes:

The Configuration Assistant displays a list of the databases to which your applications can connect. Each database is identified by its database alias. You can use the Add Database wizard to add databases or the Change Database wizard to alter the information that is associated with the databases in the list.

From the View menu you can select an advance view, which uses a notebook to organize connection information by objects: Systems, Instance Nodes, Databases, Database Connection Services (DCS), and Data Sources. These notebook pages can be used to perform object-specific actions. In addition, you can customize your view.

From the Configure menu, you can choose to:

• Configure the DBM configuration • Configure the DB2 registry • Import profile • Export profile • Configure another instance

© Copyright IBM Corporation 2004

Using the Configuration Assistant

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

From the Selected menu, you will be able to:

• Add database using wizard • Change database • Remove database • Bind applications • Change password • Test connection • Set CLI setting

From the Edit menu you can, for example, edit the profile offline.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-15

Student Notebook

Figure 2-6. Using the Configuration Assistant CF457.3

Notes:

The graphic shows an example of using the Configuration Assistant (CA). To configure a client connection, click Selected and then Add Database Using Wizard from the popup menu, and the Add Database Wizard appears. You must first decide on the method that you want to use when setting up the connection. There are several options:

• Use a profile.

- A file containing all the necessary information for accessing a remote server. This profile can be created with the Configuration Assistant and also be imported directly without use of the wizard.

• Search the network. This uses DB2 UDB Discovery.

- Known System; this is if the right selection is not yet listed, but you know the protocol for the administration server on the system. Click Add System and specify the communication protocol and the server-specific information. After the server is found, you will be able to select the database you want to add, define an alias, and if you want to connect directly or through a gateway (such as OS/390 or z/OS).

© Copyright IBM Corporation 2004

Using the Configuration Assistant

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

- Search the network; this option can be used to search the local network for DB2 servers. From the servers presented, you can choose the one with the database you want to add, and proceed as described above.

• Manually configure a connection to a database.

- Here you have to know all the information necessary to connect to the desired database. This includes the protocols supported by the remote server, the connection configuration information, and the name of the database.

Notice that the last two options require that you know the specific protocol information. In order to use the discovery facility to search the network for remote databases, you must do the following:

1. Ensure that the protocol stack(s) on the client and server are preinstalled and configured so that they are fully functional at discovery time.

2. On every server, the DB2 UDB Administration Server service must be installed and configured on the server workstation to support one or more protocols.

3. On every server, the Administration Server database manager configuration parameter of DISCOVER must be set to either KNOWN or SEARCH.

4. On every server, the instance database manager configuration parameter of DISCOVER must be set to either KNOWN or SEARCH.

5. On every server, the instance database manager configuration parameter of DISCOVER_INST must be enabled for each instance to be discovered.

6. On every server, the instance database manager configuration parameter of DISCOVER_COMM must be set for one or more protocols that are installed on that server.

7. On every server, the Administration Server database parameter of DISCOVER_DB must be enabled for each database to be discovered.

8. The code that enables discovery on the client must be installed on the client workstation to support one or more protocols.

Note that the discovery facility assumes that all of the information returned by DB2 UDB Administration Server is valid.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-17

Student Notebook

Figure 2-7. Discovery Parameters CF457.3

Notes:

A DAS must reside on each physical partition. When a DAS is created on the partition, the DB2SYSTEM name is configured to the TCP/IP hostname, and the discover setting is defaulted to search.

In the Database Manager Configuration File, there are two parameters:

• discover, which is the Discovery Mode configuration parameter. From a client perspective, the following can occur:

- discover=SEARCH, the client can issue search discovery requests to find the DB2 server systems. If SEARCH is specified, both SEARCH and KNOWN are supported. SEARCH is the default.

- discover=KNOWN, only known discovery requests can be issued from the client.

- discover=DISABLE, discovery is disabled.

• discover_comm, which is the Discovery Communication Protocols configuration parameter.

© Copyright IBM Corporation 2004

Discovery Parameters

Administration Server Level

Instance Level

Database Level

DISCOVER=SEARCH

DISCOVER=ENABLE

DISCOVER_COM=TCPIP

DISCOVER_DB=DISABLE DISCOVER_DB=ENABLE

DISCOVER=DISABLE

DISCOVER_COMM

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

- Default is None; any combination of TCP/IP is allowed. The protocols defined here must also be specified in the DB2COMM registry variable.

In the Database Configuration file is one parameter which is used to prevent information about a database from being returned to a client when Discovery is used.:

• ENABLE is the default. By changing it to DISABLE, it is possible to hide a database with sensitive data from the discovery process.

The parameters for DB2 UDB Discovery can be set at the server at three different levels: Administration Server, instance, and database. You can override a parameter's setting by setting a lower level parameter. For example, you can set the DISCOVER variable at the Administration Server to search (DISCOVER=SEARCH) to allow remote clients access to a specific system. If there are multiple DB2 UDB instances at the Administration Server, you can prevent the access of remote clients to each DB2 UDB instance by disabling the DISCOVER_INST parameter in the dbm configuration file of the instance you want to protect.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-19

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

2.3 Instance-Level Commands

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-21

Student Notebook

Figure 2-8. Instance Attachment versus DB Connection CF457.3

Notes:

Certain tasks can only be done at an instance level, such as creating a database, forcing off applications, monitoring a database, or updating the database manager configuration.

Certain tasks require a database connection, such as Data Manipulation Language (DML), Data Control Language (DCL), Data Definition Language (DDL), Load, Export, Import, Bind, and Precompile. If a database connection does not exist, an error will occur.

DB2 UDB provides administration of a remote instance and NOT remote administration of client nodes. Catalog/Uncatalog/List of database/node/dcs directories at a remote instance is not allowed. These commands refer to local directories only -- those associated with the current setting of DB2INSTANCE.

© Copyright IBM Corporation 2004

Instance Attachment versus DB Connection

INSTANCE ATTACHMENT

Create/drop databases

Get/update/reset database manager anddatabase configuration file

Database monitor

Backup/restore/roll forward database

Force application

DATABASE CONNECTION

DML, DDL, DCL

Precompile/bind applications

Load/export/import

Only catalog/uncatalog/list node/database/dcsdirectories associated with DB2INSTANCE variable

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 2-9. Explicit/Implicit ATTACH/CONNECT CF457.3

Notes:

The capability to complete tasks at the instance level is provided via the ATTACH function. The ATTACH function is enabled through explicit or implicit instance attachment capabilities. The instance that will be attached to by default is specified by the DB2INSTANCE variable.

To attach explicitly, a node name is used that acts as an alias for a database manager instance, and is expected to have a matching entry in the local node directory. The only exception is the local instance, which may be specified as an object of the attach, but which cannot appear as an instance alias in the local node directory.

If ATTACH is issued with no arguments, the current attachment status is returned. If ATTACH is issued with arguments, one of the following will happen:

1. If there is no current instance attachment, an attempt to establish one will be made.

2. Otherwise, if the application is already attached to an instance, then the current attachment will be detached and an attempt will be made to establish the new attachment.

© Copyright IBM Corporation 2004

Explicit/Implicit ATTACH/CONNECT

INSTANCE ATTACHMENT

Implicit: DB2INSTANCE=

Explicit: db2 ATTACH TO nodename [USER ...USING...]

DATABASE CONNECTION

Implicit: DB2DBDFT=

Explicit: db2 CONNECT TO db-alias [USER ...USING...]

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-23

Student Notebook

If an instance level function is specified without being preceded by an ATTACH command, the instance name contained in the DB2INSTANCE environment variable is used.

Note: The DB2INSTANCE environment variable on a client-only node does not point to an instance to which an attachment can be made. In this case, DB2INSTANCE acts as a pointer to the directories and configuration files -- not a pointer to an instance.

The capability to complete tasks at the database level is provided via the CONNECT function. The CONNECT function is enabled through explicit or implicit database connection capabilities. The default database connection is specified in the DB2DBDFT variable.

Invoking the DETACH function will remove the logical instance attachment and terminate the physical communication connection.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 2-10. DB2 UDB Directory Structure - CONNECT CF457.3

Notes:

There are three different directories that are used for attachment or connects:

• System database directory

• Local database directory

• Node directory

It is important to know this structure and to understand how these directories work together, because, for example, the GUI tools may not be available to make relevant updates, or there may be a problem determination situation.

A system database directory resides in the file sqllib/sqldbdir/sqldbdir (UNIX) or sqllib\instnn\sqldbdir\sqldbdir (Intel) of the instance owner. It is used to access local databases and remote databases. Each system database directory entry contains the database name, the alias database name (used in the CONNECT statement), the database type (INDIRECT for local databases and REMOTE for remote databases), and a nodename if the database type is remote. The nodename is a pointer to an entry in the node directory which specifies how to locate the remote database.

© Copyright IBM Corporation 2004

DB2 Directory Structure - CONNECT

Workstation1

DB2

LocalDatabase

DB2 Home

SystemDatabase

DB1 IndirectDB2 Indirect

REM1 Remote

Node

REM1

inst1

Workstation2

inst3

CONNECT TO system-db-alias-name

DB1

LocalDatabase

DB1 Home

SystemDatabase

REM1 Indirect

REM1

LocalDatabase

REM1 Home

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-25

Student Notebook

If the type in the system database directory is indirect, a local database directory entry is used to access the local database. A local database directory resides in every subdirectory that contains a database. Each local database directory entry contains the database name, the database alias name, the entry type, and the name of the file system where the database files are stored.

If the type in the system database directory is remote, a node directory is used. It is recorded in the file sqllib/sqlnodir/sqlnodir on UNIX platforms or sqllib\instnn\sqlnodir\sqlnodir on Intel platforms, of the instance owner. The node directory contains entries for all nodes that the database client can access. The node directory is used to obtain communication information for network connections when the database being referenced is remote.

A database connection services directory (dcs) is only created if DB2 UDB Connect is installed on the system. It can be found in the file sqllib/sqlgwdir/sqlgwdir on UNIX platforms or sqllib\instnn\sqlgwdir\sqlgwdir on Intel platforms, of the instance owner. The dcs directory stores information used by the database manager to access remote databases on a remote DRDA computer.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 2-11. Remote Administration - ATTACH CF457.3

Notes:

When you attempt to attach to an instance that is not your default instance (DB2INSTANCE), the node directory on the local client is used to determine how to communicate with the instance. Instances are cataloged as being on local or remote nodes in the node directory. The ATTACH command enables an application to specify the instance at which instance-level commands are to be executed. This instance may be the current instance, another instance on the same workstation, or an instance on a remote workstation. The nodename specified in the ATTACH command is the alias of the instance to which the user wants to attach. The nodename must have a matching value in the node directory on the local client workstation. The only exception to this is the instance name specified by the DB2INSTANCE environment variable which may be specified as the object of an attach but cannot be specified as a nodename in the node directory. If nodename is omitted from the ATTACH command, information about the current state of attachment is returned. If ATTACH has not been executed, instance-level commands are executed against the current instance, specified by the DB2INSTANCE environment variable.

© Copyright IBM Corporation 2004

Remote Administration - ATTACH

Workstation1

inst1

Node

inst2inst3

SystemDatabase

. . . .

inst2

Node. . .

. . .

Workstation2

REM1

SystemDatabase

REM1 Indirect

LocalDatabase

REM1 Home

inst3

ATTACH TO nodename

SystemDatabase

SystemDatabase

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-27

Student Notebook

Figure 2-12. ATTACH to Local Node CF457.3

Notes:

If a local node is cataloged, applications can attach to the instance name associated with the local node name. Applications can also attach to the instance name in DB2INSTANCE.

The ATTACH statement will cause the NODE directory to be searched. Information taken from the directory will be used to determine whether the instance is local or remote.

When a database is created, an entry is cataloged in the system database directory of the local DB2INSTANCE instance. This entry states that the database is INDIRECT (local). When a database is created, an entry is also cataloged in the local database directory, specifying on which file system directory the database files are located. Once the database is created, a local user can specify the system database alias name in a CONNECT statement and connect to the database. (In UNIX, the local userid must also set DB2INSTANCE and DB2PATH to values which indicate the instance where this database was created.)

© Copyright IBM Corporation 2004

ATTACH to Local Node

instancename

DB2INSTANCE= inst1

user-definednodename

db2 CATALOG LOCAL NODE linst3 INSTANCE inst3

db2 ATTACH TO linst3

db2 CREATE DATABASE DB5 ...

db2 DETACH

db2 CONNECT TO DB5

db2 CREATE TABLE T1 ...

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 2-13. ATTACH to Remote Node CF457.3

Notes:

If a remote node is cataloged, applications can attach to the instance name associated with the remote node name.

The ATTACH statement will cause the NODE directory to be searched. Information taken from the directory will be used to determine whether the instance is local or remote.

The following commands may be used to catalog remote nodes:

• CATALOG TCPIP NODE • CATALOG NETBIOS NODE • CATALOG APPN NODE • CATALOG LDAP NODE • CATLAOG LOCAL NODE

Once a remote node is cataloged (pointing to an instance), you may then catalog a system database directory entry (pointing to a database in the remote instance). You may then connect to the remote database.

© Copyright IBM Corporation 2004

ATTACH to Remote Node

db2 CATALOG TCPIP NODE wkstn2 REMOTE sys2 SERVER inst2

DB2INSTANCE= inst1

db2 CATALOG DATABASE rem1 AS remdb1 AT NODE wkstn2

db2 CONNECT TO remdb1

user-definednodename hostname

points toWorkstation2's

IP address

servicenamepoints to

inst2's mainport numbers

Workstation2'sSystem DBalias name

alias usedin CONNECT

nodenamein

nodedirectory

db2 ATTACH TO wkstn2

db2 RESTORE DATABASE REM1 ...

db2 DETACH

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-29

Student Notebook

Figure 2-14. Connectivity Checklist (TCP/IP) CF457.3

Notes:

The above checklist helps to:

• Collect all relevant information for connectivity via TCP/IP

• Update all relevant files and configurations

• Identify and correct communication errors

© Copyright IBM Corporation 2004

Connectivity Checklist (TCP/IP)

Client Server

Connect to ..........

Node

Protocol

Hostname

Service Name

DB Alias

DB Name

DB Type

Node

Authentication

hostname

IP Address

service_name

port numbers

protocol

hostname

IP Address

service_name

port numbers

protocol

DB Alias

DB Name

DB Type

Node

Authentication

DB2COMM=TCPIP

svcename ............

DB2 Registry:

DBM CFG File:

Node directory:

System DB directory: System DB directory:

hosts file: hosts file:

services file: services file:

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. Remote administration may be used to do all of the following functions, EXCEPT:

a. Create/drop databases

b. Get/update/reset database manager and database configuration files

c. Catalog directories on client

d. Database monitoring

__________________________________________________

2. Write the command to catalog a local instance called inst1.

__________________________________________________

3. Write the command to make an attachment to inst1.

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 2. Remote Administration 2-31

Student Notebook

Figure 2-15. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Administer a remote database using:

DB2 Administration Server (DAS)

Configuration Assistant

Instance-level commands

Identify when an instance attachment is required

Attach to or detach from an instance

Complete the connectivity check list

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

2-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 3. The Governor and Audit

What This Unit Is About

This unit describes how to set up usage rules for a database that are enforced by the DB2 UDB governor. It shows how to stop and start the governor, how to control it through a configuration file, and how to interpret its log file. This unit shows also how Audit works, and how to set up, start, and stop this utility.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe the DB2 Governor functionality • Configure the Governor • Start and stop the Governor • Read the Governor Log File • Describe the DB2 Audit functionality • Configure the Audit • Start and stop the Audit • Read the Audit Log File • Describe the impact of these utilities

How You Will Check Your Progress

Accountability:

• Checkpoint questions • Machine exercise

References

IBM DB2 Universal Database Administration Guide: Performance

IBM DB2 Universal Database Command Reference

IBM DB2 Universal Database Administration Guide: Implementation

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-1

Student Notebook

Figure 3-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Describe the DB2 Governor functionality

Configure the Governor

Start and stop the Governor

Read the Governor Log File

Describe the DB2 Audit functionality

Configure the Audit

Start and stop the Audit

Read the Audit Log File

Describe the impact of these utilities

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

3.1 Establishing Rules with the Governor

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-3

Student Notebook

Figure 3-2. DB2 UDB Governor CF457.3

Notes:

The DB2 UDB Governor is used to monitor user and application activity and, if required, take appropriate actions.

It collects statistics on a regular basis defined by a configuration file.

The governor then evaluates the statistics against a set of rules that are also defined in the configuration file. Based on these rules, the governor may change an application's priority, or force an application or user off the database. The governor also records the actions it has taken in a log file.

In an environment with many database applications, long-running queries are quite common. However, you may want to limit their use to certain portions of the day, or reduce their priority so that they do not negatively impact any other work on the system.

© Copyright IBM Corporation 2004

DB2 UDB Governor

Used to monitor user activity

If a user exceeds resource limits, the Governor can:

Force the user's application

Reduce the priority of the user's application

Resource limits may be set on:

CPU time, rows read, rows selected, number of locks held,connect time, elapsed time

Resource limits can be set to apply:

During certain time periods, to certain users, or on certainapplications

All the actions taken by the Governor are logged

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-3. Starting the Governor CF457.3

Notes:

Before you start the governor, a configuration file must exist.

To start the governor, issue the command db2gov start, and specify the database, the configuration file name, and the log file name. By default, a governor daemon is started on every partition in a partitioned database, but you can specify the database partition on which to start the governor.

Stopping the governor is similar, but without the need to specify the configuration file name or log file name.

SYSADM or SYSCTRL authority is required to use the governor.

In UNIX, if one of the actions requires changing the priority of a process, the root userid must be defined as a trusted user on all machines in the instance. The root userid also needs to be in the user list associated with the SYSADM group.

The parameters are as follows:

database The name of the database for which the governor is being started or stopped.

© Copyright IBM Corporation 2004

Starting the Governor

db2gov

start

stop

database

database

dbpartitionnum db-partition-number

config-file log-file

dbpartitionnum db-partition-number

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-5

Student Notebook

dbpartitionnum db-partition-number Specifies the database partition on which to start or stop the governor daemon. The number specified must be the same as that specified in the database parathion configuration file (db2nodes.cfg).

config-file Specifies where the rules for monitoring the database can be found. The default location of the configuration file is the sqllib directory. If the file is not in the default location, you must include the path as well as the file name.

log-file Specifies the base name of the log file to which the governor will write. The log file is stored in the sqllib/log directory. The number of the database partition on which the governor is running is automatically appended to the log file name (for example: mylog.0, mylog.1, mylog.2).

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-4. What Does the Governor Do? CF457.3

Notes:

The term daemon is often used to describe the behavior of the governor. A daemon is a process that intermittently awakens and performs a chore. The governor is like a daemon because it periodically awakens, checks priorities for the database that you have established, and realigns applications accordingly.

The governor daemon is started by the db2gov front-end utility or by waking up (if already running). It runs in a loop.

The first task it does is to check whether its configuration file has changed or has not yet been read. If either condition is true, the daemon reads the rules in the file. This allows you to change the behavior of the governor daemon while it is running.

After this, the governor daemon issues a snapshot request to obtain statistics for each application and agent working on the database.

The governor then checks the statistics for each application against the rules in the governor configuration file. When the governor finishes checking all applications, it sleeps

© Copyright IBM Corporation 2004

Runs in a loop after waking up

Configuration file is checked

Takes snapshots for each

Application

Agent

Checks the statistics

Changes or forces application (if needed)

What Does the Governor Do?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-7

Student Notebook

for the interval specified in an configuration file. Once this time has elapsed, the governor wakes up and begins the execution loop again.

If a rule applies to an application, the governor can:

• Force the application

• Change the application's priority, which indirectly changes the agent priorities working for it on the database partition (if an error situation occurs, the governor resets the priority back).

• Change the schedule for the application, which indirectly changes the agent priorities working on the application.

The governor writes a record of any action it takes to its logfile. If an error occurred, a record is written to db2diag.log file.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-5. Governor Configuration File Format CF457.3

Notes:

When you start the governor, you specify the name of the configuration file that contains the rules to be used to govern applications running against the database. If your rule requirements change, you edit the configuration file without stopping the governor. Each daemon detects that the file has changed, and re-reads it.

In a partitioned database, the configuration file must reside in a directory that is mounted across all the database partitions, because the governor daemon on each partition must be able to read the same configuration file.

The file consists of rules and comments. Most entries can be specified in uppercase, lowercase, or mixed case characters. The exception is applname, which is case-sensitive.

You delimit comments within the {} braces.

The rules include:

• The database to which the rules apply.

• How often the rules should be applied (how often a snapshot should be done).

© Copyright IBM Corporation 2004

Governor Configuration File Format

Must be in directory for all database partitions

Config File Structure

{Comments}

interval x; dbname SAMPLE; account y;

desc "rule description"

"clauses" "checks" "actions";

"clauses" "checks" "actions";

. . .

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-9

Student Notebook

• The rules that specify how to govern the applications. These rules are made of smaller components: rule clauses.

Each rule in the file must be followed by a semicolon (;).

Each of the following rules is only specified once in the file.

interval Specifies when the daemon wakes up and applies the governor's rules. Interval is specified in seconds. If no interval is specified, an interval of 120 seconds is used.

dbname The name or alias of the database to be monitored.

account Specifies when records are written containing CPU usage statistics for each connection. Account is specified in minutes.

Note: Account is not available on Windows platforms.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-6. Governor Clauses CF457.3

Notes:

The following clauses are combined to form a rule. The rule, not the clause, is followed by a semicolon. Clauses can be specified only once in a rule, but they can be specified in more than one rule. Clauses must be specified in the order shown.

• desc Specifies a text description of the rule. The description must be enclosed by either single or double quotation marks. This clause is optional.

• time Specifies the time period during which the rule is to be evaluated. The time period must be specified in the form hh:mm hh:mm. If this clause does not appear in a rule, the rule is valid 24 hours a day (00:00 23:59).

• authid Specifies one or more authorization IDs under which the application is executing. Multiple IDs must be separated by a comma(,). Example: authid gary, melanie, patti, jiriIf this clause does not appear in a rule, the rule applies to all authorization IDs.

© Copyright IBM Corporation 2004

Governor Clauses

desc <text description>

time <time period>

authid <authorization ID>

applname<application name>

case-sensitive

Example: desc: "limit use of CLP by user novice"

time 8:30 17:30

authid novice

applname db2bp

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-11

Student Notebook

• applname Specifies the name of the executable (or object file) that makes the connection to the database.

Multiple application names must be separated by a comma(,). Example: applname db2bp, batch, geneprog If this clause does not appear in a rule, the rule applies to all application names.

Note: Application names are case-sensitive.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-7. Elements Checked CF457.3

Notes:

The governor can check the following elements (introduced with setlimit):

1. cpu nnn Specifies the number of CPU seconds that can be consumed by an application. If you specify -1, the governor does not limit the application's CPU usage.

Note: This option is not available in Windows environments.

2. rowsread nnn Specifies the number of rows an application can read to develop the qualifying rows. If you specify -1, there is no limit on the number of rows the application can read.

Note: This limit is not the same as rowssel. The difference is that rowsread is the count of the total number of rows that has to be read in order to return the result set. The number of rows read includes reads of the catalog tables by the engine and may be diminished when indices are used.

3. rowssel nnn Specifies the number of rows that are returned to the application. In a partitioned

© Copyright IBM Corporation 2004

Elements Checked

CPU time (cpu)

Read rows (rowsread)

Selected rows (rowssel)

Locks

Connect time (idle)

Elapsed time (uowtime)

Example: setlimit cpu 5 locks 1000 rowssel 500

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-13

Student Notebook

database environment, this value will only be non-zero at the coordinator partition. If you specify -1, the governor does not limit the number of rows that can be selected.

4. locks nnn Specifies the number of locks that an application can hold. If you specify -1, the governor does not limit the number of locks held by the application.

5. idle nnn Specifies the number of idle seconds allowed for a connection before an action is taken. If you specify -1, the connection's idle time is not limited.

6. uowtime nnn Specifies the number of seconds that can elapse from the time that a unit of work (UOW) first becomes active. If you specify -1, the elapsed time is not limited.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-8. Actions Taken CF457.3

Notes:

Specifies the action to take if one or more of the specified limits is exceeded. You can specify the following:

• nice nnn

Specifies to change the priority of agents working for the application. Valid values are from -20 to +20. For this parameter to be effective:

- On UNIX-based platforms, the agentpri database manager parameter must be set to the default value; otherwise, it overrides the priority clause.

- On Windows platforms, the agentpri database manager parameter and priority action may be used together.

-1 specifies no special priority.

For this parameter to be effective:

- The executables for the governor front-end utility and the governor daemon must have an operating system level of authority to be able to set agent priorities.

© Copyright IBM Corporation 2004

Actions Taken

Priority change

Force application

Schedule

Example: action priority 3

setlimit CPU 3600

action schedule class

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-15

Student Notebook

• force

Specifies to force the agent that is servicing the application (issues a FORCE APPLICATION to terminate the coordinator agent).

• schedule [class]

Scheduling improves the priorities of the agents working on the applications with the goal of minimizing the average response times while maintaining fairness across all applications.

The governor enforces its schedule by setting priorities for the agents working on the applications, using query cost estimates from the DB2 UDB internal query compiler. If the class option is specified, all applications chosen by the rule are scheduled among themselves only. If this option is not specified, the governor uses one or more classes, with scheduling done within each class.

Within each class, how an application is prioritized is based on:

The number of locks held by the application within the class. (An application holding up many other applications due to locking is given a high priority.)

The application's age. (An application in the system for a long time is given a high priority.)

The application's estimated remaining running time. (An application close to finishing is given a high priority.)

Applications that are not covered by any schedule run with the highest authority.

Note: If you used the sqlmon (Database System Monitor Switch) API to deactivate the statement switch, this will affect the ability of the governor to govern applications based on the statement elapsed time.

Note: If a limit is exceeded and the action clause is not specified, the governor reduces the priority of agents working for the application.

If more than one rule applies to an application, the rule that is closest to the end of the configuration file is applied to the application. An exception occurs if -1 is specified for a clause in a rule. In this situation, the value specified for the clause in the subsequent rule can only override the value previously specified for the same clause: other clauses in the previous rule are still operative. For example, one rule indicates that the priority of an application is to be decreased if its elapsed time is greater than 1 hour, or if it selects more than 100,000 rows (that is, rowssel 100000 uowtime 3600). A subsequent rule indicates that the same application can have unlimited elapsed time (that is, uowtime -1). In this situation, if the application runs for more than 1 hour, its priority won't be changed (that is, uowtime -1 overrides uowtime 3600), but if it selects more than 100,000 rows, its priority will be lowered (as rowssel 100000 is still valid).

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-9. Example Governor Configuration File CF457.3

Notes:

The configuration file contains the rules to govern applications executed against the database. It can be changed without stopping the governor; each governor daemon will detect if it has changed and re-read it. In a partitioned database, the configuration file must be in a directory that is mounted across all of the database partitions.

Comments are specified within curly braces ({}). Rules that must be specified include the name of the database, how often the governor should check the applications, and the rules on how to govern the applications. Each rule must end with a semicolon (;).

The rules that govern the application may include a description of the rule, the time period during which the rule is to be evaluated, a specific authid to check for, a specific applname to check for, the limits to apply to this particular time period or authid or application, and the action that should be taken should one or more of the limits be exceeded.

The governor has two intervals:

1. Snapshot interval - specified in seconds in the interval rule. 2. Accounting interval - specified in minutes in the account rule.

© Copyright IBM Corporation 2004

Example Governor Configuration File

{ Wake up every minute, the database is dss, account every 10 minutes }interval 60; dbname dss; account 10;

desc "CPU restrictions apply 24 hours a day to everyone"setlimit cpu 6000 rowssel 1000000;

{ Allow no UOW to run for more than an hour}setlimit uowtime 3600 action force;

desc "Slow down a subset of applications'applname jointA, jointB, jointC, quryAsetlimit cpu 3 locks 1000 rowssel 500;

{ Slow down the use of db2 CLP by the novice user }authid noviceapplname db2bp.exesetlimit cpu 5 locks 100 rowssel 250;

{ During day hours do not let anyone run for more than 10 seconds }time 8:30 17:00 setlimit cpu 10 action force;

{ Some people should not be limited - database administrator and a few others. As this is the last specification in the file, it will override what came before. }authid melanie, tony, calene, patti setlimit cpu - 1 locks - 1 rowssel -1;

{ increase the priority of an important application so it always completes quickly }applname pgm 1 setlimit cpu 1 locks 1 rowssel 1 action priority +2;

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-17

Student Notebook

A snapshot is taken every snapshot interval, and accounting records are written when an application termination is detected or at the accounting interval statistics for each connection. The accounting interval cannot be shorter than the snapshot interval. For a short connect session that occurs entirely within the snapshot interval, no log record is written.

The limits can indicate a maximum for CPU seconds, locks, rows read, rows selected, connect time, or time exhausted by the UOW. The actions can include forcing the application, or changing its priority. Changing its priority is the default action if none is specified.

If the NUM_POOLAGENTS database manager configuration parameter is not 0, there is a chance that the application will reuse a database agent whose priority has been changed. To avoid this effect, the first line of the configuration file should read:

setlimit cpu 1 locks 1 rowsread 1 action priority 0;

A sample governor configuration file follows:

{********************************************************************}

{*******SAMPLE Governor-Configuration for CF45***********************}

{********************************************************************}

interval 2;

dbname sample;

account 10;

{**************Clause # 1********************************************}

desc"**Description of all possible clauses(more detail selection) ****"

{********************************************************************}

{

time 08:00 18:00 ( if not specifies , 24 hours a day )

authid inst00,inst01,inst02,user01,user02

applname db2bp

}

{*** Active clauses *************************************************}

authid inst00,root

{********************************************************************}

{****Description of all available limits ****************************}

{

setlimit

cpu 10 ( not on Windows )

locks 5

uowtime 10

idle 60

rowsread 10000

rowssel 100;

}

{*** Active Limits :************************************************}

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

setlimit idle 60

{***Description of all available actions ***************************}

{

action

(no value --> reduce the priority of agents working for the application )

priority nnn (AIX: 41 - 125)

priority nnn (UNIX: 41 - 128)

priority n (NT/2000: 1 - 6)

schedule class

force

}

{***Active Actions**************************************************}

action force

{***** END of clause # 1 *******************************************}

;

{***** Active clause # 2 plus Limits and Action ********************}

authid user01

setlimit cpu 60

action force

{***** END of clause # 2 *******************************************}

;

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-19

Student Notebook

Figure 3-10. The Governor Log File Structure CF457.3

Notes:

A separate log file exists for each daemon. The log files are stored in the /sqllib/instname/log directory. On Windows, the log subdirectory is under the instance directory.

The records in the log file are as follows:

Date yyyy-mm-dd

Time hh.mm.ss

NodeNum The number of the database partition on which the governor is running.

RecType Specifies the type of log record written. The possible types are:

• ACCOUNT: Indicates the application's account statistics. The fields in the Message field for an ACCOUNT record type include: authid, appl_id, appl_con_time, written_usr_cpu, and written_sys_cpu.

• FORCE: An application was forced. You will see the appl_name, auth_id, appl_id, coord_partition (if applicable), cfg_line (line number in the

© Copyright IBM Corporation 2004

The Governor Log File Structure

Date RecType

ACCOUNT

FORCE

ERROR

NICE

READCFG

WARNING

START

STOP

SCHEDULE

MessageTime NodeNum

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

configuration file where the rule causing is located), and restriction_exceeded.

• NICE: Indicates that the priority of an application was changed.

• SCHEDGRP: Indicates that a change in agent priorities occurred.

• ERROR: Indicates an error.

• NICE: The priority of an application was changed.

• READCFG: The governor read the configuration file.

• WARNING: Indicates a warning.

• START: The governor has started.

• STOP: The governor has stopped.

• SCHEDGRP: Change in agent priorities occurred.

Message Provides additional information.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-21

Student Notebook

Figure 3-11. Querying Governor Logs CF457.3

Notes:

When the governor daemon performs actions such as forcing an application, reading the governor configuration file, or changing the priority of an application, it can encounter a warning or an error. Also, sometimes it is started or stopped while performing these actions. In these cases, it writes a record to the log file.

You can query the log file using db2govlg. The log file is the name of the log file (without its directory path) that was indicated on the db2gov start command.

If you are only interested in certain log records, you can specify the record type. The valid record types are: START, ACCOUNT, FORCE, NICE, ERROR, WARNING, READCFG, and STOP.

Attention: It is necessary to restart the governor if a ERROR is reported in the governor log file!

© Copyright IBM Corporation 2004

Querying Governor Logs

db2govlg databaselog-file

dbpartitionnum db-partition-number rectype record-type

2004-02-26-11.29.47 9 START Database = DSS2004-02-26-11.29.47 9 READCFG Config = /db2home/peuser14/sqllib/myrules2004-02-26-11.29.47 11 START Database = DSS2004-02-26-11.29.47 11 READCFG Config = /db2home/peuser14/sqllib/myrules2004-02-26-11.29.47 13 START Database = DSS2004-02-26-11.29.47 13 READCFG Config = /db2home/peuser14/sqllib/myrules2004-02-26-11.29.50 15 START Database = DSS2004-02-26-11.29.50 15 READCFG Config = /db2home/peuser14/sqllib/myrules

2004-02-26-14.40.57 111 ACCOUNT WINCH-N "LOCAL.DB2.040226143917 db2bp 2004-02-26-14.39.17 562004-02-26-14.40.57 111 ACCOUNT WINCH-N "LOCAL.DB2.040226143917

2004-02-26-11.30.28 9 FORCE applname 'db2bp' authid “USER14' applid '*LOCAL.DB2.040226153017' coord9 (line 5) Rows 379

......

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-12. Table to Hold Accounting Records (UNIX) CF457.3

Notes:

When ACCOUNT log records are written, they contain CPU statistics which reflect CPU usage since the previous log record.

To load the accounting records from the governor into the table, use db2govlg to pipe the data into LOAD. The above steps assume that gov.log is the name of the log file.

A sample output for an account record in the governor log file:

SNAPSHOT_TIME-------,Node_NUM,-------, USER_AUTHID, 2004-03-13-11.13.02 0 ACCOUNT ROOT

APPL_ID (IPname in HEX,,timestamp)--- 09151418.2A04.010313151254

APPL_NAME----------,--CONNECT_TIME--------- ,

db2bp.exe 2004-03-13-11.12.54 USER_CPU-,--SYS_CPU--, 0 0

© Copyright IBM Corporation 2004

Table to Hold Accounting Records (UNIX)

(1) db2 CREATE TABLE ACCOUNT_DATA ( SNAPSHOT_TIME TIMESTAMP NOT NULL, NODE_NUM SMALLINIT NOT NULL, USER_AUTHID CHAR(20) NOT NULL, APPL_ID CHAR(32) NOT NULL, APPL_NAME CHAR(32) NOT NULL, CONNECT_TIME TIMESTAMP NOT NULL, USER_CPU INT NOT NULL, SYS_CPU INT NOT NULL; IN TS_NAME;

(2) mknod acctpipe p (create pipe)

(3) db2 -f govacct.load & (load starts reading the pipe)

LOAD FROM acctpipe of ASC MODIFIED BY noheader METHOD (1 19, 21 24, 37 56, 58 59, 91 110, 112 130, 132 141, 143 152) INSERT INTO ACCOUNT_DATA

(4) db2govlg sample gov.log rectype ACCOUNT > acctpipe (db2govlg starts writing to the pipe)(5) rm acctpipe (remove pipe after completion)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-23

Student Notebook

Figure 3-13. Why Do We Need Auditing? CF457.3

Notes:

The DB2 UDB audit facility allows the DBA to implement the following security mechanisms. It can:

• Allow the review of patterns of access to DB2 UDB objects, access histories of specific processes and individuals, and the use of the protection mechanisms supported by DB2 UDB and their effectiveness.

• Allow discovery of repeated attempts to bypass DB2 UDB's protection mechanisms.

• Allow discovery of any use of privileges that may occur when a user attempts to perform a function with privileges greater than his or her own (for example, a database user trying to execute system administrator commands).

• Act as a deterrent against repeated attempts to bypass the system protection mechanisms.

• Supply an additional form of assurance to database administrators which allows recording and discovery of attempts to bypass the protection mechanisms.

© Copyright IBM Corporation 2004

To allow DBAs to implement security mechanismssuch as:

Capturing access patterns

Verifying DB2 protection mechanisms

Discovering attempts to bypass DB2 securitymechanisms

Discovering attempts to perform non-privilegedactions

Acting as deterrent against users trying tobypass security mechanisms

To help DBAs improve system security

Why Do We Need Auditing?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• The Audit facility generates and maintains an audit trail for series of predefined audit events. The audit trail, also referred to as the audit log, contains records for each occurrence of any auditable event. These audit records can be used to detect and deter penetration of a computer system and to reveal usage patterns that identify system misuse.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-25

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

3.2 Audit Facility

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-27

Student Notebook

Figure 3-14. How Does the DB2 Audit Facility Work? CF457.3

Notes:

The DB2 UDB audit facility generates, and allows you to maintain, an audit trail for a series of predefined database events. The records generated from this facility are kept in an audit log file. The analysis of these records can reveal usage patterns which would identify system misuse. Once identified, actions can be taken to reduce or eliminate such system misuse.

The audit facility acts at an instance level, recording all instance level activities and database level activities.

Note for partitioned databases: When working in a partitioned database environment, many of the auditable events occur at the partition at which the user is connected (the coordinator partition) or at the catalog partition (if they are not the same partition). The implication of this is that audit records can be generated by more than one partition. Part of each audit record contains information on the coordinator partition and originating partition identifiers.

The audit log (db2audit.log) and the audit configuration file (db2audit.cfg) are located in the instance's security subdirectory. At the time you create an instance,

© Copyright IBM Corporation 2004

Predefined events generate records in an audit log file

Acts at instance level

On more than one partition

SYSADM authority required

Independent of DB2 server

Impact on the database performance

How Does the DB2 Audit Facility Work?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

read/write permissions are set on these files, where possible, by the operating system. By default, the permissions are read/write for the instance owner only. It is recommended that you do not change these permissions.

Note: Users of the audit facility administrator tool, db2audit, must have SYSADM authority/privileges.

The audit facility must be stopped and started explicitly. When starting, the audit facility uses existing audit configuration information. Since the audit facility is independent of the DB2 UDB server, it will remain active even if the instance is stopped. In fact, when the instance is stopped, an audit record may be generated in the audit log.

The timing of the writing of audit records to the audit log can have a significant impact on the performance of databases in the instance. The writing of the audit records can take place synchronously or asynchronously with the occurrence of the events causing the generation of those records.

The audit facility records auditable events including those affecting database instances. For this reason, the audit facility is an independent part of DB2 UDB that can operate even if the DB2 UDB instance is stopped. If the audit facility is active, then when a stopped instance is started, auditing of database events in the instance resumes.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-29

Student Notebook

Figure 3-15. DB2audit Performance Impact CF457.3

Notes:

• Synchronous or Asynchronous

The timing of the writing of audit records to the audit log can have a significant impact on the performance of databases in the instance. The writing of the audit records can take place synchronously or asynchronously with the occurrence of the events causing the generation of those records.

• AUDIT_BUF_SZ

The value of the AUDIT_BUF_SZ database manager configuration parameter determines when the writing of audit records is done. The range of values of this parameter is from 0 to 65,000 4K pages. The default value for AUDIT_BUF_SZ is 0.

If the value of this parameter is zero (0), the writing is done synchronously. The event generating the audit record will wait until the record is written to disk. The wait associated with each record causes the performance of DB2 UDB to degrade.

If the value of AUDIT_BUF_SZ is greater than zero, the record writing is done asynchronously.

© Copyright IBM Corporation 2004

DB2audit Performance Impact

- SQLCODE

audit event

Appl.

audit = errortype = normal

synchronouslyasynchronously

Data

Update

AUDIT_BUF_SZ

0

4

db2audit.log

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The value of the AUDIT_BUF_SZ when it is greater than zero is the number of 4 KB pages used to create an internal buffer. The internal buffer is used to keep a number of audit records before writing a group of them out to disk.

The statement generating the audit record as a result of an audit event will not wait until the record is written to disk, and can continue its operation.

In the asynchronous case, it could be possible for audit records to remain in an unfilled buffer for some time. To prevent this from happening for an extended period, the database manager will force the writing of the audit records regularly. An authorized user of the audit facility may also flush the audit buffer with an explicit request.

- db2audit flush

• ERRORTYPE = normal | audit

There are differences when an error occurs dependent on whether there is synchronous or asynchronous record writing. In asynchronous mode, there may be some records lost because the audit records are buffered before being written to disk. In synchronous mode, there may be one record lost because the error could only prevent at most one audit record from being written.

The setting of the ERRORTYPE audit facility parameter controls how errors are managed between DB2 UDB and the audit facility. When the audit facility is active, if the setting of the ERRORTYPE audit facility parameter is AUDIT, then the audit facility is treated in the same way as any other part of DB2 UDB. An audit record must be written (to disk in synchronous mode; or to the audit buffer in asynchronous mode) for an audit event associated with a statement to be considered successful. Whenever an error in the audit facility is encountered when running in this mode, a negative SQLCODE is returned to the application for the statement generating an audit record. If the error type is set to NORMAL, then any error from db2audit is ignored and the operation's SQLCODE is returned.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-31

Student Notebook

Figure 3-16. Audit Facility CF457.3

Notes:

You can specify the scope of the audit facility, defining which types of events are of interest, when configuring the audit facility. You can audit failures, successes, or both.

ALL: Represents all categories; see following details.

AUDIT: Events corresponding to changes in the state of auditing on the instance. For example, an event would get logged when audit was started, or stopped, or when the audit log file is pruned.

CHECKING: Events correlating to places in the database engine where authority checking is performed. For example, select from table, catalog database.

OBJECT MAINTENANCE: Events corresponding to creation and deletion of database objects, for example database creation, dropping of tables.

© Copyright IBM Corporation 2004

Audit Facility

Audit Facility Scope:

AUDIT: Changes in the state of auditing

CHECKING: Authority checking

OBJMAINT: Creation and deletion of DB2 objects

SECMAINT: Overall security (GRANT, REVOKE,and so forth)

SYSADMIN: Action requiring SYSADM authority

VALIDATE: User validation, retrieving user information

CONTEXT: Operation context (SQL statement, andso forth)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

SECURITY MAINTENANCE: Events corresponding to the overall security present in the database (from the engine point of view). For example, grant and revoke.

SYSADMIN EVENTS: Events corresponding to actions on the database that can be performed only by system administrators. For example, update database manager configuration, catalog database.

VALIDATE EVENTS: Events corresponding to either user validation, or for when the database goes to the operating system to retrieve user information. You'll see records such as these during connect.

CONTEXT EVENTS: This is the only category of audit events that is disabled by default. Context events can be used to associate a group of audit records back to an event such as a "create table". If you audit this set of events by themselves, you get a kind of "trace" as things occur at the server. Not a trace like a db2trc trace, but more a brief trace of records as they come into the server.

Events can be associated back to a context through a field called the correlation_id. This is an integer field that has the same value for all records in a single context.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-33

Student Notebook

Figure 3-17. db2audit Command - How It Works CF457.3

Notes:

Authorized users are users in the SYSADM group. Even if you are the instance owner, you must still be a member of the SYSADM group to be able to control the audit facility by using the db2audit command.

db2audit start:

• This parameter causes the audit facility to begin auditing events based on the contents of the db2audit.cfg file. In a partitioned DB2 instance, auditing will begin on all partitions when this clause is specified. If the "audit" category of events has been specified for auditing, then an audit record will be logged when the audit facility is started.

db2audit stop:

• This parameter causes the audit facility to stop auditing events. In a partitioned DB2 instance, auditing will be stopped on all partitions when this clause is specified. If the "audit" category of events has been specified for auditing, then an audit record will be logged when the audit facility is stopped.

© Copyright IBM Corporation 2004

db2audit Command - How It Works

Starting audit facility -db2audit start

Stopping audit facility -db2audit stop

Configuring audit facility -db2audit configure

Finding how audit facility is currently configured -db2audit describe

Forcing buffered audit record to disk -db2audit flush

Extracting information from the log files - db2audit extract

Pruning the audit log -db2audit prune

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

db2audit configure:

• This parameter allows the modification of the db2audit.cfg configuration file in the instance's security subdirectory. Updates to this file can occur even when the instance is shut down. Updates occurring when the instance is active dynamically affect the auditing being done by DB2 across all partitions. The configure action on the configuration file causes the creation of an audit record if the audit facility has been started and the audit category of auditable events is being audited.

The following are the possible actions on the configuration file:

• RESET This action causes the configuration file to revert to the initial configuration (where SCOPE is all of the categories except CONTEXT, STATUS is FAILURE, ERRORTYPE is NORMAL, and AUDIT is OFF). This action will create a new audit configuration file if the original has been lost or damaged.

• SCOPE This action specifies which category or categories of events are to be audited. This action also allows a particular focus for auditing and reduces the growth of the log. It is recommended that the number and type of events being logged be limited as much as possible, otherwise the audit log will grow rapidly. Note: Please notice that the default SCOPE is all categories except CONTEXT and may result in records being generated rapidly. In conjunction with the mode (synchronous or asynchronous), the selection of the categories may result in a significant performance reduction and significantly increased disk requirements.

• STATUS This action specifies whether only successful or failing events, or both successful and failing events, should be logged. Note: Context events occur before the status of an operation is known. Therefore, such events are logged regardless of the value associated with this parameter.

• ERRORTYPE This action specifies whether audit errors are returned to the user or are ignored. The value for this parameter can be:

- AUDIT All errors including errors occurring within the audit facility are managed by DB2 and all negative SQLCODEs are reported back to the caller.

- NORMAL Any errors generated by db2audit are ignored and only the SQLCODEs for the errors associated with the operation being performed are returned to the application.

db2audit describe:

• This parameter displays to standard output the current audit configuration information and status.

db2audit extract:

• This parameter allows the movement of audit records from the audit log to an indicated destination. If no optional clauses are specified, all of the audit records are extracted and placed in a flat report file. If output_file already exists, an error message is returned. The following are the possible options:

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-35

Student Notebook

- FILE The extracted audit records are placed in a file (ouptut_file). If no file name is specified, records are written to the db2audit.out file in the security subdirectory of sqllib. If no directory is specified, output_file is written to the current working directory

- DELASC The extracted audit records are placed in a delimited ASCII format suitable for loading into DB2 UDB relational tables. The output is placed in separate files, one for each category.

db2audit prune:

• Prunes the audit log. This parameter allows for the deletion of audit records from the audit log. If the audit facility is active and the "audit" category of events has been specified for auditing, then an audit record will be logged after the audit log is pruned.

The following are the possible options that can be used when pruning:

- ALL All of the audit records in the audit log are to be deleted.

- DATE yyyymmddhh. The user can specify that all audit records that occurred on or before the date/time specified are to be deleted from the audit log. The user may optionally supply a pathname which the audit facility will use as a temporary space when pruning the audit log. This temporary space allows for the pruning of the audit log when the disk it resides on is full and does not have enough space to allow for a pruning operation.

db2audit flush:

This parameter forces any pending audit records to be written to the audit log. Also, the audit state is reset in the engine from "unable to log" to a state of "ready to log" if the audit facility is in an error state.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-18. Auditing Flow CF457.3

Notes:

1. Configure: This parameter allows the modification of the db2audit.cfg configuration file in the instance’s security subdirectory.

2. Describe: This parameter displays the current audit configuration information and status.

3. Start: This parameter causes the audit facility to begin auditing events based on the contents of the db2audit.cfg file. In a partitioned instance, auditing will begin on all partitions.

4. Flush: This parameter forces any pending audit records to be written to the audit log.

5. Stop: This parameter causes the audit facility to stop auditing events. In a partitioned instance, auditing will be stopped on all partitions.

6. Extract: This parameter allows the movement of audit records from the audit log to an indicated destination. If no optional clauses are specified, then all of the audit records are extracted and placed in a flat report file.

7. Prune: This parameter allows for the deletion of audit records from the audit log.

© Copyright IBM Corporation 2004

db2audit CONFIGURE...

db2audit START

db2audit STOP

db2audit EXTRACT...

db2audit PRUNE...

db2audit.cfgdb2audit DESCRIBE

Configure Audit Scope

Confirm configured Audit scopeGenerate

Audit record

Extract Audit record

Prune the audit log

db2audit FLUSHForce buffered record to disk

DB2 UDB

Flat File

Delimited ASCII fileLOAD or IMPORT

AuditActive

db2audit.log

AUDIT_BUF

In memory

6

4

3

1

7

5

2

Auditing Flow

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-37

Student Notebook

Figure 3-19. Configuring db2audit CF457.3

Notes:

The db2audit configure command allows the modification of the db2audit.cfg configuration file in the instance's security subdirectory. Updates to this file can occur even when the instance is shut down.

Intel: X:\sqllib\db2\security

UNIX: $HOME/instxx/security

Note: In a multiple partitioned database environment, updates occurring when the instance is active dynamically affect the auditing being done by DB2 UDB across all partitions.

The following are the possible actions on the configuration file:

RESET

• This action causes the configuration file to revert to the initial configuration (where SCOPE is all of the categories except CONTEXT, STATUS is FAILURE, ERRORTYPE is NORMAL, and AUDIT is OFF). This action will create a new audit configuration file if the original has been lost or damaged.

© Copyright IBM Corporation 2004

db2audit configure

reset (Revert config file to initial configuration)

scope (Specify category to be audited)

all, audit, checking, objmaint, secmaint, sysadmin,validate, context

status (Logged successful or failed events)

both

success

failure

errortype (Return errors which occurred within audit facility)

audit

normal

Sample:

db2audit configure scope all status both errortype normal

Configuring db2audit

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-38 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

SCOPE

• This action specifies which category or categories of events are to be audited. This action also allows a particular focus for auditing and reduces the growth of the log. It is recommended that the number and type of events being logged be limited as much as possible, otherwise the audit log will grow rapidly.

Note: Please notice that the default SCOPE is all categories except CONTEXT, and may result in records being generated rapidly. In conjunction with the mode (synchronous or asynchronous), the selection of the categories may result in a significant performance reduction and significantly increased disk requirements.

STATUS

• This action specifies whether only successful events, or only failing events, or both successful and failing events, should be logged.

Note: Context events occur before the status of an operation is known. Therefore, such events are logged regardless of the value associated with this parameter.

ERRORTYPE

• This action specifies whether audit errors are returned to the user or are ignored. The value for this parameter can be:

- AUDIT All errors including errors occurring within the audit facility are managed by DB2 UDB and all negative SQLCODEs are reported to the caller.

- NORMAL Any errors generated by db2audit are ignored and only the SQLCODEs for the errors associated with the operation being performed are returned to the application.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-39

Student Notebook

Figure 3-20. Extracting Audit Record CF457.3

Notes:

The db2audit extract allows the movement of audit records from the audit log to an indicated destination. If no optional clauses are specified, then all of the audit records are extracted and placed in a flat report file called:

Intel: X:\sqllib\db2\security\db2audit.out UNIX: $HOME/sqllib/security/db2audit.out

If the output file already exists, an error message is returned.

Note: You can use only one option, FILE or DELASC. The following are the possible options that can be used when extracting:

FILE:

• The extracted audit records are placed in a file.

db2audit extract file d:\cf45\db2audit.out

© Copyright IBM Corporation 2004

db2audit.log

Flat File

Delimited ASCII files

db2audit extract filename

db2audit extract delascLOAD or IMPORT

Audit record can be written into two types of format:

db2audit extract file filename - flat report file

db2audit extract delasc - delimited ASCII file

Specify event category, database, status to beextracted

db2audit extract file db2audit.out category validate \database sample status success

Extracting Audit Record

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-40 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

DELASC:

• The extracted audit records are placed in a delimited ASCII format suitable for loading into DB2 UDB relational database tables. The output is placed in separate files, one for each category. The filenames are: audit.del, checking.del, objmaint.del, secmaint.del, sysadmin.del, validate.del, context.del.

CATEGORY:

• The audit records for the specified categories of audit events are to be extracted. If not specified, all categories are eligible for extraction.

DATABASE:

• The audit records for a specified database are to be extracted. If not specified, all databases are eligible for extraction.

STATUS:

• The audit records for the specified status (success or failure) are to be extracted. If not specified, all records are eligible for extraction.

Recommendation: When extracting audit records in a delimited ASCII format suitable for loading into a DB2 UDB relational table, you should be clear regarding the character string delimiter used within the statement text field. The default character string delimiter is '0xff'. You can use other delimiters when extracting the delimited ASCII file using the following syntax:

db2audit extract delasc delimiter <delimiter>

The character string delimiter can be a single character (such as ") or a four-byte string representing a hexadecimal value (such as "0xff"). Examples of valid commands are:

db2audit extract delasc db2audit extract delasc delimiter ] db2audit extract delasc delimiter 0xff

Load data into a table: If you have not specified a character string delimiter or have used anything other than the default load delimiter (""") as the delimiter when extracting, you should use the MODIFIED BY option on the LOAD command. A partial example of the LOAD command with "0xff" used as the delimiter follows:

db2 load from context.del of del modified by chardel0xff replace into ...

This will override the default load character string delimiter with "0xff".

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-41

Student Notebook

Figure 3-21. Audit Record - Connect (1 of 2) CF457.3

Notes:

This is an example of audited records when a CONNECT statement is issued.

1. Connect statement is issued. The event correlator shows you that these records were generated by the same event.

2. Authentication is done at the client and succeeded.

3. Check if the authenticated user is SYSADM group and find this user is SYSADM.

4. Check if the authenticated user is SYSCTRL group and find this user is not SYSCTRL.

5. Check if the authenticated user is SYSMAINT group and find this user is not SYSMAINT.

6. Check if the authenticated user may connect to the database and find this user can connect because he/she is a SYSADM user.

7. COMMIT is issued.

© Copyright IBM Corporation 2004

User authentication

Indirect SYSCTRL

Indirect SYSADM

Execute CONNECT

tetsuya is notSYSCTRL

tetsuya isSYSADM

Example - db2 connect to sampletimestamp=2004-02-18-11.59.20.797000;category=CONTEXT;auditevent=CONNECT; event correlator=2; database=SAMPLE;applicationid=*LOCA>lDB2.040218175920;application name=db2bp.exe 117254683 ;package schema=NULLID;package name=SQLC2B6J;

timestamp=2004-02-18-11.59.21.078000;category=VALIDATE;auditevent=AUTHENTICATION; event correlator=2;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA;execution id=DB2ADMIN;application id=*LOCAL.DB2.040218175920;application name=db2bp.exe117254683 ; auth type=CLIENT;package schema=NULLID;packagename=SQLC2B6J;

timestamp=2004-02-18-11.59.22.339000;category=VALIDATE;auditevent=CHECK_GROUP_MEMBERSHIP; event correlator=2;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA;execution id=DB2ADMIN;application id=*LOCAL.DB2.040218175920;application name=db2bp.exe117254683 ; auth type=CLIENT;

timestamp=20034-02-18-11.59.22.349000;category=VALIDATE;auditevent=CHECK_GROUP_MEMBERSHIP; event correlator=2;event status=1092;database=SAMPLE;userid=tetsuya;authid=TETSUYA;execution id=DB2ADMIN;application id=*LOCAL.DB2.04021875920;application name=db2bp.exe117254683 ; auth type=CLIENT;

Audit Record - Connect (1 of 2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-42 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-22. Audit Record - Connect (2 of 2) CF457.3

Notes:

© Copyright IBM Corporation 2004

Example - db2 connect to sample (continued)

SYSCAT.DBAUTH

CONNECTAUTH

COMMIT

Indirect SYSMAINT

tetsuya is not SYSMAINT

tetsuya canconnect

timestamp=2004-02-18-11.59.22.359000;category=VALIDATE;auditevent=CHECK_GROUP_MEMBERSHIP; event correlator=2;eventstatus=1092;database=SAMPLE;userid=tetsuya;authid=TETSUYA;executionid=DB2ADMIN; application id=*LOCAL.DB2.040218175920;applicationname=db2bp.exe 117254683 ; auth type=CLIENT;

timestamp=2004-02-18-11.59.22.570000;category=CHECKING;auditevent=CHECKING_OBJECT; event correlator=2;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.040218175920;application name=db2bp.exe 117254683 ;object name=SAMPLE; object type=DATABASE; access approvalreason=SYSADM;access attempted=CONNECT;

timestamp=2004-02-18-11.59.23.231000;category=CONTEXT;auditevent=COMMIT; event correlator=2;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.04021875920;application name=db2bp.exe 117254683 ;

Audit Record - Connect (2 of 2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-43

Student Notebook

Figure 3-23. Audit Record - Create Table (1 of 2) CF457.3

Notes:

This is an example of audited records when the CREATE TABLE statement is issued.

1. CREATE TABLE statement is issued.

2. Check if the user has authorization to create a table TEST1 in SAMPLE database.

3. Check if the user has authorization to create a new object in the schema TETSUYA.

4. Implicit grant on the table TEST1 to the creator TETSUYA.

5. Table TEST1 is created.

6. Commit is issued.

© Copyright IBM Corporation 2004

Example - db2 create table test1 (col0 char(1))

tetsuya cancreate table

SYSCAT.DBAUTHCREATETAB

Create Tableissued

tetsuya cancreate a objectin this schema

SYSCAT.SCHEMAAUTHCREATEINAUTH

timestamp=2004-02-18-12.26.04.774000;category=CONTEXT;auditevent=EXECUTE_IMMEDIATE; event correlator=3;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.040218175920;application name=db2bp.exe 117254683 ;package schema=NULLID;packagename=SQLC2B6J; packagesection=203;text=create table test1 (col0 char(1));

timestamp=2004-02-18-12.26.04.954000;category=CHECKING;auditevent=CHECKING_OBJECT; event correlator=3;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.040218175920;application name=db2bp.exe 117254683 ;package schema=NULLID;package name=SQLC2B6J; packagesection=0;object schema=TETSUYA;object name=TEST1;objecttype=TABLE; access approval reason=SYSADM;accessattempted=CREATE;

timestamp=2004-02-18-12.26.05.064000;category=CHECKING;auditevent=CHECKING_OBJECT; event correlator=3;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.04021875920;application name=db2bp.exe 117254683 ;package schema=NULLID;package name=SQLC2B6J; packagesection=0;object name=TETSUYA;object type=SCHEMA; access approvalreason=SYSADM;access attempted=CREATEIN;

Audit Record - Create Table (1 of 2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-44 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-24. Audit Record - Create Table (2 of 2) CF457.3

Notes:

© Copyright IBM Corporation 2004

Example - db2 create table test1 (col0 char(1)) (cont.)

Table is created

Implicit grant to the creator

COMMIT

timestamp=2004-02-18-12.26.05.645000;category=SECMAINT;auditevent=IMPLICIT_GRANT; event correlator=3;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.040218175920;application name=db2bp.exe 117254683 ;package schema=NULLID;packagename=SQLC2B6J; package section=0;object schema=TETSUYA; object name=TEST1; object type=TABLE;grantor=SYSIBM;grantee=TETSUYA;grantee type=USER;privilege=CONTROL,ALTER_WITH_GRANT,DELETE_WITH_GRANT,INDEX_WITH_GRANT,INSERT_WITH_GRANT,SELECT_WITH_GRANT,UPDATE_WITH_GRANT,REFERENCE_WITH_GRANT;

timestamp=2004-02-18-12.26.05.865000;category=OBJMAINT;auditevent=CREATE_OBJECT; event correlator=3;event status=0;database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.040218175920;application name=db2bp.exe 117254683 ;package schema=NULLID;package name=SQLC2B6J; package section=0;object schema=TETSUYA;object name=TEST1;object type=TABLE;

timestamp=2004-02-18-12.26.05.975000;category=CHECKING;auditevent=COMMIT; event correlator=3; database=SAMPLE;userid=tetsuya;authid=TETSUYA; applicationid=*LOCAL.DB2.04021875920;application name=db2bp.exe 117254683 ; package schema=NULLID;package name=SQLC2B6J;

Audit Record - Create Table (2 of 2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-45

Student Notebook

Figure 3-25. How to Clean Up Audit Logs? CF457.3

Notes:

db2audit prune

• Prunes the db2audit.log file.

• The user can specify removal of all records, or records occurring before a certain date.

• Allows for the deletion of audit records from the audit log. If the audit facility is active and the "audit" category of events has been specified for auditing, then an audit record will be logged after the audit log is pruned.

- The following are the possible options that can be used when pruning:

• ALL. All of the audit records in the audit log are to be deleted.

• DATE yyyymmddhh. The user can specify that all audit records that occurred on or before the date/time specified are to be deleted from the audit log. The user may optionally supply a pathname which the audit facility will use as a temporary space when pruning the audit log.

© Copyright IBM Corporation 2004

db2audit prune (db2audit.log)

all

date

db2audit.out

audit.del

checking.del

objmaint.del

secmaint.del

validate.del

context.del

How to Clean Up Audit Logs?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-46 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

This temporary space allows for the pruning of the audit log when the disk it resides on is full and does not have enough space to allow for a pruning operation.

Recommendations:

The files that are created during the extract process (db2audit extract ...) are not automatically erased, so you have to do it manually or via a script. If the extract command runs a second time, the output files are not overwritten. Instead, you also get an error message that the output file already exists.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-47

Student Notebook

Figure 3-26. How to Optimize Work with db2audit CF457.3

Notes:

To protect the integrity of the information that you retrieve from the audit facility, you should limit access to the directory where the db2audit files reside.

Note: For Windows it is recommended that you limit your db2audit directory access with the file permission option in Windows.

Intel: X:\Program Files\sqllib\db2\security\

UNIX: $HOME/sqllib/security/

1. Generate only audit records that you need:

db2audit configure scope audit, checking, objmaint, secmaint, sysadmin, validate status failure

2. Extract data:

db2audit extract delasc delimiter ]

© Copyright IBM Corporation 2004

Make db2audit directory secure

Use the audit facility only when needed

Generate only necessary records

Performance, disk space

Extract data to delimited ASCII files

Load extracted data to tables

Use SQL to analyze audit events

Easier to analyze with SQL

Bright Ideas!

How to Optimize Work with db2audit

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-48 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

3. Create DB and tables and load data:

db2 load from context.del of del modified by chardel replace into audit.context

4. Create an audit database, and use the table layout from the DB2 UDB Administration Implementation Guide with all relevant tables. It is recommended first to create a separate schema and then all the relevant tables (to protect them from unauthorized access).

5. Use SQL for analysis.

6. Use SQL to get only the required information or to reduce the possible values that you get. In a multiple user environment, it is recommended that you look for special events like CHECKING and their details like "access approval reason = DENIED" or some special SQLCODEs like "-552" or "-551".

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-49

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-50 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. If a user exceeds the defined resource limits, what actions can the governor take?

a. Force off the user's query.

b. Reduce the priority of the user's query.

c. Either one.

__________________________________________________

2. SYSADM authority is required to use the governor. True or False?

__________________________________________________

3. What would these limits mean in a configuration file?

- Setlimit

- uowtime 3600

- locks -1

- rowssel 100000

a. A unit of work is limited to 36 minutes, no resources can be locked by an application, and up to 1,000 rows can be selected by an application.

b. A unit of work is limited to one hour, any number of locks may be held by an application, and up to 100,000 rows can be returned to an application.

c. A unit of work is limited to one hour, any number of locks may be held by an application, and up to 100,000 rows can be read by an application.

__________________________________________________

4. Which is the minimum authority to run auditing?

a. DBADM

b. SYSADM

c. SYSMAINT

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-51

Student Notebook

5. db2audit is used to:

a. Reduce workload on the server

b. Trace user behavior

c. Improve system security

__________________________________________________

6. The audit log is in a readable format. True or False?

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-52 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 3-27. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Describe the DB2 Governor functionality

Configure the Governor

Start and stop the Governor

Read the Governor Log File

Describe the DB2 Audit functionality

Configure the Audit

Start and stop the Audit

Read the Audit Log File

Describe the impact of these utilities

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 3. The Governor and Audit 3-53

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

3-54 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 4. Problem Determination Tools and Techniques

What This Unit Is About

This unit teaches you how to interpret the notification log files for problem determination, how to use the inspect, db2support, and trace utilities, and gives you some additional information in DB2 UDB problem determination.

What You Should Be Able to Do

After completing this unit you should be able to:

• Interpret administration log and db2diag.log messages • Use the inspect, db2support, and trace utilities • Implement problem determination tips

How You Will Check Your Progress

Accountability:

• Checkpoint questions • An exercise

References

IBM DB2 Universal Database Command Reference

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Guide for GUI Tools for Administration and Development

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-1

Student Notebook

Figure 4-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Interpret administration log and db2diag.log messages

Use the inspect, db2support, and trace utilities

Implement problem determination tips

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 4-2. db2diag.log and Administration Log DBM CFG CF457.3

Notes:

The DBM configuration parameter DIAGPATH specifies the fully qualified path in which DB2 puts the first failure data capture information. The default value for DIAGPATH is a null string.

By default, the first failure data capture (FFDC) information is placed in the following locations:

• For Windows systems:

- If the DB2INSTPROF environment variable is not set: db2path\db2instance (where db2path is the path referenced in the DB2PATH environment variable, and db2instance is the environment variable containing the ID of the instance owner).

- If the DB2INSTPROF environment variable is set: x:db2instprof\db2instance, where x is the drive referenced in the DB2PATH environment variable, db2instprof is the instance profile directory, and db2instance is the environment variable containing the ID of the instance owner.

© Copyright IBM Corporation 2004

db2diag.log and Administration Log DBM CFG

DIAGPATH - valid directory

Diagnostic data directory containing:

DB2DIAG.LOG - First Failure Service Log File

DB2ALERT.LOG - Alert log file

PID.DMP(s) - Dump files containing extra debug information

tPID.000 - traceback files

DIAGLEVEL - (0-4)

0 - No error logging

1 - Severe errors

2 - Severe and non-severe errors

3 - Severe, non-severe, and warning messages (DEFAULT)

4 - Severe, non-severe, warning, and informational messages

NOTIFYLEVEL - (0-4)

0 - No notification logging

1 - Fatal or unrecoverable errors

2 - Immediate action required

3 - Important information, no immediate action required (DEFAULT)

4 - Informational messages

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-3

Student Notebook

- On Windows NT, Windows 2000, and Windows XP systems, the DB2 administration notification log is found in the event log and can be reviewed through the Windows Event Viewer. On other operating systems, the administration notification log for the instance is called instance_name.nfy. The diagnostic log (db2diag.log) is still located in the DIAGPATH.

• For UNIX operating systems:

- $HOME/sqllib/db2dump, where $HOME is the home directory of the instance owner.

DIAGLEVEL, DBM configuration parameter which is configurable online. Following values are possible:

• 0 - No diagnostic data captured

• 1 - Severe errors only

• 2 - All errors

• 3 - All errors and warnings (this is the default)

• 4 - All errors, warnings, and informational messages

NOTIFYLEVEL, DBM configuration parameter which is configurable online. Following values are possible:

• 0 - No administration notification message captured (not recommended)

• 1 - Fatal or unrecoverable errors

• 2 - Immediate action required. Conditions are logged that require immediate attention from the system administrator or the database administrator. If the condition is not resolved, it could lead to a fatal error. Notification of very significant, non-error activities (such as recovery) may also be logged at this level. This level will capture health monitor alarms.

• 3 - Important information, no immediate action required. Conditions are logged that are non-threatening and do not require immediate action but may indicate a non-optimal system. This level will capture health monitor alarms, health monitor warnings, and health monitor attentions. This is the default value.

• 4 - Informational messages

The administration notification log includes messages having values up to and including the value of notifylevel. For example, setting notifylevel to 3 will include messages applicable to levels 1, 2, and 3. In order for a user application to be able to write to the notification file or the Windows Event Log, it must call the db2AdminMsgWrite API.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 4-3. Administration Notification Log CF457.3

Notes:

Use the Notification Log page of the Journal to display a history of notifications that have been generated on an instance. You can view health monitor notifications only or all notifications for an instance. From this page, you can also display additional details of each notification.

When you open the Journal, go to the Notification Log tab. Here you have the option to choose the instance.

With the Notification Log Filter, you will determine which and how many log entries you will see:

• Type of notification record to display

- Health monitor notification only

- All notifications (last 50 records is the default)

• Criteria

- Read all records of selected type from the end of the file

© Copyright IBM Corporation 2004

Administration Notification Log

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-5

Student Notebook

- Maximum number of records displayed

- Read from specified record to the end of the file

- Read records from specified range

You are also able to customize your view using the View tab in the lower right corner.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 4-4. Administration Notification Log CF457.3

Notes:

By double-clicking a displayed record, or by using Selected and Show Details from the menu, bar you will get the details regarding this entry:

• Error code

• Timestamp of the entry

• Instance name

• Node

• TID

• PID

• Application ID

• Additional information, such as what the error was and how to solve the problem

© Copyright IBM Corporation 2004

Administration Notification Log

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-7

Student Notebook

Figure 4-5. Interpreting Log Entries CF457.3

Notes:

This example shows the header information for a sample log entry with all the parts of the log identified.

Not every log entry will contain all of these parts.

1. Timestamp for the message

2. Name of the instance generating the message

3. For multi-partition systems, the partition generating the message. In a nonpartitioned database, the value is 000.

4. DB2 component that is writing the message. For messages written by user applications using the db2AdminMsgWrite API, the component will read “User Application”.

5. Identification of the application for which the process is working. In this example, the process generating the message is working on behalf of an application with the ID LOCAL.db2.030305091303. To identify more about a particular application ID, either:

© Copyright IBM Corporation 2004

Interpreting Log Entries

2004-03-13-03.15.39.020344 1 Instance: db2 2 Node: 000 3

PID:89198 (db2agent (MUSICDB)) 4 Appid: *LOCAL.db2.040305091303 5

recovery manager 6 sqlpresr 7 Probe: 1 8 Database: MUSICDB 9

ADM1530E 10 Crash recovery has been initiated. 11

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

a. Use the db2 list applications command on a DB2 UDB server or db2 list dcs applications on a DB2 UDB Connect gateway to view a list of application IDs. From this list you can determine information about the client experiencing the error, such as its node name and its TCP/IP address.

b. Use the db2 get snapshot for application command to view a list of application IDs.

6. DB2 component that is writing the message.

7. Name of the function that is providing the message. This function operates within the DB2 subcomponent that is writing the message. To find out more about the activity performed by a function, look at the fourth letter of its name. In this example, the letter “p” in the function sqlpresr indicates a data protection problem (for example, a damaged log). Some of the letters in the fourth position indicate:

- b Buffer pools

- c Communication between clients and servers

- d Data management

- e Engine processes

- o Operating system calls (such as opening and closing files)

- p Data protection (such as locking and logging)

- r Relational database services

- s Sorting

- x Indexing

8. Unique internal identifier. This number allows DB2 customer support and development to locate the point in the DB2 source code that reported the message.

9. The database on which the error occurred.

10.When available, a message indicating the error type and number as a hexadecimal code.

11. When available, a message text explaining the logged event.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-9

Student Notebook

Figure 4-6. Interpreting Log Entries CF457.3

Notes:

For severe errors, an SQLCA structure is dumped into the db2diag.log.

1. Beginning of the SQLCA entry.

2. The SQL state (when negative, an error has occurred).

3. Any reason codes associated with the SQL error code.

4. Sometimes there are several errors leading to the final SQL error code. These errors are shown in sequence in the sqlerrd area.

5. The hexadecimal representation of an SQL error.

© Copyright IBM Corporation 2004

Interpreting Log Entries

2004-03-13-03.15.39.020344 Instance: db2 Node: 000

PID:44829 (db2agent (MUSICDB)) Appid:*LOCAL.db2.040305091303

realtion_data_serv sqlrerlg Probe:17 Database:MUSICDB

DIA9999E An internal return code occurred. Report the following:

"0xFFFFE101"

Data Title:SQLCA pid(14358)

sqlcaid : SQLCA sqlcabc:136 sqlcode: -980 sqlerrml: 0

sqlerrmc :

sqlerrp : sqlrita

sqlerrd : (1) 0xFFFFE101 (2) 0x00000000 (3) 0x00000000

(4) 0x00000000 (5) 0x00000000 (6) 0x00000000

sqlwarn : (1) (2) (3) (4) (5) (6)

(7) (8) (9) (10) (11)

sqlstate :

54

32

1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 4-7. Log File Example (1) CF457.3

Notes:

A power outage causes your DB2 server machine to reboot. While rebooting, some of the file systems do not remount properly.

You want to check on your database’s integrity after the outage. You start the instance and connect to the database. The connection is successful. In the administration notification log you see the entries shown on this and the next page.

• The Table Space “TS1” has been put offline and is marked roll-forward pending. This has probably happened because TS1 was located on one of the file systems that did not get remounted during the machine’s reboot.

© Copyright IBM Corporation 2004

Log File Example (1)

2004-03-13-03.14.33.576365 Instance: db2 Node: 000

PID:140546(db2star2) Appid:none

base sys utilities startdbm Probe:911

ADM7513W Database manager has started.

*

2004-03-13-03.14.38.559911 Instance: db2 Node: 000

PID:89198(db2agent(MUSICDB)) Appid:*LOCAL.db2.040205091435

buffer pool services sqlbStartPoolsErrorHandling Probe: 39

ADM6080E The Tablespace "TS1" (ID "3") was put OFFLINE and in

ROLLFORWARD_PENDING. Tablespace state is 0x"00004080".

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-11

Student Notebook

Figure 4-8. Log File Example (2) CF457.3

Notes:

• The name of the database being recovered is MUSICDB.

• A notification that database crash recovery has initialized.

• Crash recovery indicates that the database was recovered but one or more table spaces were not recovered because they were offline. The table space that was offline is indicated in an earlier message.

• A notification that the database crash recovery was successful.

From here, the db2 list tablespace containers command can identify the file systems associated with the table space TS1. The table space can be put back online by remounting the file systems and a roll-forward.

© Copyright IBM Corporation 2004

Log File Example (2)

2004-03-13-03.14.39.020766 Instance: db2 Node: 000

PID:89198(db2agent (MUSICDB)) Appid:*LOCAL.db2.040205091435

recovery manager sqlpresr Probe:1 Database: MUSICDB

ADM1530E Crash recovery has been initiated.

*

2003-04-13-03.14.44.524546 Instance: db2 Node: 000

PID:89198(db2agent(MUSICDB)) Appid:*LOCAL.db2.040205091435

recovery manager sqlpresr Probe: 350 Database: MUSICDB

ADM1533W Database has recovered. However, one or more tablespaces

are offline.

*

2003-04-13-03.14.44.956773 Instance: db2 Node: 000

PID:89198(db2agent(MUSICDB)) Appid:*LOCAL.db2.040205091435

recovery manager sqlpresr Probe: 370 Database: MUSICDB

ADM1531E Crash recovery has completed successfully.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 4-9. db2diag db2diag.log Analysis Tool Command CF457.3

Notes:

The db2diag filters and formats the db2diag.log file. You don’t need any special authorization for that command. The above graphic shows only a part of the command, but details are explained in the student notes.

Available options are:

• filename specifies one or more space-separated path names of DB2 diagnostic logs to be processed. If the file name is omitted, the db2diag.log file from the current directory is processed. If the file is not found, a directory set by the DIAGPATH variable is searched.

• -h/?/-help displays help information. If this option is specified, all other options are ignored. You can also specify an optionList where you can get help for a specific option. If you need more then one option, separate them by commas. With optionLIst you will also be able to display more information about the tool with the following switches:

- brief displays all options without examples

- tutorial displays examples that describe advanced features

© Copyright IBM Corporation 2004

db2diag db2diag.log Analysis Tool Command

db2diag filename

-h

?

-help

-g

-filter

-gi

-gv

-giv

-gvi

-pid -tid -e -fmt -o -H -t

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-13

Student Notebook

- notes displays usage notes and restrictions

- all displays complete information

• -frmt format-string formats the db2diag output using a format string. The following fields (case-sensitive) are currently available (the full list of available fields can be found in the Command Reference):

- %timestamp/%ts Timestamp (can be divided into parts like %tsyear, %tsmonth, and so on

- %timezone/%tz Number of minutes’ difference from UTC (Universal Coordinated Time)

- %recordid/%recid Unique record ID

- %audience E indicates external users, I internal users, and D debugging information for developers

- %level Severity of the message, such as Info, Warning Error, and so on

- %source Location from which the logged error originated

- %instance/%inst Instance name

- %node Database partition number

- %database/%db Database name

- %pid Process ID

- %tid Thread ID

- %process Name associated with the process ID in double quotation marks, for example “db2sysc.exe”

- %product Product name, for example DB2 UDB

- %component Component name

- %probe Probe number

- %function Function name

- %appid Application ID

• -g fieldPatternList is a list of filed pattern pairs in the following format separated by a comma: fieldName operator searchPattern. The operator can be:

= for only those records that contain matches := for those records that contain matches in which a search pattern ca be a part

of a larger expression != for non-matching lines !:= for non matching lines in which the search pattern can be a part of a larger

expression ^= for records for which the field value starts with the search pattern !^= for records for which the field value does not start with the search pattern

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• -gi fieldPatternList has the same function as -g but is case-insensitive

• -gv fieldPatternList is for messages that do not match the specified pattern

• -gvi/-giv filedPatternList has the same function as -gv but is case-insensitive

• -pid processID LIst filters only log messages with the process IDs listed

• -tid threadIDList filters only log messages with the thread IDs listed

• -n/ -node nodeList displays only the database partition numbers listed

• -e/-error errorList is for log messages with the error numbers listed

• -l/-level levelList displays only messages with the severity levels indicated

• -c/-count shows the number of records found

• -v/-invert is used to invert the pattern matching to select all records that do not match the specified pattern

• -strict uses only one field: value pair per line; all empty fields are skipped

• -V/-verbose displays all fields, including the empty ones

• -exist defines how fields in a record are processed when a search is requested; with this option, a field must exist in order to be processed

• -cbe is for Common Base Event (CBE) Canonical Situation Data

• -o/-output pahtName is used to save the output to a file specified (fully-qualified pathname)

• -f/follow If the input file is a regular file, specifies that the tool will not terminate after the last record of the file that has been processed. It sleeps for a specified interval of time and attempts to read and process further records.

• -H/-history Displays the history of logged messages for the specified time interval, which could be historyPeriod starting form the most recent record, or historyPeriod:histryBeginn starting from a specified date/time

• -t/-time time stamp value for either startTime or endTime

• -A/archive dirName archives a diagnostic log file. When this option is specified, all other options are ignored. You can specify a directory but no file name.

• -rc rcList/switch displays descriptions of DB2 internal error return codes for a space-separated list

Examples:

To display all severe error messages produced by the process with the process id 53989 and on database partition 1,2 or 3:

db2diag -g level=Severe,pid=53989 -n 1,2,3

To display all messages containing database MUSICDB and instance INST1:

db2diag -g db=MUSICDB, instance=INST1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-15

Student Notebook

To display all severe error messages containing the database field:

db2diag -g db:= -gi level=severe

To display severe errors logged for the last three days:

db2diag -gi “level=severe” -H 3d

Each option can appear only once. They can be specified in any order and can have optional parameters.

By default, db2diag looks for the db2diag.log file in the current directory. If the file is not found, the directory set by the DIAGPATH variable is searched next. If the file is not found, db2diag returns an error and exits.

For further details and complete information, please refer to the Command Reference.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

4.1 Inspect

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-17

Student Notebook

Figure 4-10. Online Database Checking Tool - INSPECT CF457.3

Notes:

• DB2 has an inspection and repair utility called DB2DART. This utility can be used to inspect databases, table spaces, tables, and schemas in a DB2 database for their architectural integrity.

• This utility can be run online and can be called by the INSPECT command. This feature is very beneficial since the greater the amount of data in the database, the longer DB2DART would take to complete and the longer the database would be unavailable for use. Also, its performance is improved, since it is in the engine and can now take advantage of DB2 resources and features like buffer pools and prefetchers.

• INSPECT has corresponding APIs: db2Inspect() and input structure db2InspectStruct.

• INSPECT has granularity. It can run the inspection to check the whole database, or one can be more specific on what to check (like a single table space and its tables, or even a single table, and so forth).

• An inspection on an entire database will process all the objects of a table when the parent data object is found; this includes index, long field, or LOB objects that could be

© Copyright IBM Corporation 2004

Online Database Checking Tool - INSPECT

Inspect database for architectural integrity

Checks to ensure that the structures of table objects and table spaces are valid,checking the pages of the database for page consistency

INSPECT is online, whereas DB2DART is an offline utility

Integration into the engine lets INSPECT take advantage of available DB2resources like bufferpools and prefetchers for better performance

Will not lock objects

Scope

In a single-partition system, the scope is that single partition only; in a multipartition system, it is the collection of all logical partitions defined in the node configuration file,db2nodes.cfg

Objects

Granularity: databases, table spaces, and tables can be specified

Table space processing for INSPECT CHECK will only process the objects thatreside in the table space

Inspect processing will access database objects using the uncommitted readisolation level

Inspect check processing will write out unformatted inspection data results to aspecified results file

To see inspection details after check processing completes, you must format theresult data using the DB2INSPF utility

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

located in other table spaces. The INSPECT check processing on a single table space will only process the objects that reside in the table space. If a data object resides in this table space and a corresponding index object in another table space, then only the data object is processed (and vice versa). The same is true for LOB-based data.

DB2DART had different behavior since it was offline. It would check all corresponding INDEX and LOB data in other table space automatically. However, in order to do cascade checks, the utility had to be run on an object that contained the actual data. In other words, running DB2DART on an INDEX table space would not seek out the parent data.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-19

Student Notebook

Figure 4-11. Inspect Syntax (1) CF457.3

Notes:

Inspect database for architectural integrity, checking the pages of the database for consistency. The inspection checks that the structures of table objects and table spaces are valid.

In a single-partition system, the scope is on that single partition. In a partitioned database system, the scope is the collection of all logical partitions defined in db2nodes.cfg.

You must have SYSADM, DBADM, SYSCTRL, or SYSMAINT authority to run INSPECT, or have the CONTROL privilege if it should be used against a single table.

The command parameters are:

• CHECK specifies the processing

• DATABASE specifies a whole database to be inspected

• BEGIN TBSPACEID n specifies the table space ID to begin with

• BEGIN TBSPACEID n OBJECTID n specifies to begin from a table with the object ID within a given table space ID

© Copyright IBM Corporation 2004

Inspect Syntax (1)

INSPECT CHECK

TABLESPACE

BEGIN TBSPACEID n

OBJECTID n

CATALOG TO TABLESPACE CONSISTENCY

DATABASE

NAME tablespace-name

TBSPACEID n BEGIN OBJECTID n

NAME table-name

TABLESPACEID n OBJECTIDID n

SCHEMA schema-name

FOR ERROR STATE ALL LIMIT ERROR TO

LIMIT ERROR TO DEFAULT

n

ALL

Level Clause

RESULTS

KEEP

filename

On Database Partition Clause

TABLE

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• TABLESPACE

- NAME tablespace-name specifies a single table space by given name

- TBSPACEID n specifies a single table space by given table space ID

- BEGIN OBJECTID n specifies the table to begin from by object ID

• TABLE

- NAME table-name specifies the table by name

- SCHEMA schema-name specifies the schema name for the specified table name in a single table operation

- TBSPACEID n OBJECTID n specifies the table within a given table space ID by object ID

• CATALOG TO TABLESPACE CONSISTENCY specifies to include checking for consistency of physical tables in the table space to the tables listed in the catalog

• FOR ERROR STATE ALL For a table object with an internal state already indicating error state, the check will just report this status and not scan through the object. Specifying this option will result in a processing scan through the object even if the internal state already shows error state.

• LIMIT ERROR TO n Limits reporting to this number of pages in error for an object. When this limit is reached, the processing will discontinue the check on the rest of the object.

• LIMIT ERROR TO DEFAULT Limits the number of pages in error for an object. This value is the extent size of the object. This parameter is the default.

• LIMIT ERROR TO ALL No limit on the number of reported pages in error.

• RESULTS specifies the resulting output file. The file will be written out to the diagnostic data directory path. If no error is found by the check processing, the resulting output file will be erased at the end of INSPECT.

- KEEP specifies to always keep the resulting output file (even if no errors have been found).

- file-name specifies the name for the resulting output file.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-21

Student Notebook

Figure 4-12. Inspect Syntax (2) CF457.3

Notes:

Level Clause:

• EXTENTMAP

- NORMAL specifies that the processing level normal for an extent map. This is the default.

- NONE specifies that the processing level is none for an extent map.

- LOW specifies that the processing level is low for an extent map.

• DATA

- NORMAL specifies that the processing level is normal for a data object. This is the default.

- NONE specifies that the processing level is none for a data object.

- LOW specifies that the processing level is low for a data object.

© Copyright IBM Corporation 2004

Inspect Syntax (2)

EXTENTMAP NORMAL

Level Clause:

EXTENTMAP NONE

LOW

DATA NORMAL BLOCKMAP NORMAL

DATA NONE

LOW

BLOCKMAP NONE

LOW

INDEX NORMAL

INDEX NONE

LOW

LONG NORMAL LOB NORMAL

LONG NONE

LOW

LOB NONE

LOW

On Database Partition Clause:

ON Database Partition Clause

DBPARTITIONNUM

DBPARTITIONNUMS

ALL DBPARTITIONNUMS

EXCEPT Clause

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• BLOCKMAP

- NORMAL specifies that the processing level is normal for a block map object. This is the default.

- NONE specifies that the processing level is none for a block map object.

- LOW specifies that the processing level is low for a block map object.

• INDEX

- NORMAL specifies that the processing level is normal for an index object. This is the default.

- NONE specifies that the processing level is none for an index object.

- LOW specifies that the processing level is low for an index object.

• LONG

- NORMAL specifies that the processing level is normal for a long object. This is the default.

- NONE specifies that the processing level is none for a long object.

- LOW specifies that the processing level is low for a long object.

• LOB

- NORMAL specifies that the processing level is normal for a long object. This is the default.

- NONE specifies that the processing level is none for a long object.

- LOW specifies that the processing level is low for a long object.

On Database Partition Clause

• ALL DBPARTITIONNUMS specifies that operation is to be done on all database partitions specified in the db2nodes.cfg file. This is the default.

• EXCEPT specifies that the operation is to be done on all database partitions specified in the db2nodes.cfg file except those specified in the node list.

• ON DBPARTITIONNUM / ON DBPARTITIONNUMS performs the operation on a set of database partitions.

• db-partition-number1 specifies a database partition number in the database partition list.

• db-partition-number2 specifies the second database partition number, so that all database partitions from db-partition-number1 up to and including db-partition-number2 are included in the database partition list.

Notes: For check operations on table objects, the level of processing can be specified. The default is NORMAL; specifying NONE for an object excludes it. Specifying LOW will do a subset of the checks that are done for NORMAL

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-23

Student Notebook

Check database can be specified to start from a specific table space or from a specific table by specifying the ID value of the table space or table.

Check table space can be specified to start from a specific table, by ID.

The processing of table spaces will affect only the objects that reside in the table space.

Online inspect processing will access database objects using isolation level uncommitted read. COMMIT processing will be done during INSPECT processing. It is advisable to end the unit of work by issuing a COMMIT or ROLLBACK before invoking INSPECT.

Online inspect check processing will write out unformatted inspection data results to the results file (which is written to the sqllib\instname directory). The file will be written out to the diagnostic data directory path. After check processing completes, to see inspection details, the inspection result data will require to be formatted out with the utility db2inspf. The results file will have a file extension of the database partition number. In a partitioned database environment, each database partition will generate its own results output file, with an extension corresponding to its database partition number.

If the name of a file that already exists is specified, the operation will not be processed. The file will have to be removed before that file name can be specified.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

4.2 db2support

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-25

Student Notebook

Figure 4-13. db2support Overview CF457.3

Notes:

The db2support tool will collect data from the machine on which the tool runs. In a client/server environment, database-related information will be from the machine where the database resides via an instance attachment or a connection to the database. For example, operating system or hardware information (the -s option) and files from the diagnostic directory (DIAGPATH) will be from the local machine where the tool is running. Data such as buffer pool information, database configuration, and tablespace information will be from the machine where the database physically resides. All collected information will be stored in an output file.

Collected information includes:

• Basic operating and hardware information

• System files (DB2 and operating system)

• System resource information (disk, CPU, memory)

• Operating system and level

• JDK level

© Copyright IBM Corporation 2004

db2support Overview

Database

db2support

Collected

Information

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• DB2 release information

• Registry information

• Node configuration

• DBM configuration

• DAS configuration

• Database directory information

• Node directory information

• DCS directory information

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-27

Student Notebook

Figure 4-14. db2support Syntax CF457.3

Notes:

db2support collects environment data about either a client or a server machine, and places the files containing system data into a compressed file archive. You can also collect basic data about the nature of a problem through an interactive question and answer process with the user.

The most complete output will be generated if this tool is invoked by the instance owner. Other users with limited privileges on the system can run this tool, but some of the data collection actions will result in reduced reporting and reduced output.

Command parameters:

• output path specifies the path where the archived library is to be created. This is the directory where user-created files must be placed for inclusion in the archive.

• -f or -flow ignores pauses when requests are made for the user to press <Enter> key to continue. This option is useful when running or calling the db2support tool via a script or some other automated procedure where unattended execution is desired.

• -a or -all_core specifies that all core files are to be captured.

© Copyright IBM Corporation 2004

db2support Syntax

-d database name

output path db2support

-c -u userid

-p password

-g -h

-l -m -n -q -s -x

-f -a-r

-v

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• -r or -recent_core specifies that the most recent core files are to be captured. This is ignored if the -a option is specified.

• -d database_name or -database database_name specifies the name of the database for which data is collected.

• -c or -connect specifies an attempt to be made to connect to the specified database.

• -u userid or -user userid specifies the ID to connect to the database.

• -p password or -password password specifies the password for the ID.

• -g or -get_dump specifies that all files in a dump directory, excluding core files, are to be captured.

• -h or -help displays help information. When this option is used, all other options are ignored.

• -l or -logs specifies that active logs are to be captured.

• -m or -html specifies that all system output is dumped into HTML format files. By default, all system-related information is dumped into flat text files if this parameter is not used.

• -n or -number specifies the problem management report (PMR) number or identifier for the current problem.

• -q or -question_response specifies that interactive problem analysis mode is to be used.

• -s or -system_detail specifies that detailed hardware and operating system information is to be gathered.

• -v or -verbose specifies that output is to be used while this tool is running.

• -x or -xml_generate specifies that an XML document containing the entire decision tree logic used during the interactive problem analysis mode (-q mode) is to be generated.

Note: In order to protect the security of business data, this tool does not collect table data, schema (DDL), or logs. Some of the options do allow the inclusion of some aspects of schema and data (such as archived logs). Options that expose database schema or data should be used carefully. When you invoke this tool, a message is displayed that indicates how sensitive data is dealt with.

Data collected from the db2support tool will be from the machine where the tool runs. In a client/server environment, database-related information will be from the machine where the database resides via an instance attachment or connection to the database. As an example: operating system or hardware information (the -s option) and files from the DIAGPATH will be from the local machine where db2support is running; whereas data such as buffer pool information, database configuration, and table space information will be from the machine where the database physically resides.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-29

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

4.3 db2trace

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-31

Student Notebook

Figure 4-15. DB2 Trace CF457.3

Notes:

After a problem was reported, IBM can ask you to run a trace in order to get detailed information.

The trace facility records information about operations and formats this information into readable form. Enabling the trace facility may impact your system’s performance. As a result, only use the trace facility when directed by a DB2 support representative.

© Copyright IBM Corporation 2004

DB2 Trace

Must reproduce the error in order to take the trace

Since the trace logs all actions being performedalong with the parameter values at various steps inthe process:

DB2 Trace does impact performance

Timing-related problems may not reoccur

Trace information grows rapidly

Capture only the error

Avoid other activities

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 4-16. DB2 Trace to Memory CF457.3

Notes:

You can specify that the trace operation will be performed on the instance (db2), which is the default, or on the DB2 administration server (das).

Traces must be turned on (db2trc on), and then trace information is continuously written to the specified file (the -f option, which can generate an extremely large dump file) until db2trc is turned off.

With the -p option, you can specify to run the trace facility for a process ID (pid) or thread ID (tid). A maximum of five pid.tid combinations is supported. For example, to trace processes 10, 20, and 30, you must issue db2trc on -p 10,20,30. To enable tracing only for thread 33 of process 100, the syntax is: db2trc on -p 100.33

With -l and -i you can specify the size and behavior of the trace buffer. -l specifies that the last trace records are retained (that is, the first records are overwritten when the buffer is full). -i specifies that the initial trace records are retained (that is, no more records are written to the buffer once it is full). The buffer size can be specified in either bytes or megabytes. For example, to specify a buffer size of four megabytes, use: db2trc on -l 4m

© Copyright IBM Corporation 2004

DB2 Trace to Memory

db2trc

db2

(Instance)

das

(Administration Server)

-f

File

-l -i

Buffer

PID

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-33

Student Notebook

The buffer size must be a power of 2.

Note: The db2trc command must be issued several times to turn tracing on, produce a dump file, format the dump file, and turn tracing off.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

4.4 Problem Determination Tips

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-35

Student Notebook

Figure 4-17. Information Needed CF457.3

Notes:

To be able to solve a problem, it is crucial to understand it, which means having a complete picture of the problem situation. The points mentioned above are hints for being able to understand the problem.

The nature of the problem leads you to determine, for example, that DB2 could not be started, an application produces an error, a user cannot connect, and so on. Answering these questions leads you in the correct direction for further investigation. One indicator for answering this is the SQLCODE, as DB2 informational messages are always returned in the form of CCCnnnnnS. The CCC identifies the DB2 component returning the message, the nnnnn is a four- or five-digit error code, and the S is a severity indicator. The SQL component identifier could be:

• SQL Database Manager messages

• DB2 Command Line Processor messages

• ASN Replication messages

• CLI Call Level Interface messages

© Copyright IBM Corporation 2004

Information Needed

What is the problem?

Where is the problem happening?

When does the problem happen?

Under which conditions does the problem happen?

Is the problem reproducible?

What are the symptoms?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• SQJ Embedded SQLJ in Java messages

• SPM Synch Point Manager messages

• DBI Installation or configuration messages

• DBA Control Center and Database Administration Utility messages

• CCA Configuration Assistant messages

• DWC Data Warehouse Center messages

• FLG Information Catalog Manager messages

• STA Satellite messages

A DB2 return code message can be identified using the DB2 command line by separating the db2 command and the error code with a question mark.

Where the problem is happening leads you to determine if it is on the server machine, on a particular database partition, or, for example, is it an application problem, is this application running locally or from a client, and so on.

When the problem happens gives answers like: while connecting to a database, during selects or inserts, or during a maintenance activity, and so on. Or, for instance: always when application x and application y are connected at the same time.

Under which conditions does the problem happen? This could be: if it is an application problem, is this the only application running, or does the problem happen if two different applications are running at the same time? Is it a problem that occurs only once, or does it recur, and in which situations, and so on.

Could the problem be reproduced? It might be helpful to check it on a test system or to reproduce it to be able to run a trace if needed.

What are the symptoms? Here it is necessary to isolate exactly what happens if the problem occurs. Things like error messages, or that no other applications can connect. Or, regarding applications, for instance the application can make 500 updates but then all is rolled back. Also, in this context, note what would be expected instead of the error situation.

Once you have answered all these questions, you have a very good starting point for solving your problem, or, if needed, you have all the necessary information (together with the configuration information) to contact and inform DB2 support personnel.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-37

Student Notebook

Figure 4-18. Miscellaneous Troubleshooting Tools CF457.3

Notes:

Here is a listing of useful tools for problem determination. The exact syntax for these tools can be found in the DB2 UDB Command Reference.

• db2bfd Bind File Description Tool This tool displays the contents of a bind file and can be used to examine and to verify the SQL statements within a bind file, as well as to display the precompile options used to create the bind file. This can be helpful in cases of problem determination related to an application’s bind file.

• db2dart Database Analysis and Reporting Tool This tool examines databases for architectural correctness and reports any encountered errors. It could inspect the entire database, or single tables, as well as table space files and containers. db2dart must be run with no users connected to the database, and requires SYSADM authorization.

• db2flsn Find Log Sequence Number This tool returns the name of the file that contains the log record identified by a specified log sequence number (LSN). The log header control file (SQLOGCTL.LFH) must reside in the current directory. Since this file is located in the database directory, the tool can be run from the database directory, or the control file can be copied to the directory from which the tool will be run. The tool

© Copyright IBM Corporation 2004

Miscellaneous Troubleshooting Tools

db2bfd

db2dart

db2flsn

db2drdat

db2_call_stack

db2level

db2look

db2ckbkp

db2sql92

db2tbst

db2untag

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-38 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

can only be used with databases that have logretain set to RECOVER or userexit set to ON.

• db2drdat DRDA Trace allows you to capture a DRDA data stream exchanged between a DRDA Application Requestor (AR) and the DB2 UDB DRDA Application Server (AS). This tool can also be used to determine how many sends and receives are required to execute an application. db2trc should not be used while db2drdat is active.

• db2_call_stack generates EDU call stacks. EDU stands for “engine dispatch unit”, and refers to a thread (Windows) or process (UNIX) that is doing work on behalf of DB2. This shows the processing path that an EDU is currently in. This command is mainly used when it appears that the DB2 engine has become “hung”. In UNIX, the db2_call_stack command generates call stacks for both single- and multi-partition instances. The call stacks are placed in files in the diagnostic directory.

• db2level Show DB2 Service Level Here you will get the current version and service level of the installed DB2 product. Output from this command goes to the console by default. This information is helpful, for example, if you have to search for APARs, in which case it is essential that you know what service level your instance is running at.

db2level: DB21085I Instance “DB2” uses DB2 code release “SQL08020” with level identifier “03010106” and informational tokens “DB2 v8.1.7.328”, “s040415” and “WR21306” ,and FixPak “7” Product is installed at “C:\SQLLIB”

- The last token in the message, WR21306, contains the PTF number and can be used to identify the FixPak number on the DB2 Technical Support Web site. The token prior to that, s040415, shows the date when the product was built, and is used by DB2 service personnel and development when diagnosing problems.

• db2look DB2 Statistics and DDL Extraction Tool This tool extracts the required DDL statements to reproduce the database object of a database. You can also generate the required UPDATE statements used to replicate the statistics on the objects as well as the update database configuration, the update database manager configuration parameters, and the db2set statements, so that the registry variables and configuration parameter settings can be reproduced (for example, so that they can also be used to have the test system updated with the configurations, and so on, from the live system).

• db2ckbkp Check Backup is used to test the integrity of a backup image and to determine whether or not the image can be restored. It can also be used to display the metadata stored in the backup header. If the complete backup consists of multiple objects, the validation will only succeed if db2ckbkp is used to validate all of the objects at the same time.

• db2sql92 SQL92 Compliant SQL Statement Processor reads SQL statements from either a flat file or standard input, dynamically describes and prepares the statements, and returns an answer set. It supports concurrent connections to multiple databases. This tool requires SYSADM authorization.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-39

Student Notebook

• db2tbst Get Tablespace State accepts a hexadecimal table space state value and returns the state. The state value is part of the output from LIST TABLESPACES.

• db2untag Release Container Tag removes the DB2 tag on a table space container. The tag is used to prevent DB2 from reusing a container in more than one table space. It displays information about the container tag, identifying the database with which the container is associated. It is useful when it is necessary to release a container last used by a database that has since been deleted. If the tag is left behind, DB2 is prevented from using the resource in future. Use this with care!

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-40 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. db2diag.log is the only possible means to receive additional information regarding an error situation. True or False?

__________________________________________________

2. What can you do to receive all the necessary information about your DB2 UDB environment and system?

a. Use the Control Center

b. Use the Health Center

c. Use the db2support tool

__________________________________________________

3. What you can use to check the integrity of your DB2 UDB online?

a. db2dart

b. inspect

c. db2flsn

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 4. Problem Determination Tools and Techniques 4-41

Student Notebook

Figure 4-19. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Interpret administration log and db2diag.log messages

Use the inspect, db2support, and trace utilities

Implement problem determination tips

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

4-42 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 5. Parallelism, SMP Enablement, and Process Model

What This Unit Is About

This unit describes the parallelism and SMP enablement in DB2 UDB.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe the parallel options for: - SQL processing - Backup and restore - Load utility - Create index processing

• Describe how the DB2 UDB process model supports the use of parallelism

• List the benefits of the parallel options when used in SMP or partitioned database environments

• Configure DB2 UDB to use the parallel features

References

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Command Reference

IBM DB2 Universal Database SQL Reference, Volume 1

IBM DB2 Universal Database SQL Reference, Volume 2

IBM DB2 Universal Database Administration Guide: Performance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-1

Student Notebook

Figure 5-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Describe the parallel options for:SQL processingBackup and restoreLoad utilityCreate index processing

Describe how DB2 process model supports the use of parallelism

List the benefits of the parallel options when used in SMP or Partitioned Database environments

Configure DB2 to use the parallel features

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

5.1 Parallelism and SMP Enablement

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-3

Student Notebook

Figure 5-2. Why Use Parallelism CF457.3

Notes:

What is parallelism?

Parallelism refers to the use of multiple resources to perform one task. We sometimes use parallelism to coordinate what may be many independent resources as one to accomplish the task at hand.

Why use parallelism?

There are many driving forces behind the use of parallelism in relational databases, many of which relate to the introduction of new technology or hardware becoming available, or to improvements in database theory and implementation. The most compelling reason for the use of parallelism comes from the ever-increasing demands of relational database users. These demands can be summarized as follows:

• Improved performance

Users expect a fast response time even if the queries generate complex reports, search gigabytes of data, or make a selection on the web page of their local video store.

© Copyright IBM Corporation 2004

Why Use Parallelism?

What is parallelism?

Why use parallelism?

G

Improve SQL query performance

Improve database scalability

Fully use hardware to improve cost efficiency $$$

?PTM

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• Database growth and scalability

Businesses have a growing appetite for information with many companies now considering a terabyte of data not unusual. Companies want to maintain more customer records, with more detailed information for longer periods of time, and to add to that data using image, video, and sound. RDBMSs need to support these ever-increasing volumes and be able to grow and change with the business. If a business starts performing more analysis on the customer information, the database needs to be able to handle the additional processing while maintaining satisfactory response times.

• Cost efficient computing

RDBMSs used as a business tool are required to add value by exploiting the data in a company in a cost-effective manner. There is also a requirement for database systems to be scalable. This means that, if a business needs to increase processing workloads or to manage a much greater volume of data, additional hardware may be added to their existing systems (such as more CPUs, disk, or even complete machines), and the RDBMSs must be able to efficiently exploit the additional resources.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-5

Student Notebook

Figure 5-3. Parallelism in DB2 UDB CF457.3

Notes:

Parallelism in DB2 UDB

Many forms of parallelism existed in DB2 prior to DB2 Universal Database SMP support. The notes here are supplied as background information to help clarify the terms and to distinguish the functionality from the SMP functionality documented in this lecture.

Inter-Query Parallelism

Inter-query parallelism is achieved when different multiple transactions are executed simultaneously against one database. This is accomplished by having different transactions assigned to different processes or threads which are executing simultaneously on different CPUs within the same computer or on separate partitions in a shared nothing parallel database. This type of parallelism is beneficial when there are many different concurrent transactions, which are neither heavily computational nor interdependent. DB2 UDB Workgroup Edition and DB2 Enterprise Server Edition both support forms of inter-query parallelism.

© Copyright IBM Corporation 2004

Parallelism in DB2 UDB

Inter-Query Parallelism (multiple queries)

Many simultaneously executing transactions or completequeries

Intra-Query Parallelism (within a single query (select only))

"Data Parallelism"Queries are broken into subsections, each working on adata subset

"Functional Parallelism" Queries are broken into a series of operators to be runsimultaneously

Function Shipping versus I/O Shipping

Move the subquery to where the data is and execute it(Function)

Move the data to where the query is executing (I/O)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Intra-Query Parallelism

With intra-query parallelism, a single query is split across many processors. The benefit of intra-query parallelism is a speed-up in processing time. The elapsed time for performing a query may be reduced. This type of parallelism enables more complicated and/or more computation-intensive operations to be performed in a reasonable time. Intra-query parallelism can be achieved in two forms:

• Data parallelism (also referred to as partition parallelism) • Functional parallelism (also referred to as pipeline parallelism)

Remember, intra-query parallelism is query parallelism. You do not get intra-query parallelism with UPDATE, INSERT, and DELETE; only with SELECT.

“Data Parallelism” (within a single query)

With data parallelism, a single query is split into many clones, or subqueries, each of which works with a subset of the data. This type of parallelism is particularly suitable for compute-intensive operations. This is supported simultaneously on different database partitions, each working on its own portion of the database.

“Functional Parallelism” (within a single query)

Another form of intra-query parallelism is commonly referred to as "functional parallelism". It is also known as intra-query parallelism, intra-query pipelined parallelism, or pipelining. A single query is divided into a series of operators which are simultaneously executed on different CPUs. DB2 ESE supports functional parallelism with simultaneous executing operators (for example, sort and join).

Depending on the hardware architecture type (shared nothing, shared disk, shared memory), a parallel database can perform an operation in two ways:

• Function shipping • I/O shipping

Function Shipping

Function shipping is normally used in shared nothing systems. The query is divided into subqueries which are moved to where the data resides to be executed. This helps to reduce communication overhead, as the shipping of the function to the data is normally less resource-intensive.

I/O Shipping

I/O shipping is implemented by shipping the data from one database partition to another database partition to be used by a query or subquery executing at that partition. In the case of a shared memory architecture, the data is shared rather than shipped over a communication link.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-7

Student Notebook

Figure 5-4. More DB2 UDB Parallelism CF457.3

Notes:

In DB2 Universal Database, parallelism support also exploits the symmetric multiprocessor (SMP) systems. This functionality exploits SMP shared memory architecture to speed up individual SQL queries and utilities. The utilities also support parallel I/O servers for driving multiple disk drives, similar to that already supported with SQL requests. The SMP support builds upon the existing parallelism support that existed in prior versions of DB2 UDB, including:

• Parallel transactions: each user's request could be run using separate processes or threads

• Parallel computers: using DB2 ESE to support partitioned database machines with multiple parallel database partitions

• Parallel I/O: with the ability to define multiple processes or threads to manage the I/O activity for different physical drives

• Parallel disks: DB2 UDB would allow databases to be created over many disk drives allowing concurrent physical access to data

© Copyright IBM Corporation 2004

More DB2 UDB Parallelism

Parallel Disk Drives

Parallel I/O Servers

Parallel ComputersParallel RDBMS

Utilities

Utilities

SQL

SQL

Parallel Piecesof a querybased on SMP

Parallel Transactions

Parallel Piecesof a utility

based on SMP

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The SMP parallelism support in DB2 UDB can be used in these environments:

• A single system where users' processing needs can be efficiently and cost-effectively satisfied by a single SMP machine which, over time, is upgraded by adding CPUs.

• A shared nothing cluster or partitioned database environment which provides large-scale, shared nothing parallelism.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-9

Student Notebook

Figure 5-5. SQL Query Parallelism Overview CF457.3

Notes:

The fact that disks, memory, and CPUs can be shared uniformly by multiple processes or threads in an SMP environment provides two important features that DB2 UDB exploits:

• It allows workloads to be divided more evenly among the processes or threads, thereby achieving better scalability. This is possible since all processes or threads have the choice to work on all or part of the data. This analogous to a single queue serviced by multiple bank tellers.

• It provides flexibility in designing query execution strategies that are not possible or do not perform well in a shared nothing environment.

This functionality starts in the optimizer where the query is broken into subqueries or parts of a query which may be run in parallel. The process model starts and manages different processes or threads to support these types of subqueries in DB2 UDB. These are needed so that subqueries can be dispatched to different CPUs truly executing parts of a query. DB2 UDB will attempt to determine the SMP or partitioned database environment it is running on, so as to fully exploit the available hardware.

© Copyright IBM Corporation 2004

SQL Query Parallelism Overview

Optimizer

Process Model

SMP Enablement

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-6. Subsection Pieces CF457.3

Notes:

Subsection Pieces

For the optimizer to exploit SMP parallelism, the optimizer must build SQL access plans which fully exploit an SMP computer. This has been achieved in DB2 UDB by the optimizer being extended to consider access plans which, instead of running a complete query, break the query down into different parts. This is often called query decomposition. In a non-SMP environment, a query would be a sequence of operators which were sent to a single process or thread to be executed in an SMP query. DB2 UDB considers different groupings of the operators which can be sent to different processes or threads to be executed. The parts are called subsection pieces (SSPs), with one SQL statement consisting of one or many subsection pieces. The use of subsection pieces or SSPs will only be considered by the optimizer if parallelism support has been enabled in DB2 UDB. Enabling parallelism support is covered later in this unit.

Subsection Piece (SSP)

A subsection piece is a sequence of one or more database operators belonging to the same SQL query which, when executed, complete the processing part of that query. A

© Copyright IBM Corporation 2004

Subsection Pieces

SubsectionPiece (SSP)

TABLEQUEUE

MERGEJOIN

SORT

SORT

TABLESCAN

TABLESCAN

Package

Section

Operators

Example Subsection Piece

The optimizer may divide a query into subsection pieces

Subsection pieces have the potential to be used in parallel processing

subsection

non-partitioned

partitioned

smallest pieceto execute ona partition

smallest piece toexecute on a CPU

smallest step inexecution of SQL request

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-11

Student Notebook

subsection piece must be executed in one DB2 UDB process or thread. A subsection piece may receive data from other DB2 UDB subsection pieces, or feed data to another subsection piece, or return data back to the user. Subsection pieces can be cloned or duplicated by the optimizer (this is discussed later) to speed up processing.

Subsection

A subsection is the smallest unit of a SQL request which may be shipped between database partitions. In contrast, a subsection piece is a subset of (or a complete) subsection, and is the smallest unit which can be shipped between processes or threads on the same computer or database partition. When running DB2 Universal Database ESE in a multiple database environment, the optimizer is able to decompose a single SQL request to be run over a number of database partitions. The optimizer will perform the decomposition of the SQL request into subsections which, like subsection pieces, are a sequence of database operators. This decomposition of an SQL request will happen before it is decomposed further into subsection pieces. Subsections do not exist in a single partition environment, as an SQL request is broken down directly into subsection pieces.

Operator

An operator is the smallest step in the execution of a SQL request externalized to users by DB2 UDB. Operators perform a distinct action on the data being processed, for example a sort operator. By combining operators into a sequence of steps, DB2 UDB is able to fully resolve an SQL request, manipulating its source tables into the answer set required by the user.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-7. Intra-Query Parallelism (1) CF457.3

Notes:

DB2 UDB has a configurable parameter known as DEGREE OF PARALLELISM. The degree of parallelism limits the number of subsection pieces (SSPs) into which a query can be broken down. In the diagram above, the query is broken into four SSPs, and therefore the degree of parallelism is 4. When a query is broken into multiple SSPs, a table queue is built to coordinate the return of data from the SSPs.

The optimizer handles all of this behind the scenes, and therefore the user does not need to change anything in their application to utilize SMP parallelism, other than to set the degree of parallelism > 1. Users can explicitly set the degree of parallelism, either for the DB2 UDB instance, the database, the application, or the statement. In addition, if a section is built by the optimizer with a certain degree of parallelism, it can be reduced if the instance default degree of parallelism is lower.

When breaking a query into multiple SSPs, the optimizer will also determine if the SSPs will work on the same data or if the data can be partitioned among the SSPs. In determining the optimal way to split the section into SSPs and to partition the data, the optimizer will be influenced mainly by the degree of parallelism and the cardinality of the data.

© Copyright IBM Corporation 2004

Intra-Query Parallelism (1)

RETURN

TABLEQUEUE

Subsection Piece B

Copy 1

Subsection Piece B

Copy 2

Subsection Piece B

Copy 3

Subsection Piece B

Copy 4

Subsection pieces may be duplicated

and performed in parallel by the optimizer.

The number of parallel copies is called

the degree of parallelism and can be set.

Degree of Parallelism = 4

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-13

Student Notebook

In an SMP environment, the SSPs will be duplicates or clones of each other. DB2 UDB will then execute the SSPs in parallel and return the results quicker than if the query was run on one processor. Each SSP copy will work on a subset of the data. As determined by the optimizer, this subset may be based on the data values or on equal divisions of the data based on the number of rows.

Note: If the degree of parallelism is set to a value > 1, the number of SSPs created is equal to the value of "degree of parallelism". When setting the degree of parallelism for your system, you should match this to the number of CPUs in the SMP system. (This would be 1 in a uniprocessor environment.)

If the degree of parallelism is set to ANY, then the optimizer will decide the degree of parallelism for each SQL query. For instance, on a 4-way SMP machine, a simple insert statement will gain no advantage from being run in parallel across the CPUs, so using the ANY setting will result in this type of statement running on only 1 CPU (a degree of parallelism of 1).

The INTRA_PARALLEL setting should be enabled or disabled depending on the typical workload and the number of users. For complex SQL with relatively few users (OLAP/DSS), enable intra-parallelism. For relatively easy respective SQL with a large number of queries (OLTP), do not enable intra-partition parallelism.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-8. Intra-Query Parallelism (2) CF457.3

Notes:

In a partitioned database environment with SMP partitions, the coordinating partition may ship subsections to the partitions to be processed. If the partitions have a degree of parallelism greater than one (> 1), the query may be broken into (degrees of parallelism) SSPs on each partition. This way, each partition can process the data within the partition in parallel. As can be seen above, there can now be (number of partitions) X (degrees of parallelism) SSPs executing at the same time to satisfy one request.

© Copyright IBM Corporation 2004

Intra-Query Parallelism (2)

Degree of parallelism equals 2but there are 4 copies of thesubsection pieces

2 copies per partition

RETURN

TABLEQUEUE

Database Partition 1

Coordinating Partition

Database Partition 2 Database Partition 3

Degree of Parallelism = 2

Subsection Piece B

Copy 1 at Partition 2

Subsection Piece B

Copy 2 at Partition 2Subsection Piece B

Copy 1 at Partition 3

Subsection Piece B

Copy 2 at Partition 3

Degree of Parallelism = 2

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-15

Student Notebook

Figure 5-9. Intra-Query Parallelism CF457.3

Notes:

The diagram above is an example of both functional, or pipeline, and data parallelism. There is one SMP machine in the diagram. The data from the lineitem table is dynamically partitioned between the four CPUs. Each CPU then performs a series of operations, for example a sort, against its portion of the lineitem table. This is an example of "data parallelism".

Each CPU (CPU 1-4) copies its sorted data set into one of four table queues, marked t1-t4 in the diagram. CPU5 then reads the top page from each of the table queues and sorts the data in these pages. If there is a page filled in each of the table queues, CPU5 is able to perform an overall sort even though the table queues may not contain all of the data from the four previous sorts. This is an example of functional or pipeline parallelism. In this manner, the final sort done by CPU5 is working simultaneously with the sorts on CPUs 1-4. The output from the sorts from CPUs 1-4 is pipelined to CPU5.

© Copyright IBM Corporation 2004

Intra-Query Parallelism

CPU5

t2

t1

t3

t4

SELECT * FROM LINEITEM ORDER BY LINENUM

Data

Parallelism

Functional

(Pipeline)

Parallelism

Functional (pipeline) and data parallelism

sort

CPU1

25% oflineitem

sort

CPU1

sort

CPU1

sort

CPU1

25% oflineitem

25% oflineitem

25% oflineitem

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-10. Data Parallelism for SMP CF457.3

Notes:

DB2 UDB supports data parallelism on an SMP system. Data parallelism supports SCAN and SORT operators within an SMP environment.

DB2 UDB now performs data partitioning dynamically at runtime before supplying the data or index information to parallel table or index scans. It assigns a range of index or data rows or pages decided by the optimizer to each scan operation. This allows the load on CPUs in an SMP system to be balanced evenly, and can also help to improve the performance of the scan operator by grouping like data together to enhance the operation on the data.

To take advantage of the shared memory architecture found in SMP systems and dynamic data partitioning, DB2 UDB has four types of sorts or sort operators:

• Round robin sort • Partitioned round robin sort • Replicated sort • Dynamic shared memory sort

© Copyright IBM Corporation 2004

Data Parallelism for SMP

Scan and sort operators support data parallelism inan SMP environment

Dynamic data partitioning for table or index scansAssigns ranges of pages (or rows) to each of thescan operators

Allows for dynamic load balancing

Shared memory parallel sort operatorRound robin, partitioned round robin, replicated shared,dynamic shared

Prefetch requests support multiple subsection piecesrequesting prefetches at once

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-17

Student Notebook

In DB2 UDB, the logic used by prefetchers handles prefetch requests from multiple SSPs at the same time. The SSPs issue prefetch requests, and the prefetchers will scan the requests and group together the requests whenever possible.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-11. Parallel Configuration Parameters CF457.3

Notes:

Parameters Values INTRA_PARALLEL (dbm level) • NO, YES, -1

• Defaults to NO • -1 causes the parameter to be set to YES or NO

based on the hardware on which the database manager is running

• If changed, packages already bound will automatically be rebound at next execution

• Not configurable onlineMAX_QUERYDEGREE (dbm level)

• 1-32767,-1 • Defaults to -1; allows optimizer to choose degrees

of parallelism based on cost • No SQL executed on a database in this instance

can use a degree of parallelism higher than this value

• Configurable online

© Copyright IBM Corporation 2004

Parallel Configuration Parameters

INTRA_PARALLEL (Instance level)

MAX_QUERYDEGREE (Instance level)

DFT_DEGREE (Database level)

CURRENT DEGREE (Dynamic SQL)

DEGREE (PRECOMPILE or BIND OPTION(Static SQL)

RUNTIME DEGREE

DB2DEGREE (CLI applications)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-19

Student Notebook

There are two parameters in the database manager configuration that affect parallelism:

• INTRA_PARALLEL - Can be set to either YES or NO. The INTRA_PARALLEL parameter defaults to OFF at install time. If the INTRA_PARALLEL flag is YES, then parallelism is allowed for that DB2 UDB instance. If the INTRA_PARALLEL flag is -1, it causes the parameter to be set to Yes or No depending on the hardware on which the database manager is running. If this parameter is changed, all bound packages will be marked as invalid and will be implicitly rebound on their next execution.

• MAX_QUERYDEGREE - Allowed values 1 - 32767, -1. The default is -1, which allows the optimizer to choose the optimal degree of parallelism based on the SQL statements, the number of CPUs, and the data cardinality. This sets the maximum degree of parallelism for the instance. No SQL executed on databases within the instance can have a degree of parallelism greater than this setting.

There is a parameter in the database configuration file that affects parallelism:

• DFT_DEGREE - Allowed values 1- 32767, -1. The default is 1, that is, no parallelism. This provides the default value for the DEGREE option on bind as well as setting the default for the CURRENT DEGREE special register. If an application connects to two or more databases, the DFT_DEGREE for each database controls the degree of parallelism. For example, the queries accessing database1 may have a degree of

DFT_DEGREE (db level) • 1-32767,-1 • Defaults to 1 (no parallelism) • Provides the default value for:

- CURRENT DEGREE special register - DEGREE bind option

• Configurable onlineCURRENT DEGREE (Special Register)

• 1-32767,-1 • Sets degree of parallelism for dynamic SQL • Defaults to DFT_DEGREE

DEGREE (precompile or bind option)

• 1-32767,-1 • Sets degree of parallelism for static SQL • Defaults to DFT_DEGREE • To change: PREP STATIC.SQL DEGREE 2

RUNTIME DEGREE (Special Register)

• 1-32767,-1 • Sets degree of parallelism for running applications • To change SET RUNTIME DEGREE FOR (100) to

4 • Only affects queries issued after SET RUNTIME is

executed DB2DEGREE (CLI Keyword)

• 0-32767,-1 • Defaults to 0 • Sets degree of parallelism for CLI applications • CLI application issues a SET CURRENT DEGREE

statement after database connection.

Parameters Values

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

parallelism of 2, while the queries accessing database2 may have a degree of parallelism of 4.

The following are options to control the degree of parallelism in static and dynamic SQL:

• CURRENT DEGREE - Allowed values 1 - 32767, -1. The default is the DFT_DEGREE setting. This sets the degree of parallelism which is used when executing dynamic SQL. The value can be changed for different SQL statements within a dynamic SQL program. To change its value, use:

SET CURRENT DEGREE='X' where X is between 1 and 32767.

• DEGREE BIND OPTION - Allowed values 1 - 32767, ANY. The default is the DFT_DEGREE setting. This sets the degree of parallelism for all static SQL statements within the package being bound to the database. To change its value, use:

For prep and bind in one step:

prep program.sqc degree X

Or, for prep and bind in two steps:

prep program.sqc bindfile bind program.bnd degree X

where X is between 1 and 32767, or ANY.

One important point which needs to be considered in the degree of parallelism is that INTRA_PARALLEL must be set to Yes in order for the optimizer to break sections into SSPs and distribute them among the processors. Also, the degree of parallelism for a package, dynamic SQL statement, or executing package can be limited by the MAX_QUERYDEGREE dbm parameter.

This parameter affects the degree of parallelism for active applications:

• SET RUNTIME DEGREE - Allowed values: 1 - 32767, -1.

Sets the maximum degree of parallelism for an active application. If an active application is executing with a degree of parallelism less than this setting, nothing happens. If the application has a degree of parallelism higher than this setting, the application will have its degree of parallelism reduced for all queries within the application. This does not affect currently executing queries, and will only affect queries in the application started after the SET RUNTIME DEGREE command has been executed. If a SYSADM or DBADM user notices performance degradation due to an application using a degree of parallelism inappropriate for the environment, they can set this value to reduce the number of SSPs (and thus DB2 UDB agents) created by the optimizer for each query within the application. To use this option, you must specify the application ID of the application on which to reduce the degree of parallelism. For example:

LIST APPLICATIONS SET RUNTIME DEGREE FOR 350 to 4

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-21

Student Notebook

This parameter affects the degree of parallelism for CLI applications:

DB2DEGREE keyword in db2cli.ini. If the value specified is anything other than 0 (the default), then DB2 CLI will issue the following SQL statement after a successful connection: SET CURRENT DEGREE value. This specifies the degree of parallelism for the execution of the SQL statements. The database manager will determine the degree of parallelism if you specify -1.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-12. Which Degree of Parallelism Is Used? CF457.3

Notes:

As we have shown, there are a number of parameters which can influence the degree of parallelism which is used when executing SQL queries. The actual degree of parallelism used is: • 1, if INTRA_PARALLEL is set to NO • If INTRA_PARALLEL is set to YES, the lower of:

- MAX_QUERYDEGREE - DFT_DEGREE - CURRENT_DEGREE (Dynamic SQL) - DEGREE BIND OPTION (Static SQL) - RUNTIME DEGREE (if set)

For a partitioned database environment, the DB2 UDB optimizer makes some basic assumptions and adopts some basic rules. The optimizer assumes that the database partitions are roughly equal. Since it cannot determine the exact configuration (number of CPUs, memory, I/O speed, buffered settings, and so on) for each system, it assumes that they are all the same. DB2 UDB uses the coordinator partition as the model for the system, and uses a degree of parallelism based on the configuration of the coordinator partition.

© Copyright IBM Corporation 2004

If INTRA_PARALLEL is set to YES, then the degreeof parallelism will be the lower of:

MAX_QUERYDEGREE configuration parameter

SQL bind degree for the application (or CURRENTDEGREE at bind)

Application runtime degree

For partitioned databases and DB2 clusters:

Optimizer assumes uniform database partitions

CPU speed, buffer pool size, I/O speeds

Same degree of parallelism will be applied to alldatabase partitions

Based on the coordinating database partition

Which Degree of Parallelism Is Used?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-23

Student Notebook

Figure 5-13. Intra-Partition Parallelism Recommendations CF457.3

Notes:

Intra-partition parallelism is most beneficial in workloads with long-running, complex queries. Intra_partition parallelism could actually hurt OLTP workloads, so it is recommended to set this parameter to OFF if the workload is OLTP.

For a mixed workload of OLTP/DSS, turn the parameter ON, but set the DFT_DEGREE parameter to 1. For DSS queries, the SET CURRENT DEGREE statement can be used first to set a degree of parallelism greater than 1.

Parallelism load is the number of active users multiplied by the degree of parallelism (example: if you have 50 active users with a degree of parallelism of 4, then the parallelism load is 200). If the parallelism load is much larger than the number of CPUs available, excess context switching will burn up CPU cycles. If the number of CPUs available is 4, then, using the previous example, there could potentially be up to 200 processes in the run queue, but only 4 CPUs to work off the workload. The recommendation is that over-subscription can be beneficial, up to a point. A parallelism load of between 1.5 and 2.0 times the number of available CPUs seems to work well. If the number of active users is

© Copyright IBM Corporation 2004

Intra-Partition Parallelism Recommendations

Enabled by DBM CFG parameter INTRA_PARALLEL

Most beneficial to long-running, complex queries

For mixed OLTP/DSS, set INTRA_PARALLEL=YES, but set DB CFG parameter DFT_DEGREE = 1

For DSS queries, SET CURRENT DEGREE can be used to set degree of parallelism greater than 1

If number of DSS users is greater than 2 times number of CPUs, leave INTRA_PARALLEL=NO

Consider parallelism load on machine

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

more than twice the number of CPUs, leave INTRA_PARALLEL set to NO, even in a DSS environment.

The INTRA_PARALLEL setting should be enabled or disabled depending on the typical workload and the number of users. For complex SQL with relatively few users (OLSAP/DSS), enable intra-parallelism. For relatively easy respective SQL with a large number of queries (OLTP), do not enable intra-partition parallelism.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-25

Student Notebook

Figure 5-14. Process Model CF457.3

Notes:

The above graphic illustrates an overview of the process model.

Coordinating Agent

Process name - db2agent

This process or thread handles all requests from a user or application, including SQL and DB2 UDB commands. In a partitioned database environment, there will only be one coordinating agent for an application, and it will exist at the partition where the user connected. The maximum number of coordinating agents which can exist on an SMP computer or at a single partition in a partitioned database is controlled by the MAX_COORDAGENTS instance-level configuration parameter.

Subagents

Process name - db2agntp

These processes handle all the parallel database tasks issued by a coordinating agent. In a partitioned database environment, they are also known as parallel agents. A pool of subagents can be created at the startup of an instance.

© Copyright IBM Corporation 2004

Process Model

UserApplications

Coordinating Agent

Local Table Queue in Shared Memory

Firewall

Subagents

...

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Table Queue

A table queue is an area of memory used to pass data between agents (coordinating or subagents). Often table queues are used as pipes, that is, some agents may be writing to a table queue while others are reading from it simultaneously.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-27

Student Notebook

Figure 5-15. Process Model on SMP CF457.3

Notes:

The process model used by all DB2 Universal Databases represents the communication that occurs between database servers and client and local applications. It ensures that database applications are isolated from critical database resources such as database control blocks and critical database files. In Intel operating systems, such as Windows NT or 2000, communication occurs via threads; in UNIX-based systems, via processes. For each database being accessed, various processes or threads are started to deal with the various database tasks (for example, logging). There is also a one-to-one mapping of the processes or threads of client applications to coordinating agents that operate on a database. A coordinating agent works on behalf of an application, and communicates to other agents or subagents using Inter-Process Communication (IPC) techniques or remote communication protocols. The firewall is used to protect the database and the database manager from applications, unfenced stored procedures, and user-defined functions (UDFs). A firewall maintains the integrity of data in the databases, because an application programming error cannot overwrite an internal buffer or file of the database manager. It also improves the reliability of the database manager, because an application programming error cannot crash the database manager.

© Copyright IBM Corporation 2004

Process Model

ClientsClient Application

DB2 UDB Client Library

TCP/IP, Named Pipe, NetBIOS Shared Memory

DB2 UDB Server

Deadlock

Detector

Coordinator

Agent

Sub

Agent

Sub

Agent

Sub

AgentSub

Agent

Sub

Agent

Sub

Agent

Logger

LOG

Log Buffer Buffer Pool(s)

Sub

Agent

Sub

AgentPrefetchers

Common prefetch

request queue

Sub

Agent

Sub

Agent

Page Cleaners

Coordinator

Agent

Writ

e Log R

equests

Vic

tim n

otif

icatio

nsAsync I/O Prefetch Requests

Par

alle

l, Big

-Blo

ck,

read

requ

ests

Parallel, Page

write requests

I/OsI/Os

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

When client programs connect, a remote listener (or IPC listener) allocates a coordinating agent (db2agent) to represent the user's application within the database manager. The coordinating agent will perform any work, or arrange another process or thread to perform the work on its behalf. All data and return code information is passed back to the user's application via the coordinating agent. There is a pool of spare or idle agents kept by DB2 UDB to allow for the quick allocation of new agents for an application. When SQL requests are passed to the coordinating agent which involve parallel pieces, the coordinating agent will pass the work out to a number of subagents (db2agntp) to perform this work on its behalf. As each subagent is a separate process or thread, they can be executing simultaneously if there are adequate CPUs in the machine to support the number of subagents.

The processes are visible on your machine, and also, for example, in the db2diag.log file. Therefore, it might be important to know which process is doing what. Here a brief overview of the processes and their purposes:

Per instance, no connection no active databases:

• db2agent is a coordinating agent which coordinates the work on behalf on an application and communicates to other agents using interprocess communication (IPC).

• db2agntp are subagents which perform the requests for the application (when intra_parallel dbm cfg is enabled) on behalf of the coordinating agent.

• db2cart determines when a log file can be archived and invokes the user exit. Available on all platforms and one per instance (if at least one database has USEREXIT enabled).

• db2chkau is used by the DB2 audit facility to log entries to the Audit log. Available on all platforms.

• db2ckpw checks the userid and password on the server. Available on UNIX or Linux only.

• db2disp is the agent dispatcher process, dispatching connection between logical agent assigned to the application and the available coordinating agent when connection concentration is enabled. Available on all platforms.

• db2fcmd is the Fast Communication Manager daemon for handling interpartition communication (one per server, per partition). Available in a multipartitioned environment only.

• db2fmcd is the Fault Monitor Coordinator daemon process. One per physical machine. Available on UNIX only.

• db2fmd is the Fault Monitor daemon process that is started for every instance of DB2 monitored by the fault monitor. Available on UNIX only.

• db2fmtlg preallocates log files in the log path when LOGRETAINB is ON and USEREXIT OFF. Available on all platforms.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-29

Student Notebook

• db2gds is the DB2 Global Daemon Spawner process that starts all DB2 EDUs on UNIX. One per instance or database partition. Available on UNIX only.

• db2glock is the global deadlock detector. Coordinates information gathered from the db2dlock process on each database partition to check for deadlock conditions. Runs on the catalog partition of a partitioned database only.

• db2govd is the DB2 Governor. available on all platforms.

• db2panic is the panic agent. It handles urgent requests after agent limits have been reached on any of the database partitions. Available in a partitioned database environment only.

• db2pbdc is the Parallel Database Controller (PDB) and handles parallel requests from remote nodes. Available in partitioned database environment only.

• db2rebal is the rebalancer process, and is called when containers are added to an existing table space and a rebalance of the existing data is required. Available on all platforms.

• db2resyn is the resync manager process used to support applications that are using two-phase commit. Available on all platforms.

• db2srvlst is used to manage lists of addresses for systems such as OS/390 or z/OS. Available on all platforms.

• db2sysc is the main DB2 system controller or engine. Without this process, the database server cannot function. Available on all platforms.

• db2syslog is the system logger process, and writes to the operating system error log facility. On UNIX, this must be enabled by editing the syslog.conf file. On Windows, DB2 will automatically write the Windows event log. Available on all platforms.

• db2wdog is the DB2 watchdog. This is required since processes in UNIX can only track their parent process ID. Each time a new process is started, the db2gds notifies the DB2 watchdog. Available on UNIX only.

Per instance and per connection:

• db2agent is a coordinating agent which coordinates the work on behalf on an application and communicates to other agents using interprocess communication (IPC).

• db2agntp are subagents which perform the requests for the application (when intra_parallel dbm cfg is enabled) on behalf of the coordinating agent.

• db2agentg is the gateway agent for DRDA Application Requesters. Available on all platforms.

• db2agnsc is the parallel recovery agent used during roll-forward and restart recovery to perform actions from the logs in parallel. Available on all platforms.

• db2agnta is an idle subagent. Available on all platforms.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• db2ipccm is the IPC communication manager. One per database partition. This is the interprocess communication listener for local client connections. Available on all platforms.

• db2tcpcm is the TCP communication manager and works as a communication listener for TCP/IP connection requests. Available on all platforms.

• db2tcpdm is the communication listener for TCP/IP discovery requests. Available on all platforms.

Per Instance and per active database:

• db2dlock is the local deadlock detector, one per database partition. Available on all platforms.

• db2estor is used to copy pages between the database buffer pools and extended storage. Available on all platforms.

• db2event is the event monitor process, one per active event monitor. Available on all platforms.

• db2loggr is the database log reader and reads the logs during transaction processing, restart recovery, and roll-forward operations. Available on all platforms.

• db2loggw is the database log writer and flushes log records from the log buffer to the log files. Available on all platforms.

• db2logts is used for collecting historical information about which logs are active when a tablespace is modified. Available on all platforms.

• db2pclnr is the buffer pool page cleaner. Available on all platforms.

• db2pfchr is the buffer pool prefetcher. Available on all platforms.

Other processes:

• db2bm is the backup/restore buffer manipulator. Available on all platforms.

• db2fmp is used for fenced processes to run user code on the server outside a firewall. Available on all platforms.

• db2lbs is the LOAD LOB scanner. Available on all platforms.

• db2lbmX is the LOAD buffer manipulator. The X indicates one or more. Available on all platforms.

• db2frmx is the LOAD formatter process, where the X indicates one or more. Available on all platforms.

• db2lfs are the processes used when the table being loaded has LONG VARCHAR columns. Available on all platforms.

• db2lmr is the LOAD Media Reader process which reads the LOAD input file. Available on all platforms.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-31

Student Notebook

• db2lmwX are the LOAD media writer processes, and the X represents one or more. This processes are used with the COPY YES option. Available on all platforms.

• db2lrid performs the index sort and builds the RID during the load. Available on all platforms.

• db2ltsc is the LOAD table scanner and scans the data object for the table being loaded during a load append operation. Available on all platforms.

• db2linit is the LOAD initialization subagent which acquires the resources required on the database partitions and serializes the reply back to the load catalog subagent. Available on partitioned databases only.

• db2lcata is the LOAD catalog subagent which is executed on the catalog partition, and is responsible for spawning the initialization subagents, processing their replies, and storing the lock information at the catalog partition. Available on partitioned databases only.

• db2lpprt is the load pre-partition subagent, and partitions the input data into multiple output streams, one for each partition. Available on partitioned databases only.

• db2lpart is the load partition subagent and partition subagent, and partitions the input data into multiple output streams, one for each partition. Available on partitioned databases only.

• db2lmibm is the load mini buffer manipulator. Available on partitioned databases only.

• db2lload is the load subagent process, and is responsible for carrying out the loading on each database partition. Available on partitioned databases only.

• db2lrdfl is the load read-file subagent which reads the message file on a partition. Available on partitioned databases only.

• db2llqcl is the load query cleanup subagent which removes all load temporary files from a given partition. Available on partitioned databases only.

• db2lmitk is the load ini-task subagent which frees all LOB locators used in a load. Available on partitioned databases only.

• db2lurex is the load user-exit subagent. Available on partitioned databases only.

• db2lmctk is the process used to hold, release, or downgrade locks held on the catalog partition. Available on partitioned databases only.

• db2med is the process reading from and/or writing to the database table spaces for LOAD, backup, and restore. Available on all platforms.

• db2reorg is the process used to perform online in-place reorg. Available on all platforms.

Some commonly-used executables:

• db2 is the Command Line Processor foreground process. Available on all platforms.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• db2bp is the persistent background process for the Command Line Processor. Available on all platforms.

• db2cmd is similar to db2 but for Windows. Available on Windows only.

• db2start2 db2start program. Available on all platforms.

• db2stop2 db2stop Gregorian. Available on all platforms.

Other Windows services or processes:

• db2dasrm.exe is the Admin Server process.

• db2dasstm.exe is the Admin Server tools DB manager process.

• db2fmp.exe handles or executes all fenced stored procedures and UDFs.

• db2rcmd.exe is the DB2 Remote Command Service (inter-partition administration communications)

• db2jds.exe is the JDBC applet server service.

• db2licd.exe is the license daemon.

• db2sec.exe is used to check the userid and password.

• db2syscs.exe is the main DB2 system controller or engine.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-33

Student Notebook

Figure 5-16. Agent States CF457.3

Notes:

Database agents are Engine Dispatchable Unit (EDU) processes or threads. Database agents do the work in the Database Manager that applications request.

As we have discussed previously, when DB2 UDB receives a connection request, it assigns a coordinating agent to handle the requests for the connection. When the coordinating agent subsequently receives an SQL query to be executed, the coordinating agent passes the package to a subagent for execution. In the event that the degree of parallelism is greater than 1 and the optimizer has decided to break the section into two or more SSPs, the coordinating agent then passes each of the SSPs to a subagent for execution.

Each of the subagents is either a process or a thread, depending on the operating system. In the case of UNIX, they are processes; for Windows environments, they are threads. For every process or thread executing on the system, there is overhead in terms of memory and CPU. There is also some overhead involved in starting a new process or thread. This overhead is more for a process than for a thread, but it is not negligible in either case.

© Copyright IBM Corporation 2004

Agent States

Pool of Idle AgentsApplication 1

Application 3

Active and Associated

Agent

Coordinating

Agent

Idle and Unassociated

Agent

Idle and Associated

Agent

Application 2

Agent States:

>Coordinating

>Active

>Idle Associated

>Idle Unassociated

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

To help overcome the overhead of starting the process or thread for a DB2 UDB subagent, DB2 UDB maintains a "pool" of already started subagents. These agents can be in one of four states:

1. A coordinating agent: An agent which has been assigned to handle the requests for a particular connection.

2. An active and associated subagent: A subagent which is currently executing a query or SSP for a coordinating agent.

3. An idle and associated subagent: A subagent which has executed a query or SSP for a coordinating agent in the past but is not currently executing. The agent maintains data about the last coordinating agent it "worked" for.

4. An idle and unassociated subagent: A subagent which has not executed a query or SSP for a coordinating agent in the past, or the connection has been reset for the last coordinating agent it serviced.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-35

Student Notebook

Figure 5-17. Assigning Subagents CF457.3

Notes:

The coordinating agent has a preset method for determining which subagent it will use to execute on its behalf. We will explain the method used by examining Application 2 in the above diagram:

• Look in the pool for subagents which have been used by Application 2 previously. These are called idle and associated subagents. They require the least amount of processing to be switched to active for Application 2.

• Look in the pool for subagents which are idle and unassociated. These are subagents not currently in use by an application, and have either never been used, or the last application to use them has now disconnected.

• Steal an idle subagent which is associated to another application. This saves having to create a new process or thread, but the subagent has to be initialized to Application 2.

• If we have not reached maxagents, then create a new subagent to be used by Application 2.

© Copyright IBM Corporation 2004

Idle Agent Pool

Application 1

Application 2

Application 3

Active & AssociatedAgent

CoordinatingAgent

Idle & UnassociatedAgent

Idle & AssociatedAgent

Assigning subagents:

Assigning Subagents

NUM_INITAGENTS

Idle Agents

created at

db2start

NUM_POOLAGENTS

Max size of Idle Agent Pool

1. Try idle and associated agents to your application

2. Try idle and unassociatedassociated agents

3. If MAXAGENTS is not reached, create a new subagent

4. Steal subagent idle andassociated to another application

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

When the subagent has completed processing the request for the coordinating agent, it is returned to the pool or terminated. The subagent is:

• Freed and marked as idle, if the maximum number of idle agents has not been reached. It remains in the pool as idle and is associated to Application 2.

• Terminated and its storage freed if the maximum number of idle agents has been reached.

For coordinating agents when a client disconnects (or detaches) from a database, the coordinating agent is:

• Freed and marked as idle if the maximum number of idle agents has not been reached.

• Terminated and its storage freed if the maximum number of idle agents has been reached.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-37

Student Notebook

Figure 5-18. Controlling the Number of Subagents CF457.3

Notes:

• Degree of parallelism - The number of subagents which will be used to process a request for a coordinating agent is determined by the degree of parallelism of the application. The degree of parallelism controls the number of SSPs the queries will be broken down into, and therefore, controls the number of subagents required to execute the query.

• MAXCAGENTS - Database Manager Configuration. Limits the number of currently active concurrent agents for the DB2 UDB instance. In most cases, the default value of this parameter will be acceptable.

• MAXAGENTS - Database Manager Configuration. Limits the total number of agents (whether coordinator agents or subagents) which can be created within the DB2 UDB instance. The value of maxagents should be at least the sum of the values for maxappls in each database allowed to be accessed concurrently.

• MAX_COORDAGENTS - Database Manager Configuration. Limits the total number of coordinating agents for the DB2 UDB instance (active plus inactive). For partitioned

© Copyright IBM Corporation 2004

Controlling the Number of Subagents

Degree of parallelism

MAXCAGENTS

MAXAGENTS

MAX_COORDAGENTS

NUM_INITAGENTS

NUM_POOLAGENTS

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-38 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

database environments and environments in which intra_parallel is set to yes, the default is maxagents minus num_initagents. Otherwise the default is maxagents.

• NUM_INITAGENTS - Database Manager Configuration. Specifies the initial number of idle agents to create within the agent pool at db2start time.

• NUM_ POOLAGENTS - Database Manager Configuration. Specifies the number of agents in the pool. When the concentrator is off, that is, when max_connections is equal to max_coordagents, this parameter determines the maximum size of the idle agent pool. Idle agents can be used as parallel subagents or as coordinator agents. If more agents are created than is indicated by the value of this parameter, they will be terminated when they finish executing the current request. When the concentrator is on, that is, when max_connections is greater than max_coordagents, this parameter will be used as a guideline for how large the agent pool will be when the system workload is low. An agent will always be returned to the pool no matter what the value of this parameter is. Recommendation: In a decision support environment in which few applications connect concurrently, set num_poolagents to a small value to avoid having an agent pool that is full of idle agents. In a transaction processing environment in which many applications are concurrently connected, increase the value to avoid the cost associated with the frequent creation and termination of agents.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-39

Student Notebook

Figure 5-19. Connection Concentrator CF457.3

Notes:

For Internet applications, for example, with many relatively transient connections, or for similar kinds of applications, the connection concentrator improves performance by allowing many more client connections to be processed efficiently. It also reduces memory use for each connection and decreases the number of content switches.

The Connection Concentrator is enabled when the value of MAX_CONNECTIONS is greater than the value of MAX_COORDAGENTS.

In an environment that requires many simultaneous user connections, you can enable the connection concentrator for more efficient use of system resources. This feature incorporates advantages formerly found only in DB2 connection pooling.

After the first connection, the connection concentrator reduces the connection time to a host. When disconnection from a host is requested, the connection with the client is dropped, but the connection to the database by the agent is kept in a pool. When a new request is made to connect to the database, DB2 tries to reuse an existing agent connection from the pool.

© Copyright IBM Corporation 2004

Connection Concentrator

Connection Concentrator

Supports large number of concurrently connected users

New concept of logical agents and database agents

n:m architecture

client

client

client

client

firewall

appl_cb

appl_cb

appl_cb

logical

coord.

agent

logical

coord.

agent

logical

coord.

agent

listener

scheduler

coord

worker

agent

wait queues

agent pool

worker

agent

pdb

system

controller

appl_cb

sub

worker

agent

sub

worker

agent

logical

sub.

agent

logical

coord.

agent logical

sub.

agent

appl_cbz

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-40 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-20. Utility Parallelism for SMP Overview CF457.3

Notes:

DB2 UDB's utilities take advantage of SMP parallelism.

The LOAD utility supports the parsing and formatting of the input data in parallel. Also, the LOAD can use parallel I/O servers to write the data to the containers in parallel. LOAD takes advantage of both CPU and disk parallelism.

During index creation, the scanning and subsequent sorting of the data can occur in parallel. This will help to speed up index creation during a create index command, during restart (if an index is marked invalid), and also during reorg processing.

During backup and restore processing, DB2 UDB can also take advantage of running buffer manipulators in parallel to read/write data to or from the database. DB2 UDB backup/restore can take advantage of disk parallelism and also take advantage of multiple CPUs by assigning the buffer manipulators among the CPUs.

© Copyright IBM Corporation 2004

Utility Parallelism for SMP Overview

Load

Parallel parsing and formatting of records

Parallel I/O servers to write data to the containers

Create index

Parallel scanning and sorting of rows

Improves REORG utility and restart for corruptindexes

Backup database/table space

Parallel buffer manipulators to read data fromtable spaces

Restore database/table space

Parallel buffer manipulators to write data to tablespaces

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-41

Student Notebook

Figure 5-21. Load Parallelism CF457.3

Notes:

You do not have to have the INTRA_PARALLEL = ON for Load CPU_PARALLELISM to be in effect.

The LOAD utility takes advantage of both CPU and disk parallelism to improve the speed of loading the data. The LOAD syntax allows you to specify both CPU and disk parallelism. The CPU_PARALLELISM parameter controls the number of subagents used for data parsing and sorting.

The load utility CPU_PARALLELISM n - Use this parameter to exploit intra-partition parallelism (if this is part of your machine’s capability), and significantly improve load performance. The parameter specifies the number of processes or threads used by the load utility to parse, convert, and format data records. The maximum number allowed is 30. If there is insufficient memory to support the specified value, the utility adjusts the value. If this parameter is not specified, the load utility selects a default value that is based on the number of CPUs on the system. Although use of this parameter is not restricted to symmetric multiprocessor (SMP) hardware, you may not obtain any discernible performance benefit from using it on a non-SMP environment. The CPU_PARALLELISM n

© Copyright IBM Corporation 2004

Load Parallelism

Input Media

Application

Media Reader

Coordinator

Agent

Parsing and

record formatting Container IO

Table space

Containers

Shared memory buffers

DISK_PARALLELISMCPU_PARALLELISM

Assigns

record ids# buffers

buffer size

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-42 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

parameter does not have any relation to the INTRA_PARALLEL parameter used for queries in an SMP type environment.

The DISK_PARALLELISM n - Specifies the number of processes or threads that the load utility will spawn for writing data to table space containers. If not specified, an intelligent default is based on the number of table space containers and the characteristics of the table.

LOAD also takes advantage of disk parallelism on its input, as it can process input data from multiple locations.

If the ANYORDER modifier is used, then the preservation of source data order is not maintained when loading rows, yielding a significant additional performance benefit on SMP systems.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-43

Student Notebook

Figure 5-22. Index Create Parallelism CF457.3

Notes:

When creating an index on SMP machines, DB2 UDB will consider using multiple scan and sort operators to gather the information to build an index. When used, these multiple operators will speed up the initial processing stages for the index creation, though the actual index creation later is performed using a single process. Speeding up index creation will also improve performance for the reorg utility, creation of primary and unique key constraints, and database restart commands where corrupt indexes are involved.

When creating an index, if INTRA_PARALLEL is YES, and the table is large enough to benefit from the use of multiple processes or threads, then the degree of parallelism used will be the number of CPUs plus one. The actual processing will be performed using a coordinating agent and subagents similar to an SQL request. The coordinator dispatches a number of subagents which scan the data in parallel. Each subagent performs an independent sort of the data it reads. The subagents then insert the data, in sorted order, into a sorted table queue. The sorted table queue consolidates the data from the subagents and gives it to the coordinator agent in sorted order. The coordinator agent then builds the index using the sorted data.

© Copyright IBM Corporation 2004

Index Create Parallelism

INTRA_PARALLEL must be ON

Table must be large enough to benefit from parallelism

Reorg and addition of primary or unique key constraints also benefit

Table Space

Cabinets

Parallel

Scan & Sort

Table

Queue

Create

Index

Coordinating

Agent

Subagents

# CPUs + 1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-44 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The degree of parallelism used by create index is not limited by the MAX_QUERYDEGREE configuration parameter. Also, the degree of parallelism is not affected by the DEGREE option of the BIND command or by the CURRENT DEGREE special register.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-45

Student Notebook

Figure 5-23. Backup Parallelism with SMP CF457.3

Notes:

During backup processing, DB2 UDB can take advantage of running the buffer manipulators in parallel to read the data from disk into the memory buffers. The media writers can subsequently run in parallel, reading data from the memory buffers and writing to the backup devices if the backup devices are not on the same device.

You do not have to have INTRA_PARALLEL=ON for backup parallelism to be enabled.

The PARALLELISM parameter defines the number of processes or threads that are started when reading data from the database. Each process or thread is assigned to a specific table space. When it finishes backing up this table space, it requests another.

© Copyright IBM Corporation 2004

Backup Parallelism with SMP

BackupDevicesBuffer

Manipulators

MediaWriters

Coordinator

Agent

Table Spacesand Containers

Shared MemoryBuffers

# BUFFERS

BUFFER size

Application

TO dev1, dev2, dev3CREATE

TABLESPACE

(cont1,cont2)

PARALLELISM x

Defaults to degree of parallelism

DFT_DEGREE

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-46 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 5-24. Restore Parallelism with SMP CF457.3

Notes:

During restore processing, DB2 UDB can take advantage of running the media readers in parallel if the backup is on multiple disks or devices. The buffer manipulators can subsequently be run in parallel reading from the memory buffers and writing to the database.

You do not have to have INTRA_PARALLEL=ON for restore parallelism to be enabled.

With the restore command, there is no affiliation between a table space and a thread or process, as in backup. Parallelism should be no greater than the number of source images. The buffer size must be a multiple of the buffer size used during backup.

© Copyright IBM Corporation 2004

Restore Parallelism with SMP

BackupDevices

BufferManipulators

MediaReaders

CoordinatorAgent

Table spacesand Containers

Shared MemoryBuffers

# BUFFERSBUFFER size

Application

FROM dev1, dev2, dev3CREATETABLESPACE(cont1,cont2)

PARALLELISM x

defaults to degree of parallelism DFT_DEGREE

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-47

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-48 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. Parallelism is only applicable to those with a partitioned database. True or False?

___________________________________________________

2. The database configuration parameter for the degree of parallelism is DFT_DEGREE. If DFT_DEGREE is set to 2, which of the following statements is true?

a. By default, the degree of parallelism will be two or more subsection pieces. For example, a query could be broken into four subsection pieces for parallel processing.

b. By default, the degree of parallelism will be limited to two subsection pieces. For example, a query might not have any subsections for parallel processing.

c. By default, the degree of parallelism will always be 2. For example, a query will always be divided into two subsection pieces.

___________________________________________________

3. If the DFT_DEGREE database configuration parameter is set to 2 for one database you are connected to, and to ANY for another database you are connected to, then which statement is true?

a. By default, queries to the first database will have a degree of parallelism of 2, and queries to the second database will have a degree of parallelism determined by DB2 UDB.

b. By default, queries to either database will have a degree of parallelism of 2.

c. By default, queries to either database will have a degree of parallelism determined by DB2 UDB.

___________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 5. Parallelism, SMP Enablement, and Process Model 5-49

Student Notebook

Figure 5-25. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Describe the parallel options for:SQL processingBackup and restoreLoad utilityCreate index processing

Describe how DB2 process model supports the use of parallelism

List the benefits of the parallel options when used in SMP or MPP environments

Configure DB2 to use the parallel features

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

5-50 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 6. Advanced Utility Topics

What This Unit Is About

This unit focuses on some additional utilities that may be of use to you with your database server. Specifically, we will look at a tool to help you to move large numbers of tables from one database to another (db2move), a tool which will help you to build a customized catalog for ODBC applications (db2ocat), and a tool that will allow you to estimate the storage requirements for your data as it grows (Estimate Size GUI) as well as the use of db2look for DDL extract, db2cfexp and db2cfimp to export and import profiles, the change of DMS containers, and the high availability monitor.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe how to use db2move • Describe how to use db2look • Describe how to use Estimate Size GUI • Describe the rebalancing container functionality • Describe the High Availability Monitor • Describe db2look, db2cfexp, db2cfimp

References

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Command Reference

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-1

Student Notebook

Figure 6-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Describe how to use db2move

Describe how to use db2ocat

Describe how to use Estimate Size GUI

Describe the rebalancing container functionality

Describe the High Availability Monitor

Describe db2look, db2cfexp, db2cfimp

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

6.1 Using db2move

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-3

Student Notebook

Figure 6-2. How db2move Works CF457.3

Notes:

db2move is a tool that can help move large numbers of tables between DB2 UDB databases that are located on workstations.

db2move queries the system catalog tables for a particular database and compiles a list of all user tables. The tool then exports these tables in PC/IXF format. The PC/IXF files can be imported or loaded to another local DB2 UDB database on the same system, or can be transferred to another workstation platform and imported or loaded to a DB2 UDB database on that platform.

Note: Tables with structured type columns are not moved when this tool is used.

db2move calls the DB2 UDB export, import, and load APIs depending on the action that is requested by the user. Therefore, the requesting user ID must have the correct authorization required by the APIs or the request will fail. Also, db2move inherits the limitations and restrictions of the APIs.

© Copyright IBM Corporation 2004

How db2move Works

db2move finance export

user tables

imported

user tables

exported

db2move finance import

finance

database

finance

database

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-3. db2move Syntax CF457.3

Notes:

Command parameters:

dbname Name of the database

action Must be EXPORT, IMPORT, or LOAD

-tc Table creators, used for export only. The default is all creators. When specifying multiple creators, each must be separated by commas and no blanks are allowed between the IDs. The maximum number of creators that can be specified is 10. An asterisk (*) can be used as a wildcard that can be placed anywhere in the string. This option can be used together with the -tn option.

-tn Table names, used for export only. The default is all user tables. When specifying multiple table names, each must be separated by a comma; no blanks are allowed between the table names. The maximum number of table names is 10. This option can be used together with the -tc option. An asterisk (*) can be used as a wildcard that can be placed anywhere in the string.

© Copyright IBM Corporation 2004

db2move Syntax

db2move dbname action-tc table-creators-tn table-names-sn schema-name-ts tablespace-name-tf file-name-io import-option-lo load-option-l lobpaths-u userid-p password-aw allow Warnings

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-5

Student Notebook

-sn Schema names. The default is all schemas. If specified, only those tables where the schema names matches exactly will be exported. You can use the wildcard, asterisk (*) in the schema names; it will be changed to percent sign (%) and used in the LIKE predicate of the WHERE clause. The schema names must be separated by commas and no blanks are allowed. Also, here the maximum number of schema names is 10. Note: Schema names of fewer than eight characters are padded to eight characters in length. So, for example, for schema name ‘fred’ you will have to specify -sn fr*d* instead of -sn fr*d when using the asterisk.

-ts Table space names. The default is all table spaces. This is an EXPORT action only. If this option is specified, only those tables that reside in the specified table space will be exported. If the asterisk wildcard character (*) is used in the table space name, it will be changed to a percent sign (%) and the table name (with percent sign) will be used in the LIKE predicate in the WHERE clause. If the -ts option is not specified, the default is to use all table spaces. If multiple table space names are specified, they must be separated by commas; no blanks are allowed between table space names. The maximum number of table space names that can be specified is 10. Note: Table space names of fewer than eight characters are padded to eight characters in length. So, for example, for table space name ‘mytb’ you will have to specify -ts my*b* instead of -ts my*b when using the asterisk.

-tf Filename. This is an EXPORT action only. If specified, only the tables listed in the given file will be exported. The tables should be listed one per line and each table should be fully qualified.

-io Import option. REPLACE_CREATE is the default. Valid options are: INSERT, INSERT_UPDATE, REPLACE, CREATE, and REPLACE_CREATE.

-lo Load option. INSERT is the default. Valid options are INSERT or REPLACE.

-l lobpaths. The current directory is the default. This option specifies the absolute path names where LOB files are created (for EXPORT) or searched for (for IMPORT or LOAD).

-u Userid. The default is the userid with which you are logged on. Both userid and password are optional. However, if one is specified the other must be specified too.

-p Password. The default is the password you used to log on.

-aw Allow warnings. When -aw is not specified, tables that experience warnings during export are not included in the db2move.lst file (although that table’s.ixf file and .msg file are still generated). In some scenarios (such as data truncation) the user may wish to allow such tables to be included in the db2move.lst file. Specifying this option allows tables which receive warning during export to be included in the .lst file.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Examples:

• db2move sample export

This will export all tables in sample; the defaults are used for all options.

• db2move sample export -tc userid1,us*rid2 -tn tbname1,*tbname2

This will export all tables created by userid1 or user IDs LIKE _%rid2; and table name is tbname1 or table names LIKE %tbname2.

• db2move sample import -l D:\LOBPATH1,C:\LOBPATH2

This example is applicable for Windows operating systems only. This will import all tables in sample; any LOB files are to be searched for using lobpaths D:\LOBPATH1 and C:\LOBPATH2.

• db2move sample load -l /home/userid/lobpath,/tmp

This example is applicable for UNIX-based platforms only. This will load all tables in sample; any LOB files are to be searched for using the lobpath subdirectory in the userid subdirectory of the home directory or in the tmp subdirectory.

• db2move sample import -io replace -u userid -p password

This will import all tables in sample in REPLACE mode; the userid and password are used.

Usage notes:

1. This tool exports, imports, or loads user-created tables. If you want to duplicate a database from one platform to another platform, db2move only helps you to move the tables. You need to consider moving all other objects associated with the tables, such as: aliases, views, triggers, user-defined functions, and so on.

db2look can help you move some of these objects by extracting the data definition language (DDL) statements from the database.

If the import utility with the REPLACE_CREATE option is used to create the tables on the target database, then the limitations outlined in using import to recreate and exported table are imposed. If unexpected errors are encountered during the db2move import phase when the REPLACE_CREATE option is used, examine the appropriate tabnnn.msg message file and consider that the errors might be the result of the limitations on table creation.

2. When EXPORT, IMPORT, or LOAD APIs are called by db2move, the FileTypeMod parameter is set to lobsinfile. That is, LOB data is kept in separate files from PC/IXF files. There are 26,000 file names available for LOB files.

3. LOAD action must be run locally on the machine where the database and data file reside. When the LOAD API is called by db2move, the CopyTargetList parameter is set to NULL. That is, no copying is done. If logretain is on, the LOAD cannot be rolled forward later on. The table space where the loaded tables reside is placed in

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-7

Student Notebook

backup pending state and is not accessible. A full database backup or a table space backup is required to take the table space out of the backup pending state.

The db2move LOAD action is not supported in DB2 Universal Database partitioned databases.

Note: db2move import performance may be improved by altering the default buffer pool, IBMDEFAULTBP, and by updating the configuration parameters sortheap, util_heap_sz, logfilsz, and logprimary. Please refer to the Administration Guide: Performance for detailed information.

Notes when using EXPORT:

• Input: None.

• Output:

EXPORT.out The summarized result of the EXPORT action.

db2move.lst The list of original table names, their corresponding PC/IXF file names (tabnnn.ixf), and message file names (tabnnn.msg). This list, the exported PC/IXF files, and LOB files (tabnnnc.yyy) are used as input to the db2move IMPORT or LOAD action.

tabnnn.ixf The exported PC/IXF file of a table.

tabnnn.msg The export message file of the corresponding table.

tabnnnc.yyy The exported LOB files of a table.

These files are created only if the table being exported contains LOB data. If created, these LOB files are placed in the lobpath directories. There are a total of 26,000 possible names for the LOB files.

system.msg The message file containing system messages for creating or deleting file or directory commands. This is only used if the action is EXPORT and a lobpath is specified.

Notes when using IMPORT:

• Input:

db2move.lst An output file from the EXPORT action.

tabnnn.ixf An output file from the EXPORT action.

tabnnnc.yyy An output file from the EXPORT action.

• Output:

IMPORT.out The summarized result of the IMPORT action.

tabnnn.msg The import message file of the corresponding table.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Notes when using LOAD:

• Input:

db2move.lst An output file from the EXPORT action.

tabnnn.ixf An output file from the EXPORT action.

tabnnnc.yyy An output file from the EXPORT action.

• Output:

LOAD.out The summarized result of the LOAD action.

tabnnn.msg The LOAD message file of the corresponding table.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-9

Student Notebook

Figure 6-4. db2look CF457.3

Notes:

db2look is used to extract the DDL statements to reproduce database objects, such as of a production database on a test database. You can also generate the UPDATE statements to replicate the statistics on the object in a test database, as well as to update database configuration and database manager configuration parameters and the db2set statements, so that registry variables and configuration parameter settings on the test database match those of the production database.

The authorization required is SELECT privilege on the system catalogs. db2look establishes a database connection.

Note: The DDL generated might not exactly reproduce all characteristics of the original SQL objects. Check the DDL generated by db2look.

In some cases, such as generating table space container DDL (which calls the APIs sqlbotcq, sqlbftcq, and sqlbctcq), you will require one of the following: SYSADM, SYSCTRL, SYSMAINT, or DBADM.

No connection to the database is needed; it will be established by the command.

© Copyright IBM Corporation 2004

db2look

db2look -d DBname

-e -u Creator -z Schema

-t Tname-tw Tname

-v Vname -h

-o Fname -m-a -l -xd-x

-r-c

-f -s -g-p-td delimiter -noview

-i userid -nofed

-server Sname

-wrapper Wname-w password

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Command parameters:

-d DBname Alias name of the database that is to be queried. DBname can be the name of a DB2 UDB for UNIX, Windows, or OS/390 and z/OS database. If it is an OS/390 or z/OS database, db2look will extract the DDL and UPDATE statistics statements for the relevant object. These statements are applicable to DB2 UDB on UNIX or Windows and not to DB2 for OS/390 or z/OS. This is useful to extract OS/390 or z/OS objects and recreate them in a DB2 UDB for UNIX or Windows. If DBname is an OS/390 or z/OS database, then the output is limited to the following:

• Generate DDL for tables, indexes, views, and user-defined distinct types

• Generate UPDATE statistics statements for tables, columns, column distributions, and indexes

-e Extract DDL statements for the following database objects:

• Tables • Views • Materialized Query Tables (MQT) • Aliases • Indexes • Triggers • Sequences • User-defined distinct types • Primary key, referential integrity, and check constraints • User-defined structured types, functions, methods, and transforms

Note: DDL can be used to recreate user-defined functions successfully. However the user source code that the function references (the EXTERNAL NAME) must be available in order for the function to be usable.

• Wrappers • Servers • User mappings • Nicknames • Type mappings • Function templates • Function mappings • Index specifications • Stored procedures

-u Creator Limits output to objects with this creator ID. If option -a is specified, this parameter is ignored. If neither -u nor -a is specified, the environment variable USER is used.

-z Schema Limits the output to objects with this schema name. If option -a is specified, this parameter is ignored. If this parameter is not specified, objects with all schema names are extracted.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-11

Student Notebook

-t Tname1 Tname2 .... TnameN Limits the output to particular tables in the table list. The maximum number of tables is 30. Table names are separated by a blank space.

-tw Tname Generates DDL for table names that match the pattern criteria specified by Tname and all dependent objects of all returned tables. The underscore (_) represents any single character, the percent (%) represents a strip of zero or more characters. If -tw is specified, then the -t option is ignored.

-v Vname1 Vname2 .... Vname N. Same function as -t Tname, but used for views.

-h Help Option

-o Fname Filename to be used (plain text with .txt, otherwise flinemae.sql)

-a With this option, the output is not limited to the objects created under a particular creator ID. All objects created by all users are considered.

-m Generates the UPDATE statements to replicate the statistics on tables, columns, and indexes. The -p, -g, and -s options are ignored.

• -c When specified in conjunction with -m, the COMMIT, CONNECT and CONNECT RESET statements are not generated by db2look.

• -r When specified in conjunction with -m, the RUNSTATS command is not generated by db2look

-l DDL for user-defined tables spaces, database partition groups, and buffer pools will be generated.

-x With this option, authorization DCL will be generated.

-xd This option generates all authorization DCL, including for objects whose authorizations were granted by SYSIBM at object creation time.

-f This option extracts the configuration parameters and registry variables that affect the query optimizer. (DBM CFG parameters like cpuspeed, intra_parallel, nodetype, federated; DB CFG parameters like locklist, dft_degree, maxlocks, avg_appls, stmtheap, and dft_queryopt)

-td delimiter Specifies the statement delimiter for SQL statements. The default is the semicolon (;). It is recommended to use this option if the -e option is specified (extracted objects might contain triggers or SQL routines).

-p This option will use plain text format.

-s Generates a PostScript file.

-g Uses a graph to show fetch page pairs for indexes. Note: this option generates a filename.ps file as well as a LaTeX file.

-noview With this option, no CREATE VIEW DDL will be extracted.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

-i userid This option can be used when working with a remote database.Note: If working with a remote database, it must be the same version as the local database. The db2look utility does not have down-level or up-level support.

-w password Used in conjuction with the -i option.

-wrapper WnameGenerates DDL statements for federated objects that apply to this wrapper.

-server Sname Generates DDL statements for federated objects that apply to this server.

-nofed Specifies that no federated DDL statements will be generated.

Example: Generate the DDL statements for objects created by user Melanie in the database DEPARTMENT. The output is sent to file db2look.sql:

db2look -d department -u Melanie -e -o db2look.sql

Generate the UPDATE statements to replicate the statistics for the tables and indexes created by user Melanie and the database DEPARTMENT. The output is sent to file db2look.sql:

db2look -d department -u Melanie -m -o db2look.sql

Generate the DDL statements for objects in the DEPARTMENT database with tables that have names beginning with ‘abc’ and send the output to the db2look.sql file:

db2look -d department -e -tw abc% -o db2look.sql

Note: db2look command line options can be specified in any order. All command line options are optional, except the -d option which must be followed by a valid database alias name.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-13

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

6.2 db2ocat

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-15

Student Notebook

Figure 6-5. db2ocat Utility CF457.3

Notes:

Many applications written using the ODBC or DB2 UDB CLI interfaces make heavy use of the system catalog. While this does not usually present a problem for databases with a small number of database objects (tables, views, and so forth) it can lead to performance problems when using these applications with larger DB2 UDB databases. This performance degradation can be attributed to two main factors: the amount of information that has to be returned to the calling application, and the length of time that locks are held on the catalog tables.

The DB2 UDB ODBC Catalog Optimizer Tool solves both problems by creating alternative catalog tables that are optimized for ODBC access by a particular application. This utility helps database administrators to identify the subset of catalog information that is needed for a particular application, and it creates an ODBC-optimized catalog that the application can use. As a result, fewer locks are placed on the base system catalog tables, and catalog query times can be reduced substantially, together with the amount of data returned as a result of such queries.

© Copyright IBM Corporation 2004

db2ocat Utility

DB2 ODBC Catalog Optimizer Tool

Creates optimized catalog tables for ODBC, CLI, andJDBC access

Results in fewer locks on base system catalog tables

Supported utility from:ftp://ftp.software.ibm.com/ps/products/db2/tools/db2cat.zip

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

db2ocat is a supported utility that we make available for download from ftp://ftp.software.ibm.com/ps/products/db2/tools/db2ocat.zip to all DB2 UDB Connect and DB2 UDB customers.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-17

Student Notebook

Figure 6-6. How db2ocat Works CF457.3

Notes:

• Step 1: Create ODBC-optimized catalogs on the host

To create an ODBC-optimized catalog, the db2ocat utility should be run by a DBA on a Windows 95, 98, or NT workstation. db2ocat provides a wizard that takes the DBA through a process of naming the catalog and specifying the tables and stored procedures that should be available to a particular application. Once the tables and stored procedures are identified, it creates 10 new tables on the target database server (for example, DB2 UDB for OS/390) that comprise an ODBC-optimized catalog that describes data objects (tables, views, procedures, and so forth) available to the application. These 10 tables have a qualifier (creator), that is, the name of the ODBC-optimized catalog.

This process needs to be repeated for each application that would benefit from an ODBC-optimized catalog.

© Copyright IBM Corporation 2004

How db2ocat Works

Two-step process:

1. Create ODBC-optimized catalog(s) on the server

2. Configure each workstation to point to the rightODBC-optimized catalog(s) and keep theODBC-optimized catalog(s) current

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• Step 2: Configure each workstation to point to the right ODBC-optimized catalog(s)

By default, applications query the real DB2 UDB system catalog to obtain metadata. IBM DB2 UDB ODBC, CLI, and JDBC drivers provide a parameter that can be used to point applications to a different source of metadata that is optimized for access by the ODBC, DB2 UDB CLI, and JDBC applications. This parameter, CLISCHEMA, is set in the DB2CLI.INI file in the subdirectory where DB2 UDB products are installed (typically \sqllib). This parameter is specific to a Data Source Name (DSN), and applies only to the DSN for which it is set. Any application that uses a DSN for which CLISCHEMA has been set will obtain metadata from the ODBC-optimized catalog named by the CLISCHEMA.

The CLISCHEMA parameter can be set by manually editing the DB2CLI.INI file or by using the db2ocat tool at each of the user workstations. Because DB2CLI.INI is stored on each workstation, administrators should consider strategies for distributing DB2CLI.INI to users, especially if the number of users is large.

• Additional step: Keep the ODBC-optimized catalog(s) current

Because ODBC-optimized catalogs created with the db2ocat utility create a separate copy of the data extracted from the DB2 UDB system catalog (SYSIBM), it is important to have a procedure that will propagate future additions and changes in the system catalog to the ODBC-optimized catalogs. At this point in time, db2ocat provides a manual procedure that requires that, when catalog changes occur, a DBA starts the db2ocat utility on a Windows 95, 98, or NT workstation and presses the Refresh button for each ODBC-optimized catalog that may be affected by the changes.

Another option for keeping ODBC-optimized catalogs current is under development. This option will be available to DB2 UDB for OS/390 users who have IBM DataPropagator Relational Capture and Apply products installed on their host system. This DataPropagator option will allow DBAs to automatically replicate all changes in the real system catalog to all ODBC-optimized catalogs.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-19

Student Notebook

Figure 6-7. db2cfexp CF457.3

Notes:

db2cfexp exports connectivity configuration information to an export profile (configuration profile). The resulting profile will contain only configuration information associated with the current DB2 UDB instance. It is a non-interactive utility that packages all of the configuration information needed according the export options specified. Items that can be exported are:

• Database information (including DCS and ODBC information)

• Node information

• Protocol information

• Database Manager Configuration settings

• UDB registry settings

• Common ODBC/CLI settings

© Copyright IBM Corporation 2004

db2cfexp

db2cfexp filenameTEMPLATEBACKUPMAINTAIN

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

This utility is especially useful for exporting connectivity configuration information to workstations that do not have the DB2 Configuration Assistant installed, and in situations where multiple similar remote clients are to be installed.

You need SYSADM or SYSCTRL authorization to use this utility.

filename Specifies the fully qualified name of the target export file (Configuration Profile)

TEMPLATE Creates a configuration profile that is used as a template for other instances of the same instance type. Information included is:

• All databases, including related ODBC and DCS information

• All nodes associated with the exported databases

• Common ODBC/CLI settings

• Common client settings in the Database Manager Configuration

• Common clients settings in the DB2 UDB registry

BACKUP A configuration profile of the DB2 UDB instance for backup purposes. This file contains all of the instance configuration information, including information of a specific nature relevant only to this instance (like all protocol information)

M AINTAIN creates a profile containing only database- and node-related information for maintaining or updating other instances.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-21

Student Notebook

Figure 6-8. db2cfimp CF457.3

Notes:

db2cfimp imports connectivity configuration information from the configuration profile. It is a non-interactive utility that will import all information found in the configuration profile, such as:

• Database information (including DB2 Connect and ODBC information) • Node information • Protocol information • Database Manager Configuration settings • UDB registry settings • Common ODBC/CLI settings

With this utility you can duplicate connectivity information, such as when multiple similar remote clients are to be installed, configured, and maintained.

You need SYSADM or SYSCTRL authorization to use this utility.

filename specifies the fully qualified name of the configuration profile to be imported.

© Copyright IBM Corporation 2004

db2cfimp

db2cfimp filename

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-9. List Utilities CF457.3

Notes:

LIST UTILITIES generates, to standard output, the list of active utilities on the instance. The description of each utility can include attributes such as start time, description, and throttling priority (if applicable), as well as progress monitoring information.

In a partitioned database, the information returned is for the partition on which it is issued only.

You need SYSADM, SYSCTRL, or SYSMAINT authorization to run this command.

With LIST UTILITIES, the following information will be shown:

• ID • Type of utility (such as RUNSTATS or BACKUP) • Database name • Description (additional details) • Start time • Priority

© Copyright IBM Corporation 2004

List Utilities

DB2 UDB

Running Utility

Running Utility

?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-23

Student Notebook

If you add SHOW DETAIL to this command, you will receive additional information, such as progress monitoring, work metrics, total work units, and completed work units.

This command could be used to monitor the status of running utilities, for example the progress of an online backup. Another possibility is to determine which utilities are running if you are expecting performance problems. If you then determine a utility which is suspected to be responsible for degrading performance, then you will be able to use this ID in the SET UTIL_IMPACT_PRIORITY command which we will discuss on the next page.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-10. Change the Priority for Utilities CF457.3

Notes:

This utility changes the impact setting for a running utility. With this command you are able to:

• Throttle a utility that was invoked in unthrottled mode

• Unthrottle a throttled utility (disable throttling)

• Reprioritize a throttled utility

You need SYSADM, SYSCTRL, or SYSMAINT authorization.

utility-id is the ID whose impact setting must be changed. IDs can be obtained with the LIST UTILITIES command.

TO priority specifies an instance-level limit on the impact associated with running a utility. A value of 100 represents the highest priority and 1 represents the lowest priority. The value 0 will force a throttled utility to continue unthrottled.

Throttling requires having an impact policy defined by setting the util_impact_lim configuration parameter.

© Copyright IBM Corporation 2004

Change the Priority for Utilities

BACKUP

SET UTIL_IMPACT_PRIORITY FOR utility-id TO priority

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-25

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

6.3 Capacity Management

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-27

Student Notebook

Figure 6-11. Estimate Size GUI CF457.3

Notes:

The Estimate Size window can be opened via the table view of the Control Center. Then you select the relevant table, use your right mouse button, and select from the pop-up menu Estimate size.

When you create an Index in the Control Center, you are also able to launch Estimate Size from there.

The Estimate Size window shows the current statistics of a table, including the information about the associated indexes. To get the current values, click the "Run statistics..." button at the bottom of the window. These values are the basis of the size calculations. The size can be shown in pages, KB, or MB.

A "New total number of rows" amount as well as a new average row lengths can be specified to estimate future size requirements. The "Refresh" button starts the recalculation, and the new size estimates are shown.

© Copyright IBM Corporation 2004

Estimate Size GUI

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-12. Storage Management CF457.3

Notes:

With the storage management GUI you will be able to see all information regarding your table spaces.

If you launch it for the first time from the Control Center (right mouse button on DATABASE -> Manage Storage), you will be guided with a wizard to complete the steps necessary to be able to use the GUI:

• Specify snapshot storage where the storage management tables will be created and stored (existing or new table space). Once defined, it is not possible to change the table space at a later time.

• Specify Threshold Settings allows you to specify warning and alarm thresholds for the following criteria: space usage, data skew, and cluster ratio. Space usage measures the amount of disk space usage. Data skew measures the balance of data between the database partitions of a database, partition group, or table. The cluster ratio measures the quality of coverage an index has on one table. Default warning and alarm threshold values are provided and can be changed at any time.

© Copyright IBM Corporation 2004

Storage Management

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-29

Student Notebook

• Specify Snapshot Schedule allows you to schedule an initial snapshot as well as recurring snapshots. In order to launch the Storage Management view, you must have at least one valid snapshot.

When all prerequisites are done, you can launch the Storage Management view.

Here you can capture a snapshot as well as save it to the task center and set up recurring snapshots. The timestamp of the current snapshot will be shown at the top of the GUI.

For each table space you will find the following information:

• Containers with container name, accessibility, type, total size, and total as well as usable pages

• Tables in this table space with table name, schema, estimated size, number of rows, NPages as well as FPages, column count, and additional information like index- or lob-table space or partition number

• Indexes with information such as index name, index schema, table name, table schema, cluster ration and cluster factor, cluster warning threshold, and alarm and column count.

By selecting one line (container, table, or index) on the right site of the pane with your right mouse button, you will be able to show historical snapshots. Historical snapshots can also be shown selecting a specific table space on the left side of the GUI with your right mouse button.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-13. Database Container Operations CF457.3

Notes:

If you use database-managed table spaces, you can drop a container from a table space, reduce the size of existing containers, and add new containers to a table space such that a rebalance does not occur.

If a DMS table space was initially over-allocated, this can now be corrected.

• If the amount of data that resides in a table space has decreased significantly and the resulting waste in space is permanent, this extra space can now be reduced.

• These operations can be accomplished online with full access to the table space, so there is no need to disconnect users.

• When a table space is created, its table space map is created and all of the initial containers are lined up such that they all start in stripe 0. This means that data will be striped evenly across all of the table space containers until the individual containers fill up.

© Copyright IBM Corporation 2004

Database Container Operations

Drop existing containers from a DMS tablespace

Reduce the size of existing containers in a DMS tablespace

Add new containers to a DMS tablespace such that a rebalance does not occur

Monitor the progress of a rebalance

ALTER TABLESPACE myts

REDUCE (FILE 'cont1' 2500,

FILE 'cont2' 2500)

ALTER TABLESPACE myts

DROP (FILE 'cont6' ,

FILE 'cont7' )

ALTER TABLESPACE myts

Begin new stripe set

(FILE 'cont3' 2500,

FILE 'cont4' 2500)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-31

Student Notebook

Add or extend DMS table space

• The ALTER TABLESPACE statement lets you add a container to an existing table space or extend a container to increase its storage capacity. You can add containers to a DMS table space using the new BEGIN STRIPE SET option of the ALTER command, and a rebalance operation will not occur. Space added in this way is immediately available for use. Adding a container which is smaller than existing containers results in a uneven distribution of data. This can cause parallel I/O operations, such as prefetching data, to perform less efficiently than they otherwise could on containers of equal size. When new containers are added to a table space, or existing containers are extended, a rebalance of the table space data may occur.

Drop or reduce DMS table space containers

With a DMS table space, you can drop a container from the table space or reduce the size of a container. You use the ALTER TABLESPACE statement to accomplish this. Dropping or reducing a container will only be allowed if the number of extents being dropped by the operation is less than or equal to the number of free extents above the high-water mark in the table space. This is necessary because page numbers cannot be changed by the operation, and therefore all extents up to and including the high-water mark must sit in the same logical position within the table space. Therefore, the resulting table space must have enough space to hold all of the data up to and including the high-water mark. In the situation where there is not enough free space, you will receive an error immediately upon execution of the statement. When containers are dropped or reduced, a rebalance will occur if data resides in the space being dropped from the table space.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-14. DMS Table Space Container Management (Drop Container) CF457.3

Notes:

• To drop a container, use the DROP option of the ALTER TABLESPACE SQL statement. There must be enough free extents above the table space high water mark in the containers that will remain to hold the active extents of the container being dropped, or the operation will not be allowed to proceed.

• For example, in the top diagram we have three containers: container f0 contains two extents and both are active, and containers f1 and f2 have five extents each with the first three of each being active. Extent number 7 in container f2 contains the high water mark. A request to DROP container f0 will succeed, as there are enough free extents (those marked with a green X) to hold the two active extents in container f0.

A rebalance will be initiated to move the extents; if you are using CLP to DROP the container, the command will return once it has been determined that the request can proceed, but the space will not be available for further use until the rebalance operation has completed. The progress of a rebalance operation can now be monitored using the snapshot monitor facility of DB2. Once the rebalance has been completed, the space map will be as shown in the lower diagram.

© Copyright IBM Corporation 2004

DMS Table Space Container Management (Drop Container)

ALTER TABLESPACE ts1DROP (FILE 'f0')

f0 f1 f2

0 0 1 2

1 3 4 5

2 6 7

3 X X

4 X X

Containers

S

t

r

i

p

e

s

f1 f2

0 0 1

1 2 3

2 4 5

3 6 7

4 X X

Containers

S

t

r

i

p

e

s

X represents

unused extent

Drop a container.DROP option of ALTER TABLESPACE.

There must be enough extents above the high water mark in the remaining containers to hold the active extents in the dropped container.

A rebalance will be required.Must complete before the container is successfully dropped.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-33

Student Notebook

Figure 6-15. DMS Table Space Container Management (Reduce Size) CF457.3

Notes:

• To reduce the size of an existing container, use the REDUCE or RESIZE option of the ALTER TABLESPACE SQL statement. There must be enough free extents above the table space high water mark in the containers that will remain to hold the active extents of the container being reduced, or the operation will not be allowed to proceed.

• For example, in the top diagram we have three containers: container f0 contains two extents and both are active, and containers f1 and f2 have five extents each, with the first three of each being active. Extent number 7 in container f2 contains the high water mark. A request to REDUCE container f1 to two extents will succeed, as there are enough free extents (those marked with a green X) to hold the one active extent (extent 6) in container f1 that will have to be moved.

A rebalance will be initiated to move the extents; if you are using CLP to REDUCE the container, the command will return once it has been determined that the request can proceed, but the space will not be available for further use until the rebalance operation has completed. The progress of a rebalance operation can now be monitored using the snapshot monitor facility of DB2. Once the rebalance has been completed, the space map will be as shown in the lower diagram.

© Copyright IBM Corporation 2004

f0 f1 f2

0 0 1 2

1 3 4 5

2 6 7

3 X X

4 X X

5 X X

DMS Table Space Container Management (Reduce Size)

Reduce the size of an existing container.REDUCE or RESIZE option of ALTER TABLESPACEThere must be enough extents above the high water mark in theremaining containers to hold any dropped extents that were active.A rebalance will be required.

Must complete before the container is successfully reduced in size.

Containers

S

t

r

i

p

e

s

ALTER TABLESPACE ts2REDUCE (FILE 'f1' 128)orALTER TABLESPACE ts2RESIZE (FILE 'f1' 96)

Containers

S

t

r

i

p

e

s

(assuming extent sizeof 32 pages - one extent for the container tag)

X representsunused extent

f0 f1 f2

0 0 1 2

1 3 4 5

2 6

3 7

4 X

5 X

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 6-16. DMS Container Management (Add New/No Rebalance) CF457.3

Notes:

• To add a container without triggering a rebalance operation, use the BEGIN NEW STRIPE SET option of the ALTER TABLESPACE SQL statement. The number of the stripe set added will be one more than the highest existing stripe set in the table space.

• For example, in the diagram we had one stripe set, number 0. A request to add a container with three extents (container f3 in the diagram) to the table space was processed. Since it used the BEGIN NEW STRIPE SET option of the ALTER TABLESPACE statement, this resulted in the addition of stripe set number 1 to the table space. Because DB2 only stripes data across the extents in a stripe set, there is no need to move data from stripe set number 0 to stripe set number 1, and a rebalance operation is not needed.

When you later add another container (container f4 in the diagram) with four extents to stripe set number 1 using the ADD TO STRIPE SET option, a rebalance will be initiated to stripe the active extents in container f3 evenly across the extents in containers f3 and f4.

© Copyright IBM Corporation 2004

3. Add a new container without rebalancing.BEGIN NEW STRIPE SET option of ALTER TABLESPACE

f0 f1 f2 f3 f4

0 0 1 2

1 3 4 5

2 6 7 8

3 9 10

4 11 12

5 13 X

6 X X

7 X

S

t

r

i

p

e

s

Containers

Stripe set 0

Stripe set 1

ALTER TABLESPACE ts3BEGIN NEW STRIPE SET(FILE 'f3' 128)

ALTER TABLESPACE ts3ADD TO STRIPE SET 1(FILE 'f4' 160)

X represents

unused extent

Advantage: If all containerswere full when adding 'f3',the new space isimmediately available

DMS Container Management (Add New/No Rebalance)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-35

Student Notebook

Figure 6-17. ALTER TABLESPACE Syntax CF457.3

Notes:

This syntax above is only a part of the ALTER TABLESPACE syntax showing the discussed parts only.

tablespace-name is a one-part name. It is a long SQL identifier (either ordinary or delimited).

BEGIN NEW STRIPE SET specifies that a new stripe set is to be created in the table space, and that one or more containers are to be added to this new stripe set. Containers that are subsequently added using the ADD option will be added to this new stripe set unless TO STRIPE SET is specified.

DROP specifies that one or more containers are to be dropped.

EXTEND specifies that existing containers are to be increased in size. The size specified is the one by which the existing container is increased. If the all-containers clause is specified, all containers in the table space will be increased by this size.

© Copyright IBM Corporation 2004

ALTER TABLESPACE Syntax

ALTER TABLESPACE tablespace-name

BEGIN NEW STRIPE SET database-container clause

DROP drop-container clause

EXTENDREDUCERESIZE

database-container clause

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

REDUCE specifies that existing containers are to be reduced in size. The size specified is the one by which the existing container is decreased. If the all-containers clause is specified, all containers in the table space will be decreased by this size.

RESIZE specifies that the size of existing containers is to be changed. The size specified is the new size for the container. If the all-containers clause is specified, all containers in the table space will be changed to this size. If the operation affects more than one container, these containers must all either increase in size or decrease in size. It is not possible to increase some while decreasing others (SQLSTATE 429BC).

In addition, the on-db-partitions clause could be specified to indicate one or more partitions for the corresponding container operation. For compatibility with versions earlier than Version 8, the keyword NODE an be substituted for DBPARTITIONNUM.

Rules:

• The BEGIN NEW STRIPE SET clause cannot be specified in the same statement as ADD, DROP, EXTEND, REDUCE, and RESIZE, unless those clauses are being directed to different partitions (SQLSTATE 429BC).

• The stripe set value specified with the TO STRIPE SET clause must be within the valid range for the table space being altered (SQLSTATE 42615).

• When adding or removing space from the table space, the following rules must be followed.

- EXTEND and RESIZE can be used in the same statement, provided that the size of each container is increasing.

- EXTEND and RESIZE can be used in the same statement, provided that the size of each container is decreasing.

- EXTEND and REDUCE cannot be used in the same statements, unless they are being directed to different partitions.

- ADD cannot be used with REDUCE or DROP in the same statement, unless they are being directed to different partitions.

- DROP cannot be used with EXTEND or ADD in the same statement, unless they are being directed to different partitions.

You can now monitor the progress of a rebalance operation. The information returned by the table space snapshot monitor has been extended to include the data elements on the visual and others. Please refer to the table space activity data elements in the System Monitor Guide and Reference for details.

With the addition of the ability to drop containers or reduce their size, the rebalancer will now operate in two directions: forwards when it has been initiated as a result of a container addition or extension, or backwards as a result of a dropped container or reducing the size of a container.

V8.1 also provides you with SQL access to the snapshot monitor data; it uses a table function. The SQL can be embedded in an application or issued from CLP.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-37

Student Notebook

You can create a table to store results from the snapshot SQL.

Note: LAST_EXTEND_MOVED is the way the "column" is spelt.

Example: SELECT * FROM TABLE (SNAPSHOT_TBS_CFG('SAMPLE',-1))

as SNAPSHOT_TBS_CFG

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-38 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

6.4 High Availability Monitor

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-39

Student Notebook

Figure 6-18. High Availability Monitor on UNIX (db2fm) CF457.3

Notes:

On UNIX based systems, the Fault Monitor Facility improves the availability of non-clustered DB2 UDB environments through a sequence of processes that work together to ensure that DB2 UDB is running. That is, the init daemon monitors the Fault Monitor Coordinator (FMC), the FMC monitors the fault monitors, and the fault monitors monitor DB2 UDB.

DB2 Fault Monitor controls the DB2 fault monitor daemon. You can use db2fm to configure the fault monitor. This command may be used on UNIX platforms only.

Command parameters:

-m module-path defines the full path of the fault monitor shared library for the product being monitored. The default is $INSTANCEHOME/sqllib/lib/libdb2gcf.

-t service gives the unique text descriptor for a service.

-i instance defines the instance of the service.

-u brings the service up

© Copyright IBM Corporation 2004

High Availability Monitor on UNIX (db2fm)

db2fm -t servicename -m module path-u-d-s-k-U-D-S-K-a-f-T-l-R-n-H-?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-40 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

-U brings the fault monitor daemon up

-d brings the service down

-D brings the fault monitor daemon down

-k kills the service

-K kills the fault monitor daemon

-s returns the status of the service

-S returns the status of the fault monitor daemon (not properly installed, installed properly but not active, alive but not available (maintenance), available or unknown)

-f on/off turns the fault monitor on or off. If this option is set off, the fault monitor daemon will not be started or the daemon will exit if it was running.

-a on/off activates or deactivates fault monitoring. If this option is set off, the fault monitor will not be actively monitoring, which means that, if the service goes down, it will not try to bring it back.

-T T1/T2 overwrites the start and stop timeout.

-I R1/R2 sets the status interval and timeout

-R R1/R2 sets the number of retries for the status method and action before giving up

-n sets the e-mail address for notification of events

-h prints usage

-? prints usage

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-41

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-42 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. Which command exports the glamor table and all tables with the word money in them from the hollywood database? The tables are owned by the cruise and pfeifer userids.

a. db2move hollywood export -tc cruise, pfeifer -tn glamor, *money

b. db2move hollywood export -tc cruise, pfeifer -tn glamormove hollywood export -tc cruise, pfeifer -tn glamor, LIKE money

__________________________________________________

2. Which tool can be used to export all relevant information of an instance?

a. db2ocat

b. db2fimp

c. db2fexp

___________________________________________________

3. Can you reduce the size of containers?

a. Yes

b. Yes, for DMS table spaces only

c. No

___________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 6. Advanced Utility Topics 6-43

Student Notebook

Figure 6-19. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Describe how to use db2move

Describe how to use db2ocat

Describe how to use Estimate Size GUI

Describe the rebalancing container functionality

Describe the High Availability Monitor

Describe db2look, db2cfexp, db2cfimp

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

6-44 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 7. Online Table and Index Reorganization

What This Unit Is About

This unit explains the use of online table and index reorganization and the appropriate advantages.

What You Should Be Able to Do

After completing this unit, you should be able to describe:

• Configurable online configuration parameters

• Online (INPLACE) table reorganization

- Characteristics

- Syntax

- Usage tips

• Online index creation and reorganization

- Characteristics

- Syntax

- Usage tips

References

IBM DB2 Universal Database Administration: Performance

IBM DB2 Universal Database Command Reference

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-1

Student Notebook

Figure 7-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to describe:

Table reorganization

Online (INPLACE) table reorganizationCharacteristicsSyntaxUsage tips

Online index creation and reorganizationCharacteristicsSyntaxUsage tips

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

7.1 Table Reorganization

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-3

Student Notebook

Figure 7-2. Table Reorganization Overview CF457.3

Notes:

• By default, the table is available read-only during an offline reorganization. In warehousing environments, this can significantly improve the availability of the data to the users. Long and LOB data stored in a large table space is no longer reorganized by default, which will significantly improve reorganization performance when there is lots of long and LOB data. If the data is well clustered and reclaiming space is the primary objective, naming the clustering index on the INDEX clause and specifying the INDEXSCAN will avoid a sort during the reorganization process. This can improve the overall duration of the reorganization.

• All reorganizations provide improved support for partitioned database environments. For both offline and online reorganizations, you can control the partitions of the table that are reorganized.

© Copyright IBM Corporation 2004

Table Reorganization Overview

Online (INPLACE) table reorganizationWill be covered on the charts that follow

"Offline" table reorganization enhancementsDefault is to ALLOW READ ACCESS, option is to ALLOW NOACCESS

You can specify INDEXSCAN to reorder the tableThis avoids a sort of the dataMay be best option if data is fairly well clustered

Default is not to reorganize long and LOB dataSpecify LONGLOBDATA to reorganize this data

Partition subset support for partitioned databaseenvironmentsAll partitions

A partition or a range of partitions

Except a partition or a range of partitions

Applies to both offline and online reorganizations

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-3. Online (INPLACE) Table Reorg CF457.3

Notes:

In-place table reorganization

• The in-place method is slower and does not ensure perfectly ordered data, but it can allow applications to access the table during the reorganization. In addition, in-place table reorganization can be paused and resumed later by anyone with the appropriate authority by using the schema and table name.

• Note: In-place table reorganization is allowed only on tables with type 2 indexes and without extended indexes.

• Consider the following tradeoffs:

- Imperfect index reorganization: you might need to reorganize indexes later to reduce index fragmentation and to reclaim index object space.

- Longer time to complete: when required, in-place reorganization defers to concurrent applications. This means that long-running statements or RR and RS readers in long-running applications can slow the reorganization progress. In-place

© Copyright IBM Corporation 2004

Application access during reorg

Must have type 2 indexes

Multipartition supportImperfect index reorganization

Longer time to complete

Requires more log space

Online (INPLACE) Table Reorg

VACATE PAGE RANGE: MOVE & CLEAN to make space

FILL PAGE RANGE: MOVE & CLEAN to fill space

freespace

TIME

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-5

Student Notebook

reorganization might be faster in an OLTP environment in which many small transactions occur.

- Requires more log space: because in-place table reorganization logs its activities so that recovery is possible after an unexpected failure, it requires more log space than classic reorganization.

- It is possible that in-place reorganization will require log space equal to several times the size of the reorganized table. The amount of required space depends on the number of rows that are moved and the number and size of the indexes on the table.

• Recommendation: Choose in-place table reorganization for 24x7 operations with minimal maintenance windows.

• Information about the current progress of table reorganization is written to the history file for database activity. The history file contains a record for each reorganization event.

• You can also use table snapshots to monitor the progress of table reorganization, because the data is recorded regardless of the Database Monitor Table Switch setting.

• If an error occurs, an SQLCA dump is written to the history file. For an in-place table reorganization, the status recorded is PAUSED.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-4. Online Table Reorganization - Inside CF457.3

Notes:

• Online reorganize uses an in place technique. Rows are reorganized within the existing table object to reestablish clustering, reestablish free space, and eliminate overflow rows.

• Unlike offline reorganization, online reorganization does not require additional disk storage. As it proceeds through the table, the benefits of the reorganization are seen immediately for the portions of the table that have been processed. And there is no switch over at the end, which avoids a table quiesce that prevents access.

• The design allows for concurrent index and table scans and for updates to occur while the reorganization is in process. Furthermore, the design guarantees that an application won't miss a row, or see a row twice.

© Copyright IBM Corporation 2004

Online Table Reorganization - Inside

Table available forTable available forfull S/I/U/D accessfull S/I/U/D access

during reorgduring reorg

Online table reorganization that uses an in place techniqueRows moved within existing table object to reestablish clustering,reestablish free space, and eliminate overflows

Attributes:Minimal extra storage requirement

Incremental: benefit of effects seen immediately

No quiesce for "switch over" at end

Allows concurrent index and table scans, and updates to occur

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-7

Student Notebook

Figure 7-5. Online Table Reorganization - Characteristics CF457.3

Notes:

• An online reorganization processes the table incrementally. Small sets of rows from a range of pages are processed as a batch. When one batch has been processed, work is started on the next. This continues until the entire table has been reorganized.

• An online reorganization will always recluster using the clustering index if one is defined. It must do this to work within the clustering index definition. If there is no clustering index, and you specify an index using the INDEX option, then the table will be reclustered using the specified index.

• If there is no clustering index and you do not specify an index, the online reorganization only reclaims space. The reclamation process starts at the end of the table and moves rows to fill holes earlier in the table. Since fewer rows are moved, it is faster than a reclustering.

• Online reorganization runs asynchronously. Once the command or API completes basic checking, it returns. The reorganization continues in the background. Its status can be monitored using the tables snapshot monitor facility.

© Copyright IBM Corporation 2004

Online Table Reorganization - Characteristics

Online reorganization characteristicsIncremental - small sets of rows are reorganized in batches

Batches are from a range of pages

Two policies: recluster or reclaim spaceClustering index defined: always reclusters using the indexNo clustering index, can specify an INDEX: will reclusterusing the indexNo clustering index and no index specified: will reclaim space

Asynchronous - runs in backgroundCommand or API call completes after basic checking

Status can be tracked through the snapshot monitorTables snapshot

There are pause, resume, and stop options

Recoverable - redo work from the log

Resumable - after a DBMS restart

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• An online reorganization can be paused and then resumed later, or stopped completely. The table data will have been reorganized as far as the utility got before stopping. (The batches that have been reorganized will not be rolled back.)

• The operation is recoverable in the event of a failure, and can be resumed after the database has been restarted successfully.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-9

Student Notebook

Figure 7-6. Online Reorg Syntax - Table or Index CF457.3

Notes:

Command parameters:

• TABLE table-name Specifies the table to reorganize. The table can be in a local or a remote database. The name or alias in the form: schema.table-name may be used. The schema is the user name under which the table was created. If you omit the schema name, the default schema is assumed.

- Note: For typed tables, the specified table name must be the name of the hierarchy’s root table. You cannot specify an index for the reorganization of a multidimensional clustering (MDC) table. Also note that in-place reorganization of tables cannot be used for MDC tables.

• INDEXES ALL FOR TABLE table-name

- Specifies the table whose indexes are to be reorganized. The table can be in a local or a remote database.

© Copyright IBM Corporation 2004

Online Reorg Syntax - Table or Index

REORG {TABLE table-name Table-Clause |

INDEXES ALL FOR TABLE table-name Index-Clause}

{ database-partition clause }

Table Clause:offline options:

[INDEX index-name][[ALLOW {READ | NO} ACCESS]

[USE tbspace-name]

[ INDEXSCAN ] [ LONGLOBDATA ] ] |

inplace options:

INPLACE { [ ALLOW { WRITE | READ } ACCESS ]

[ NOTRUNCATE TABLE ] [ START | RESUME ] |

{ STOP | PAUSE } }

Index Clause:[ ALLOW { NO | READ | WRITE } ACCESS ]

[ { CLEANUP ONLY [ ALL | PAGES ] | CONVERT } ]

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Table Clause:

• INDEX index-name Specifies the index to use when reorganizing the table. If you do not specify the fully qualified name in the form: schema.index-name, the default schema is assumed. The schema is the user name under which the index was created. The database manager uses the index to physically reorder the records in the table it is reorganizing.

• For an inplace table reorg, if a clustering index is defined on the table and an index is specified, it must be a clustering index. If the inplace option is not specified, any index specified will be used. If you do not specify the name of an index, the records are reorganized without regard to order. If the table has a clustering index defined, however, and no index is specified, then the clustering index is used to cluster the table. You cannot specify an index if you are reorganizing an MDC table.

• USE tbspace-name Specifies the name of a system temporary table space in which to store a temporary copy of the table being reorganized. If you do not provide a table space name, the database manager stores a working copy of the table in the table spaces that contain the table being reorganized. For an 8 KB, 16 KB, or 32 KB table object, the page size of any system temporary table space that you specify must match the page size of the table spaces in which the table data resides.

• INDEXSCAN For a clustering REORG, an index scan will be used to reorder table records. Reorganize table rows by accessing the table through an index. The default method is to scan the table and sort the result to reorganize the table, using temporary table spaces as necessary. Even though the index keys are in sort order, scanning and sorting is typically faster than fetching rows by first reading the row identifier from an index.

• LONGLOBDATA Long field and LOB data are to be reorganized. This is not required even if the table contains long or LOB columns. The default is to avoid reorganizing these objects, because it is time-consuming and does not improve clustering.

• INPLACE Reorganize the table while permitting user access. In-place table reorganization is allowed only on tables with type-2 indexes and without extended indexes. Inplace table reorganization takes place asynchronously and might not be effective immediately.

- ALLOW READ ACCESS Allow only read access to the table during reorganization.

- ALLOW WRITE ACCESS Allow write access to the table during reorganization. This is the default behavior.

- NOTRUNCATE TABLE Do not truncate the table after inplace reorganization. During truncation, the table is S-locked.

- START Start the inplace REORG processing. Because this is the default, this keyword is optional.

- RESUME Continue or resume a previously paused inplace table reorganization.

- STOP Stop the inplace REORG processing at its current point.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-11

Student Notebook

- PAUSE Suspend or pause inplace REORG for the time being.

Index Clause:

• ALLOW READ ACCESS Specifies that other users can have read-only access to the table while the indexes are being reorganized.

• ALLOW NO ACCESS Specifies that no other users can access the table while the indexes are being reorganized. This is the default.

• ALLOW WRITE ACCESS Specifies that other users can read from and write to the table while the indexes are being reorganized.

• CLEANUP ONLY When CLEANUP ONLY is requested, a cleanup rather than a full reorganization will be done. The indexes will not be rebuilt and any pages freed up will be available for reuse by indexes defined on this table only.

The CLEANUP ONLY PAGES option will search for and free committed pseudo-empty pages. A committed pseudo-empty page is one where all the keys on the page are marked as deleted and all these deletions are known to be committed. The number of pseudo-empty pages in an indexes can be determined by running runstats and looking at the NUM EMPTY LEAFS column in SYSCAT.INDEXES. The PAGES option will clean the NUM EMPTY LEAFS if they are determined to be committed.

The CLEANUP ONLY ALL option will free committed pseudo empty pages, as well as remove committed pseudo-deleted keys from pages that are not pseudo-empty. This option will also try to merge adjacent leaf pages if doing so will result in a merged leaf page that has at least PCTFREE free space on the merged leaf page, where PCTFREE is the percent free space defined for the index at index creation time. The default PCTFREE is ten percent. If two pages can be merged, one of the pages will be freed. The number of pseudo-deleted keys in an index, excluding those on pseudo-empty pages, can be determined by running runstats and then selecting the NUMRIDS DELETED from SYSCAT.INDEXES. The ALL option will clean the NUMRIDS DELETED and the NUM EMPTY LEAFS if they are determined to be committed.

- ALL Specifies that indexes should be cleaned up by removing committed pseudo-deleted keys and committed pseudo-empty pages.

- PAGES Specifies that committed pseudo-empty pages should be removed from the index tree. This will not clean up pseudo-deleted keys on pages that are not pseudo-empty. Since it is only checking the pseudo-empty leaf pages, it is considerably faster than using the ALL option in most cases.

• CONVERT If you are not sure whether the table you are operating on has a type 1 or type 2 index, but want type 2 indexes, you can use the CONVERT option. If the index is type 1, this option will convert it into type 2. If the index is already type 2, this option has no effect.

All indexes created by DB2 prior to Version 8 are type 1 indexes. All indexes created by Version 8 are type 2 indexes, except when you create an index on a table that already has a type 1 index. In this case the new index will also be of type 1. REORG INDEXES

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

will always convert type 1 indexes to type 2 indexes unless you use the CLEANUP option.

Using the INSPECT command to determine the index type can be slow. CONVERT allows you to ensure that the new index will be type 2 without your needing to determine its original type.

Use the ALLOW READ ACCESS or ALLOW WRITE ACCESS option to allow other transactions either read-only or read-write access to the table while the indexes are being reorganized. Note that, while ALLOW READ ACCESS and ALLOW WRITE ACCESS allow access to the table, during the period in which the reorganized copies of the indexes are made available, no access to the table is allowed.

db2Reorg API

• Parameters include:

- reorgType - Specifies the type of reorganization - reorgFlags - Reorganization options - nodeListFlag - Specifies which nodes to reorganize

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-13

Student Notebook

Figure 7-7. Online Table Reorganization - Usage Tips CF457.3

Notes:

• An online reorganization connects to the database. To stop the database manager, you will either have to pause the online reorganization and then issue db2stop, or you will have to use a db2stop force. If you use db2stop force, any uncommitted work done by the online reorganization will be backed out during database activation. Once the database is available, the online reorganization can be started again.

• Online reorganization requires that all indexes on the table must be type 2 indexes. This index type is new and is being added in V8.

• In order to make the online reorganization process recoverable, many of its actions are logged. If you will be using online reorganizations, you should increase your primary log space to accommodate the additional log records.

• While every effort has been made to minimize the impact that online reorganization has on other work in the database, there will be some impact, especially on systems with heavy update workloads. You may want to consider running online reorganizations outside of prime shift in these environments.

© Copyright IBM Corporation 2004

Online Table Reorganization - Usage Tips

Online reorg connects to the databaseMust pause it before db2stop, or use db2stop force

Indexes on table must support pseudo-deletesType 2 indexes

Method to convert to type 2 indexes: REORG INDEXES ... CONVERT

To make the operation recoverable, many actionsare logged

Increase primary log space

There will be some limited impact on other work,especially on a system with heavy updates

Consider running off-shift for 24 x 7 environments

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-8. Classic Table Reorg CF457.3

Notes:

• This method provides the fastest table reorganization, especially if you do not need to reorganize LOB or LONG data. In addition, indexes are rebuilt in perfect order after the table is reorganized. Read-only applications can access the original copy of the table except during the last phases of the reorganization, in which the permanent table replaces the shadow copy of the table and the indexes are rebuilt.

• On the other hand, consider the following possible disadvantages:

- Large space requirement: because classic table reorganization creates the shadow copy of the table, it can require twice as much space as the original table. If the reorganized table is larger than the original, reorganization can require more than twice as much space as the original. The shadow copy can be built in a temporary table space if the table space is not large enough, but the replace phase performs best in the same DMS table space. Tables in SMS table spaces must always store the shadow copy in temporary space.

- Limited table access: even read-only access is limited to the first phases of the reorganization process.

© Copyright IBM Corporation 2004

Offline still available:db2 reorg table employee index empid allow no access indexscan longlobdata

Faster

LONG not REORGed unless requested

Indexes rebuilt in perfect order

Read-only applications can access until last phase if ALLOW READ ACCESS is specified

Disadvantages:Large space requirement

Limited table access

All or nothing process

Within control of application that invokes it

Classic Table Reorg

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-15

Student Notebook

- All or nothing process: if the reorganization fails at any point, it must be restarted from the beginning on the nodes where it failed.

- Performed within the control of the application that invoked it: the reorganization can be stopped only by that application or by a user who understands how to stop the process and who has authority to execute the FORCE command for the application.

• Recommendation: Choose this method if you can reorganize tables during a maintenance window.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-9. Online Index Create and Reorganization CF457.3

Notes:

• DB2 UDB adopts an index creation technique that uses shadow objects to allow full read and write access to a table during index creation.

• It also supports the reorganization of an index, using the same shadow object technique. During index reorganization, the table and existing indexes can be read and written to.

© Copyright IBM Corporation 2004

Online Index Create and Reorganization

Table available for Table available for full S/I/U/D access full S/I/U/D access during create/reorgduring create/reorg

(until final(until finalswitch over)switch over)

IDX1

IDX2

IDX1

DB2 UDB provides:Full read/write access to table during index creation

Full read/write access to table and index during indexreorganization (re-creation)

Uses a shadow object technique to achieve this

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-17

Student Notebook

Figure 7-10. Online Index Create Overview CF457.3

Notes:

• All indexes created in V8 will be created using the shadow object technique. For index creation the shadow object is a ghost index. To build a ghost index, the base table is scanned.

• All new indexes will be type 2, unless the table has type 1 indexes that were created with Version 7 or a prior version. Type 1 and type 2 indexes cannot coexist on a table. If you create an index on a table that has type 1 indexes, the new index on the table will be type 1. Type 1 indexes are converted to type 2 indexes during an index reorganization. Details on index reorganization follow. Type 2 indexes support pseudo-delete and are required for the online reorganization, online load, and multidimensional clustering facilities.

• Altering a table can cause indexes to be created, for example, if you add a primary key or a unique constraint. In V8, these indexes will be created online using the shadow object technique.

• No user action is required to take advantage of this new index creation technique; continue to use CREATE INDEX and ALTER TABLE as you do today.

© Copyright IBM Corporation 2004

Online Index Create Overview

Indexes will be constructed using a shadow objecttechnique

All new indexes will be type 2 (pseudo-delete) in thenext release unless table has type 1 indexes:

Type 1 and type 2 indexes cannot coexist on the same table

Type 1 indexes can be converted to type 2 via an index reorg

See REORG INDEX topic

Altering a table can cause indexes to be created, forexample, when adding a primary key or uniquespecification

These indexes will be created using the online technique

How to enable:No user actions required

Continue to use CREATE INDEX and ALTER TABLE statementsas you do today

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-11. Online Index Create - Usage Tips CF457.3

Notes:

• Creation of an index involves the use of a shadow object, in this case a ghost index. Before the ghost index can be made available for use, it must be enabled. The table is temporarily unavailable while DB2 is switching over to use the new index.

• Don't forget, CREATE INDEX and ALTER TABLE are SQL statements and must be committed to free locks.

• Should your system fail during index creation, any shadow objects that exist for a table are discarded during the restart of the database. If you were creating an index or running an alter statement that forced indexes to be created, you would have to run the SQL statement again.

• Tip: If you alter a table to add a primary key, a unique constraint or another attribute that requires an index, create the required index first. The table will be available during the index creation. When you issue the ALTER TABLE statement, if the required indexes exist, the ALTER TABLE will complete with a minimum of delay, and the catalog lock time will be minimized.

© Copyright IBM Corporation 2004

Online Index Create - Usage Tips

The table is temporarily unavailable while DB2 is switchingover to use the new index

CREATE INDEX is an SQL statement. Must commit work tofree locks.

During crash recovery, any shadow objects are discardedRun the CREATE INDEX or ALTER TABLE statement again

Technique to alter a table to add a primary key, aunique constraint, or other item that needs an index:

Create the required indexes firstThe table will be available during index creation

ALTER TABLE checks for indexes that can be used for added constraintsbefore it creates them

ALTER TABLE statement will complete with a minimum delay; cataloglock time will be minimized

Offline index create can be doneLOCK TABLE, share or exclusive, before CREATE INDEX

Issue a commit to release the lock

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-19

Student Notebook

• An offline index create can still be done. First, you must acquire a lock on the table, either shared or exclusive, using the LOCK TABLE statement. Then, issue your CREATE INDEX statement. Once the index has been created, don't forget to commit your work to free the table lock.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-12. Online Index Reorganization - Overview CF457.3

Notes:

• When you reorganize the indexes on a table, all the indexes are reorganized. • By default, no user access is allowed on the table during index reorganization.

Optionally you can permit read access, or read/write. • Free space is always restored. If inserts have consumed the previously allocated free

space, the index may grow in size. • Reorganization always converts existing type 1 indexes to type 2 indexes, plus it

reorganizes the index. Type 2 indexes require slightly more space than type 1 indexes, so the space required for the reorganized index may grow. Type 2 indexes are required for online table reorganization, online load and for multidimensional clustering.

• If you have type 2 indexes, there is an option to compress the indexes and not reorganize them. This is accomplished by using the CLEANUP ONLY option. With “PAGES”, the reorg will delete pseudo-deleted pages. With “ALL”, the reorg will delete pseudo-deleted pages and pseudo-deleted keys.

• Reorganize index provides support for partitioned database environments. You can control the partitions of the table that have their indexes reorganized.

© Copyright IBM Corporation 2004

Online index reorganization is supported in V8

All indexes on a table are rebuilt

The default is to not allow access to the table, effectivelyrunning it offline

Options to allow read only or read/write access

Free space is restored

Reorganization always converts type 1 indexes to type 2,plus it reorganizes

For type 2 indexesREORG INDEXES ... CLEANUP ONLY { ALL | PAGES }

Options to delete pseudo-deleted pages and pseudo-deleted keysCompresses the indexes, but does not reorganize them

Partition subset support for partitioned database environmentsAll partitions

A partition or a range of partitions

Except a partition or a range of partitions

Online Index Reorganization - Overview

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-21

Student Notebook

Figure 7-13. Online Index Reorganization - Usage Tips CF457.3

Notes:

• The default for online index reorganization is to allow no access to the table during the reorganization of the indexes. In this case, the existing indexes are marked for rebuild, and the new indexes are built in the same space. When the indexes have been rebuilt, they are marked good. Should the system fail during the reorganization process, the indexes will remain marked for rebuild. When they are rebuilt depends on the setting of the database configuration parameter indexrec; they will either be rebuilt when the database is restarted, lengthening the restart, or on the first access to the table, which will cause a long response time for the user.

• When you specify read or read/write access, the reorganization of indexes will involve the use of shadow objects, in this case a shadow indexes. Before the shadow indexes can be made available for use, they must be enabled. The table is temporarily unavailable while DB2 is switching over to use the new indexes.

The shadow indexes are created in the same table space as the existing indexes. You must have enough space to hold two sets of indexes. Large table spaces (currently

© Copyright IBM Corporation 2004

REORG INDEXES using ALLOW NO ACCESS (the default)Old indexes are marked for rebuild and new ones are built in the samespace

If system crashes during REORG, rebuilding of the indexes will be done eitherat restart or at first access to the table

Check the indexrec DB configuration parameter system to determine rebuild type

REORG INDEXES using ALLOW READ/WRITE ACCESSThe table is temporarily unavailable while DB2 switches over touse the new indexes

During crash recovery, any shadow indexes are discardedRun the REORG INDEXES ALL FOR TABLE command again

Must have enough space for two sets of indexes on the tableIndexes can be placed in large table spaces (formerly called long tablespaces) to allow more space for shadow indexes

REORG INDEXES places progress messages in db2diagand administration notification log

At the beginning, start of the switch over to use the new shadowindexes, and when finished.

Online Index Reorganization - Usage Tips

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

called long table spaces) can now be used for indexes providing additional space for the two sets of indexes.

• To monitor the progress of an online index reorganization, messages are written to the db2diag log and the administration notification log at the beginning of the reorganization, when the switch to the new indexes starts, and when the process is finished. The administration notification log is a new log added for administrators.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-23

Student Notebook

Figure 7-14. “Online Index Reorganization” now “Online Index Defragmentation of Leaf Pages” CF457.3

Notes:

Online index defragmentation is enabled by the user-definable threshold for the maximum amount of free space on an index leaf page. When an index key is deleted from a leaf page and the threshold is exceeded, the neighboring index leaf pages are checked to determine if two leaf pages can be merged. If there is sufficient space on a page for a merge of two neighboring pages to take place, the merge occurs immediately in the background.

Online defragmentation is only possible with indexes created in Version 6 or later. If existing indexes require the ability to be merged online, they must be dropped and then re-created with the MINPCTUSED clause. Set the MINPCTUSED value to less than 100. The recommended value for MINPCTUSED is less than 50 because the goal is to merge two neighboring index leaf pages. A value of zero for MINPCTUESD (which is the default) disables online defragmentation.

Before Version 8, the term online index reorganization was used to describe the process of merging index leaf pages while the index was online. This function is now referred to as online index defragmentation of leaf pages.

© Copyright IBM Corporation 2004

"Online Index Reorganization" now "Online Index Defragmentation of Leaf Pages"

Root Node

Non-leafNode

LeafNode

DELETE

CREATE INDEX IX1

....MINPCTUSED 30

Root Node

Non-leafNode

LeafNode

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 7-15. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to describe:

Table reorganization

Online (INPLACE) table reorganizationCharacteristicsSyntaxUsage tips

Online index creation and reorganizationCharacteristicsSyntaxUsage tips

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 7. Online Table and Index Reorganization 7-25

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

7-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 8. Multidimensional Clustering

What This Unit Is About

This unit introduces you to the capability of DB2 UDB multidimensional clustering.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Identify the advantages of multidimensional clustering (MDC) over single-dimensional clustering

• Define cell, slice, and dimension

• Describe the performance impact of using MDC

• Identify the block indexes that will be built automatically when an MDC table is created

• Compare and contrast block indexes and RID indexes

• Identify considerations when choosing dimensions

References

IBM DB2 UDB Administration Guide: Performance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-1

Student Notebook

Figure 8-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Identify advantages of multidimensional clustering (MDC) over single-dimensional clustering

Define cell, slice, and dimension

Describe performance impact of using MDC

Identify the block indexes that will be built automatically when an MDC table is created

Compare and contrast block indexes and RID indexes

Identify considerations when choosing dimensions

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

8.1 Multidimensional Clustering

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-3

Student Notebook

Figure 8-2. Overview CF457.3

Notes:

• Multidimensional clustering (MDC) is primarily intended for data warehousing and large database environments. It provides an elegant method to provide flexible, continuous, and automatic clustering of data along multiple dimensions. This results in significant improvement in the performance of queries, as well as significant reduction in the overhead of data maintenance operations such as reorganization, and index maintenance during INSERT, UPDATE, and DELETE.

• MDC enables a table to be physically clustered on more than one dimension simultaneously (rather like having multiple clustered indexes). In particular, an MDC table ensures that rows are organized on disk in blocks of consecutive pages, such that all rows within any block have the same dimension values. All blocks contain the same number of pages, and multiple blocks can have the same dimension values when there are enough values corresponding to a dimension to warrant this. Dimensions of an MDC table are specified when the table is created. A block index is then automatically created for each dimension, and a composite block index is created for the entire set of dimensions. Dimensions are not restricted to being columns or column sequences. In

© Copyright IBM Corporation 2004

Overview

Single-dimensional clustering

Multidimensional clustering (MDC) Background and concepts

How it works

Insert, update, and delete details

Performance impact

Considerations when choosing dimensions

Generated columns and monotonicity

Load and rollin/rollout

Summary

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

particular, a dimension can be an expression with arithmetic operators, scalar functions, and so forth.

• To help you with MDC, you can also use the Design Advisor in the Control Center (right-click Database -> Design Advisor). There you can indicate a particular workload and let DB2 UDB advise you how to cluster your data.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-5

Student Notebook

Figure 8-3. Single-Dimensional Data Clustering CF457.3

Notes:

• The syntax for creating a clustering index is:

create index i1 on SALES(SKU) CLUSTER

• In order to cluster future inserts, your table should be set up with sufficient space on each page. This can be done by using the PCTFREE option:

ALTER TABLE SALES PCTFREE 10

• The load and reorg utilities honor the PCTFREE value specified, and will leave 10% free space on pages in this example.

© Copyright IBM Corporation 2004

Single-Dimensional Data Clustering

clustering

index

on SKU

sales table

index

on date

Benefits:Physically cluster data on insert according to order of single"clustering" index

Improves performance of range queries and prefetching

clustered

Clustering Index

SKU Store Date Qty Amt

101 21 04/02 1 1.50

101 21 04/02 1 1.50

101 7 04/02 2 3.00

101 7 04/01 6 8.11

SKU Store Date Qty Amt

101 7 04/02 1 1.50

101 21 04/02 3 4.10

101 7 04/01 2 3.00

Data Page 1

Data Page 2

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-4. How Does Single-Dimensional Clustering Work? CF457.3

Notes:

In order to cluster future inserts, your table should be set up with sufficient space on each page. This can be done by using the PCTFREE option. The load and reorg utilities honor the PCTFREE value specified, and will leave the appropriate percentage free space on pages.

During an insert operation, clustering is used to store the data on the same page where the data for the previous or next key is stored. If there is no more free space on the target page, DB2 UDB searches for the next “nearby” free space. If there is still no free space, the normal insert algorithm is used.

The advantage of having an index and data in the same order and in the right sequence on the pages is that, in case of a select, for instance, fewer I/Os would need be used to read the relevant information.

You can have only one clustered index on a table. In order to keep the clustering “up to date”, reorganizations are necessary.

© Copyright IBM Corporation 2004

How Does Single-Dimensional Clustering Work?

clustering

index

on SKU

sales table

index

on date

How it works:On insert, clustering index used to find location of same or nextkey and attempt insert on same page.

If insufficient space on target page, search in spiral fashionthrough target free space map page (covers 500 data pages).

If no page found in target free space map page, search otherfree space map pages using normal insert algorithm but"worst-fit" search

Pages with most space instead of first found

Drawbacks:Clustering in single dimension only

All other indexes are unclustered

Clustering degrades over time requiring reorg

Only record based indexes, often very large

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-7

Student Notebook

Figure 8-5. MDC - How It Works CF457.3

Notes:

• In this example, we have an MDC table with two dimensions: region and year.

• When data is inserted into the table, records having different dimension values are put into separate extents.

• In this way, each extent contains data that has a particular combination of dimension values; and a particular set of dimension values will ONLY be found in a subset of extents of the table.

• Note that there may be more records having a particular set of dimension values than fit in a single extent. Multiple extents can be assigned to a particular set of dimension values.

• Dimension values are indexed with BLOCK indexes, that is, indexes which point to extents instead of individual records.

© Copyright IBM Corporation 2004

MDC - How It Works

Extents making up an MDC table

with dimensions region and year

In multidimensional clustered (MDC) tables, data isorganized along extent boundaries according todimension (clustering) values

yearregion

Dimension

Block Index

on Region

Dimension

Block Index

on Year

East, 1993

East, 1996

North, 1996

North, 1997

North, 1998

South, 1999

Data stored

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-6. Multidimensional Clustering CF457.3

Notes:

• Multidimensional clustering (MDC) provides an elegant method for flexible, continuous, and automatic clustering of data along multiple dimensions. This results in significant improvement in the performance of queries, as well as significant reduction in the overhead of data maintenance operations, such as reorganization, and index maintenance operations during insert, update, and delete operations. Multidimensional clustering is primarily intended for data warehousing and large database environments, and it can also be used in online transaction processing (OLTP) environments.

• This kind of clustering is similar to the Red Brick Star Join and to ROLAP star schemas. It is useful when queries frequently request data rows sorted and/or summarized on the same group of keys. Usually, the data is inserted into the data pages in the order they are received based entirely on the primary key value. In our example, the primary key is the stock-keeping-unit (product ID). However, in this retail example, the rows are also indexed on store number and date, foreign keys that are frequently used to organize reports. In the past, we would have to keep an index on all three columns. Prior to using MDC, when a request would come in for store 21 analysis of product 101 sales, DB2 would have to access many data blocks to find all the occurrences of store 21. With

© Copyright IBM Corporation 2004

Multidimensional Clustering

SKU Store Date Qty Amt

101 21 04/02 1 1.50

101 21 04/02 1 1.50

101 21 04/02 3 4.10

SKU Store Date Qty Amt

101 7 04/01 6 8.11

101 7 04/01 2 3.00

SKU Store Date Qty Amt

101 7 04/02 2 3.00

101 7 04/02 1 1.50

Data pages - with MDC

keys

Data pages - without MDC

rows

SKU Store Date Qty Amt

101 21 04/02 1 1.50

101 21 04/02 1 1.50

101 7 04/02 2 3.00

101 7 04/01 6 8.11

SKU Store Date Qty Amt

101 7 04/02 1 1.50

101 21 04/02 3 4.10

101 7 04/01 2 3.00

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-9

Student Notebook

MDC, DB2 sorts the rows by all three key values and stores them in separate data pages or blocks. Only rows with the same value in all three keys are stored onto that page. Now, when the request comes for store 21 analysis of SKU 101, DB2 only has to retrieve one data block. All this is maintained for the DBA by DB2 every time a row is inserted.

• Data blocks are also organized sequentially. If one data block is not enough to hold all similar rows, multiple contiguous blocks are allocated so that prefetching scans ahead and delivers data before the first block of records has finished processing.

• Another important point is that the index now does not have to hold an index entry for every record in the table. Instead, it only has to hold one item for the entire data block of similar keyed rows. This dramatically shrinks the index size. This has two effects: it reduces disk space significantly and it speeds up queries since fewer index pages must be read to find the rows.

• A more descriptive term for this feature might have been row collocation.

• MDC enables a table to be physically clustered on more than one key (or dimension) simultaneously. Prior to Version 8, DB2 only supported single-dimensional clustering of data, through clustering indexes. Using a clustering index, DB2 maintains the physical order of data on pages in the key order of the index, as records are inserted and updated in the table. Clustering indexes greatly improve the performance of range queries that have predicates containing one or more keys of the clustering index. With good clustering, only a portion of the table needs to be accessed and, when the pages are sequential, more efficient prefetching can be performed.

• With MDC, these benefits are extended to more than one dimension, or clustering key. In terms of query performance, range queries involving any combination of specified dimensions of the table will benefit from clustering. Not only will these queries access only those pages having records with the correct dimension values, but these qualifying pages will be grouped by extents. Furthermore, although a table with a clustering index can become unclustered over time as space fills up in the table, an MDC table is able to maintain its clustering over all dimensions automatically and continuously, thus eliminating the need to reorganize the table to restore the physical order of the data.

• Multidimensional clustering (MDC) enables Materialized Query Tables (MQTs) to be physically clustered and indexed on more than one dimension simultaneously. The physical order of data based on the index is maintained by DB2 as data records are inserted and updated to the MQTs. It greatly improves the performance of OLAP queries as only a portion of the MQT needs to be accessed. It also reduces the size of the index needed to achieve the level of query performance.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-7. Background - Extents (1 of 2) CF457.3

Notes:

• SMS table spaces store data in operating system files. Data is striped, by extent (an extent is a group of consecutive pages) across the containers. Each table gets its own file name which is used in all containers. The file extension denotes the type of data stored in the file. In this example, all of T1's data pages are stored in files with the name SQL00002.DAT across the two containers. (00002 is the 'fid' from SYSTABLES.)

• The starting extent for each table is “round-robined” through the containers (that's why T1's first extent is /mydir1, whereas T2's is /mydir2). This helps spread the space requirement evenly across the containers, which is important especially when there are a large number of small tables.

• It isn't shown on the visual, but if an index exists for T1, it would be stored in SQL00002.INX and would be spread across extents by extent just like the data pages. Similarly, SQL00002.LF would store all T1's LONG VARCHARS, and SQL00002.LB all its LOBs. If SQL00002.LB existed, so would SQL00002.LBA. It would store an allocation map, which is internal metadata that helps DB2 UDB quickly find and manage free space in SQL00002.LB.

© Copyright IBM Corporation 2004

Background - Extents (1 of 2)

First Extent of Data

Pages for T1

Second Extent of Data

Pages for T1

/mydir1/. /mydir2/../SQL00002.DAT ./SQL00003.DAT ./SQL00002.DAT ./SQL00003.DAT

T1.1

T1.3

T1.5

T1.7

T1.9

T1.0

T1.2

T1.4

T1.6

T1.8

T2.1

T2.3

T2.5

T2.7

T2.9

T2.0

T2.2

T2.4

T2.6

T2.8

What is an extent?A set of contiguous pages on disk, size specified at table space creation time

SMS table spaces - what happens on disk after the following:db2 create tablespace tb1 managed by system using('/mydir1','/mydir2') extentsize 4

db2 create table t1 in tb1

db2 create table t2 in tb1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-11

Student Notebook

Figure 8-8. Background - Extents (2 of 2) CF457.3

Notes:

• In SMS table spaces, because each table's extents are in separate files from those of other tables, the logical and physical page numbers are the same.

• In DMS table spaces, because the extents of a table are interspersed with the extents of other objects in the same table space, the logical pages and physical page numbers (relative to the table space) are not the same.

• Note that indexes always point to the physical page number, not the logical page number. This is so that, in DMS table spaces, we don't have to do a lookup in the extent map, but we can point directly to the data where it resides in the table space.

© Copyright IBM Corporation 2004

Background - Extents (2 of 2)

In both SMS and DMS table spaces, a table is logicallymade up of a set of extents, each of which is a set ofconsecutive pages on disk.

Extents making up a table

0-3

4-7

12-15

16-19

20-23

Logical

page #s

Physical

page #s (DMS)84-87

112-115

1000-1003

444-447

872-875

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-9. Terminology: Dimension CF457.3

Notes:

• In this example, we have three dimensions: nation, color, and year.

• The table can be thought of as being organized into a three-dimensional cube.

• Data having particular dimension values can be found via that dimension's axis in the grid.

• This cube is simply a way of conceptualizing how the data is organized in an MDC table having three dimensions.

© Copyright IBM Corporation 2004

1997, Canada,

blue

1997, Mexico,

yellow

1997, Mexico,

blue

1997, Canada, yellow

1998, Mexico, yellow

1997, Mexico, yellow

1998, Canada, yellow

1997, Canada, yellow

Year

dimension

Color

dimension

Nation

dimension

Terminology: Dimension

Dimension: An axis along which data is physically organized inan MDC table

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-13

Student Notebook

Figure 8-10. Terminology: Slice (1 of 3) CF457.3

Notes:

Continuing with the three-dimensional table example:

• Any value of a particular dimension will define for us a “slice” of the table.

• That slice contains all data in the table having that value for that dimension, and only that data.

• In this case, we show the slice for nation = Canada.

© Copyright IBM Corporation 2004

Terminology: Slice (1 of 3)

Slice: Portion of the table containing data having a certainkey value of one of the dimensions

1997, Canada,

blue

1997, Mexico, yellow

1997, Mexico,

blue

1997, Canada, yellow

1998, Mexico, yellow

1997, Mexico, yellow

1998, Canada, yellow

1997, Canada, yellow

Year

dimension

Color

dimension

Nation

dimension

Country

slice

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-11. Terminology: Slice (2 of 3) CF457.3

Notes:

• Here, we further slice up the table to find those records having color = Yellow.

• Note that this slice partly overlaps the previous slice.

• This indicates that some records have both a nation of Canada and a color of yellow. Others don't.

© Copyright IBM Corporation 2004

Terminology: Slice (2 of 3)

Slice: Portion of the table containing data having a certainkey value of one of the dimensions

1997, Mexico, yellow

1997, Mexico,

blue

1997, Canada, yellow

1998, Mexico, yellow

1997, Mexico, yellow

1998, Canada, yellow

1997, Canada, yellow

Year

dimension

Color

dimension

Nation

dimension

1997, Canada,

blue

Color

slice

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-15

Student Notebook

Figure 8-12. Terminology: Slice (3 of 3) CF457.3

Notes:

• Again, we find the slice for year = 1997.

• And again, there is some overlap with the previous slices.

© Copyright IBM Corporation 2004

1997, Canada,

blue

1997, Mexico, yellow

1997, Mexico,

blue

1997, Canada, yellow

1998, Canada, yellow

1997, Mexico, yellow

1998, Mexico, yellow

1997, Canada, yellow

Year

dimension

Color

dimension

Nation

dimension

1998, Canada, yellow

1998, Mexico, yellow

Terminology: Slice (3 of 3)

Year

slice

Slice: Portion of the table containing data having a certainkey value of one of the dimensions

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-13. Terminology: Cell CF457.3

Notes:

• A cell is the overlap or intersection of slices from each of the dimensions.

• There is a logical cell in the table for each unique combination of existing dimension values.

• Each cell is physically made up of one or more extents or blocks, which themselves contain data having the cell's dimension values.

© Copyright IBM Corporation 2004

1997, Canada,

blue

1997, Mexico, yellow

1997, Mexico,

blue

1997, Canada, yellow

1998, Canada, yellow

1997, Mexico, yellow

1998, Mexico, yellow

1997, Canada, yellow

Year

dimension

Color

dimension

Nation

dimension

Cell for

(nation,

color, year)

1998, Canada, yellow

1998, Mexico, yellow

Each cell

contains one

or more

blocks

Terminology: Cell

Cell: Portion of the table containing data having unique setof dimension values; the intersection formed by taking aslice from each dimension

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-17

Student Notebook

Figure 8-14. Block Indexes and Dimension Block Indexes CF457.3

Notes:

• Block indexes are considered by the optimizer just as RID indexes are, in determining possible access plans for queries.

• They are treated just as RID indexes are during processing. That is, they can be ANDed and ORed with each other and other RID indexes; reverse scans can be done on them, and so on.

• Dimension block indexes are automatically created when the table is created. They cannot be dropped. However, they can be renamed, since a system-generated name is assigned to them for you.

© Copyright IBM Corporation 2004

Block Indexes and Dimension Block Indexes

Block IndexesStructurally the same as regular RID indexes, but point to blocksinstead of records

Much smaller than RID indexes

Smaller by a factor of block size * average # records in page

where block size = # pages in an extent (2-256)

Dimension block indexesFacilitate the determination of which blocks comprise a slice

Automatically created for each dimension of the table at tablecreation time

System required

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-15. Dimension Block Indexes CF457.3

Notes:

• This diagram attempts to illustrate that each slice and cell in the table actually contains a number of extents or blocks containing data associated with those slices or cells.

• A slice corresponds to a key and its list of IDs in a block index.

• Note that the index key has a list of block IDs; a block ID is made up of the first pool relative page of the block and a dummy (0) slot.

• Compare this to a RID in a RID index, which is made up of the page number and slot number of a record in the table.

• Since the BID and RID structure is so similar, index manager treats both the same until the block or record is actually accessed.

© Copyright IBM Corporation 2004

Dimension Block Indexes

Key for Canada:

4,0Canada 12,0 48,0 52,0 76,0 100,0 216,0 292,0 304,0 444,0

Block index on nationdimension

Canada

Mexico

Keypart BID (Block ID) = <first pool relative page of block, 0>

For example, if extent size is 4 pages, Block ID (4,0)

represents pages 4, 5, 6, and 7.

Each key has a list

of BIDs (Block IDs):

one for each block

belonging to that

slice of the table

Each key value corresponds toa different slice of the table.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-19

Student Notebook

Figure 8-16. Benefit: Clustering in Multiple Dimensions CF457.3

Notes:

Once DB2 has a BID from a block index, it has the first pool page of that block; DB2 can go directly to the first pool page of the block, and scan the entire block of pages. DB2 knows that every record found in these pages will have the dimension value of the block index key found.

© Copyright IBM Corporation 2004

Benefit: Clustering in Multiple Dimensions

Range scans on any dimension index Provides clustered data access, since each BID corresponds to a set ofsequential pages in the table guaranteed to contain data having thatdimension value

Dimensions/slices can be accessed independently from eachother

Through their block indexes without compromising the cluster factor ofany other

Block index scans Can also be combined (ANDed, ORed) and the resulting list of blocks toscan also provide clustered data access

Access to the clustered data is much faster than would havebeen with a clustering index

There is one pointer per qualifying block of pages versus one pointerper qualifying row

Given a BID from a block index, DB2 can do a very efficientscan of the corresponding block in the table

Much faster than accessing each row via a RID

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-17. Insert CF457.3

Notes:

Inserts - the hard way.

© Copyright IBM Corporation 2004

Insert

When inserting a new record to the table, how does DB2most efficiently determine where to store it?

INSERT INTO mdcTable Values(1997, 'Canada', 'yellow', 5000, 3.843, ....)

In order to maintain clustering, DB2 needs to find the uniquecell for the dimension values: year=1997, nation='Canada'and color= 'yellow'

Each dimension's slice could be found; the cell could bedetermined from the intersection of the slices

DB2 searches each dimension index for the list of blockscorresponding to the respective key value and does anANDing of the BID lists to find the set of blocks in the tablefor this specific cell

However, there is a better way: the composite block index

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-21

Student Notebook

Figure 8-18. Insert: The Composite Block Index CF457.3

Notes:

• This index is created automatically when you create the table.

• It is treated by the optimizer exactly like any other block index, and so it can be used for query processing.

• Because of this, you may be interested in influencing the order of the set of columns for the key definition of this index. The index may be more or less useful for query processing depending on the order of its key parts.

• The order is determined by the order of columns encountered by the parser when parsing the dimensions specified in the ORGANIZE BY clause.

• Keep this in mind when the syntax is indicated later.

© Copyright IBM Corporation 2004

Insert: The Composite Block Index

In addition, we have an additional block index which maps cellvalues to the list of blocks for each cell.

Its key is made up of all columns involved in the dimensions.

So, in the previous example, an additional block index on (year,nation, color) will be created at table creation time.

It is used to very quickly determine whether a particular cell exists,and if so, exactly which blocks contain those cell values.

1997, Canada, Blue

(4,0),(84,0),(444,0)

1997, Canada, Yellow

(52,0),(292,0)

1997, Mexico, Blue

(124,0),(128,0)

1997, Mexico, Yellow

(96,0),(3340,0)

Block index on

Year, Nation, Color

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-19. Insert Processing CF457.3

Notes:

DB2 UDB probes the composite block index for the cell corresponding to the dimension values of the record. Once the cell’s key is found, its list of BIDs gives DB2 UDB the complete list of blocks in the table.

If the cell key is not found in the index, or the extents containing these values are full, a new block must be assigned to that cell.

© Copyright IBM Corporation 2004

Insert Processing

On insert, DB2 probes the composite block index forthe cell corresponding to the dimension values ofthe record to be inserted.

If the cell's key is found in the index, its list of BIDsgives DB2 the complete list of blocks in the tablehaving the cell's dimension values.

Limits the number of extents of the table to search for spaceto insert.

If the cell's key is not found in the index, or if theextents containing these values are full, a new blockmust be assigned to the cell.

If possible, DB2 reuses an empty block in the table first, beforeextending the table by another new extent of pages.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-23

Student Notebook

Figure 8-20. Benefit: Guaranteed Clustering CF457.3

Notes:

• As far as the statistics are concerned, all block indexes are 100% clustered.

• Reorg is never needed to recluster the data.

• However, reorg can still be used to reclaim space, if cells have many sparse blocks where data could fit on fewer blocks, for example.

© Copyright IBM Corporation 2004

Benefit: Guaranteed Clustering

Data having particular dimension values are guaranteed tobe found in a set of blocks that contain only and all recordshaving those values.

Blocks are consecutive pages on disk, so access to theserecords is sequential, with minimal I/O.

Clustering is automatically maintained over time by insertingto existing blocks having record's dimension values.

When existing blocks in a cell are full, DB2 reuses orallocates a block and adds it to the set of blocks for that cell.

When a block is emptied of data, the BID is removed fromthe block indexes and thus disassociated with any cell values

It can be reused for another cell in the future.

REORG is no longer needed to re-cluster data.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-21. Block Management CF457.3

Notes:

• We remove empty blocks from the block indexes so that they can be reused, and to reduce the processing involved in searching empty blocks via the index. This also keeps the indexes small.

• However, once we remove a BID from the block indexes, we don't have a quick way to find it again, and we don't want to have to resort to a table scan in order to find it again for reuse.

© Copyright IBM Corporation 2004

Block Management

When a block is emptied, DB2 disassociates it with itscurrent cell values so that it can be reused by anothercell when needed.

Its BID is removed from the block indexes when theblock is emptied.

When a new block is needed, DB2 needs to findpreviously emptied blocks quickly, without having tosearch the table for them.

Solution: The block map

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-25

Student Notebook

Figure 8-22. The Block Map CF457.3

Notes:

In addition to the dimension block indexes and the composite block index, MDC tables maintain a block map containing a bitmap that indicates the availability status of each block. The following attributes are coded in the bitmap list:

• X (reserved): The first block contains only system information for the table. • U (in use): This block is used and associated with a dimension block index. • L (loaded): This block has been loaded by a current load operation. • C (check constraint): This block is set by the load operation to specify incremental

constraint checking during the load. • T (refresh table): This block is set by the load operation to specify that AST

maintenance is required. • F (free): If no other attribute is set, the block is considered free.

Because each block has an entry in the block map file, the file grows as the table grows. This file is stored as a separate object. In an SMS table space, it is a new file type. In a DMS table space, it has a new object descriptor in the object table.

© Copyright IBM Corporation 2004

The Block Map

A new structure which stores the status of each blockof the table.

Stored as a separate object:In SMS: As a separate .BMP file

In DMS: As a new object descriptor in the object table

Composed of an array containing an entry for eachblock of the table, where each entry is a set of statusbits for a block.

Status bits:In-use: Currently contains data and block is assigned to a cell

Load: Recently loaded; not yet visible by scans

Constraint: Recently loaded; constraint checking still to be done

Refresh Table: Recently loaded; MQTs still need to be refreshed

Free: If no other attribute is set, the block is considered free

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-23. Update CF457.3

Notes:

The update of values which are not a part of a dimension key takes place as a “regular” update.

The update of dimension values may result in the need to move to a different cell. If this is the case, the update would be converted into a delete and then an insert of the changed record.

© Copyright IBM Corporation 2004

Update

Update of non-dimension valuesUpdate takes place as in regular tables

If variable length and no longer fits on page, search for another page with space

First search within same block

If space not found in block, use insert algorithm to find another

No need to update block indexes unless the move required a newly allocated block

Update of dimension valuesNeed to move to different cell

Converted into delete then insert of changed record

Block indexes do need to be updated since different cellUnless new cell creation required, or old cell was completely emptied

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-27

Student Notebook

Figure 8-24. Benefit: Reduced Overhead and Logging CF457.3

Notes:

Block indexes reduces the need for logging, and also the relevant overhead caused, for example, by maintenance of an index.

© Copyright IBM Corporation 2004

Benefit: Reduced Overhead and Logging

Block indexes need only be updated when inserting firstrecord in block or deleting last record from block.

Reduces index overhead for maintenance, and logging.

For every block index that would have otherwise beena RID index, this overhead and logging is reducedenormously.

Reduction by a factor of cell cardinality.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-25. Simple and Flexible Syntax (1 of 3) CF457.3

Notes:

• In MDC, only the dimension columns (or key definitions) need to be defined.

• With a competitor's range partitioning (the closest thing out there to MDC), each range boundary must be explicitly defined and eventually altered as new data is added that exceeds the upper bound, for example.

• Compare to a competitor's partition definition:

CREATE TABLE PART1 (Year INT, Nation CHAR(25), Color VARCHAR ...) PARTITION BY RANGE ( Year )(PARTITION cell1 VALUES LESS THAN (1994) TABLESPACE TB1, PARTITION cell2 VALUES LESS THAN (1995) TABLESPACE TB2, PARTITION cell3 VALUES LESS THAN (1996) TABLESPACE TB3,PARTITION cell4 VALUES LESS THAN (1997) TABLESPACE TB4, ... )

• MDC is much more dynamic and flexible, and this is reflected in the syntax.

© Copyright IBM Corporation 2004

Simple and Flexible Syntax (1 of 3)

No need to plan for or define explicit range boundaries.

Cells (logical partitions) and blocks automaticallyadded or removed based on actual data.

Example:

CREATE TABLE MDCTABLE ( Year INT, Nation CHAR(25), Color VARCHAR(10),... )ORGANIZE BY( Year, Nation, Color )

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-29

Student Notebook

Figure 8-26. Simple and Flexible Syntax (2 of 3) CF457.3

Notes:

ORGANIZE BY DIMENSIONS (column-name,...)

Specifies a dimension for each column or group of columns used to cluster the table data. The use of parentheses within the dimension list specifies that a group of columns is to be treated as one dimension. The DIMENSIONS keyword is optional.

A clustering block index is automatically maintained for each specified dimension, and a block index, consisting of all columns used in the clause, is maintained if none of the clustering block indexes includes them all. The set of columns used in the ORGANIZE BY clause must follow the rules for the CREATE INDEX statement.

Each column name specified in the ORGANIZE BY clause must be defined for the table, and a dimension cannot occur more than once in the dimension list.

Pages of the table are arranged in blocks of equal size, which is the extent size of the table space, and all rows of each block contain the same combination of dimension values.

The order of key parts in the composite block index may affect its use or applicability for query processing. The order of its key parts is determined by the order of columns found in

© Copyright IBM Corporation 2004

Simple and Flexible Syntax (2 of 3)

A dimension is an index key definition, so it can containmultiple columns.

Example:

CREATE TABLE MDCTABLE2 ( Year INT, Nation CHAR(25), Color VARCHAR(10), ... )ORGANIZE BY( (Year, Nation), Color )

This MDC table will have two dimension block indexes:One on (Year, Nation), another on Color

And a composite block index on (Year, Nation, Color)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

the entire ORGANIZE BY [DIMENSIONS] clause used when creating the MDC table. For example, if a table is created using:

CREATE TABLE t1 (c1 int,c2 int,c3 int,c4 int) ORGANIZE BY DIMENSIONS (c1,c4,(c3,c1),c2)

then the composite block index will be created on columns (c4,c3,c1,c1). Although c1 is specified twice in the DIMENSIONS clause, it is used only once as a key part for the composite block index, and in the order in which it is last found.

The order of key parts in the composite block index makes no difference for insert processing, but may do so for query processing.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-31

Student Notebook

Figure 8-27. Simple and Flexible Syntax (3 of 3) CF457.3

Notes:

• The number of dimensions doesn't have an explicit limit.

- However, because of the composite block index that is created, the number of columns involved in all dimensions cannot exceed the column limit for an index.

• Generated columns can be made using a wide variety of expressions, from simple arithmetic expressions, to built-in functions, to case statements.

© Copyright IBM Corporation 2004

Simple and Flexible Syntax (3 of 3)

A dimension can be created on a GENERATED column,which is a column built from an expression on a differentcolumn in the table.

Example:

CREATE TABLE MDCTABLE2 ( Date DATE, Nation CHAR(25), Color VARCHAR(10),YearAndMonth INT generated always as ((INTEGER(Date)/100),... )ORGANIZE BY( YearAndMonth, Color )

This MDC table will have two dimension block indexes:One on YearAndMonth, another on Color

And a composite block index on (YearAndMonth, Color)

This provides a very powerful and flexible way to organizeand cluster data on expressions.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-28. Query Processing - Example (1 of 3) CF457.3

Notes:

DB2 can perform index ANDing and index ORing using block and RID indexes.

© Copyright IBM Corporation 2004

Query Processing - Example (1 of 3)

Dimension block index lookupMini-relation scan of resulting blocks in the table

Example: Block ANDing SELECT * FROM MDCTABLE WHERE COLOR='BLUE' AND NATION='USA'

12,0USA 76,0 92,0 100,0 112,0216,0 276,04,0Blue 12,0 48,0 52,0 76,0 100,0 216,0 12,0 76,0 100,0 216,0

key from dimension blockindex on Color

Key from dimension blockindex on Nation

Resulting BID listof resulting blocksto scan

+ =

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-33

Student Notebook

Figure 8-29. Query Processing - Example (2 of 3) CF457.3

Notes:

© Copyright IBM Corporation 2004

Query Processing - Example (2 of 3)

Dimensions: color, year, nation; RID index on part#

Example: ANDing block and RID indexes

SELECT * FROM MDCTABLE

WHERE COLOR='BLUE' AND part# < 1000

4,0Blue 12,0 48,0 52,0 76,0 100,0 216,0 8,126,4 50,1 77,3 107,0115,0219,5 276,9 6,4 50,1 77,3 219,5

Key from dimension blockindex on Color

RIDs from RIDindex on Part #

Resulting RIDsto fetch

Result is only those RIDs belonging to qualifying blocks.

+ =

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-30. Query Processing - Example (3 of 3) CF457.3

Notes:

© Copyright IBM Corporation 2004

Query Processing - Example (3 of 3)

Dimensions: color, year, nation; RID index on part#

Example: ORing block and RID indexes

SELECT * FROM MDCTABLE WHERE COLOR='BLUE' OR part# < 1000

Result is all records in qualifying blocks, plus additionalRIDs outside of those blocks.

Key from dimension blockindex on Color

RIDs from RID index

on Part #

Resulting blocks andRIDs to fetch

4,0Blue 12,0 48,0 52,0 76,0 100,0 216,0 8,126,4 50,1 77,3 107,0 115,0 219,5 276,9

8,12 107,0 115,0 276,9

4,0 12,0 48,0 52,0 76,0 100,0 216,0

+ =

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-35

Student Notebook

Figure 8-31. Benefit: Faster Queries CF457.3

Notes:

Queries can benefit by using the block indexes.

© Copyright IBM Corporation 2004

Benefit: Faster Queries

Take advantage of block indexes.

Quickly and easily narrow down a portion of the tablehaving particular dimension values or ranges of values(block elimination).

Very fast index lookups as block indexes are small.

Relation scans of blocks faster than RID-based retrieval.

Index ANDing and ORing can be done at a block level,and mixed with RIDs.

Data guaranteed to be clustered on extents, so dataretrieval much faster.

Block indexes provide additional access plans to choosefrom, and do not prevent the use of traditional accessplans (RID scans, joins, table scans, and so forth).

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-32. Considerations for Dimension Selection CF457.3

Notes:

• For example, if you are clustering on three dimensions: date, nation, color:

- If the number of years spanned is 20 and there are 100 different nations and 5 colors

- The potential number of cells is 20*100*5*365=3,650,000

• If the skew of the data is such that some cells will be sparsely populated while others will be densely so, and space consumption is a concern, you may wish to have a smaller block size, so that the sparse cells will not take up as much wasted space.

• On the other hand, if you were to cluster on year instead of date, this would increase the density of cells, where the potential number of cells becomes: 20*100*5=10,000

• As already mentioned, the Design Advisor from the Control Center could be used to determine the right dimension keys.

© Copyright IBM Corporation 2004

Considerations for Dimension Selection

When choosing dimensions for a table, consider:

First, which queries will benefit from block-levelclustering

Columns in equality or range queries

Columns with coarse granularity

Foreign key columns in fact tables

Second, the expected density of cells based onexpected data

# possible cells = cartesian product of dimension cardinalities

Possibility of sparsely populated blocks/cells

Two factors to manipulate:Extent size - reduce if many sparse cells

Rollup dimension to a larger granularity with generated columns

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-37

Student Notebook

Figure 8-33. MDC and Generated Columns - Integration CF457.3

Notes:

However, it is important to be aware of the restrictions; see the next page.

© Copyright IBM Corporation 2004

MDC and Generated Columns - Integration

Given an MDC table with dimension on generated columnmonth, where YearAndMonth = INTEGER(date)/100

For queries on the dimension (YearAndMonth), block index range scans can be used.

For queries on the base column (date), block index range scans canalso be done to narrow down which blocks to scan and then applythe predicates on date to the rows in those blocks only.

The compiler generates the additional dimension predicates to use.

Example: For the query

select * from MDCTABLE where date > '1999/03/03' and date < '2000/01/01'

The compiler generates the additional predicates which can be usedas range predicates for a block index scan

YearAndMonth >= 199903 and YearAndMonth <= 200001

This gives a list of blocks to be scanned, and the original predicatesare applied to the rows in those blocks.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-38 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-34. Caution: The Importance of Monotonicity (1 of 4) CF457.3

Notes:

Restrictions on query rewrite.

© Copyright IBM Corporation 2004

Caution: The Importance of Monotonicity (1 of 4)

Range scans can only be done on derived predicates,as in the previous example, when the expression usedin the generated column definition is MONOTONIC

Monotonic means: if A > B then expr(A) >= expr(B) and if (A < B) then expr(A) <= expr(B)

Examples of monotonic operations include: A+B, A*B, A/B (where A,B>0), INTEGER(A), and so forth

Examples of non-monotonic operations include: A-B, month(A), day(A), and so forth

If the compiler cannot determine the monotonicity of anexpression, or if it determines that an expression is notmonotonic, only equality predicates can be used on thegenerated column (or dimension if it is a dimension)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-39

Student Notebook

Figure 8-35. Caution: The Importance of Monotonicity (2 of 4) CF457.3

Notes:

Examples of monotonic expressions.

© Copyright IBM Corporation 2004

Caution:The Importance of Monotonicity (2 of 4)

Example of a monotonic expression: B = A/100

Values of A Corresponding values of B

1 0

10 0

103 1

143 1

199 1

250 2

378 3

... ...

We can determine that as A increases in value, B neverdecreases in value

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-40 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-36. Caution: The Importance of Monotonicity (3 of 4) CF457.3

Notes:

Examples of non-monotonic expressions.

© Copyright IBM Corporation 2004

Caution:The Importance of Monotonicity (3 of 4)

Example of a non monotonic expression: B = month(date)

Values of Date month(date)

1999/03/03 03

1999/05/17 05

1999/12/25 12

2000/02/01 02

2001/05/24 05

... ...

We can determine that as A increases in value, B can bothincrease and decrease in value

Range predicates cannot be generated for B from predicateson the base column of date

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-41

Student Notebook

Figure 8-37. Caution: The Importance of Monotonicity (4 of 4) CF457.3

Notes:

© Copyright IBM Corporation 2004

Caution:The Importance of Monotonicity (4 of 4)

Comparing non monotonic expression: B = month(date) tomonotonic expression: B = integer(date)/100

Values of Date month(date) integer(date)/100 1999/03/03 03 199903 1999/05/17 05 199905 1999/12/25 12 199912 2000/02/01 02 200002 2001/05/24 05 200105 ... ...

If the dimension column, B, is defined as integer(date)/100,then range predicates against DATE could be translated torange predicates against the dimension.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-42 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-38. MDC and Generated Columns CF457.3

Notes:

In the second example, scans on the RID index (base column) are the common case, but even if the block index isn't chosen for the plan, the RID index benefits from the clustering imposed by the block index.

Here is an example using a calculation:

create table MDC2 (custNo INT, clustCustNo INT generated always as (custno/100)) organize by (clustCustNo)

© Copyright IBM Corporation 2004

MDC and Generated Columns

YearAndMonth

Scans on date or YearAndMonthcan benefit from block index scan

Scans on custNo requiremuch smaller pageworking set

clustCustNo

41 2 31

custNo

DSS example:create table MDC1 (

date DATE,

province CHAR(2),

YearAndMonth INT generated always

as (INTEGER(date)/100) )

organize by (YearAndMonth)

OLTP example:create table MDC2 (

custNo INT,

clustCustNo INT generated always

as (case when custNo < 100 then 1

when custNo < 200 then 2

when custNo < 300 then 3

else 4 end))

organize by (clustCustNo)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-43

Student Notebook

Figure 8-39. Fast and Efficient Data Roll-in CF457.3

Notes:

Benefits for warehouses with rolling windows of data.

© Copyright IBM Corporation 2004

Fast and Efficient Data Roll-in

Load used for data roll-in

Faster:Very efficient algorithm for organizing data along dimension lines(previously, a need for separate full sort, to cluster data initially)

Less updates and logging for block indexes versus regularindexes

Better space management:Roll-in of new slice can reuse freed blocks from previouslyemptied blocks (load for regular table can only append)

Load uses block map to determine which blocks are free

Offline or online Load marks block status to "newly-loaded" so that table scansskip these blocks during online load

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-44 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-40. MDC Load Example CF457.3

Notes:

• The following restrictions apply to multidimensional clustering (MDC) tables:

- The SAVECOUNT option of the LOAD command is not supported. - The TOTALFREESPACE file type modifier is not supported since these tables

manage their own free space.

• When using the LOAD command with MDC, violations of unique constraints will be handled as follows:

- If the table included a unique key prior to the load operation, and duplicate records are loaded into the table, the original record will remain and the new records will be deleted during the delete phase.

- If the table did not include a unique key prior to the load operation, and both a unique key and duplicate records are loaded into the table, only one of the records with the unique key will be loaded, and the others will be deleted during the delete phase.

Note: There is no explicit technique for determining which record will be loaded and which will be deleted.

© Copyright IBM Corporation 2004

MDC Load Example

Jan Feb AprMar May

2000

2001

2002

2002-03-24, “BrandX”, “VCR”, …2002-03-24, “BrandY”, “Radio”, …2002-03-24, “BrandZ”, “TV”, …2002-02-02, “BrandU”, “Gloves”, …

MDC Load

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-45

Student Notebook

Performance Considerations

• To improve the performance of the load utility when loading MDC tables, the UTIL_HEAP_SZ database configuration parameter should be set to a value that is 10 to 15% higher than usual. This will reduce disk I/O during the clustering of data that is performed during the load phase. When the DATA BUFFER option of the LOAD command is specified, its value should also be increased by 10-15%. If the LOAD command is being used to load several MDC tables concurrently, the UTIL_HEAP_SZ configuration parameter should be increased accordingly.

• MDC load operations will always have a build phase since all MDC tables have block indexes.

• During the load phase, extra logging for the maintenance of the block map will be performed. There are approximately two extra log records per extent allocated. To ensure good performance, the LOGBUFSZ database configuration parameter should be set to a value that takes this into account.

• A system temporary table with an index is used to load data into MDC tables. The size of the table is proportional to the number of distinct cells loaded. The size of each row in the table is proportional to the size of the MDC dimension key. To minimize disk I/O caused by the manipulation of this table during a load operation, ensure that the buffer pool for the temporary table space is large enough.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-46 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-41. MDC Integration CF457.3

Notes:

All the same characteristics apply as for regular tables, except that you do not need to reorg to maintain the cluster.

© Copyright IBM Corporation 2004

MDC Integration

MDC tables work just like regular tablesCan have RID indexes

All block and RID indexes in same object

Scalable: can exist in serial, SMP, MPP environments

Can have MQTs on MDC tables; MQTs can be MDC tables

Can be replicated

Tools can be used on them: reorg, load, import, runstats,Control Center, and so forth

Can define triggers, constraints, primary/foreign keys

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-47

Student Notebook

Figure 8-42. In Conclusion CF457.3

Notes:

© Copyright IBM Corporation 2004

In Conclusion

MDC provides a unique and powerful solution for largedatabase performance and high availability.

MDC benefits include:

Extending the performance advantages of clustering to multipledimensions.

Clustering automatically and dynamically maintained over time.

Reorganization reduced to space reclamation only.

Data organization provides benefits of partition elimination.

Block-based indexes provide additional high performanceaccess plans and block elimination in queries.

Block index size results in faster scans and much less overheadfor logging and maintenance.

Simple, flexible syntax makes it easy to set up and maintain.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-48 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 8-43. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Identify advantages of multidimensional clustering (MDC) over single-dimensional clustering

Define cell, slice, and dimension

Describe performance impact of using MDC

Identify the block indexes that will be built automatically when an MDC table is created

Compare and contrast block indexes and RID indexes

Identify considerations when choosing dimensions

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 8. Multidimensional Clustering 8-49

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

8-50 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 9. Advanced Load

What This Unit Is About

This unit focuses on some of the advanced facilities available in the Load facility.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Identify the benefits of online Load

• Identify the benefits of various Load options

• Identify the possibilities of Load from Cursor

• Describe the advantage of the Index Free Space parameter

• Use the Load terminate option

• Describe the Incremental Indexing mode of the Load utility

References

Data Movement Utilities Guide and Reference

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Command Reference

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-1

Student Notebook

Figure 9-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Identify the benefits of online Load

Identify the benefits of various Load options

Describe the advantage of the Index Free Space parameter

Use the Load terminate option

Describe the Incremental Indexing mode of the Load utility

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

9.1 Benefits of Online Load

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-3

Student Notebook

Figure 9-2. LOAD Syntax CF457.3

Notes:

ALLOW NO ACCESS

Load will lock the target table for exclusive access during the load. The table state will be set to LOAD IN PROGRESS during the load. ALLOW NO ACCESS is the default behavior. It is the only valid option for LOAD REPLACE. When there are constraints on the table, the table state will be set to CHECK PENDING as well as LOAD IN PROGRESS. The SET INTEGRITY command must be used to take the table out of CHECK PENDING.

ALLOW READ ACCESS

Load will lock the target table in a share mode. The table state will be set to both LOAD IN PROGRESS and READ ACCESS. Readers may access the non-delta portion of the data while the table is being loaded. In other words, data that existed before the start of the load will be accessible by readers to the table; data that is being loaded is not available until the load is complete. LOAD TERMINATE or LOAD RESTART of an ALLOW READ ACCESS load may use this option; LOAD TERMINATE or LOAD

© Copyright IBM Corporation 2004

LOAD Syntax

LOAD FROM file/pipe/dev/cursor-name [ {,file/pipe/dev}...] OF {ASC | DEL | IXF | CURSOR}

[SAVECOUNT n] [ROWCOUNT n] [WARNINGCOUNT n]

[MESSAGES msg-file] [TEMPFILES PATH pathname]

{INSERT | REPLACE | RESTART | TERMINATE}

INTO table-name [( insert-column [ {,insert-column} ... ] )]

[FOR EXCEPTION table-name]

[CHECK PENDING CASCADE {DEFERRED | IMMEDIATE}]

[ALLOW NO ACCESS | ALLOW READ ACCESS [USE tblspace-name]] [LOCK WITH FORCE]

[[PARTITIONED DB CONFIG] partitioned-db-option [{partitioned-db-option}...]]

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

RESTART of an ALLOW NO ACCESS load may not use this option. Furthermore, this option is not valid if the indexes on the target table are marked as requiring a rebuild.

When there are constraints on the table, the table state will be set to CHECK PENDING as well as LOAD IN PROGRESS, and READ ACCESS. At the end of the load the table state LOAD IN PROGRESS state will be removed but the table states CHECK PENDING and READ ACCESS will remain. The SET INTEGRITY command must be used to take the table out of CHECK PENDING. While the table is in CHECK PENDING and READ ACCESS, the non-delta portion of the data is still accessible to readers; the new (delta) portion of the data will remain inaccessible until the SET INTEGRITY command has completed. A user may perform multiple loads on the same table without issuing a SET INTEGRITY command. Only the original (checked) data will remain visible, however, until the SET INTEGRITY command is issued.

ALLOW READ ACCESS also supports the following modifiers:

USE tablespace-name

If the indexes are being rebuilt, a shadow copy of the index is built in table space tablespace-name and copied over to the original table space at the end of the load during an INDEX COPY PHASE. Only system temporary table spaces can be used with this option. If not specified then the shadow index will be created in the same table space as the index object. If the shadow copy is created in the same table space as the index object, the copy of the shadow index object over the old index object is instantaneous. If the shadow copy is in a different table space from the index object, a physical copy is performed. This could involve considerable I/O and time. The copy happens while the table is offline at the end of a load during the INDEX COPY PHASE.

Without this option the shadow index is built in the same table space as the original. Since both the original index and shadow index by default reside in the same table space simultaneously, there may be insufficient space to hold both indexes within one table space. Using this option ensures that you retain enough table space for the indexes.

This option is ignored if the user does not specify INDEXING MODE REBUILD or INDEXING MODE AUTOSELECT. This option will also be ignored if INDEXING MODE AUTOSELECT is chosen and LOAD chooses to incrementally update the index.

CHECK PENDING CASCADE

If LOAD puts the table into a check pending state, the CHECK PENDING CASCADE option allows the user to specify whether or not the check pending state of the loaded table is immediately cascaded to all descendents (including descendent foreign key tables, descendent immediate materialized query tables, and descendent immediate staging tables).

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-5

Student Notebook

IMMEDIATE

Indicates that the check pending state (read or no access mode) for foreign key constraints is immediately extended to all descendent foreign key tables. If the table has descendent immediate materialized query tables or descendent immediate staging tables, the check pending state is extended immediately to the materialized query tables and the staging tables. Note that, for a LOAD INSERT operation, the check pending state is not extended to descendent foreign key tables even if the IMMEDIATE option is specified.

When the loaded table is later checked for constraint violations (using the IMMEDIATE CHECKED option of the SET INTEGRITY statement), descendent foreign key tables that were placed in check pending read state will be put into check pending no access state.

DEFERRED

Indicates that only the loaded table will be placed in the check pending state (read or no access mode). The states of the descendent foreign key tables, descendent immediate materialized query tables, and descendent immediate staging tables will remain unchanged.

Descendent foreign key tables may later be implicitly placed in the check pending no access state when their parent tables are checked for constraint violations (using the IMMEDIATE CHECKED option of the SET INTEGRITY statement).

Descendent immediate materialized query tables and descendent immediate staging tables will be implicitly placed in the check pending no access state when one of its underlying tables is checked for integrity violations. A warning will be issued to indicate that dependent tables have been placed in the check pending state.

If the CHECK PENDING CASCADE option is not specified:

- Only the loaded table will be placed in the check pending state. The state of descendent foreign key tables, descendent immediate materialized query tables, and descendent immediate staging tables will remain unchanged, and may later be implicitly put into the check pending state when the loaded table is checked for constraint violations.

If LOAD does not put the target table into check pending state, the CHECK PENDING CASCADE option is ignored.

LOCK WITH FORCE

The utility acquires various locks, including table locks, in the process of loading. Rather than wait, and possibly time out, when acquiring a lock, this option allows load to force off other applications that hold conflicting locks. Forced applications will roll back and release the locks the load utility needs. The load utility can then proceed. This option requires the same authority as the FORCE APPLICATIONS command (SYSADM or SYSCTRL).

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

ALLOW NO ACCESS loads may force applications holding conflicting locks at the start of the load operation. At the start of the load the utility may force applications that are attempting to either query or modify the table.

ALLOW READ ACCESS loads may force applications holding conflicting locks at the start or end of the load operation. At the start of the load the load utility may force applications that are attempting to modify the table. At the end of the load the load utility may force applications that are attempting to either query or modify the table. At the end of the load operation, the load utility can force applications that are attempting to either query or modify the table.

PARTITIONED DB CONFIG

Allows you to execute a load into a partitioned table. The PARTITIONED DB CONFIG parameter allows you to specify partitioned database-specific configuration options. The partitioned-db-option values may be any of the following:

HOSTNAME xFILE_TRANSFER_CMD xPART_FILE_LOCATION xOUTPUT_DBPARTNUMS xPARTITIONING_DBPARTNUMS xMODE xMAX_NUM_PART_AGENTS xISOLATE_PART_ERRS xSTATUS_INTERVAL xPORT_RANGE xCHECK_TRUNCATIONMAP_FILE_INPUT xMAP_FILE_OUTPUT xTRACE xNEWLINEDISTFILE xOMIT_HEADERRUN_STAT_DBPARTNUM x

- If load is executed from within a non-partitioned environment, it will behave as usual. If a partitioned database configuration option is specified, this will result in SQL Error 27959, reason code 1.

- Without any further qualification, in a partitioned database environment, the MODE option will default to PARTITION_AND_LOAD except when the DB2_PARTITIONEDLOAD_DEFAULT registry variable is set to NO. In this case, the following defaults will apply: the MODE will be LOAD_ONLY, OUTPUT_DBPARTNUMS will be a list containing the single database partition to which the user is currently connected, and PART_FILE_LOCATION will be the current working path of the client, if the load input file name is not fully qualified and the client and server are on the same physical machine, or the path prefix of the load

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-7

Student Notebook

input file if that file name is fully qualified. The purpose of this registry variable is to preserve the pre-Version 8 behavior of the LOAD utility in a partitioned database environment.

- In a partitioned database environment, if LOAD is used with the MODE option set to PARTITION_ONLY, the input file will be partitioned, resulting in the creation of a file on each output partition containing a partition map header and data for only that partition. For all file types except CURSOR, the name of the file that is created on each output partition is < filename >.< xxx >, where < filename > is the name of the input file specified in the LOAD command, and < xxx > is the number of the partition on which the file resides. Furthermore, the location of the file on each output partition is indicated by the PART_FILE_LOCATION option, if it is specified. If this option is not specified, the location of the input file is taken to be the current working directory. If the file type is CURSOR, the PART_FILE_LOCATION option is required and must specify a fully qualified basename. In this case, the name of the file created on each partition will be this basename appended with the appropriate partition number.

- In a partitioned database environment, if LOAD is used with the MODE option set to LOAD_ONLY, the files to be loaded are assumed to exist on each output partition and are assumed to contain a valid partition map header. For all file types except CURSOR, the name of the file on each partition is expected to be < filename >.< xxx >, where < filename > is the name of the input file specified in the load command, and < xxx > is the number of the partition on which the file resides. Furthermore, the location of the file on each partition is indicated by the PART_FILE_LOCATION option, if it is specified. If this option is not specified, the files will be read from the location indicated by the path prefix of the input file name, if that name is fully qualified, or the current working directory, if the input file name is not fully qualified. If the file type is CURSOR, the PART_FILE_LOCATION option is required and must specify a fully qualified basename. In this case, the name of the file on each partition is expected to be this basename appended with the appropriate partition number.

- In a partitioned database environment, if LOAD is used with the MODE option set to LOAD_ONLY_VERIFY_PART, the files to be loaded are assumed to exist on each output partition and are also assumed to contain no partition map header. Load will verify that the data in each file is on the proper partition. Rows that are not on the correct partition will be rejected and sent to a dumpfile, if one is specified. The name and location of the file on each output partition follows the same rules as the filename for the LOAD_ONLY mode.

Note: The LOAD_ONLY_VERIFY_PART mode is not supported when the file type is CURSOR.

If the CLIENT keyword of LOAD is specified, a remote load will be permitted. Only the PARTITION_AND_LOAD and PARTITION_ONLY modes are supported for loads where CLIENT is specified.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-3. Online Load Example CF457.3

Notes:

• Concurrent read/write table space access - Space map updates through buffer pool - Two log records written per extent allocation (DMS only, similar to V7

nonrecoverable) - Load no longer using quiesce or the load pending/delete pending table space states - COPY NO — Loads continue to place table space in backup pending; are also

incompatible with online backup • Concurrent Table Access

- By default, no access allowed, table Z-locked - New ALLOW READ ACCESS clause — read access to pre-existing data, load

append only • Online Load Table Locking Behavior

- NO ACCESS — table locked super-exclusive

READ ACCESS — table locked in U mode, briefly upgrade to Z prior to commit. Drains old readers prior to start (see next page).

© Copyright IBM Corporation 2004

Online Load Example

Insert into T values (1,‘aaaa’) Insert into T values (2,‘bbbb’) Insert into T values (3,‘cccc’) Load from data.del of del insert into T allow read access

Select * from T

C1 C2 ----------- ---- 1 aaaa 2 bbbb 3 cccc

3 record(s) selected.

Insert a few rows into a table, then call Load with the "allow read access"

option

While Load is running,query the table content

from a separate connection. Only rows

existing prior to the start of the load will appear.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-9

Student Notebook

Figure 9-4. Locking - Offline versus Online Load CF457.3

Notes:

• The load utility provides two options that control the amount of access other applications have to a table being loaded. The ALLOW NO ACCESS option locks the table exclusively and allows no access to the table data while the table is being loaded. This is the default behavior. The ALLOW READ ACCESS option prevents all write access to the table by other applications, but allows read access to preloaded data. This section deals with the ALLOW READ ACCESS option.

• Table data and index data that exists prior to the start of a load operation are visible to queries while the load operation is in progress.

• The ALLOW READ ACCESS option is very useful when loading large amounts of data because it gives users access to table data at all times, even when the load operation is in progress or after a load operation has failed. The behavior of a load operation in ALLOW READ ACCESS mode is independent of the isolation level of the application. That is, readers with any isolation level can always read the preexisting data, but they will not be able to read the newly loaded data until the load operation has finished.

© Copyright IBM Corporation 2004

Locking - Offline versus Online Load

Time

Time

Z-lock requested

Z-lock granted

Load commit

Load allows no access

read/write

read/writeread/write

readread

Load allows read access

Z-lock requested

Z-lock granted

Load commit

Drain requested

Drain granted

r/w

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• Read access is provided throughout the load operation except at the very end. Before data is committed, the load utility acquires an exclusive lock (Z-lock) on the table. The load utility will wait until all applications that have locks on the table release them. This may cause a delay before the data can be committed. The LOCK WITH FORCE option may be used to force off conflicting applications, and allow the load operation to proceed without having to wait.

• Usually, a load operation in ALLOW READ ACCESS mode acquires an exclusive lock for a short amount of time; however, if the USE <tablespaceName> option is specified, the exclusive lock will last for the entire period of the index copy phase.

Notes:

1. If a load operation is aborted, it remains at the same access level that was specified when the load operation was issued. So, if a load operation in ALLOW NO ACCESS mode aborts, the table data is inaccessible until a load terminate or a load restart is issued. If a load operation in ALLOW READ ACCESS mode aborts, the preloaded table data is still accessible for read access.

2. If the ALLOW READ ACCESS option was specified for an aborted load operation, it can also be specified for the load restart or load terminate operation. However, if the aborted load operation specified the ALLOW NO ACCESS option, the ALLOW READ ACCESS option cannot be specified for the load restart or load terminate operation.

• The ALLOW READ ACCESS option is not supported if:

- The REPLACE option is specified. Since a load replace operation truncates the existing table data before loading the new data, there is no preexisting data to query until after the load operation is complete.

- The indexes have been marked invalid and are waiting to be rebuilt. Indexes can be marked invalid in some rollforward scenarios or through the use of the db2dart command.

- The INDEXING MODE DEFERRED option is specified. This mode marks the indexes as requiring a rebuild.

- An ALLOW NO ACCESS load operation is being restarted or terminated. Until it is brought fully online, a load operation in ALLOW READ ACCESS mode cannot take place on the table.

- A load operation is taking place to a table that is in check pending state and is not in read access state. This is also the case for multiple load operations on tables with constraints. A table is not brought online until the SET INTEGRITY statement is issued.

• Generally, if table data is taken offline, read access is not available during a load operation until the table data is back online.

• In most cases, the load utility uses table level locking to restrict access to tables. The load utility does not quiesce the table spaces involved in the load operation, and uses table space states only for load operations with the COPY NO option specified. The

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-11

Student Notebook

level of locking depends on whether or not the load operation allows read access. A load operation in ALLOW NO ACCESS mode will use an exclusive lock (Z-lock) on the table for the duration of the load. A load operation in ALLOW READ ACCESS mode acquires and maintains a share lock (S-lock) for the duration of the load operation, and upgrades the lock to an exclusive lock (Z-lock) when data is being committed.

• Before a load operation in ALLOW READ ACCESS mode begins, the load utility will wait for all applications that began before the load operation to release locks on the target table. Since locks are not persistent, they are supplemented by table states that will remain even if a load operation is aborted. These states can be checked by using the LOAD QUERY command. By using the LOCK WITH FORCE option, the load utility will force applications holding conflicting locks off the table that it is trying to load into.

Locking Behavior for Load Operations in ALLOW READ ACCESS Mode

• At the beginning of a load operation, the load utility acquires a share lock (S-lock) on the table. It holds this lock until the data is being committed. The share lock allows applications with compatible locks to access the table during the load operation. For example, applications that use read-only queries will be able to access the table, while applications that try to insert data into the table will be denied. When the load utility acquires the share lock on the table, it will wait for all applications that hold locks on the table prior to the start of the load operation to release them, even if they have compatible locks. Since the load utility upgrades the share lock to an exclusive (Z-lock) when the data is being committed, there may be some delay in commit time while the load utility waits for applications with conflicting locks to finish.

• Note: The load operation will not time out while it waits for the applications to release their locks on the table.

LOCK WITH FORCE Option

• The LOCK WITH FORCE option can be used to force off applications holding conflicting locks on a table so that the load operation can proceed. If an application is forced off the system by the load utility, it will lose its database connection and an error will be returned (SQL1224N).

• For a load operation in ALLOW NO ACCESS mode, all applications holding table locks that exist at the start of the load operation will be forced.

• For a load operation in ALLOW READ ACCESS mode applications holding the following locks will be forced:

- Table locks that conflict with a table share lock (for example, import or insert). - All table locks that exist at the commit phase of the load operation.

• When the COPY NO option is specified for a load operation on a recoverable database, all objects in the target table space will be locked in share mode before the table space is placed in backup pending state. This will occur regardless of the access mode. If the LOCK WITH FORCE option is specified, all applications holding locks on objects in the table space that conflict with a share lock will be forced off.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Table States

• In addition to locks, the load utility uses table states to control access to tables. A table state can be checked by using the LOAD QUERY command. The states returned by the LOAD QUERY command are as follows:

- Normal - No table states affect the table.

- Check Pending - The table has constraints which have not yet been verified. Use the SET INTEGRITY statement to take the table out of check pending state. The load utility places a table in the check pending state when it begins a load operation on a table with constraints.

- Load in Progress - There is a load operation in progress on this table.

- Load Pending - A load operation has been active on this table but has been aborted before the data could be committed. Issue a LOAD TERMINATE, LOAD RESTART, or LOAD REPLACE command to bring the table out of this state.

- Read Access Only - The table data is available for read access queries. Load operations using the ALLOW READ ACCESS option place the table in read access only state.

- Unavailable - The table is unavailable. The table may only be dropped or restored from a backup. Rolling forward through a non-recoverable load operation will place a table in the unavailable state.

- Not Load Restartable - The table is in a partially loaded state that will not allow a load restart operation. The table will also be in load pending state. Issue a LOAD TERMINATE or a LOAD REPLACE command to bring the table out of the not load restartable state. A table is placed in not load restartable state when a rollforward operation is performed after a failed load operation that has not been successfully restarted or terminated, or when a restore operation is performed from an online backup that was taken while the table was in load in progress or load pending state. In either case, the information required for a load restart operation is unreliable, and the not load restartable state prevents a load restart operation from taking place.

- Unknown - The LOAD QUERY command is unable to determine the table state.

• A table can be in several states at the same time. For example, if data is loaded into a table with constraints and the ALLOW READ ACCESS option is specified, the table state would be:

Tablestate:Check PendingLoad in ProgressRead Access Only

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-13

Student Notebook

• After the load operation but before issuing the SET INTEGRITY statement, the table state would be:

Tablestate:Check PendingRead Access Only

• After the SET INTEGRITY statement has been issued, the table state would be:

Tablestate:Normal

Table Space States when COPY NO is Specified

• If a load operation with the COPY NO option is executed in a recoverable database, the table spaces associated with the load operation are placed in the backup pending table space state and the load in progress table space state. This takes place at the beginning of the load operation. The load operation may be delayed at this point while locks are acquired on the tables within the table space.

• When a table space is in backup pending state, it is still available for read access. The table space can only be taken out of backup pending state by taking a backup of the table space. Even if the load operation is aborted, the table space will remain in backup pending state because the table space state is changed at the beginning of the load operation, and cannot be rolled back if it fails. The load in progress table space state prevents online backups of a load operation with the COPY NO option specified while data is being loaded. The load in progress state is removed when the load operation is completed or aborts.

During a rollforward operation through a LOAD command with the COPY NO option specified, the associated table spaces are placed in restore pending state. To remove the table spaces from restore pending state, a restore operation must be performed. A rollforward operation will only place a table space in the restore pending state if the load operation completed successfully.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-5. Online Load - Index Rebuild CF457.3

Notes:

• Indexes are built during the build phase of a load operation. There are four indexing modes that can be specified in the LOAD command:

1. REBUILD. All indexes will be rebuilt. The utility must have sufficient resources to sort all index key parts for both old and appended table data.

2. INCREMENTAL. Indexes will be extended with new data. This approach consumes index free space. It only requires enough sort space to append index keys for the inserted records. This method is only supported in cases where the index object is valid and accessible at the start of the load operation. If this mode is specified but not supported due to the state of the index, a warning is returned and the load operation continues in REBUILD mode. Similarly, if a load restart operation is begun in the load build phase, INCREMENTAL mode is not supported. Incremental mode is not supported when all of the following conditions are true (to bypass this restriction it is recommended that indexes are placed in a separate table space):

• The LOAD COPY options is specified

© Copyright IBM Corporation 2004

Online Load – Index Rebuild

Data Tablespace

Index Tablespace

Table data prior to Load

Data appended by

Load

Pre-Load index version

New and old keys in the shadow index built during Load

During Load a reader can only see

preexisting data; accesses original

index

After Load finishes, the shadow becomes the new original;

reader can see old and new data after the switch

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-15

Student Notebook

• The table resides in a DMS table space

• The index object resides in a table space that is shared by other table objects belonging to the table being loaded.

3. AUTOSELECT. The load utility will automatically decide between REBUILD or INCREMENTAL mode. This is the default.

Note: You may decide to explicitly choose an indexing mode, because the behavior of the REBUILD and INCREMENTAL modes are quite different.

4. DEFERRED. The load utility will not attempt index creation if this mode is specified. Indexes will be marked as needing a refresh, and a rebuild may be forced the first time they are accessed. This option is not compatible with the ALLOW READ ACCESS option, because it does not maintain the indexes, and index scanners require a valid index.

• Load operations that specify the ALLOW READ ACCESS option require special consideration in terms of space usage and logging depending on the type of indexing mode chosen. When the ALLOW READ ACCESS option is specified, the load utility keeps indexes available for queries even while they are being rebuilt.

• When a load operation in ALLOW READ ACCESS mode specifies the INDEXING MODE REBUILD option, new indexes are built as a shadow either in the same table space as the original index or in a system temporary table space. The original indexes remain intact and are available during the load operation, and are only replaced by the new indexes at the end of the load operation while the table is exclusively locked. If the load operation fails and the transaction is rolled back, the original indexes will remain intact.

Building New Indexes in the Same Table Space as the Original

• By default, the shadow index is built in the same table space as the original index. Since both the original index and the new index are maintained simultaneously, there must be sufficient table space to hold both indexes at the same time. If the load operation is aborted, the extra space used to build the new index is released. If the load operation commits, the space used for the original index is released and the new index becomes the current index.

• When the new indexes are built in the same table space as the original indexes, replacing the original indexes will take place almost instantaneously. If the indexes are built in a DMS table space, the new shadow index cannot be seen by the user. If the indexes are built within an SMS table space, the user may see index files in the table space directory with the .IN1 suffix and the .INX suffix. These suffixes do not indicate which is the original index and which is the shadow index.

Building New Indexes in a System Temporary Table Space

• The new index can be built in a system temporary table space to avoid running out of space in the original table space. The USE <tablespaceName> option allows the indexes to be rebuilt in a system temporary table space when using the INDEXING

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

MODE REBUILD and ALLOW READ ACCESS options. The system temporary table may be an SMS or a DMS table space, but the page size of the system temporary table space must match the page size of the original index table space.

• The USE <tablespaceName> option is ignored if the load operation is not in ALLOW READ ACCESS mode, or if the indexing mode is incompatible. The USE <tablespaceName> option is only supported for the INDEXING MODE REBUILD or INDEXING MODE AUTOSELECT options. If the INDEXING MODE AUTOSELECT option is specified and the load utility selects incremental maintenance of the indexes, the USE <tablespaceName> option will be ignored.

• A load restart operation may use an alternate table space for building an index even if the original load operation did not use an alternate table space. A load restart operation cannot be issued in ALLOW READ ACCESS mode if the original load operation was not issued in ALLOW READ ACCESS mode.

• Load terminate operations do not rebuild indexes, so the USE <tablespaceName> option will be ignored. During the build phase of the load operation, the indexes are built in the system temporary table space. Then, during the index copy phase, the index is copied from the system temporary table space to the original index table space. To make sure that there is sufficient space in the original index table space for the new index, space is allocated in the original table space during the build phase. So, if the load operation is going to run out of index space, it will do it during the build phase. If this happens, the original index will not be lost.

• The index copy phase occurs after the build and delete phases. Before the index copy phase begins, the table is locked exclusively. That is, it is unavailable for read access throughout the index copy phase. Since the index copy phase is a physical copy, the table may be unavailable for a significant amount of time.

Note: If either the system temporary table space or the index table space are DMS table spaces, the read from the system temporary table space can cause random I/O on the system temporary table space and can cause a delay. The write to the index table space is still optimized and the DISK_PARALLELISM values will be used.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-17

Student Notebook

Figure 9-6. Online Load - Incremental Indexing CF457.3

Notes:

• When a load operation in ALLOW READ ACCESS mode specifies the INDEXING MODE INCREMENTAL option, the load utility will write some log records that protect the integrity of the index tree. The number of log records written is a fraction of the number of inserted keys and is a number considerably less than would be needed by a similar SQL insert operation. A load operation in ALLOW NO ACCESS mode with the INDEXING MODE INCREMENTAL option specified writes only a small log record beyond the normal space allocation logs.

© Copyright IBM Corporation 2004

Online Load – Incremental Indexing

During the build phase of an ALLOW READ ACCESS online Load with incremental indexing,

newly appended keys are inserted into the index, but remain invisible

until the end of the Load

Newly appended portion of the table

Index maintained incrementally during the build phase of

Load

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

9.2 Load File Type Modifiers

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-19

Student Notebook

Figure 9-7. Load File Type Modifiers CF457.3

Notes:

The MODIFIED BY options are used either to improve performance of the LOAD operation or to modify the file input. They depend on the file format. Let’s have a look at the most important options.

ANYORDER is used in conjunction with the CPU_parallelism parameter and specifies that the preservation of source data order is not required. If SAVECOUNT is > 0, then this option is not supported, since crash recovery after a consistency point requires data is loaded in sequence.

FASTPARSE reduces syntax checking on column values. Use this modifier with clean data only, because, for example, if a value of 456zuv8 should be loaded into an integer column, no syntax error is detected and an arbitrary number is loaded. This option is not supported in conjunction with the CURSOR or IXF file types.

GENERATEDIGNORE informs the load utility that data for all generated columns is present in the data file but should be ignored.

© Copyright IBM Corporation 2004

Load File Type Modifiers

ANYORDER

FASTPARSE

GENERATEDIGNORE

GENERATEDMISSING

GENERATEDOVERRIDE

IDENTITYIGNORE

IDENTITYMISSING

IDENTITYOVERRIDE

INDEXFREESPACE=X

LOBSINFILE

PAGEFREESPACE=X

TOTALFREESPACE=X

USEDEFAULTS

DATEFORMAT=X

DUMPFILE=X

IMPLIEDDECIMAL

TIMEFORMAT=X

BINARYNUMERICS

NOCHECKLENGTHS

RECLEN=X

STRIPBLANKS

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

GENERATEDMISSING indicates that the input data file contains no data for the generated column (not even NULLs).

GENERATEDOVERRIDE instructs the load utility to accept user-supplied data for all generated columns in the table (contrary to the normal rules for these types of columns).

IDENTITYIGNORE informs the load utility that data for the identity column is present in the data file but should be ignored.

IDENTITYMISSING assumes that the input data file has no values for the identity column (not even NULLs).

IDENTITYOVERRIDE should be used only when an identity column defined as GENERATED ALWAYS is present in the table. It instructs the utility to accept explicit, non-NULL data for such a column (contrary to the normal rules for these types of identity columns).

INDEXFREESPACE=X (X= an integer between 0 and 99) is interpreted as the percentage of each index page that is to be left as free space when load rebuilds the index.

LOBSINFILE lob-path specifies the path to the files containing LOB data.

PAGEFREESPACE=X (X is an integer between 0 and 100) and is interpreted as the percentage of each data page that is to be left as free space.

TOTALFREESPACE=X (X is an integer) is interpreted as the percentage of the total pages in the table that is to be appended to the end of the table as free space.

USEDEFAULTS loads default values if a source column for a target table column has been specified but contains no data for one or more rows.

DATEFORMAT=X where X is the format of the date in the source file.

DUMPFILE=X where X is the fully qualified name (according to the server database partition) of an exception file to which rejected rows are written. The file will be created and owned by the instance owner. To override the default file permissions, use the dumpfileaccessall file type modifier.

DUMPFILEACCESSALL=x grants read access to “others” when a dump file is created.

IMPLIEDDECIMAL indicates that the location of an implied decimal point is determined by the column definition; it is no longer assumed to be at the end of the value. For example, the value 12345 is loaded into a DECIMAL(8,2) column as 123.45 and not as 12345.00.

TIMEFORMAT=X where X is the format of the time in the source file.

BINARYNUMERICS indicates that the numeric (not decimal) data is in binary form to avoid costly conversions.

NOCHECKLENGTHS makes an attempt to load each row, even if the source data has a column definition that exceeds the size of the target table column.

RECLEN=X where X is an integer with a maximum value of 32.767. X characters are read for each row, and a new-line character is not used to indicate the end of the row.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-21

Student Notebook

STRIPBLANKS truncates any trailing blank spaces when loading data into a variable-length field.

Detailed information, as well as the rest of the MODIFIED BY options, can be found in the DB2 UDB COMMAND Reference and the DB2 UDB Data Movement Utilities Guide and Reference.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-8. Free Space in Index and Data Pages CF457.3

Notes:

There are several MODIFIED BY parameters that can be used to control the amount of free space available after a table is loaded. These parameters can be used individually or together to provide free space to allow for INSERT and UPDATE growth to the table following the completion of the LOAD.

Using up the free space keeps related rows close together, and generally keeps the performance of tables better than if no free space were considered.

• Totalfreespace allows you to append empty data pages to the end of the loaded table. The number used with this parameter is the percentage of the total pages in the table that is to be appended to the end of the table as free space. Therefore, if you used 20 with this parameter, and the table has 100 data pages, 20 additional empty pages are appended. The total number of data pages in the table will then be 120.

• Pagefreespace allows you to control the amount of free space allowed on each loaded data page. The number used with this parameter is the percentage of each data page that is to be left as free space. The first row in a page is added without restriction. Therefore, with very large rows and a large number used with this parameter, there may

© Copyright IBM Corporation 2004

Free Space in Index and Data Pages

Data Page

Index Page

pagefreespace = 25

1 10 + 1 + 2

totalfreespace = 20

Data Data

Free 25 %100 %

75 %

0

100 %

Data

25 %

10 % 10 %

120 %

Root

Non Leafmax.10 %

Leaf

Leaf - Page only

max.10 %

100 %80020 %

indexfreespace = 20

Free

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-23

Student Notebook

be less space left free on each page than indicated by the value used with this parameter.

• Indexfreespace allows you to control the amount of free space allowed on each loaded index page. The number used with this parameter is the percentage of each index page that is to be left as free space.

- The first index entry in a page is added without restriction.

- Additional index entries are placed in the index page, provided that the percent free space threshold can be maintained.

The default value is the one used at CREATE INDEX time.

This value takes precedence over the PCTFREE value specified in the CREATE INDEX statement.

If you determine to use pagefreespace and you have an index on the table, you should consider using indexfreespace. When deciding on the amount of free space to leave for each, consider that the size of each row being inserted into the table will likely be larger than the size of the associated key to be inserted into the index.

In addition, the page size of the table spaces for the table and the index may be different.

• If it is set to an invalid value, the default is set to 10%.

• Only multiples of 10% are then valid (0-60%), but the maximum is 60%.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-9. Load From Cursor - Example CF457.3

Notes:

• By specifying the CURSOR file type when using the LOAD command, you can load the results of an SQL query directly into a target table without creating an intermediate exported file. By referencing a nickname within the SQL query, the load utility can also load data from another database in a single step.

• To execute a load from cursor operation from the CLP, a cursor must first be declared against an SQL query. Once this is done, you can issue the LOAD command using the declared cursor’s name as the cursorname and CURSOR as the file type.

• For example:

Table ABC.TABLE1 has 3 columns: ONE INT, TWO CHAR(10), THREE DATETable ABC.TABLE2 has 3 columns: ONE VARCHAR, TWO INT, THREE DATE

• Executing the following CLP commands will load all the data from ABC.TABLE1 into ABC.TABLE2:

DECLARE mycurs CURSOR FOR SELECT TWO,ONE,THREE FROM abc.table1LOAD FROM mycurs OF cursor INSERT INTO abc.table2

© Copyright IBM Corporation 2004

Load From Cursor - Example

Create nickname sales for oracledb.salesdata.sales …Create nickname employee for oracledb.salesdata.cashier …Declare C cursor for select sales.* from sales, employee where sales.cashierid = employee.id and employee.name = ‘John Smith’ …Load from C of cursor insert into johns_sales

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-25

Student Notebook

Notes:

1. The above example shows how to load from an SQL query through the CLP. However, loading from an SQL query can also be done through the db2Load API, by properly defining the piSourceList and piFileType values of the db2LoadStruct structure.

2. As demonstrated above, the source column types of the SQL query do not need to be identical to their target column types, although they do have to be compatible.

This is a new powerful tool for moving data between tables or systems. It supports arbitrary SELECT statements: single table, joins, even nicknames!

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-10. LOAD ... HOLD QUIESCE CF457.3

Notes:

The HOLD QUIESCE specifies that the LOAD utility should leave the table space in a quiesced exclusive state after the load completes. The QUIESCE TABLESPACES FOR TABLE <table-name> RESET command must be used to reset the table space state to normal.

© Copyright IBM Corporation 2004

LOAD ... HOLD QUIESCE

Specifies LOAD command quiesce table spacein exclusive state after load completes

QUIESCE TABLESPACES FOR TABLE ...RESET sets state to normal

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-27

Student Notebook

Figure 9-11. Load and Load Performance Considerations CF457.3

Notes:

The performance of the load utility depends on the nature and the quantity of the data, the number of indexes, and the load options specified.

Unique indexes reduce load performance if duplicates are encountered. For the performance of index creation during a load operation, the sortheap database configuration parameter is important. If an index is too large to be sorted in memory, a sort spill occurs and the data is divided among several sort runs and stored in a temporary table space, so if the size of the sortheap parameter could not be increased it is important that the buffer pools for the temporary table spaces are large enough. Also, consider using high-performance sorting libraries, such as from third-party vendors. The DB2SORT environment variable can be used to specify the location of the sorting library.

The load utility attempts to deliver the best performance possible by determining optimal values for DISK_PARALLELISM (number of processes or threads used by the load utility to write data records to disk), CPU_PARALLELISM (intra-partition parallelism), and DATA BUFFER (total amount of memory (util_heap_sz) allocated to the load utility), if these

© Copyright IBM Corporation 2004

Load and Load Performance Considerations

SORTHEAP / DB2SORT

INDEXES

ALLOW READ ACCESS

SAVECOUNT

STATISTIC YES

USE <Tablespace name>

NONRECOVERABLE

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

parameters have not been specified by the user. This is done based on the size and the free space available in the utility heap.

ALLOW READ ACCESS will enable users to query a table (data that existed in the table prior to the load operation) while a load operation is in progress.

SAVECOUNT is used to set an interval for the establishment of consistency points during a load. If done too frequently, there will be a noticeable reduction in load performance.

STATISTICS YES is used to collect data distribution and index statistics, and allows the applications to use new access paths based on the latest statistics. Use of STATISTIC YES is more effective then the use of the runstats utility after completion of the load.

USE <tablespace name> allows an index to be rebuilt in a system temporary table space and copied back to the index table space. If ALLOW READ ACCESS is used, new indexes are built as a shadow and replace the original indexes at the end of the load operation. By default, the shadow index is built in the same table space as the original index.

Additionally, in a partitioned database remember MODE X. This specifies the mode in which the load operation will take place.

The NONRECOVERABLE option will be discussed in detail later in this unit.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-29

Student Notebook

Figure 9-12. Non-Recoverable Load Scenario CF457.3

Notes:

One of the options for the load utility is the NONRECOVERABLE option. This option allows you to perform a non-recoverable load of a table without affecting the recoverability of all other tables in the database. This option can be used when forward recovery (archival logging) has been enabled and the default COPY NO load option is used. The NONRECOVERABLE option specifies that the load transaction is to be marked as non-recoverable. To recover the table, you must either reissue the load command or use import.

One of the scenarios where this option could be used is shown on the graphic. If you need to perform multiple load operations in the same table space, you could use the NONRECOVERABLE option on all but the last table. The last table would be loaded without using the NONRECOVERABLE option, causing the table space to be placed in a backup pending state. The benefit of this method is that only one backup is necessary after all the loads are completed.

Another use of this option could be for loading a read-only table, especially if the table is large. In this case, it is not important that the table is non-recoverable, as, in the event of a failure, a load could be used to restore the table.

© Copyright IBM Corporation 2004

Non-Recoverable Load Scenario

load from t1.del of del insert into

t1 NONRECOVERABLEBackup not required!

load from tn.del of del insert into tn Backup once after nth tableis loaded!

...

t3

Backup

Timeline

LOAD Multiple Tables in Same Table Space

Archival Logging Enabled

t2... tn

with NONRECOVERABLE option

LOAD (normal) last table

to force backup

t1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The next several pages illustrate the considerations necessary for using a non-recoverable load.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-31

Student Notebook

Figure 9-13. Non-Recoverable Load Considerations (1 of 2) CF457.3

Notes:

There are some important considerations to be aware of when using a non-recoverable load. These considerations are explained in the following example:

1. Log retain is enabled (LOGRETAIN=YES) so that DB2 UDB will retain all the database recovery logs.

2. At the beginning of the day, a "full database backup" is taken, which includes table t1.

3. Throughout the morning, insert transactions are performed, and the number of rows in table t1 grows. All transactions against the table are logged.

4. At midday, a non-recoverable load is performed which inserts a large amount of new data into table t1:

load from t1.del of de1 insert into t1 nonrecoverable

5. Once the load is completed, the table is usable since the table space was not placed in BACKUP PENDING state.

© Copyright IBM Corporation 2004

Non-Recoverable Load Considerations (1 of 2)

Table t1 is loaded with NONRECOVERABLE andCOPY NO options

A backup is not necessary after the load (no BackupPending)

Time

Backup

Table t1

Inserts

LOAD Table t1

Inserts

DiskCrash

Archival Logging Enabled

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

6. Throughout the afternoon, more inserts are performed in table t1. This is in addition to the rows inserted in the morning and during the load.

7. Late in the afternoon, the system experiences a disk crash which makes a database recovery necessary (restore with a roll-forward).

8. Now let us look at what happens to table t1 when we try to recover.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-33

Student Notebook

Figure 9-14. Non-Recoverable Load Considerations (2 of 2) CF457.3

Notes:

When a load is run with the NONRECOVERABLE option, the load transaction is marked as non-recoverable. It will not be possible to recover this load with a subsequent roll-forward action. When processing a roll-forward recovery, the load transaction will be bypassed. The table into which the data was loaded will be marked as "invalid". The roll-forward will also ignore any subsequent transactions against that table (updates/inserts/deletes and so forth). After the roll-forward completes, a non-recoverable table can only be dropped or replaced with a Load or Import. The NONRECOVERABLE option gives the user more flexibility to mix tables that need a roll-forward recovery strategy, with tables that do not, in the same database. This would also be useful when loading a large read-only table into a table space which already has updateable tables which require full recovery.

We will now resume the example shown on the previous notes page. The following steps are used to recover t1, and other tables in the table space(s):

1. A restore (either table space in which t1 resides, or database) is performed to restore t1 and other tables.

© Copyright IBM Corporation 2004

Non-Recoverable Load Considerations (2 of 2)

Time

restore

roll-forward

Table t1

Inserts applied

roll-forward

completes

Archival Logging Enabled

At the non-recoverable load log recordtable t1 is marked inaccessible

Table t1 is put into an inaccessible state

Only way to recover table t1 is:

Drop Table t1

Recreate Table t1 and reload or import

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

2. A roll-forward to end-of-logs is issued to replay all the transactions after the backup was taken. All the inserts from the morning are reapplied to table t1 and the other tables.

3. When the roll-forward reaches the transaction containing the non-recoverable load, it does not reapply this transaction. The table is put into an inaccessible state.

4. The roll-forward recovery will continue to process all other inserts from the afternoon period. Since table t1 is marked invalid, any transactions which change the t1 table are ignored.

5. The roll-forward completes. All tables except t1 will contain data that is current with the log files.

6. The other tables will now be accessible by users.

This example shows the dangers associated with non-recoverable load. This type of load should be used with caution.

• The only way to recover table t1 is to drop it, recreate it, and reload the data or import/create if data is available in IXF format.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-35

Student Notebook

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

9.3 Additional Load Utility Options

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-37

Student Notebook

Figure 9-15. LOAD - TEMPFILES PATH Option CF457.3

Notes:

This specifies the name of the path to be used when creating temporary files during a load operation, and should be fully qualified according to the server database partition. Temporary files take up file system space. Sometimes this space requirement is quite substantial.Concurrent LOAD invocations are allowed from the same directory. • Remote message files are no longer written to the current working directory by default.

This allows for simpler concurrent LOAD invocations from the same directory. • The default name of the remote file is changed from db2utmp to db2load. • By default, remote files are written to a subdirectory of the database path, namely: • <fully qualified database path> / LFB<pool_id/TID / L<obj_id/FID>.

How much space is required for the temporary files: • 4 bytes for each duplicate or rejected row containing DATALINK values • 136 bytes for each message that the load utility generates • 15 KB overhead if the data file contains long field data or LOBs. This quantity can grow

significantly if the INSERT option is specified.

© Copyright IBM Corporation 2004

What is a remote file?

Remote file used for LOAD QUERY command to shadowmessage file

Concurrent LOAD invocations allowed from same directory

Remote file is not created in the current directory

Remote files written to subdirectory of database path:<fully qualified database path> / LFB<pool_id/TID> /L<obj_id/FID>

TEMPFILES PATH option is used to specify the path for theremote file

LOAD - TEMPFILES PATH Option

Default remote file name is db2load

INSERT

REPLACE

RESTART

TERMINATE

LOAD ...

TEMPFILES PATH--temp-pathname

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-38 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-16. LOAD QUERY Command CF457.3

Notes:

LOAD QUERY checks the status of a load during processing and returns the table state. A connection to the database and a separate CLP session are required to successfully invoke this command either locally or remotely.

TABLE table-name is the name of the table where data is currently being loaded. If an unqualified table name is used, the table will be qualified with the CURRENT SCHEMA.

TO local-message-file specifies the destination for warning and error messages. This file cannot be the message-file specified for the LOAD command. If the file already exists, all messages are appended to it.

NOSUMMARY indicates that no load summary information, such as rows read, rows skipped, rows loaded, rows rejected, rows deleted, rows committed, or number of warnings, will be reported.

SUMMARYONLY indicates that only the summary information is to be reported.

SHOWDELTA indicates that only new information since the last invocation of the LOAD QUERY command is reported.

© Copyright IBM Corporation 2004

Specify the table being loaded or the directory of thetemporary files

May choose to display only summary information

May choose to display only updated information

LOAD QUERY Command

LOAD QUERY

TABLE table-name TO local-message-file

NOSUMMARY

SUMMARY ONLY

SHOWDELTA

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-39

Student Notebook

The LOAD QUERY command also can be used to determine the table state. Possible table states are:

• NORMAL

• CHECK PENDING

• LOAD IN PROGRESS

• LOAD PENDING

• READ ACCESS ONLY

• UNAVAILABLE (here the table may only be dropped or it may be restored from a backup. For example, roll-forward through a non-recoverable load will put a table into the unavailable state.)

• NO LOAD RESTARTABLE (the table is partially loaded and will not allow a load restart. The table is also in a LOAD PENDING state. Issue a load terminate or a load replace to bring the table out of this state)

• UNKNOWN

The progress of a load operation can also be monitored with the LIST UTILITIES command.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-40 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-17. LOAD QUERY - Example CF457.3

Notes:

• LOAD always writes the remote message file to a subdirectory of the system database directory by default.

• The syntax of LOAD QUERY is extended to handle the choice of query remote-file/query table-name. The table name may or may not be fully qualified on the LOAD QUERY command.

• A LOAD QUERY request from a down-level client has no problem executing on a higher level server. However, a higher-level client cannot take advantage of the added LOAD QUERY functionality on a down-level server.

The output file might look like the following:

SQL3501W The table space(s) in which the table resides will not be placed in backup pending state since forward recovery is disabled for the database.

SQL3109N The utility is beginning to load data from file “/u/mydir/data/staffbig.del”

© Copyright IBM Corporation 2004

Example:

Check status of load after loading into STAFF tableusing "TEMPFILES PATH /u/ltempdir/staff" onLOAD command

CHOICE1 Specify the TEMPFILES PATH value:db2 load query /u/ltempdir/staff to /u/mydir/staff.tempmsg

CHOICE2 Specify the tablename:db2 load query table STAFF to /u/mydir/staff.tempmsg

LOAD QUERY - Example

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-41

Student Notebook

SQL3500W The utility is beginning the “LOAD” phase at time “03-13-2004 16:05:16.456073”.

SQL3519W Begin Load Consistency Point. Input record count = “0”.

SQL3520W Load Consistency Point was successful.

SQL3519W Begin Load Consistency Point. Input record count = “1032180”.

SQL3520W Load Consistency Point was successful.

SQL3519W Begin Load Consistency Point. Input record count = “206313”.

SQL3520W Load Consistency Point was successful.

SQL3519W Begin Load Consistency Point. Input record count = “309015”.

SQL3520W Load Consistency Point was successful.

SQL35321 The Load utility is currently in the “LOAD” phase.

Number of rows read = 325843Number of row skipped = 0Number of row loaded = 325843Number of rows rejected = 0Number of rows deleted = 0Number of rows committed = 309015Number of warnings = 0Tablestate:Load in Progress

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-42 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-18. LOAD RESTART CF457.3

Notes:

Load RESTART option:

• Users do not need to check the message file in order to determine where to restart. Users need only to specify the RESTART keyword, and LOAD collects all the information needed on its own.

Performance improvement:

• The possibility of breaking down each build and delete phase into smaller sections is automatically considered, and then LOAD is restarted incrementally from the last completed section.

External changes for LOAD:

• One more temporary file is created when LOAD proceeds with the SAVECOUNT option. This additional log file is used for shadowing. During restart, LOAD validates and reads information from this new temporary file for recovery.

• Two log files of fixed size are preallocated at setup time to make sure that LOAD does not run out of log file space.

© Copyright IBM Corporation 2004

LOAD RESTART option

Automatically continue from the last consistencypoint in the load, build, or delete phase andperformance is improved

LOAD Restart

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-43

Student Notebook

• During restart, LOAD determines where to start without the user's intervention. LOAD always restarts from the last successful consistency point if LOAD has not completed the load phase. If LOAD crashed or failed in the build phase, it is possible to restart only from the last incomplete index instead of rebuilding all the indexes again. If LOAD crashed or failed in the delete phase, it is possible to establish save points at every fixed number of rows deleted so that LOAD can start processing the RID file at some commit point instead of always from the beginning.

• All load restart operations will choose the REBUILD indexing mode even if the INCREMENTAL option is specified.

Restarting an ALLOW READ ACCESS Load

• A load operation specified with ALLOW READ ACCESS can be restarted using either ALLOW READ ACCESS or ALLOW NO ACCESS. A load operation specified with ALLOW NO ACCESS cannot be restarted using ALLOW READ ACCESS.

• If an index object is unavailable or marked invalid, a restart operation in ALLOW READ ACCESS is not permitted.

• If the load operation aborted in the index copy phase, a restart with ALLOW READ ACCESS is not permitted because the index may be corrupt.

• If the abort of the load with ALLOW READ ACCESS was in the load phase, the restart will be in the load phase. If it was aborted in any phase other than the load phase, it will restart in the build phase.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-44 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 9-19. LOAD - TERMINATE CF457.3

Notes:

• LOAD TERMINATE will roll back a previously interrupted (or crashed) load to the point in time when it started, even if consistency points are passed. The states of table spaces involved go back to normal; all table objects are consistent (index objects may be marked as invalid, in which case index rebuild will automatically take place at next access).

• The TERMINATE function removes a backup pending state from table spaces.

• For a table of DATALINK columns, load TERMINATE is not supported.

• It he load being terminated is a load REPLACE, the table will be truncated to an empty table.

• If the load being terminates is a load INSERT, the table will retain all of its original records.

• Load terminate will not remove a backup pending state from table spaces.

© Copyright IBM Corporation 2004

Return to thestart point of

LOAD!

LOAD

TERMINATE

LOAD TERMINATE returns table space to normal state and table to consistent state

Roll back a previously interrupted load to thepoint-in-time when it was started even if consistencypoints are passed

States of table spaces involved go back to NORMAL.All table objects except indexes are consistent.

LOAD - TERMINATE

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-45

Student Notebook

Figure 9-20. LOAD - Incremental Indexing CF457.3

Notes:

You can specify the INDEXING MODE option for the LOAD utility. The INDEXING MODE options are:

• AUTOSELECT

- The load utility automatically decides between REBUILD or INCREMENTAL mode. The load utility uses a heuristic to model the cost of building the index in each method (rebuild versus incremental). Refer to notes on the INCREMENTAL method for when this mode of indexing is not supported.

• REBUILD

- Forces all indexes to be rebuilt. This technique rebuilds the table indexes and constructs the indexes with their associated definitions for free space. When indexes are built using this technique, the utility must have sufficient resources to sort all index keys for both old and appended table data.

© Copyright IBM Corporation 2004

LOAD INDEXING MODE options

AUTOSELECTLoad utility automatically decides between REBUILD or INCREMENTALmode. This is the default.

REBUILD All indexes rebuilt.

Sort space is needed for both old and appended keys.

INCREMENTALIndexes changed based only on new data.

Sort space is needed for appended keys only.

Useful when adding small increments of data.

DEFERREDIndexes not touched, but marked as needing refresh. First non-loadaccess forces rebuild.

Sort space is needed for all keys in largest index.

Deferred indexing only is supported for tables with non-unique indexes.

Useful when doing multiple LOADs: Build index on last load.

LOAD - Incremental Indexing

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-46 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• INCREMENTAL

- Indexes extended with new data. This technique consumes index-free space. This technique only requires sort space sufficient for appending index keys for the inserted records. This method is only supported in cases where the index object is valid and accessible at the time load starts. (For example, it is not valid immediately following a load with DEFERRED.) If this method is indicated by the user, but not supported due to the state of the index, then a warning will be generated, and the load will continue with indexing mode REBUILD. Similarly, if RESTARTing a LOAD using LOAD RESTART, and the RESTART is in the BUILD PHASE, then INCREMENTAL is not supported.

• DEFERRED

- Indexes will be marked as needing a refresh. The first access to such indexes that is unrelated to a load operation may force a rebuild, or indexes may be rebuilt when the database is restarted. This approach requires enough sort space for all key parts for the largest index. The total time taken for index construction in this mode is longer than that required in REBUILD mode. Therefore, when performing multiple load operations with deferred indexing, it is advisable (from a performance viewpoint) to let the last load operation in the sequence perform an index rebuild, rather than allow indexes to be rebuilt at first non-load access. This option is not compatible with the ALLOW READ ACCESS option, because it does not maintain the indexes, and index scanners require a valid index.

- Deferred indexing is only supported for tables with non-unique indexes, so that duplicate keys inserted during the load phase are not persistent after the load operation.

• Limitation:

- Incremental indexing is not supported for load when the LOAD COPY option is specified (log retain logging is on), and the table resides in a DMS table space, and the index object resides in a table space shared with other table objects (DAT, or LONG, and so forth) belonging to the table being loaded. To bypass this restriction, it is recommended that indexes be placed in a separate table space.

INDEXING MODE AUTOSELECT

REBUILD

INCREMENTAL

DEFERRED

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-47

Student Notebook

By default the shadow index is built in the same table space as the original. Since both the original and the new index are maintained at the same time, there must be sufficient table space to hold both indexes at the same time. It the load aborts, the space used for the new index is released. If the load commits, the space used for the original index is released, and the new index becomes the current one.

If the indexes are in a DMS table space, the new shadow index cannot be seen; in an SMS table space you might see index files with the.IN1 suffix and the .INX suffix.

The new index can be built in a system temporary table space to avoid running out of space. The USE <tablespace name> option allows this when using INDEXING MODE REBUILD and ALLOW READ ACCESS. The page size of the system temporary table space must match the page size of the original index table space. The USE <tablespace name> option is ignored if the load is not in ALLOW READ ACCESS mode or if the indexing mode is incompatible (only REBUILD or AUTOSELECT is supported).

INDEXING MODE INDEX CREATION TIMING TO CREATE AUTOSELECT Automatically decided, REBUILD or

INCREMENTALLY. During LOAD.

REBUILD REBUILD all indexes. During LOAD. INCREMENTAL All indexes extended with new data

INCREMENTALLY. During LOAD.

DEFERRED If INDEXREC=ACCESS, REBUILD all indexes on the table accessed. If INDEXREC=RESTART, REBUILD all indexes on a database.

If INDEXREC=ACCESS, when a table is accessed.If INDEXREC=RESTART, when RESTART is issued.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-48 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

9.4 Abnormal Load Termination

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-49

Student Notebook

Figure 9-21. Abnormal Load Termination Cleanup Problem CF457.3

Notes:

The following notes describe the cleanup process required for loads that abnormally terminate.

Load does not automatically remove INDEX SORT temporary files when it is forced via operating system commands to abnormally terminate. This cleanup must be done manually. The following only applies to loads into tables that contain an index. These files should be removed because they will sit around forever, and may eventually cause a disk full condition in the temp sort file system.

Load file cleanup process:

The following notes describe the steps required to clean up temporary sort files after a load has abnormally terminated (that is, kill -9 to load process, or db2_kill). These steps should not be necessary if the load is terminated correctly (that is, db2 force application).

Sort files may be created in the DEFAULT location if no temp subdirectory is specified:

© Copyright IBM Corporation 2004

Abnormal LoadTermination Cleanup Problem

Do not kill the load process (for example, by operating systemcommands), but if someone does, here are the cleanup tasksyou need to do:

Load does not automatically remove Load Sort temporary fileswhen it is abnormally terminated

Location of sort files

Default:UNIX: <instance_owner>/sqllib/tmp

Windows: X:\sqllib\db2\tmp

User specified: db2 load ... using <directory1, directory2>

Sort file naming convention:UNIX: <process_ID>.<index_num>.<num>

Windows: <thread_id in hex>.<index_num in hex><num>

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-50 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

DEFAULT location:

UNIX: "<instance_owner>/sqllib/tmp"

Windows: X:\sqllib\db2\tmp

SORT FILE naming convention:

UNIX: <process_id>.<index_num>.<num>

Windows: <thread_id in hex>.<index_num in hex><num>

The process_id is the process identification number for a load process. The index_num represents the index that is being sorted. If there are four indexes, then this number will be between 0-3, an iteration for all four files.

(that is, UNIX: 27940.1.1, 27940.2.1, 27940.3.1 Windows: 1db.001, 1db.011, 1db.021)

If indexes are defined on the table, the load utility will create and sort the index keys during the LOAD phase. The files will be created long before the index rebuild phase, meaning that the files are created and grow almost immediately after the load is started.

Windows: 1db.001 , 1db.011 , 1db.021 )

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-51

Student Notebook

Figure 9-22. Abnormal Load Termination Cleanup Steps CF457.3

Notes:

Cleanup steps

Ensure that all loads have completed or have been stopped.

db2 list application show detail grep -i load

or

ps -aef grep db2lrid (UNIX) taskmanager (Windows)

Remove all files with the load sort temp naming convention.

rm <sort_dir>/<process_id><index_nam>.<num>

Restart the load.

© Copyright IBM Corporation 2004

Abnormal Load Termination Cleanup Steps

Ensure that all loads have completed or have been stopped

Remove all files with the load sort temp namingconvention

Restart Load

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-52 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. If you use the NONRECOVERABLE load option, you must specify the NONRECOVERABLE parameter when restoring your table from a backup. True or False?

__________________________________________________

2. A load started with ALLOW NO ACCESS should be restarted. Is this possible?

a. No, it is not possible.

b. Yes, but only with the ALLOW NO ACCESS option

c. Yes, with the ALLOW NO ACCESS option or with the ALLOW READ ACCESS option

__________________________________________________

3. FASTPARSE improves the run-time performance of the load utility by doing what?

a. Increasing validation checks

b. Checking if your system has additional processors, and utilizing them automatically

c. Reducing validation checks

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 9. Advanced Load 9-53

Student Notebook

Figure 9-23. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Identify the benefits of online Load

Identify the benefits of various Load options

Describe the advantage of the Index Free Space parameter

Use the Load terminate option

Describe the Incremental Indexing mode of the Load utility

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

9-54 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 10. Distributed Management

What This Unit Is About

This unit provides information about the differences between distributed unit of work, remote unit of work, and distributed request. Options will be discussed that enable or disable multisite read or multisite update capabilities.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe the difference between RUOW and DUOW • Identify design considerations for DUOW applications • Implement components involved in two-phase commit • Differentiate between type 1 and type 2 connects • Describe one-phase versus two-phase syncpoint requirements • Identify how federated databases may be used for distributed

request applications

How You Will Check Your Progress

Accountability:

• Machine lab

References

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Command Reference

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-1

Student Notebook

Figure 10-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Describe the difference between RUOW and DUOW

Identify design considerations for DUOW applications

Implement components involved in two-phase commit

Differentiate between type 1 and type 2 Connects

Describe one-phase versus two-phase syncpointrequirements

Identify how federated systems may be used

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

10.1 Distributed Management

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-3

Student Notebook

Figure 10-2. What Is a Distributed Unit of Work? CF457.3

Notes:

Distributed Unit of Work (DUOW) is a unit of work or transaction that can access more than one database at a time. Such a transaction is also known as a global transaction.

Although previous versions of DB2 UDB provided the ability to access local and remote databases, a single transaction was limited to accessing one database. This is known as Remote Unit of Work (RUOW). RUOW allowed only one database connection in one unit of work. With RUOW, if information was needed from multiple databases, a commit work was required first. After the commit work, the original database connection would be destroyed when you switched from one database to another.

DUOW allows you to access multiple databases with a single unit of work. Removing the one database connection per unit of work restriction achieves this. If you need multiple database connections, the existing database connection is not destroyed when you switch from one database to another. The existing database connection is placed in a dormant state.

© Copyright IBM Corporation 2004

What Is a Distributed Unit of Work?

Connect to TEXAS

SelectInsertUpdate

Connect to Germany

SelectInsertUpdate

Commit Work

Commit Work

RUOW

TEXAS

Germany

Connect to TEXAS

Connect to Germany

Commit Work

DUOW

TEXAS

Germany

SelectInsertUpdate

SelectInsertUpdate

U

O

W

U

O

W

U

O

W

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-3. Levels of Access CF457.3

Notes:

Remote Unit of Work (RUOW).

• Several SQL statements, one database

Distributed Unit of Work (DUOW)

• Several SQL statements, several databases

Distributed Request

• One SQL statement, several databases

DUOW and RUOW are supported by default by DB2 UDB. Distributed Request is supported by the federated database.

© Copyright IBM Corporation 2004

Levels of Access

DB

DB

DB

DB

DB

SQLSQL

SQL

SQLSQL REMOTE UNIT OF WORK

DISTRIBUTED UNIT OF WORK

DISTRIBUTED REQUEST

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-5

Student Notebook

Figure 10-4. Federated Database - Distributed Request CF457.3

Notes:

Federated database support is available in DB2 UDB for DB2 UDB databases and Oracle databases.

• Smart global optimization which is needed to provide acceptable performance. • Multi-location views which are a key to preserving the simplicity, flexibility, and high

degree of data independence that the relational database approach promises its users. • A global catalog, which is a prerequisite to true location transparency. • Protection against unauthorized access, data integrity, and recovery from failure. • A single SQL API, which compensates for language differences between database

managers, and which therefore enhances language transparency and contributes to the simplicity of the user interface.

Federated database functionality is delivered with DB2 UDB.

In DB2 UDB for UNIX and Windows the supported data sources are:

• DB2 UDB for UNIX and WIndows

• DB2 UDB for z/OS and OS/390

© Copyright IBM Corporation 2004

Federated Database - Distributed Request

Table Statistics Index Statistics

Server Type Server Statistics

ClientAppl

Connect to X

Select *From DEPT,PAYwhere ...

. .

.Federated Database

Global Optimizer

Whereis DEPT

and PAY?

Where shouldI join them?

Strategy?

DB2

Oracle

?

SYSIBM.SYSTABLES

TABNAME DEFINER TYPE REMOTE_SERVER

DEPT SCOTT T ORACLE1

PAY SCOTT T DB2MVS

SYSIBM.SYSSERVERS

SERVER NODE DBNAME SERVER_TYPE

ORACLE1 ORANY ORACLE

DB2MVS TOMVS NYDB2 DB2/MVS

SYSIBM.SYSSERVERS

SYSIBM.SYS.TABLESSYSIBM.SYSCOLUMNSSYSIBM.SYSINDEXES

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• DB2 UDB for iSeries

• DB2 Server for VM and VSE

• Informix

Together with relational Connect:

• ODBC

• OLE DB

• Oracle

• Microsoft SQL Server

• Sybase

Together with DB2 Life Sciences Data Connect:

• BLAST

• Documentum

• Microsoft Excel

• Table-structured files

• XML

Data sources are semi-autonomous. For example, the federated server can send queries to Oracle data sources at the same time that Oracle applications access these data sources. A DB2 federated system does not monopolize or restrict access to the other data sources, beyond integrity and locking constraints.

We will discuss further details of federated databases in the unit Federated Database.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-7

Student Notebook

Figure 10-5. Application-Directed DUOW CF457.3

Notes:

DUOW is application-directed. That is, the application directs its SQL statements to the correct database server by using an SQL CONNECT statement and some new SQL extensions to establish or switch database connections. Previous versions of DB2 used the TYPE 1 CONNECT statements which allow one database connection.

The CONNECT statements which are available with DUOW are known as TYPE 2 CONNECT statements. They allow multiple database connections. SQL syntax extensions have been included to allow you to switch connections between databases and to selectively disconnect from databases. These extensions and their use are shown in the above table.

The DRDA Syncpoint Manager (SPM) is a component of DDCS (Distributed Database Computing System) Multi-User, which allows an SQL application to update multiple remote DRDA2 Application Server (DRDA2 AS) databases with two-phase commit.

With SPM, DDCS is also enabled to support two-phase commit in an X/Open XA environment where a transaction may be coordinated by an XA-compliant transaction manager such as CICS/6000 and Encina for AIX.

© Copyright IBM Corporation 2004

Application-Directed DUOW

Desired Action

Establish a connection

Mark a connection for release at nextsuccessful commit

Sever transaction connection

Switch connections to another database(Reestablish a dormant connection)

SQL Statement needed

CONNECT TO <dbname>

SET CONNECTION <dbname>

RELEASE <ALL><CURRENT><dbname>

DISCONNECT <ALL><CURRENT><dbname>

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The CONNECT TO <dbname> or CONNECT TO <dbname> USER userid USING password statements establish a connection.

The SET CONNECTION <dbname> statement changes the state of a database connection from dormant to current. It makes the specified location the current server.

If the CONNECT TO statement or SET CONNECTION statement is successful:

• A connection to the application server is either created or made non-dormant and placed into the current and held state.

• The CURRENT SERVER special register and the SQLCA are updated.

If the CONNECT TO statement or SET CONNECTION statement is unsuccessful:

• The connection states of the application process and the states of its connections are unchanged, no matter what the reason for failure.

• The SQLERRP field of the SQLCA is set to the product identifier of the database that detected the error.

The RELEASE statement places one or more connections in the release-pending state.

The DISCONNECT statement destroys one or more connections when there is no active unit of work (after COMMIT or ROLLBACK WORK). While used within a unit of work, an error (SQLSTATE 25000) is raised. If the DISCONNECT is successful, each identified connection is destroyed. If the current connection is destroyed, the application is placed in the connectable and unconnected state. If the DISCONNECT is unsuccessful, the connection state of the application process and states of its connections are unchanged. If DISCONNECT is used to destroy the current connection, the next executed SQL statement should be CONNECT or SET CONNECTION. Connections can also be destroyed during a COMMIT operation because the db2 precompile option DISCONNECT (AUTOMATIC ! CONDITIONAL ! EXPLICIT) is in effect.

The valid RELEASE or DISCONNECT statement options are:

• ALL • CURRENT • A host variable • Name of database server

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-9

Student Notebook

Figure 10-6. Application Design CF457.3

Notes:

A CONNECT 1 option allows a RUOW application. CONNECT 1 states that you can only connect to one database per unit of work.

A CONNECT 2 option allows a DUOW application. CONNECT 2 states that you can connect to multiple databases in the same unit of work.

DUOW applications can require a synchronization point of either one-phase or two-phase commit. The commit process for a multisite update application (two-phase commit) is different from the normal commit process (one-phase commit).

Because more than one database is updated, the commit process has to be enhanced to coordinate and control the commits in each database involved. This will ensure that all databases commit or roll back the unit of work, thus maintaining data integrity. This commit process is called two-phase commit. There is an overhead in communication costs associated with two-phase commit. Therefore, use one-phase commit whenever possible. The two-phase commit process is not required unless multiple databases are updated in the same unit of work.

© Copyright IBM Corporation 2004

Application Design

RUOW

CONNECT 1 SYNCPOINT ONEPHASE

Single-site update, Single-site read

DUOW

CONNECT 2 SYNCPOINT ONEPHASE

Multisite read

Multisite read, Single-site update

CONNECT 2 SYNCPOINT TWOPHASE

Multisite read, Multisite update

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-7. DUOW Examples (1 of 2) CF457.3

Notes:

The SET CLIENT CLP command specifies connection settings for the application process.

CLP1 has connection settings of SYNCPOINT TWOPHASE. CONNECT 2 is required when SYNCPOINT TWOPHASE is used. These options allow CLP1 to use DUOW with multisite update in the same unit of work (UOW). Since CLP1 uses the default SQLRULES DB2, either a CONNECT TO or SET CONNECTION statement may be used to make the dormant connection (D1) current. RELEASE ALL marks all dormant and current connections to be released at the next successful COMMIT. Note that, even though a database has been placed in release pending state, you may still CONNECT TO or SET CONNECTION to the database before the COMMIT.

CLP2 has connection settings of SYNCPOINT TWOPHASE, CONNECT 2, and SQLRULES STD. SQLRULES STD requires that a SET CONNECTION statement be used to make the dormant connection (D1) current.

CLP3 has connection settings of SYNCPOINT TWOPHASE, CONNECT 2, and SQLRULES DB2. RELEASE CURRENT marks D2 to be released at the next successful COMMIT. RELEASE D1 marks D1 to be released at the next successful COMMIT. Note

© Copyright IBM Corporation 2004

DUOW Examples (1 of 2)

db2 SET CLIENTSYNCPOINT TWOPHASECONNECT 2SQLRULES DB2

CONNECT to D1SELECT ...UPDATE ...CONNECT TO D2INSERT ...CONNECT to D1SELECT ...RELEASE ALLCONNECT TO D2UPDATE ...COMMIT

CONNECT to D1SELECT ...UPDATE ...CONNECT TO D2INSERT ...RELEASE CURRENTSET CONNECTION D1SELECT ...RELEASE D1COMMIT

CLP1

CLP3

CONNECT to D1

SELECT ...

UPDATE ...

CONNECT TO D2

INSERT ...

SET CONNECTION D1

SELECT ...

COMMIT

DISCONNECT ALL

CLP2

CLP4

db2 SET CLIENTSYNCPOINT TWOPHASECONNECT 2SQLRULES STD

CONNECT to D1

SELECT ...

UPDATE ...

CONNECT TO D2

INSERT ...

SET CONNECT ION D1

SELECT ...

RELEASE ALL

COMMIT

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-11

Student Notebook

that, even though SQLRULES DB2 was used, a SET CONNECTION statement may be used to make a dormant connection active.

CLP4 has connection settings of SYNCPOINT TWOPHASE, CONNECT 2, and SQLRULES STD. DISCONNECT ALL statement destroys all the dormant and active connections when there is no active unit of work. The DISCONNECT statement may not be executed within a UOW. If the DISCONNECT ALL is successful, all the database connections are destroyed. If the DISCONNECT ALL is unsuccessful, the connection state of the application process and connection states are unchanged.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-8. DUOW Examples (2 of 2) CF457.3

Notes:

CLP5 has connection settings of SYNCPOINT ONEPHASE. CONNECT 1 allows only SYNCPOINT ONEPHASE: the default. These options allow CLP5 to use only RUOW. With CONNECT 1, the CONNECT RESET statement disconnects the current connection to D1 and places the application in an unconnected and connectable state. Since a CONNECT TO statement is not executed before the next SQL statement, the UPDATE SQL statement implicitly connects the application to the default database and executes the UPDATE statement. The default database is specified in the DB2DBDFT environment variable. The default value of DB2DBDFT is SAMPLE at installation on UNIX platforms. The default value of DB2DBDFT is null on Intel platforms. If the DB2DBDFT is not an alias in the system database directory, an error will occur when the UPDATE is executed.

CLP6 has connection settings of SYNCPOINT TWOPHASE. CONNECT 2 is required if SYNCPOINT equals TWOPHASE. These options allow CLP6 to use DUOW with multisite update. With CONNECT 2, the CONNECT RESET statement places the current connection to D1 in a dormant state and immediately connects the application to the default database specified in DB2DBDFT. This places the application in a connected and connectable state. Since another CONNECT TO or SET CONNECTION statement is not

© Copyright IBM Corporation 2004

DUOW Examples (2 of 2)

CONNECT to D1

INSERT ...

COMMIT

CONNECT RESET

UPDATE ...

COMMIT

CLP5

db2 SET CLIENTSYNCPOINT TWOPHASECONNECT 2

CONNECT to D1

INSERT ...

CONNECT RESET

UPDATE ...

COMMIT

CLP6

DB2DBDFT=D2

Disconnects Current Connection

Implicitly Connects to D2

Immediately Connects to D2 andD1 is Dormant

db2 SET CLIENTCONNECT 1

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-13

Student Notebook

executed before the next SQL statement, the UPDATE SQL statement updates the default database. The default value of DB2DBDFT is SAMPLE at installation on UNIX platforms. If the DB2DBDFT is not an alias in the system database directory, an error will occur when the CONNECT RESET is executed.

Please be aware that code applications like this are not really recommended. There are two possible drawbacks:

• If DB2DBDFT is changed, your connection points to another database

• Maintenance and readability are very hard, especially when the application has more lines than in our above example.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-9. Setting DUOW Options with CLP CF457.3

Notes:

QUERY CLIENT is a command that returns current connection settings for an application process. A connect is not required. The connection settings of an application process can be queried at any time during execution.

SET CLIENT specifies connection settings for the application process.

CONNECT 1 specifies that a CONNECT command is processed as a type 1 CONNECT (RUOW). CONNECT 2 specifies that a CONNECT command is processed as a type 2 CONNECT (DUOW).

SQLRULES DB2 specifies that the type 2 CONNECT is processed according to the DB2 rules: either a CONNECT TO or a SET CONNECTION statement may be used to reestablish a dormant connection. SQLRULES STD specifies that the type 2 CONNECT is processed according to the standard ISO/ANS SQL92 rules: only a SET CONNECTION statement may be used to reestablish a dormant connection.

SYNCPOINT specifies how commits or rollbacks are coordinated among multiple database connections. ONEPHASE, TWOPHASE, or NONE can be specified.

© Copyright IBM Corporation 2004

Setting DUOW Options with CLP

TWOPHASENONE

db2 terminatedb2 set client

CONNECT

SQLRULES

SYNCPOINT

DISCONNECT

1

2

DB2

STD

ONEPHASE

EXPLICIT

CONDITIONALAUTOMATIC

db2 query client

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-15

Student Notebook

ONEPHASE specifies that no Transaction Manager (TM) is used to perform a two-phase commit. A one-phase commit is used to commit the work done by each database in multiple database transactions. It is appropriate to use a one-phase commit when in a multisite read, single-site update transaction.

TWOPHASE specifies that the TM is required to coordinate two-phase commits among those databases that support this protocol.

NONE specifies that no TM is used to perform a two-phase commit and does not enforce single update, multiple reader. A COMMIT is sent to each participating database. The application is responsible for recovery if any of the commits fail.

DISCONNECT EXPLICIT specifies that only database connections that have been explicitly marked for release by the RELEASE statement are disconnected at commit. DISCONNECT CONDITIONAL specifies that the database connections that have been marked for release by the RELEASE statement or have no open WITH HOLD cursors are disconnected at commit. DISCONNECT AUTOMATIC specifies that all database connections are disconnected at commit.

If the DB2 UDB server supports NetBIOS clients, the MAX_NETBIOS_CONNECTIONS value may be used to specify the maximum number of concurrent connections that can be made using a NETBIOS adapter in an application. The maximum value is 254. This parameter must be set before the first NetBIOS connection is made. Changes subsequent to the first connection are ignored. NetBIOS clients support Distributed Unit of Work (DUOW) across a Local Area Network (LAN) through the NetBIOS protocol.

CONNECT_DBPARTITIONNUM (partitioned database environment only) specifies the database partition to which a connect is to be made.

SET CLIENT cannot be issued if one or more connections are active. If SET CLIENT is successful, the connections in the subsequent units of work will use the connection settings specified. If SET CLIENT is unsuccessful, the connection settings of the application NetBIOS are unchanged.

You specify your connection with the CLP with db2 connect to dbname, where dbname maps to the alias name specified in the System Database Directory. After the connect is issued, all SQL requests are executed against the database to which you are connected.

If CONNECT=1 (RUOW) db2 connect reset terminates the connection, and a subsequent SQL statement will cause connection to the default database, if it is defined in the environment variable DB2DBDFT. If CONNECT=2 (DUOW) db2 connect reset puts the current connection in a dormant state and establishes a connection with the default database if it is defined.

The quit command ends the input mode and returns the user to the command prompt, but quit does not terminate the CLP nor disconnect the database connection.

The db2 terminate issues a disconnect and also terminates the CLP back-end process so that, for instance, the connection settings will revert to the default values.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-10. Multisite Update with Two-Phase Commit CF457.3

Notes:

If you want an update to be committed on one database only if updates to other databases are also committed, you would want to use a DUOW multisite update application to maintain data integrity across all the databases. This requires CONNECT 2 and SYNCPOINT TWOPHASE connection settings.

To maintain data integrity across databases, updates to multiple databases in the same unit of work need to be all or nothing. Either all databases commit their work or all databases roll back their work. There is no way that some databases commit while others roll back.

With DUOW support, the coordination of databases involved in a global (DUOW) transaction is done using a two-phase commit protocol. The commit is centrally controlled by a coordinator.

© Copyright IBM Corporation 2004

Multisite Update with Two-Phase Commit

Maintains Data Integrity Across Databases

Transactions are All or Nothing

Centrally Controlled by Coordinator

Two-Step Process

Prepare Phase

Commit Phase

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-17

Student Notebook

As its name suggests, a two-phase commit requires two steps where:

1. The coordinator informs all participating databases that a commit is about to occur. Each database in turn informs the coordinator if it is able to commit or not. This first phase is known as the prepare phase.

2. If all databases respond positively, the coordinator sends the actual commit to all participating databases. This second step is known as the commit phase.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-11. What Is Involved in a Two-Phase Commit? CF457.3

Notes:

Two-phase commit is the key to maintaining data integrity in your database with DUOW applications. There are many safety mechanisms that are built around the process. If a database crashes or communications are lost, there must be some recovery mechanism in place, otherwise the objective of data integrity is lost. The two-phase commit provides the needed mechanism.

First, let us define what is involved in a two-phase commit process.

• Your Application (AP): When the application issues a commit, the two-phase commit process is started.

• Resource Manager (RM): An RM is a participant in a two-phase commit. An RM can be a database manager such as DB2 UDB or any other external resource such as a file. A DB2 UDB database is an RM capable of two-phase commit.

• Transaction Manager (TM): The transaction manager (TM) assigns identifiers to transactions, monitors their progress, and takes responsibility for transaction completion and failure. DB2 UDB provides transaction manager functions that can be

© Copyright IBM Corporation 2004

What Is Involved in a Two-Phase Commit?

Your Application (AP)

Resource Manager (RM)

Transaction Manager (TM)

Also known as SyncPoint Manager (SPM)

Found in DB2

Components found in DB2 TM:

Coordinator

Database Protection Services (DPS)Builds Transaction Tables

TM database

Resynchronization Manager

Could be an external TM

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-19

Student Notebook

used to coordinate the updating of several DB2 UDB common server databases within a single unit of work. (Syncpoint manager (SPM) is an IBM term for TM.)

The TM coordinates the two-phase commit among multiple RMs. Because of its role as a coordinator in a DUOW, the TM is crucial in maintaining data integrity in your distributed database environment.

The TM provides the following services:

- Keeping track of the beginning and end of each transaction. - Logging the status of the commit process. - Recovering RMs involved in the global transaction to ensure data integrity.

The DB2 UDB TM has the following components:

- Coordinator: The coordinator function is linked into the application at compile time. Your application invokes the coordinator in the two-phase commit process.

- Data Protection Services: DPS is responsible for transaction management within each database. The DPS in each database keeps track of all transactions in progress (or active transactions) for that database via a data structure that is called a transaction table. The maximum number of transaction table entries is based on the maximum number of applications configured in the MAXAPPLS database configuration parameter. Each application can only have one transaction active at any time in a database.

- TM database: The TM database is a DB2 UDB database used for the logging and recovery of the DUOW TM.

- Resynchronization Manager: The Resync Manager is responsible for carrying out crash recovery when the two-phase commit process does not complete successfully.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-12. Two-Phase Commit in DB2 UDB CF457.3

Notes:

The first phase of a two-phase commit is also known as the prepare phase. All the RMs involved in the transaction need to be informed that a commit will occur. This phase starts when the application issues a commit. The diagram outlines the steps involved in the prepare phase. They are:

1. The Client Application sends the prepare request to all participating RMs and waits for their replies.

2. The RM performs a write of the prepared record, containing the name of the TM database. This information is used in the event of a crash.

3. Each RM informs the Client of its status. If the transaction did not perform an update on the RM, the RM sends a read-only reply. Otherwise, it sends a yes if it was successful in the prepare and a no if it was not successful in the prepare.

4. The Client sends a prepare request to the TM database.

5. The TM database performs a write of the TM prepared record to its log file. This information is used in case of a crash.

© Copyright IBM Corporation 2004

Two-Phase Commit in DB2 UDB

Client

Application

DB2 RM

DB2 RM

DB2 TM

1 1

7

1 7

4

10

9 3

9 3

12 6 11

5

8

2

8

2

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-21

Student Notebook

6. The TM database informs the Client of its status.

If all RMs respond with a yes vote, the second phase begins. The second phase is called the commit phase. Referring to the diagram, the second phase has the following steps:

7. The Client sends a commit request to all participating RMs except those that replied with a read-only vote.

8. Each RM performs a write of the committed log record and commits the transaction.

9. Each RM sends an acknowledgment back to the Client that this was done successfully.

10.The Client sends a commit request to the TM database.

11. The TM database performs a write of the committed log record.

12.The TM database informs the Client of its status.

If anything goes wrong in the second phase, the databases will have to be resynchronized.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-13. GUI - Multisite Update Wizard CF457.3

Notes:

This GUI is activated in one of two ways.

1. When adding a DCS database in the Control Center, a button launches the Multifile Update Wizard.

2. In the Control Center, click the instance with the right mouse button and select Multisite Update and then Configure.

a. You can configure to use or not to use a TP Monitorb. You can specify the sync point manager (SPM) settings

© Copyright IBM Corporation 2004

Simplified SPM configuration

Specify values of parameter about SPM (Syncpoint manager)

Configure a DB2 UDB server/DB2 Connect server instanceto enable the SPM

Configure a client to use an SPM using Discovery or Import

In addition you have the ability to test that multisite updateusing a SPM is possible from a client.

GUI - Multisite Update Wizard

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-23

Student Notebook

Figure 10-14. Configure Multisite Update Wizard (1 of 2) CF457.3

Notes:

What you can specify:

• Do not use a TP monitor

• TP monitor, which could be:

- If applications are run in a WebSphere Enterprise Edition CICS environment, this parameter should be set to "CICS"

- If applications are run in a WebSphere Enterprise Edition Encina environment, this parameter should be set to "ENCINA"

- If applications are run in a WebSphere Enterprise Edition Component Broker environment, this parameter should be set to "CB"

- If applications are run in an IBM MQSeries environment, this parameter should be set to "MQ"

- If applications are run in a BEA Tuxedo environment, this parameter should be set to "TUXEDO"

© Copyright IBM Corporation 2004

Configure Multisite Update Wizard (1 of 2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-24 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

- If applications are run in an IBM San Francisco environment, this parameter should be set to "SF"

- IBM WebSphere EJB and Microsoft Transaction Server users do not need to configure any value for this parameter.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-25

Student Notebook

Figure 10-15. Configure Multisite Update Wizard (2 of 2) CF457.3

Notes:

• Specify the sync point manager (SPM) settings, such as sync point manager name, log file size, log path, and maximum number of resync agents.

• Default values are provided.

© Copyright IBM Corporation 2004

Configure Multisite Update Wizard (2 of 2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-26 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-16. Test Multisite Update CF457.3

Notes:

Testing the Multisite Update feature:

1. Select the instance with the right mouse button and choose the Multisite Update > Test menu option from the pop-up menu. The Test Multisite Update window opens.

2. Select the databases you wish to test from the available databases in the left sub-window. You can use the arrow buttons in the middle to move selections to and from the Selected Databases sub-window.

3. When you have finished your selection, click the Test... button at the bottom of the window. The Multisite Update Test Result window opens.

4. The Multisite Update Test Result window shows which of the databases you selected succeeded or failed the update test. The window will show SQL codes and error messages for those that failed.

© Copyright IBM Corporation 2004

Test Multisite Update

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-27

Student Notebook

Figure 10-17. The Resynchronization Process CF457.3

Notes:

If communication is lost, or any of the systems goes down after the first phase (prepare) and before or during the second phase (commit), resolution of the transaction must take place. The database may have to be restored to a consistent state depending on when the problems occurred. This process, called resynchronization or resync, is handled by the resync agent, which is created during db2start. The transactions that were able to prepare but unable to commit are called indoubt transactions.

Transactions are left indoubt when the TM or at least one RM becomes unavailable after successfully completing the first phase (prepare) of the two-phase protocol. An indoubt transaction is a two-phase transaction which has been prepared but not committed or rolled back. The prepare phase is done when the RM writes the log pages to disk so that it can respond with either a COMMIT or a ROLLBACK.

RMs do not know whether to COMMIT or ROLLBACK their portion of the transaction until the TM can consolidate its own log with the indoubt status information from the RMs when they become available again.

© Copyright IBM Corporation 2004

The Resynchronization Process

Can be initiated by either the TM database or RM

RM responsible - resync rollback

TM responsible - resync commit

RESYNC_INTERVAL

DBM configuration parameter

180 seconds is default

Indoubt Transaction

db2 LIST INDOUBT TRANSACTIONS[WITH PROMPTING]

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-28 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The transaction resynchronization interval parameter, RESYNC_INTERVAL, specifies the time interval in seconds for a TM or RM to retry the recovery of any outstanding indoubt transactions found in the TM or RM. This parameter is only applicable to transactions in a DUOW environment.

If a COMMIT is to be done, but some RMs fail, then the TM retries the COMMIT on the failed RMs after the resync interval has passed. The default value is 180 seconds. The recommendation is to increase the value of this parameter if indoubt transactions will not interfere with other transactions against your database.

Resynchronization can be initiated either by the TM database or by any of the RMs.

• If an RM is down before the prepare, the RM transaction will be rolled back automatically. The coordinator will roll back other RMs on receiving the result of the prepare from all RMs. No resync is required.

• If an RM is down after the prepare but before the commit, and the result of the transaction is a commit, it is the TM’s responsibility to resync.

• If all the RMs prepare successfully, the TM database prepares successfully (that is, the prepare log is written) and then the TM database fails or communications are lost, both the TM and RM can initiate the resync depending on which database is connected first.

Basically, the RM is responsible for resync rollback, while the TM is responsible for resync commit. If the TM database is corrupted or unavailable, resynchronization may need to be user-initiated.

The indoubt transactions at an RM or TM can be determined using the list indoubt transaction command. You should use this command with extreme caution and as a last resort. The best solution is to wait for the transaction manager to drive the resynchronization process. You could experience data integrity problems if you manually commit or roll back a transaction in one of the participating databases and the opposite action is taken for another of the databases. This manual process is sometimes called a heuristic decision. Recovering from data integrity problems requires you to understand the application logic and if the unit of work should be committed or rolled back.

If you cannot wait for the transaction manager to initiate the resynchronization process and you must release the resource tied up by an indoubt transaction, then heuristic operations are necessary.

db2 LIST INDOUBT TRANSACTION [WITH PROMPTING]

The authorization required is DBADM. The command parameters WITH PROMPTING indicate that indoubt transactions are to be processed. If this parameter is not specified, indoubt transactions are written to standard output, and the interactive dialog mode is not initiated. The interactive dialog mode permits the user to:

• l - list all indoubt transactions • l x - list indoubt transaction numbered 'x' • q - quit • c x - commit indoubt transaction numbered 'x'

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-29

Student Notebook

• r x - roll back indoubt transaction-numbered 'x' • f x - forget the indoubt transaction numbered 'x'

You can only commit transactions whose status is indoubt (i), rollback transactions whose status is indoubt (i) or ended (e), or forget transactions whose status is committed (c) or rolled back (r).

Use the q option to refresh the dialog, because it may change once you enter because of external activities.

If the AUTORESTART database configuration parameter is OFF (default is ON) and there are indoubt transactions in either the TM or RM databases, the RESTART DATABASE command/routine is required in order to start up the resynchronization process. If issuing the RESTART DATABASE command from the command line processor, use different sessions. If you restart a different database from the same session, the connection established by the previous restart database command will be dropped. The database will need to be restarted again if the last connection to it is dropped. Issue db2 terminate to drop the connection after there are no indoubt transactions listed by the db2 list indoubt transactions command.

You can also use a GUI which is accessible in Windows from Start -> Programs -> IBM DB2 -> Monitoring Tools -> Indoubt Transaction Manager.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-30 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 10-18. DUOW Database Configuration CF457.3

Notes:

When setting up the database involved in your DUOW environment, remember:

• The TM database that you select must connect to all RMs either directly or through a gateway.

DB2 Universal Databases, when acting as a DRDA Application Server (AS), support the TCP/IP communications protocol with a two-phase commit.

DB2 Universal Databases, when acting as a DRDA Application Requester (AR), support the TCP/IP communications protocol with a two-phase commit.

The transaction manager database can be any database, although for operational and administration reasons, you should use a database that does not contain user data. Your application program should not refer to this transaction manager database.

© Copyright IBM Corporation 2004

DUOW Database Configuration

TM database must be able to connect to allRM databases directly or through a gateway

The transaction manager database can be:

Your local DB2 UDB database

Another DB2 UDB database

A DB2 for OS/390 z/OS V5 or later database

A DB2 for OS/400 database

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-31

Student Notebook

Figure 10-19. Selecting Your TM Database CF457.3

Notes:

The TM database can be a local or remote database, and it can be a DRDA database. You assign a TM database per DB2 UDB instance by setting the database manager configuration parameter TM_DATABASE to the name of the database of your choice. However, if you want the application to use the database it first connects to, then set TM_DATABASE to 1ST_CONN.

Keep the following criteria in mind when selecting your TM database:

• Security: The TM database should be on a secure machine because it contains data for the integrity of the transactions.

• Configuration: In order for resync to complete, all RM instances should be able to connect to the TM database, and the TM database should be able to connect to all RM databases.

• Performance: The TM database must have enough resources to handle being involved in every two-phase commit transaction.

© Copyright IBM Corporation 2004

Selecting Your TM Database

TM_DATABASE

DBM configuration parameter on client machine

Specific database alias in system databasedirectory or 1ST_CONN

Other Criteria

Security

Configuration

Performance

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-32 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The DBM configuration parameter, TM_DATABASE, on the client from which the application is executing, defines the TM database for the application. There are certain considerations associated with the TM_DATABASE parameter:

• Can be local or remote database • Should be a secure and robust machine • This is a system database for use by DB2 UDB • May want to create dummy database without data • Used as a logger and coordinator • Performs recovery of indoubt transactions • Should experiment with the workload.

Specifying TM_DATABASE as 1ST_CONN:

• Implies that the application always uses one of the RMs as the TM. • The danger is that this might be an insecure or non-robust machine.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-33

Student Notebook

Figure 10-20. Lock Timeout Avoidance CF457.3

Notes:

The locktimeout database configuration parameter specifies the number of seconds that an application will wait to obtain a lock. This helps avoid global deadlocks for applications. If you set the locktimeout parameter to -1 (which is the default), lock timeout detection is turned off and the application will wait forever or until the lock is released.

In a transaction processing environment, you can use an initial starting value of 30 seconds. In a query-only environment, you could start with a higher value. In both cases, you should use benchmarking techniques to tune this parameter.

This value should be set to quickly detect waits that are occurring because of an abnormal situation, such as a transaction that is stalled. You should set it high enough so that valid lock requests do not time out because of peak workload, during which time there is more waiting for locks.

You may use the database system monitor to help you track the number of times an application (connection) experienced a lock timeout or that a database detected a timeout situation for all applications that were connected. This monitor element is called the locks-timeouts element. Related parameters are locklist and maxlocks.

© Copyright IBM Corporation 2004

Lock Timeout Avoidance

Avoid global deadlocks with lock wait timeout

Special concern with DUOW applications

LOCKTIMEOUT=nnnnn

Database configuration parameter

Specified in seconds

Default is -1 (waits forever)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-34 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. Distributed Unit of Work (DUOW) is a unit of work or transaction that can access more than one database at a time. True or False?

___________________________________________________

2. Which option allows you to change a dormant connection to an active connection state using a "CONNECT" statement?

a. SQLRULES SQL

b. SQLRULES DB2

c. SQLRULES STD

___________________________________________________

3. Which options allow you to update at multiple databases in the same unit of work?

a. CONNECT 1 SYNCPOINT TWOPHASE

b. CONNECT 2 SYNCPOINT ONEPHASE

c. CONNECT 2 SYNCPOINT TWOPHASE

___________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 10. Distributed Management 10-35

Student Notebook

Figure 10-21. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Describe the difference between RUOW and DUOW

Identify design considerations for DUOW applications

Implement components involved in two-phase commit

Differentiate between type 1 and type 2 Connects

Describe one-phase versus two-phase syncpointrequirements

Identify how federated systems may be used

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

10-36 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 11. Federated Databases

What This Unit Is About

This unit provides information about accessing data from different databases by using the federated database feature.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe a heterogeneous distributed query

• Describe the functions available with federated databases

How You Will Check Your Progress

Accountability:

• Checkpoint • Machine exercises

References

IBM DB2 Universal Database Installation and Configuration Guide

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Universal Database Administration Guide: Performance

IBM DB2 Universal Database SQL Reference, Volume 1

IBM DB2 Universal Database SQL Reference, Volume 2

IBM DB2 Universal Database Federated Systems Guide

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-1

Student Notebook

Figure 11-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Describe a heterogeneous distributed query

Describe the functions available with federated databases

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

11.1 Federated Databases

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-3

Student Notebook

Figure 11-2. Distributed Queries - General CF457.3

Notes:

• A federated system is designed to make it easy to access data, regardless of where it is actually stored. This is accomplished by creating nicknames for all data source objects that are needed.

• Applications or users submit SQL statements referencing two or more DBMSs or databases in a single statement.

• Transparent access to data in multiple data sources. This transparency permits the location of the remote source to be changed without needing to change the SQL code.

• Local DB2 UDB database is defined as a federated database.

• When you create an SQL statement, you don’t have to be concerned with issues like:

- Name of the table at the data source - Server on which it resides - Type of DBMS on which the table resides (such as Informix or Oracle) - Query language or SQL dialect that the DBMS uses - Data type mappings between the data source and DB2 UDB

© Copyright IBM Corporation 2004

Distributed Queries - General

DB2 UDB for UNIX,Windows

DB2 UDB for OS/390 z/OSOracle Other

One SQL Statementusing 'tables' T1, T2, T3, T4

Federated DB

T1T2 T3

T4

Remote Table

or View

Remote Table

or View

Remote Table

or View

Remote Table

or View

Multiple Heterogeneous

Sources

T* are nicknames

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

All the underlying metadata is stored in the federated database catalog after the relevant setup and configuration steps, providing the federated server with the information it needs to process the queries.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-5

Student Notebook

Figure 11-3. DB2 UDB Federated Systems CF457.3

Notes:

The advantage of a DB2 federated system is:

• Join data from local tables and remote data sources as if all the data is local

• Take advantage of the data source processing strengths by sending distributed requests to the data sources for processing.

• Compensate for SQL limitations at the data source by processing parts of a distributed request at the federated server.

To be able to use a federated system, the DBM CFG parameter FEDERATED must be set to ON.

The federated server in a federated system is a DB2 instance configured accordingly. Any number of DB2 instances can be configured to function as federated servers.

The federated server often sends parts of the requests it receives to the data sources for processing. A pushdown operation is processed remotely; in such a case the federated server acts as a client.

© Copyright IBM Corporation 2004

DB2 UDB Federated Systems

DB2 UDB

Windows

UNIX

DB2 UDB

iServer

DB2 UDB

z/OS

OS/390

DB2

VM & VSE

Informix

OLE DB

BLAST

DOCUMENTUM

Microsoft

EXCEL

Table

Structured

Files

XML

Global Catalog

DB2 UDB

Federated

Database

DB2 UDB

Federated Server

DB2 UDB

Clients

Microsoft

SQL ServerORACLE Sybase ODBC

DB2 Life

Sciences

Data

Connect

DB2

Relational

Connect

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The federated server is a database manager instance to which application processes connect and submit requests. Two main features distinguish it from other application servers:

It is configured to receive requests that might be partially or entirely intended for data sources, and it distributes these requests to the data sources.

Like other application servers, a federated server uses DRDA communication protocols (such as TCP/IP) to communicate with the DB2 family. However, a federated server also uses other protocols to communicate with non-DB2 family instances.

Supported data source versions and access methods are:vv

Data Source Supported data source versions

Access method Notes

DB2 UDB for UNIX and Windows

6.1, 7.1, 7.2, 8.1, 8.2

DRDA Directly integrated in DB2 Version 8

DB2 UDB for z/OS and OS/390

5 with PTF PQ07537 or later

DRDA Directly integrated in DB2 Version 8

DB2 UDB for iSeries

4.2 or later DRDA Directly integrated in DB2 Version 8

DB2 Server for VM & VSE

3.3 or later DRDA Directly integrated in DB2 Version 8

Informix 7, 8, 9 Informix Client SDK Directly integrated in DB2 Version 8

ODBC ODBC 3.0 driver Requires DB2 Relational Connect

OLE DB OLE DB 2.0 or later Directly integrated in DB2 Version 9

Oracle 7.x, 8.x, 9.x SQL*Net or Net8 client software

Requires DB2 Relational Connect

Microsoft SQL Server

6.5, 7.0, 2000 On Windows: Microsoft SQL Server Client ODBC 3.0 or higher driver. On UNIX: the Data Direct Technologies (formerly MERANT) Connect ODBC 3.6 driver

Requires DB2 Relational Connect

Sybase 10.0, 11.0, 11.1, 11.5, 11.9, 12.0

Sybase Open Client

Requires DB2 Relational Connect

BLAST 2.1.2 BLAST daemon (supplied with the wrapper)

Requires DB2 Life Sciences Data Connect

Documentum Documentum Server: EDMS 98 (also referred as Version3) and 4i

Documentum Client API/Library

Requires DB2 Life Sciences Data Connect

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-7

Student Notebook

Microsoft Excel 97, 2000 none Requires DB2 Life Sciences Data Connect

Table-structured files

none Requires DB2 Life Sciences Data Connect

XML 1.0 specification none Requires DB2 Life Sciences Data Connect

vv

Data Source Supported data source versions

Access method Notes

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 11-4. Federated Database CF457.3

Notes:

The federated database contains catalog entries that have information about data sources and their characteristics. The federated server consults the information stored in the federated database system catalog and the data source wrapper to determine the best plan for SQL processing.

The global catalog contains information about the entire federated system, such as:

• Objects in the federated database • Objects in the data sources • Column names • Column data types • Indexes • Column default values • Connect information to the sources • Mapping of user authorizations

An example of remote catalog information is the name used by the data source; an example of local catalog information is the name used by the federated database.

© Copyright IBM Corporation 2004

Federated Database

Global Catalog

DB2 UDB

Optimizer

Wrappers

Server definitions/Options User Mapping

Nicknames

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-9

Student Notebook

The Optimizer analyzes a query, and decides on the access plan and processing of a query, which could be:

• Processed by the data sources • Processed by the federated server • Processed partly by the data sources and partly by the federated server

Wrappers are routines stored in a library called a wrapper module, and allow the federated server to perform operations such as connection to a data source and retrieving data. For each type of data source, a wrapper must be created. Server definitions and nicknames (name, location, and so forth) are used to identify the specifics of each data source. Some wrappers have default wrapper names. If you use these default names, the federated server automatically picks up the data source library associated with the wrapper (DRDA for DB2 UDB sources, INFORMIX for Informix sources, SQLNet or Net8 for Oracle sources, and DJXMSSQL3 or MSSQLODBC3 for Microsoft SQL Server sources). A wrapper performs tasks such as:

• Connecting to the data source (standard connection API of the data source) • Submitting queries to the data source • Receiving result sets from the data source (standard APIs of the data source) • Responding to federated server queries about the default data types • Responding to federated server queries about default function mappings

Server definitions must be created for the data sources, so, for example, if the data source is an RDBMS, this information includes:

• Type and version of the RDBMS • Database name for the data source • Metadata for this specific RDBMS

The name and other information that the instance owner supplies to the federated server are collectively called the server definition.

User mapping and user options are used for the connection to the data source. By default the user ID and password with which the user connects to DB2 UDB is used. If on the data source another user ID and password is needed, you must define an association between the two authorizations, which is called user mapping.

Nicknames are identifiers that are used to reference an object located at the data source. The objects that nicknames identify are called data source objects (primarily tables and views). Nicknames are not alternative names for data source objects like aliases; they are pointers by which the federated server references these objects. The user is able to submit a distributed request without specifying the data source. The mappings eliminate the need to qualify the nicknames by data source names. When a nickname is created, metadata about the object is added to the global catalog.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 11-5. DB2 UDB Federated Database GUI CF457.3

Notes:

Before you can use a federated database from the Control Center, you must set the FEDERATED parameter to YES.

From the Control Center, you are able to perform the following tasks:

• Create and alter wrappers

• Create and alter the server definitions

• Create and alter the user mappings

• Create and alter the nicknames

In addition, in the Control Center help text is available that guides you through the relevant steps.

© Copyright IBM Corporation 2004

DB2 UDB Federated Database GUI

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-11

Student Notebook

Figure 11-6. Sample Scenario DML (1) CF457.3

Notes:

To be able to issue the SELECT statement which references two tables in different DB2 UDB databases, you first need to perform the necessary steps, such as create wrapper, create server, and so on. Let’s assume for this example that we have two DB2 UDB databases for Windows or UNIX.

To create the wrapper, server, or whatever, you can use the GUI interface, or use the Control Center, or else you can perform these steps in the CLP (Command Line Processor).

© Copyright IBM Corporation 2004

Sample Scenario DML (1)

select a.order, b.supplier from Orders a, Supplier b

where a.order=b.order and article > 3000

ORDER ARTICLE ORDER SUPPLIER

2ABA 1010 2ABA SUP BROTHERS

3ABA 2056 3ABA GM ARTICLES

4ABA 5560 4ABA GMC CO

5ABA 1200 5ABA SUP BROTHERS

6ABA 9005 6ABa CROWFISH

ORDER CHAR(6) ORDER CHAR(6)

ARTICLE SMALLINT SUPPLIER VARCHAR(20)

DB2 UDB MYDB: DB2 UDB YOURDB:table creator: Melanie table creator: Jiritable name : Orders table name: Suppliercolumns: columns:

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 11-7. Sample Scenario DML (2) CF457.3

Notes:

First you need to connect to a database:

connect to mydb

Then, create the wrapper for DB2 UDB data sources:

create wrapper drda

As the next step, create the server mapping for the data source. Because, in this example, we are connected to MYDB, we only need to create a server for the other database, YOURDB. If you use a separate database for your federated objects, then you need to create a server for the YOURDB too:

create server otherdb typ db2/UDB version 8.1 wrapper drda authid jiri password ****** options (dbname yourdb)

Now we create the user mapping (if needed). Also, if you use a separate database, you might need to create the user mapping for both databases:

© Copyright IBM Corporation 2004

Sample Scenario DML (2)

create wrapper "DRDA"

create server "OTHERDB" type DB2/UDB VERSION '8.1' wrapper "DRDA" authid "JIRI" password "******" options ( dbname 'YOURDB' );

create user mapping for "MELANIE" server "OTHERDB" options( remote_authid 'JIRI', remote_password '******')

create nickname "MELANIE"."SUPPLIER" for "OTHERDB"."JIRI"."SUPPLIER";

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-13

Student Notebook

create user mapping for melanie server otherdb options (remote_authid jiri, remote_password ******

If needed, as the next step the relevant type mappings should be done. In this example, we don’t need any type mapping.

As the last step, you create the nicknames. For this last point, the same applies as for create server or create user mapping; if you use a separate database, then you need to create nicknames for all tables accessed:

create nickname melanie.supplier for otherdb.jiri.supplier

Now, if a user or an application is connected to MYDB it can issue the SQL:

select a.order, b.supplier from orders a, supplier b where a.order = b.order and article > 3000

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 11-8. Sample Scenario DML (3) CF457.3

Notes:

Establish a connection to a DB2 UDB database.

Create wrappers to access the Oracle and DB2 for OS/390 data sources:

create wrapper sqlnet; create wrapper drda;

Create server mappings for the Oracle and DB2 for OS/390 data sources:

create server oracle1 create server DB21wrapper sqlnet wrapper drdatype oracle type db2/390version 7.2 version 6.0options ( database "FRUIT"node "mvpdb0") options(

node "MVS1DB2F",fold_pw "U")

© Copyright IBM Corporation 2004

Sample Scenario DML (3)

select a.fruit, a.quantity, b.supplier from produce a, supplier b

where a.fruit=b.fruit and quantity > 10

FRUIT QUANTITY FRUIT SUPPLIER

Apple 10 Apple Cosentino Orchards

Pear 2 Lemon GM Foods

Banana 5 Orange Seville

Orange 12 Pear Sunkist

Lemon 9 Banana Windward

FRUIT CHAR(6) FRUIT CHAR(6)

QUANTITY INT SUPPLIER VARCHAR(20)

Oracle: DB2 for OS/390:table creator: J15USER1 table creator: smithtable name : PRODUCE table name: suppliercolumns: columns:

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-15

Student Notebook

Presume that a user mapping is needed:

create user mapping from user create user mapping from userto server oracle1 to server DB21options ( options (remote_id j15user1 remote_id "SMITH"remote_pw j15user1) remote_pw "SMITH")

The default data type mapping for an Oracle "number" data type is to a UDB "float" data type (but we want integer).

create type mapping oraint from integer to server oracle1 type number(9,0)

Create nicknames for the Oracle and DB2/390 tables.

create nickname o_produce for oracle1.j15user1.producecreate nickname s_supplier for db21.smith.supplier

A user or application can now issue a join between the nicknames:

select a.fruit, a.quantity, b.supplier from o_produce a, s_ supplier bwhere a.fruit=b.fruit andquantity > 10

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-16 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 11-9. Performance Issues in Federated Databases CF457.3

Notes:

To be able to execute the SQL statement, the SQL compiler (optimizer) consults information in the global catalog and the data source wrapper. This includes information about connecting to the data source, server attributes, mappings, index information, and processing statistics.

The compiler develops alternative strategies, called access plans, for processing the query, which might be:

• processed by the data sources • processed by the federated server • processed partly by the data sources and by the federated server

Important for this evaluation is the information about the data source capabilities and the data. DB2 composes a query into segments called query fragments. Typically it is more efficient to push down a query fragment to a data source to process the fragment (performed only on relational data sources). Factors influencing this are:

© Copyright IBM Corporation 2004

Performance Issues in Federated Databases

Parse Query

Check

Semantics

Rewrite

Query

Pushdown

Analysis

Optimize

Access Plan

Remote SQL

Generation

Generate

Executable

Code

Execute Plan

SQL

Query

Graph

Model

Access

Plan

Executable

Plan

Explain

Tables

db2expln

Tool

Visual

Explain

db2exfmt

Tool

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-17

Student Notebook

• The amount of data that needs to be processed • The processing speed of the data source • The amount of data that the fragment will return • The communication bandwidth

The query optimizer generates local and remote access plans for processing a query fragment, based on resource cost. DB2 UDB then chooses the plan it believes will process the query with the least resource cost.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-18 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Figure 11-10. Additional Education CF457.3

Notes:

Develop your skills in this challenging hands-on experience that includes setting up, using, monitoring, and troubleshooting a federated database environment. Learn about the configuration for all supported DB2 UDB and DB2 Relational Connect data sources. Complete hands-on exercises between DB2 UDB for Windows, DB2 UDB for z/OS and OS/390, DB2 UDB for AIX, Microsoft SQL Server, and Oracle databases. Perform a five-way join between these data sources from a single SQL statement.

You will:

• Describe the advantages of using DB2 Federated Database technology for your systems and application integration project needs

• Describe the architectural and product components of an integrated DW or OLTP e-business solution related to DB2 federated database approach, including DB2 Relational Connect, DB2 UDB for UNIX, Windows and OS/2, and DB2 Connect products, in addition to the backend data sources, which are part of the environment.

© Copyright IBM Corporation 2004

Additional Education

CF47, DB2 Federated Database: Integrating Diverse Data

3.0 days

Key topics:

Configuring the federated environment

Security

Advanced stuff that you need

The Federated Catalog tables

The Federated Distributed Relational Database Architecture (DRDA) server

Tuning the federated environment

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-19

Student Notebook

• Use the DB2 UDB Control Center to configure an operational federated database environment between any member of the DB2 UDB family and supported non-IBM sources, Oracle and Microsoft SQL Server

• Recognize key factors that influence query performance in a integrated federated database environment

• Use DB2 Visual Explain and DB2 monitoring tools to expose federated database performance problems

• Explain how to implement security to protect an integrated federated environment

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-20 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise - Unit Checkpoint

1. What do you need to access (query) an Oracle database?

a. DB2 Connect

b. DB2 Common Server

c. DB2 Relational Connect

__________________________________________________

2. What are the necessary steps for federated database setup?

a. Create tables, links, and protocols

b. Create DB2 instance and sample database

c. Create wrappers, server definitions, user mapping, and nicknames

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 11. Federated Databases 11-21

Student Notebook

Figure 11-11. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Describe a heterogeneous distributed query

Describe the functions available with federated databases

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

11-22 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 12. Replication - Optional

What This Unit Is About

This unit describes how to replicate (or copy) data from one database to another using DB2 UDB’s replication capabilities.

What You Should Be Able to Do

After completing this unit, you should be able to:

• Describe the replication tools • List additional education opportunities

References

IBM DB2 Universal Database Administration Guide: Implementation

IBM DB2 Replication Guide and Reference

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-1

Student Notebook

Figure 12-1. Unit Objectives CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Objectives

After completing this unit, you should be able to:

Describe the replication tools

List additional education opportunities

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

12.1 Replication

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-3

Student Notebook

Figure 12-2. Replication Overview CF457.3

Notes:

Replication can be used to satisfy several needs, including moving data from a production system to another production system, consolidating data from various distributed databases, and moving data into a data warehouse environment for decision support activities.

The IBM UDB replication tools capture source data from the DB2 UDB logs or triggers or by applications. Captured data for copying is placed into a staging table called a changed data (CD) table.

A single staging table can serve as a source for multiple subscriptions, or multiple staging tables can be used for a single source, depending on the requirements. Typically, the staging table or tables reside on the same system as the source tables.

The Apply program reads the tables and applies them to the target tables using standard SQL statements.

One or more Apply programs can subscribe to a CD table and replicate data to one or more target tables.

© Copyright IBM Corporation 2004

Replication Overview

DPropR Functionality implemented in UDB

Major parts:

Capture: Grabs information from UDB logs

Apply: SQL routine to apply changes

Replication Center GUI support

Possibility of multiple targets and options

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The Apply program handles column and row subsetting, performs SQL transformations, and manages commit scope based on subscription sets and table versus transaction consistent delivery.

You can tailor or enhance data as it is copied, and deliver detailed, subset, summarized, or derived data when and where it is needed.

IBM UDB replication consists of the replication administration features of the Replication Center, and two tools: the Capture and Apply programs. The Replication Center provides administration support for the replication environment with objects and actions that define and manage source and target table definitions.

The Capture and Apply programs are responsible for capturing the updates to the source tables and applying the changes to the target tables.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-5

Student Notebook

Figure 12-3. UDB Replication Concept CF457.3

Notes:

The Capture program is the replication tool that captures the changed data and makes the changed data available for replication. It runs at the source server database. The Capture program usually runs continuously, but you can bring it down while running utilities or changing replication sources. The Capture program runs independently of the Replication Center, but uses control information that is created by the Replication Center.

DB2 UDB records every transaction in a log file for diagnostic and recovery purposes. The Capture program monitors the DB2 UDB log to detect change records from source tables that both have the DATA CAPTURE CHANGES attribute and are defined as replication sources. The Capture program retrieves change and commit information from the active and archive logs on DB2 for MVS 4.1 or higher and DB2 Universal Database. These records contain a before-image and an after-image of the table row. The Capture program captures these changes in the CD tables. The Capture program also maintains information about committed units of work in the unit-of-work (UOW) table. This table is joined with the CD table to identify and replicate committed updates. The Apply program can then read the CD table, copy the changes to the target site, and apply them to copies of the source table.

© Copyright IBM Corporation 2004

UDB Replication Concept

Log

Application

Trigger CAPTURE APPLY

or

or

ControlControl

DB2

UDB

ADMIN

StagingStagingStaging

Other

DB

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The figure shows the relationship between the Capture program, the DB2 UDB log, the source table, and the control tables.

The Apply program is the replication tool that replicates copies of the source table data to the target table. You can run it at any server, but it is generally run at the target server database.

The Apply program reads the changed data that was previously captured and stored in a CD table, and applies the changes to target tables. The Apply program also reads data directly from source tables when copying the entire source table data for a full refresh to the target table.

The Apply program generally runs at the target server. It can run at any server in your network as long as it can connect the source, control, and target servers. Several Apply program instances can run on the same or different servers.

The Apply program runs independently of the Replication Center, but uses control information that the Replication Center creates. The control information that the Apply program uses is stored in tables at the control server.

When the Apply program reads the changed data that is stored in CD tables, it applies it to target tables at either local or remote servers. It can also apply column functions, such as SUM and AVG, to the source table or CD table, and append the result to the target tables. The Apply program can run at any server that can connect through the SQL CONNECT statement to each database server where source and target tables reside. The above figure shows the Apply program's relationship with the source server control tables, the subscription definition control tables, and the target table. The control tables that are used by the Apply Program are described in the book below.

For more information, look in: Replication Guide and Reference.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-7

Student Notebook

Figure 12-4. Replication Center (1) CF457.3

Notes:

When launching the Replication Center, you can use the Replication Center Launchpad. Here, you will be guided through the following actions:

• Create the Capture control tables (control tables on the Capture control server)

• Register a source table (tables on the source server from which to copy data)

• Create the Apply control tables (control tables for the subscription sets on the Apply control server)

• Create a subscription set (group sources with targets; each source-target pair in a set is called a member)

• Start the Capture program (begin capturing DB2 changed data)

• Start the Apply program (replicate the changed data)

The Replication Center Launchpad also guides you through the different process steps and gives related information to help you finish the relevant tasks. The launchpad does not require that you perform the tasks in sequence.

© Copyright IBM Corporation 2004

Replication Center (1)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-8 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

The launchpad could be invoked at any time by selecting Launchpad from the Replication Center menu or by right-clicking the Replication Center folder in the navigator and selecting Start Launchpad.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-9

Student Notebook

Figure 12-5. Replication Center (2) CF457.3

Notes:

To start the Replication Center, use the db2rc command from a command window. On Windows systems, you can also start the Replication Center by using the Windows Start menu: Start -> Programs -> IBM DB2 -> General Administration Tools -> Replication Center. When the Control Center is running, you can start the Replication Center by selecting Replication Center from the Tools menu or by clicking the Replication Center icon.

If you use the Replication Center to operate the Capture, Apply, or Replication Alert Monitor Programs on remote systems, the DB2 Administration Server (DAS) must run on the local system that is running the Replication Center and on each of the remote DB2 systems that will run the Capture or Apply programs.

You can use the Replication Center to:

• Define the default in profiles for creating control tables, source objects, and target objects

• Create replication control tables

© Copyright IBM Corporation 2004

Replication Center (2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-10 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

• Register replication sources

• Create subscription sets and add subscription set members to subscription sets

• Operate the Capture program

• Operate the Apply program

• Monitor the replication process

• Perform basic troubleshooting for replication

You can also use the Replication Center to perform many other replication administration tasks.

The Replication Center must be able to connect to many database servers, source servers, Capture control servers, Apply control servers, Monitor control servers, and target servers. For all remote databases and systems, a valid user ID and password are needed to connect. The Replication Center allows you to specify and manage user IDs and passwords. To manage IDs and passwords, right-click the Replication Center folder and select Manage Passwords for Replication Center.

Task performed with the Replication Center could be:

• Creating replication profiles • Creating control table profiles • Creating source object profiles • Creating target object profiles • Creating replication control tables

- Creating Capture control tables - Creating Apply control tables - Creating Monitor control tables

• Adding servers to the Replication Center • Enabling a database for change capture • Registering sources • Creating subscription sets • Defining the information for the subscription set • Mapping sources to targets • Scheduling the subscription set • Adding SQL statements or stored procedures to the subscription set • Activating or deactivating subscription sets • Promoting replication objects • Promoting registered tables or views • Promoting subscription sets • Forcing a full refresh of target tables • Removing or deleting replication definitions • Operating the Capture program • Operating the Apply program • Operating the Replication Alert Monitor

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-11

Student Notebook

Figure 12-6. Additional Courses CF457.3

Notes:

Detailed information about these classes can be found under:

http://www.ibm.com/services/learning/

© Copyright IBM Corporation 2004

Additional Courses

DW14, Data Replication: Basic Usage

Duration: 2 days

DW15, Data Replication: Advanced Topics

Duration: 2 days

DW24, Q Replication for WebSphere MQ Administration

Duration: 3 days

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-12 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Checkpoint

Exercise — Unit Checkpoint

1. What are the major components for data propagation?

__________________________________________________

__________________________________________________

2. For administering data propagation, you need to know many commands. True or False?

__________________________________________________

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Unit 12. Replication - Optional 12-13

Student Notebook

Figure 12-7. Unit Summary CF457.3

Notes:

© Copyright IBM Corporation 2004

Unit Summary

Having completed this unit, you should be able to:

Describe the replication tools

List additional education opportunities

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

12-14 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Appendix A. Checkpoint Solutions

Unit 1 Checkpoint Solutions

1. What is the prerequisite to be able to run automatic RUNSTATS?

Auto_maint must be ON.

2. Which content in the Control Center is dependent on the selected object or folder?

a. Object Tree pane

b. Contents pane

c. Object Detail pane

3. The Health Center can be configured and started only through the GUI.

a. True

b. False

Unit 2 Checkpoint Solutions

1. Remote administration may be used to do all of the following functions, EXCEPT:

catalog directories on client

2. Write the command to catalog a local instance called inst1.

db2 CATALOG LOCAL NODE node-name INSTANCE INST1

3. Write the command to make an attachment to inst1.

db2 ATTACH TO node-name USER userid USING password

Unit 3 Checkpoint Solutions

1. If a user exceeds the defined resource limits, what actions can the governor take?

Either one.

2. SYSADM authority is required to use the governor. True or False?

True

3. What would these limits mean in a configuration file?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Appendix A. Checkpoint Solutions A-1

Student Notebook

setlimit uowtime 3600 locks -1 rowssel 100000

A unit of work is limited to one hour, any number of locks may be held by an application, and up to 100,000 rows can be returned to an application.

4. Which is the minimum authority to run auditing?

SYSADM

5. db2audit is used to:

Trace user behavior

Improve system security

6. The audit log is in a readable format. True or False?

False

Unit 4 Checkpoint Solutions

1. db2diag.log is the only possible means to receive additional information regarding an error situation. True or False?

False.

2. What can you do to receive all the necessary information about your DB2 UDB environment and system?

c) Use the db2support tool

3. What you can use to check the integrity of your DB2 UDB online?

b) inspect

Unit 5 Checkpoint Solutions

1. Parallelism is only applicable to those with a partitioned database. True or False?

False

2. The database configuration parameter for the degree of parallelism is DFT_DEGREE. If DFT_DEGREE is set to 2, which of the following statements is true?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

A-2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

a. By default, the degree of parallelism will be two or more subsection pieces. For example, a query could be broken into four subsection pieces for parallel processing.

b. By default, the degree of parallelism will be limited to two subsection pieces. For example, a query might not have any subsections for parallel processing.

c. By default, the degree of parallelism will always be 2. For example, a query will always be divided into two subsection pieces.

b. By default, the degree of parallelism will be limited to two subsection pieces. For example, a query might not have any subsections for parallel processing.

3. If the DFT_DEGREE database configuration parameter is set to 2 for one database you are connected to, and to ANY for another database you are connected to, then which statement is true?

a. By default, queries to the first database will have a degree of parallelism of 2, and queries to the second database will have a degree of parallelism determined by DB2 UDB.

b. By default, queries to either database will have a degree of parallelism of 2.

c. By default, queries to either database will have a degree of parallelism determined by DB2 UDB.

a. By default, queries to the first database will have a degree of parallelism of 2, and queries to the second database will have a degree of parallelism determined by DB2 UDB.

Unit 6 Checkpoint Solutions

1. Which command exports the glamor table and all tables with the word money in them from the hollywood database? The tables are owned by the cruise and pfeifer userids.

a. db2move hollywood export -tc cruise, pfeifer -tn glamor, *money

b. db2move hollywood export -tc cruise, pfeifer -tn glamormove hollywood export -tc cruise, pfeifer -tn glamor, LIKE money

__________________________________________________

Correct Answer: a)

2. Which tool can be used to export all relevant information of an instance?

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Appendix A. Checkpoint Solutions A-3

Student Notebook

a. db2ocat

b. db2fimp

c. db2fexp

___________________________________________________

Correct Answer: c)

3. Can you reduce the size of containers?

a. Yes

b. Yes, for DMS table spaces only

c. No

___________________________________________________

Correct Answer: b)

No checkpoint questions in units 7 or 8

Unit 9 Checkpoint Solutions

1. If you use the NONRECOVERABLE load option, you must specify the NONRECOVERABLE parameter when restoring your table from a backup. True or False?

False

2. A LOAD started with ALLOW NO ACCESS should be restarted. Is this possible?

a. No, it is not possible

b. Yes, but only with the ALLOW NO ACCESS option

c. Yes, with the ALLOW NO ACCESS or with the ALLOW READ ACCESS option

b. Yes, but only with the ALLOW NO ACCESS option

3. FASTPARSE improves the run-time performance of the load utility by doing what?

a. Increasing validation checks

b. Checking if your system has additional processors, and utilizing them automatically

c. Reducing validation checks

c. Reducing validation checks

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

A-4 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Unit 10 Checkpoint Solutions

1. Distributed Unit of Work (DUOW) is a unit of work or transaction that can access more than one database at a time. True or False?

True

2. Which option allows you to change a dormant connection to an active connection state using a "CONNECT" statement?

a. SQLRULES SQL

b. SQLRULES DB2

c. SQLRULES STD

b. SQLRULES DB2

3. Which options allow you to update at multiple databases in the same unit of work?

a. CONNECT 1 SYNCPOINT TWOPHASE

b. CONNECT 2 SYNCPOINT ONEPHASE

c. CONNECT 2 SYNCPOINT TWOPHASE

c. CONNECT 2 SYNCPOINT TWOPHASE

Unit 11 Checkpoint Solutions

1. What do you need to access (query) an Oracle database?

a. DB2 Connect

b. DB2 Common Server

c. DB2 Relational Connect

c. DB2 Relational Connect

2. What are the necessary steps for federated database setup?

a. Create tables, links, and protocols

b. Create DB2 instance and sample database

c. Create wrappers, server definitions, user mapping, and nicknames

c. Create wrappers, server definitions, user mapping, and nicknames

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Appendix A. Checkpoint Solutions A-5

Student Notebook

Unit 12 Checkpoint Solutions

1. What are the major components for data propagation?

Capture

Apply

Replication Center for administration and control tables

2. For administering Data Propagation, you need to know many commands. True or False?

False. I can use the Replication Center.

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

A-6 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

Student NotebookV3.1.0.1

Uempty

Bibliography

Books

Please refer to the Quick Beginnings manual for your platform for a listing of the books available.

Additional DB2 Information

Where Can I Go For Additional DB2 Information?

World Wide Web

IBM's Internet Connection Web site

http://www.ibm.net

Internet "Requests for Comments" Editor

http://www.isi.edu/rfc-editor/

IBM IT Education Services Web site

http://www-3.ibm.com/services/learning

IBM DB2 and Business Intelligence Technical Conference Web site

http://www-3.ibm.com/services/learning/conf/db2

ITSO Redbooks Web site

http://www.redbooks.com

DB2 Support Web site

http://www-4.cgi-bin/db2www/data/db2/udb/winos2unix/support/index.d2w/report

DB2 Information Web site

http://www-4.software.ibm.com/data/db2

DB2 Technical Library Web site

http://www-4.software.ibm.com/data/db2/library

DB2 Certification Web site

http://www-1.ibm.com/certify

CompuServe

IBM DB2 Family Forum (GO IBMDB2)

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

© Copyright IBM Corp. 1997, 2004 Bibliography X--1

Student Notebook

Internet News Groups

comp.databases.ibm-db2 bit.listserv.db2-l

Anonymous FTP Site

ftp.software.ibm.com directory /ps/products/db2

DB2 Magazine

IBM has contracted with Miller Freeman Inc. to produce and distribute DB2 Magazine. You can subscribe to DB2 Magazine by going to the IBM Data Management page at http://www.software.ibm.com/data/pubs and using the link under "Newsletters".

Course materials may not be reproduced in whole or in part without the prior written permission of IBM.

X--2 DB2 UDB Advanced Administration © Copyright IBM Corp. 1997, 2004

V3.1.0.1

backpg

Back page

���®