Upload
sudhakarsoorna
View
222
Download
12
Embed Size (px)
Citation preview
1DB2
DB2 - IBM’s Relational DBMSDB2 - IBM’s Relational DBMS
2DB2
Prerequisite for this coursePrerequisite for this course
The participant should be exposed to :
• IBM Mainframe Concepts
• COBOL and File Handling Concepts
• VSAM
3DB2
Day 1 - Session 1Day 1 - Session 1
4DB2
Topics to be covered in this sessionTopics to be covered in this session
• Introduction to databases - covers their advantages and the types of databases
• Relational database concepts - covers Properties, Terminology, Normalization, Integrity rules, CODD’s Relational Rules and the E-R model
5DB2
What is Data ?
‘A representation of facts or instruction in a form suitable for communication’ - IBM Dictionary
What is a Database ?
‘Is a repository for stored data’ - C.J.Date
Introduction to DatabasesIntroduction to Databases
6DB2
What is a database system ?
An integrated and shared repository for stored data or collection of stored operational data used by application systems of some particular enterprise.
Or
‘Nothing more than a computer-based record keeping system’.
Introduction to Database (contd...)Introduction to Database (contd...)
7DB2
Advantages of DBMS over File Mngt SysAdvantages of DBMS over File Mngt Sys
• Data redundancy
• Multiple views
• Shared data
• Data independence (logical/physical)
• Data dictionary
• Search versatility
• Cost effective
• Security & Control
• Recovery restart & Backup
• Concurrency
8DB2
TYPES OF DATABASES (or Models)TYPES OF DATABASES (or Models)
• Hierarchical Model
• Network Model
• Relational Model
• Object-Oriented Model
9DB2
Types of Databases (contd...)Types of Databases (contd...)
HIERARCHICAL
• Top down structure resembling an upside-down tree
• Parent child relationship
• First logical database model
• Available on most of the Mainframe computers
• Example - IMS
10DB2
Types of Database (contd...) Types of Database (contd...)
NETWORK
• Does not distinguish between parent and child. Any record type can be associated with any number of arbitrary record types
• Enhanced to overcome limitations of other models but in reality, there is minimal difference due to frequent enhancements
• Example - IDMS
11DB2
Types of Database (contd...) Types of Database (contd...)
RELATIONAL
• Data stored in the form of tables consists of multiple rows and columns.
• Examples - DB2, Oracle, Sybase, Ingres etc.
OBJECT -ORIENTED MODEL
• Data attributes and methods that operate on those attributes are encapsulated in structures called objects
12DB2
RELATIONAL DB CONCEPTSRELATIONAL DB CONCEPTS
13DB2
Relational PropertiesRelational Properties
• Why Relational ? - Relation is a mathematical term for a table - Hence Relational database ‘is perceived’ by the users as a set of tables.
• All data values are atomic.
• Entries in columns are from the same domain
• Sequence of rows (T-B) is insignificant
• Each row is unique
• Sequence of columns (L-R) is insignificant
14DB2
Relational Concepts (Terminology) Relational Concepts (Terminology)
• Relation : A table or File
• Tuple : Row contains an entry for each attribute
• Attributes : Columns or the characteristics that define the entity
• Domain:. A range of values (or Pool)
• Entity : Some object about which we wish to store information
• Null : Represents an unknown/empty value
• Atomic Value: Smallest unit of data; the individual data value
15DB2
Relational Concepts (contd...)Relational Concepts (contd...)
• Candidate key : Some attribute (or a set of attributes) that may uniquely identify each row(tuple) in the relation(table)
• Primary key : The candidate key that is chosen for primary attributes to uniquely identify each row.
• Alternate key :The remaining candidate keys that were not chosen as primary key
• Foreign key :An attribute of one relation that might be a primary key of another relation.
16DB2
Normalization (1NF - 5NF)Normalization (1NF - 5NF)
It is done to bring the design of database to a standardized mode or (form)
• 1NF : All entities must have a unique identifier, or key, that can be composed of one or more attributes. All attributes must be atomic and non repeating.
• 2NF : Partial functional dependencies removed - all attributes that are not a part of the key must depend on the entire key for that entity.
17DB2
Normalization (contd...)Normalization (contd...)
• 3NF : Transitive dependencies removed - attributes that are not a part of the key must not depend on any non-key attribute.
• 4NF : Multi valued dependencies removed
• 5NF : Remaining anomalies removed
18DB2
Types of IntegrityTypes of Integrity
• Entity Integrity : Is a state Where no column that is part of a primary key can have a null values.
• Referential Integrity : Is a state Where every foreign key in the first table must either match a primary key value in the second table or must be wholly null
• Domain Integrity : Integrity of information allowed in column
19DB2
CODD's RELATIONAL RULESCODD's RELATIONAL RULES
1. All information in a relational database is represented explicitly at the logical level and in exactly one way - by values in tables
2. Each and every datum(atomic value) in a relational database is guaranteed to be logically accessible by resorting to a combination of tablename, primary key value, and column name
20DB2
CODD's RELATIONAL RULES (contd...)CODD's RELATIONAL RULES (contd...)
3. Null values are supported for representing missing information in a systematic way irrespective of the datatype.
4. The database description is represented at the logical level in the same way as ordinary data, so that authorized users can apply the same relational language to its interrogation as they apply to the regular data.
21DB2
CODD's RELATIONAL RULES (contd...)CODD's RELATIONAL RULES (contd...)
5.A relational system may support several languages and various modes of terminal use. However there must be one language whose statements can express all of the following items: (1)data definitions (2)view definitions (3)data manipulation(interactive and by program)(4) integrity constraints (5) authorization(6) transaction boundaries(begin, commit, rollback)
22DB2
CODD's RELATIONAL RULES (contd...)CODD's RELATIONAL RULES (contd...)
6. All views are theoretically updatable, are also updatable by the system
7. The capability of handling a base relation or a derived relation (view) as a single operand applies not only to the retrieval of of data but also to the insertion, updation and deletion of data
23DB2
CODD's RELATIONAL RULES (contd...)CODD's RELATIONAL RULES (contd...)
8. Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods
9. Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables.
24DB2
CODD's RELATIONAL RULES (contd...)CODD's RELATIONAL RULES (contd...)
10. Integrity constraints specific to a particular relational database must be definable in the relational data sublanguage and storable in the catalog, not in the application programs.
11. The data manipulation sublanguage of a relational DBMS must enable application programs and inquiries to remain logically the same whether and whenever data are physically centralized or distributed.
25DB2
CODD's RELATIONAL RULES (contd...)CODD's RELATIONAL RULES (contd...)
12. If a relational system has a low-level(single-record-at-a-time)language, that low level cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level relational language(multiple-records-at-a-time)
26DB2
Entity Relationship ModelEntity Relationship Model
• E-R model is a logical representation of data for a business area
• Represented as entities, relationship between entities and attributes of both relationships and entities
• E-R models are outputs of analysis phase i.e they are conceptual data models expressed in the form of an E-R Diagram
27DB2
Example of a Relational StructureExample of a Relational Structure
CUSTOMER Places ORDERS
ORDERS Has PRODUCTS
28DB2
The above relations can be interpreted as The above relations can be interpreted as follows :follows :
• Each order relates to only one customer (one-to-one)
• Many orders can contain many products (many-to-many)
• A CUSTOMER can place any number of orders (one-to-many)
29DB2
Entity Relationship Model (contd...)Entity Relationship Model (contd...)
• In the above example CUSTOMER, Order & Product are called ENTITIES.
• An Entity may transform into table(s).
• The unique identity for information stored in an ENTITY is called a PRIMARY KEY. E.g... CUSTOMER-No uniquely identifies each customer
30DB2
Entity Relationship Model (contd...)Entity Relationship Model (contd...)
A table essentially consists of
• Attributes, which define the characteristics of the table
• Primary key, which uniquely identifies each row of data stored in a table
• Secondary & Foreign Keys/indexes
31DB2
Entity Relationship Model (contd...)Entity Relationship Model (contd...) Table Definition :
Table ‘CUSTOMER’ -
Attributes - CUST_NO, CUST_NAME,
CUST-LOCATION, CUST_ID, ORDER_NO...
Primary Key - CUSTOMER_NO
Secondary Key - CUST_ID
Foreign-Key - ORDER_NO
32DB2
Entity Relationship Model (contd...)Entity Relationship Model (contd...)
• The Relationships transform into Foreign Keys. For e.g.. CUSTOMER is related to Orders through ‘ORDER_NO’ which is the Foreign-key in CUSTOMER and Primary key in Order. So basically the relationship ‘Places’ is through the ORDER_NO.
• As per the relational integrity the Primary-Key, ORDER_NO, for the table ‘Orders’ can never be Null, while it can be so in the table ‘CUSTOMER’.
33DB2
Entity Relationship Model (contd...)Entity Relationship Model (contd...)
• Tables exist in Tablespaces. A tablespace can contain one or more tables
• Apart from the Primary Key, a table can have many secondary keys/indexes, which exist in Indexspaces.
• These tablespaces and indexspaces together exist in a Database
34DB2
Entity Relationship Model (contd...)Entity Relationship Model (contd...)
• To do transformations as described above we need a tool that will provide a way of creating the tables, manipulate the data present in these, create relationships, indexes, tablespace, indexspace and so on. DB2 provides SQL which performs these functions. The next part briefly deals with SQL and its functions. A detailed explanation will be taken up later.
35DB2
Day 1 - Session 2Day 1 - Session 2
36DB2
Topics to be covered in this sessionTopics to be covered in this session
• SQL - all data object manipulation, creation and use, involve SQL’s.
• DB2 objects - Database, Tablespaces & Indexspaces - creation & use, and other terminology's associated with databases.
• DDL - Data Definition Language
37DB2
An introduction to SQLAn introduction to SQL
SQL or Structured Query Language is • A Powerful language that performs the functions of
data manipulation(DML), data definition(DDL) and data control or data authorization(DAL/DCL).
• A Non procedural language - the capability to act on a set of data and the lack of need to know how to retrieve it. An SQL can perform the functions of more than a procedure.
• The De Facto Standard query language for RDBMS
• Very flexible
38DB2
Introduction to SQL (contd...)Introduction to SQL (contd...)
SQL - Features :-
• Unlike COBOL or 4GL’s, SQL is coded without data-navigational instructions. The optimal access paths are determined by the DBMS. This is advantageous because the database knows better how it has stored data than the user.
• What you want and not how to get it
• Set level processing & multiple row processing
39DB2
SQL - Types (based on the functionality)SQL - Types (based on the functionality)
• Data Definition Language (DDL)
- Create, Alter and Drop
• Data Manipulation Language (DML)
- Select, Insert, Update and Delete
• Data Control Language (DCL)
- Grant and Revoke
40DB2
SQL - Types (Others) SQL - Types (Others)
• Static or Dynamic SQL
• Embedded or Stand-alone SQL
41DB2
The following are the Operations that can be The following are the Operations that can be performed by a SQL on the database tables :performed by a SQL on the database tables :
• Select
• Project
• Union
• Intersection
• Difference
• Join
• Divide
42DB2
Topics dealt with, in DB2 objectsTopics dealt with, in DB2 objects
• Stogroup, Databases, Tablespaces (types, creation and modification)
• Indexspaces (creation and modification)
• Some more terms associated with tablespaces
43DB2
DB2 Objects DB2 Objects
• The DB2 Object Hierarchy
44DB2
StogroupStogroup
• It is a collection of direct access volumes, all of the same device type
• The option is defined as a part of tablespace definition
• When a given space needs to be extended, storage is acquired from the appropriate stogroup
45DB2
DatabaseDatabase
• A collection of logically related objects - like Tablespaces, Indexspaces, Tables etc.
• Not a physical kind of object - may occupy more than one disk space
• A STOGROUP & BUFFERPOOL (is buffer area used to maintain recently accessed table and index pages) must be defined for each database.
• Stogroup and user-defined VSAM are the two storage allocations for a DB2 dataset definition.
46DB2
Database (contd...)Database (contd...)
• In a given database, all the spaces need not have the same stogroup
• These are, in a sense, the most physical of various storage objects in DB2
• More than one volume can be defined in a stogroup. DB2 keeps track of which volume was defined first & uses that volume.
47DB2
TablespacesTablespaces
• Logical address space on secondary storage to hold one or more tables
• A ‘SPACE’ is basically an extendable collection of pages with each page of size 4K or 32K bytes.
• It is the storage unit for for recovery and reorganizing purpose
• Three Type of Tablespaces - Simple, Partitioned & Segmented
48DB2
Simple TablespaceSimple Tablespace
• Can contain more than one stored table
• Depending on application, storing more than one Table might enable faster retrieval for joins using these tables
• Usually only one is preferred. This is because a single page can contain rows from all tables defined in the database.
• LOAD with replace option deletes all data
49DB2
Segmented TablespacesSegmented Tablespaces
• Can contain more than one stored table, but in a segmented space
• A ‘Segment’ consists of a logically contiguous set of ‘n’ pages.
• Segsize parameter decides the allocation size for the tablespace
• No segment is allowed to contain records for more than one table
• Sequential access to a particular table is more efficient
50DB2
Segmented Tablespaces (contd...)Segmented Tablespaces (contd...)
• Mass Delete is much more efficient than in any other Tablespace
• Reorganizing the tablespace will restore every table to its clustered order
• Lock Table on table locks only the table, not the entire tablespace
• If a table is dropped, the space for that table can be reclaimed with minimum reorg
51DB2
Partitioned TablespacesPartitioned Tablespaces
• Primarily used for Very large tables
• Only one table in a partitioned TS; 1 to 64 partitions/TS
• Numpart parameter specifies the no. of partitions
• It is partitioned in accordance with value ranges for single or a combination of columns. Hence these column(s) cannot be updated
• Individual partitions can be independently recovered and reorganized
• Different partitions can be stored on different storage groups for efficient access.
52DB2
Tablespace parameters to be specified for TS Tablespace parameters to be specified for TS creationcreation
• LOCKSIZE - indicates the type of locking DB2 performs for the given TS
• Page
• Table
• Tablespace
• ANY - DB2 decides the starting page
53DB2
Tablespace parameters (contd...)Tablespace parameters (contd...)
• USING - method of storage allocations - Stogroup or VCAT
• PCTFREE - % of space available for future inserts
• FREEPAGE - no of pages after which an empty page is available
• BUFFERPOOL - BP1, BP2 & BP32K
• CLOSE - Yes/No - whether the underlying VSAM datasets be closed each time the table is used. Max no of datasets that can be open in DB2 at a time is 10,000
54DB2
Tablespace parameters (contd...)Tablespace parameters (contd...)
• ERASE - Yes/No - whether physical DASD Where the TS reside to be written with binary zeros when the TS is dropped
• NUMPARTS - For Partitioned Tablespaces
• SEGSIZE - For Segmented Tablespaces
55DB2
VCAT OptionVCAT Option
• User Defined VSAM datasets have to be defined explicitly by the AMS utility IDCAMS
• Two types of VSAM datasets are used -ESDS & LDS. Linear Data set is more efficiently used by DB2
• VSAM datasets defined here are different from the plain VSAM datasets - can access them only through VSAM Media Manager
56DB2
Data Definition LanguageData Definition LanguageCREATE
This statement is used to create objects
Syntax : For Creating a Table
CREATE TABLE <tabname> (Col Definitions)
PRIMARY KEY(Columns) / FOREIGN KEY
UNIQUE (Colname) (referential constraint)
[LIKE Table name / View name]
[IN Database Tablespace Name ]
57DB2
Data Definition Language (contd...)Data Definition Language (contd...)
• Foreign Key references dbname.table on ‘relation condition for delete’
• Table1 references table2(target) - Table2’s Primary key is the foreign key defined in Table1
• The Conditions that can be used are CASCADE, RESTRICT & SET NULL (referential constraint for the foreign key definition)
• Inserting (or updating ) rows in the target is allowed only if there are no rows in the referencing table
58DB2
Data Definition Language (contd...)Data Definition Language (contd...)ALTER
This statement is used for altering all DB2 objects
Syntax : For altering a Table
ALTER TABLE <Tablename>
ADD Column Data-type [ not null with default]
• Alter allows primary & Foreign key specifications to be changed
• It does not support changes to width or data type of a column or dropping a column
59DB2
Data Definition Language (contd...)Data Definition Language (contd...)DROP
This statement is used for dropping all DB2 objects
Syntax : For dropping a table
DROP TABLE <Tablename>
60DB2
Some general rules for RI & Table ParametersSome general rules for RI & Table Parameters
• Avoid nulls in columns participating in Arithmetic logic or comparisons
• Primary key cols cannot be nulls
• Limit referential structures to no more than three levels in a direction
• Use DB2’s inherent features rather than program coded RI’s.
61DB2
Day 2 - Session 1Day 2 - Session 1
62DB2
Topics to be covered in this sessionTopics to be covered in this session
• More SQL - Insight into the DML statement Select
• Simple Queries
• Functions
• Complex Queries
• Other DML statements Insert, Update and Delete
• Dynamic SQL Vs Static SQL
• More on DB2 Objects (Indexes, Views, Alias etc...)
63DB2
SQL - Selection & ProjectionSQL - Selection & Projection
• Select retrieves a specific number of rows from a table
• Projection operation retrieves a specified subset of columns(but all rows) from the table
E.g.. : SELECT CUST_NO, CUST_NAME FROM CUSTOMER;
• The WHERE clause defines the Predicates for the SQL operation.
• The above WHERE clause can have multiple conditions using AND & OR .
64DB2
Other Clauses Other Clauses
Many other clauses can be used in conjunction with the WHERE clause to code the required predicate, some are :-
• Between / Not Between
• In / Not In
• Like / Not Like
• IS NULL / IS NOT NULL
65DB2
SELECT using a range :SELECT using a range :
Between Clause
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR FROM CUSTOMER
WHERE CUST_NO BETWEEN 1000 AND 2000;
In Clause
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR FROM CUSTOMER
WHERE CUST_NO IN(1000, 1001,1002);
66DB2
Select clause (contd...)Select clause (contd...)Like Clause
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR
FROM CUSTOMER
WHERE CUST_ID like/not like ‘425%’
Note :- ‘_’ for a single char ; ‘%’ for a string of chars
Escape ‘\’ - escape char; if precedes ‘_’ or ‘%’ overrides their meaning
67DB2
Select clause (contd...) Select clause (contd...)
NULL Clause : To check null the syntax is ‘IS NULL’
E.g. SELECT CUST_NO, CUST_NAME, ORDER_NO
WHERE ORDER_NO IS NULL;
However if there are null values for ORDER_NO, then these are always evaluated as a ‘Not True’ condition in a Query.
68DB2
Order by and Group by clauses :Order by and Group by clauses :
• Order by sorts retrieved data in the specified order; uses the WHERE clause
• Group by operator causes the table represented by the FROM clause to be rearranged into groups, such that within one group all rows have the same value for the Group by column (not physically in the database). The Select clause is applied to the grouped data and not to the original table.
Here ‘HAVING’ is used to eliminate groups, just like WHERE is used for rows.
69DB2
Order by and Group by clauses (contd...)Order by and Group by clauses (contd...)
E.g. SELECT ORDER_NO, SUM(NO_PRODUCTS)
FROM ORDER
GROUP BY ORDER_NO
HAVING AVG(NO_PRODUCTS) < 10
ORDER BY ORDER_NO ;
70DB2
FunctionsFunctions
Types are two :
• Column Function
• Scalar Function
71DB2
Column FunctionsColumn Functions
• Compute from a group of rows aggregate value for a specified column(s)
• AVG, COUNT, MAX, MIN, SUM
72DB2
Scalar FunctionsScalar Functions
• Are applied to a column or expression and operate on a single value.
• CHAR, DATE, DAY(S), DECIMAL, DIGITS, FLOAT, HEX, HOUR, INTEGER, LENGTH, MICROSECOND, MINUTE, MONTH, SECOND, SUBSTR, TIME, TIMESTAMP, VALUE, VARGRAPHIC, YEAR
73DB2
Complex SQL’sComplex SQL’s
• One terms a SQL to be complex when data that is to be retrieved comes from more than one table
• SQL provides two ways of coding a complex SQL
• Subqueries and
• Joins
74DB2
SubqueriesSubqueries
• Nested Select statements
• Specified using the IN(or NOT IN) predicate, equality or non-equality predicate(‘=‘ or ‘<>‘) and comparative operator(<, <=, >, >=)
• When using the equality, non-equality or comparative operators, the inner query should return only a single value
75DB2
Subqueries (contd...)Subqueries (contd...)E.g. SELECT CUST_NO, CUST_NAME
FROM CUSTOMER
WHERE ORDER_NO IN (SELECT ORDER_NO FROM ORDER
WHERE NO_PRODUCTS <5);
E.g. SELECT CUST_NO, CUST_ADDR
FROM CUSTOMER
WHERE ORDER_NO =
(SELECT ORDER_NO FROM ORDER
WHERE NO_PRODUCTS = 5);
76DB2
Subqueries (contd...)Subqueries (contd...)
• The nested loop statements gives the user the flexibility for querying multiple tables
• A specialized form is Correlated Subquery - the nested select statement refers back to the columns in previous select statements
• It works on Top-Bottom-Top fashion
• Non-correlated Subquery works in Bottom-to-Top fashion
77DB2
Correlated SubqueryCorrelated Subquery
E.g. SELECT A.CUST_NAME A.CUST_ADDR
FROM CUSTOMER A WHERE A.ORDER_NO IN
(SELECT ORDER_NO
FROM CUSTOMER B
WHERE A.CUST_ID = B.CUST_ID)
ORDER BY A.CUST_ID, A.CUST_NO ;
78DB2
Corelated Subquery using EXISTS clause :Corelated Subquery using EXISTS clause :
E.g. SELECT CUST_NO, CUST_NAME
FROM CUSTOMER A
WHERE EXISTS
(SELECT * FROM ORDER B
WHERE B.ORDER_NO = A.ORDER_NO
AND B.ORDER_NO = 5);
79DB2
Multiple levels of SubqueryMultiple levels of Subquery
E.g. SELECT CUST_NO, CUST_NAME, CUST_ADDR
FROM CUSTOMER
WHERE ORDER_NO IN
(SELECT ORDER_NO FROM ORDER
WHERE PROD_ID IN
(SELECT PROD_ID
FROM PRODUCTS
WHERE PROD_NAME = ‘NUTS’));
80DB2
JoinsJoins
OUTER JOIN : For one or more tables being joined, both matching and non-matching rows are returned. Duplicate columns may be eliminated
The non-matching columns will have nulls in them.
INNER JOIN: Here there is a possibility one or more of the rows from either or both tables being joined will not be included in the table that results from the join operation
81DB2
Other DML Statement’sOther DML Statement’s
INSERT
E.g..: INSERT INTO Tablename(column1, column2,
column3 ,......)
VALUES( value1, value2, value3 ,........)
If any column is omitted in an INSERT statement and that column is NOT NULL, then INSERT fails; if null it is set to null
82DB2
DML statements (contd...)DML statements (contd...)
• If the column is defined as NOT NULL BY DEFAULT, it is set to that default value
• Omitting the list of columns is equivalent to specifying all values
• SELECT - INSERT
E.g. INSERT INTO TEMP (A#, B)
SELECT A#, SUM(B)
FROM TEMP1 GROUP BY A# ;
83DB2
DML statements (contd...)DML statements (contd...)UPDATE
E.g.. UPDATE tablename
SET Columnname(s) = scalar expression WHERE [ condition ]
• Single or Multiple row updates
• Update with a Subquery
84DB2
DML statements (contd...)DML statements (contd...)
DELETE
E.g. DELETE FROM Tablename
WHERE [condition ];
• Single or multiple row delete or deletion of all rows
85DB2
Day 2 - Session 2Day 2 - Session 2
86DB2
Static SQLStatic SQL
• Hard-coded into an application program
• cannot be modified during the program’s execution except for changes to the values assigned to the host variables
• Cursors are used to access set-level data (i.e when a SQL SELECT returns more than 1 row)
• The general form is EXEC SQL
[SQL statements]
END-EXEC.
87DB2
Dynamic SQL Dynamic SQL
• Statements can change throughout the program’s execution
• When the SQL is bound, the application plan or package that is created does not contain the same information as that for a static SQL program
• The access paths cannot be determined before execution
88DB2
IndexesIndexes
What is an Index ?
‘An index is an ordered set of pointers to rows of a base table’.
Or
‘An Index is a balanced B-tree structure that orders the values of columns in a table’
Why an Index ?
‘One can access data directly and more efficiently’
89DB2
Indexes (contd...)Indexes (contd...)• Each index is based on the values of data in one or
more columns. An index is an object that is separate from the data in the table.
• When you define an index using the CREATE INDEX statement, DB2 builds this structure and maintains it automatically.
• Indexes can be used by DB2 to improve performance and ensure uniqueness.
• In most cases, access to data is faster with an index.
• A table with a unique index cannot have rows with identical keys.
90DB2
Indexes (contd...)Indexes (contd...)
Syntax : For creation of an Index
CREATE INDEX <indexname> ON <tabname>
(colname asc/desc)
91DB2
Index Parameters for CreationIndex Parameters for Creation
• CLUSTER
• USING STOGROUP/VCAT (the corresponding name)
• FREEPAGE
• PCTFREE
• PRIQTY / SECQTY
• BUFFERPOOL
• CLOSE - Yes/No
• ERASE Yes/No
92DB2
Index Guidelines - What to do ?Index Guidelines - What to do ?
1. Consider indexing on columns used in UNION, DISTINCT, GROUP BY, ORDER BY & WHERE clauses.
2. Limit the indexing of frequently updated columns
3. Create explicitly, a clustering index
4. Create a unique index on the primary key and indexes on foreign keys
93DB2
Index Guidelines (contd...)Index Guidelines (contd...)
5. Overloading of index when row length of a table to be accessed is short
6. Atleast one index must be defined for a table with more than 100 pages
7. Use Multicolumn index rather than a multi-index (appln dependent); however the latter requires more DASD .
94DB2
Index Guidelines (contd...)Index Guidelines (contd...)
8. Create indexes before loading the table.
9. Clustering reduces I/O; DB2 optimizer usually tries to use an index on clustered column before using the other indexes.
10. Specify Indexspace freespace the same as tablespace freespace
95DB2
Index Guidelines (contd...)Index Guidelines (contd...)
11. Use the DEFER option while creating the index. RECOVER INDEX utility can then be used to populate the index. Recover utility populates index entries faster.
12. Use different STOGROUP’s for Tablespaces & indexspaces
13. Create Critical indexes in a different bufferpool than the tablespaces.
96DB2
Index Guidelines - What Not to do ?Index Guidelines - What Not to do ?
1. Avoid indexing on Variable columns
2. Limit the number of indexes on partitioned TS
3. Avoid indexes if
• the table is very small (< 10 pages)
• it has heavy inserts and deletes and is relatively small (< 20 pages)
• it is accessed with a scan.
4. Avoid defining redundant indexes
97DB2
Other DB2 ObjectsOther DB2 Objects
VIEWSVIEWS
• It is a logical derivation of a table from other table/tables. A View does not exist in its own right.
• They provide a certain amount if logical independence
• They allow the same data to be seen by different users in different ways
• In DB2 a view that is to accept a update must be derived from a single base table
98DB2
DB2 Objects (contd...)DB2 Objects (contd...)
Aliases Aliases
• Mean ‘another name’ for the table.
• Aliases are used basically for accessing remote tables (in distributed data processing), which add a location prefix to their names.
• Using aliases creates a shorter name.
Synonym
• Also means another name for the table, but is private to the user who created it.
99DB2
DB2 Objects (contd...)DB2 Objects (contd...)
Syntax:
CREATE VIEW <Viewname> (<columns>)
AS Subquery (Subquery - SELECT FROM other Table(s))
CREATE ALIAS <Aliasname> FOR <Tablename>
CREATE SYNONYM <Synonymname> FOR <Tablename>
100DB2
SQL Guidelines SQL Guidelines
- Refer handout - Refer handout - Mullins, chapter 2- Mullins, chapter 2
101DB2
Day 3 - Session 1Day 3 - Session 1
102DB2
Topic to be covered in this sessionTopic to be covered in this session
• Application programming using DB2
• Steps to write a DB2 application
• Cursors
• QMF and SPUFI
• Some Hints
103DB2
Application programming using DB2Application programming using DB2
Application environments supporting DB2 :
• IMS(Batch/Online), CICS, TSO(Batch/Online)
• CAF - Call Attach Facility
• All DB2 application types can execute concurrently
• Host Language support - COBOL, PL/1, C, Fortran or Assembly lang
104DB2
Steps involved in creating a DB2 applicationSteps involved in creating a DB2 application
Coding the application
• using Embedded SQL
• using Host variables (DCLGEN)
• using SQLCA
• pre-compile the program
• compile & link edit the program
• bind
Note : Cursors can also be used
105DB2
Embedded SQL statementsEmbedded SQL statements
• It is like the file I/O
• Normally the embedded SQL statements contain the host variables coded with the INTO clause of the SELECT statement.
• They are delimited with EXEC SQL ...... END EXEC.
• E.g. EXEC SQL
SELECT Empno, Empname INTO :H-empno, :H-empname
FROM EMPLOYEE
WHERE empno = 1001
END EXEC.
106DB2
Host VariablesHost Variables
• These are variables(or rather area of storage) defined in the host language to use the predicates of a DB2 table. These are referenced in the SQL statement.
• A means of moving data from and to DB2 tables
• DCLGEN produces host variables, the same as the columns of the table
107DB2
Host Variables (contd...)Host Variables (contd...)
Host variables can be used
• In WHERE Clause of Select, Insert, Update & Delete
• ‘INTO’ Clause of Select & Fetch statements
• As input of ‘SET’ Clause of Update Statements
• As Input for the ‘VALUES’ Clause of Insert statements
• As Literals in Select list of a Select Statement
108DB2
Host Variables (contd...)Host Variables (contd...)
E.g. SELECT Cust_No, Cust_name, Cust_addr
INTO :H-CUST-NO, :H-CUST-NAME,
:H-CUST-ADDR
FROM CUSTOMER
WHERE CUST_NO = :H-CUST-NO;
109DB2
DCLGENDCLGEN
• Issued for a single table
• Prepares the structure of the table in a COBOL copybook
• The copybook contains a ‘SQL DECLARE TABLE’ statement along with a working storage host variable definition for the table
110DB2
SQLCASQLCA
• An SQLCA is a structure or collection of variables that is updated after each SQL statement executes.
• An application program that contains executable SQL statements must provide exactly one SQLCA.
111DB2
SQLCA (contd...)SQLCA (contd...)Structure of the SQLCA (for COBOL)01 SQLCA.
05 SQLCAID PIC X(8).
05 SQLCABC PIC S9(9) COMP
05 SQLCODE PIC S9(9) COMP
05 SQLERRM.
:
05 SQLWARN.
10 SQLWARN0 PIC X(1).
:
10 SQLWARNA PIC X(1).
10 SQLSTATE PIC X(5).
112DB2
Day 3 - Session 2Day 3 - Session 2
113DB2
CursorsCursors
• Used when a large number of rows are to be Selected
• Can be likened to a pointer
• Can be used for modifying data using ‘FOR UPDATE OF’ clause
114DB2
Cursors (contd...)Cursors (contd...)
The four (4) Cursor control statements are -
• Declare : name assigned for a particular SQL statement
• Open : readies the cursor for row retrieval; sometimes builds the result table. However it does not assign values to the host variables
• Fetch : returns data from the results table one row at a time and assigns the value to specified host variables
• Close : releases all resources used by the cursor
115DB2
Cursors (contd...)Cursors (contd...)
DECLARE
E.g. - For the Declare statement
EXEC SQL DECLARE EMPCUR CURSOR FOR SELECT Empno, Empname,Dept, Job FROM EMP WHERE Dept = 'D11' FOR UPDATE OF Job END-EXEC.
116DB2
Cursors (contd...)Cursors (contd...)
OPEN
E.g. - For the Open statement
EXEC SQL OPEN EMPCUR END-EXEC.
117DB2
Cursors (contd...)Cursors (contd...)
FETCH
E.g. - For the Fetch statement
EXEC SQL FETCH EMPCUR INTO :Empno, :Empname, :Dept, :JobEND-EXEC.
118DB2
Cursors (contd...)Cursors (contd...)
CLOSE
E.g. - For the Close statement
EXEC SQL
CLOSE EMPCUR
END EXEC.
119DB2
Cursors (contd...)Cursors (contd...)
WHENEVER
E.g. - For the Whenever Clause
EXEC SQL
WHENEVER NOT FOUND
Go To Close-EMPCUR
END EXEC.
Note :- Not recommended for use in application programs
120DB2
Cursors (contd...)Cursors (contd...)
UPDATE
E.g. - For the Update statement using cursors
EXEC SQL
UPDATE EMP
Set Job = :New-job
WHERE current of EMPCUR
END EXEC.
121DB2
Cursors (contd...)Cursors (contd...)
DELETE
E.g. - For the Delete statement using cursors
EXEC SQL
DELETE FROM EMP
WHERE current of EMPCUR
END EXEC.
122DB2
Application development guidelinesApplication development guidelines
• Code modular DB2 programs and make them as small as possible
• Use unqualified SQL statements; this enables movement from one environment to another(test to production)
• Never use ‘Select *’ in an embedded SQL program;
• Use joins rather than subqueries
123DB2
Application development guidelines (contd...)Application development guidelines (contd...)
• Use WHERE clause and filter out data
• Use cursors when fetching multiple rows, though they add overheads
• Use FOR UPDATE OF clause for UPDATE or DELETE with cursor - this ensures data integrity.
• Use Inserts minimally ; use LOAD utility instead of INSERT, if the inserts are not application dependent
124DB2
QMF - Query Management FacilityQMF - Query Management Facility
• It is an MVS- and VM- based query tool
• allows end users to enter SQL queries to produce a variety of reports and graphs as a result of this query
• QMF queries can be formulated in several ways : by direct SQL statements, by means of relational prompted query interface or by query-by-example (QBE). QBE is similar to SQL in some ways but more user friendly
125DB2
SPUFI - SQL Processing Using File InputSPUFI - SQL Processing Using File Input
• Supports the online execution of SQL statements from a TSO terminal
• Used for developers to check SQL statements or view table details
• SPUFI menu contains the input file in which the SQL statements are coded, option for default settings and editing and the output file.
126DB2
Day 4 - Session 1Day 4 - Session 1
127DB2
Topic to be covered in this sessionTopic to be covered in this session
• Program Preparation
• Precompile, Compile, Linkedit and Bind
• Plan & Packages
128DB2
PrecompilePrecompile
• Searches all the SQL statements and DB2 related INCLUDE members and comments out every SQL statement in the program
• The SQL statements are replaced by a CALL to the DB2 runtime interface module, along with parameters.
• All SQL statements are extracted and put in a Database Request Module (DBRM)
129DB2
Precompile (contd...)Precompile (contd...)
• Places a timestamp in the modified source and the DBRM so that these are tied. If there is a mismatch in this a runtime error of ‘-818‘, timestamp mismatch occurs
• All DB2 related INCLUDE statements must be placed between EXEC SQL & END EXEC keywords for the precompiler to recognize them
130DB2
Compile & LinkCompile & Link
• Modified precompiler COBOL output is compiled
• Compiled source is link edited to an executable load module
• Appropriate DB2 host language interface module should also be included in the link edit step(i.e DSNELI)
131DB2
BindBind
• A type of compiler for SQL statements
• It reads the SQL statements from the DBRM and produces a mechanism to access data (in an efficient manner) as directed by the SQL statements being bound
• Checks syntax, checks for correctness of table & column definitions against the catalog information & performs authorization validation
132DB2
Bind TypesBind Types
• BIND PLAN : accepts as input one or more DBRMs and outputs an application plan containing executable logic representing optimized access paths to DB2 data.
• BIND PACKAGE : accepts as input a single DBRM and produces a single package containing the optimized access path. The PLAN in this case contains a reference to the physical location of the package(s).
133DB2
What is a Package ?What is a Package ?
• It is a single bound DBRM with optimized access paths
• It also contains a location identifier, a collection identifier and a package identifier
• A package can have multiple versions, each with its own version identifier
134DB2
Advantages of PackageAdvantages of Package
• Reduced bind time
• Can specify bind options at the programmer level
• Versioning
• Provides remote data access(in version DB2 V2.3 or higher)
135DB2
What is a Plan ?What is a Plan ?
• An application plan contains one or both of the following elements:
• A list of package names
• The bound form of SQL statements taken from one or more DBRMs.
• Every DB2 application requires an application plan.
• Plans are created using the DB2 subcommands BIND PLAN
136DB2
For the following refer handoutFor the following refer handout
• List of common SQL return codes and solutions
137DB2
Day 4 - Session 2Day 4 - Session 2
138DB2
Topics to be covered in this SessionTopics to be covered in this Session
• DB2 Utilities
139DB2
DB2 System administrationDB2 System administration
DB2 UTILITIES
• Check
• Copy/Mergecopy
• Recover
• Load
• Reorg
• Runstats
• Explain
140DB2
CheckCheck
• Checks the integrity of DB2 data structures
• Checks the referential integrity between two tables and also checks DB2 indexes for consistency
• Can delete invalid rows and copies them to a exception table
• Use CHECK DATA when loading a table without specifying the ‘ENFORCE CONSTRAINTS’ option or after the partial recovery of tablespaces in a referential set
141DB2
CopyCopy
• Used to create an imagecopy for the complete tablespace or a partition of the tablespace - full imagecopy or incremental imagecopy
• Every successful execution of COPY utility places in the table SYSIBM.SYSCOPY, atleast one row that indicates the status of the imagecopy
142DB2
MergecopyMergecopy
• The MERGECOPY utility combines multiple incremental image copy data sets into a new full or incremental image copy data set
143DB2
RecoverRecover
• Restore DB2 tablespaces and indexes to a specific instance
• Data can be recovered for single page, pages that contain I/O errors, a single partition or an entire tablespace
• Indexes are always recovered from the actual table data, not from image copy and log data, as in the case of tablespace recovery
• Standard unit of recovery is a Tablespace
144DB2
LoadLoad
• To accomplish bulk inserts into DB2 table
• Can replace the current data or append to it .i.e. LOAD DATA REPLACE or LOAD DATA RESUME(S)
• If a job terminates in any phase of LOAD REPLACE the utility has to be terminated and rerun
145DB2
Load (contd...)Load (contd...)
• If a job terminates in any phase other than UTILINIT(which sets up and initializes the LOAD utility), the tablespace must be first restored using the full RECOVER, if LOG NO option of the LOAD was mentioned. After the tablespace is restored, the error is to be corrected, the utility terminated and the job rerun.
146DB2
ReorgReorg
• To reorganize DB2 tables and indexes and thereby improving their efficiency of access
• Re-clusters data, resets free space to the amount specified in the ‘create DDL’ statement and deletes and redefines underlying VSAM datasets for stogroup defined objects
147DB2
RunstatsRunstats
• Collects statistical information for DB2 tables, tablespaces, partitions, indexes, and columns.
• It can place this information in the catalog tables with DB2 optimizer statistics or DBA monitoring statistics or with all statistics that have been gathered
• It can be used on specific SQL queries without updating the current usable statistics
148DB2
Reorg Job streamReorg Job stream
• The total reorg schedule should include a Runstats job or step : to record current tablespace and index statistics to DB catalog
• Two copy steps for each tablespace being reorganized : so that data is recoverable. The second copy job is required after the REORG if it was performed with a LOG NO option
149DB2
Reorg Job stream (contd...)Reorg Job stream (contd...)
• After a REORG is run with LOG NO option, DB2 turns on the copy pending status flag for tablespaces specified in the REORG.
• When LOG NO parameter is specified it is better to take a imagecopy of the tablespace being reorganized immediately after reorg
• A REBIND job for all plans using tables in any of the tablespaces being organized
150DB2
ExplainExplain
• This feature can be used to obtain the details about the access paths chosen by the DB2 optimizer for SQL statements.
• Used specifically for performance monitoring.
• When EXPLAIN is requested the access paths that the DB2 chooses are put in coded format into the table PLAN_TABLE, which is created in the default database.
151DB2
Explain (contd...)Explain (contd...)
• To EXPLAIN a single SQL statement precede that SQL statement with the EXPLAIN Command
EXPLAIN ALL SET QUERYNO = integer
FOR SQL statement
• The other method is specifying EXPLAIN YES with the Bind command
• Then PLAN_TABLE is to be queried to get the required information.
152DB2
Explain (contd...)Explain (contd...)
• The information provided include the type of access of particular tables used in the SQL or Package or Plan, the order in which the tables or joined in a JOIN, whether SORT is required and so on
• Since the EXPLAIN results are dependent on the DB catalog, it is better to run RUNSTATS before running a EXPLAIN
153DB2
Day 5 - Session 1Day 5 - Session 1
154DB2
Topics to be covered in this SessionTopics to be covered in this Session
• DB2 Security and DCL
• DB2 Locking
155DB2
Data Control languageData Control language
• DB2 security is provided internal to DB2 using the DCL
• The two (2) DCL statements used are
• Grant
• Revoke
156DB2
Data Control language (contd...)Data Control language (contd...)
GRANT
• Grants privileges on different DB2 objects such as the Tables, Views, Plans, Packages, Databases etc. to the required set of users.
• Is used to grant Use privileges to user on requirement
• Is also used to grant system privileges to select few users
• User with a SYSADM privilege will be responsible for overall control of the system
157DB2
Data Control language (contd...)Data Control language (contd...)Syntax : GRANT <privileges> TO <users/PUBLIC>
[WITH GRANT OPTION]
E.g. GRANT SELECT, UPDATE(NAME, NO)
ON Table EMPL To A, B, C (or PUBLIC);
GRANT EXECUTE ON PLAN PLANA To USER;
158DB2
Data Control language (contd...)Data Control language (contd...)
• Some table (or View) privileges are
• Select, Update, Delete and Insert
• Privileges specific to Tables are
• Alter & Index (create)
• There are no specific DROP privileges; the table can be dropped by its owner or a SYSADM
• A user having authority to grant privilege to another, also has the authority to grant the privilege with “with the GRANT Option”
159DB2
Data Control language (contd...)Data Control language (contd...)
REVOKE
• Revoke is primarily used to revoke the privileges given to a user on specific Objects.
• The user granting the privileges has the authority to Revoke also.
• It is not possible to be column specific when revoking an Update privilege
160DB2
Data Control language (contd...)Data Control language (contd...)Syntax : REVOKE <privileges> FROM <user/PUBLIC>
E.g. REVOKE ALL ON Table EMPL
FROM A, B, C (or PUBLIC);
REVOKE Bind ON PLAN PLANA FROM USER;
161DB2
DB2 LockingDB2 Locking
Why Locking ?
‘Locking is used to provide multiple user access to the same system’
How does DB2 manage locking ?
DB2 uses locking services provided by an MVS subsystem called the IMS Resource Lock Manager(IRLM).
162DB2
DB2 Locking (contd...)DB2 Locking (contd...)
• The above is based on Transaction Processing - the system component that provides this is
‘A TRANSACTION MANAGER’
• COMMIT & ROLLBACK are key methods of implementing this
163DB2
Explicit locking facilitiesExplicit locking facilities
• The SQL statement LOCK TABLE
• The ISOLATION parameter on the BIND PACKAGE command - the two possible values are RR(‘Repeatable Read’) & CS(‘Cursor Stability’).
• CS is the value specified if the application program is used in an online environment.
• The tablespace LOCKSIZE parameter - physically DB2 locks data in terms of pages or tables or tablespaces. This parameter is specified in ‘CREATE or ALTER Tablespace’ option ‘LOCKSIZE’. The options are ‘Tablespace’, ‘Table’, ‘Page’ or ‘Any’
164DB2
Explicit locking facilities (contd...)Explicit locking facilities (contd...)
• The ACQUIRE/RELEASE parameters on the BIND PLAN command specifies when table locks(which are implicitly acquired by DB2) are to be acquired and released.
• Types :
• ACQUIRE• Use
• Allocate
• RELEASE• Commit
• Deallocate
165DB2
Day 5 - Session 2Day 5 - Session 2
166DB2
Topics to be covered in this SessionTopics to be covered in this Session
• DB2 Catalog & Directory
• Optimizer
• Performance tuning
167DB2
Catalog Tables & the DB2 directoryCatalog Tables & the DB2 directory
• Repository for all DB2 objects - contains 43 tables
• Each table maintains data about an aspect of the DB2 environment
• The data refers to information about tablespaces, tables, indexes, privileges, on utilities run on DB2 and so on e.g. : SYSIBM.SYSTABLES, SYSINDEXES/SYSCOLUMNS ......’
168DB2
Catalog Tables & the DB2 directory (contd...)Catalog Tables & the DB2 directory (contd...)
• When standard DB2 SQL is used, the DB2 catalog is either accessed or updated. e.g.. When a ‘CREATE TABLE’ statement is issued the catalog tables SYSIBM.SYSTABLES, SYSIBM.SYSCOLUMNS & SYSIBM.SYSFIELDS are updated.
• However the DB2 catalog is semi active only. This is because updates to number of rows, the physical order of the rows for a set of keys and the like are updated only after running a RUNSTATS utility
• DB2 catalog is integrated - DB2 catalog and DB2 DBMS are inherently bound together
169DB2
Catalog Tables & the DB2 directory (contd...)Catalog Tables & the DB2 directory (contd...)
• It is nonsubvertible - DB2 catalog cannot be updated behind DB2’s back. i.e. if a table of 10 columns is created, it is not possible to go and change the number of columns directly on the catalog to 15. It has to be done using the standard SQL statements for dropping and recreating the table
170DB2
DB2 OptimizerDB2 Optimizer
• Analyzes the SQL statements and determines the most efficient way to access data - gives Physical data independence
• It evaluates the following factors : CPU cost, I/O cost, DB2 catalog statistics & the SQL statement
• It estimates CPU time, cost involved in applying predicates, traversing pages and sorting
171DB2
DB2 Optimizer (contd...)DB2 Optimizer (contd...)
• It estimates the cost of physically retrieving and writing the data
• The information pertaining to the state of the tables that will be accessed by the SQL statements are provided by the Catalog
172DB2
Performance TuningPerformance Tuning
• The performance of an application can be monitored and enhanced in the application, as well as at the database level
• In application side the SQL’s can be tuned to make them more efficient, and avoid redundancy
• It is better to structure the SQLs so that they perform only the necessary operations
173DB2
Performance Tuning (contd...)Performance Tuning (contd...)
• On the database side, the major enhancements can be done to the definitions of tables, indexes & the distribution of tablespace and indexspace
• The application run statistics are obtained from EXPLAIN or DB2PM (DB2 Performance Monitor) report
174DB2
Thank YouThank You