59
WE CREATE THE FUTURE OF IT…. Don’t think twice. Join. The benefits are yours. Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part 2 John Campbell Distinguished Engineer Db2 for z/OS Development Email: [email protected] John Campbell Distinguished Engineer Db2 for z/OS 1

Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Db2 for z/OSJC Greatest Hits, War Stories, and Best Practice 2019

S203 - Part 1, S211 - Part 2

John CampbellDistinguished Engineer

Db2 for z/OS DevelopmentEmail: [email protected]

John CampbellDistinguished Engineer Db2 for z/OS

1

Page 2: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Objectives• Learn recommended best practice• Learn from the positive and negative experiences from other installations

2

Page 3: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Agenda• Large size real memory page frames• PGSTEAL(NONE) buffer pool and FRAMESIZE(2G)• Pointer-Overflow Pairs (indirect references)• Running out of basic 6-byte log RBA addressing range• Db2 Connect and Continuous Delivery (Db2 12)• Diagnosing and resolving slow-downs and hangs• Hung Db2 threads• New ZSA keyword on HIPER and PE APARs• Insert free space search algorithm• Insert with APPEND• Fast Un-clustered insert (Db2 12)• Hidden ROWID support to partition

3

Page 4: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Agenda …• Cost management using CPU capping• Fast REORG with SORTDATA NO RECLUSTER NO• Requirement for increased RID Pool size (Db2 12)• Increased EDM Pool memory usage for DBDs (Db2 12)• Setting initial STATISTICS PROFILE (Db2 12)• System parameter REALSTORAGE_MANAGEMENT • Transaction level workload balancing for DRDA traffic• Use of High-Performance DBATs• Overuse of UTS PBG tablespaces and MAXPARTS• IRLM query requests for package break-in• Running CHECK utilities with SHRLEVEL CHANGE

4

Page 5: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames• Benefit of large (1M or 2G) size real memory page frames

– The Translation Lookaside Buffer (TLB) is a cache used to speed up the conversion of virtual memory addresses into real memory addresses

– With the introduction of 64-bit real and virtual addressing , the TLB coverage has dramatically shrunk, leading to CPU performance degradation

– Large size page frames help increase TLB coverage without having to enlarge the TLB size

– Result: Better CPU performance by decreasing the number of TLB misses

• Common problem– LFAREA is grossly over-configured, which might result in a shortage of 4K size frames

and lead to expensive breakdown of 1M size large frames, expensive page movement for 4K page fixes, premature paging, CPU burn, loss of LPAR

5

Page 6: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …• LFAREA – 1M/2G large frame area

– Fixed 1M/2G page frames– Defined in IEASYSxx parmlib member – ‘Old’ syntax (still supported)

• LFAREA = (xM | xG | xT | x%)• Pct formula: (x% of online memory available at IPL) – 2G• Max LFAREA is (80% * online real memory available at IPL) – 2G

– New syntax• LFAREA = (1M=(a [,b]) | 1M=(a% [,b%]) | 2G=(a [,b]) | 2G=(a% [,b%])• Pct formula: x% of (online memory available at IPL – 4G)• Max LFAREA is 80% of (online memory available at IPL time – 4G)

– Only changeable by IPL– If the LFAREA is overcommitted, DB2 will use 4K and/or 1M size page frames

6

Page 7: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …

7

Page 8: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …• Quad area

– 12.5% of online memory at IPL time

• PLArea – Pageable 1M large frame area – Pageable 1M page frames– Allocated on SCM-capable machines (zEC12/zBC12 and above)

• If Flash Express is installed, these large pages may be paged to and from SCM• If Flash Express is not installed, then if those pages are ever paged out, they will be demoted

to 4K size page frames and will remain 4K size until the next IPL

– System-defined size • Approximately 12.5% of online memory at IPL time – adjusted to what fits after Quad and

LFArea are built

– Pageable 1M frames overflow into the LFAREA when PLArea is depleted

• Quad and Pageable 1M areas grow proportionally with additional real memory

8

Page 9: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …• Let’s do some maths …

– Starting position

– If you were to add 100GB to the LPAR and define it all as LFAREA

9

Online memory (GB) 150.0LFAREA (GB) 100.0QUAD (GB) 18.81MB PAGEABLE (GB) 18.84KB FRAMES (GB) 12.4

Online memory (GB) 50.0LFAREA (GB) 0QUAD (GB) 6.31MB PAGEABLE (GB) 6.34KB FRAMES (GB) 37.4

Do not forget that Quad area and Pageable 1M area grow proportionally with additional REAL memory!

Probably not enough 4K frames to handle the 4K workload needs, including taking dumps quickly, without having to break down free 1M frames

Page 10: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …• Estimating ‘optimal’ LFAREA

– Total of• (Sum of VPSIZE*page size from candidate local buffer pools) * 1.05• Plus 20MB for z/OS usage• Plus log output buffer size (OUTBUFF) if running DB2 11 or later• Plus non-DB2 usage e.g., Java heap sizes• Plus any overflow from PLArea (Pageable Large Area)

10

Page 11: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …• Recommendations

– Factor in that Quad and PLArea areas grow proportionally with additional real memory added to the LPAR

– Have enough 4K frames “above the bar” to avoid RSM breaking down free 1M pages and paging or page movement for 4K page fixes

– Include RSM needs for memory mapping - 1/64 total online real at IPL– Factor in system address space memory usage (CICS, Db2, etc)– Include enough spare 4K frames for taking dumps quickly– Define the LFAREA based on what you can actually afford within the available

remaining online real memory budget, the results are either:• DB2 buffer pools are spread across 4K and 1M/2G size frames

– No availability issues– No performance regression compared to using all 4K frames– Only some loss to incremental benefit of using 1M/2G frames

• Provision additional REAL memory

– Specify LFAREA as an absolute number value as opposed to a percentage value

11

Page 12: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Large size real memory page frames …• Minimum starting point for 4K requests is the sum of all of the following:

– 1/64 of total real memory operating system requirement (64 bytes per 4K page, 1M page counts as 256*4K), plus

– 1/8 for QUAD area - can be used for 4K fixed requests, plus– 1/8 for Pageable 1M frames - can be used for 4K pageable or fixed requests, plus– 2G for below the 2G bar, plus– 2G for above the 2G bar, plus– DUMPSRV/MAXSPACE requirement (general ROT is 16G)

12

Page 13: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

PGSTEAL(NONE) buffer pool and FRAMESIZE(2G)• Incompatible change when migrating from Db2 11 to Db2 12• For Db2 11, Db2 can use 2G size frames for PGSTEAL(NONE) buffer pools• For Db2 12, contiguous buffer pools (PGSTEAL(NONE)) will NOT use 2G size frames

– Request to use 2G size frames is not honoured– Buffer pool will still be allocated, but in 4K size frames– DSNB548I message will be issued when

• Allocating buffer pool which has PGSTEAL(NONE) and FRAMESIZE(2G) specified• ALTER BUFFERPOOL command changes either attribute with the result being PGSTEAL(NONE)

FRAMESIZE(2G)

• Why be concerned?– If the size of contiguous buffer pools are very large, this can lead to shortage of 4K

frames on the LPAR, with consequences• Penalty of page movement or paging I/O overhead with corresponding CPU burn in RASP

address space• Worst case the LPAR will crash out!

• Recommendation– If using 2G size frames with PGSTEAL(NONE) buffer pools under Db2 11, then switch to

using 1M size frames before leaving Db2 1113

Page 14: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Pointer-Overflow Pairs (indirect references)• What is a pointer-overflow pair and how is it created?

– Row increases in size as a result of update– Row no longer fits in the space available in the data page– Row is then moved to a new data page location (overflow record)– Pointer to the overflow location placed in the original spot

• Problem – Unique to data sharing and GBP-dependent object– ‘Timing window’ where an ISO(UR) scanner running on a different Db2 member

failed to get a row in an overflow record

14

Page 15: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Pointer-Overflow Pairs (indirect references) …• Db2 solution introduced with APAR PM82279

– Synchronous write force the new data page location to CF GBP structure preceded by force sync log write

– Performance penalty which could be significant– Aggravating factor

• May not be enough free contiguous free space for the overflow record in the new data page location, so the data page is compacted

• After compaction, the data page still does not have enough committed free space to insert the overflow record, the page is released and move on to find another data page location

• Even though the overflow record was not inserted into this compacted page, another synchronous write force the new data page location to CF GBP structure occurs preceded by force sync log write

• Recommendation: generously allocate PCTFREE FOR UPDATE to limit number of new rows inserted and overflow records inserted into a data page, allow for updated rows to expand, and reduce the number of pointer-overflow pairs

15

Page 16: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Running out of basic 6-byte log RBA addressing range• Background

– Increasingly common for the basic 6-byte log RBA (256 TB) addressing range to be exhausted for Db2 subsystems

– Tiny number of installations close to exhausting the basic 6-byte LRSN addressing range for a data sharing group

– BSDS must be converted to extended 10-byte RBA format before migrating to Db2 12• Problem areas

– After BSDS conversion to extended 10-byte log RBA, non-data sharing Db2 subsystems will accelerate with increased velocity towards end of basic 6-byte log RBA addressing range!

– After converting the BSDS, Db2 stops generating the DSNJ032I warning messages, even if there is imminent danger of reaching the 6-byte RBA soft limit (non-data sharing) or 6-byte LRSN soft limit (data sharing) for table spaces or index spaces in6-byte basic format

– Many installations embarked on an aggressive but unnecessary “crash project” to reorganise Catalog/Directory and application database objects to convert to extended 10-byte of RBA or LRSN format

16

Page 17: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Running out of basic 6-byte log RBA addressing range …• Recommendations

– Non-data sharing - getting to extended 10-byte log RBA format• Must convert the Catalog, Directory and all application objects via REORG to extended 10-byte

log RBA format• Must convert BSDS of the problem Db2 subsystem to extended 10-byte log RBA format• Leave converting the BSDS to the very end of the overall conversion process, otherwise will

accelerate towards end of basic 6-byte log RBA addressing range with increased velocity

– Data sharing - getting to extended 10-byte log RBA format for a specific Db2 member• Just convert BSDS of the problem Db2 member• No need to convert Catalog, Directory and application objects via REORG to extended 10-byte

LRSN format

– Data sharing – getting to the extended 10-byte LRSN format• Must convert the Catalog, Directory and application objects via REORG to extended 10-byte

extended LRSN format• Must convert BSDS of each Db2 member to extended 10-byte extended log RBA/LRSN format• Convert the BSDS at the start of the overall conversion process to get potential incremental

performance benefit from “LRSN spin” avoidance as Catalog, Directory and application objects are converted to extended 10-byte LRSN format

17

Page 18: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Reaching the limit conversion strategies

18

Db2 11 CMDb2 10 NFM Db2 11 NFM ConvertDb2 Objects

Non Data Sharing – End log RBA problem

Db2 11 CMDb2 10 NFM Db2 11 NFM Convert BSDS

Data Sharing – End log RBA problem only on one or more members

Convert BSDS

Db2 11 CMDb2 10 NFM Db2 11 NFM Convert BSDS

Data Sharing – End LRSN problemConvert

Db2 Objects

Page 19: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Db2 Connect and Continuous Delivery (Db2 12)• Db2 Connect – Situation until recently prior to applying PTF for APAR PH08482

– Any in-support level of Db2 Connect drivers should work with Db2 12 for z/OS, both before and after new function is activated (FL500) with no behavior change

– Data server clients and drivers must be at the following levels to exploit Db2 for z/OS function-level application compatibility (APPLCOMPAT) of V12R1M501 or greater:

• IBM® Data Server Driver for JDBC and SQLJ: Versions 3.72 and 4.22, or later• Other IBM data server clients and drivers: Db2 for Linux, UNIX, and Windows Version 11.1 Fix

Pack 1, or later

– New ClientApplCompat (ODBC) and clientApplcompat (JDBC) property setting allows you to control the capability of the client when updated drivers ship changes to enable new server capability

• You might want specific control of driver capability when:– Db2 client driver introduces new behavior currently not controlled by Db2 application compatibility– Change needs to be controlled at the application level to ensure compatibility with new behavior

– ClientApplCompat/clientApplcompat setting of V12R1M500 is absolutely required to exploit Db2 12 for z/OS Server capability shipped after GA at function levels beyond Db2 12 for z/OS FL=V12R1M500

– Db2 Connect Server gateway does NOT support ClientApplCompat/clientApplcompat 19

Page 20: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Db2 Connect and Continuous Delivery (Db2 12) …• Db2 Connect – new behavior after applying PTF for APAR PH08482

– Makes setting of ClientApplCompat/clientApplcompat property optional • Customers no longer forced to have the setting

– No changes are required in Db2 Connect level– As before all customers must upgrade to at least Db2 Connect V11.1 FP1 or higher

in order to run DRDA applications where packages have APPCOMPAT > FL500 – Db2 Connect Server gateways will need to be upgraded to at least Db2 Connect

V11.1 FP1 to access packages using an APPLCOMPAT > FL500– When ClientApplCompat/clientApplcompat is set, Db2 will perform validation

checking where there are changes in DRDA message flows i.e., check the underlying infrastructure and avoid application incompatibilities

20

Page 21: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Db2 Connect and Continuous Delivery (Db2 12) …• Db2 Connect – new behavior after applying PTF for APAR PH08482 …

– Going out into the future when not setting ClientApplCompat/clientApplcompat there are consequences in terms of risk when DRDA flows change on existing applications

• So far in Db2 12 there are no changes in DRDA flows, no changes are in plan, but at some point it will likely happen

• Applications may break in the future when DRDA transactions run with packages where APPLCOMPAT > FL500

• If an application breaks then Db2 Development will not provide server support to allow these broken applications to run i.e., no more new DDF_COMPATABILITY zparm settings

– So what are the your options1. Rebind driver packages in the NULLID collection and back level the APPLCOMPAT setting

This is a "one size fits all" solution to fallback to an earlier APPLCOMPAT2. “Penalty Box” the problem applications

– Switch the problem applications out to use the driver packages in a different collection which has a back levelled APPLCOMPAT setting, or

– Switch all the good applications out into a new collection using driver packages with the new APPLCOMPAT setting and leave the problem applications still using the driver packages in the NULLID or different collection but with the driver packages running a back levelled APPLCOMPAT setting

21

Page 22: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Db2 Connect and Continuous Delivery (Db2 12) …• General best practice recommendation

– When migrating to Db2 12 - all DRDA applications should continue to use the driver packages in the NULLID collection

– These packages can have an APPLCOMPAT setting of V10R1, V11R1, V12R1M100 or V12R1M500 depending on where you are in the migration process

– The APPLCOMPAT setting for the driver packages in the NULLID collection should not advance beyond V12R1M500

– When specific applications and their application servers want to use new function requiring APPLCOMPAT setting > FL500, these application servers should switch away from using the driver packages in the NULLID collection to a new collection (e.g., V12R1M503) where the driver packages are bound with a higher APPLCOMPAT setting (e.g., driver packages bound with APPLCOMPAT V12R1M503 in collection V12R1M503)

22

Page 23: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Diagnosing and resolving slow-downs and hangs• Very rare condition, with little or no customer operational experience and

confidence to handle• Techniques

– Issue Db2 and IRLM commands for each Db2 member and if a Db2 member/IRLM does not respond, then should take that Db2 member out even if it means using the z/OS CANCEL command

• Sample commands to be issued– -DIS THD(*) SERVICE(WAIT) to each Db2 member– MODIFY xxxxIRLM,STATUS,ALLD to each IRLM– -DIS THD(*) SERVICE(WAIT) SCOPE(GROUP) to each Db2 member

– Manually trigger rebuild of the LOCK1 structure into alternate CF based on the PREFERENCE LIST in CFRM policy

• Issue SETXCF START,REBUILD,STRNAME=DSNxxxx_LOCK1,LOCATION=OTHER• Structure rebuild may clear the condition• However if the structure rebuild fails, find the connector which did not respond and IPL

the non-responsive z/OS LPAR

23

Page 24: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Hung Db2 threads• Things are hung, do not cancel … at least not yet• Take a dump first, cancel changes the layout of the control blocks and may hide

the problem• Bring out the smallest size “hammer” first

– CANCEL the thread in Db2– FORCEPURGE CICS TRAN or CANCEL BATCH JOB– CANCEL Db2

• CANCEL IRLM• FORCE MSTR• FORCE DBM1

– IPL z/OS LPAR

24

Page 25: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

New ZSA keyword on HIPER and PE APARs• Introduced to provide “rating” to try and help customers determine when to apply particular

HIPER/PE PTF• The ZSA keyword is under APAR Error Description, usually as an 'Additional Keyword'• ZSA ratings, ZSA1 (low), ZSA2, ZSA3, ZSA4, ZSA45, ZSA5 (highest) described below

– System Outage: 4.5 and 5 is an example for adding the consideration for the number of customer who has hit the problem, it should be the same for other HIPER category

4: system outage should automatically get a 44.5: If there are 1-5 customer already hit the problem5: If there are more than 10 customer already hit the problem

– Data Loss:4: non-recoverable or pervasive, common3: recoverable, incorrout output, but with few conditions2: recoverable, incorrout, but fairly rare to hit it1: super rare cases

– Function Loss:4: pervasive causing application outages3: likely common2/1: rare

– Performance:4: looping indefinitely3: degrades >=5%2: degrades <5% 1: not noticeable

– Pervasive:4: automatically

– MSysplex:4: automatically

25

Page 26: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Insert free space search• Insert performance is a trade off across

– Minimising CPU resource consumption– Maximising throughput– Maintaining data row clustering– Reusing space from deleted rows and minimising space growth

• Warning: insert space search algorithm is subject to change and it does

26

Page 27: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Insert free space search …• For UTS and classic partitioned table space:

1. Index Manager will identify the candidate page (next lowest key rid) based on the clustering index

• If page is full or locked, skip to Step 2

2. Search adjacent pages (+/-) within the segment containing the candidate page• For classic partitioned it is +/-16 pages

3. Search the end of pageset/partition without extend 4. Search the space map page that contains lowest segment that has free space to the

last space map page up to 50 space map pages, skip to Step 5• This is called "smart exhaustive space search"

5. Search the end of pageset/partition with extend until PRIQTY or SECQTY reached6. Perform "exhaustive space search" from front to back of pageset/partition when

PRIQTY or SECQTY reached• Very expensive i.e., space map with lowest segment with free space all the way through

• For classic segmented table space, steps are very similar except the sequence:– 1->2->3->5->4->6

27

Page 28: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Insert – free space search steps (UTS and classic partitioned tablespace)

28

End

1

22

4

Search from free space to the end with physical extend

Search adjacent

3

Search the end Without physical extend

Exhaustive search6

Search the end with Extend

5

Page 29: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

INSERT with APPEND• With APPEND

– Always insert into the end– No attempt to use free space created by deleting rows– Space search will always start on last data page of the last space map page– Search (minus) n pages prior to the last segment of the last space map page– n pages determined by SEGSIZE (use large SEGSIZE)

• Ideal for application journal and event logging tables with little or no update or delete• APPEND can be beneficial as it avoids the “normal” space search• With high concurrency may not see any improvement due to space map contention at the

end of table – So in data sharing environment with high concurrency, APPEND should be combined with MEMBER

CLUSTER for tables which require faster insert and do not need to maintain the data row clustering• Similar to V8 MEMBER CLUSTER, PCTFREE 0, FREEPAGE 0 for classic partitioned tablespace

• With LOCKSIZE PAGE|ANY, space growth can be excessive with partially filled data pages– With LOCKSIZE ROW, the space growth should be slower

• When using LOAD with DUMMY, do NOT use PREFORMAT option– DASD space size will be reduced after LOAD

• When using REORG (with DISCARD), do NOT use PREFORMAT option

29

Page 30: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast Un-clustered INSERT• Insert workloads are amongst the most prevalent and performance critical• Performance bottleneck will vary across different insert workloads

– Index maintenance?– Log write I/O?– Data space search (space map and page contention, false leads)– Format write during dataset extend– PPRC disk mirroring– Network latency– etc.

• Common that index maintenance and/or log write I/O time may dominate and mask any insert speed bottleneck on table space

30

Page 31: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast Un-clustered INSERT …• Officially referred to as “Insert Algorithm 2 (IAG2)”• Potentially delivers significant improvement for un-clustered inserts (e.g.,

journal table pattern) where– Heavy concurrent insert activity (many concurrent threads)– Space search and false leads on data is the constraint on overall insert throughput

• Applies to any UTS table space defined with MEMBER CLUSTER– Applies to both tables defined as APPEND YES or NO

• Implemented advanced new insert algorithm to streamline space search and space utilisation– Eliminates page contention and false leads

• Space is preallocated/preassigned in order to fill up pipes which are pulled from• Space allocation occurs at pageset open and real time when there is space shortage in

each individual pipe– Default is to use the new fast insert algorithm for qualifying table spaces

• DEFAULT_INSERT_ALGORITHM system parameter can change the default• INSERT ALGORITHM table space attribute can override system parameter

• It is NOT a replacement for the existing insert algorithm (IAG1)!31

Page 32: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast Un-clustered INSERT …• Your mileage will vary

– Many insert workloads will see no improvement and is to be expected– Will probably not see much difference/improvement when only one insert per

commit scope– Some specific insert workloads may see significant improvement– Less benefit as more indexes are added to the respective table

• Will shift the bottleneck to the next constraining factor• LOAD SHRLEVEL CHANGE can also use Fast Un-clustered INSERT• Fast Un-clustered INSERT will be disabled when lock escalation occurs or use of

SQL LOCK TABLE• First available after new function activation (FL=V12R1M500)

– Open to change in behavior for insert processing at new function activation

32

Page 33: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast Un-clustered INSERT …• Must be very aggressive on applying preventative service tagged with “IAG2”

for robustness and serviceability• Strongly recommend the apply of PTF for APAR PH02052 (Closed) which

implements automatic re-enablement with retry logic

• Current JC point-in-time recommendation– One size probably does not fit all tablespaces– Change system wide default - set system parameter DEFAULT_INSERT_ALGORITHM

= 1 (old basic insert algorithm)– Use INSERT ALGORITHM 2 (new fast insert algorithm) selectively at individual table

space level to override system wide default– Additional benefit may be gained where effective CICS and DRDA thread reuse

coupled with running packages bound with RELEASE(DEALLOCATE)

33

Page 34: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast un-clustered INSERT – Shifting The Bottleneck …

34

454035302520151050

Application Response Db2 elapsed Class 2 CPU Getpage

Insert Algorithm 2

Db2 11For z/OS

Db2 12For z/OS

Page 35: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast un-clustered INSERT – Shifting The Bottleneck …

35

27089453

65417

1460

40000300002000010000

6000050000

Throughput (insert/sec) TOTAL CPU per commit (us)

Db2 11 for z/OS Db2 12for z/OS

24x1/64

UTS PBG with MEMBER CLUSTER, RLL, with 400 bytes per row, one index,

800 concurrent threads, 10 insert per commit

Page 36: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Hidden ROWID support to partition• ROWID can be used as a partitioning column • Application impact if ROWID cannot

be hidden – APARs to support hidden ROWID

• PI76972, PI77310, PI77302 (Db2 12)• PI77718, PI77719, PI77360 (Db2 11)

• Benefits– Allows table to be partitioned where no

natural partitioning key exists or candidate partitioning keys do not provide a good spread across partitions

– Transparent to the application– Improved insert throughput

• Less lock/latch contention on index and data

36

CREATE TABLE PRDA.ZJSCNTP0( CLIENT VARGRAPHIC(3) NOT NULL,

WI_ID VARGRAPHIC(12) NOT NULL,LENGTH SMALLINT,DATA VARCHAR(1000),ROW_ID ROWID NOT NULL IMPLICITLY HIDDEN generated always

) PARTITION BY (ROW_ID) (PARTITION 1 ENDING AT (X'0FFF'), PARTITION 2 ENDING AT (X'1FFF'), PARTITION 3 ENDING AT (X'2FFF'), PARTITION 4 ENDING AT (X'3FFF'), :PARTITION 14 ENDING AT (X'DFFF'), PARTITION 15 ENDING AT (X'EFFF'), PARTITION 16 ENDING AT

(MAXVALUE))

Page 37: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Performance cost management using CPU capping• There are different types of caps, based on different technologies and therefore different

impact on workloads when capping kicks in• Concerns

– WLM policy may not be defined in a way to defensively protect the service when CPU is constrained – Increased risk of experiencing erratic performance and sporadic system slowdowns and worse …

• Recommended best practice for WLM setup with the right prioritisation - not related to CPU capping

– IRLM mapped to service class SYSSTC– Use Importance 1 and a very high velocity goal that is achievable for MSTR, DBM1, DIST, WLM-

managed stored procedure system address spaces– Use Importance 2-5 for all business applications starting in I=2 for high priority IMS, CICS, DRDA

workloads– Use CPUCRITICAL for Db2 system address spaces to prevent lower importance working climbing– Ensure WLM Blocked Workload enabled and set BLWLINTHD threshold time interval for which a

blocked address space or enclave must wait before being considered for promotion to <= 5 seconds– Generously provision zIIP capacity– Do not allow any Db2 application workload to run as discretionary

37

Page 38: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Performance cost management using CPU capping …• CPU capping …There are no problems with caps provided the applications can

tolerate the cap hitting and there is an escape mechanism when contention exists

• If NOT there there can be sympathy sickness across a Db2 subsystem or across a whole data sharing group (sysplex wide) due to lock and/or latch contention– Multiple customer outages have been caused by capping– Cap hits and can freeze Db2 at the wrong time (latch)

• Normal Db2 safeguards prove ineffective (Db2 targeted boost via WLM services, WLM Blocked Workload)

• Caps easy to misconfigure scope (accidental use of a SYSPLEX wide cap instead of LPAR cap)

• WLM Resource Group Capping is a cap due to which work can become non-dispatchable and therefore can have more severe impacts on workloads, and such as limiting the efficiency of the very valuable WLM Blocked Workload promotion

38

Page 39: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Performance cost management using CPU capping …• CPU capping …

– There are tools from various vendors which perform “soft” capping (i.e., cap the MSUs for a particular workload), some claiming intelligence and being automatic, others used by customers to micro manage with frequent intervention

– Every installation is different and many factors influence the outcome, but there are some aggravating factors and/or side effects

• Engine parking and unparking with HiperDispatch• Over use of I=1 work for application work requiring aggressive SLA (i.e., >= 70% capacity

and little or no lower priority work to pre-empt)• Specialised LPARs: online vs. batch LPARs i.e., only online interactive workload on an LPAR• Low and high priority work touching the same tables and the same pages• Frequent changes in PR/SM topology, especially when very small partitions are present

39

Page 40: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast REORG with SORTDATA NO RECLUSTER NO • Use cases:

– Materialising pending alters, converting to extended 10-byte LRSN/LRBA, conversion to UTS

• Sample control statement• REORG TABLESPACE SHRLEVEL CHANGE• SORTDATA NO RECLUSTER NO• STATISTICS TABLE(ALL) SAMPLE 1 REPORT YES UPDATE NONE

• Considerations and risks when using RECLUSTER NO– Recalculation of CLUSTERRATIO

• If no explicit STATISTICS option provided, REORG will collect implicit statistics– This will can probably lead to bad CLUSTERRATIO statistics

– Reset of RTS statistics• REORGUNCLUSTINS, REORGINSERTS and REORGUNCLUSTINS will be set to 0 even through no

data rows have been reclustered• Recommendations

– Only use SORTDATA NO RECLUSTER NO for objects with very highly clustered data• Alternative is to update RTS columns after REORG and restore the before values which have

been saved away– Use explicit STATISTICS with REPORT YES UPDATE NONE to avoid update to

CLUSTERRATIO• SAMPLE 1 is to reduce CPU resource consumption

40

Page 41: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Fast REORG with SORTDATA NO RECLUSTER NO … • Have to use tablespace level REORG to go through the process of converting

classic partitioned tablespaces to UTS PBR/PBR RPN (PBR2)• Db2 has supported data unload/reload partition parallelism on classic

partitioned and UTS PBR tablespace since Db2 9 GA• However, this parallelism not enabled when REORG converts a classic

partitioned tablespace to UTS PBR tablespace– Restriction has been removed with APAR PI72455 (Db2 12) and APAR PI71930 (Db2

11) which enables REORG to perform data partition parallelism in the conversion case

• Can also experience very long elapsed times with most of the elapsed time spent rebuilding the clustered index because the keys were not sorted first– If the data is not well clustered to start with, you end up with “death by random

I/O”– At the same time, REORG already uses sort for the not-clustered index keys– With APAR PI90801, available for both Db2 11 and Db2 12, REORG will now sort the

clustering index keys into key order for this scenario41

Page 42: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Requirement for increased RID Pool size (Db2 12)• Increased space requirement for RID Pool as a result of RID value increase 5 -> 8-

byte value– Internally Db2 for z/OS uses a normalised 8-byte RID value to allow for future expansion– More RID blocks will be used for the same query because each RIDLIST holds fewer RIDs– RID Pool memory usage will be roughly 60% higher (for smaller lists it will be up to 2x

higher)– Should plan on increasing MAXRBLK (RID Pool size) by up to 60% before leaving Db2 11– Data Manager logical limit (RIDMAP/RIDLIST) reduced from 26M RIDs to 16M RIDs– More RID Pool overflow to workfile is to be expected

• Increased space requirement for RID Pool as a result of potentially more use of list prefetch– Enhancement to the Optimizer cost model to more closely reflect the true cost (and

benefit) of list prefetch– Expected to see an increase in list prefetch (and potentially hybrid join)– But not necessarily changes in the access plan where Db2 would previously have chosen

a sort avoidance plan– Db2 for z/OS trying to be careful not to select list prefetch (with sort) as an access path

when there was an alternative access path that could use an index to avoid a sort i.e., for pagination type SQL 42

Page 43: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Increased EDM Pool memory usage for DBDs (Db2 12)• In Db2 12 there will be increased demand for EDM pool memory for DBDs

– How much depends on the size and number of OBDRECs (tables) and the number of OBDPSETs (table spaces)

• Extra memory comes from “puffing” of table obdrec and tablespace obdpset to new DBD format in Db2 12

– Pre-Db2 12 “puffed” size under Db2 12 > Db2 12 size (after ALTER/REPAIR) > Db2 11 size• The “puffed” OBD (OBDREC and OBDPSET) will not be written out to DBD01 which

means that “puffing” to Db2 12 format will occur every time the DBD is read from DBD01

• Before APAR PH05624, the “puffing” code is invoked even when the OBD is already in Db2 12 format

– No further additional memory requirement since OBD (OBDREC and OBDPSET) is already in Db2 12 format

• Some customers experienced significant performance impact after migration to Db2 12• Solutions

– Double size of EDM DBDC pool before leaving Db2 11– Apply PTF for APAR PH05624– Any DDL or REPAIR DBD REBUILD will result in update of OBD to Db2 12 format and new

format will be persisted in DBD0143

Page 44: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Setting initial STATISTICS PROFILE (Db2 12) • Statistics profile is automatically created on first

BIND/REBIND/PREPARE/EXPLAIN after entry to Db2 12 for z/OS• Statistics profile created based on all the existing statistics in the Catalog

– Stale or inconsistent “specialised” statistics may exist in the Catalog that have not been collected for a long time

– Statistics no longer collected because either too expensive to collect with RUNSTATS or were ineffective in solving access path problems, and not ever cleaned out

• “Specialised” statistics will now be MERGED into the respective statistics profile and will then be collected again on every execution of RUNSTATS with USE PROFILE

• After the initial profile create, cannot tell from the subject statistics profile what statistics are the ones that were the older/inconsistent statistics

• Stale or inconsistent statistics may already be causing sub-optimal access path choices prior to Db2 12

44

Page 45: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Setting initial STATISTICS PROFILE …• Use the following sample query to identify the inconsistent statistics

45

SELECT TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAME, MIN(STATSTIME),COUNT(*) FROM SYSIBM.SYSCOLDIST CD WHERE STATSTIME < CURRENT TIMESTAMP – 1 MONTH AND (TYPE IN ('C', 'H') OR NUMCOLUMNS > 1

OR STATSTIME < CURRENT TIMESTAMP - 1 YEAR)AND NOT EXISTS(SELECT 1 FROM SYSIBM.SYSINDEXES I WHERE I.TBCREATOR = CD.TBOWNER AND I.TBNAME = CD.TBNAME AND CD.STATSTIME BETWEEN I.STATSTIME - 8 DAYS

AND I.STATSTIME + 8 DAYS)AND NOT EXISTS(SELECT 1 FROM SYSIBM.SYSTABLES TWHERE T.CREATOR = CD.TBOWNER AND T.NAME = CD.TBNAME AND CD.STATSTIME BETWEEN T.STATSTIME - 8 DAYS

AND T.STATSTIME + 8 DAYS)GROUP BY TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAMEORDER BY TYPE, NUMCOLUMNS, TBOWNER, TBNAME, NAMEWITH UR;

Page 46: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Setting initial STATISTICS PROFILE (Db2 12) …• So it is important to clean up any (SYSCOLDIST) statistics that you do not intend

to regularly collect before first BIND/REBIND, PREPARE or EXPLAIN after entry to Db2 12 for z/OS– Simplest way to find these is to look for tables with rows having different STATSTIME

values in SYSCOLDIST– Reset access path statistics back to default using RUNSTATS with RESET ACCESPATH

option• Sets relevant Catalog column values to -1, and clears out entries in SYSCOLDIST

– Run “regular” RUNSTATS after the RESET operation

46

Page 47: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

System parameter REALSTORAGE_MANAGEMENT• The frequency of contraction mode is controlled by system parameter

REALSTORAGE_MANAGEMENT • Db2 uses DISCARD with KEEPREAL(YES) = “Soft Discard” • Unless you have ample real memory headroom and can tolerate some memory

growth, recommend running with REALSTORAGE_MANAGEMENT=AUTO (default) – With RSM=AUTO, Db2 will regularly discard unused REAL memory frames

• RSM=AUTO with no paging (AUTO-OFF) “Soft Discard” at Thread Deallocation or after 120 commits

• RSM=AUTO with paging (AUTO-ON) “Soft Discard” at Thread Deallocation or after 30 commits – STACK also DISCARDED

– Pros of the “Soft Discard”• Reduced REAL memory use• Reduced exposure to SPIN lock contention

– Cons: • CPU overhead in MSTR/DIST SRB time (which can be greatly minimised with good thread reuse)

– Worse on IBM z13 and z14 hardware• 64-bit shared and common real memory counters are not accurate until paging occurs

– The memory is only “virtually freed”– RSM flags the page as freed or unused, but the frame is still charged against Db2

47

Page 48: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

System parameter REALSTORAGE_MANAGEMENT …• REALSTORAGE_MANAGEMENT = OFF

– Do not enter ‘contraction mode’ unless the system parameter REALSTORAGE_MAX boundary is approaching OR z/OS has notified Db2 that there is a critical aux shortage

• But … provided an installation is certain about the generous allocation of REAL memory and avoiding any paging at all times then setting to OFF can generate CPU savings where there is poor thread reuse and/or surge of DBAT termination– CPU savings can be seen in Db2 MSTR and DIST system address spaces

• Setting system REALSTORAGE_MANAGEMENT = OFF requires a long-term commitment to maintaining generous REAL memory “white space” at all times

48

Page 49: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Transaction level workload balancing for DRDA traffic• Re-balancing mechanism to adjust the workload distribution across Db2 members

to meet the goal• Workload distribution can be non-uniform, can by spiky, can see “sloshing” effect• Drivers up to today’s latest available level have the two main issues

– Balancing tends to favor highest weighted member(s)• Less load: Driver starts traversing through server list and always selects the first Db2 member in

the server list, because it is available to take a new transaction based on a non-zero weight• Heavy Load: Driver starts traversing through server list from first Db2 member and checks

whether the Db2 member is available to take a new transaction based on the weight

– Failover to another Db2 member has caused client driver to excessively retry• Client driver team have made the following changes to address issues:

– Driver knows what is the last Db2 member used, so it starts from next Db2 member• For first iteration through server list, driver will use every Db2 member once• Second iteration onwards the driver choses the Db2 member based on the weight and number

of transactions already ran

– Failover to another Db2 member will only be attempted once and the DataSource address will be used

49

Page 50: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Transaction level workload balancing for DRDA traffic …• Recommendations (prioritised) to improve the workload distribution across both

members– Validate whether or not the goal of DDF transactions are being met under normal

operating conditions and adjust the WLM policy accordingly (adjust the goal so it can be met and/or break down the DDF transactions into multiple service classes)

– Upgrade to new Db2 Connect driver level 11.1 M4 FP4 which will provide a new workload balancing distribution algorithm which is much better

– Only if necessary consider setting RTPIFACTOR in IEAOPTxx (e.g. 50%) to reduce the possibility of a sudden redirection of workload to the alternate Db2 member when PI > 1

• Supporting recommendations– With sysplex workload balancing (sysplexWLB) enabled, CONDBAT on each member

should be set higher than the sum of all possible connections across all connection pools from all application servers plus 20% to avoid reduction in health of Db2 member

• 80% of CONDBAT=Health/2, 90% of CONDBAT=Health/4– MAXDBAT should be set to a value which permits normal workload levels, AND allow for

peaks, AND possible Db2 member outage … but not so high as to allow a ‘tsunami’ of work into the system

– MAXCONQN can be used as the “vehicle” to limit queuing force early redirection of connections wanting to do work to the another Db2 member

50

Page 51: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Use of High-Performance DBATs• High-Performance DBATs have the potential to provide a significant opportunity

for CPU reduction by– Avoiding connection going inactive and switching back to active later– Avoid DBAT being pooled at transaction end i.e., going back into DDF thread pool

for reuse by probably a different connection– Supporting true RELEASE(DEALLOCATE) execution for static SQL packages to avoid

repeated package and statement block allocation/deallocation• Very powerful performance option, but can be dangerous … if not used

intelligently

51

Page 52: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Use of High-Performance DBATs …• Important operational considerations

– Do not use RELEASE(DEALLOCATE) on common widely shared packages across distributed workloads as it will considerably drive up the requirement for MAXDBAT

• Risk of not having enough DBATs available for DRDA workloads that are not using High-Performance DBATs

• Worst case running out of DBATs completely or escalating thread management costs

– Check that all the ODBC/JDBC driver packages in collection NULLID are bound as RELEASE(COMMIT) – otherwise they must be rebound with RELEASE(COMMIT)

• WARNING: Since V9.7 FP3a, default BIND option for Db2 client driver packages has been RELEASE(DEALLOCATE) !

– Be very careful about DDF workloads re-using common static SQL packages used by CICS and/or batch workloads bound with RELEASE(DEALLOCATE)

– Do not over-inflate the application server connection pool definitions, otherwise it will considerably drive up the demand for High-Performance DBATs

52

Page 53: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Use of High-Performance DBATs …• Recommendations

– Consider rebinding with RELEASE(DEALLOCATE) the static SQL packages that are heavily-used and unique to specific high-volume DDF transactions

– Enable High-Performance DBATs by usingthe -MODIFY DDF PKGREL(BNDOPT|BNDPOOL)

– Be prepared to use -MODIFY DDF PKGREL(COMMIT) • Allow BIND, REBIND, DDL, online REORG materialising a pending ALTER to break in• Switch off High-Performance DBATs at first signs of DBAT congestion i.e. overuse of DBATs

53

Page 54: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Overuse of UTS PBG tablespace and MAXPARTS • Primary driver for the developing UTS PBG tablespace was the removal of the 64GB

limit for classic segmented tablespace and avoid the disruptive migration to classic partitioned tablespace

• Some considerations– All indexes are going to be NPIs– Limited partition independence for utilities (REORG, LOAD)– No partition parallelism for utilities (LOAD, UNLOAD, REORG)– Partitioning not used for query parallelism– Degraded insert performance (free space search) as the number of partitions grow– If REORG a partition list/range, it may encounter undetected deadlock between

applications and REORG during the SWITCH phase (i.e. drain and claim in different order) – REORG PART will fail for a full UTS PBG partition if FREEPAGE or PCTFREE are non-zero– Setting system parameter REORG_DROP_PBG_PARTS = ENABLE could lead to operational

issues if the number of PARTs are pruned back• No point-in-time recovery prior to the REORG that prunes partitions• Cannot use DSN1COPY to move data between Db2 systems

• Should not be using UTS PBG as the design default for all tables (with large number of partitions)

54

Page 55: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Overuse of UTS PBG tablespace and MAXPARTS … • General recommendations for use of UTS PBG tablespace

– Only use UTS PBG tablespace as the alternative and replacement for classic segmented tablespace

– A table greater than 60GB in size should be created as a UTS PBR tablespace– Good reasons to limit number of partitions - should have as few partitions as

possible - ideally only 1– DSSIZE and SEGSIZE should be consistent with the target size of the object e.g.

• Small size object: DSSIZE = 2GB and SEGSIZE = 4• Medium size object: DSSIZE = 4GB and SEGSIZE = 32• Large size object: DSSIZE = 64GB and SEGSIZE = 64

– REORG at the table space level unless do not have sufficient DASD space for sort– Set system parameter REORG_DROP_PBG_PARTS = DISABLE

• If required to prune back the number of partitions– Use online system parameter to temporarily enable for controlled use

• Better still, in Db2 12, use the DROP_PART YES option of REORG

55

Page 56: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

IRLM query requests for package break-in• An IRLM query request is used to see if there are X-package lock waiters on a

package during commit as part of commit phase 1 broadcast• The break-in at commit is charged to Acctg Class 2 CPU Time for both local allied

and DBATs regardless under read-only or update transactions• The IRLM query request is “fast path” and a single query request has little or no

performance impact• There is an IRLM query request per RELEASE(DEALLOCATE) package per commit (per

lock token basis)• If there are many RELEASE(DEALLOCATE) packages loaded by commit time in a long-

running thread, then there will be an IRLM query request for each one of those packages at that time

• It is possible to accumulate many distinct RELEASE(DEALLOCATE) packages across 1000 successive commits on each CICS-Db2 thread i.e., REUSELIMIT is 1000 in CICS definitions, and many different transactions running against the same DB2ENTRY in CICS definitions, and multiple packages per commit

• A big multiplier exaggerates the CPU overhead of IRLM query requests• Likely to be a problem when applications are using many fine-grained packages for

IO routines e.g., CA Gen (CoolGen)• Solution: set system parameter PKGREL_COMMIT=NO and toggle on/off when

needed56

Page 57: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Running CHECK utilities with SHRLEVEL CHANGE• Db2 CHECK utilities are critical diagnosis tools in the event of data corruption

– Identify objects that need repairing/recovering and assess the extent of the damage• Many installations not currently set up to run the CHECK utilities non-disruptively

– System parameter CHECK_FASTREPLICATION = PREFERRED– System parameter FLASHCOPY_PPRC = blank

• Having system parameter CHECK_FASTREPLICATION = PREFERRED may become an availability exposure– CHECK with SHRLEVEL CHANGE would be allowed to run, but would not be able to use

FlashCopy to create the shadow objects objects could be left in RO status for many minutes whilst they are being copied with standard I/O and this would likely be very disruptive causing an application outage

– Will you take an application outage, or delay the discovery of the damage and the extent thereof?

• Recommendations– As an immediate defensive measure, set ZPARM CHECK_FASTREPLICATION = REQUIRED

(default since Db2 10)• Protection against misuse of CHECK SHRLEVEL CHANGE• CHECK utility will fail if FlashCopy cannot be used

57

Page 58: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Running CHECK utilities with SHRLEVEL CHANGE …• Recommendations …

– Better still exploit data set FlashCopy to run the CHECK utilities non disruptively• Requires only a small investment in temporary disk storage to create shadow objects• Optional: system parameter UTIL_TEMP_STORCLAS can be used to specify a storage class

for the shadow data sets (if left blank, the shadow data sets will be defined in the same storage class as the production pageset)

• Some special considerations apply when using DASD-based replication– Shadow data sets must be allocated on a pool of volumes outside of async mirror (Global Mirror,

z/OS Global Mirror, XRC)– With sync mirror (Metro Mirror, PPRC) and assuming Remote Pair FlashCopy

» Set system parameter FLASHCOPY_PPRC = REQUIRED (default Db2 10)» Or shadow data sets must be allocated on a pool of volumes outside of the mirror

58

Page 59: Db2 for z/OS JC Greatest Hits, War Stories, and Best Practice … for zOS JC Greatest Hits War... · JC Greatest Hits, War Stories, and Best Practice 2019 S203 - Part 1, S211 - Part

WE CREATE THE FUTURE OF IT….

Don’t think twice. Join. The benefits are yours.

Questions?

59