DB2 for z/OS: JC's Greatest Hits, War Stories and Best

DB2forz/OS:JC'sGreatestHits,WarStoriesandBestPractice2016- Parts1and2JohnCampbellDB2forz/OSDevelopmentSessionCode:Z6andZ7Thursday,September15th 8:30-9:30&9:45-10:45|Platform:z/OS

Objectives

• Learnrecommendedbestpractice• Learnfromthepositiveandnegativeexperiencesofotherinstallations

2

Topics• SystemparameterDEL_CFSTRUCTS_ON_RESTART=YESsetting• AutomatedDB2restartandmisuseof“RestartLight”• REALmemoryplanningLFAREAIEASYSxx parameter• ALTERBUFFERPOOLbpname immediate• GlobalLockManager(GLM)cleanupandrecoveryindatasharing• FastREORGwithSORTDATANORECLUSTERNO• Housekeeping:REORGINDEXandREORGTABLESPACE• Conversiontotable-controlledpartitioning• Alterlimitkeywithtable-controlledpartitioning• Value ofrebindformigratedstaticSQLpackages• Multi-sitedatasharinggroupandextendeddistance• zIIP capacity• AvoidAuto-bindafterreleasemigrationsandschemachanges• Memorymanagementdesignchanges• Tutorialoninsertspacesearch

3

SystemparameterDEL_CFSTRUCTS_ON_RESTART=YESsetting

• Background• AllDB2CFstructuresMUSTbepurgedbeforeattemptingadisasterrestart

• Otherwise,guaranteedlogicaldatacorruption• RequiredevenifusingMetroMirrorwithFREEZE/GOpolicy

• Noconsistencybetween‘frozen’secondaryDASDcopyandcontentoftheCFstructures• Necessaryactionsandchecksshouldbeautomated• NewDB2systemparameterintroducedinV10atcustomerrequest

• DEL_CFSTRUCTS_ON_RESTART=YES|NO• Triggersauto-deleteofDB2CFstructuresonrestartifnoactiveconnectionsexist

• GRECP/LPLrecoverywillberequiredasaresultofdeletingtheGBPs• LikelytobethelargestcontributortoDB2restarttime

4

SystemparameterDEL_CFSTRUCTS_ON_RESTART=YESsetting...

• Problem• Manycustomersusethesamesystemparametersettingsforproductionand

disasterrecovery• IfDEL_CFSTRUCTS_ON_RESTART=YESthiscanresultinunnecessaryDB2group

restartincertainlocalfailurescenariosanditcouldtakealongtime• Recommendations

• DoNOT setDEL_CFSTRUCTS_ON_RESTART=YES• MuchsafertoupgradetheGDPSprocedurestopurgetheDB2CFstructuresbefore

DB2restartwhenrestartingoffsecondaryvolumesthathavedriftedbehindthecurrencyoftheCFcontent

5

DXCF,STRUCTURE,STRNAME=grpname*

FortheLOCK1structureandanyfailed-persistentgroupbufferpools:SETXCFFORCE,CONNECTION,STRNAME=strname,CONNAME=ALL

FortheLOCK1structureandtheSCA,itisalsonecessarytoforcethestructuresout:SETXCFFORCE,STRUCTURE,STRNAME=strname

DXCF,STRUCTURE,STRNAME=grpname*(toconfirm)

AutomatedDB2restartandmisuseofRestartLight

• TuningforfastDB2restart• Takefrequentsystemcheckpoints

• Every2-5minutesgenerallyworkswell• Commitfrequently!• UseDB2Consistentrestart(PostponedAbort)tolimitbackout oflong-runningURs

• ControlledviaZPARM• LBACKOUT=AUTO|LIGHTAUTO

• LIGHTAUTOisanewoptiontoallowRestartLightwithpostponedabort

• SeePM34474(V9)andPM37675(V10)• BACKODUR=5(intervalis500Klogrecordsiftime-basedcheckpointfrequency)

• PostponedAbortisnota‘get-out-of-jail-free’card• Someretainedlockspersistthroughrestartwhenpostponedbackout processingisactive

• Retainedlocksheldonpagesetsforwhichbackout workhasnotbeencompleted• Retainedlocksheldontables,pages,rows,orLOBsofthosetablespacesorpartitions

• Ifonsharedcriticalresources,theseretainedlockscanpreventapplicationsfromrunningproperly

6

AutomatedDB2restartandmisuseofRestartLight…

• TuningforfastDB2restart…• Trackandeliminatelong-runningURs

• Long-runningURscanhaveabigimpactonoverallsystemperformanceandavailability• ElongatedDB2restartandrecoverytimes• Reducedavailabilityduetoretainedlocksheldforalongtime(datasharing)• Potentiallongrollbacktimesincaseofapplicationabends• Lockcontentionduetoextendedlockduration>>timeoutsforotherapplications• Ineffectivelockavoidance(datasharing)

• Asinglelong-runningURrunninganywhereinthedatasharinggroup canreducetheeffectivenessoflockavoidanceforapplicationsaccessingGBP-dependentpagesetsbystoppingtheGlobalCLSNvaluefrommovingforward

• ProblemsgettingDB2utilitiesexecuted• DB2providestwosystemparameterstohelpidentify‘rogue’applications

• URCHKTHforlongrunningURs• URLGWTHforheavyupdatersthatdonotcommit

7


• TuningforfastDB2restart…• Trackandeliminatelong-runningURs…

• Aggressivelymonitorlong-runningURs• StartconservativelyandadjusttheZPARMvaluesdownwardsprogressively• Initialrecommendations

• URLGWTH=10(Klogrecords)• URCHKTH=5(systemcheckpoints)– basedona3-minutecheckpointinterval

• Automaticallycapturewarningmessages• DSNJ031I(URLGWTH)• DSNR035I(URCHKTH)• and/orpostprocessIFCID0313records(ifStatisticsClass3ison)

• Getbadly-behavedapplicationsupgradedsothattheycommitmorefrequently• Needmanagementownershipandprocess

8


• DB2restartafterfailure• UseautomationtodetectfailuresquicklyandrestartDB2immediately

• Objective:Releaseretainedlocksassoonaspossible• NOmanualintervention• NOmanagementdecisioninvolved• AutomatedRestartManager(ARM)orautomatedoperators(e.g.TivoliSystem

Automationfamily)canbeused• Twofailurescenariostodistinguish

• DB2orIRLMfailure:restartDB2inplaceusinganormalDB2warmrestart• LPARorCECfailure:restartDB2onanalternativeLPARusingDB2RestartLight

9

** Very basic ARM Policy **RESTART_GROUP(DB2)

ELEMENT(DSNDB10DB1A)TERMTYPE(ALLTERM)RESTART_METHOD(ELEMTERM,PERSIST)RESTART_METHOD(SYSTERM,STC,'-DB1A STA DB2,LIGHT(NOINDOUBTS)')


• DB2restartafterfailure…• “RestartLight“=-STARTDB2LIGHT(YES|NOINDOUBTS|CASTOUT)

• DesignpointofRestartLightisONLY forcross-systemrestartfollowingLPAR/CECfailure• Smallmemoryfootprint

• UsedtoavoidECSA/CSAvirtualstorageshortageonthealternateLPARbyforcingIRLMPC=YES• NolongeranissuebecauseasfromV8asIRLMPC=YESisalwaysforced

• AvoidsrealmemoryshortageonthealternateLPARbyseverelyconstrainingpoolsizes• Simplifiedmanagement

• Doesnotallownewworkloadtocomein• CanshutdownDB2memberautomatically

• BUT….RestartLightislikelytobeslowerthannormalDB2crashrestart

• RestartLightisNOT intendedforrestartinplaceonthe‘home’ LPARorforDRcrashrestart

10


• DB2restartafterfailure…• RestartLightandpostponedAbort

• ZPARMLBACKOUT=AUTOsetting(default)isignoredandtreatedasLBACKOUT=NO• ProcessingofAbortURswillnotbepostponed• TherewillbeawaittocompleteprocessingofAbortURssuccessfully• RestartcouldtakeatakealongtimetocompleteifthereareverylongrunningURstoabort

• Recommendation:LBACKOUT=LIGHTAUTOtoenablepostponedAbortforRestartLight

• RestartLightandin-doubtURs• WithLIGHT(YES),DB2willnotterminateautomaticallyuntilallin-doubtURsareresolved• Option#1:ProvidedthereissufficientspareREALmemoryavailableonthealternate

LPAR,coulduseautomationtorestartCICSand/orIMS/TMinfrastructureonthesamealternateLPARastheDB2memberbeingrestarted

• Option#2:UseoptionLIGHT(NOINDOUBTS)• DB2willterminateautomaticallyandnotresolvein-doubts• Butkeepinmindthatretainedlockscannotbefreeduntilthein-doubtURisresolved

11


• DB2restartafterfailure…• RestartLightandretainedlocks

• STARTDB2LIGHT(YES|NOINDOUBTS)willnotcleanupallretainedlocks• ShouldexpecttoseemessageDSNT804Iattheendofrestart

• RetainedtransactionlocksandX-modepagesetP-locksarereleased• Butretainedpagesetp-locksinIXorSIXmodewillNOTbereleased

• NoSQLDMLprocessingwillbeblocked– exceptionismassdelete• NoDB2utilitythatisaclaimerwillbeblockede.g.COPYSHRLEVEL(CHANGE)willwork• Butdrainerswillbeblockede.g.REORGincludingREORGSHRLEVEL(CHANGE),SQLDDL

• WillbeclearedoncetheDB2isrestartednormallyinits‘home’LPAR• NewoptionSTARTDB2LIGHT(CASTOUT)inV11

• Kicksoffcastout processing• Aftercastout,pagesets becomenon-GBPdependent• PagesetP-locksinIXorSIXmodearereleased• WilltakelongertocompletethanLIGHT(YES|NOINDOUBTS)option

12

DSNT804I - THERE ARE MODIFY LOCKS OWNED BY THIS DB2 THAT HAVE BEEN RETAINED

REALmemoryplanningLFAREAIEASYSxx parameter

• Thelargeframeareaisusedfor• Fixed1Msizelargepages• ”Spillover”forPageable 1Msizelargepages

• UsinglargesizepagescanimproveCPUperformancebyreducingtheoverheadofdynamicaddresstranslation• ImprovedhitratiointheTranslationLook-asideBuffer(TLB)becauseofsmaller

numberofentries

• CPUreductionwillvary,basedoncustomerexperience0to4%• BufferpoolsmustbedefinedasPGFIX=YEStouse1Msizepages• SizeofthelargepageframeareaisspecifiedbyLFAREAinIEASYSnnparmlib

memberandonlychangeablebyIPL• Involvespartitioningrealmemoryinto4Kand1Msizepages• MustmakesurethereissufficienttotalREALmemorytofullybackthedemand

forboth enough4K andenough1Msizepages13

REALmemoryplanningLFAREAIEASYSxx parameter...

• LFAREAshouldbesizedandmanagedbasedontotalworkloaddemandforfixed1Mpages

• SettingLFAREAtothemaximumsettingof80%isnot wheretostart• ThereareconsequencestooverallocationofLFAREAandshortage of4Kpages

intermsofpenaltyofpagemovementorpagingI/Ooverhead• PagemovementandpagingI/OisdonebySRBsprocessingundertheRASP

addressspaceandtherewillbeacorrespondingincreaseonCPUusageinRASP

• DefineLFAREAthatyoucanaffordbasedonwhatisleftoverafterconsideringyourtotalrealmemoryandworkloaddemandsfor4Kframesabove2G

• Mustconsideroperatingsystemstorageneeds• RSMrequirementformemorymappingwhichisapproximately1/64totalonline

realatIPLe.g.,4Gfor256Gsystem• Systemaddressspacesmemoryusage(DB2,CICS,etc)

• Mustalsoincludeenoughspare4Ksizeframesfortakingdumpsquickly14

REALmemoryplanningLFAREAIEASYSxxparameter…

• LFAREAandINCLUDE1MAFCoption• Availableframecount(AFC)isusedtodeterminewhenstoragemanagement

shouldbeginpagingframeswhenrealmemoryshortage• TwooptionswhendefiningtheLFAREA:INCLUDE1MAFC(YES|NO)

• YESmeansAFCincludes theLFAREA1Mpages• Pagingisdelayed• Frameswillbeconvertedfromthe1MLFAREAto4Kframesfor4Knon-preferredrequests

• NOmeansAFCdoesnotinclude LFAREA1Mpages• Pagingwillbeinstigated• Fixed1Mframeswilleventuallybebrokendowninto4Kframesfor4Knon-preferredrequests

• RecommendationistouseINCLUDE1MAFC(YES)sothatunused1MpagescanbebrokendownASAPtosatisfydemandfor4Knon-preferredrequests

15


16


• Recommendations• Haveenough4Kframes“abovethebar”toavoidRSMbreakingdownfree1M

pagesandpagingorpagemovementfor4Kpagefixes• IncludeRSMneedsformemorymapping- 1/64totalonlinerealatIPL• Factorinsystemaddressspacememoryusage• Includeenoughspare4Kframesfortakingdumpsquickly• DefinetheLFAREAbasedonwhatyoucanaffordwithintheavailableremaining

onlinerealmemory,theresultsareeither:• DB2bufferpoolsarespreadacross4Kand1Mframes

• Noavailabilityissues• Noperformanceregressioncomparedtousingall4Kframes• Onlysomelosstoincrementalbenefitofusing1Mframes

• ProvisionadditionalREALmemory• ConstrainDFSORTuseofREALmemorytowhattheconfigurationcanafford• BothQUADandPageable1MareasgrowproportionallywithadditionalREAL

memory

17

REALmemoryplanningLFAREAIEASYSxx parameter…

• Minimumstartingpointfor4Krequestsisthesumofallofthefollowing:• 1/64oftotalrealmemoryoperatingsystemrequirement(64bytesper4Kpage,1M

pagecountsas256*4K),plus• 1/8forQUADarea– shouldnotrelyonthisareabeingusedascontingencyfor

satisfying4Kfixedrequests,plus• 1/8forPageable 1Mframes- shouldnotrelyonthisareabeingusedas

contingencyforsatisfying4Kpageable orfixedrequests,plus• 2Gforbelowthe2Gbar,plus• 2Gforabovethe2Gbar,plus• DUMPSRV/MAXSPACErequirement(generalROTis16G)

18

ALTERBUFFERPOOLbpname immediate

• PriortoDB210,userhadtoperformaSTOPandSTARTaroundtheALTER• DB210allowssubjectbufferpoolchangetonowbe"immediate"aslongasthe

pagesizeisnotchanging• Thisenhancementisnotwelladvertised• DB210performs

• DRAINALLoperationontheobject,whichdrivesphysicalcloseonallDB2members• PhysicalclosebythelastDB2memberdrivesalloftheGBPcastoutandpurge

processing• Takeanxmodelockonskeletoncursorandpackagetables

• Mustbesubmittedduringquietperiod

19

GlobalLockManager(GLM)cleanupandrecoveryindatasharing• Context

• WhenaDB2memberterminates,managementresponsibilityforgloballockcontentionmustbetransferred tootherDB2membersofthesubjectdatasharinggroup

• TheDB2lockstructureofthesubjectdatasharinggroupmustbequiesced duringsomeofthecleanupactivities

• DuringtheperiodoftimewhentheDB2lockstructureisquiesced,no DB2globallockingactivitycanoccurforthesubjectDB2datasharinggroup

• Aspartofthisrecoveryasignificant amountofXCFsignalingactivityisrequiredtore-establish thegloballockmanagementposition

• Enhancementshavebeendeliveredintheservicestreamtoimprove recoveryperformance

• OptimizedXCFsignalingisessentialtoreduceoverallrecoverytime• Problems

• IRLMCPUburnfollowingDB2membershutdownandabnormalDB2membertermination

• Manysecondstosmallnumberofminuteswhenthedatasharinggroupappearstobehung

20

GlobalLockManager(GLM)cleanupandrecoveryindatasharing…

21


• Recommendations• ApplyrelevantPTFmaintenance

• APAROA43748(Feb2014)• FixesaqueuingscalabilityissueforGLMrecovery

• APAROA43388(Aug2014HIPER)• Redesignrecoveryto

• Minimise thetimetheDB2lockstructureisquiesced• AchievebetterbalancingoftheGLMrecoveryacrossthesurvivingmembers

• APAROA46423(Feb2015HIPER,PEfixforAPAROA43748)• EliminatespotentialforhangsanddelaysintroducedbyAPAROA43748

• Ifglobalfalselockcontentionisexcessive(i.e.,>1%)• IncreasenumberofLTEsforCFLOCK1structuretoreducegloballockmanagement

22


• Recommendations…• AvoidXCFsignalingdelays

• Insufficientmessagebufferswillleadtorejectedmessages,signalshavingtoberesent,resultingindelays

• DelayselongatetheamountoftimethattheDB2lockstructureremainsquiesced• Makesurea4KsizeXCFtransportclassisdefined• Assigndedicatedbufferspace(MAXMSG)ofatleast10000for4KsizeXCFtransportclass• SignificantlyincreasebufferspaceonCOUPLESYSPLEXstatementtoatleast10000• Proactivelymonitorandtunefornearzero‘messagesrejected’

23

FastREORGwithSORTDATANORECLUSTERNO• Usecases:pendingalters,LRBA/LRSNconversion,conversiontoUTS• Samplecontrolstatement

REORGTABLESPACESHRLEVELCHANGESORTDATANORECLUSTERNOSTATISTICSTABLE(ALL)SAMPLE1REPORTYESUPDATENONE

• ConsiderationsandriskswhenusingRECLUSTERNO1. RecalculationofCLUSTERRATIO

• IfyouremoveSTATISTICS,REORGwillcollectimplicitstatistics• ThiswillcanleadtopossiblybadCLUSTERRATIOstatistics

2. ResetofRTSstatistics• REORGUNCLUSTINS,REORGINSERTSandREORGUNCLUSTINSwillbesetto0even

throughnodatarowshavebeenreclustered• Recommendations

• OnlyuseSORTDATANORECLUSTERNOforobjectswithveryhighlyclustereddata• AlternativeistoupdateRTSafterREORGandrestorethebeforevalues

• UseexplicitSTATISTICSwithREPORTYESUPDATENONE• SAMPLE1istoreduceCPUresourceconsumption

24

Housekeeping:REORGINDEXandREORGTABLESPACE

• Collectionofsomecommonfindingsfromcustomerstudies• ToomanyREORGTABLESPACEscheduledvs.toofewREORGINDEXscheduled• Indexesareonlyeverreorganised aspartofREORGTABLESPACE• CollectinginlinestatisticsforREORGINDEX

• Timedriftacrossstatisticscollected:REORGTABLESPACEvs.REORGINDEX• DegradedCLUSTERRATIOforsubjectindex->potentialchangeofaccesspath

• Highnumberofindexesthat:• Areverydisorganised duetoindexleafpagesplits• Havehighpercentageofpseudo-deletedindexentries

• NPIsexperiencinglevelofdisorganisation followingREORGTABLESPACEPARTwithSORTNPSINO

25

Housekeeping:REORGINDEXandREORGTABLESPACE…

• RecommendationsforREORGINDEX• ImplementcriteriaforindexesbasedonRTStableSYSINDEXSPACESTATS

• %ofindexpagesplits:REORGLEAFFAR/NACTIVE>10%• %ofpseudo-deletedindexentries:REORGPSEUDODELETES/TOTALENTRIES>5%• %ofpseudo-emptypages:NPAGES>5ANDNPAGES/NLEAF>10%

• Buildanindexlistandmergewiththetablespacelist• IftheindexqualifiesforREORG,butnotthetablespace,runREORGINDEX• Donot collectinlinestatisticsaspartofREORGINDEX(alwaysfewexceptions)• SetZPARMREORG_PART_SORT_NPSI=NOà AUTO(V11default)

• AvoidNPIdisorganisationatendofREORGTABLESPACEPART• SetZPARMINDEX_CLEANUP_THREADS=10(V11default)forautonomiccleanup

ofpseudodeletedindexentries• APARPI31459:Suppressthetimeoutmessageswhereacleanuptaskisthevictim

26

Housekeeping:REORGINDEXandREORGTABLESPACE…

• RecommendationsforREORGTABLESPACE• Extendcriteriabasedon%ofrowsinserted/deletedtoincludequalificationonsize• Addacriteriaon%ofunclusteredinsertoperations:

• REORGCLUSTERSENS>0ANDREORGUNCLUSTINS/TOTALROWS>10%• PeriodicallyreviewrulesandthresholdsagainstIBM-providedsampleSQLstored

procedureDSNACCOX• ExploittheinformationcollectedonREORGtriggerconditionstodrivetuning

decisionssuchasdistributedfreespace(PCTFREE/FREEPAGE)adjustments

27

Conversiontotable-controlledpartitioning

• IntroducedbackinV8aheadofintroductionofUTSPBRinV9• History

1. InV8,allcolumnsnotjustthesignificant columns,were used during theconversion(no system parameter to control behavior)

2. V9APARPM45829introduced new system parameter togovernwhichcolumnsareusedintheconversionprocess:IX_TB_PART_CONV_EXCLUDE• NO =allcolumns (default)• YES =only significant columns

3. V9APARPM90893changed default to YES• Manycustomershavemadeprogressovertimeinconvertingovertoclassic

partitionedtablespaceen-routetoUTSPBR• Problem

• SomePIscreatedafterconversionmaybecreatedasDPSIs,whichmayleadtosuboptimalaccesspathselections…

28

Conversiontotable-controlledpartitioning…

29

IX_TB_PART_CONV_EXCLUDE=YES IX_TB_PART_CONV_EXCLUDE=NO

CREATE TABLESPACE SAMSSEGSIZE 0 NUMPARTS 3...CREATE UNIQUE INDEX SAMX(COLA ASC, COLB ASC) ... PART 0001 VALUES ( 00010 ) , PART 0002 VALUES ( 00020 ) , PART 0003 VALUES ( 00030 )...

DSNT404I SQLCODE = 20272, WARNING: TABLE SPACE SAMS HAS BEEN CONVERTED TO USE TABLE-CONTROLLED PARTITIONING INSTEAD OF INDEX-CONTROLLED PARTITIONING, ADDITIONAL INFORMATION: 30

SYSIBM.SYSTABLEPART-------------------! LIMITKEY_INTERNAL-------------------! 10

SYSIBM.SYSTABLEPART-------------------! LIMITKEY_INTERNAL-------------------! 10, MAXVALUE


30

IX_TB_PART_CONV_EXCLUDE=YES IX_TB_PART_CONV_EXCLUDE=NO

CREATE INDEX SAMX2 ...(COLA ASC , COLC ASC) PARTITIONED ...CREATE INDEX SAMX3 ...(COLA ASC , COLB ASC LOCATION ASC) PARTITIONED...

SYSIBM.SYSINDEXES+----------------------+! NAME ! INDEXTYPE !+----------------------+! SAMX ! P !! SAMX2 ! P !! SAMX3 ! P !+----------------------+

SYSIBM.SYSINDEXES+----------------------+! NAME ! INDEXTYPE !+----------------------+! SAMX ! P !! SAMX2 ! D !! SAMX3 ! P !+----------------------+


• Recommendations• Setsystemparamter IX_TB_PART_CONV_EXCLUDE=YES• Investigatewhethersuchcasesexiste.g.,

• IndexonA,B,CbutpartitionedonAonlyà table-controlledpartitioningonA,B,C• AlternateindexonA,DwasaPIbeforeconversionà becomesaDPSIafterconversion

• Bestway to convert• ALTERINDEX...NOTCLUSTERand then ALTERINDEX...CLUSTERonthepartitioning

index inthesamecommit scope• Note:AfterAPARPI36213,dependentpackageswillbeinvalidated!

31

Alterlimitkeywithtable-controlledpartitioning

• ExistinglimitkeyvaluesforpartitionedtablePARTITION001ENDINGAT('','01'),PARTITION002ENDINGAT('','02'),PARTITION003ENDINGAT('','03'),PARTITION004ENDINGAT('','99');

• Oldsyntaxforindex-controlledpartitionedtableALTERINDEXDB.XDBUTS11

PARTITION001ENDINGAT('','02'),PARTITION002ENDINGAT('','03'),PARTITION003ENDINGAT('','04'),PARTITION004ENDINGAT('','99');

• Multiplepartitionsareallowedinonealterstatement• DDLStatementvalidaslongastheresultingvaluesareconsistent

32

Alterlimitkeywithtable-controlledpartitioning…

• Syntaxfortable-controlledpartitionedtableandonly onepartitionallowedforeachALTERstatement:

ALTERTABLEDB.TDB_UTSR_DB1ALTERPARTITION1ENDINGAT('','02')INCLUSIVE;




• TheveryfirstalterresultsinSQLCODE-636asitconflictswithoriginalvalueofpart2

• Onlywaytohandlethisspecificcase,istodothealtersinreverseorder• Foreverycaseyouneedtoanalyse thesequencetoissuethealters

33

Value ofrebindformigratedstaticSQLpackages

1. Improvedperformancefromnewruntime• Fastcolumnprocessingroutinesre-enabled(SPROC,IPROC,UPROC,GENPROC)• Avoid“puffing”undernewreleasewhenexecutingmigratedpackage

2. ExposestaticapplicationstonewDB2releasewhilefallbackispossibleviaREBINDSWITCH

3. Getexposuretonewqueryoptimizationandruntimeenhancements• Newaccesspathchoices

4. Reduceexposuretoproblemswithmigratedpackagesfromearlierreleases• INCORROUTs• Threadabends

5. Preparationforfurtherusageofplanmanagementinnewreleasesandbeyond• APREUSE(ERROR|WARN),…

34

Multi-sitedatasharinggroupandextendeddistance

• Somecustomersconsideringmovingtoamulti-siteparallelsysplex overlongerdistances,whilestillhavingactiveDB2membersateachsite

• Generalconsiderations• ‘Firstorder’effectisanincreaseinCFservicetimes,whichiseasytocalculate

• Servicetimesfordistancesgreaterthanaround3kmwillbeapproximately:• ForhostCPUefficiency,allCFoperationswillbesentasasync,whichwilladdaround70usec• Then,forevery1kmofdistance,theroundtriptimeisabout10usec (speedoflight)• Forexample:at15kmyoucouldexpectabout70+(15x10)=220usec forCFoperation

• Fromanapplicationresponsetimestandpoint,theelongationofCFservicetimeswillhavehard-to-predictpotential‘secondorder’effects• Elongatedtransactionresponsetimecouldaggravatelockcontention• Effectsverydependentoncustomerapplicationstructureandmix

• Anycustomerconsideringgoingbeyond10kmdistanceshoulddosomesimulateddistancetestingwithrealisticapplicationworkloads(includingbatchworkloadsandDB2utilities) toproveviability

35

Multi-sitedatasharinggroupandextendeddistance…

• Considerationsusingsystem-managedduplexing(SMD)ofLOCK1structureoverextendeddistance• Every lockrequest,fromsite1orsite2,isduplexedandpenalised

• Includingpeer-to-peerCFroundtrip• Increasedlatencymaybeperfectlyacceptableformostwell-behavedonline

transactionsthatachievegoodlockavoidance• Butsomeworkloadswillbeimpactedfromaperformanceviewpoint:

• Insert/update/delete-intensivebatchworkrunningoneithersite• Read-onlyworkthatislock-intensiveduetopoorlockavoidancerunningoneithersite

• Morelikelytoaggravatelockcontentionasdistanceisstretched• Limitedbenefitfromcreatingalocalityofreferencebyroutingworktoonesite• UseofSMDwillgreatlyreducetheviabledistance

• Plannedz/OSenhancementtodeliverasync lockduplexingandforDB212toexploititshouldincreasethedistanceatwhichSMDisviable

36


• ConsiderationusingDB2-managedduplexingofGBPstructuresoverextendeddistance• Onlychangedpagesareduplexed

• Async requesttothesecondaryGBPstructureoverlapswiththesyncrequesttotheprimaryGBPstructure(whichmayitselfbeconvertedtoasync)

• Butboth requestsneedtocompletesuccessfullybeforetheprocessingcancontinue

• Increasedlatencymaybeperfectlyacceptableforapplicationswithhighread-to-writeratio

• Butsomeworkloadswillbeimpactedfromaperformanceviewpoint:• Update-intensivebatchwork(insert/update/delete)runningoneithersite• Read-onlyworkthatisCFintensive(pageregistration)andrunningawayfromPrimary

GBP• e.g.Imagecopycycles,UNLOADs,reportingprograms

• Benefitfromcreatingalocalityofreferenceforread-onlyworkthatisCFintensivebyroutingworktothesitewheretheprimaryGBPisallocated• IncreasingsizeoflocalBPsalsohasthepotentialtoreduceCFrequeststotheprimary

GBP• Limited benefitfromcreatingalocalityofreferenceforupdate-intensivework

37

DB2B


• ParallelSysplex configurations– Scenario1(‘campussolution’)

• Protectionagainstsinglecomponentfailure(DB2,LPAR,CEC,CF)inSite1andSite2• IfSite1fails,alwaysrequireafullgrouprestartatSite2+GRECP/LPLrecovery• SomeperformanceimpactforapplicationsrunningonSite2• Opportunitiestomitigatesevereperformanceregressionshoulditoccurbyaffinity

routingtheproblemworkloadtoSite1

38

GBP-P

DB2A

Site 1 Site 2

GBP-SSCA

LOCK1

LOCK1 and SCA allocated in a failure-

isolated CF

Primary and secondary GBPs in Site 1

DB2DDB2C

DB2B


• ParallelSysplex configurations– Scenario2

• ‘Fast’grouprestartwithoutGRECP/LPLrecoveryonsitefailureispossiblebutnotguaranteed

• Benefitfromcreatingalocalityofreferenceforread-onlyworkthatisCFintensivebyroutingworktoSite1wheretheprimaryGBPsareallocated• Nobenefitfromcreatingalocalityofreferenceforupdate-intensivework

• Limitedopportunitiestomitigatesevereperformanceregressionshoulditoccur,otherthanbeingreadytotakeactionstoreverttoscenario1,andthenaffinityroutetheproblemworkload 39

GBP-P

DB2A

Site 1 Site 2

GBP-SSCA

LOCK1

LOCK1 and SCA allocated in a failure-

isolated CF

Primary GBPs in Site 1

DB2DDB2C

Secondary GBPs in Site 2

DB2B


• ParallelSysplex configurations– Scenario3

• Continuousavailabilityonsitefailureispossiblebutnotguaranteed• Limitedopportunitiestomitigatesevereperformanceregressionshoulditoccur,

otherthanbeingreadytotakeactionstoreverttoscenario2orevenscenario1,andthenaffinityroutetheproblemworkload

40

GBP-P

DB2A

Site 1 Site 2

SCA-PLOCK1-P

Primary LOCK1 and SCA in Site 1

Primary GBPs in Site 1

DB2C

GBP-S

Secondary GBPs in Site 2

DB2B

SCA-SLOCK1-S

Secondary LOCK1 and SCA in Site 1

zIIP capacity• History

1. OriginallyitwasonlyDRDAapplicationworkthatwaszIIP-eligible2. IBMmessageaboutloadingupthezIIPs andspillingovertoGCP3. IIPHONORPRIORITY=NOwasimplementedtostopzIIP eligibleworkswitchingoverto

GCPonsub-capacityCECs4. zAAP actvity runningonzIIP5. LazyswitchwithqueuingwasimplementedtoprotectTCO6. DB2startedsendingsystemtaskactivityovertorunonzIIPs

• SystemtasksdonotusemuchCPU,buttheyhaveahighperformancelowlatencyrequirement• CannotaffordtoqueueforzIIP engineswithalazyswitch

• InV10,prefetch anddeferredwritesenginesarenoweligibleforzIIP offload– inV11,allSRB-modesystemagents(exceptp-locknegotiation)areeligibleforzIIPoffload• TheseDB2tasksmustbedispatchedveryquickly• Anydelayscouldresultin

• SignificantelapsedtimedegradationforsomebatchandDB2utilityjobs• Veryhighcountfor'prefetch enginenotavailable'intheDB2StatisticsTrace

41

zIIP capacity…

• ThenumberofzIIPs ontheLPARinfluenceshowhardyoucanrunthem• Itisdowntoqueuingtheory:

• So1-2zIIPs wewouldrecommendnomorethan50%busy• 3-4zIIPs nomorethan70%busy• >4zIIPs youcanrunthemcloseto90%busy

• Butthebusyisreallya'shorthand'andanartifactfromGCPmanagement• TherealdriveroftheneedforadditionalzIIP capacityisnotaspecific

utilisation numberbutrathertheIIPCPamount• IIPCPtimeisreallythemetrictobewatchingandnothowbusy• NeedtoproactivelymonitorzIIP CPUusage

• Usingthe%ofworkloadredirect(APPL%IIPCPinRMF)– notjustthezIIP %utilisation• zIIP enginesshouldbetreatedas‘assist-processors’

• TargetutilisationforzIIP enginesislikelytobemuchlowerthanthelimitusedforGCPusageduetothetime-criticalnatureoftheworklikelytobeoffloadedthere

• ShouldnotberunningzIIP resourceveryhardathighutilisations

42

zIIP capacity…

• SampleRMFstatistics

43

zIIP capacity…

• Summary1. GenerouslyallocatezIIP resourcesordedicatezIIP resourcestoselectproduction

LPARs2. DesigndefaultanddefensivechoiceistosetIIPHONORPRIORITY=YES3. AssumingagivencustomerhasgenerouslyallocatedzIIP resource,thenthereisno

issueaboutrunningwithIIPHONORPRIORITY=NO i.e.,thereisplentyofzIIP capacityandnozIIP CPUstarvation

4. OnlyuseIIPHONORPRIORITY=NO onsub-capacityCECsprovided#3hasbeenachieved

5. zIIP enginesshouldnotberunashardasGCPsasthehardwarecostslessanddoesnotcarryMSUcharges

6. ForV11,ifIIPHONORPRIORITY=NO thenDB2willnot schedulesystemengines(prefetch,deferredwrite,physicallogreadandwrite,etc)torunonzIIPs

44

AvoidAuto-bindafterreleasemigrationandschemachanges• Nopre-V10boundplansandpackagesallowedinV12

• Removalof31-bitruntimetodeliversomeperformanceimprovement• Planandpackageauto-bindsastheycanbedisruptive• AvoidPLANauto-binds

• REBINDPLANsolderthanV10toV12auto-bind• Foronlinemigration,planmaybein-usebyV11members• REBINDavoidsPLANauto-bindin/aftermigration…

• AvoidPACKAGEauto-binds• Packagesarealsosubjecttoauto-bindinV12

• Auto-bindmeansnoplanmanagement,noREBINDSWITCHcapability• Evenifcopywassaved,auto-bindoccursbecauseDB2willnotusepre-DB210package

copies• REBINDPACKAGEsolderthanV10priortoV12migrationusing

PLANMGMT(EXTENDED)• REBINDSWITCHinV11canuseV9packagecopiesifneeded• Eliminates“priortoV10”packageauto- bindsfrommigration

• FutureREBINDunderV12,V11packagecopyusablewithREBINDSWITCH

45

Memorymanagementdesignchanges

• BeforeAPARPM88804,DB2usesDISCARDDATAKEEPREAL(NO)onstoragecontractionwithouttakingintoaccountthesettingofDB2systemparameterREALSTORAGE_MANAGEMENTsetting• KEEPREAL(NO)tellsRSMtofreeandreclaimthepageimmediately,andittakes

CPUtodothefree• RSMofz/OShastogetLPARlevelserialisationtomanagethosepagesthatare

beingfreedimmediately• IncursCPUspinattheLPARleveltogetthatserialisation

• AfterAPARPM88804• Iftherearenorealmemoryshortageissues,DB2stops discard completely when

REALSTORAGEMANAGEMENT=OFF,andforREALSTORAGE_MANAGEMENT=AUTOwithnopaging• ReducesCPUresourceconsumption• ReducesthechancesofCPUspin• Increasesrealmemoryuseandultimatelymaycausepaging

• DB2willcontinuetodiscardifrealmemoryshortagereachesacriticallevel46

Memorymanagementdesignchanges…

• AfterAPARPM99575• DB2nowusesKEEPREAL(YES)ondiscard• Storageisnowonly"virtuallyfreed”• RSMflagsthepageasfreedorunused,butthestorageisstillinrealmemorywith

thedataandchargedagainstDB2• IfRSMneedsthepage,itcanstealitandtheRSMaccountingthatgoeswithitis

done• SomecustomersexperiencedthatDB2wasthenconsumingmoreCPU,

especiallyinMSTRand/orDISTpreemptibleSRBtime• ButgaveupsomeoftheCPUreductionthatwasinAPARPM88804

• SolutionstoreduceCPUresourceconsumption• SetREALSTORAGE_MANAGEMENT=OFFprovidedrealmemoryisgenerously

allocatedandcantoleratesomegrowthinuseofrealmemoryuse• ReducetheimpactofREALSTORAGE_MANAGEMENT=AUTO,bydesigningfor

increasedthreadreusetominimsethreaddeallocations

47

Memorymanagementdesignchanges…

48

AfterAPARPM88804

• ENF55signalmeansDISCARDKEEPREAL=NO

• RSM=OFFmeansNoDISCARD• RSM=AUTOwithnopagingmeansNo

DISCARDatThreadDeallocationor120commits

• RSM=AUTOwithpagingorRSM=ONmeansDISCARDwithKEEPREAL=NO atDeallocationor30commits.STACKalsoDISCARDED

• REALSTORAGE_MAXmeansDISCARDKEEPREAL=NOat80%

AfterAPARPM99575

• ENF55signalmeansDISCARDKEEPREAL=NO

• RSM=OFFmeansNoDISCARD• RSM=AUTOwithnopaging

meansDISCARDwithKEEPREAL=YESatThreadDeallocationor120commits

• RSM=AUTOwithpagingorRSM=ONmeansDISCARDwithKEEPREAL=YES atDeallocationor30commits.STACKalsoDISCARDED

• REALSTORAGE_MAXmeansDISCARDKEEPREAL=NOat100%

TutorialonInsertSpaceSearch

• Performanceisatradeoffacross• MinimisingCPUresourceconsumption• Maximisingthroughput• Maintainingdatarowclustering• Reusingspacefromdeletedrowsandminimisingspacegrowth

• Note:insertspacesearchalgorithmissubjecttochangeanditdoes

49

TutorialonInsertSpaceSearch…

• ForUTSandClassicpartitionedtablespace:1. IndexManagerwillidentifythecandidatepage(nextlowestkeyrid)

• Ifpageisfullorlocked,skiptoStep22. Searchadjacentpages(+/-)withinthesegmentcontainingthecandidatepage

• Forclassicpartitioneditis+/-16pages3. Searchtheendofpageset/partitionwithoutextend4. Searchthespacemappagethatcontainslowestsegmentthathasfreespacetothe

lastspacemappageupto50spacemappages• Thisis"smartspaceexhaustivesearch"

5. Searchtheendofpageset/partitionwithextenduntilPRIQTYorSECQTYreached6. Perform"exhaustivespacesearch"fromfronttobackofpageset/partitionwhen

PRIQTYorSECQTYreached• Veryexpensivei.e.,spacemapwithlowestsegmentallthewaythrough

• Forclassicsegmentedtablespace,stepsareverycloseareverysimilarexceptthesequence:• 1->2->3->5->4->6

50

TutorialonInsertSpaceSearch…• Eachmembermaintainsforeachpageset/partition:

• First/lowestspacemappagewithfreespace• Currentspacemappagewithfreespace• Lastspacemappage

• Ingeneral,eachstephas3pagesforfalseleadand100forlockfailurethreshold• Instep3(searchatendofpageset/partitionwithoutextend)thereisalittlemore

searchthanjust3pagesfalseleadtotoleratethehotspotcase• Forthesearchstepwherecannotextendatendofpageset/partition

• Searchfromthefirstspacemappagetotheend• Nofalseleadorlockthreshold• Tryeverypossiblespacebeforefailure

• Elsewhereforthe“smartexhaustivesearch”beforeextend,itwillfollowthe"lowestsegmenthasfreespace"goingforward50spacemappagesthenitwillgotoextend• The3pagesfalseleadand100pagelockfailuresareappliedtoeachspacemappage

• IftherearemultipleinsertsinthesameUR,then2ndinsertwillnotsearchthepreviousrepeatedunsuccessfulspacemappagesifthepreviousinsertalreadygonethroughmorethan25pages

51

52

JohnCampbellDB2forz/[email protected]

SessionZ6andZ7DB2forz/OS:JC'sGreatestHits,WarStoriesandBestPractice2016- Parts1and2

Please fill out your session evaluation before leaving!

Documents

DB2 for z/OS: JC's Greatest Hits, War Stories and Best