Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
DB2forz/OS:JC'sGreatestHits,WarStoriesandBestPractice2016- Parts1and2JohnCampbellDB2forz/OSDevelopmentSessionCode:Z6andZ7Thursday,September15th 8:30-9:30&9:45-10:45|Platform:z/OS
Objectives
• Learnrecommendedbestpractice• Learnfromthepositiveandnegativeexperiencesofotherinstallations
2
Topics• SystemparameterDEL_CFSTRUCTS_ON_RESTART=YESsetting• AutomatedDB2restartandmisuseof“RestartLight”• REALmemoryplanningLFAREAIEASYSxx parameter• ALTERBUFFERPOOLbpname immediate• GlobalLockManager(GLM)cleanupandrecoveryindatasharing• FastREORGwithSORTDATANORECLUSTERNO• Housekeeping:REORGINDEXandREORGTABLESPACE• Conversiontotable-controlledpartitioning• Alterlimitkeywithtable-controlledpartitioning• Value ofrebindformigratedstaticSQLpackages• Multi-sitedatasharinggroupandextendeddistance• zIIP capacity• AvoidAuto-bindafterreleasemigrationsandschemachanges• Memorymanagementdesignchanges• Tutorialoninsertspacesearch
3
SystemparameterDEL_CFSTRUCTS_ON_RESTART=YESsetting
• Background• AllDB2CFstructuresMUSTbepurgedbeforeattemptingadisasterrestart
• Otherwise,guaranteedlogicaldatacorruption• RequiredevenifusingMetroMirrorwithFREEZE/GOpolicy
• Noconsistencybetween‘frozen’secondaryDASDcopyandcontentoftheCFstructures• Necessaryactionsandchecksshouldbeautomated• NewDB2systemparameterintroducedinV10atcustomerrequest
• DEL_CFSTRUCTS_ON_RESTART=YES|NO• Triggersauto-deleteofDB2CFstructuresonrestartifnoactiveconnectionsexist
• GRECP/LPLrecoverywillberequiredasaresultofdeletingtheGBPs• LikelytobethelargestcontributortoDB2restarttime
4
SystemparameterDEL_CFSTRUCTS_ON_RESTART=YESsetting...
• Problem• Manycustomersusethesamesystemparametersettingsforproductionand
disasterrecovery• IfDEL_CFSTRUCTS_ON_RESTART=YESthiscanresultinunnecessaryDB2group
restartincertainlocalfailurescenariosanditcouldtakealongtime• Recommendations
• DoNOT setDEL_CFSTRUCTS_ON_RESTART=YES• MuchsafertoupgradetheGDPSprocedurestopurgetheDB2CFstructuresbefore
DB2restartwhenrestartingoffsecondaryvolumesthathavedriftedbehindthecurrencyoftheCFcontent
5
DXCF,STRUCTURE,STRNAME=grpname*
FortheLOCK1structureandanyfailed-persistentgroupbufferpools:SETXCFFORCE,CONNECTION,STRNAME=strname,CONNAME=ALL
FortheLOCK1structureandtheSCA,itisalsonecessarytoforcethestructuresout:SETXCFFORCE,STRUCTURE,STRNAME=strname
DXCF,STRUCTURE,STRNAME=grpname*(toconfirm)
AutomatedDB2restartandmisuseofRestartLight
• TuningforfastDB2restart• Takefrequentsystemcheckpoints
• Every2-5minutesgenerallyworkswell• Commitfrequently!• UseDB2Consistentrestart(PostponedAbort)tolimitbackout oflong-runningURs
• ControlledviaZPARM• LBACKOUT=AUTO|LIGHTAUTO
• LIGHTAUTOisanewoptiontoallowRestartLightwithpostponedabort
• SeePM34474(V9)andPM37675(V10)• BACKODUR=5(intervalis500Klogrecordsiftime-basedcheckpointfrequency)
• PostponedAbortisnota‘get-out-of-jail-free’card• Someretainedlockspersistthroughrestartwhenpostponedbackout processingisactive
• Retainedlocksheldonpagesetsforwhichbackout workhasnotbeencompleted• Retainedlocksheldontables,pages,rows,orLOBsofthosetablespacesorpartitions
• Ifonsharedcriticalresources,theseretainedlockscanpreventapplicationsfromrunningproperly
6
AutomatedDB2restartandmisuseofRestartLight…
• TuningforfastDB2restart…• Trackandeliminatelong-runningURs
• Long-runningURscanhaveabigimpactonoverallsystemperformanceandavailability• ElongatedDB2restartandrecoverytimes• Reducedavailabilityduetoretainedlocksheldforalongtime(datasharing)• Potentiallongrollbacktimesincaseofapplicationabends• Lockcontentionduetoextendedlockduration>>timeoutsforotherapplications• Ineffectivelockavoidance(datasharing)
• Asinglelong-runningURrunninganywhereinthedatasharinggroup canreducetheeffectivenessoflockavoidanceforapplicationsaccessingGBP-dependentpagesetsbystoppingtheGlobalCLSNvaluefrommovingforward
• ProblemsgettingDB2utilitiesexecuted• DB2providestwosystemparameterstohelpidentify‘rogue’applications
• URCHKTHforlongrunningURs• URLGWTHforheavyupdatersthatdonotcommit
7
AutomatedDB2restartandmisuseofRestartLight…
• TuningforfastDB2restart…• Trackandeliminatelong-runningURs…
• Aggressivelymonitorlong-runningURs• StartconservativelyandadjusttheZPARMvaluesdownwardsprogressively• Initialrecommendations
• URLGWTH=10(Klogrecords)• URCHKTH=5(systemcheckpoints)– basedona3-minutecheckpointinterval
• Automaticallycapturewarningmessages• DSNJ031I(URLGWTH)• DSNR035I(URCHKTH)• and/orpostprocessIFCID0313records(ifStatisticsClass3ison)
• Getbadly-behavedapplicationsupgradedsothattheycommitmorefrequently• Needmanagementownershipandprocess
8
AutomatedDB2restartandmisuseofRestartLight…
• DB2restartafterfailure• UseautomationtodetectfailuresquicklyandrestartDB2immediately
• Objective:Releaseretainedlocksassoonaspossible• NOmanualintervention• NOmanagementdecisioninvolved• AutomatedRestartManager(ARM)orautomatedoperators(e.g.TivoliSystem
Automationfamily)canbeused• Twofailurescenariostodistinguish
• DB2orIRLMfailure:restartDB2inplaceusinganormalDB2warmrestart• LPARorCECfailure:restartDB2onanalternativeLPARusingDB2RestartLight
9
** Very basic ARM Policy **RESTART_GROUP(DB2)
ELEMENT(DSNDB10DB1A)TERMTYPE(ALLTERM)RESTART_METHOD(ELEMTERM,PERSIST)RESTART_METHOD(SYSTERM,STC,'-DB1A STA DB2,LIGHT(NOINDOUBTS)')
AutomatedDB2restartandmisuseofRestartLight…
• DB2restartafterfailure…• “RestartLight“=-STARTDB2LIGHT(YES|NOINDOUBTS|CASTOUT)
• DesignpointofRestartLightisONLY forcross-systemrestartfollowingLPAR/CECfailure• Smallmemoryfootprint
• UsedtoavoidECSA/CSAvirtualstorageshortageonthealternateLPARbyforcingIRLMPC=YES• NolongeranissuebecauseasfromV8asIRLMPC=YESisalwaysforced
• AvoidsrealmemoryshortageonthealternateLPARbyseverelyconstrainingpoolsizes• Simplifiedmanagement
• Doesnotallownewworkloadtocomein• CanshutdownDB2memberautomatically
• BUT….RestartLightislikelytobeslowerthannormalDB2crashrestart
• RestartLightisNOT intendedforrestartinplaceonthe‘home’ LPARorforDRcrashrestart
10
AutomatedDB2restartandmisuseofRestartLight…
• DB2restartafterfailure…• RestartLightandpostponedAbort
• ZPARMLBACKOUT=AUTOsetting(default)isignoredandtreatedasLBACKOUT=NO• ProcessingofAbortURswillnotbepostponed• TherewillbeawaittocompleteprocessingofAbortURssuccessfully• RestartcouldtakeatakealongtimetocompleteifthereareverylongrunningURstoabort
• Recommendation:LBACKOUT=LIGHTAUTOtoenablepostponedAbortforRestartLight
• RestartLightandin-doubtURs• WithLIGHT(YES),DB2willnotterminateautomaticallyuntilallin-doubtURsareresolved• Option#1:ProvidedthereissufficientspareREALmemoryavailableonthealternate
LPAR,coulduseautomationtorestartCICSand/orIMS/TMinfrastructureonthesamealternateLPARastheDB2memberbeingrestarted
• Option#2:UseoptionLIGHT(NOINDOUBTS)• DB2willterminateautomaticallyandnotresolvein-doubts• Butkeepinmindthatretainedlockscannotbefreeduntilthein-doubtURisresolved
11
AutomatedDB2restartandmisuseofRestartLight…
• DB2restartafterfailure…• RestartLightandretainedlocks
• STARTDB2LIGHT(YES|NOINDOUBTS)willnotcleanupallretainedlocks• ShouldexpecttoseemessageDSNT804Iattheendofrestart
• RetainedtransactionlocksandX-modepagesetP-locksarereleased• Butretainedpagesetp-locksinIXorSIXmodewillNOTbereleased
• NoSQLDMLprocessingwillbeblocked– exceptionismassdelete• NoDB2utilitythatisaclaimerwillbeblockede.g.COPYSHRLEVEL(CHANGE)willwork• Butdrainerswillbeblockede.g.REORGincludingREORGSHRLEVEL(CHANGE),SQLDDL
• WillbeclearedoncetheDB2isrestartednormallyinits‘home’LPAR• NewoptionSTARTDB2LIGHT(CASTOUT)inV11
• Kicksoffcastout processing• Aftercastout,pagesets becomenon-GBPdependent• PagesetP-locksinIXorSIXmodearereleased• WilltakelongertocompletethanLIGHT(YES|NOINDOUBTS)option
12
DSNT804I - THERE ARE MODIFY LOCKS OWNED BY THIS DB2 THAT HAVE BEEN RETAINED
REALmemoryplanningLFAREAIEASYSxx parameter
• Thelargeframeareaisusedfor• Fixed1Msizelargepages• ”Spillover”forPageable 1Msizelargepages
• UsinglargesizepagescanimproveCPUperformancebyreducingtheoverheadofdynamicaddresstranslation• ImprovedhitratiointheTranslationLook-asideBuffer(TLB)becauseofsmaller
numberofentries
• CPUreductionwillvary,basedoncustomerexperience0to4%• BufferpoolsmustbedefinedasPGFIX=YEStouse1Msizepages• SizeofthelargepageframeareaisspecifiedbyLFAREAinIEASYSnnparmlib
memberandonlychangeablebyIPL• Involvespartitioningrealmemoryinto4Kand1Msizepages• MustmakesurethereissufficienttotalREALmemorytofullybackthedemand
forboth enough4K andenough1Msizepages13
REALmemoryplanningLFAREAIEASYSxx parameter...
• LFAREAshouldbesizedandmanagedbasedontotalworkloaddemandforfixed1Mpages
• SettingLFAREAtothemaximumsettingof80%isnot wheretostart• ThereareconsequencestooverallocationofLFAREAandshortage of4Kpages
intermsofpenaltyofpagemovementorpagingI/Ooverhead• PagemovementandpagingI/OisdonebySRBsprocessingundertheRASP
addressspaceandtherewillbeacorrespondingincreaseonCPUusageinRASP
• DefineLFAREAthatyoucanaffordbasedonwhatisleftoverafterconsideringyourtotalrealmemoryandworkloaddemandsfor4Kframesabove2G
• Mustconsideroperatingsystemstorageneeds• RSMrequirementformemorymappingwhichisapproximately1/64totalonline
realatIPLe.g.,4Gfor256Gsystem• Systemaddressspacesmemoryusage(DB2,CICS,etc)
• Mustalsoincludeenoughspare4Ksizeframesfortakingdumpsquickly14
REALmemoryplanningLFAREAIEASYSxxparameter…
• LFAREAandINCLUDE1MAFCoption• Availableframecount(AFC)isusedtodeterminewhenstoragemanagement
shouldbeginpagingframeswhenrealmemoryshortage• TwooptionswhendefiningtheLFAREA:INCLUDE1MAFC(YES|NO)
• YESmeansAFCincludes theLFAREA1Mpages• Pagingisdelayed• Frameswillbeconvertedfromthe1MLFAREAto4Kframesfor4Knon-preferredrequests
• NOmeansAFCdoesnotinclude LFAREA1Mpages• Pagingwillbeinstigated• Fixed1Mframeswilleventuallybebrokendowninto4Kframesfor4Knon-preferredrequests
• RecommendationistouseINCLUDE1MAFC(YES)sothatunused1MpagescanbebrokendownASAPtosatisfydemandfor4Knon-preferredrequests
15
REALmemoryplanningLFAREAIEASYSxxparameter…
• Recommendations• Haveenough4Kframes“abovethebar”toavoidRSMbreakingdownfree1M
pagesandpagingorpagemovementfor4Kpagefixes• IncludeRSMneedsformemorymapping- 1/64totalonlinerealatIPL• Factorinsystemaddressspacememoryusage• Includeenoughspare4Kframesfortakingdumpsquickly• DefinetheLFAREAbasedonwhatyoucanaffordwithintheavailableremaining
onlinerealmemory,theresultsareeither:• DB2bufferpoolsarespreadacross4Kand1Mframes
• Noavailabilityissues• Noperformanceregressioncomparedtousingall4Kframes• Onlysomelosstoincrementalbenefitofusing1Mframes
• ProvisionadditionalREALmemory• ConstrainDFSORTuseofREALmemorytowhattheconfigurationcanafford• BothQUADandPageable1MareasgrowproportionallywithadditionalREAL
memory
17
REALmemoryplanningLFAREAIEASYSxx parameter…
• Minimumstartingpointfor4Krequestsisthesumofallofthefollowing:• 1/64oftotalrealmemoryoperatingsystemrequirement(64bytesper4Kpage,1M
pagecountsas256*4K),plus• 1/8forQUADarea– shouldnotrelyonthisareabeingusedascontingencyfor
satisfying4Kfixedrequests,plus• 1/8forPageable 1Mframes- shouldnotrelyonthisareabeingusedas
contingencyforsatisfying4Kpageable orfixedrequests,plus• 2Gforbelowthe2Gbar,plus• 2Gforabovethe2Gbar,plus• DUMPSRV/MAXSPACErequirement(generalROTis16G)
18
ALTERBUFFERPOOLbpname immediate
• PriortoDB210,userhadtoperformaSTOPandSTARTaroundtheALTER• DB210allowssubjectbufferpoolchangetonowbe"immediate"aslongasthe
pagesizeisnotchanging• Thisenhancementisnotwelladvertised• DB210performs
• DRAINALLoperationontheobject,whichdrivesphysicalcloseonallDB2members• PhysicalclosebythelastDB2memberdrivesalloftheGBPcastoutandpurge
processing• Takeanxmodelockonskeletoncursorandpackagetables
• Mustbesubmittedduringquietperiod
19
GlobalLockManager(GLM)cleanupandrecoveryindatasharing• Context
• WhenaDB2memberterminates,managementresponsibilityforgloballockcontentionmustbetransferred tootherDB2membersofthesubjectdatasharinggroup
• TheDB2lockstructureofthesubjectdatasharinggroupmustbequiesced duringsomeofthecleanupactivities
• DuringtheperiodoftimewhentheDB2lockstructureisquiesced,no DB2globallockingactivitycanoccurforthesubjectDB2datasharinggroup
• Aspartofthisrecoveryasignificant amountofXCFsignalingactivityisrequiredtore-establish thegloballockmanagementposition
• Enhancementshavebeendeliveredintheservicestreamtoimprove recoveryperformance
• OptimizedXCFsignalingisessentialtoreduceoverallrecoverytime• Problems
• IRLMCPUburnfollowingDB2membershutdownandabnormalDB2membertermination
• Manysecondstosmallnumberofminuteswhenthedatasharinggroupappearstobehung
20
GlobalLockManager(GLM)cleanupandrecoveryindatasharing…
• Recommendations• ApplyrelevantPTFmaintenance
• APAROA43748(Feb2014)• FixesaqueuingscalabilityissueforGLMrecovery
• APAROA43388(Aug2014HIPER)• Redesignrecoveryto
• Minimise thetimetheDB2lockstructureisquiesced• AchievebetterbalancingoftheGLMrecoveryacrossthesurvivingmembers
• APAROA46423(Feb2015HIPER,PEfixforAPAROA43748)• EliminatespotentialforhangsanddelaysintroducedbyAPAROA43748
• Ifglobalfalselockcontentionisexcessive(i.e.,>1%)• IncreasenumberofLTEsforCFLOCK1structuretoreducegloballockmanagement
22
GlobalLockManager(GLM)cleanupandrecoveryindatasharing…
• Recommendations…• AvoidXCFsignalingdelays
• Insufficientmessagebufferswillleadtorejectedmessages,signalshavingtoberesent,resultingindelays
• DelayselongatetheamountoftimethattheDB2lockstructureremainsquiesced• Makesurea4KsizeXCFtransportclassisdefined• Assigndedicatedbufferspace(MAXMSG)ofatleast10000for4KsizeXCFtransportclass• SignificantlyincreasebufferspaceonCOUPLESYSPLEXstatementtoatleast10000• Proactivelymonitorandtunefornearzero‘messagesrejected’
23
FastREORGwithSORTDATANORECLUSTERNO• Usecases:pendingalters,LRBA/LRSNconversion,conversiontoUTS• Samplecontrolstatement
REORGTABLESPACESHRLEVELCHANGESORTDATANORECLUSTERNOSTATISTICSTABLE(ALL)SAMPLE1REPORTYESUPDATENONE
• ConsiderationsandriskswhenusingRECLUSTERNO1. RecalculationofCLUSTERRATIO
• IfyouremoveSTATISTICS,REORGwillcollectimplicitstatistics• ThiswillcanleadtopossiblybadCLUSTERRATIOstatistics
2. ResetofRTSstatistics• REORGUNCLUSTINS,REORGINSERTSandREORGUNCLUSTINSwillbesetto0even
throughnodatarowshavebeenreclustered• Recommendations
• OnlyuseSORTDATANORECLUSTERNOforobjectswithveryhighlyclustereddata• AlternativeistoupdateRTSafterREORGandrestorethebeforevalues
• UseexplicitSTATISTICSwithREPORTYESUPDATENONE• SAMPLE1istoreduceCPUresourceconsumption
24
Housekeeping:REORGINDEXandREORGTABLESPACE
• Collectionofsomecommonfindingsfromcustomerstudies• ToomanyREORGTABLESPACEscheduledvs.toofewREORGINDEXscheduled• Indexesareonlyeverreorganised aspartofREORGTABLESPACE• CollectinginlinestatisticsforREORGINDEX
• Timedriftacrossstatisticscollected:REORGTABLESPACEvs.REORGINDEX• DegradedCLUSTERRATIOforsubjectindex->potentialchangeofaccesspath
• Highnumberofindexesthat:• Areverydisorganised duetoindexleafpagesplits• Havehighpercentageofpseudo-deletedindexentries
• NPIsexperiencinglevelofdisorganisation followingREORGTABLESPACEPARTwithSORTNPSINO
25
Housekeeping:REORGINDEXandREORGTABLESPACE…
• RecommendationsforREORGINDEX• ImplementcriteriaforindexesbasedonRTStableSYSINDEXSPACESTATS
• %ofindexpagesplits:REORGLEAFFAR/NACTIVE>10%• %ofpseudo-deletedindexentries:REORGPSEUDODELETES/TOTALENTRIES>5%• %ofpseudo-emptypages:NPAGES>5ANDNPAGES/NLEAF>10%
• Buildanindexlistandmergewiththetablespacelist• IftheindexqualifiesforREORG,butnotthetablespace,runREORGINDEX• Donot collectinlinestatisticsaspartofREORGINDEX(alwaysfewexceptions)• SetZPARMREORG_PART_SORT_NPSI=NOà AUTO(V11default)
• AvoidNPIdisorganisationatendofREORGTABLESPACEPART• SetZPARMINDEX_CLEANUP_THREADS=10(V11default)forautonomiccleanup
ofpseudodeletedindexentries• APARPI31459:Suppressthetimeoutmessageswhereacleanuptaskisthevictim
26
Housekeeping:REORGINDEXandREORGTABLESPACE…
• RecommendationsforREORGTABLESPACE• Extendcriteriabasedon%ofrowsinserted/deletedtoincludequalificationonsize• Addacriteriaon%ofunclusteredinsertoperations:
• REORGCLUSTERSENS>0ANDREORGUNCLUSTINS/TOTALROWS>10%• PeriodicallyreviewrulesandthresholdsagainstIBM-providedsampleSQLstored
procedureDSNACCOX• ExploittheinformationcollectedonREORGtriggerconditionstodrivetuning
decisionssuchasdistributedfreespace(PCTFREE/FREEPAGE)adjustments
27
Conversiontotable-controlledpartitioning
• IntroducedbackinV8aheadofintroductionofUTSPBRinV9• History
1. InV8,allcolumnsnotjustthesignificant columns,were used during theconversion(no system parameter to control behavior)
2. V9APARPM45829introduced new system parameter togovernwhichcolumnsareusedintheconversionprocess:IX_TB_PART_CONV_EXCLUDE• NO =allcolumns (default)• YES =only significant columns
3. V9APARPM90893changed default to YES• Manycustomershavemadeprogressovertimeinconvertingovertoclassic
partitionedtablespaceen-routetoUTSPBR• Problem
• SomePIscreatedafterconversionmaybecreatedasDPSIs,whichmayleadtosuboptimalaccesspathselections…
28
Conversiontotable-controlledpartitioning…
29
IX_TB_PART_CONV_EXCLUDE=YES IX_TB_PART_CONV_EXCLUDE=NO
CREATE TABLESPACE SAMSSEGSIZE 0 NUMPARTS 3...CREATE UNIQUE INDEX SAMX(COLA ASC, COLB ASC) ... PART 0001 VALUES ( 00010 ) , PART 0002 VALUES ( 00020 ) , PART 0003 VALUES ( 00030 )...
DSNT404I SQLCODE = 20272, WARNING: TABLE SPACE SAMS HAS BEEN CONVERTED TO USE TABLE-CONTROLLED PARTITIONING INSTEAD OF INDEX-CONTROLLED PARTITIONING, ADDITIONAL INFORMATION: 30
SYSIBM.SYSTABLEPART-------------------! LIMITKEY_INTERNAL-------------------! 10
SYSIBM.SYSTABLEPART-------------------! LIMITKEY_INTERNAL-------------------! 10, MAXVALUE
Conversiontotable-controlledpartitioning…
30
IX_TB_PART_CONV_EXCLUDE=YES IX_TB_PART_CONV_EXCLUDE=NO
CREATE INDEX SAMX2 ...(COLA ASC , COLC ASC) PARTITIONED ...CREATE INDEX SAMX3 ...(COLA ASC , COLB ASC LOCATION ASC) PARTITIONED...
SYSIBM.SYSINDEXES+----------------------+! NAME ! INDEXTYPE !+----------------------+! SAMX ! P !! SAMX2 ! P !! SAMX3 ! P !+----------------------+
SYSIBM.SYSINDEXES+----------------------+! NAME ! INDEXTYPE !+----------------------+! SAMX ! P !! SAMX2 ! D !! SAMX3 ! P !+----------------------+
Conversiontotable-controlledpartitioning…
• Recommendations• Setsystemparamter IX_TB_PART_CONV_EXCLUDE=YES• Investigatewhethersuchcasesexiste.g.,
• IndexonA,B,CbutpartitionedonAonlyà table-controlledpartitioningonA,B,C• AlternateindexonA,DwasaPIbeforeconversionà becomesaDPSIafterconversion
• Bestway to convert• ALTERINDEX...NOTCLUSTERand then ALTERINDEX...CLUSTERonthepartitioning
index inthesamecommit scope• Note:AfterAPARPI36213,dependentpackageswillbeinvalidated!
31
Alterlimitkeywithtable-controlledpartitioning
• ExistinglimitkeyvaluesforpartitionedtablePARTITION001ENDINGAT('','01'),PARTITION002ENDINGAT('','02'),PARTITION003ENDINGAT('','03'),PARTITION004ENDINGAT('','99');
• Oldsyntaxforindex-controlledpartitionedtableALTERINDEXDB.XDBUTS11
PARTITION001ENDINGAT('','02'),PARTITION002ENDINGAT('','03'),PARTITION003ENDINGAT('','04'),PARTITION004ENDINGAT('','99');
• Multiplepartitionsareallowedinonealterstatement• DDLStatementvalidaslongastheresultingvaluesareconsistent
32
Alterlimitkeywithtable-controlledpartitioning…
• Syntaxfortable-controlledpartitionedtableandonly onepartitionallowedforeachALTERstatement:
ALTERTABLEDB.TDB_UTSR_DB1ALTERPARTITION1ENDINGAT('','02')INCLUSIVE;
ALTERTABLEDB.TDB_UTSR_DB1ALTERPARTITION2ENDINGAT('','03')INCLUSIVE;
ALTERTABLEDB.TDB_UTSR_DB1ALTERPARTITION3ENDINGAT('','04')INCLUSIVE;
ALTERTABLEDB.TDB_UTSR_DB1ALTERPARTITION4ENDINGAT('','99')INCLUSIVE;
• TheveryfirstalterresultsinSQLCODE-636asitconflictswithoriginalvalueofpart2
• Onlywaytohandlethisspecificcase,istodothealtersinreverseorder• Foreverycaseyouneedtoanalyse thesequencetoissuethealters
33
Value ofrebindformigratedstaticSQLpackages
1. Improvedperformancefromnewruntime• Fastcolumnprocessingroutinesre-enabled(SPROC,IPROC,UPROC,GENPROC)• Avoid“puffing”undernewreleasewhenexecutingmigratedpackage
2. ExposestaticapplicationstonewDB2releasewhilefallbackispossibleviaREBINDSWITCH
3. Getexposuretonewqueryoptimizationandruntimeenhancements• Newaccesspathchoices
4. Reduceexposuretoproblemswithmigratedpackagesfromearlierreleases• INCORROUTs• Threadabends
5. Preparationforfurtherusageofplanmanagementinnewreleasesandbeyond• APREUSE(ERROR|WARN),…
34
Multi-sitedatasharinggroupandextendeddistance
• Somecustomersconsideringmovingtoamulti-siteparallelsysplex overlongerdistances,whilestillhavingactiveDB2membersateachsite
• Generalconsiderations• ‘Firstorder’effectisanincreaseinCFservicetimes,whichiseasytocalculate
• Servicetimesfordistancesgreaterthanaround3kmwillbeapproximately:• ForhostCPUefficiency,allCFoperationswillbesentasasync,whichwilladdaround70usec• Then,forevery1kmofdistance,theroundtriptimeisabout10usec (speedoflight)• Forexample:at15kmyoucouldexpectabout70+(15x10)=220usec forCFoperation
• Fromanapplicationresponsetimestandpoint,theelongationofCFservicetimeswillhavehard-to-predictpotential‘secondorder’effects• Elongatedtransactionresponsetimecouldaggravatelockcontention• Effectsverydependentoncustomerapplicationstructureandmix
• Anycustomerconsideringgoingbeyond10kmdistanceshoulddosomesimulateddistancetestingwithrealisticapplicationworkloads(includingbatchworkloadsandDB2utilities) toproveviability
35
Multi-sitedatasharinggroupandextendeddistance…
• Considerationsusingsystem-managedduplexing(SMD)ofLOCK1structureoverextendeddistance• Every lockrequest,fromsite1orsite2,isduplexedandpenalised
• Includingpeer-to-peerCFroundtrip• Increasedlatencymaybeperfectlyacceptableformostwell-behavedonline
transactionsthatachievegoodlockavoidance• Butsomeworkloadswillbeimpactedfromaperformanceviewpoint:
• Insert/update/delete-intensivebatchworkrunningoneithersite• Read-onlyworkthatislock-intensiveduetopoorlockavoidancerunningoneithersite
• Morelikelytoaggravatelockcontentionasdistanceisstretched• Limitedbenefitfromcreatingalocalityofreferencebyroutingworktoonesite• UseofSMDwillgreatlyreducetheviabledistance
• Plannedz/OSenhancementtodeliverasync lockduplexingandforDB212toexploititshouldincreasethedistanceatwhichSMDisviable
36
Multi-sitedatasharinggroupandextendeddistance…
• ConsiderationusingDB2-managedduplexingofGBPstructuresoverextendeddistance• Onlychangedpagesareduplexed
• Async requesttothesecondaryGBPstructureoverlapswiththesyncrequesttotheprimaryGBPstructure(whichmayitselfbeconvertedtoasync)
• Butboth requestsneedtocompletesuccessfullybeforetheprocessingcancontinue
• Increasedlatencymaybeperfectlyacceptableforapplicationswithhighread-to-writeratio
• Butsomeworkloadswillbeimpactedfromaperformanceviewpoint:• Update-intensivebatchwork(insert/update/delete)runningoneithersite• Read-onlyworkthatisCFintensive(pageregistration)andrunningawayfromPrimary
GBP• e.g.Imagecopycycles,UNLOADs,reportingprograms
• Benefitfromcreatingalocalityofreferenceforread-onlyworkthatisCFintensivebyroutingworktothesitewheretheprimaryGBPisallocated• IncreasingsizeoflocalBPsalsohasthepotentialtoreduceCFrequeststotheprimary
GBP• Limited benefitfromcreatingalocalityofreferenceforupdate-intensivework
37
DB2B
Multi-sitedatasharinggroupandextendeddistance…
• ParallelSysplex configurations– Scenario1(‘campussolution’)
• Protectionagainstsinglecomponentfailure(DB2,LPAR,CEC,CF)inSite1andSite2• IfSite1fails,alwaysrequireafullgrouprestartatSite2+GRECP/LPLrecovery• SomeperformanceimpactforapplicationsrunningonSite2• Opportunitiestomitigatesevereperformanceregressionshoulditoccurbyaffinity
routingtheproblemworkloadtoSite1
38
GBP-P
DB2A
Site 1 Site 2
GBP-SSCA
LOCK1
LOCK1 and SCA allocated in a failure-
isolated CF
Primary and secondary GBPs in Site 1
DB2DDB2C
DB2B
Multi-sitedatasharinggroupandextendeddistance…
• ParallelSysplex configurations– Scenario2
• ‘Fast’grouprestartwithoutGRECP/LPLrecoveryonsitefailureispossiblebutnotguaranteed
• Benefitfromcreatingalocalityofreferenceforread-onlyworkthatisCFintensivebyroutingworktoSite1wheretheprimaryGBPsareallocated• Nobenefitfromcreatingalocalityofreferenceforupdate-intensivework
• Limitedopportunitiestomitigatesevereperformanceregressionshoulditoccur,otherthanbeingreadytotakeactionstoreverttoscenario1,andthenaffinityroutetheproblemworkload 39
GBP-P
DB2A
Site 1 Site 2
GBP-SSCA
LOCK1
LOCK1 and SCA allocated in a failure-
isolated CF
Primary GBPs in Site 1
DB2DDB2C
Secondary GBPs in Site 2
DB2B
Multi-sitedatasharinggroupandextendeddistance…
• ParallelSysplex configurations– Scenario3
• Continuousavailabilityonsitefailureispossiblebutnotguaranteed• Limitedopportunitiestomitigatesevereperformanceregressionshoulditoccur,
otherthanbeingreadytotakeactionstoreverttoscenario2orevenscenario1,andthenaffinityroutetheproblemworkload
40
GBP-P
DB2A
Site 1 Site 2
SCA-PLOCK1-P
Primary LOCK1 and SCA in Site 1
Primary GBPs in Site 1
DB2C
GBP-S
Secondary GBPs in Site 2
DB2B
SCA-SLOCK1-S
Secondary LOCK1 and SCA in Site 1
zIIP capacity• History
1. OriginallyitwasonlyDRDAapplicationworkthatwaszIIP-eligible2. IBMmessageaboutloadingupthezIIPs andspillingovertoGCP3. IIPHONORPRIORITY=NOwasimplementedtostopzIIP eligibleworkswitchingoverto
GCPonsub-capacityCECs4. zAAP actvity runningonzIIP5. LazyswitchwithqueuingwasimplementedtoprotectTCO6. DB2startedsendingsystemtaskactivityovertorunonzIIPs
• SystemtasksdonotusemuchCPU,buttheyhaveahighperformancelowlatencyrequirement• CannotaffordtoqueueforzIIP engineswithalazyswitch
• InV10,prefetch anddeferredwritesenginesarenoweligibleforzIIP offload– inV11,allSRB-modesystemagents(exceptp-locknegotiation)areeligibleforzIIPoffload• TheseDB2tasksmustbedispatchedveryquickly• Anydelayscouldresultin
• SignificantelapsedtimedegradationforsomebatchandDB2utilityjobs• Veryhighcountfor'prefetch enginenotavailable'intheDB2StatisticsTrace
41
zIIP capacity…
• ThenumberofzIIPs ontheLPARinfluenceshowhardyoucanrunthem• Itisdowntoqueuingtheory:
• So1-2zIIPs wewouldrecommendnomorethan50%busy• 3-4zIIPs nomorethan70%busy• >4zIIPs youcanrunthemcloseto90%busy
• Butthebusyisreallya'shorthand'andanartifactfromGCPmanagement• TherealdriveroftheneedforadditionalzIIP capacityisnotaspecific
utilisation numberbutrathertheIIPCPamount• IIPCPtimeisreallythemetrictobewatchingandnothowbusy• NeedtoproactivelymonitorzIIP CPUusage
• Usingthe%ofworkloadredirect(APPL%IIPCPinRMF)– notjustthezIIP %utilisation• zIIP enginesshouldbetreatedas‘assist-processors’
• TargetutilisationforzIIP enginesislikelytobemuchlowerthanthelimitusedforGCPusageduetothetime-criticalnatureoftheworklikelytobeoffloadedthere
• ShouldnotberunningzIIP resourceveryhardathighutilisations
42
zIIP capacity…
• Summary1. GenerouslyallocatezIIP resourcesordedicatezIIP resourcestoselectproduction
LPARs2. DesigndefaultanddefensivechoiceistosetIIPHONORPRIORITY=YES3. AssumingagivencustomerhasgenerouslyallocatedzIIP resource,thenthereisno
issueaboutrunningwithIIPHONORPRIORITY=NO i.e.,thereisplentyofzIIP capacityandnozIIP CPUstarvation
4. OnlyuseIIPHONORPRIORITY=NO onsub-capacityCECsprovided#3hasbeenachieved
5. zIIP enginesshouldnotberunashardasGCPsasthehardwarecostslessanddoesnotcarryMSUcharges
6. ForV11,ifIIPHONORPRIORITY=NO thenDB2willnot schedulesystemengines(prefetch,deferredwrite,physicallogreadandwrite,etc)torunonzIIPs
44
AvoidAuto-bindafterreleasemigrationandschemachanges• Nopre-V10boundplansandpackagesallowedinV12
• Removalof31-bitruntimetodeliversomeperformanceimprovement• Planandpackageauto-bindsastheycanbedisruptive• AvoidPLANauto-binds
• REBINDPLANsolderthanV10toV12auto-bind• Foronlinemigration,planmaybein-usebyV11members• REBINDavoidsPLANauto-bindin/aftermigration…
• AvoidPACKAGEauto-binds• Packagesarealsosubjecttoauto-bindinV12
• Auto-bindmeansnoplanmanagement,noREBINDSWITCHcapability• Evenifcopywassaved,auto-bindoccursbecauseDB2willnotusepre-DB210package
copies• REBINDPACKAGEsolderthanV10priortoV12migrationusing
PLANMGMT(EXTENDED)• REBINDSWITCHinV11canuseV9packagecopiesifneeded• Eliminates“priortoV10”packageauto- bindsfrommigration
• FutureREBINDunderV12,V11packagecopyusablewithREBINDSWITCH
45
Memorymanagementdesignchanges
• BeforeAPARPM88804,DB2usesDISCARDDATAKEEPREAL(NO)onstoragecontractionwithouttakingintoaccountthesettingofDB2systemparameterREALSTORAGE_MANAGEMENTsetting• KEEPREAL(NO)tellsRSMtofreeandreclaimthepageimmediately,andittakes
CPUtodothefree• RSMofz/OShastogetLPARlevelserialisationtomanagethosepagesthatare
beingfreedimmediately• IncursCPUspinattheLPARleveltogetthatserialisation
• AfterAPARPM88804• Iftherearenorealmemoryshortageissues,DB2stops discard completely when
REALSTORAGEMANAGEMENT=OFF,andforREALSTORAGE_MANAGEMENT=AUTOwithnopaging• ReducesCPUresourceconsumption• ReducesthechancesofCPUspin• Increasesrealmemoryuseandultimatelymaycausepaging
• DB2willcontinuetodiscardifrealmemoryshortagereachesacriticallevel46
Memorymanagementdesignchanges…
• AfterAPARPM99575• DB2nowusesKEEPREAL(YES)ondiscard• Storageisnowonly"virtuallyfreed”• RSMflagsthepageasfreedorunused,butthestorageisstillinrealmemorywith
thedataandchargedagainstDB2• IfRSMneedsthepage,itcanstealitandtheRSMaccountingthatgoeswithitis
done• SomecustomersexperiencedthatDB2wasthenconsumingmoreCPU,
especiallyinMSTRand/orDISTpreemptibleSRBtime• ButgaveupsomeoftheCPUreductionthatwasinAPARPM88804
• SolutionstoreduceCPUresourceconsumption• SetREALSTORAGE_MANAGEMENT=OFFprovidedrealmemoryisgenerously
allocatedandcantoleratesomegrowthinuseofrealmemoryuse• ReducetheimpactofREALSTORAGE_MANAGEMENT=AUTO,bydesigningfor
increasedthreadreusetominimsethreaddeallocations
47
Memorymanagementdesignchanges…
48
AfterAPARPM88804
• ENF55signalmeansDISCARDKEEPREAL=NO
• RSM=OFFmeansNoDISCARD• RSM=AUTOwithnopagingmeansNo
DISCARDatThreadDeallocationor120commits
• RSM=AUTOwithpagingorRSM=ONmeansDISCARDwithKEEPREAL=NO atDeallocationor30commits.STACKalsoDISCARDED
• REALSTORAGE_MAXmeansDISCARDKEEPREAL=NOat80%
AfterAPARPM99575
• ENF55signalmeansDISCARDKEEPREAL=NO
• RSM=OFFmeansNoDISCARD• RSM=AUTOwithnopaging
meansDISCARDwithKEEPREAL=YESatThreadDeallocationor120commits
• RSM=AUTOwithpagingorRSM=ONmeansDISCARDwithKEEPREAL=YES atDeallocationor30commits.STACKalsoDISCARDED
• REALSTORAGE_MAXmeansDISCARDKEEPREAL=NOat100%
TutorialonInsertSpaceSearch
• Performanceisatradeoffacross• MinimisingCPUresourceconsumption• Maximisingthroughput• Maintainingdatarowclustering• Reusingspacefromdeletedrowsandminimisingspacegrowth
• Note:insertspacesearchalgorithmissubjecttochangeanditdoes
49
TutorialonInsertSpaceSearch…
• ForUTSandClassicpartitionedtablespace:1. IndexManagerwillidentifythecandidatepage(nextlowestkeyrid)
• Ifpageisfullorlocked,skiptoStep22. Searchadjacentpages(+/-)withinthesegmentcontainingthecandidatepage
• Forclassicpartitioneditis+/-16pages3. Searchtheendofpageset/partitionwithoutextend4. Searchthespacemappagethatcontainslowestsegmentthathasfreespacetothe
lastspacemappageupto50spacemappages• Thisis"smartspaceexhaustivesearch"
5. Searchtheendofpageset/partitionwithextenduntilPRIQTYorSECQTYreached6. Perform"exhaustivespacesearch"fromfronttobackofpageset/partitionwhen
PRIQTYorSECQTYreached• Veryexpensivei.e.,spacemapwithlowestsegmentallthewaythrough
• Forclassicsegmentedtablespace,stepsareverycloseareverysimilarexceptthesequence:• 1->2->3->5->4->6
50
TutorialonInsertSpaceSearch…• Eachmembermaintainsforeachpageset/partition:
• First/lowestspacemappagewithfreespace• Currentspacemappagewithfreespace• Lastspacemappage
• Ingeneral,eachstephas3pagesforfalseleadand100forlockfailurethreshold• Instep3(searchatendofpageset/partitionwithoutextend)thereisalittlemore
searchthanjust3pagesfalseleadtotoleratethehotspotcase• Forthesearchstepwherecannotextendatendofpageset/partition
• Searchfromthefirstspacemappagetotheend• Nofalseleadorlockthreshold• Tryeverypossiblespacebeforefailure
• Elsewhereforthe“smartexhaustivesearch”beforeextend,itwillfollowthe"lowestsegmenthasfreespace"goingforward50spacemappagesthenitwillgotoextend• The3pagesfalseleadand100pagelockfailuresareappliedtoeachspacemappage
• IftherearemultipleinsertsinthesameUR,then2ndinsertwillnotsearchthepreviousrepeatedunsuccessfulspacemappagesifthepreviousinsertalreadygonethroughmorethan25pages
51
JohnCampbellDB2forz/[email protected]
SessionZ6andZ7DB2forz/OS:JC'sGreatestHits,WarStoriesandBestPractice2016- Parts1and2
Please fill out your session evaluation before leaving!