28
Understanding MLC Cost Impact of Performance and Capacity Management St. Louis CMG Oct 4 th 2016 at Donald Zeunert BMC Software

Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Understanding MLC Cost Impact of Performance and Capacity Management

St. Louis CMG Oct 4th 2016 at Donald Zeunert

BMC Software

Page 2: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Survey Questions

Hardware upgrades

– driven by online / batch?

MLC Costs - 4hr Rolling Average

– online, batch or both?

4HRA MSU peak hour

– SCRT (CEC, LPAR, Product sources)?

2

Page 3: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Topics

What is important to reduce IT Costs

– MIPS / MSUs reduction?

– ISV SWLCs reduction?

– Specialty Engines usage?

– JIT Hardware upgrades?

Why do I need to manage my 4 hour MSU rolling average?

How can I manage the 4HRA w/o impacting SLAs?

3

Page 4: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

What is important to reducing IT Costs ?CIO objectives and IT Budget Cost sources %

Bigger benefits with less effort if focus on managing MLC costs

Focus on Software Costs

MLC 25-30%

– Save 10% = 2.5%

ISVs 7-10%

– Save 10% = 0.7%

4

Meeting CIO objectives -With 80/20 RuleFocus 20% of effort to get 80% of benefit

Hardware no longer $ issue

Page 5: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Performance & Availability Rules of Thumb

Keep CEC & MVS 100% busy– An unused MIP is lost forever– See USAA Session 18345

PR/SM Configuration– ensure production can steal from

test and visa-versa

– Over configure logical to physical ratio so MIPS > guaranteed % can be used

Upgrade thresholds – online workloads peak at 70-80% of

capacity

Decrease Batch window– Complete ASAP for maximum

window to recover from failures or schedule planned outages

– Start batch as soon as possible

5

Are you managing P&A using most of these rules?

Page 6: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

IBM SCRT Report – LPARs contribution to CPC 4HRA

6

The CPC 4HRA used for z/OS MLC chargesis different date than any of the LPAR Detail breakdowns

PRD2 the largest LPAR did peak at same hour the two days beforeNone of the other LPARs around that date / time.2016 SCRT version has detail mode to show all hours

Data in SCRT report insufficient to understand what caused CPC 4HRA peak for the month

Page 7: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Goal of Usage (4HRA) based MLC

7

Billed at 4hr Rolling average not peak usage

4hrs4hrs

• Heavy online with multiple Peaks

• Little or no batch within 4hr of peak

• Batch consumption peak below onlines4HRA

Buy extra capacity - meet SLAs, pay for less than used

“Ideal” workload

5:00 PM9:00 AM Q: Is my CPC Ideal?

FREE

Page 8: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

"Indy, they're digging in the wrong place!”

8

When you have all of the information you can focus your efforts in the right place

Modification to formula;take back one kadam

The SCRT report doesn’t have enough information to tell if our CPCs are “Ideal”. We need to rely on performance monitors, Capacity planning reports, other tools.

Page 9: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Is my workload “Ideal”? –Understand 4HRA Drivers

9

Batch 7:30PM

Backup / ReOrgs

Online Peak 3pm

4HRA RiskFrom batch spike

4HRAPEAK

Start of online day

White spaceConsumed?

Investigate• CPC billed MSUs

time• Which LPAR is

contributing to CPC max

• What workload on the LPAR

• From a spike, necessary?

Page 10: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Typical ISSUEs – 4HRA not when expected

10

BATCH is source of problemStarted too early – averages w/ end of online peak usage Finishes too late - averages w/ start of online peak usage Source of peak MSUs is the 4HRA

– Finishes with hours to spare and is not either above condition• Needs to be controlled

– Drops then peaks again barely finishes on time

Daytime OnlinesPeak 4HRA 56 MSUs

10pm Why not started earlier? MSUs to finish

on time

Onlines volume complete

3:00AM

Why not CAP to Online Peak?

Page 11: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Batch needs 100% capacity to complete on time

11

Need to Begin sooner Shorten Duration

– make (batch, sorts, backups, DB reorgs, etc.) more efficient

– Use less GCP or use specialty engines

Delay Subset– find discretionary batch

/ work that can complete after 8am

Batch starts 9pm finishes 7:30amOnlines start 8amNeed all MSUs to completeResource limiting not an option

Page 12: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Batch Multiple Peaks

12

Need to smooth and CAP– Batch Starts 7pm– Drops at 8:15 as

waiting for something– Picks up at 9:30, drops

off till Midnight– Small spike– 3am something huge

(reorgs / backups) creates 4hr MSU peak for billing.• Doesn’t run long

done 6:15am ready for 7am online start

Resource Limit Second Peak

Page 13: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Monitor / Manage / Plan - 4hr Avg MSU

Monitor demand and 4HRA MSUs– Catch loops and anomalies to exclude from SCRT report

– Understand normal LPARs / CEC usage patterns

– Track current against normal 4hr MSU patterns

Manage potential SLA impact of capping– Max MSUs Consumed

– Average MSUs Consumed

Plan your corrective actions– Understand what to sacrifice to stay under expected max

• What actions can be taken to allow sacrifice

13

Page 14: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Capping types and Pro/ConsManage to current consumption

Can’t exceed even if no impact to 4HRA

PR/SM Initial Capping (aka – Hard Capping) –Relative to LPARs current weight, # of CPs worth, don’t take “white space”

Absolute Capping – (hard cap new on EC12) –Not relative to share, specified in terms of 1/100ths of a processor

WLM Resource Groups - CPU SU/SEC a Service class can use across the Sysplex

– Sysplex scope may not be CPC and aligned with SCRT billing

Manage to 4hr MSU Average

May exceed MSU cap until impacts 4HRA limit

4HRA not allowed to exceed cap– Will never cap lower than Defined capacity– Even if 4HRA exceeds cap not billed for it

Types• Defined Capacity – Soft Capping # of

MSUs to give LPAR in 1MSU increments

• Group Capacity – Set capacity for group of LPARs in a CEC to subset of CEC capacity but in PR/SM ration entitlements

14

Page 15: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Growth LPAR – Rolling and Using

15

Most monitors will let you view a CECs Rolling 4hr MSU

But this is not enough• Alarms – Not just 4HRA too

late, look at MSU spikes• Actions - What on the fly

actions are allowed?• Impact SLAs ?Peak 4HRA long time in making. What actions now?

Image purple line was 4HRA Cap

Page 16: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Ease of identifying drivers of 4HA MSUs

16

Easier

Homogeneous software on LPARs

Very few LPARs

Very few LPARs / CEC

HardHeterogeneous software on LPARs

– New work on z LPARs just WAS and / or DB2.

– IMS, CICS not on all LPARs– DB2 not on all LPARs– DB2 only LPARs (SAP)

Numerous LPARs with different time of day peaks and different application mixes typically from large customers / acquisitions

– Banks– Insurance – Service Bureaus

Page 17: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Sub-capacity Pricing – Example Scenario

17

CPC Peak 4HRA1108 MSUs

LPAR1 Peak 4HRA550 MSUs

LPAR3 Peak 4HRA450 MSUs

LPAR2 Peak 4HRA400 MSUs

0

200

400

600

800

1000

Monthly MSU 4 Hour Rolling Average

LPAR1

LPAR2

LPAR3

Total CEC (LPARs 1-3)

LPA

R1 z/OS

CICS

DB2

IMS

Onlines

LPA

R2 z/OS

CICS

DB2

IMS

Onlines

LPA

R3 z/OS

CICS

IMS

Batch

MQ

MQ: 450 MSUs

Largest LPAR peak not what is billed

Bill - z/OS,CICS,IMS: 1108 MSUs

Page 18: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Need for LPAR and Workload balancing

18

4HRA Peaks at Same basic time on CEC777

Different peak times on CEC888

Page 19: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Complexity different products on LPARs / CPCs

19

LPARs on different CPCs peak at different timesIf Subsystem MSUs not = z/OS MSUs, then not licensed all LPARsSCRT Record type data

– Bill by product date / time

CPC 1CPC 2

Page 20: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

PR/SM – Easy changes lead you big rewards

20

Page 21: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Controlling 4HRA Monthly Max

Using PR/SM

– Capping

– Max share via #LPs

Using WLM (w/ PR/SM)

– Capping

• LPAR

• Service class – for lower importance work

– Discretionary can be source of 4HRA peak

21

Page 22: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

CEC and LPAR – Rolling 4hr MSUs

22

Reality workloads tend to consume 100% of capacity providedLPARs configured for stealing for latent demand

Contribution by LPAR

4HRA Free above the DC line

LPARs with latent demand expand to take up others contractions

Page 23: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Logical / Physical – Guaranteed Minimum

23Everyone's share < 10% of 16 CPs so guaranteed < 1.6 CPs

Using White Space / more than guaranteed shareImpacting 4HRA ?

If no LPARs with work to compete with production batch, can exceed guaranteed share. And therefore easily exceed (2x) daytime 4HRA (3 vs 1.5 CPs)

Does not create a maximum other than 100% of # of LPs

Page 24: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Avoid Batch SLA impact

Reduce Batch GCP CPU

zIIP enabled Sorts, reorgs, etc.

Delta / changed backups vs fullReorgs by need instead of all

Balance LPARs across CECs• avoid simultaneous peaks• Use scheduling environments

Reduce LPAR Software footprint• Consider Batch only LPARs eliminate

non-essential software

Reduce Batch CPU Tune application and subsystems• DB2 V10 zIIP enabled sequential prefetch (batch,

online)• Verify adequate zIIP capacity

Reduce Batch Elapsed timeConvert serial processing to parallel processing• N-way data sharing (VSAM RLS)• Batch pipes or equivalent

24

Page 25: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

MSUs and Relative Nest Index (RNI)

25

High RNI = Higher MSU for same workload.

Drives demand MSUs, which drives 4HRA and MLC costs

Page 26: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Summary - New Rules of Thumb

Manage 4 hour rolling average MSU consumption without impacting Service Level AgreementsDon’t use CEC Capacity if SLAs can be met

– Don’t always let unimportant workloads consume “spare” MIPS• Group CAP box to Production monthly peak +N% for SLA

– Let others consume 100% of what the bill will be anyway

– Be careful how many LPs you give LPAR if no competing LPARs– Balance LPARs across CECs if peak at different times

Don’t let batch be 4HRA peak by default– If it can complete with an acceptable window for recovery

• Don’t start batch ASAP if it creates a 4HRA max • Don’t let batch finish late if creates a 4HRA max in online window before online peak

Don’t run software on all LPARs – if not needed as billed even if not used

26

Page 27: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Discussion TopicMobile Workload Pricing

– 60% discount on CICS, IMS, DB2, MQ, WAS MSUs if customer can prove request initiated from a “mobile” device (phone, tablet)• So if same URL is used for “mobile” and PC and you can’t tell the difference

you can’t get the discount

– Do you currently track “mobile” MSU consumption?• How do you do it?

27

On May 6, 2014 IBM a new reporting (MWRT) tool ,available June 30th, Workload on z/OS from Mobile (cell / tablets) to pay 60% reduced MLC

Replaced w/ new JAVA based SCRT toolWLM w/ CICS & IMS support MWLC via SMF70s and 72s

Page 28: Understanding MLC Cost Impact of Performance and Capacity ...€¦ · Performance & Availability Rules of Thumb Keep CEC & MVS 100% busy – An unused MIP is lost forever – See

Bring IT to Life.™

Thank You

28

Contact: [email protected]