21
Budapest University of Technology and Economics Department of Measurement and Information Risk assessment based cloudification Szilárd Bozóki Gábor Koronka Supervisor: Prof. András Pataricza Department of Measurement and Information Systems

Risk Assessment Based Cloudification

Embed Size (px)

Citation preview

Page 1: Risk Assessment Based Cloudification

Budapest University of Technology and EconomicsDepartment of Measurement and Information Systems

Risk assessment based cloudification

Szilárd BozókiGábor Koronka

Supervisor: Prof. András PatariczaDepartment of Measurement and Information Systems

Page 2: Risk Assessment Based Cloudification

2

Cloud computing around the globe

Source: Cisco Global Cloud Index: Forecast and Methodology 2013–2018

Page 3: Risk Assessment Based Cloudification

3

Mission critical cloud computing August 31, 2015

o Federal aviation administrationo 108$ million now, $1 billion in 10 yearso Source: CSC news

Network Functions Virtualizationo The „telco” cloudo Source: NFV

Page 4: Risk Assessment Based Cloudification

Problem statement Moving to the cloud, cloudification:

„How large pool of VMs is needed?”o It depends…

• SLA?• Application and platform characteristics

Aim: mission critical and high value applications Approach:

oModeling distinctive cloud featureso Revisit decades old modeling and fault tolerance

techniques leveraging distinctive cloud features

4

Page 5: Risk Assessment Based Cloudification

5

From dependability to risk A definition of dependability

o “the ability of a system to avoid service failures that are more frequent or more severe than is acceptable.”

Core idea:Mission insurance ? Mission incident probability * value of assete.g. 3$ < > = 50% * 10$expected value of loss (risk): 5$while insurance < expected value of loss good insurance

insurance investment in redundancyo diminishing returns

?

Page 6: Risk Assessment Based Cloudification

6

Risk interpretation Risk the expected value of loss

due to potential service outageso Failure frequency

• Cloud fault model– Fault tree, or reliability block diagram

o Severity• Cost of application downtime

Risk mitigationo Design for resiliency, cloudificationo Reduction of failure frequency or severity

Page 7: Risk Assessment Based Cloudification

7

Critical services over ordinary clouds? Environment

o HW/SW stacko Cloud service models

Research objectiveo Carrier grade IaaSo Fast resiliency (SDN)o Redundancy architectural

pattern• HW,VM, App, else

o Measures• Availability (downtime)• Cost HW 3

HW 2HW 1Redundancy

Paa

S p

rovi

der

Saa

S p

rovi

der

Hardware

VMM (Hypervisor)

+ optional Host OS

VM

Guest OS

Container

Application

IaaS

pro

vide

r VM 3VM 2

VM 1Replication

APP 3APP 2

APP 1Replication

Page 8: Risk Assessment Based Cloudification

8

Physical cloud model A distinctive cloud feature to begin with.

VM

Data Center

ClientInternet

HW- ClusterHW- Server

Availability Zone

Page 9: Risk Assessment Based Cloudification

9

Related work SLA requirement from provider view

o Provider manages and optimizes the SLA portfolio Concrete application specific cloud user view

o Concentrates on queuing and scheduler based job execution service

Deploying a scientific grid on a private cloudo Profitability analysis

Our focus: user view on an IaaS public cloud

Page 10: Risk Assessment Based Cloudification

10

Cost resilience trade-off In order to reduce total cost, how many

redundant virtual machines are needed? Risk

o Failure frequency• Cloud fault model based on the physical layout of IaaS

o Severity• Cost of application downtime (SLA)

Risk mitigation failure frequency reduction redundant VMs (hot-running)Cost overhead of VMs (VM price/hour)

Page 11: Risk Assessment Based Cloudification

11

Abstract cloud model based on the physical cloud

Abstract cloud model

Level1 Client

Level2 Region

Level 3 AV zone

Level 4 VM 𝒇 𝟏 (𝒕) 𝒇 𝟐 (𝒕 ) 𝒇 𝟑 (𝒕) 𝒇 𝟒(𝒕 )𝒈𝟏(𝒕 ) 𝒈𝟐(𝒕 ) 𝒈𝟑(𝒕)

𝒉𝟏(𝒕 ) 𝒉𝟐(𝒕)

Page 12: Risk Assessment Based Cloudification

12

Top level fault tree - basic model

Path down

AV zone down VM downRegion

down

System DownAll paths down

Level1 Client

Level2 Region

Level 3 AV zone

Level 4 VM

Page 13: Risk Assessment Based Cloudification

N+7 replication based redundancy- basic model

13

AVZone 1

fail

VM 1fail

Region 1

fail

VM 2fail

All VMs fail

AVzone 1 path fail

AVZone 2

fail

VM 1fail

VM 2fail

All VMs fail

AVzone 2 path fail

Total AVzone path fail

Region 1 path fail

AVZone 1

fail

VM 1fail

Region 2

fail

VM 2fail

All VMs fail

AVzone 1 path fail

AVZone 2

fail

VM 1fail

VM 2fail

All VMs fail

AVzone 2 path fail

Total AVzone path fail

Region 2 path fail

System failureAll paths downLevel1 Client

Level2 Region

Level 3 AV zone

Level 4 VM

Page 14: Risk Assessment Based Cloudification

14

Variance of resource quality A distinctive cloud feature with impact. If any resource underperforms VM down

o Interferenceo Noisy neighbors

VM down Computedown

Networkdown

VM down

Level 4 VM

Page 15: Risk Assessment Based Cloudification

15

Network resource quality variation

Source: Gorbenko, A., Kharchenko, V., Mamutov, S., Tarasyuk, O., Romanovsky, A. “Exploring Uncertainty of Delays as a Factor in End-to-End Cloud Response Time.” 2012

Page 16: Risk Assessment Based Cloudification

16

Compute resource quality variation

Source: Gorbenko, A., Kharchenko, V., Mamutov, S., Tarasyuk, O., Romanovsky, A. “Exploring Uncertainty of Delays as a Factor in End-to-End Cloud Response Time.” 2012

Page 17: Risk Assessment Based Cloudification

17

Results – availability (number of 9s)

0 1 2 3 4 5 6 70

2

4

6

8

10

12

A basicA extended

Number of VMs

Num

ber o

f 9s

Page 18: Risk Assessment Based Cloudification

18

Results – availability (number of 9s)

0 1 2 3 4 5 6 70

2

4

6

8

10

12

A basicA extendedB basicB extendedC basicC extended

Number of VMs

Num

ber o

f 9s

Page 19: Risk Assessment Based Cloudification

19

Results – annual expected total cost

0 1 2 3 4 5 6 7 100.00

1,000.00

10,000.00

100,000.00

1,000,000.00

10,000,000.00

100,000,000.00

A basic modelA extended

Number of VMs

Annu

al to

tal c

ost

Page 20: Risk Assessment Based Cloudification

20

Results – annual expected total cost

0 1 2 3 4 5 6 7 100.00

1,000.00

10,000.00

100,000.00

1,000,000.00

10,000,000.00

100,000,000.00

A basic modelA extendedB basicB extendedC basicC extended

Number of VMs

Annu

al to

tal c

ost

Page 21: Risk Assessment Based Cloudification

21

Concluding remarks Aim: mission critical and high value applications Modeling distinctive cloud features

Risk based interpretation of dependabilityFault tree based on the physical cloud

Revisit decades old modeling and fault tolerance techniques leveraging distinctive cloud featuresRedundant VMs on the cloud

Numerical analysisRedundant VMs N+(4-6VMs) extra availability reduction of risk

reduction of total cost Future work

o Model refinemento Data acquisition, measuremento Dynamic replication for critical mission phases