Scaling the Cloud - Cloud Security

Preview:

DESCRIPTION

Presented at ISSA CISO Executive forum 2012 Comments/Questions: bill.burns@netflix.com (3/8: Replaced Keynote for PDF version for compatibility) An adjunct to Jason Chan's Practical Cloud Security preso: http://www.slideshare.net/jason_chan/practical-cloud-security

Citation preview

Scaling the Cloud

Bill BurnsSr. Manager, Networks & Security

CISO Executive ForumFebruary 26, 2012

Thursday, March 8, 12

Agenda

•Netflix Background and Culture

•Why We Moved to the Cloud

• InfoSec Challenges in an IaaS Cloud

• InfoSec Perspective: Running In The Cloud

Thursday, March 8, 12

NetflixBusiness

(c) 2011 SandvineThursday, March 8, 12

NetflixBusiness

• 24+ million members globally

(c) 2011 SandvineThursday, March 8, 12

NetflixBusiness

• 24+ million members globally

• Streaming in 47 countries

(c) 2011 SandvineThursday, March 8, 12

NetflixBusiness

• 24+ million members globally

• Streaming in 47 countries

•Watch on more than 700 devices

(c) 2011 SandvineThursday, March 8, 12

NetflixBusiness

• 24+ million members globally

• Streaming in 47 countries

•Watch on more than 700 devices

• 33% of US peak evening Internet traffic

(c) 2011 SandvineThursday, March 8, 12

Background and Context

•High Performance Culture

•Fail Fast, Learn Fast ... Get Results

•Core Value: “Freedom & Responsibility”

Thursday, March 8, 12

Engineering-Centric Culture

Thursday, March 8, 12

Engineering-Centric Culture

•Sought the Cloud for Availability, Capacity

• ...and also found Agility

Thursday, March 8, 12

Engineering-Centric Culture

•Sought the Cloud for Availability, Capacity

• ...and also found Agility

•DevOps / NoOps means engineering teams own:

•New deployments and upgrades

•Capacity planning & procurement

Thursday, March 8, 12

Freedom&

Responsibility

Thursday, March 8, 12

Freedom&

Responsibility

Thursday, March 8, 12

Why Cloud?

•Transforming Netflix’s Core Business

•Availability, Capacity, Consistency

•Lower operational effort

•Mission Focus

•Agility

Thursday, March 8, 12

Demand vs Capacity

Thursday, March 8, 12

Demand vs Capacity

Thursday, March 8, 12

Demand vs Capacity

37x growth in13 months

Thursday, March 8, 12

Demand vs Capacity

37x growth in13 months

DataCenter Capacity

Thursday, March 8, 12

Cloud:On-Demand Capacity

Thursday, March 8, 12

Cloud:On-Demand Capacity

1. Demand: Typical pattern of customer requests rise & fall over time

1

Demand

Thursday, March 8, 12

Cloud:On-Demand Capacity

1. Demand: Typical pattern of customer requests rise & fall over time

2. Reaction: System automatically adds, removes servers to the application pool

1

Demand

2

# Servers

Thursday, March 8, 12

Cloud:On-Demand Capacity

1. Demand: Typical pattern of customer requests rise & fall over time

2. Reaction: System automatically adds, removes servers to the application pool

3. Result: Overall utilization stays constant

1

Demand

2

# Servers

3

Utilization

Thursday, March 8, 12

InfoSec Challenges In An IaaS CloudU"lity'

Authen"city'

Possession'

Confiden"ality'

Integrity'

Availability'

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Confidentiality

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Integrity

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Availability

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Possession/Control

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Authenticity

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Authenticity

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Authenticity

Thursday, March 8, 12

InfoSec Challenge in an IaaS Cloud :: Authenticity

Thursday, March 8, 12

Running In The Cloud :: InfoSec Perspective

Thursday, March 8, 12

Running In The Cloud :: InfoSec Perspective

Thursday, March 8, 12

Running In The Cloud :: InfoSec Perspective

Thursday, March 8, 12

Running In The Cloud :: InfoSec Perspective

Thursday, March 8, 12

InfoSec In The Cloud :: Harder

Thursday, March 8, 12

InfoSec In The Cloud :: Harder

1.“You’re host attacked me yesterday. Please stop!”

Thursday, March 8, 12

InfoSec In The Cloud :: Harder

1.“You’re host attacked me yesterday. Please stop!”2.Dealing with other people’s traffic at your front door

Thursday, March 8, 12

InfoSec In The Cloud :: Harder

1.“You’re host attacked me yesterday. Please stop!”2.Dealing with other people’s traffic at your front door 3.Herding ephemeral instances with vendor applications

Thursday, March 8, 12

InfoSec In The Cloud :: Harder

1.“You’re host attacked me yesterday. Please stop!”2.Dealing with other people’s traffic at your front door 3.Herding ephemeral instances with vendor applications4.Trusting endpoints, infrastructure

Thursday, March 8, 12

InfoSec In The Cloud :: Harder

1.“You’re host attacked me yesterday. Please stop!”2.Dealing with other people’s traffic at your front door 3.Herding ephemeral instances with vendor applications4.Trusting endpoints, infrastructure5.Key management

Thursday, March 8, 12

InfoSec In The Cloud :: Easier

Thursday, March 8, 12

InfoSec In The Cloud :: Easier

1.Reacting to business velocity

2.Detecting instance changes

3.Application ownership, management

4.Patching, updating

5.Availability, in a failure-prone environment

6.Embedding security controls

7.Least privilege enforcement

8.Testing/auditing for conformance

9.Consistency, conformity in build and launch

Thursday, March 8, 12

Old IT way:Hand-Crafted configuration

(C) courtesy: Flikr (piper, viamoi)Thursday, March 8, 12

Old IT way:Hand-Crafted configuration

(C) courtesy: Flikr (piper, viamoi)Thursday, March 8, 12

New: Automation

Thursday, March 8, 12

Change Controls ::Patching

• Goal: Running instances do not get patched• Alternative:

• Bake a new AMI for any change• Launch new instances in parallel• Kill the old instances

Thursday, March 8, 12

Change Controls ::Upgrades• Bake a new AMI for any

change

• Launch new instances in parallel

• Kill the old instances

Lesson Learned: Make the secure, consistent behavior the easier alternative.

Thursday, March 8, 12

Availability :: Never Launch One of Anything

(c) Courtesy Flikr - WintonThursday, March 8, 12

Availability :: Never Launch One of Anything

•Chaos Monkey induces failures, helps us practice recovery

(c) Courtesy Flikr - WintonThursday, March 8, 12

Availability :: Never Launch One of Anything

•Chaos Monkey induces failures, helps us practice recovery

•Balance across Availability Zones

(c) Courtesy Flikr - WintonThursday, March 8, 12

Availability :: Never Launch One of Anything

•Chaos Monkey induces failures, helps us practice recovery

•Balance across Availability Zones

•Applications automatically scale-out, regenerate

(c) Courtesy Flikr - WintonThursday, March 8, 12

Availability :: Never Launch One of Anything

•Chaos Monkey induces failures, helps us practice recovery

•Balance across Availability Zones

•Applications automatically scale-out, regenerate

•Conformity Monkey detects differences, improper settings

(c) Courtesy Flikr - WintonThursday, March 8, 12

Identity Challenges :: Vendors Lagging

Thursday, March 8, 12

Identity Challenges :: Vendors Lagging

• Cloud instances are ephemeral

• Customers cannot necessarily pick their IP addresses, ranges

• Instances need to base context on apps, services, tagging (not IPs)

• Vendors need better support for ephemeral licensing, stateless instances, self-config

Thursday, March 8, 12

Identity Challenges :: Vendors Lagging

• Cloud instances are ephemeral

• Customers cannot necessarily pick their IP addresses, ranges

• Instances need to base context on apps, services, tagging (not IPs)

• Vendors need better support for ephemeral licensing, stateless instances, self-config

• Machine capacity is no longer a CapEx friction item.

Thursday, March 8, 12

Conformity&Consistency

Thursday, March 8, 12

Conformity&Consistency

Thursday, March 8, 12

Automation =Conformity &Consistency

Thursday, March 8, 12

Automation =Conformity &Consistency

• All apps, tiers are Highly Available

• Secure defaults applied automatically

• Replacement instances look just like the originals

Thursday, March 8, 12

Automation =Conformity &Consistency

• All apps, tiers are Highly Available

• Secure defaults applied automatically

• Replacement instances look just like the originals

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

• Conformity Monkey - Various policy checks

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

• Conformity Monkey - Various policy checks

• Latency Monkey – Induces random latency

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

• Conformity Monkey - Various policy checks

• Latency Monkey – Induces random latency

• Janitor Monkey – Kills orphaned instances

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

• Conformity Monkey - Various policy checks

• Latency Monkey – Induces random latency

• Janitor Monkey – Kills orphaned instances

• Security Monkey – Various security checks

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

• Conformity Monkey - Various policy checks

• Latency Monkey – Induces random latency

• Janitor Monkey – Kills orphaned instances

• Security Monkey – Various security checks

• Exploit Monkey – Vuln Scans / Pen Tests

Thursday, March 8, 12

Baked-In Security Controls :: Netflix Simian Army

• Cloud Ready Dashboard

• Identify and test common failure modes

• Continuous, aggressive monitoring, testing

• Mostly opt-In

• Chaos Monkey - Randomly kills instances

• Conformity Monkey - Various policy checks

• Latency Monkey – Induces random latency

• Janitor Monkey – Kills orphaned instances

• Security Monkey – Various security checks

• Exploit Monkey – Vuln Scans / Pen Tests

• Unnamed – File integrity monitoring, HIDS

Thursday, March 8, 12

Embedded Security Controls

Thursday, March 8, 12

Embedded Security Controls

• Controls baked into the “base AMI”

• Controls placed near the data

• Applied as machines die/reborn

Thursday, March 8, 12

Embedded Security Controls

• Controls baked into the “base AMI”

• Controls placed near the data

• Applied as machines die/reborn

• Security controls are “Data Center agnostic”

• Provide a “single pane of glass” awareness

• Span all regions, data centers

Thursday, March 8, 12

CISO ForumTake-Aways

Thursday, March 8, 12

CISO ForumTake-Aways

1. The public cloud / IaaS is not just a technology.

2. Cloud IaaS is disruptive to Operations, Engineering, Vendors, Auditors.

3. Your Data is your new perimeter.

4. Design for failures in everything.

5. IaaS providers care about their infrastructure.

6. Public cloud Information Security is still about the basics, but in a new context.

7. There’s still plenty left to resolve, like trusted infrastructure, strong key management, COTS support.

Thursday, March 8, 12

Questions

Thursday, March 8, 12

Questions

Thursday, March 8, 12

Recommended