14
@jonahhorowitz Production Ready Services Increasing Microservice Availability at Netflix

Production Ready Services at Netflix

Embed Size (px)

Citation preview

Page 1: Production Ready Services at Netflix

@jonahhorowitz

Production Ready ServicesIncreasing Microservice Availability

at Netflix

Page 2: Production Ready Services at Netflix

@jonahhorowitz

Netflix Architecture

Page 3: Production Ready Services at Netflix

@jonahhorowitz

Page 4: Production Ready Services at Netflix

@jonahhorowitz

Determining Microservice Availability

Service A Service B

?

Page 5: Production Ready Services at Netflix

@jonahhorowitz

Microservice Availability Report

Page 6: Production Ready Services at Netflix

@jonahhorowitz

Microservice Availability Report

Page 7: Production Ready Services at Netflix

@jonahhorowitz

Production Ready

Page 8: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

Page 9: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

● Scalable○ Proactive autoscaling○ Reactive autoscaling

Page 10: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

● Scalable○ Proactive autoscaling○ Reactive autoscaling

● Performant○ Automated Canary Analysis○ Proper Instance Type○ Well-Tuned GC○ Well-Tuned connections, threads

(ezconfig)

Page 11: Production Ready Services at Netflix

@jonahhorowitz

Characteristics of a Production Ready Service● Reliable

○ Chaos Monkey Enabled○ Automated Deployment Pipelines○ Automated Canary Analysis

● Scalable○ Proactive autoscaling○ Reactive autoscaling

● Performant○ Automated Canary Analysis○ Proper Instance Type○ Well-Tuned GC○ Well-Tuned connections, threads

(ezconfig)● Monitored

○ High Quality Alerts■ Upstream Failure %■ Downstream Failure %

○ Dashboards■ Monitoring Releases■ Troubleshooting

Page 12: Production Ready Services at Netflix

@jonahhorowitz

Team Engagements

Page 13: Production Ready Services at Netflix

@jonahhorowitz

Drill SergeantAutomated Checks

● Deployment Pipelines● Operating System● Canary Analysis● Chaos Monkey

Manual Checks

● Alerts● Dashboards● Autoscaling Rules (for now)● Java/Tomcat/Apache Tuning (for

now)

Page 14: Production Ready Services at Netflix

@jonahhorowitz

Jonah HorowitzSenior Site Reliability Engineer

@jonahhorowitz

https://netflix.github.io/https://jobs.netflix.com/

Velocity Ignite Talk tomorrow:7:00pm – 8:30pm Santa Clara Convention Center