Upload
atlassian
View
732
Download
4
Embed Size (px)
Citation preview
Pre-CloudTraditional DC
N-Tier Web Applications Relational Databases
Cloud Cloud infrastructure - AWS
Java, Linux, Apache No SQLHigh Availability
Scale Performance
Bitbucket At Netflix
Bitbucket At Netflix
Total LOC1,610,528,526Total Commits5,554,379Builds / Day7,132Jenkins Slaves335 Developers950
Cloud Infrastructure
Amazon Machine Image (AMI)
Elastic Compute Cloud (EC2) Instance
Elastic Block Storage Volume (EBS)
Relational Database (RDS)
Elastic Load Balancer (ELB)
Cloud Infrastructure
1. Stock Atlassian AMI
nginx Stash
PostgreSQL setups & configs
Amazon Machine Image (AMI)
2. Custom AMI
Cloud Infrastructure
Apache/Tomcat
How to Make?
Organizational
Amazon Machine Image (AMI)
Metrics Sidecars
Cloud InfrastructureEC2 Instance c3.8xlarge, 32 vcpu, 60 GB RAM 2x320 SSD ephermal
EBS Volume General Purpose GP2, 1 TB Stash Home
Cloud InfrastructureEC2 Instance c3.8xlarge, 32 vcpu, 60 GB RAM 2x320 SSD ephermal
EBS Volume General Purpose GP2, 1 TB Stash Home
RDS db.m3.xlarge, 4 vcpu, 16 GB RAM 100 GB storage
Cloud InfrastructureEC2 Instance c3.8xlarge, 32 vcpu, 60 GB RAM 2x320 SSD ephermal
EBS Volume General Purpose GP2, 1 TB Stash Home
RDS db.m3.xlarge, 4 vcpu, 16 GB RAM 100 GB storage
ELB DNS
Cloud Infrastructure
Auto Scaling Group?
EC2 Instance c3.8xlarge, 32 vcpu, 60 GB RAM 2x320 SSD ephermal
EBS Volume General Purpose GP2, 1 TB Stash Home
RDS db.m3.xlarge, 4 vcpu, 16 GB RAM 100 GB storage
ELB DNS
CPU, memory, sessions, JDBC
Completely stand-alone WAR
Built-in charting
Easy, drop in metrics
Easy, drop in metrics
Completely stand-alone WAR
System (CPU, etc), http sessions
Built-in charting
Prana
Sidecar “platform”
Standalone JVM
Ship and index log files
Visualize with Kibana
Resp times, error rates
Logstash,Elasticsearch, Kibana
Bitbucket DIY Backup Scripts
1. EBS Snapshots
2. Database backups*
RDS Database
*EBS Instance only
custom scripting
maint mode — short < 30s
Metrics, Monitoring, Backups
2. Bake an AMIAminator
3. Create Launch Configs and ASG
4. Scale up ASG to generate load
Load Testing
Lessons Learned
RDS Manual Snapshot Limit
max 50 snapshots snapshot error -> kept stash in maint mode
Lessons Learned
Janitor cleans unused AWS infrastructurenew rule to clean old RDS snapshots
RDS Manual Snapshot Limitbackup script more resilient to errorsJanitor Monkey
Volume mount dupe disaster
snapshot, attach to new instance
two Stashs connected to same database
prod
test Stash migrated Stash prod database
tables mismatched with code
prod immediately failed hard
Lessons Learned
3.5context: populate test with prod data
test3.8
connected to prod database
started test instance (prod configs!)
3.5ish Volume mount dupe disasterprod
Lessons Learned
3.5
test3.8
shutdown test
ad-hoc SQL to stabilize database
did not restore database from backups
analyze liquibase code -> roll-back script
upgrade: roll-back then roll-forward
Bitbucket in AWS - Takeaways
Embrace cloud infrastructure
Include monitoring and metrics
Learn from our mistakes
External resources
netflix.github.io