Migrating a multi tenant app to Azure (war biopic)

Preview:

Citation preview

Migrating Multi-tenant App to AzureAkshay Surve, CTO DeltaX

Twitter: @ak47suve Email: akshay@deltax.com

About Me● 10+ years

Shipping Ideas, Making Mistakes, GTD Marathons / Hackathons / *-athon :)

● Co-founded DeltaX in 2013Ad-tech / Product Startup / 30+ team300+ advertisers across India, APAC and US.

What to Expect?● Our journey to the Cloud

Migrating our 1.2TB+ DB 100 odd tenants, Application & Services from bare metal dedicated infrastructure to the cloud

● Key Considerations & Pointers

● Learnings & Opinions

Context (What we do?)● SAAS platform for advertising analytics and tracking.

● We had 2 parallel stacks

○ Auto-scaled HA ad-serving + tracking system - AWSNode.js / Route53 > ELB > EC2 (auto-scale) > AWS S3 > Cloudfront

○ Always ON Mini-datawarehouse per tenant - Bare-metal Servers This is the stack we migrated to Azure

Previous Setup (Infrastructure stack)DB Machine

Dual Intel® Xeon® E5-2600 v3 @ 2.10GHz octa-core incl. RAM 128+23GB DDR4 ECC RAM

2 x 480 GB 6 Gb/s SSD Datacenter Edition6 x 960 GB 6 Gb/s SSD Datacenter Edition1 x 32 GB DDR4 Reg. ECC RAM

Web Box

Single Intel® Xeon® E5-2600 v3 @ 2.10GHz octa-core incl. 64 GB DDR4 ECC RAM

Services Box

Single Intel® Xeon® E5-2600 v3 @ 2.10GHz octa-core incl. 64 GB DDR4 ECC RAM

Bottlenecks● Tenant Isolation

Small vs Big tenants; Access to their own data;

● Dev Ops / Managing Infrastructure / DB growth2 years - 0 GB -> 200 GB1 years - 200 GB -> 600 GB6 months - 600 GB -> 1.2TB

Around 80-100 clients when managing the bare-metal servers started becoming a task; having already had our fair share of war stories - database going down; disk failure with RAID; log disk getting full etc.

● Exploding Service > Workers (120)Multi-threaded / Memory hungry / Isolation

Previous Setup (Software stack)● Database - MS SQL 2014 (1 Big DB)

Typical star schema; mixed load (OLTP / OLAP)

● Asp.NET MVC / Knockout.JS / WCF / Web API

● Over 80-90 Always running background services - processing data from Google Adwords, Yahoo Bing, Facebook and other external services including our own ad-serving and tracking infrastructure.

Considerations● Multiple providers

AWS vs. Microsoft Azure

● Cloud vendor Lock-inYes that is a term!

● Capacity planning / Cost Estimation / BenchmarkThe cloud is a complex beast. Pay per use feels liberating but for consistently high usage/loads is cloud even apt?

AWS vs Azure (Considerations contd.)● You are bound to be confused

Amazon Web Services

Microsoft Azure

Google Cloud

Bakasur Thali

Azure

● Microsoft Azure initially was all over the place; but in the last 2 years it has caught up and is in a promising direction (same can be said about Microsoft)

● Our software stack was a ‘first class’ citizen in Azure ecosystem- Azure SQL Database Pool- Azure App Service

AWS vs Azure (Considerations contd.)AWS

● 3+ Experience running a horizontally auto-scaled Ad-server and Event racking system with close to 99.998% uptime

● Exposed to the full stack of AWS right from Route53, ELB, EC2, DynamoDB, S3 and CDN

Azure SQL Database Elastic Pool

AWS vs Azure (Considerations contd.)

Azure SQL Database Elastic Pool

AWS vs Azure (Considerations contd.)

Azure SQL Database Elastic Pool

AWS vs Azure (Considerations contd.)

Azure SQL Database Elastic Pool

● Simple, cost-effective solution for managing and scaling multiple databases that have varying and unpredictable usage demands

● Shared pricing plan

● Ideal for multi-tenant applications - large differences between peak utilization and average utilization per database.

AWS vs Azure (Considerations contd.)

AWS vs Azure (Considerations contd.)Azure App Service

● Multiple languages and frameworks - App Service has first-class support for ASP.NET, Node.js, Java, PHP, and Python.

● DevOps optimization - Set up continuous integration and deployment with Visual Studio Team Services, GitHub, or BitBucket. Promote updates through test and staging environments

● Global scale with high availability - Scale up or out manually or automatically.

● If you are working on business critical applications - then it’s an important consideration

● You ideally want the best of what is to be offered by cloud provider but not get locked in. Unless it suits your need to the tee.

● Having a popular and widely adopted service as managed is a great advantage - happy that slowly and steadily Azure offers such services - eg: Redis, HD Insight, Azure SQL

Cloud vendor lock-in (Considerations contd.)

● Pay-per-use is great for on-demand or sporadic usage; for always on infrastructure being able to benchmark loads and eventually estimate costsis difficult to get right

● We had cases of Azure DBs capacity planning tool fail on us - not having the right plan size. We split each tenant to multiple DBs of it’s own

● Migrated tenants in batches and so we were able to benchmark and plan capacity likewise

Capacity / Cost / Benchmarking (Considerations contd.)

Service / Worker VMs- From 1 big box VM to 25 VMs

Misc Services- Redis- CDN- Cloud storage

Current Setup (Infrastructure)Azure SQL- 80 tenants on 1 SQL DB > 480 DBs (4x)- Hovering around 1 TB; to 2.8TB

Azure App Service- Web app- API - WCF

Timeline of Migration

Discussing and evaluation various ideasApril 2016

Declaration of WARJune 2016

First live tenantMid August

New tenants started using the cloud stackSept 2016

All tenants migrated with close to 12months dataDec 2016

● Tooling / Logging / Performance monitoringWe had one problem earlier; now the problem is compounded and spread out

● Keeping all DB schemas in sync (migrations)

● Exception logging; actionable SMS alerts

● Our costs increased 3-5x; also the number of tenants in that time. But we were able to scale

New Challenges

“The cloud is not a silver bullet”

silver bullet ~ noun‘a simple and seemingly magical solution to a complicated problem’

Twitter: @ak47suve Email: akshay@deltax.com

Recommended