[AWSKRUG&JAWS-UG Meetup #1] 70% Cost Reduction with On-demand resizing

Preview:

Citation preview

70% Cost Reduction with On-demand resizing

VCNCMyungbo Kim

2016.05.21

Speaker

• Myungbo Kim (Andrew)

• Between Developer in VCNC

• Server / Ops / Data analysis

• Spend lots of money to AWS

Between

• Mobile Service for couples

• iPhone, Android Application

• Chatting, Anniversary, Photo Albums, Calendar, etc…

• 17M+ download globally ( as of 2016.05 )

• http://between.us

• http://engineering.vcnc.co.kr

Between

Today …

• Photo Architecture migration in Between

• Thumbnail generation -> On-demand resizing

• Save 70% S3 cost !!

Old Architecture

Between user like photos

Different sized thumbnails

Conventional wisdom

• Generate Thumbnail while upload/create images

Photo upload process

• Generate 4~6 thumbnails per photo

• Store at s3 bucket w/ original image

A lot of photos!

• 1.1 B+ original photos ( as of 2016.03 )

• 6.6 B+ w/ thumbnails

• 738 TB

Very Expensive !!!

Could be more efficient?

Low fan-out

• Only shared between couples - 2 fan-out

• Not like web-services which can have 10k+ fan-outs

Client screen size

Client screen size

• Some thumbnails might not be used at all ( depend on client screen size )

High cache hit rate

• Client have LRU-based file cache

• CloudFront give some bonus cache hit

S3 - Thumbnail Recall Rate

• Analysis recall rate on S3 photo by analyzing S3 logs

• Use Spark cluster + Zeppelin

New Architecture

On-demand Resizing

• Don’t need to save thumbnails on S3

• But On-demand thumbnail resizing can be very slow

Resizing during upload

On-demandResizing

• Skia is 2D graphic library used in Chrome, Android, Firefox

• Very fast resizing / converting format

• Well optimized on CPU instruction level

• x4 faster than ImageMagicK

• Resize 3264x2448 JPEG -> 1000x1000 JPEG

ImageMagicK Skia

time (ms) 620 ms 151 ms

Skia Benchmark

• 75.64% faster !!

• Want to save more space of original images

• Open Image format made by Google

• 26% smaller than JPEG in comparable quality

New architecture

• Save original image as WebP

• Resize thumbnails when client request

Architecture ComparisonOld Architecture

New Architecture

Resizing Latency

• 3M pixel WebP (90% quality, 145KB) resize to avg 45KB JPEG

• Download to server : avg 59ms

• Resizing : avg 37ms

• Download takes longer !!!

Migration

Architecture MigrationOld Architecture

New Architecture

Architecture Migration

Old Image Migration

Old Image Migration

• Total 1.1B original images (JPEG, PNG)

• 6.6B original + thumbnails (JPEG)

• Encode original images to WebP

• Remove old thumbnails

• Could not lose any user images

Old Image Migration

• Image encoding is CPU intensive

• Use SQS, Spot instance Auto-Scaling group to process migration in cost-efficient way

SQS Auto-Scaling Spot Instance

Old Image Migration Step

1. Separate migration by each couples and put in SQS

2. Set up spot auto-scaling group of worker to process all images put them on S3

3. When upload s3 complete, mark it on DB

4. Remove old thumbnails

5. When thumbnails removed, make it on DB

Old Image Migration

• All step should be idempotent

• Exception can be occurred everywhere

• Spot instance could be taken by other bider

Old Image Migration

• Use compute-optimized instance

• C3.4xlarge, C4.2xlarge

• Use cloud watch metric of DB cpu utilization as scaling alarm

• Most work run on night ~ dawn

Old Image Migration

Old Image Migration

All spot is mine !!!

Old Image Migration

Old Image Migration

Old Image Migration

• Took 4 days

• Used up to 140 instances

• 6,767 instance · hour

• 303,933 ECU · hour

• Only take $1.8 / 1M WebP encoding

Result of Migration

Migration Cost

Usage Cost ($)

EC2 spot 6,767 hours 1,959.11

SQS 188,204,104 89.59

S3 Put/Get 2,492,466,860 5,608.34

Total 7,657.04

Migration Result

Before Migration

AfterMigration

Reduction (%)

S3 # of objects 6.65 B 1.17 B 82.40

S3 Storage 738 TB 184 TB 75.06

Cost Reduction

• 68% cost reduction ( S3 + Resizing )

Conclusion

Conclusion

• Migrate Architecture to reduce cost

• Protect user experience with faster resizing

• Processing 1.1B+ images with Spot + AutoScaling

• Reduce 68% S3 cost

Q & A

Recommended