Upload
sergey-chernyshev
View
1.366
Download
1
Embed Size (px)
DESCRIPTION
CDNs have become a core part of internet infrastructure, and application owners are building them into development and product roadmaps for improved efficiency, transparency and performance. In his talk, Hooman shares recent learnings about the world of CDNs, how they're changing, and how Devs, Ops, and DevOps can integrate with them for optimal deployment and performance. Hooman Beheshti is VP of Technology at Fastly, where he develops web performance services for the world's smartest CDN platform. A pioneer in the application acceleration space, Hooman helped design one of the original load balancers while at Radware and has held senior technology positions with Strangeloop Networks and Crescendo Networks. He has worked on the core technologies that make the Internet work faster for nearly 20 years and is an expert and frequent speaker on the subjects of load balancing, application performance, and content delivery networks.
Citation preview
What we can learn from CDNs about Web Development,
Deployment, and Performance
Hooman Behesh+, VP Technology
WebPerf Meetup, NYC, May 13, 2014
Who am I?
• Early Load Balancing Vendors – Radware – Crescendo Networks
• Front End Op+miza+on – Strangeloop Networks
• Off the grid for a year! • Joined Fastly 6 weeks ago
So, really…
What I’ve Learned From Working at a CDN Company
for 6 Weeks!!
Lesson:
CDNs Are Not Solved!
We Don’t Cache As Much As We Should!
• HTML and other dynamic content
• Worse cache hit rate than we think – Especially for long tail content
• Mobile Apps, APIs, etc
Making Changes SUCKS!!
• Configura+on changes take way too long – People are used to making changes real-‐+me – CDNs aren’t classically good at this – Phone??
• Purging is a real problem – Slow – Difficult – Not granular enough
Lots of Room
• New Demands from Customers • Plenty of room for differen+a+on • Can’t take some things for granted: – DNS – Rou+ng – TCP – SCALE!
• Plus: lots of room to be crea+ve at the edge!
Lesson:
There’s More to the Web Than the Web!
Non “web” Traffic • Video – HLS (HTTP Live Streaming) – HTTP-‐based small video chunks – Unique by URL
• APIs – Instant purging can let API calls be cacheable – Another example of dynamic content cached at the edge
• Mobile Apps
Lesson:
People Use Their CDNs Wrong
CDNs offer a toolset
• The black box approach isn’t always good • Configura+on isn’t trivial – And a lot s+ll depends on configura+on
• Can’t depend on the CDN to solve all your problems
• Don’t exacerbate your problems!
hbp://bigqueri.es/t/sites-‐that-‐deliver-‐images-‐using-‐gzip-‐deflate-‐encoding/220
Gzipping Images
• Not a very good thing for performance
– Extra bytes
– Extra work for the browser
• But was this the Surrogate’s fault?
More Examples
• Bad caching headers – max-‐age, s-‐maxage have a lot of power!
• Bad TCP connec+on management at origin
• Not Gzipping (actual, compressible content) for origin fetches
With Great Power…
Lesson:
Dynamic Content Is Really InteresVng!
What Is Dynamic Content?
• Stuff that’s not sta+c!
• With web traffic, generally the base HTML – Big deal because it’s blocking – And some+mes the largest object; longer download
• Some AJAX
• More…
Blocking
Classically, with dynamic content…
Caching
Caching vs.
InvalidaVon
We tried…
Dynamic Content Caching Problems
• Serving stale pages – Lack of good invalida+on framework
• Visibility
• Logging
CDNs and Dynamic Content
• Generally, handling dynamic content has been a maber of transport – Middle mile op+miza+ons – TCP tweaks
• Some edge micro caching, but not easy
• ESI
Actually…
• Dynamic content is more cacheable than we think
• Sta+c for short periods of +me
• Unpredictable invalida+on – Standard HTTP caching rules aren’t good enough
So Many Benefits!
• Performance – Faster +me to first byte – Faster start render – Happy users!
• Offload – Less work for our servers – Less bandwidth at origin
What Would Make It BeZer?
• Programma+c Invalida+on – Granular – Instantaneous
• Control at the edge, and not just for web pages – Real-‐+me log files – Imagine termina+ng beacons at the edge!
Lesson:
IntegraVng is Awesome!
The Influence of Clouds
• DevOps people like programmability and integra+on
• The CDN is no longer a black box mechanism, necessarily
• Cliché Alert: Content as a Service!
Real Time IntegraVon
• Tap in to the CDN: – Instantaneous configura+on changes – Instantaneous cache purge and invalida+on – Real +me stats and logs
• Infrastructure as code – Expect extensive APIs – Apps need to naturally extend to the CDN – Your content => you need control
About Time!!
Lesson:
Measurement is SVll Hard
“SVll”
• In the world of FEO – Webpagetest.org – RUM – Synthe+c tes+ng vendors
• In the world of CDNs – Same as far as client performance goes – Some new things…
Client-‐side Measurement in CDNs
• Cache hit ra+o – How do you test and measure?
• Long tail content? • DNS and edge node selec+on • TTFB out of datacenter – Memory hit vs disk hit vs mid-‐+er hit vs miss
• RUM and synthe+c (Cedexis, Catchpoint, etc) • There’s s+ll gaming going on!
Let’s Test It! • 3 Objects on the same CDN (anonymous)
– Cedexis object – Small image from Alexa 5000 site – Long tail object: ~40 +mes every 3-‐4 hours
• Use Catchpoint last mile clients in US – Test every 15 minutes – ~11,500 total tests across all test nodes
• Focus measurement on: – Connect +me (TCP) – Wait +me (TTFB)
Cedexis Object
Connect (median) Wait (median)
Cedexis 14ms 19ms
Cedexis Object Alexa 5000
Connect (median) Wait (median)
Cedexis 14ms 19ms
Alexa 5000 14ms 24ms
Cedexis Object Alexa 5000
Connect (median) Wait (median)
Cedexis 14ms 19ms
Alexa 5000 14ms 24ms 26%
Cedexis Object Long Tail Alexa 5000
Connect (median) Wait (median)
Cedexis 14ms 19ms
Alexa 5000 14ms 24ms
Long Tail 15ms 29ms
Cedexis Object Long Tail Alexa 5000
Connect (median) Wait (median)
Cedexis 14ms 19ms
Alexa 5000 14ms 24ms
Long Tail 15ms 29ms 20%
Cedexis Object
Count TCP TTFB Count TCP TTFB Count TCP TTFB
Mem 11,074 14ms 19ms 481 14ms 19ms 6741 14ms 20ms
Disk 428 12ms 24ms 9626 15ms 28ms 4692 14ms 31ms
Miss 1 6ms 38ms 1355 16ms 51ms 28 13ms 45ms
Cedexis Object Alexa 5000
Count TCP TTFB Count TCP TTFB Count TCP TTFB
Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms
Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms
Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms
Cedexis Object Long Tail Alexa 5000
Count TCP TTFB Count TCP TTFB Count TCP TTFB
Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms
Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms
Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms
Cedexis Object Long Tail Alexa 5000
Count TCP TTFB Count TCP TTFB Count TCP TTFB
Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms
Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms
Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms
99.99% Mem: 96.27% Disk: 3.72%
Cedexis Object Long Tail Alexa 5000
Count TCP TTFB Count TCP TTFB Count TCP TTFB
Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms
Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms
Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms
99.99% Mem: 96.27% Disk: 3.72%
99.76% Mem: 58.82% Disk: 40.94%
Cedexis Object Long Tail Alexa 5000
Count TCP TTFB Count TCP TTFB Count TCP TTFB
Mem 11,074 14ms 19ms 6741 14ms 20ms 481 14ms 19ms
Disk 428 12ms 24ms 4692 14ms 31ms 9626 15ms 28ms
Miss 1 6ms 38ms 28 13ms 45ms 1355 16ms 51ms
99.99% Mem: 96.27% Disk: 3.72%
99.76% 88.17% Mem: 58.82% Disk: 40.94%
Mem: 4.19% Disk: 83.98%
Measurement!
• Not only do I care about: – Cache hit rate – Long tail – Measuring the right thing
• Fetching from disk could suck! – SSDs!
• Caching ≠ Caching
Lesson:
It’s Not Only About…
…Performance!!!
Security
• Cer+ficate management
• Perimeter security
• DDoS protec+on <-‐ benefit of scale!
Flexibility, Visibility, and Control
• Integra+on • Programmability • APIs • Instant purging • Real +me logs
Fun at the Edge! • Synthe+c responses – Example: node ID for Cedexis measurements
• Cookie manipula+on – Remove/inject/replace/recall
• Beacon termina+on – 204 responses – Real +me logs – Awesome!
PERFORMANCE!! It’s s+ll preby damn important!
Thank you!