Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
Drongo Speeding Up CDNs with Subnet Assimilation from the Client
Authors:Marc Anthony WarriorUri KlarmanMarcel FloresAleksandar Kuzmanovic
CoNEXT ‘17Incheon, South KoreaCDN & Caching Session
Bird’s Eye View● What is Drongo?
● Why we need Drongo
● Performance Analysis
● Thoughts & Conclusions
● Questions
22
What is Drongo?
3
What is Drongo?
It’s a bird!
4
What is Drongo?
It’s a bird!
5
What is Drongo?
It’s a bird!
6
What is Drongo?
It’s a system that allows end-users to enhance the QoS (quality of service) they get from Content Distribution Networks (CDNs)
7
What is Drongo?
It’s a system that allows end-users to enhance the QoS (quality of service) they get from Content Distribution Networks (CDNs)
(in this talk, QoS = latency)
8
Why Latency?
9
Why Latency?● Latency is time
10
Why Latency?● Latency is time
● Latency is money ○ Google (Marissa Mayer), Amazon (Greg Linden)
■ Web 2.0 Summet, glinden.blogspot.com
11
Why Latency?● Latency is time
● Latency is money ○ Google (Marissa Mayer), Amazon (Greg Linden)
■ Web 2.0 Summet, glinden.blogspot.com
● Latency is the bottom line ○ “What we have found running our applications at
Google is that latency is as important, or more important, for our applications than relative bandwidth,” Amin Vahdat (Google)
12
Drongo helps you (the end user)
lower your own latency!
13
14
Drongo’s Effect on Latency
GoogleAmazon
AlibabaCDNetworks
ChinaNetCtr
CubeCDN
15
Drongo’s Effect on Latency
GoogleAmazon
AlibabaCDNetworks
ChinaNetCtr
CubeCDN
ONLY client-side changes
Example Scenario
16
Provider wants to serve client17
Client is far18
CDN = more replica locations19
DNS Redirection
Which replica serves the client?
20
Choose the “closest” server21
Choose the “closest” server
This choice is nontrivial!
22
Often Suboptimal Choices!23
24
Maybe just a far LDNS...
[Chen - SigComm ’15; Huang - SigComm CCR ‘12; Alzoubi - WWW ‘13; Rula - SigComm ‘14 …]
Ordinary DNS Query
25
DNS Query LDNS IP
Somewhere in California
EDNS0 Client-Subnet extension (ECS)
26
DNS Query LDNS IP
Somewhere in California Actually somewhere in New York
Client Subnet
27
(ECS User)
We used ECS:
This still happens
28
(ECS User)
We used ECS:
This still happens
29
… frequently
(ECS User)
We used ECS:
Really? ...
30
Really? ...
YES! We measured it!
31
How did we measure it?
32
How did we measure it?
33
Find subnets directed to different replicas
Subnet Assimilation
34
DNS Query LDNS IPClient Subnet
Subnet Assimilation
35
DNS Query LDNS IPClient SubnetOther Subnet
How did we measure it?
36
Find subnets directed to different replicas
Perform pings and downloads to each replica
How did we measure it?
37
Find subnets directed to different replicas
Perform pings and downloads to each replica
Identify which subnet resulted in the “best” replica
1. Get “Default” Choice38
(use client’s own subnet for ECS)
2. Traceroute to default choice39
3. Get Hop Subnet Choices40
(use hops’ subnets for ECS)
4. Measure Latencies41
4. Measure Latencies42
Steps 1-4: a “trial”
Latency Ratio 43
10.6
1.4
Normalize to default choice’s RTT
We’re looking for this 44
10.6
1.4
Valley = better choice from hop subnet
100 ms
0 ms
RTT: client to replica
traceroute
replica choice for subnet
45
Valley = better choice from hop subnet
100 ms
0 ms
RTT: client to replica
traceroute
replica choice for subnet
46
PlanetLab Sees Valleys!
47
PlanetLab Sees Valleys!
48
PlanetLab Sees Valleys!
● Google: 20.24%● Amazon: 14.02%● Alibaba: 33.68%● CDNetworks: 15.61%● ChinaNetCenter: 27.42%● CubeCDN: 38.58% Room for improvement!
49
5.50
5. Use best subnet for ECS51
5. Use best subnet for ECS52
Get best mapping!
Are Valleys Predictable?
● Trials are not “fast”
53
Are Valleys Predictable?
● Trials are not “fast”● We want valleys “on the fly”
54
Are Valleys Predictable?
● Trials are not “fast”● We want valleys “on the fly”● We need to find valley-prone subnets
55
Testing Persistence
0 205 10 15
56
consecutive trials
Testing Persistence
0 205 10 15
VS
Trial A Trial B
57
Latency Ratio Difference Over Time
58
Latency Ratio = (hop replica RTT) / (default replica RTT)
Testing Persistence
0 205 10 15
VS
Window A Window B
59
Testing Persistence
0 205 10 15
VS
Window A Window C
60
Testing Persistence
0 205 10 15
VS
Window A Window C
61
15 hours
Latency Ratio Difference Over Time
62
Latency Ratio = (hop replica RTT) / (default replica RTT)
Latency Ratio Difference Over Time
63
Latency Ratio = (hop replica RTT) / (default replica RTT)
Latency Ratio Difference Over Time
64
Latency Ratio = (hop replica RTT) / (default replica RTT)
Latency Ratio Difference Over Time
65
Latency Ratio = (hop replica RTT) / (default replica RTT)
SURPRISE! The Internet is crazy!
66
Filter: at least one valley
{0,0,0,0,0,V,0,0,0,0,0,0,V}
{0,0,0,0,0,0,0,0,0,0,0,0,0}
{V,V,V,V,0,0,0,0,V,V,V,0,V}
Subnet A
Subnet B
Subnet C
67
Filter: at least one valley
{0,0,0,0,0,V,0,0,0,0,0,0,V}
{0,0,0,0,0,0,0,0,0,0,0,0,0}
{V,V,V,V,0,0,0,0,V,V,V,0,V}
Subnet A
Subnet B
Subnet C
68
Filter: at least one valley
Latency Ratio = (hop replica RTT) / (default replica RTT)
69
Filter: at least one valley
Latency Ratio = (hop replica RTT) / (default replica RTT)
very flat
70
Filter: at least one valley
Latency Ratio = (hop replica RTT) / (default replica RTT)
very flat Close to zero
71
72
73
Parameter Exploration
How deep are the valleys from
useful subnets?
74
Vthresh =
Vthresh
75
10.6
0.9
Latency Ratio
Replicas
1
A B C
76
Vthresh 10.6
0.9
Latency Ratio
1
Replicas
A B C
How often do valleys occur in
useful subnets?
77
Vfreq =
78
TRAINING WINDOW
TRIALS79
80
Vfreq = 2/5
81
Vfreq = 2/5
Valley-Prone Subnet
82
Vfreq = 2/5
Valley-Prone Subnet
83
Vfreq = 2/5
NOT Valley-Prone Subnet
Overview of Drongo: 1. Collect training window
84
Overview of Drongo: 1. Collect training window
2. Count the # of sufficiently deep valleys
85
Overview of Drongo: 1. Collect training window
2. Count the # of sufficiently deep valleys
3. Apply subnet assimilationa. Training window is already completeb. Both parameters met
86
System Wide Performance
87
System Wide Performance
88
System Wide Performance
89
better
System Wide Performance
90
better
System Wide Performance
91
better
System Wide Performance
92
better
System Wide Performance
93
better
System Wide Performance
94
better
Vfreq = 1.0
System Wide Performance
95
better
Vfreq = 1.0
Vthresh = 0.95
96
Switch Quality
Global Params
Per Prov. Params
GoogleAmazon
AlibabaCDNetworks
ChinaNetCtr
CubeCDN
Conclusion & Insights
● CDNs have a lot of room for improvement
97
Conclusion & Insights
● CDNs have a lot of room for improvement
● Clients can help
98
Conclusion & Insights
● CDNs have a lot of room for improvement
● Clients can help
● Low requirements
99
Conclusion & Insights
● CDNs have a lot of room for improvement
● Clients can help
● Low requirements
● Can provide 50% improvement
100
Questions?
101
# Clients Affected
bette
r
102
Per Provider Overall Performance
103
Performance of Drongo’s choices
104