Upload
others
View
11
Download
0
Embed Size (px)
Citation preview
DNS AND CDN
What we have learned so far…
Socket programming, Internet
Process and Thread
Concurrent programming
RPC
Logical Time
(Distributed) Transactions and concurrency control
What we will learn…
ACID vs BASE.
We will learn how real systems work:
DNS, CDN
Distributed file system (NFS, AFS, Dropbox)
MapReduce/Hadoop
SSH, PKI
Midterm
0
1
2
3
4
100 95 90 85 80 75 70 65 60 55 50
AVG=77
PA1+PA2(part I)+Midterm
0
1
2
3
4
5
100 95 90 85 80 75 70 65 60 55 50
Total
Today's Lecture6
DNS: one of world’s largest distributed system"
CDNs: More than 60% of Internet traffic is video. More than half is
served by CDNs.
What is DNS?
Domain Name Service
Translates human readable names into IP addresses.
Challenge
How do we scale these to the wide area and to many users?
How do we make this efficient?
How do we support many updates?
This is what DNS gives you (dig)
dongsuh@ina:~/public_html$ dig ina.kaist.ac.kr
; <<>> DiG 9.8.1-P1 <<>> ina.kaist.ac.kr
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3211
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 2
;; QUESTION SECTION:
;ina.kaist.ac.kr. IN A
;; ANSWER SECTION:
ina.kaist.ac.kr. 7200 IN A 143.248.56.236
;; AUTHORITY SECTION:
kaist.ac.kr. 7200 IN NS ns1.kaist.ac.kr.
kaist.ac.kr. 7200 IN NS ns.kaist.ac.kr.
;; ADDITIONAL SECTION:
ns.kaist.ac.kr. 7200 IN A 143.248.1.177
ns1.kaist.ac.kr. 7200 IN A 143.248.2.177
;; Query time: 2 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Nov 7 08:14:11 2013
;; MSG SIZE rcvd: 116
; <<>> DiG 9.8.1-P1 <<>> www.google.com
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 45520
;; flags: qr rd ra; QUERY: 1, ANSWER: 6, AUTHORITY: 4, ADDITIONAL: 4
;; QUESTION SECTION:
;www.google.com. IN A
;; ANSWER SECTION:
www.google.com. 44 IN A 74.125.128.103
www.google.com. 44 IN A 74.125.128.105
www.google.com. 44 IN A 74.125.128.104
www.google.com. 44 IN A 74.125.128.106
www.google.com. 44 IN A 74.125.128.147
www.google.com. 44 IN A 74.125.128.99
;; AUTHORITY SECTION:
google.com. 268411 IN NS ns3.google.com.
google.com. 268411 IN NS ns1.google.com.
google.com. 268411 IN NS ns4.google.com.
google.com. 268411 IN NS ns2.google.com.
;; ADDITIONAL SECTION:
ns2.google.com. 265844 IN A 216.239.34.10
ns1.google.com. 265844 IN A 216.239.32.10
ns3.google.com. 265844 IN A 216.239.36.10
ns4.google.com. 265844 IN A 216.239.38.10
;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Thu Nov 7 08:14:32 2013
;; MSG SIZE rcvd: 264
Why is DNS distributed system?
Why can’t it be centralized?
Single point of failure
Large amount of requests
Must handle large updates!
Does not scale.
Why not use distributed transactions?
ACID approach.
11
Domain Name System Goals
Basically a wide-area distributed database
Scalability
Decentralized maintenance
Robustness
Global scope
Names mean the same thing everywhere
Don’t need
Atomicity
Strong consistency
12
Typical Resolution
Steps for resolving www.google.com
Application calls gethostbyname() (RESOLVER)
Resolver contacts local name server (S1)
S1 queries root server (S2) for (www.google.com)
S2 returns NS record for google.com. (S3) ns1.google.com
What about A record for S3?
This is what the additional information section is for (PREFETCHING)
S1 queries S3 (ns1.google.com) for www.google.com
S3 returns A record for www.google.com
Can return multiple A records what does this mean?
13
Lookup Methods
Recursive query:
Server goes out and searches for more info (recursive)
Only returns final answer or “not found”
Iterative query:
Server responds with as much as it knows (iterative)
“I don’t know this name, but ask this server”
Workload impact on choice?
Local server typically does recursive
Root/distant server does iterative
requesting hostsurf.eurecom.fr
gaia.cs.umass.edu
root name server
local name serverdns.eurecom.fr
1
2
34
5 6authoritative name s
erver
dns.cs.umass.edu
intermediate name server
dns.umass.edu
7
8
iterated query
14
Workload and Caching
Are all servers/names likely to be equally popular?
Why might this be a problem? How can we solve this problem?
DNS responses are cached
Quick response for repeated translations
Other queries may reuse some parts of lookup
NS records for domains
DNS negative queries are cached
Don’t have to repeat past mistakes
E.g. misspellings, search strings in resolv.conf
Cached data periodically times out
Lifetime (TTL) of data controlled by owner of data
TTL passed with every record
15
Typical Resolution
ClientLocal
DNS server
root & edu
DNS server
ns1.cmu.edu
DNS server
www.cs.cmu.edu
ns1.cs.cmu.edu
DNS
server
16
Subsequent Lookup Example
ClientLocal
DNS server
root & edu
DNS server
cmu.edu
DNS server
cs.cmu.edu
DNS
server
ftp.cs.cmu.edu
digdongsuh@ina:~/Documents/sdn-cdn-doc/nsdi$ dig www.google.com
+trace
; <<>> DiG 9.8.1-P1 <<>> www.google.com +trace
;; global options: +cmd
. 436533 IN NS k.root-servers.net.
. 436533 IN NS d.root-servers.net.
. 436533 IN NS m.root-servers.net.
. 436533 IN NS f.root-servers.net.
. 436533 IN NS l.root-servers.net.
. 436533 IN NS j.root-servers.net.
. 436533 IN NS h.root-servers.net.
. 436533 IN NS b.root-servers.net.
. 436533 IN NS c.root-servers.net.
. 436533 IN NS e.root-servers.net.
. 436533 IN NS a.root-servers.net.
. 436533 IN NS i.root-servers.net.
. 436533 IN NS g.root-servers.net.
;; Received 420 bytes from 127.0.0.1#53(127.0.0.1) in 11 ms
com. 172800 IN NS d.gtld-servers.net.
com. 172800 IN NS m.gtld-servers.net.
com. 172800 IN NS g.gtld-servers.net.
com. 172800 IN NS j.gtld-servers.net.
com. 172800 IN NS i.gtld-servers.net.
com. 172800 IN NS a.gtld-servers.net.
com. 172800 IN NS f.gtld-servers.net.
com. 172800 IN NS e.gtld-servers.net.
com. 172800 IN NS k.gtld-servers.net.
com. 172800 IN NS b.gtld-servers.net.
com. 172800 IN NS l.gtld-servers.net.
com. 172800 IN NS h.gtld-servers.net.
com. 172800 IN NS c.gtld-servers.net.
;; Received 492 bytes from 192.112.36.4#53(192.112.36.4) in
1275 ms
google.com. 172800 IN NS ns2.google.com.
google.com. 172800 IN NS ns1.google.com.
google.com. 172800 IN NS ns3.google.com.
google.com. 172800 IN NS ns4.google.com.
;; Received 168 bytes from 192.35.51.30#53(192.35.51.30) in
210 ms
www.google.com. 300 IN A 74.125.128.99
www.google.com. 300 IN A 74.125.128.147
www.google.com. 300 IN A 74.125.128.106
Switch gTCL and ccTLD (incorrect figure)
But you get the idea
Root name servers (13 servers)
Actually many physical root name servers
Several root servers have multiple physical servers
They have
Packets routed to “nearest” server by “Anycast” protocol
http://www.root-servers.org/
22
Reverse DNS
Task
Given IP address, find its name
Method
Maintain separate hierarchy based on IP names
Write 128.2.194.242 as 242.194.128.2.in-addr.arpa
Why is the address reversed?
Managing
Authority manages IP addresses assigned to it
E.g., CMU manages name space 128.2.in-addr.arpa
edu
cmu
cs
kittyhawk128.2.194.242
cmcl
unnamed root
arpa
in-addr
128
2
194
242
23
.arpa Name Server Hierarchy
At each level of hierarchy, have group of s
ervers that are authorized to handle that re
gion of hierarchy
128
2
194
kittyhawk128.2.194.242
in-addr.arpa a.root-servers.net • • • m.root-servers.net
chia.arin.net
(dill, henna, indigo, epazote, figwort, ginseng)
cucumber.srv.cs.cmu.edu,
t-ns1.net.cmu.edu
t-ns2.net.cmu.edu
mango.srv.cs.cmu.edu
(peach, banana, blueberry)
24
Prefetching
Name servers can add additional data to response
Typically used for prefetching
CNAME/MX/NS typically point to another host name
Responses include address of host referred to in “additional section”
25
Mail Addresses
MX records point to mail exchanger for a name
E.g. mail.acm.org is MX for acm.org
Addition of MX record type proved to be a challenge
How to get mail programs to lookup MX record for mail delivery?
Needed critical mass of such mailers
26
DNS (Summary)
Motivations large distributed database
Scalability
Independent update
Robustness
Hierarchical database structure
Zones
How is a lookup done
Caching/prefetching and TTLs
Reverse name lookup
HTTP Caching27
Clients often cache documents
Challenge: update of documents
If-Modified-Since requests to check
HTTP 0.9/1.0 used just date
HTTP 1.1 has an opaque “entity tag (ETAG)” (could be a file signature, etc.) as well
When/how often should the original be checked for changes?
Check every time?
Check each session? Day? Etc?
Use Expires header
If no Expires, often use Last-Modified as estimate
Example Cache Check Request28
GET / HTTP/1.1
Accept: */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
If-Modified-Since: Mon, 29 Jan 2001 17:54:18 GMT
If-None-Match: "7a11f-10ed-3a75ae4a"
User-Agent: Mozilla/4.0 (compatible; MSIE 5.5; Windows NT 5.0)
Host: www.intel-iris.net
Connection: Keep-Alive
Example Cache Check Response29
HTTP/1.1 304 Not Modified
Date: Tue, 27 Mar 2001 03:50:51 GMT
Server: Apache/1.3.14 (Unix) (Red-Hat/Linux) mod_ssl/2.7.1 OpenSSL
/0.9.5a DAV/1.0.2 PHP/4.0.1pl2 mod_perl/1.24
Connection: Keep-Alive
Keep-Alive: timeout=15, max=100
ETag: "7a11f-10ed-3a75ae4a"
Content Distribution Networks (CDNs)
The content providers are the CDN customers.
Content replication
CDN company installs hundreds of CDN servers throughout Internet
Close to users
CDN replicates its customers’ content in CDN servers. When provider updates content, CDN updates servers
30
origin server
in North America
CDN distribution node
CDN server
in S. America CDN server
in Europe
CDN server
in Asia
Server Selection31
Service is replicated in many places in network
How do direct clients to a particular server?
As part of routing anycast, cluster load balancing
As part of application HTTP redirect
As part of naming DNS
Which server?
Lowest load to balance load on servers
Best performance to improve client performance
Based on Geography? RTT? Throughput? Load?
Any alive node to provide fault tolerance
Routing Based 32
Anycast
Give service a single IP address
Each node implementing service advertises route to address
Packets get routed routed from client to “closest” service node
Closest is defined by routing metrics
May not mirror performance/application needs
What about the stability of routes?
Application Based33
HTTP support simple way to indicate that Web page has moved
Server gets Get request from client
Decides which server is best suited for particular client and object
Returns HTTP redirect to that server
Can make informed application specific decision
May introduce additional overhead multiple connection setup, name lookups, etc.
While good solution in general HTTP Redirect has some design flaws –especially with current browsers
Naming Based34
Client does name lookup for service
Name server chooses appropriate server address
A-record returned is “best” one for the client
What information can name server base decision on?
Server load/location must be collected
Information in the name lookup request
Name service client typically the local name server for client
How Akamai Works35
Clients fetch html document from primary server
E.g. fetch index.html from cnn.com
URLs for replicated content are replaced in html
E.g. <img src=“http://cnn.com/af/x.gif”> replaced with <img src=“http://a73.g.
akamaitech.net/7/23/cnn.com/af/x.gif”>
Client is forced to resolve aXYZ.g.akamaitech.net hostname, where XY
Z is the hash of the URL
How Akamai Works36
How is content replicated?
Akamai only replicates static content (*)
Modified name contains original file name
Akamai server is asked for content
First checks local cache
If not in cache, requests file from primary server and caches file
* (At least, the version we’re talking about today. Akamai actually lets sites write code that can run on Akamai’s servers, but that’s a pretty different beast)
How Akamai Works37
Root server gives NS record for akamai.net
Akamai.net name server returns NS record for g.akamaitech.net
Name server chosen to be in region of client’s name server
TTL is large
G.akamaitech.net nameserver chooses server in region
Should try to chose server that has file in cache - How to choose?
Uses aXYZ name and hash
TTL is small why?
Simple Hashing38
Given document XYZ, we need to choose a server to use
Suppose we use modulo
Number servers from 1…n
Place document XYZ on server (XYZ mod n)
What happens when a servers fails? n n-1
Same if different people have different measures of n
Why might this be bad?
Simple Hashing39
Given document XYZ, we need to choose a server to use
Suppose we use modulo
Number servers from 1…n
Place document XYZ on server (XYZ mod n)
What happens when a servers fails? n n-1
Same if different people have different measures of n
Why might this be bad?
Consistent Hash40
“view” = subset of all hash buckets that are visible
Desired features
Balanced – in any one view, load is equal across buckets
Smoothness – little impact on hash bucket contents when buckets are added/
removed
Spread – small set of hash buckets that may hold an object regardless of vi
ews
Load – across all views # of objects assigned to hash bucket is small
Consistent Hash – Example41
Smoothness addition of bucket does not cause movement between existing buckets
Spread & Load small set of buckets that lie near object
Balance no bucket is responsible for large number of objects
• Construction• Assign each of C hash buckets to random points on mod
2n circle, where, hash key size = n.
• Map object to random position on circle
• Hash of object = closest clockwise bucket
0
4
8
12Bucket
14
How Akamai Works42
End-user
cnn.com (content provider) DNS root server Akamai server
1 2 3
4
Akamai high-level DN
S server
Akamai low-level DNS
server
Nearby matching
Akamai server
11
6
7
8
9
10
Get in
dex.ht
ml
Get /cnn.com/foo.jpg
12
Get foo.jpg
5
Akamai – Subsequent Requests43
End-user
cnn.com (content provider) DNS root server Akamai server
1 2 Akamai high-level DN
S server
Akamai low-level DNS
server7
8
9
10
Get in
dex.ht
ml
Get /cnn.com/foo.jp
g
Nearby matching
Akamai server
Important Lessons44
Akamai CDN illustrate range of ideas
BASE (not ACID design)
Weak consistency
Naming of objects location translation
Consistent hashing
Why are these the right design choices for this application?