48
The Domain Name System (DNS) J. C. Allen March 14, 2004

The Domain Name System (DNS) J. C. Allen March 14, 2004

  • View
    215

  • Download
    2

Embed Size (px)

Citation preview

The Domain Name System (DNS)

J. C. AllenMarch 14, 2004

History of DNS

● Mid- to late-1960s:– U. S. Department of Defense Advanced Research

Projects Agency (ARPA)– 1968. RFQ DAHC15 69 Q 0002, dated July 29, 1968

(“ARPANET”)– 1969. ARPANET was commissioned:

● IMPs (Bolt Beranek and Newman)● Network topology (Network Analysis Corp.)● Network measurement system (UCLA)

History (Continued)

● The goal of the ARPANET: Connect important research centers in the United States, allowing government contractors to share computing resources to promote research and joint development

History (Continued)

● Early 1980s– TCP/IP developed– Quickly became de facto standard

● Included in BSD Unix● BSD Unix free to universities

– ARPANET rapidly grew from hundreds of hosts to tens of thousands

– In time, ARPANET became the backbone of the Internet

The Rapid Growth of the Internet Created a Problem

● 1970s– ARPANET small with only a few hundred hosts– All routing information (name-address mappings)

contained in single file: HOSTS.TXT● HOSTS.TXT maintained by SRI NIC (“the NIC”)● HOSTS.TXT distributed from a single host: SRI-NIC

– Network administrators● Update (email)● Download most current version (FTP)

Rapid Growth (Continued)

● As hosts were added to the network, this scheme proved to be unworkable– Size of HOSTS.TXT grew with the size of

ARPANET– Traffic generated by update process increased with

each additional host. Each additional host:● was another host updating from SRI-NIC● required another line in HOSTS.TXT

Rapid Growth (Continued)

● 1980s– ARPANET moved to TCP/IP– Size of the network exploded– Immediate problems:

● Network traffic/load on SRI-NIC● Bandwidth consumed distributing a new version

proportional to the square of the number of hosts● Name collision. The NIC could guarantee unique

addresses, but had no authority over host names● HOSTS.TXT obsolete by the time it reached the most

remote hosts

Rapid Growth (Continued)

● HOSTS.TXT update process not scalable● ARPANET chartered search for replacement.

System selected should:– Allow local administration of data– Make name-address mappings available globally– Use a hierarchical namespace to ensure names are

unique

Rapid Growth (Continued)

● 1984. Paul Mockapetris released RFCs 882 and 883, which describe the DNS

● RFCs 882 and 883 later superseded by RFCs 1034 and 1035

The Design Goals of the DNS (RFC 1034)

● The primary goal is a consistent name space which will be used for referring to resources. In order to avoid the problems caused by ad hoc encodings, names should not be required to contain network identifiers, addresses, routes, or similar information as part of the name.

Design Goals (Continued)

● The sheer size of the database and frequency of updates suggest that it must be maintained in a distributed manner, with local caching to improve performance.

● Where there tradeoffs between the cost of acquiring data, the speed of updates, and the accuracy of caches, the source of the data should control the tradeoff.

Design Goals (Continued)

● The costs of implementing such a facility dictate that it be generally useful, and not restricted to a single application.

● Because we want the name space to be useful in dissimilar networks and applications, we provide the ability to use the same name space with different protocol families or management.

Design Goals (Continued)

● We want name server transactions to be independent of the communications system that carries them.

● The system should be useful across a wide spectrum of host capabilities. Both personal computers and large timeshared hosts should be able to use the system, though perhaps in different ways.

Definitions

● Domain Name Space. The domain name space is represented as an inverse tree representing all possible domains. Each node and leaf of the domain name space provides a label for a set of information, which may be empty. Query operations attempt to extract specific types of information from a particular set.

Definitions (Continued)

● Node. A node is the point at which the tree branches.

● Domain Name. Each node has a text label. The full domain name of a given node is the list of labels on the path from that node to the root node. Domain names are always read from the node toward the root node. A domain name uniquely identifies a single node in the domain name space.

Definitions (Continued)

● Fully Qualified Domain Name (FQDN). The NULL (zero-length) domain name is reserved for the root node. Therefore, a domain name which ends in a dot actually ends in a dot and the root node's label. This is interpreted as an indicator that the domain name is absolute and unambiguously specifies the node's location in the domain name space. An absolute domain name is referred to as a fully qualified domain name.

Definitions (Continued)

● Domain. A domain is simply a subtree of the domain name space. The domain name of a domain is the same as the domain name of the node at the top of the domain.

● Subdomain. A subdomain is a subtree of a subtree. A domain is a subdomain of another domain if the root of the subdomain is contained within the domain.

Definitions (Continued)

● Resource Records.– owner. The owner of a resource record is the domain

name where the resource record is found.– class. Resource records are divided into classes.

Each class corresponds to a protocol family or instance of a protocol. For example:

● "IN" describes internets based on TCP/IP.● "CH" describes internets based on the Chaosnet protocol.

Definitions (Continued)

● Each record type of a given class is defined by a particular syntax. All resource records of that class must adhere to the syntax.

– type. Different classes define different record types. The type specifies the type of resource in the resource record. Some types are common to more than one class.

– TTL (time-to-live). Time to live describes the amount of time a resource record can be cached before it should be discarded.

Definitions (Continued)

– RDATA. The type and sometimes class dependent data that describes the resource.

● Examples:– localhost.cnu.edu IN A 127.0.0.1– lh IN CNAME localhost.cnu.edu

Definitions (Continued)

● Top-Level Domain (TLD) or Generic Top-Level Domain (gTLD). A TLD is a child of the root node.

● Delegation. Delegation is the process by which responsibility for a subdomain is assigned to another organization.

Definitions (Continued)

● Name Servers. Name server, in this sense, is used to refer to a server application that stores information about the domain name space's structure and set information. Name servers have complete information about a subset of the domain name space, called a zone, and pointers to other name servers that lead to information from any part of the domain name space.

Definitions (Continued)

● Primary Master. A primary master reads the data for the zone from a file on its host. A primary master is authoritative.

● Secondary Master. A secondary master retrieves zone data from another master name server that is authoritative at some interval, which may be selected. This is frequently, but not necessarily, the primary master. A secondary master is authoritative.

Definitions (Continued)

● Zone. A zone is a smaller unit than a domain, and may be, but is not necessarily, a subdomain. A domain is broken up into zones to make it more manageable by delegation.

Definitions (Continued)

● Zone Transfer. The process by which a secondary master contacts its master server and, if necessary, transfers the zone data files. Changes are made to the zone data file on the primary master. Secondary masters periodically check for changes and obtain new copies of the zone data files when necessary.

Definitions (Continued)

● Zone Data Files ("data files" or "database files"). Zone data files are files from which the primary master name servers load their zone data. Secondary masters back up zone data to data files. If killed, secondary masters check their backup first. If the zone data is current, there is no need to transfer files from the primary master. Zone data files contain the resource records that describe a zone.

Definitions (Continued)

● Resolver. Resolvers are programs that access name servers. A resolver must be able to access at least one name server and use that name server's information to answer a query directly, or pursue the query using referrals to other name servers. Resolvers provide a service by querying name servers, interpreting the response, and returning information to the application that requested it.

Definitions (Continued)

● Resolution. Resolution is the process by which name servers satisfy client requests.

● Root Name Server. A root name server knows where the authoritative name servers for each TLD are. There are thirteen (13) root name servers spread across different parts of the network.

Definitions (Continued)

● Recursion. Recursion, or recursive resolution, is the name for the process used by name servers which receive recursive queries. The name server takes on the task of locating the requested resource. If it does not already know the answer, it queries other name servers, following any referrals it receives, until an authoritative answer to the query is received. Recursion is the simplest mode for the client.

Definitions (Continued)

● Iteration. Iteration, or iterative resolution, is the name for the process used by name servers which receive iterative queries. The name server returns the best answer that it already knows to the requester using only information available locally. The response to an iterative query contains an error, the answer, or a referral to some other server closer to the answer. Iteration is the simplest mode for the name server.

Definitions (Continued)

● Caching. Caching refers to the process by which the information returned by DNS queries is locally replicated, eliminating the need to re-query each time the same information is requested.

Features

● DNS uses query-response to locate resources in the domain name space– Applications use resolvers (stub vs. “smart”)– Resolvers construct queries and send them to name

servers– Response from name server either:

● provides the information requested,● refers the requester to another name server, or● signals some error has occurred

Features (Continued)

● DNS queries and responses have a standard format:– header containing required information– four sections containing specific parameters:

● Question (query name)● Answer (answer the question)● Authority (describe other authoritative servers)● Additional (may be helpful)

Features (Continued)

● Standard query (SQUERY) specifies:– target domain name (QNAME)– type (QTYPE)– class (QCLASS)

● Resolver constructs query, sends to name server● Name server searches for matching resource

records

Features (Continued)

● If no exact match, name server– returns a pointer to a name server “closer” to the

answer, or– information that may be useful,– depending on type of query: recursive or iterative– Recursion if resolver and name server agree to its use,

iteration otherwise.

Features (Continued)

● Example query:

+---------------------------------------------------+ Header | OPCODE=SQUERY | +---------------------------------------------------+ Question | QNAME=SRI-NIC.ARPA., QCLASS=IN, QTYPE=A | +---------------------------------------------------+ Answer | <empty> | +---------------------------------------------------+ Authority | <empty> | +---------------------------------------------------+ Additional | <empty> | +---------------------------------------------------+

Features (Continued)

● Example response:

+---------------------------------------------------+ Header | OPCODE=SQUERY, RESPONSE, AA | +---------------------------------------------------+ Question | QNAME=SRI-NIC.ARPA., QCLASS=IN, QTYPE=A | +---------------------------------------------------+ Answer | SRI-NIC.ARPA. 86400 IN A 26.0.0.73 | | 86400 IN A 10.0.0.51 | +---------------------------------------------------+ Authority | <empty> | +---------------------------------------------------+ Additional | <empty> | +---------------------------------------------------+

Features (Continued)

● Another example response:

+---------------------------------------------------+ Header | OPCODE=SQUERY,RESPONSE | +---------------------------------------------------+ Question | QNAME=SRI-NIC.ARPA., QCLASS=IN, QTYPE=A | +---------------------------------------------------+ Answer | SRI-NIC.ARPA. 1777 IN A 10.0.0.51 | | 1777 IN A 26.0.0.73 | +---------------------------------------------------+ Authority | <empty> | +---------------------------------------------------+ Additional | <empty> | +---------------------------------------------------+

Structure

● The DNS has three major components:– The domain name space and resource records

describe the inverse tree structure of the name space and any associated data. Each node of the domain tree describes a set of data which is a subset of the whole tree.

Structure (Continued)

– Name servers hold information about the domain tree's structure and set information. A name server may maintain a cache of structure or set information about any part of the domain tree. A name server has complete information about some subset of the domain name space for which it is authoritative, called a zone, and pointers to other name servers that have information about any other part of the domain tree.

Structure (Continued)

– Resolvers are programs or library routines that provide a service to user programs. Resolvers attempt to extract information from name servers by sending queries to a name server and handling the response. Queries attempt to extract specific types of information from a particular set of data. A query provides the domain name of interest and describes the type of information that is being requested.

Implementation

● Implementation details are beyond the scope of this presentation. In general (using BIND):– The zone administrator creates the zone data files

which describe the zone. These include a file containing the resource records which provide information on specific types of information which are available in the zone, a file which describes the loopback address hosts use to direct traffic to themselves, and a file which contains the locations of the name servers for the root zone and which is obtained from InterNIC via anonymous FTP.

Implementation (Continued)

– Next, the administrator configures BIND. The BIND configuration file, called named.boot (older versions of BIND) or named.conf, provides the locations of the zone data files on the name server and the values of any options to be passed to BIND.

– Finally, the name server is started with the /usr/sbin/named command.

Implementation (Continued)

– The process for implementing a secondary master name server is similar. Again, specific details in the zone data files are changed to indicate that the name server is a secondary master. When the secondary master is started it reads zone data from these files. However, it receives all updates from the primary master server as described previously via a zone transfer.

Significance

● A distributed system is defined as one in which components at networked computers communicate and coordinate their actions only by passing messages. This is clearly the case with the DNS.– Name servers and resolvers pass messages back and

forth in the form of query and response to coordinate the location of resources on a global network.

Significance (Continued)

● In addition, the DNS has the following characteristics:– Data available through client-server interaction.– Data replicated to provide global access to data– Data cached to minimize network traffic and load.– Data distributed so that the name server most likely to

know its location is authoritative.– DNS is scalable.– DNS is transparent

Summary

The DNS is a distributed database structured as an inverse tree. Each node in the tree describes a set of data. The data is requested by user programs. The user program uses a resolver to construct the query and send it to a name server. A name server provides a response to client (resolver) requests. Through client-server interaction, the location of every resource on the Internet may be determined. The decentralized nature of the DNS reduces network traffic and server load and increases ease of administration. Applications exist to implement DNS which are dependent on the underlying hardware and operating system.

References

1. P. Albitz and C. Liu, DNS and BIND, Fourth Ed. (O'Reilly & Associates, Inc., Sebastopol, CA, 2001)

2. P. Mockapetris, RFC-1034 Domain Names - Concepts and Facilities (Information Sciences Institute, November 1987)

3. P. Mockapetris, RFC-1035 Domain Names - Implementation and Specification (Information Sciences Institute, November 1987)

4. G. Coulouris, J. Dollimore, and T. Kindberg, Distributed Systems, Concepts and Design, Third Ed. (Pearson Education Ltd., Harlow, 2001)

5. B. Leiner et al., A Brief History of the Internet, version 3.32 (Internet Society, www.isoc.org/internet/history/brief.shtml, 2003)