1. Scalability

Embed Size (px)

Citation preview

  • 8/8/2019 1. Scalability

    1/16

    Distributed Operating Systems

    Amdahl's law

    From Wikipedia, the free encyclopedia

    Amdahl's law, also known as Amdahl's argument,[1] is named aftercomputer architectGene

    Amdahl, and is used to find the maximum expected improvement to an overall system when

    only part of the system is improved. It is often used in parallel computing to predict the

    theoretical maximum speedup using multiple processors.

    The speedup of a program using multiple processors in parallel computing is limited by the

    time needed for the sequential fraction of the program. For example, if a program needs 20

    hours using a single processor core, and a particular portion of 1 hour cannot be parallelized,

    while the remaining promising portion of 19 hours (95%) can be parallelized, then regardlessof how many processors we devote to a parallelized execution of this program, the minimum

    execution time cannot be less than that critical 1 hour. Hence the speed up is limited up to

    20, as the diagram illustrates.

    The speedup of a program using multiple processors in parallel computing is limited by the

    sequential fraction of the program. For example, if 95% of the program can be parallelized,

    the theoretical maximum speedup using parallel computing would be 20 as shown in thediagram, no matter how many processors are used.

    http://en.wikipedia.org/wiki/Amdahl's_law

    Scalability

    Definition

    http://en.wikipedia.org/wiki/File:AmdahlsLaw.svghttp://en.wikipedia.org/wiki/File:AmdahlsLaw.svghttp://en.wikipedia.org/wiki/Amdahl's_law#cite_note-0http://en.wikipedia.org/wiki/Computer_architecturehttp://en.wikipedia.org/wiki/Gene_Amdahlhttp://en.wikipedia.org/wiki/Gene_Amdahlhttp://en.wikipedia.org/wiki/Parallel_computinghttp://en.wikipedia.org/wiki/Speeduphttp://en.wikipedia.org/wiki/Amdahl's_law
  • 8/8/2019 1. Scalability

    2/16

    Scalability indicates the capability of a system to increase performance under an

    increased load when resources (typically hardware) are added. Wikipedia

    A computer system (HW + SW) is called scalable if it can scale up (improve

    its resources) to accommodate ever increasing performance and functionalitydemand and / orscale down (decrease resources) to reduce cost. -- Wang, Xu 98

    A system is scalable

    if it works well for very large and very small numbers

    PartitioningPartitioning is a process of split systems into parts that can operateindependently to a large extent.

    Replication

    The process of creating and managing duplicate versions of a database. Replication not only

    copies a database but also synchronizes a set ofreplicas so that changes made to one replica

    are reflected in all the others. The beauty of replication is that it enables many users to work

    with their own local copy of a database but have the database updated as if they were working

    on a single, centralized database. For database applications where users are geographically

    widely distributed, replication is often the most efficient method of database access.

    The Lotus Notes system was one of the first to make replication a central component of its

    design, which has been one of the main reasons for its success.

    Replication (pronounced rehp-lih-KA-shun) is the process of making a replica (a copy) of

    something. A replication (noun) is a copy. The term is used in fields as varied as

    microbiology (cell replication), knitwear (replication of knitting patterns), and information

    distribution (CD-ROM replication).

    On the Internet, a Web site that has been replicated in its entirety and put on another site is

    called a mirror site.

    Replication (computer science)

    Replication is the process of sharing information so as to ensure consistency between

    redundant resources, such as software or hardware components, to improve reliability, fault-

    tolerance, or accessibility. It could be data replication if the same data is stored on multiple

    storage devices, orcomputation replication if the same computing task is executed many

    times. A computational task is typically replicated in space, i.e. executed on separate devices,

    or it could be replicated in time, if it is executed repeatedly on a single device.

    http://www.webopedia.com/TERM/R/database.htmlhttp://www.webopedia.com/TERM/R/Lotus_Notes.htmlhttp://searchsoa.techtarget.com/sDefinition/0,,sid26_gci213352,00.htmlhttp://searchstorage.techtarget.com/sDefinition/0,,sid5_gci212579,00.htmlhttp://en.wikipedia.org/wiki/Softwarehttp://en.wikipedia.org/wiki/Hardwarehttp://en.wikipedia.org/wiki/Fault-tolerancehttp://en.wikipedia.org/wiki/Fault-tolerancehttp://en.wikipedia.org/wiki/Data_storage_device
  • 8/8/2019 1. Scalability

    3/16

    Load balancing is different from task replication, since it distributes a load of different (not

    the same) computations across machines, and allows a single computation to be dropped in

    case of failure. Load balancing, however, sometimes uses data replication (esp. multi-master)

    internally, to distribute its data among machines.

    Backup is different from replication, since it saves a copy of data unchanged for a long periodof time. Replicas on the other hand are frequently updated and quickly lose any historical

    state.

    Goals:

    Fault tolerance

    Locality of queries -> shorten interpretation route

    Load Balancing especially for higher levels which receive many queries

    0

    Cache

    In computer science, a cache is a collection of data duplicating original values stored

    elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer

    access time) or to compute, compared to the cost of reading the cache.

    In other words, a cache is a temporary storage area where frequently accessed data can be

    stored for rapid access. Once the data is stored in the cache, it can be used in the future byaccessing the cached copy rather than re-fetching or recomputing the original data.

    What is Caching?

    Introduction

    When you view a website for the first time your browser downloads all the various page

    elements (images, text, style sheets etc.) to your desktop computer's hard drive. This is your

    local 'cached copy' of the web page. The next time you visit the site your browser first looksin the cache and displays the local copy rather than going to the bother of downloading it all

    again.

    This makes web browsing much quicker; for example, if you press your 'back' button to a

    page you just visited it will appear almost instantly, without having to download all those

    images again.

    That's the theory anyway, and it's generally a good system for most users. But we - and our

    clients - are not 'most users'. We're special!

    Why is it a problem?

    http://en.wikipedia.org/wiki/Load_balancing_(computing)http://en.wikipedia.org/wiki/Backuphttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Access_time
  • 8/8/2019 1. Scalability

    4/16

    Okay, so a site you've visited previously can load super-fast because it is, in effect, sitting on

    your own computer. But what if the site has changed since you last visited? You could be

    looking at something that's out of date. Your browser has a system of checking for page

    elements that have been updated, but in practice this doesn't always work - especially in a

    fast-moving website development situation.

    During website development and maintenance things can be changing all the time; new

    pictures, changed version of logos, text updates. We make the change, upload it to the web

    server and ask the client to check it's all as they requested.

    Thanks to caching it's usually about this time we'll get a phone call asking why the change

    hasn't been made. The client has visited the site to check the work and can't see any difference

    because what they are actually viewing is their local copy of the page, cached before the

    changes were implemented.

    Note: Pages that are served dynamically (for example using ASP or PHP) suffer less from this

    issue as the text content tends to be drawn from a database, requiring a new call to the serverevery time the page is requested. Items such as graphics or Flash elements tend not to be

    stored in the database and will be cached. In fact, it's changes to images that most commonly

    cause a problem.

    How to get around browser caching

    Generally speaking, caching is a 'good thing' as it will speed up your day-to-day browsing

    experience. But like I said, you and we are special, and when we're working on developing a

    website it's a good idea to know how to get around your browser's cache.

    Replication vs. Caching

    Both Cascade File System and Cascade Proxy use caching to speed up access to slow

    repositories. The first time someone accesses a file, it is downloaded from the appropriate

    server. Subsequent accesses pull the file from the cache instead. Eventually the cache will

    fill up, and as new files are downloaded, old files are automatically evicted from the cache.

    Caching is not the only way to speed up access to a slow repository. Another option is

    replication: mirroring the contents of a repository on multiple servers. You might set up a

    master repository at your main office and create replicas at each remote site. As changes

    are committed to the master repository, they are mirrored over to the remote replicas.

    To put it another way, caching is a pull model data is pulled as it is requested whereas

    replication is a push model data is pushed as it becomes available, regardless of whether

    it has been requested.

    http://www.conifersystems.com/whitepapers/replication/
  • 8/8/2019 1. Scalability

    5/16

    Replications Advantages

    Replication has one big performance advantage over caching: it accelerates the first access to

    a file, not just subsequent accesses. Replication has a number of disadvantages, however, and

    this advantage is not as clear-cut as it may seem. Combining caching with prefetching has

    much the same effect.

    For example, if your developers start to come in to the office at 8AM, you might kick off a

    prefetch of all the files they typically use at 7AM, and theyll all be locally cached before

    anyone arrives. Or, if you work from home, you could start a prefetch each day in the

    afternoon, and by the time you get home most of the files you need will already be cached.

    You dont need to prefetch everything just the most important files and this sort of

    prefetching can be automated using tools like cron.

    Another important feature of replication is that it doubles as an efficient way to do backups.

    If you have an entire copy of your repository offsite, the chances that you will lose all your

    data are slim. A cache may allow you to recover some files after a loss of data, but it is not a

    replacement for a real backup system.

    Replication also is ideal for disconnected operation. If you lose all network connectivity,

    having a full replica means you still have access to all the data. In practice, however,

    disconnected operation is becoming increasingly less important, with Internet and wireless

    connectivity nearly ubiquitous.

    Replications Disadvantages

    On the flip side, replication has quite a few disadvantages that if you are not using it toperform backups usually outweigh its advantages, especially for large projects.

    Building a New Replica

    Starting from scratch, it can take a very long time to build a new replica. In effect, you must

    replay each commit to the repository starting from the beginning. For sufficiently large

    projects, it may not even be realistic to build an offsite replica purely over a network you

    may be forced to build a replica at your main site, then physically ship the disks to the remote

    site.

    Caching, on the other hand, has no such upfront costs. The cache can be populated gradually

    over time, and the speedup from using the cache will grow as more files are populated into it.

    Disk Space Cost

    Each replica consumes the same amount of disk space as the main repository. The more

    replicas you need, and the larger your repository, the more you will need to spend on disks to

    store the replicas. This is usually acceptable for small projects, but once a repository grows

    large enough that it cannot typically fit on a single commodity, off-the-shelf hard drive, this

    starts to become troublesome. (Among other things, it becomes impractical for developers

    who work at home to mirror the repository.)

  • 8/8/2019 1. Scalability

    6/16

    Caching has no such disk space requiements proportional to the size of the repository. Larger

    caches can store more files, but even a modestly-sized cache can have large performance

    benefits. It is practical to set up caches not just at a site-wide level, but also on an individual

    LAN.

    Network Bandwidth Cost

    Replication mirrors every change, whether it is needed or not. As such, a replica is constantly

    consuming network bandwidth. This can overload a remote offices WAN link. In the limit,

    it is even possible for replication to break down altogether if changes are being committed

    faster than the data can be mirrored. Also, the mirroring places extra load on the master

    repositorys server.

    Caching, on the other hand, will almost always decrease, not increase, WAN bandwidth

    usage. A file is not downloaded unless it is really needed.

    Replication Lag

    Replication is not immediate. It takes time for a change to propagate from the master

    repository to the replicas. Sometimes the lag may be small, but it may spike if several large

    changes are committed in a short period of time. When you are working off a replica, you

    may think you are using the very latest top of tree source code, when in fact you may be

    any number of changes behind. Depending on how its set up, the replica server might claim

    that the missing changes dont even exist if you ask it to check out a revision number that

    hasnt replicated yet, it may give you an error message rather than waiting until the replication

    catches up to that revision number.

    Conclusion

    For accelerating offsite development, caching, especially when combined with intelligent

    prefetching, provides most of the advantages of replication without its many disadvantages.

    Setting up caches is cheap and easy. Replication is best suited for offsite backups, not for

    accelerating offsite development.

    Caching vs. Replication

    Jeff Darcy March 22, 2008 14:13

    Cache: a data location created/deployed to provide lower request latencythan the main data store (either by being located nearer to requesters orby using faster components).

    Replica: a data store, separate from that where a request is served, that iscreated/deployed to continue service after a failure.

    In short, a cache exists to improve performance and a replica exists to improve resilience. A

    cache that doesnt improve performance is a failure, as is a replica that doesnt improveresilience, but the possibility of failure doesnt turn one thing into another. Since defining

    things in terms of purpose or intent often leaves things unclear, here are some practical

    implications of the difference.

    http://pl.atyp.us/wordpress/?p=1313http://pl.atyp.us/wordpress/?author=30
  • 8/8/2019 1. Scalability

    7/16

    Caches need not be current or complete. They may return stale data, or nodata at all, although many caches are designed to avoid stale data andtransparent caches will re-request data from the main store instead ofrequiring that the requester do so after a miss.

    Replicas must be both current and complete (perhaps not perfectly butalways within defined limits), and authoritative or at least capable ofbecoming authoritative. Authoritative means that they may not becontradicted by alternative sources of information; if a conflict exists, theauthoritative source is unconditionally given precedence over any non-authoritative one. (Authority loses its meaning if authorities disagree, ofcourse, but thats a philosophical issue best left for another time. For now,assume that authorities always agree.)

    Caches exist to improve request latency, but replication might actuallydegrade request latency at the nearer data store as messages areexchanged with the further one to preserve the required replica behavior.

    Replicas exist to improve resilience, but caching might degrade resilienceas the number of components (the caches and extra data paths) and

    logical complexity both increase.

    Caching is done in caching servers, e.g. the servers that your ISP

    directs you to use for lookups, and in some cases in the resolvers local

    to the client machines. The reason it takes time for changes to

    propagate is because every DNS record is tagged with a Time To Live

    (TTL) value. This tells caching servers how long they are allowed to

    hold on to that record before they must check again with one of the

    authoritative servers for the domain. If the TTL is 1 day (a pretty

    common setting), and someone's server cached the record 1 minute before

    you changed it, it will take 23 hours 59 minutes before that server will

    notice the change (actually, it could take a bit longer, because itmight have been cached from one of the slave servers, and replication

    takes time).

    Domain Name System

    The Domain Name System (DNS) is a hierarchical naming system built on a distributed

    database for computers, services, or any resource connected to theInternet or a private

    network. It associates various information with domain names assigned to each of the

    participating entities. Most importantly, it translates domain names meaningful

    to humans into the numerical identifiers associated with networking equipment for the

    purpose of locating and addressing these devices worldwide.

    An often-used analogy to explain the Domain Name System is that it serves as the phone

    bookfor the Internet by translating human-friendly computer hostnames into IP addresses.

    For example, the domain name www.example.com translates to theaddresses 192.0.32.10 (IPv4) and 2620:0:2d0:200::10 (IPv6).

    How Does DNS Work?

    2.1 The Domain Name Space

    DNS's distributed database is indexed by domain names. Each domain name is essentially justa path in a large inverted tree, called the domain name space. The tree's hierarchical structure,

    shown in Figure 2.1, is similar to the structure of the UNIX filesystem. The tree has a single

    http://en.wikipedia.org/wiki/Distributed_databasehttp://en.wikipedia.org/wiki/Distributed_databasehttp://en.wikipedia.org/wiki/Internethttp://en.wikipedia.org/wiki/Private_networkhttp://en.wikipedia.org/wiki/Private_networkhttp://en.wikipedia.org/wiki/Domain_namehttp://en.wikipedia.org/wiki/Humanshttp://en.wikipedia.org/wiki/Telephone_directoryhttp://en.wikipedia.org/wiki/Telephone_directoryhttp://en.wikipedia.org/wiki/Hostnamehttp://en.wikipedia.org/wiki/IP_addresshttp://en.wikipedia.org/wiki/Example.comhttp://en.wikipedia.org/wiki/IPv4http://en.wikipedia.org/wiki/IPv6
  • 8/8/2019 1. Scalability

    8/16

    root at the top.[1] In the UNIX filesystem, this is called the root directory, represented by a

    slash ("/"). DNS simply calls it "the root." Like a filesystem, DNS's tree can branch any

    number of ways at each intersection point, called a node. The depth of the tree is limited to

    127 levels (a limit you're not likely to reach).

    2.1.3 Resource RecordsThe data associated with domain names are contained in resource records, orRRs. Recordsare divided into classes, each of which pertains to a type of network or software. Currently,

    there are classes for internets (any TCP/IPbased internet), networks based on the Chaosnet

    protocols, and networks that use Hesiod software. (Chaosnet is an old network of largely

    historic significance.)

    2.4 Name Servers and Zones

    The programs that store information about the domain name space are called name servers.

    Name servers generally have complete information about some part of the domain namespace, called a zone, which they load from a file or from another name server. The name

    server is then said to have authority for that zone. Name servers can be authoritative for

    multiple zones, too.

    The difference between a zone and a domain is important, but subtle. All toplevel domains,

    and many domains at the second level and lower, like berkeley.edu and hp.com, are broken

    into smaller, more manageable units by delegation. These units are called zones. The edudomain, shown in Figure 2.8, is divided into many zones, including the berkeley.edu zone, the

    purdue.edu zone, and the nwu.edu zone. At the top of the domain, there's also an edu zone. It's

    natural that the folks who run edu would break up the edu domain: otherwise, they'd have to

    manage the berkeley.edu subdomain themselves. It makes much more sense to delegate

    berkeley.edu to Berkeley. What's left for the folks who run edu? The edu zone, which would

    contain mostly delegation information to subdomains ofedu.

  • 8/8/2019 1. Scalability

    9/16

    Figure 2.8: The edu domain broken into zones

    The berkeley.edu subdomain is, in turn, broken up into multiple zones by delegation, as

    shown in Figure 2.9. There are delegated subdomains called cc, cs, ce, me, and more. Each of

    these subdomains is delegated to a set of name servers, some of which are also authoritativeforberkeley.edu. However, the zones are still separate, and may have a totally different group

    of authoritative name servers.

  • 8/8/2019 1. Scalability

    10/16

  • 8/8/2019 1. Scalability

    11/16

    2.6 Resolution

    Name servers are adept at retrieving data from the domain name space. They have to be,

    given the limited intelligence of some resolvers. Not only can they give you data about zones

    for which they're authoritative, they can also search through the domain name space to find

    data for which they're not authoritative. This process is called name resolution or simply

    resolution.

    2.6.1 Root Name Servers

    The root name servers know where there are authoritative name servers for each of the

    toplevel domains. (In fact, most of the root name servers are authoritative for the generictoplevel domains.) Given a query about any domain name, the root name servers can at least

    provide the names and addresses of the name servers that are authoritative for the toplevel

    domain that the domain name is in. And the toplevel name servers can provide the list of

    name servers that are authoritative for the secondlevel domain that the domain name is in.

    Each name server queried gives the querier information about how to get "closer" to the

    answer it's seeking, or it provides the answer itself.

    The root name servers are clearly important to resolution. Because they're so important, DNS

    provides mechanisms such as caching, which we'll discuss a little later to help offload the

    root name servers. But in the absence of other information, resolution has to start at the root

    name servers. This makes the root name servers crucial to the operation of DNS; if all the

    Internet root name servers were unreachable for an extended period, all resolution on the

    Internet would fail. To protect against this, the Internet has thirteen root name servers (as of

    this writing) spread across different parts of the network. Two are on the MILNET, the U.S.

    military's portion of the Internet; one is on SPAN, NASA's internet; two are in Europe; and

    one is in Japan.

    Being the focal point for so many queries keeps the roots busy; even with thirteen, the traffic

    to each root name server is very high. A recent informal poll of root name server

    administrators showed some roots receiving thousands of queries per second.

  • 8/8/2019 1. Scalability

    12/16

    Despite the load placed on root name servers, resolution on the Internet works quite well.

    Figure 2.12 shows the resolution process for the address of a real host in a real domain,

    including how the process corresponds to traversing the domain name space tree.

    The local name server queries a root name server for the address ofgirigiri.gbrmpa.gov.au

    and is referred to the au name servers. The local name server asks an au name server the samequestion, and is referred to the gov.au name servers. The gov.au name server refers the local

    name server to the gbrmpa.gov.au name servers. Finally, the local name server asks a

    gbrmpa.gov.au name server for the address and gets the answer.

  • 8/8/2019 1. Scalability

    13/16

    A resolver queries a local name server, which then queries a number of other name servers in

    pursuit of an answer for the resolver. Each name server it queries refers it to another nameserver that is authoritative for a zone further down in the name space and closer to the domain

    name sought. Finally, the local name server queries the authoritative name server, which

    returns an answer.

    2.7 Caching

    The whole resolution process may seem awfully convoluted and cumbersome to someone

    accustomed to simple searches through the host table. Actually, it's usually quite fast. One of

    the features that speeds it up considerably is caching.

    A name server processing a recursive query may have to send out quite a few queries to find

    an answer. However, it discovers a lot of information about the domain name space as it doesso. Each time it's referred to another list of name servers, it learns that those name servers are

    authoritative for some zone, and it learns the addresses of those servers. And, at the end of the

    resolution process, when it finally finds the data the original querier sought, it can store that

    data for future reference, too. With version 4.9 and all version 8 BINDs, name servers even

    implement negative caching: if an authoritative name server responds to a query with an

    answer that says the domain name or data type in the query doesn't exist, the local name

    server will temporarily cache that information, too. Name servers cache all of this data to help

    speed up successive queries. The next time a resolver queries the name server for data about a

    domain name the name server knows something about, the process is shortened quite a bit.

    The name server may have cached the answer, positive or negative, in which case it simply

    returns the answer to the resolver. Even if it doesn't have the answer cached, it may havelearned the identities of the name servers that are authoritative for the zone the domain name

    is in and be able to query them directly.

  • 8/8/2019 1. Scalability

    14/16

    For example, say our name server has already looked up the address ofeecs.berkeley.edu. In

    the process, it cached the names and addresses of the eecs.berkeley.edu and berkeley.eduname servers (plus eecs.berkeley.edu 's IP address). Now if a resolver were to query our name

    server for the address ofbaobab.cs.berkeley.edu , our name server could skip querying the root

    name servers. Recognizing that berkeley.edu is the closest ancestor ofbaobab.cs.berkeley.eduthat it knows about, our name server would start by querying a berkeley.edu name server, as

    shown in Figure 2.16. On the other hand, if our name server had discovered that there was no

    address foreecs.berkeley.edu, the next time it received a query for the address, it could simply

    have responded appropriately from its cache.

    In addition to speeding up resolution, caching prevents us from having to query the root name

    servers again. This means that we're not as dependent on the roots, and they won't suffer as

    much from all our queries.

    2.7.1 Time to Live

    Name servers can't cache data forever, of course. If they did, changes to that data on the

    authoritative name servers would never reach the rest of the network. Remote name servers

    would just continue to use cached data. Consequently, the administrator of the zone that

    contains the data decides on a time to live, orTTL, for the data. The time to live is the amount

    of time that any name server is allowed to cache the data. After the time to live expires, the

    name server must discard the cached data and get new data from the authoritative name

    servers. This also applies to negatively cached data; a name server must time out a negative

    answer after a period, too, in case new data has been added on the authoritative name servers.

    However, the time to live for negatively cached data isn't tunable by the domainadministrator; it's hardcoded to ten minutes.

    Deciding on a time to live for your data is essentially deciding on a tradeoff between

    performance and consistency. A small TTL will help ensure that data about your domain is

    consistent across the network, because remote name servers will time it out more quickly and

    be forced to query your authoritative name servers more often for new data. On the other

    hand, this will increase the load on your name servers and lengthen resolution time for

    information in your domain, on the average.

    A large TTL will shorten the average time it takes to resolve information in your domain

    because the data can be cached longer. The drawback is that your information will beinconsistent for a longer time if you make changes to your data on your name servers.

  • 8/8/2019 1. Scalability

    15/16

    Naming Service Mapping of logical names to physical addresses or objectreferences Logical names given by user (location independent) Service returns the addresses (location dependent)

    Directory Service Management of attributes of the named instances

    Attribute based search for named instances Similar to the Yellow Pages search model

    3.1 Getting BIND

    If you plan to set up your own domain and run name servers for it, you'll need the BIND

    software first. Even if you're planning on having someone else run your domain, it's helpful to

    have the software around. For example, you can use your local name server to test your data

    files before giving them to your remote domain administrator.

    What is BIND and what does it do?

    BIND is an implementation of the Domain Name System (DNS) protocols. The name BIND

    stands for "Berkeley Internet Name Domain", because the software originated in the early

    1980s at the University of California at Berkeley. In recent years, the word BIND hasbecome, like "radar" and "snafu" and "laser" and "scuba", more word than acronym.

    The DNS protocols are part of the core Internet standards. They specify the process by which

    one computer can find another computer on the basis of its name.

    What it means to say "BIND is an implementation of the DNS protocols" is thatthe BIND software distribution contains all of the software needed bothto ask name service questions and to answer such questions.

    The BIND software distribution contains three parts:

    A Domain Name System server. This is a program called "named", which is

    pronounced "name-dee" and stands for "name daemon". It answers questions that are

  • 8/8/2019 1. Scalability

    16/16

    sent to it, following the rules specified in the DNS protocol standards. You can

    provide DNS service on the internet by installing this software on a server computer

    and giving it correct information about your domain names.

    A Domain Name System "resolver library". A "resolver" is a program that resolves

    questions about names by sending those questions to appropriate servers and

    responding appropriately to the servers' replies. A "resolver library" is a collection ofsoftware components that a programmer can add to software being developed, which

    will give that software the ability to resolve names. For example, a programmer who

    was programming a new web browser does not need to create the part of it that looks

    up names in DNS; he or she can plug in the resolver library and then send quesitons to

    the library software components. This saves time (the programmer does not need to re-

    invent that particular wheel) and helps ensure that the new browser correctly follows

    the DNS standards.

    Software tools for testing servers. These are the tools that we use for testing, and we

    include them in the distribution in case you would like to do your own testing, perhaps

    to make sure your server configuration is working properly.

    Berkeley Internet Name Domain-BIND

    BIND (Berkeley Internet Name Domain) is an implementation of the DNSprotocols and provides anopenly redistributable reference implementation of the major components of the Domain NameSystem, including:

    Domain Name System server

    Domain Name System resolver library

    Tools for managing and verifying the proper operation of the DNS server

    The BIND DNS Serveris used on the vast majority of name serving machines on the Internet,

    providing a robust and stable architecture on top of which an organization's naming architecture can

    be built.

    The resolver library included in the BIND distribution provides the standard APIs for translation

    between domain names and Internet addresses and is intended to be linked with applications requiring

    name service.

    BIND version 9 is a majorrewrite of nearly all aspects of the underlying BIND architecture. Some of

    the important features ofBIND 9 are DNS Security (DNSSEC, TSIG), IPv6, DNS Protocol

    Enhancements (IXFR, DDNS, DNSNotify, EDNS0), Views, Multiprocessor Support, and an ImprovedPortability Architecture.

    Today, BIND version 4 is officially deprecated and BIND version 8 development is considered

    maintenance-only in favor of BIND version 9. No additional development will be done on BIND version

    4 or BIND version 8 other than for security related patches. ISC encourages all BIND users to upgrade

    to version 9 at their earliest convenience.

    http://www.bind9.net/rfchttp://www.dnssec.net/http://www.rfc-archive.org/getrfc.php?rfc=2845http://www.rfc-archive.org/getrfc.php?rfc=2460http://www.rfc-archive.org/getrfc.php?rfc=1995http://www.rfc-archive.org/getrfc.php?rfc=3007http://www.rfc-archive.org/getrfc.php?rfc=1996http://www.rfc-archive.org/getrfc.php?rfc=2671http://www.bind9.net/downloadhttp://www.bind9.net/download