1. Scalability

8/8/2019 1. Scalability

1/16

Distributed Operating Systems

Amdahl's law

From Wikipedia, the free encyclopedia

Amdahl's law, also known as Amdahl's argument,[1] is named aftercomputer architectGene

Amdahl, and is used to find the maximum expected improvement to an overall system when

only part of the system is improved. It is often used in parallel computing to predict the

theoretical maximum speedup using multiple processors.

The speedup of a program using multiple processors in parallel computing is limited by the

time needed for the sequential fraction of the program. For example, if a program needs 20

hours using a single processor core, and a particular portion of 1 hour cannot be parallelized,

while the remaining promising portion of 19 hours (95%) can be parallelized, then regardlessof how many processors we devote to a parallelized execution of this program, the minimum

execution time cannot be less than that critical 1 hour. Hence the speed up is limited up to

20, as the diagram illustrates.

The speedup of a program using multiple processors in parallel computing is limited by the

sequential fraction of the program. For example, if 95% of the program can be parallelized,

the theoretical maximum speedup using parallel computing would be 20 as shown in thediagram, no matter how many processors are used.

http://en.wikipedia.org/wiki/Amdahl's_law

Scalability

Definition
http://en.wikipedia.org/wiki/File:AmdahlsLaw.svghttp://en.wikipedia.org/wiki/File:AmdahlsLaw.svghttp://en.wikipedia.org/wiki/Amdahl's_law#cite_note-0http://en.wikipedia.org/wiki/Computer_architecturehttp://en.wikipedia.org/wiki/Gene_Amdahlhttp://en.wikipedia.org/wiki/Gene_Amdahlhttp://en.wikipedia.org/wiki/Parallel_computinghttp://en.wikipedia.org/wiki/Speeduphttp://en.wikipedia.org/wiki/Amdahl's_law


2/16

Scalability indicates the capability of a system to increase performance under an

increased load when resources (typically hardware) are added. Wikipedia

A computer system (HW + SW) is called scalable if it can scale up (improve

its resources) to accommodate ever increasing performance and functionalitydemand and / orscale down (decrease resources) to reduce cost. -- Wang, Xu 98

A system is scalable

if it works well for very large and very small numbers

PartitioningPartitioning is a process of split systems into parts that can operateindependently to a large extent.

Replication

The process of creating and managing duplicate versions of a database. Replication not only

copies a database but also synchronizes a set ofreplicas so that changes made to one replica

are reflected in all the others. The beauty of replication is that it enables many users to work

with their own local copy of a database but have the database updated as if they were working

on a single, centralized database. For database applications where users are geographically

widely distributed, replication is often the most efficient method of database access.

The Lotus Notes system was one of the first to make replication a central component of its

design, which has been one of the main reasons for its success.

Replication (pronounced rehp-lih-KA-shun) is the process of making a replica (a copy) of

something. A replication (noun) is a copy. The term is used in fields as varied as

microbiology (cell replication), knitwear (replication of knitting patterns), and information

distribution (CD-ROM replication).

On the Internet, a Web site that has been replicated in its entirety and put on another site is

called a mirror site.

Replication (computer science)

Replication is the process of sharing information so as to ensure consistency between

redundant resources, such as software or hardware components, to improve reliability, fault-

tolerance, or accessibility. It could be data replication if the same data is stored on multiple

storage devices, orcomputation replication if the same computing task is executed many

times. A computational task is typically replicated in space, i.e. executed on separate devices,

or it could be replicated in time, if it is executed repeatedly on a single device.
http://www.webopedia.com/TERM/R/database.htmlhttp://www.webopedia.com/TERM/R/Lotus_Notes.htmlhttp://searchsoa.techtarget.com/sDefinition/0,,sid26_gci213352,00.htmlhttp://searchstorage.techtarget.com/sDefinition/0,,sid5_gci212579,00.htmlhttp://en.wikipedia.org/wiki/Softwarehttp://en.wikipedia.org/wiki/Hardwarehttp://en.wikipedia.org/wiki/Fault-tolerancehttp://en.wikipedia.org/wiki/Fault-tolerancehttp://en.wikipedia.org/wiki/Data_storage_device


3/16

Load balancing is different from task replication, since it distributes a load of different (not

the same) computations across machines, and allows a single computation to be dropped in

case of failure. Load balancing, however, sometimes uses data replication (esp. multi-master)

internally, to distribute its data among machines.

Backup is different from replication, since it saves a copy of data unchanged for a long periodof time. Replicas on the other hand are frequently updated and quickly lose any historical

state.

Goals:

Fault tolerance

Locality of queries -> shorten interpretation route

Load Balancing especially for higher levels which receive many queries

0

Cache

In computer science, a cache is a collection of data duplicating original values stored

elsewhere or computed earlier, where the original data is expensive to fetch (owing to longer

access time) or to compute, compared to the cost of reading the cache.

In other words, a cache is a temporary storage area where frequently accessed data can be

stored for rapid access. Once the data is stored in the cache, it can be used in the future byaccessing the cached copy rather than re-fetching or recomputing the original data.

What is Caching?

Introduction

When you view a website for the first time your browser downloads all the various page

elements (images, text, style sheets etc.) to your desktop computer's hard drive. This is your

local 'cached copy' of the web page. The next time you visit the site your browser first looksin the cache and displays the local copy rather than going to the bother of downloading it all

again.

This makes web browsing much quicker; for example, if you press your 'back' button to a

page you just visited it will appear almost instantly, without having to download all those

images again.

That's the theory anyway, and it's generally a good system for most users. But we - and our

clients - are not 'most users'. We're special!

Why is it a problem?
http://en.wikipedia.org/wiki/Load_balancing_(computing)http://en.wikipedia.org/wiki/Backuphttp://en.wikipedia.org/wiki/Computer_sciencehttp://en.wikipedia.org/wiki/Access_time


4/16

Okay, so a site you've visited previously can load super-fast because it is, in effect, sitting on

your own computer. But what if the site has changed since you last visited? You could be

looking at something that's out of date. Your browser has a system of checking for page

elements that have been updated, but in practice this doesn't always work - especially in a

fast-moving website development situation.

During website development and maintenance things can be changing all the time; new

pictures, changed version of logos, text updates. We make the change, upload it to the web

server and ask the client to check it's all as they requested.

Thanks to caching it's usually about this time we'll get a phone call asking why the change

hasn't been made. The client has visited the site to check the work and can't see any difference

because what they are actually viewing is their local copy of the page, cached before the

changes were implemented.

Note: Pages that are served dynamically (for example using ASP or PHP) suffer less from this

issue as the text content tends to be drawn from a database, requiring a new call to the serverevery time the page is requested. Items such as graphics or Flash elements tend not to be

stored in the database and will be cached. In fact, it's changes to images that most commonly

cause a problem.

How to get around browser caching

Generally speaking, caching is a 'good thing' as it will speed up your day-to-day browsing

experience. But like I said, you and we are special, and when we're working on developing a

website it's a good idea to know how to get around your browser's cache.

Replication vs. Caching

Both Cascade File System and Cascade Proxy use caching to speed up access to slow

repositories. The first time someone accesses a file, it is downloaded from the appropriate

server. Subsequent accesses pull the file from the cache instead. Eventually the cache will

fill up, and as new files are downloaded, old files are automatically evicted from the cache.

Caching is not the only way to speed up access to a slow repository. Another option is

replication: mirroring the contents of a repository on multiple servers. You might set up a

master repository at your main office and create replicas at each remote site. As changes

are committed to the master repository, they are mirrored over to the remote replicas.

To put it another way, caching is a pull model data is pulled as it is requested whereas

replication is a push model data is pushed as it becomes available, regardless of whether

it has been requested.
http://www.conifersystems.com/whitepapers/replication/


5/16

Replications Advantages

Replication has one big performance advantage over caching: it accelerates the first access to

a file, not just subsequent accesses. Replication has a number of disadvantages, however, and

this advantage is not as clear-cut as it may seem. Combining caching with prefetching has

much the same effect.

For example, if your developers start to come in to the office at 8AM, you might kick off a

prefetch of all the files they typically use at 7AM, and theyll all be locally cached before

anyone arrives. Or, if you work from home, you could start a prefetch each day in the

afternoon, and by the time you get home most of the files you need will already be cached.

You dont need to prefetch everything just the most important files and this sort of

prefetching can be automated using tools like cron.

Another important feature of replication is that it doubles as an efficient way to do backups.

If you have an entire copy of your repository offsite, the chances that you will lose all your

data are slim. A cache may allow you to recover some files after a loss of data, but it is not a

replacement for a real backup system.

Replication also is ideal for disconnected operation. If you lose all network connectivity,

having a full replica means you still have access to all the data. In practice, however,

disconnected operation is becoming increasingly less important, with Internet and wireless

connectivity nearly ubiquitous.

Replications Disadvantages

On the flip side, replication has quite a few disadvantages that if you are not using it toperform backups usually outweigh its advantages, especially for large projects.

Building a New Replica

Starting from scratch, it can take a very long time to build a new replica. In effect, you must

replay each commit to the repository starting from the beginning. For sufficiently large

projects, it may not even be realistic to build an offsite replica purely over a network you

may be forced to build a replica at your main site, then physically ship the disks to the remote

site.

Caching, on the other hand, has no such upfront costs. The cache can be populated gradually

over time, and the speedup from using the cache will grow as more files are populated into it.

Disk Space Cost

Each replica consumes the same amount of disk space as the main repository. The more

replicas you need, and the larger your repository, the more you will need to spend on disks to

store the replicas. This is usually acceptable for small projects, but once a repository grows

large enough that it cannot typically fit on a single commodity, off-the-shelf hard drive, this

starts to become troublesome. (Among other things, it becomes impractical for developers

who work at home to mirror the repository.)


6/16

Caching has no such disk space requiements proportional to the size of the repository. Larger

caches can store more files, but even a modestly-sized cache can have large performance

benefits. It is practical to set up caches not just at a site-wide level, but also on an individual

LAN.

Network Bandwidth Cost

Replication mirrors every change, whether it is needed or not. As such, a replica is constantly

consuming network bandwidth. This can overload a remote offices WAN link. In the limit,

it is even possible for replication to break down altogether if changes are being committed

faster than the data can be mirrored. Also, the mirroring places extra load on the master

repositorys server.

Caching, on the other hand, will almost always decrease, not increase, WAN bandwidth

usage. A file is not downloaded unless it is really needed.

Replication Lag

Replication is not immediate. It takes time for a change to propagate from the master

repository to the replicas. Sometimes the lag may be small, but it may spike if several large

changes are committed in a short period of time. When you are working off a replica, you

may think you are using the very latest top of tree source code, when in fact you may be

any number of changes behind. Depending on how its set up, the replica server might claim

that the missing changes dont even exist if you ask it to check out a revision number that

hasnt replicated yet, it may give you an error message rather than waiting until the replication

catches up to that revision number.

Conclusion

For accelerating offsite development, caching, especially when combined with intelligent

prefetching, provides most of the advantages of replication without its many disadvantages.

Setting up caches is cheap and easy. Replication is best suited for offsite backups, not for

accelerating offsite development.

Caching vs. Replication

Jeff Darcy March 22, 2008 14:13

Cache: a data location created/deployed to provide lower request latencythan the main data store (either by being located nearer to requesters orby using faster components).

Replica: a data store, separate from that where a request is served, that iscreated/deployed to continue service after a failure.

In short, a cache exists to improve performance and a replica exists to improve resilience. A

cache that doesnt improve performance is a failure, as is a replica that doesnt improveresilience, but the possibility of failure doesnt turn one thing into another. Since defining

things in terms of purpose or intent often leaves things unclear, here are some practical

implications of the difference.
http://pl.atyp.us/wordpress/?p=1313http://pl.atyp.us/wordpress/?author=30


7/16

Caches need not be current or complete. They may return stale data, or nodata at all, although many caches are designed to avoid stale data andtransparent caches will re-request data from the main store instead ofrequiring that the requester do so after a miss.

Replicas must be both current and complete (perhaps not perfectly butalways within defined limits), and authoritative or at least capable ofbecoming authoritative. Authoritative means that they may not becontradicted by alternative sources of information; if a conflict exists, theauthoritative source is unconditionally given precedence over any non-authoritative one. (Authority loses its meaning if authorities disagree, ofcourse, but thats a philosophical issue best left for another time. For now,assume that authorities always agree.)

Caches exist to improve request latency, but replication might actuallydegrade request latency at the nearer data store as messages areexchanged with the further one to preserve the required replica behavior.

Replicas exist to improve resilience, but caching might degrade resilienceas the number of components (the caches and extra data paths) and

logical complexity both increase.

Caching is done in caching servers, e.g. the servers that your ISP

directs you to use for lookups, and in some cases in the resolvers local

to the client machines. The reason it takes time for changes to

propagate is because every DNS record is tagged with a Time To Live

(TTL) value. This tells caching servers how long they are allowed to

hold on to that record before they must check again with one of the

authoritative servers for the domain. If the TTL is 1 day (a pretty

common setting), and someone's server cached the record 1 minute before

you changed it, it will take 23 hours 59 minutes before that server will

notice the change (actually, it could take a bit longer, because itmight have been cached from one of the slave servers, and replication

takes time).

Domain Name System

The Domain Name System (DNS) is a hierarchical naming system built on a distributed

database for computers, services, or any resource connected to theInternet or a private

network. It associates various information with domain names assigned to each of the

participating entities. Most importantly, it translates domain names meaningful

to humans into the numerical identifiers associated with networking equipment for the

purpose of locating and addressing these devices worldwide.

An often-used analogy to explain the Domain Name System is that it serves as the phone

bookfor the Internet by translating human-friendly computer hostnames into IP addresses.

For example, the domain name www.example.com translates to theaddresses 192.0.32.10 (IPv4) and 2620:0:2d0:200::10 (IPv6).

How Does DNS Work?

2.1 The Domain Name Space

DNS's distributed database is indexed by domain names. Each domain name is essentially justa path in a large inverted tree, called the domain name space. The tree's hierarchical structure,

shown in Figure 2.1, is similar to the structure of the UNIX filesystem. The tree has a single
http://en.wikipedia.org/wiki/Distributed_databasehttp://en.wikipedia.org/wiki/Distributed_databasehttp://en.wikipedia.org/wiki/Internethttp://en.wikipedia.org/wiki/Private_networkhttp://en.wikipedia.org/wiki/Private_networkhttp://en.wikipedia.org/wiki/Domain_namehttp://en.wikipedia.org/wiki/Humanshttp://en.wikipedia.org/wiki/Telephone_directoryhttp://en.wikipedia.org/wiki/Telephone_directoryhttp://en.wikipedia.org/wiki/Hostnamehttp://en.wikipedia.org/wiki/IP_addresshttp://en.wikipedia.org/wiki/Example.comhttp://en.wikipedia.org/wiki/IPv4http://en.wikipedia.org/wiki/IPv6


8/16

root at the top.[1] In the UNIX filesystem, this is called the root directory, represented by a

slash ("/"). DNS simply calls it "the root." Like a filesystem, DNS's tree can branch any

number of ways at each intersection point, called a node. The depth of the tree is limited to

127 levels (a limit you're not likely to reach).

2.1.3 Resource RecordsThe data associated with domain names are contained in resource records, orRRs. Recordsare divided into classes, each of which pertains to a type of network or software. Currently,

there are classes for internets (any TCP/IPbased internet), networks based on the Chaosnet

protocols, and networks that use Hesiod software. (Chaosnet is an old network of largely

historic significance.)

2.4 Name Servers and Zones

The programs that store information about the domain name space are called name servers.

Name servers generally have complete information about some part of the domain namespace, called a zone, which they load from a file or from another name server. The name

server is then said to have authority for that zone. Name servers can be authoritative for

multiple zones, too.

The difference between a zone and a domain is important, but subtle. All toplevel domains,

and many domains at the second level and lower, like berkeley.edu and hp.com, are broken

into smaller, more manageable units by delegation. These units are called zones. The edudomain, shown in Figure 2.8, is divided into many zones, including the berkeley.edu zone, the

purdue.edu zone, and the nwu.edu zone. At the top of the domain, there's also an edu zone. It's

natural that the folks who run edu would break up the edu domain: otherwise, they'd have to

manage the berkeley.edu subdomain themselves. It makes much more sense to delegate

berkeley.edu to Berkeley. What's left for the folks who run edu? The edu zone, which would

contain mostly delegation information to subdomains ofedu.


9/16

Figure 2.8: The edu domain broken into zones

The berkeley.edu subdomain is, in turn, broken up into multiple zones by delegation, as

shown in Figure 2.9. There are delegated subdomains called cc, cs, ce, me, and more. Each of

these subdomains is delegated to a set of name servers, some of which are also authoritativeforberkeley.edu. However, the zones are still separate, and may have a totally different group

of authoritative name servers.


10/16


11/16

2.6 Resolution

Name servers are adept at retrieving data from the domain name space. They have to be,

given the limited intelligence of some resolvers. Not only can they give you data about zones

for which they're authoritative, they can also search through the domain name space to find

data for which they're not authoritative. This process is called name resolution or simply

resolution.

2.6.1 Root Name Servers

The root name servers know where there are authoritative name servers for each of the

toplevel domains. (In fact, most of the root name servers are authoritative for the generictoplevel domains.) Given a query about any domain name, the root name servers can at least

provide the names and addresses of the name servers that are authoritative for the toplevel

domain that the domain name is in. And the toplevel name servers can provide the list of

name servers that are authoritative for the secondlevel domain that the domain name is in.

Each name server queried gives the querier information about how to get "closer" to the

answer it's seeking, or it provides the answer itself.

The root name servers are clearly important to resolution. Because they're so important, DNS

provides mechanisms such as caching, which we'll discuss a little later to help offload the

root name servers. But in the absence of other information, resolution has to start at the root

name servers. This makes the root name servers crucial to the operation of DNS; if all the

Internet root name servers were unreachable for an extended period, all resolution on the

Internet would fail. To protect against this, the Internet has thirteen root name servers (as of

this writing) spread across different parts of the network. Two are on the MILNET, the U.S.

military's portion of the Internet; one is on SPAN, NASA's internet; two are in Europe; and

one is in Japan.

Being the focal point for so many queries keeps the roots busy; even with thirteen, the traffic

to each root name server is very high. A recent informal poll of root name server

administrators showed some roots receiving thousands of queries per second.


12/16

Despite the load placed on root name servers, resolution on the Internet works quite well.

Figure 2.12 shows the resolution process for the address of a real host in a real domain,

including how the process corresponds to traversing the domain name space tree.

The local name server queries a root name server for the address ofgirigiri.gbrmpa.gov.au

and is referred to the au name servers. The local name server asks an au name server the samequestion, and is referred to the gov.au name servers. The gov.au name server refers the local

name server to the gbrmpa.gov.au name servers. Finally, the local name server asks a

gbrmpa.gov.au name server for the address and gets the answer.


13/16

A resolver queries a local name server, which then queries a number of other name servers in

pursuit of an answer for the resolver. Each name server it queries refers it to another nameserver that is authoritative for a zone further down in the name space and closer to the domain

name sought. Finally, the local name server queries the authoritative name server, which

returns an answer.

2.7 Caching

The whole resolution process may seem awfully convoluted and cumbersome to someone

accustomed to simple searches through the host table. Actually, it's usually quite fast. One of

the features that speeds it up considerably is caching.

A name server processing a recursive query may have to send out quite a few queries to find

an answer. However, it discovers a lot of information about the domain name space as it doesso. Each time it's referred to another list of name servers, it learns that those name servers are

authoritative for some zone, and it learns the addresses of those servers. And, at the end of the

resolution process, when it finally finds the data the original querier sought, it can store that

data for future reference, too. With version 4.9 and all version 8 BINDs, name servers even

implement negative caching: if an authoritative name server responds to a query with an

answer that says the domain name or data type in the query doesn't exist, the local name

server will temporarily cache that information, too. Name servers cache all of this data to help

speed up successive queries. The next time a resolver queries the name server for data about a

domain name the name server knows something about, the process is shortened quite a bit.

The name server may have cached the answer, positive or negative, in which case it simply

returns the answer to the resolver. Even if it doesn't have the answer cached, it may havelearned the identities of the name servers that are authoritative for the zone the domain name

is in and be able to query them directly.


14/16

For example, say our name server has already looked up the address ofeecs.berkeley.edu. In

the process, it cached the names and addresses of the eecs.berkeley.edu and berkeley.eduname servers (plus eecs.berkeley.edu 's IP address). Now if a resolver were to query our name

server for the address ofbaobab.cs.berkeley.edu , our name server could skip querying the root

name servers. Recognizing that berkeley.edu is the closest ancestor ofbaobab.cs.berkeley.eduthat it knows about, our name server would start by querying a berkeley.edu name server, as

shown in Figure 2.16. On the other hand, if our name server had discovered that there was no

address foreecs.berkeley.edu, the next time it received a query for the address, it could simply

have responded appropriately from its cache.

In addition to speeding up resolution, caching prevents us from having to query the root name

servers again. This means that we're not as dependent on the roots, and they won't suffer as

much from all our queries.

2.7.1 Time to Live

Name servers can't cache data forever, of course. If they did, changes to that data on the

authoritative name servers would never reach the rest of the network. Remote name servers

would just continue to use cached data. Consequently, the administrator of the zone that

contains the data decides on a time to live, orTTL, for the data. The time to live is the amount

of time that any name server is allowed to cache the data. After the time to live expires, the

name server must discard the cached data and get new data from the authoritative name

servers. This also applies to negatively cached data; a name server must time out a negative

answer after a period, too, in case new data has been added on the authoritative name servers.

However, the time to live for negatively cached data isn't tunable by the domainadministrator; it's hardcoded to ten minutes.

Deciding on a time to live for your data is essentially deciding on a tradeoff between

performance and consistency. A small TTL will help ensure that data about your domain is

consistent across the network, because remote name servers will time it out more quickly and

be forced to query your authoritative name servers more often for new data. On the other

hand, this will increase the load on your name servers and lengthen resolution time for

information in your domain, on the average.

A large TTL will shorten the average time it takes to resolve information in your domain

because the data can be cached longer. The drawback is that your information will beinconsistent for a longer time if you make changes to your data on your name servers.


15/16

Naming Service Mapping of logical names to physical addresses or objectreferences Logical names given by user (location independent) Service returns the addresses (location dependent)

Directory Service Management of attributes of the named instances

Attribute based search for named instances Similar to the Yellow Pages search model

3.1 Getting BIND

If you plan to set up your own domain and run name servers for it, you'll need the BIND

software first. Even if you're planning on having someone else run your domain, it's helpful to

have the software around. For example, you can use your local name server to test your data

files before giving them to your remote domain administrator.

What is BIND and what does it do?

BIND is an implementation of the Domain Name System (DNS) protocols. The name BIND

stands for "Berkeley Internet Name Domain", because the software originated in the early

1980s at the University of California at Berkeley. In recent years, the word BIND hasbecome, like "radar" and "snafu" and "laser" and "scuba", more word than acronym.

The DNS protocols are part of the core Internet standards. They specify the process by which

one computer can find another computer on the basis of its name.

What it means to say "BIND is an implementation of the DNS protocols" is thatthe BIND software distribution contains all of the software needed bothto ask name service questions and to answer such questions.

The BIND software distribution contains three parts:

A Domain Name System server. This is a program called "named", which is

pronounced "name-dee" and stands for "name daemon". It answers questions that are


16/16

sent to it, following the rules specified in the DNS protocol standards. You can

provide DNS service on the internet by installing this software on a server computer

and giving it correct information about your domain names.

A Domain Name System "resolver library". A "resolver" is a program that resolves

questions about names by sending those questions to appropriate servers and

responding appropriately to the servers' replies. A "resolver library" is a collection ofsoftware components that a programmer can add to software being developed, which

will give that software the ability to resolve names. For example, a programmer who

was programming a new web browser does not need to create the part of it that looks

up names in DNS; he or she can plug in the resolver library and then send quesitons to

the library software components. This saves time (the programmer does not need to re-

invent that particular wheel) and helps ensure that the new browser correctly follows

the DNS standards.

Software tools for testing servers. These are the tools that we use for testing, and we

include them in the distribution in case you would like to do your own testing, perhaps

to make sure your server configuration is working properly.

Berkeley Internet Name Domain-BIND

BIND (Berkeley Internet Name Domain) is an implementation of the DNSprotocols and provides anopenly redistributable reference implementation of the major components of the Domain NameSystem, including:

Domain Name System server

Domain Name System resolver library

Tools for managing and verifying the proper operation of the DNS server

The BIND DNS Serveris used on the vast majority of name serving machines on the Internet,

providing a robust and stable architecture on top of which an organization's naming architecture can

be built.

The resolver library included in the BIND distribution provides the standard APIs for translation

between domain names and Internet addresses and is intended to be linked with applications requiring

name service.

BIND version 9 is a majorrewrite of nearly all aspects of the underlying BIND architecture. Some of

the important features ofBIND 9 are DNS Security (DNSSEC, TSIG), IPv6, DNS Protocol

Enhancements (IXFR, DDNS, DNSNotify, EDNS0), Views, Multiprocessor Support, and an ImprovedPortability Architecture.

Today, BIND version 4 is officially deprecated and BIND version 8 development is considered

maintenance-only in favor of BIND version 9. No additional development will be done on BIND version

4 or BIND version 8 other than for security related patches. ISC encourages all BIND users to upgrade

to version 9 at their earliest convenience.
http://www.bind9.net/rfchttp://www.dnssec.net/http://www.rfc-archive.org/getrfc.php?rfc=2845http://www.rfc-archive.org/getrfc.php?rfc=2460http://www.rfc-archive.org/getrfc.php?rfc=1995http://www.rfc-archive.org/getrfc.php?rfc=3007http://www.rfc-archive.org/getrfc.php?rfc=1996http://www.rfc-archive.org/getrfc.php?rfc=2671http://www.bind9.net/downloadhttp://www.bind9.net/download

Documents

1. Scalability