50
1

ADM100 apache-administration-sample-content

Embed Size (px)

Citation preview

Page 1: ADM100 apache-administration-sample-content

1

Page 2: ADM100 apache-administration-sample-content

2

Page 3: ADM100 apache-administration-sample-content

37

Page 4: ADM100 apache-administration-sample-content

Apache is configured by placing directives in plain text configuration files. The main configuration file is usually called httpd.conf. The location of this file is set at

compile-time, but may be overridden with the -f command line flag. In addition, other configuration files may be added using the Include directive, and wildcards can be

used to include many configuration files. Any directive may be placed in any of

these configuration files. Changes to the main configuration files are only

recognized by Apache when it is started or restarted.

The server also reads a file containing mime document types; the filename is set by the TypesConfig directive, and is mime.types by default.

Apache configuration files contain one directive per line. The backslash "\" may be

used as the last character on a line to indicate that the directive continues onto the

next line. There must be no other characters or white space between the backslash

and the end of the line.

Directives in the configuration files are case-insensitive, but arguments to directives

are often case sensitive. Lines that begin with the hash character "#" are considered

comments, and are ignored. Comments may not be included on a line after a

configuration directive. Blank lines and white space occurring before a directive are

ignored, so you may indent directives for clarity.

The values of shell environment variables can be used in configuration file lines

using the syntax ${ENVVAR}. If "ENVVAR" is the name of a valid environment

variable, the value of that variable is substituted into that spot in the configuration

file line, and processing continues as if that text were found directly in the

configuration file. (If the ENVVAR variable is not found, the characters "${ENVVAR}"

are left unchanged for use by later stages in the config file processing.)

The maximum length of a line in the configuration file, after environment-variable

substitution, joining any continued lines and removing leading and trailing white

space, is 8192 characters.

You can check your configuration files for syntax errors without starting the server by using apachectl configtest or the -t command line option.

38

Page 5: ADM100 apache-administration-sample-content

Syntax

This indicates the format of the directive as it would appear in a configuration file.

This syntax is extremely directive-specific, and is described in detail in the

directive's definition. Generally, the directive name is followed by a series of one or

more space-separated arguments. If an argument contains a space, the argument

must be enclosed in double quotes. Optional arguments are enclosed in square

brackets. Where an argument can take on more than one possible value, the

possible values are separated by vertical bars "|". Literal text is presented in the

default font, while argument-types for which substitution is necessary are

emphasized. Directives which can take a variable number of arguments will end in

"..." indicating that the last argument is repeated.

Default

If the directive has a default value (i.e., if you omit it from your configuration entirely,

the Apache Web server will behave as though you set it to a particular value), it is

described here. If there is no default value, this section should say "None". Note that

the default listed here is not necessarily the same as the value the directive takes in

the default httpd.conf distributed with the server.

Override

This directive attribute indicates which configuration override must be active in order

for the directive to be processed when it appears in a .htaccess file. If the directive's

context doesn't permit it to appear in .htaccess files, then no context will be listed.

Overrides are activated by the AllowOverride directive, and apply to a particular

scope (such as a directory) and all descendants, unless further modified by other

AllowOverride directives at lower levels. The documentation for that directive also

lists the possible override names available.

Status

This indicates how tightly bound into the Apache Web server the directive is; in

other words, you may need to recompile the server with an enhanced set of

modules in order to gain access to the directive and its functionality.

39

Page 6: ADM100 apache-administration-sample-content

The ServerRoot directive sets the directory in which the server lives. Typically it

will contain the subdirectories conf/ and logs/. Relative paths in other

configuration directives (such as Include or LoadModule, for example) are taken

as relative to this directory.

The Listen directive instructs Apache to listen to only specific IP addresses or ports;

by default it responds to requests on all IP interfaces. Listen is now a required

directive. If it is not in the config file, the server will fail to start. This is a change from

previous versions of Apache.

The Listen directive tells the server to accept incoming requests on the specified

port or address-and-port combination. If only a port number is specified, the server

listens to the given port on all interfaces. If an IP address is given as well as a port,

the server will listen on the given port and interface.

Multiple Listen directives may be used to specify a number of addresses and ports

to listen to. The server will respond to requests from any of the listed addresses and

ports.

The ServerAdmin sets the contact address that the server includes in any error

messages it returns to the client. If the httpd doesn't recognize the supplied

argument as an URL, it assumes, that it's an email-address and prepends it

with mailto: in hyperlink targets. However, it's recommended to actually use an

email address, since there are a lot of CGI scripts that make that assumption. If you

want to use an URL, it should point to another server under your control. Otherwise

users may not be able to contact you in case of errors.

The ServerAlias directive sets the alternate names for a host, for use with name-

based virtual hosts. The ServerAlias may include wildcards, if appropriate.

The ServerName directive sets the request scheme, hostname and port that the

server uses to identify itself. This is used when creating redirection URLs.

40

Page 7: ADM100 apache-administration-sample-content

DocumentRoot directive sets the directory from which httpd will serve files.

Unless matched by a directive like Alias, the server appends the path from the

requested URL to the document root to make the path to the document.

Example: DocumentRoot /usr/web

then an access to http://www.my.host.com/index.html refers to

/usr/web/index.html. If the directory-path is not absolute then it is assumed to

be relative to the ServerRoot.

The LoadModule directive links in the object file or library filename and adds the

module structure named module to the list of active modules. Module is the name of

the external variable of type module in the file, and is listed as the Module Identifier

in the module documentation. Example:

LoadModule status_module modules/mod_status.so

loads the named module from the modules subdirectory of the ServerRoot.

Include directive allows inclusion of other configuration files from within the server

configuration files. Shell-style (fnmatch()) wildcard characters can be used to

include several files at once, in alphabetical order. In addition, if Include points to

a directory, rather than a file, Apache will read all files in that directory and any

subdirectory. But including entire directories is not recommended, because it is easy to accidentally leave temporary files in a directory that can cause httpd to fail.

Examples:

Include /usr/local/apache2/conf/ssl.conf

Include /usr/local/apache2/conf/vhosts/*.conf

The file path specified may be an absolute path, or may be relative to the ServerRoot directory.

41

Page 8: ADM100 apache-administration-sample-content

The directive syntax is extremely directive-specific, and is described in detail in the

directive's definition. Generally, the directive name is followed by a series of one or

more space-separated arguments. If an argument contains a space, the argument

must be enclosed in double quotes. Optional arguments are enclosed in square

brackets. Where an argument can take on more than one possible value, the

possible values are separated by vertical bars "|". Literal text is presented in the

default font, while argument-types for which substitution is necessary are

emphasized. Directives which can take a variable number of arguments will end in

"..." indicating that the last argument is repeated. Directives use a great number of

different argument types. A few common ones are defined below :

URL :

A complete Uniform Resource Locator including a scheme, hostname, and optional

pathname as in http://www.example.com/path/to/file.html

URL-path :

The part of a url which follows the scheme and hostname as in /path/to/file.html. The

url-path represents a web-view of a resource, as opposed to a file-system view.

file-path :

The path to a file in the local file-system beginning with the root directory as in

/usr/local/apache/htdocs/path/to/file.html. Unless otherwise specified, a file-path

which does not begin with a slash will be treated as relative to the ServerRoot.

directory-path :

The path to a directory in the local file-system beginning with the root directory as in

/usr/local/apache/htdocs/path/to/.

filename :

The name of a file with no accompanying path information as in file.html.

42

Page 9: ADM100 apache-administration-sample-content

This directive selects the type of user authentication for a directory. The authentication types available are Basic (implemented by mod_auth_basic) and

Digest (implemented by mod_auth_digest).

To implement authentication, you must also use the AuthName and Require

directives. In addition, the server must have an authentication-provider module such as mod_authn_file and an authorization module such as mod_authz_user

This directive sets the name of the authorization realm for a directory. This realm is

given to the client so that the user knows which username and password to send. AuthName takes a single argument; if the realm name contains spaces, it must be

enclosed in quotation marks. It must be accompanied by AuthType and Require

directives, and directives such as AuthUserFile and AuthGroupFile to work.

For example:

AuthName "Top Secret“

The string provided for the AuthName is what will appear in the password dialog

provided by most browsers.

The AuthUserFile directive sets the name of a textual file containing the list of

users and passwords for user authentication. File-path is the path to the user file. If it is not absolute, it is treated as relative to the ServerRoot.

Each line of the user file contains a username followed by a colon, followed by the

encrypted password. If the same user ID is defined multiple times, mod_authn_file will use the first occurrence to verify the password.

The utility htpasswd which is installed as part of the binary distribution, or which

can be found in src/support, is used to maintain the password file for HTTP Basic

Authentication.

Require directive selects which authenticated users can access a resource. The

restrictions are processed by authorization modules. Some of the allowed syntaxes provided by mod_authz_user and mod_authz_groupfile are:

Require user userid [userid] ...

Only the named users can access the resource.

43

Page 10: ADM100 apache-administration-sample-content

Some of the allowed syntaxes provided by mod_authz_user,

mod_authz_host, and mod_authz_groupfile are :

Require user userid [userid] :

Only the named users can access the resource.

Require group group-name [group-name] :

Only users in the named groups can access the resource.

Require valid-user :

All valid users can access the resource.

Require ip 10 172.20 192.168.2 :

Clients in the specified IP address ranges can access the resource.

Other authorization modules that implement require options include mod_authnz_ldap, mod_authz_dbm, mod_authz_dbd, mod_authz_owner

and mod_ssl. In most cases, for a complete authentication and authorization

configuration, Require must be accompanied by AuthName, AuthType and

AuthBasicProvider or AuthDigestProvider directives, and directives such

as AuthUserFile and AuthGroupFile (to define users and groups) in order to

work correctly.

Access controls which are applied in this way are effective for all methods. This is

what is normally desired. If you wish to apply access controls only to specific methods, while leaving other methods unprotected, then place the Require

statement into a <Limit> section.

The result of the Require directive may be negated through the use of the not

option. As with the other negated authorization directive <RequireNone>, when

the Require directive is negated it can only fail or return a neutral result, and

therefore may never independently authorize a request.

44

Page 11: ADM100 apache-administration-sample-content

The Order directive, along with the Allow and Deny directives, controls a three-

pass access control system. The first pass processes either all Allow or all Deny

directives, as specified by the Order directive. The second pass parses the rest of

the directives (Deny or Allow). The third pass applies to all requests which do not

match either of the first two.

The Allow directive affects which hosts can access an area of the server. Access

can be controlled by hostname, IP address, IP address range, or by other

characteristics of the client request captured in environment variables.

The first argument to this directive is always from. The subsequent arguments can take three different forms. If Allow from all is specified, then all hosts are allowed

access, subject to the configuration of the Deny and Order directive.

The Deny directive allows access to the server to be restricted based on hostname,

IP address, or environment variables. The arguments for the Deny directive are

identical to the arguments for the Allow directive.

Note that all Allow and Deny directives are processed, unlike a typical firewall,

where only the first match is used. The last match is effective (also unlike a typical

firewall). Additionally, the order in which lines appear in the configuration files is not significant -- all Allow lines are processed as one group, all Deny lines are

considered as another, and the default state is considered by itself.

Ordering is one of :

Allow,Deny

First, all Allow directives are evaluated; at least one must match, or the request is

rejected. Next, all Deny directives are evaluated. If any matches, the request is

rejected. Last, any requests which do not match an Allow or a Deny directive are

denied by default.

Deny,Allow

First, all Deny directives are evaluated; if any match, the request is denied unless it

also matches an Allow directive. Any requests which do not match any Allow or

Deny directives are permitted.

45

Page 12: ADM100 apache-administration-sample-content

The ErrorLog directive sets the name of the file to which the server will log any

errors it encounters. If the file-path is not absolute then it is assumed to be relative to the ServerRoot. Example :

ErrorLog /var/log/httpd/error_log

The LogFormat directive can take one of two forms. In the first form, where only

one argument is specified, this directive sets the log format which will be used by logs specified in subsequent TransferLog directives.

The second form of the LogFormat directive associates an explicit format with a

nickname. This nickname can then be used in subsequent LogFormat or

CustomLog directives rather than repeating the entire format string.

The CustomLog directive is used to log requests to the server. A log format is

specified, and the logging can optionally be made conditional on request

characteristics using environment variables. The first argument, which specifies the

location to which the logs will be written, can take one of the following two types of

values:

file : A filename, relative to the ServerRoot.

pipe : The pipe character "|", followed by the path to a program to receive the log

information on its standard input.

TransferLog directive has exactly the same arguments and effect as the

CustomLog directive, with the exception that it does not allow the log format to be

specified explicitly or for conditional logging of requests. Instead, the log format is determined by the most recently specified LogFormat directive which does not

define a nickname. Common Log Format is used if no other format has been

specified.

When entering a file path on non-Unix platforms, care should be taken to make sure

that only forward slashes are used even though the platform may allow the use of

back slashes. In general it is a good idea to always use forward slashes throughout the configuration files.

46

Page 13: ADM100 apache-administration-sample-content

The Keep-Alive extension to HTTP/1.0 and the persistent connection feature of

HTTP/1.1 provide long-lived HTTP sessions which allow multiple requests to be

sent over the same TCP connection. In some cases this has been shown to result in

an almost 50% speedup in latency times for HTML documents with many images. To enable Keep-Alive connections, set KeepAlive On.

The MaxConnectionsPerChild directive sets the limit on the number of

connections that an individual child server process will handle. After MaxConnectionsPerChild connections, the child process will die. If

MaxConnectionsPerChild is 0, then the process will never expire.

Setting MaxConnectionsPerChild to a non-zero value limits the amount of

memory that process can consume by (accidental) memory leakage.

The MaxMemFree directive sets the maximum number of free Kbytes that every

allocator is allowed to hold without calling free(). In threaded MPMs, every thread

has its own allocator. When set to zero, the threshold will be set to unlimited.

The MaxRequestWorkers directive sets the limit on the number of simultaneous

requests that will be served. Any connection attempts over the MaxRequestWorkers limit will normally be queued, up to a number based on the

ListenBacklog directive. Once a child process is freed at the end of a different

request, the connection will then be serviced.

For non-threaded servers (i.e., prefork), MaxRequestWorkers translates into the

maximum number of child processes that will be launched to serve requests. The default value is 256; to increase it, you must also raise ServerLimit.

For threaded and hybrid servers (e.g. event or worker) MaxRequestWorkers

restricts the total number of threads that will be available to serve clients. For hybrid MPMs the default value is 16 (ServerLimit) multiplied by the value of 25

(ThreadsPerChild). Therefore, to increase MaxRequestWorkers to a value that

requires more than 16 processes, you must also raise ServerLimit.

MaxRequestWorkers was called MaxClients before version 2.3.13. The old

name is still supported.

.

47

Page 14: ADM100 apache-administration-sample-content

HostnameLookups directive enables DNS lookups so that host names can be logged (and passed to CGIs/SSIs in REMOTE_HOST). The value Double refers to

doing double-reverse DNS lookup. That is, after a reverse lookup is performed, a

forward lookup is then performed on that result.

Regardless of the setting, when mod_authz_host is used for controlling access by

hostname, a double reverse lookup will be performed. This is necessary for security.

Note that the result of this double-reverse isn't generally available unless you set HostnameLookups Double.

The default is Off in order to save the network traffic for those sites that don't truly

need the reverse lookups done. It is also better for the end users because they don't

have to suffer the extra latency that a lookup entails. Heavily loaded sites should

leave this directive Off, since DNS lookups can take considerable amounts of time. The utility logresolve, compiled by default to the bin subdirectory of your

installation directory, can be used to look up host names from logged IP addresses

offline.

The StartServers directive sets the number of child server processes created on

startup. As the number of processes is dynamically controlled depending on the

load, there is usually little reason to adjust this parameter. The default value differs from MPM to MPM. For worker the default is StartServers 3. For prefork

defaults to 5 and for mpmt_os2 to 2.

Number of threads created on startup. As the number of threads is dynamically

controlled depending on the load, there is usually little reason to adjust this parameter. For mpm_netware the default is StartThreads 50 and, since there is

only a single process, this is the total number of threads created at startup to serve

requests.

ThreadLimit directive sets maximum configured value for ThreadsPerChild for

the lifetime of the Apache process. Any attempts to change this directive during a restart will be ignored, but ThreadsPerChild can be modified during a restart up

to the value of this directive.

48

Page 15: ADM100 apache-administration-sample-content

The most commonly used configuration section containers are the ones that change

the configuration of particular places in the filesystem or webspace. First, it is

important to understand the difference between the two.

The filesystem is the view of your disks as seen by your operating system. For example, in a default install, Apache resides "C:/Program Files/Apache

Software Foundation/Apache2.2" in the Windows filesystem. (Note that

forward slashes should always be used as the path separator in Apache, even for

Windows.)

In contrast, the webspace is the view of your site as delivered by the web server and seen by the client. So the path /dir/ in the webspace corresponds to the path

/usr/local/apache2/htdocs/dir/ in the filesystem of a default Apache

install on Unix. The webspace need not map directly to the filesystem, since

webpages may be generated dynamically from databases or other locations.

The <Directory> and <Files> directives, along with their regex counterparts,

apply directives to parts of the filesystem. Directives enclosed in a <Directory>

section apply to the named filesystem directory and all subdirectories of that

directory.

The <Location> directive and its regex counterpart, on the other hand, change

the configuration for content in the webspace. For example, the following configuration prevents access to any URL-path that begins in /private. In

particular, it will apply to requests for http://yoursite.example.com/private,

http://yoursite.example.com/private123, and

http://yoursite.example.com/private/dir/file.html as well as any

other requests starting with the /private string.

<Location /private>

Order Allow,Deny

Deny from all

</Location>

49

Page 16: ADM100 apache-administration-sample-content

There are two basic types of containers. Most containers are evaluated for each

request. The enclosed directives are applied only for those requests that match the containers. The <IfDefine>, <IfModule>, and <IfVersion> containers, on

the other hand, are evaluated only at server startup and restart. If their conditions

are true at startup, then the enclosed directives will apply to all requests. If the

conditions are not true, the enclosed directives will be ignored.

The <IfDefine> directive encloses directives that will only be applied if an

appropriate parameter is defined on the httpd command line. For example, with

the following configuration, all requests will be redirected to another site only if the server is started using httpd –D ClosedForNow :

<IfDefine ClosedForNow>

Redirect / http://otherserver.example.com/

</IfDefine>

The <IfModule> directive is very similar, except it encloses directives that will

only be applied if a particular module is available in the server.

In the following example, the MimeMagicFiles directive will be applied only if

mod_mime_magic is available. <IfModule mod_mime_magic.c>

MimeMagicFile conf/magic

</IfModule>

The <IfVersion> directive is very similar to <IfDefine> and <IfModule>,

except it encloses directives that will only be applied if a particular version of the

server is executing. This module is designed for the use in test suites and large networks which have to deal with different httpd versions and different

configurations. <IfVersion >= 2.1>

# this happens only in versions greater or

# equal 2.1.0.

</IfVersion>

50

Page 17: ADM100 apache-administration-sample-content

<Directory> and </Directory> are used to enclose a group of directives that

will apply only to the named directory and sub-directories of that directory. Any

directive that is allowed in a directory context may be used. Directory-path is either

the full path to a directory, or a wild-card string using Unix shell-style matching. In a

wild-card string, ? matches any single character, and * matches any sequences of

characters. You may also use [] character ranges. None of the wildcards match a `/' character, so <Directory /*/public_html> will not match

/home/user/public_html, but <Directory /home/*/public_html> will

match. Example:

<Directory /usr/local/httpd/htdocs>

Options Indexes FollowSymLinks

</Directory>

Be careful with the directory-path arguments: They have to literally match the

filesystem path which Apache uses to access the files. Directives applied to a particular <Directory> will not apply to files accessed from that same directory

via a different path, such as via different symbolic links. Regular expressions can

also be used, with the addition of the ~ character. For example:

<Directory ~ "^/www/.*/[0-9]{3}">

would match directories in /www/ that consisted of three numbers.

51

Page 18: ADM100 apache-administration-sample-content

The <Location> directive limits the scope of the enclosed directives by URL. It is

similar to the <Directory> directive, and starts a subsection which is terminated

with a </Location> directive. <Location> sections are processed in the order they

appear in the configuration file, after the <Directory> sections and .htaccess

files are read, and after the <Files> sections.

<Location> sections operate completely outside the filesystem. This has several

consequences. Most importantly, <Location> directives should not be used to

control access to filesystem locations. Since several different URLs may map to the

same filesystem location, such access controls may by circumvented.

When to use <Location>

Use <Location> to apply directives to content that lives outside the filesystem.

For content that lives in the filesystem, use <Directory> and <Files>. An

exception is <Location />, which is an easy way to apply a configuration to the

entire server.

For all origin (non-proxy) requests, the URL to be matched is a URL-path of the

form /path/. No scheme, hostname, port, or query string may be included. For proxy

requests, the URL to be matched is of the form scheme://servername/path, and you

must include the prefix.

The URL may use wildcards. In a wild-card string, ? matches any single character,

and * matches any sequences of characters. Neither wildcard character matches a /

in the URL-path.

52

Page 19: ADM100 apache-administration-sample-content

The <Files> directive limits the scope of the enclosed directives by filename. It is

comparable to the <Directory> and <Location> directives. It should be

matched with a </Files> directive. The directives given within this section will be

applied to any object with a basename (last component of filename) matching the specified filename. <Files> sections are processed in the order they appear in the

configuration file, after the <Directory> sections and .htaccess files are read,

but before <Location> sections. Note that <Files> can be nested inside

<Directory> sections to restrict the portion of the filesystem they apply to.

Note that unlike <Directory> and <Location> sections, <Files> sections

can be used inside .htaccess files. This allows users to control access to their

own files, at a file-by-file level.

The <FilesMatch> directive limits the scope of the enclosed directives by

filename, just as the <Files> directive does. However, it accepts a regular

expression. For example :

<FilesMatch "\.(gif|jpe?g|png)$">

would match most common Internet graphics formats.

53

Page 20: ADM100 apache-administration-sample-content

.htaccess files (or "distributed configuration files") provide a way to make

configuration changes on a per-directory basis. A file, containing one or more

configuration directives, is placed in a particular document directory, and the

directives apply to that directory, and all subdirectories thereof.

In general, .htaccess files use the same syntax as the main configuration files.

What you can put in these files is determined by the AllowOverride directive.

This directive specifies, in categories, what directives will be honored if they are found in a .htaccess file. If a directive is permitted in a .htaccess file, the

documentation for that directive will contain an Override section, specifying what

value must be in AllowOverride in order for that directive to be permitted.

For example, if you look at the documentation for the AddDefaultCharset

directive, you will find that it is permitted in .htaccess files. The Override line

reads FileInfo. Thus, you must have at least AllowOverride FileInfo in

order for this directive to be honored in .htaccess files.

In general, you should never use .htaccess files unless you don't have access to

the main server configuration file. .htaccess files should be used in a case where

the content providers need to make configuration changes to the server on a per-

directory basis, but do not have root access on the server system. In the event that

the server administrator is not willing to make frequent configuration changes, it

might be desirable to permit individual users to make these changes in .htaccess

files for themselves. This is particularly true, for example, in cases where ISPs are

hosting multiple user sites on a single machine, and want their users to be able to

alter their configuration.

However, in general, use of .htaccess files should be avoided when possible.

Any configuration that you would consider putting in a .htaccess file, can just as effectively be made in a <Directory> section in your main server configuration

file.

54

Page 21: ADM100 apache-administration-sample-content

The configuration directives found in a .htaccess file are applied to the directory

in which the .htaccess file is found, and to all subdirectories thereof. However, it

is important to also remember that there may have been .htaccess files in

directories higher up. Directives are applied in the order that they are found. Therefore, a .htaccess file in a particular directory may override directives found

in .htaccess files found higher up in the directory tree. And those, in turn, may

have overridden directives found yet higher up, or in the main server configuration

file itself.

Set "AllowOverride Options" in effect to permit the use of the "Options" directive in .htaccess files.

.htaccess files can override the <Directory> sections for the corresponding

directory, but will be overriden by other types of configuration sections from the

main configuration files. This fact can be used to enforce certain configurations, even in the presence of a liberal AllowOverride setting. For example, to prevent

script execution while allowing anything else to be set in .htaccess you can use :

<Directory />

Allowoverride All

</Directory>

<Location />

Options +IncludesNoExec -ExecCGI

</Location>

55

Page 22: ADM100 apache-administration-sample-content

In the event of a problem or error, Apache can be configured to do one of four

things,

• Output a simple hardcoded error message

• Output a customized message

• Redirect to a local URL-path to handle the problem/error

• Redirect to an external URL to handle the problem/error

The first option is the default, while options 2-4 are configured using the ErrorDocument directive, which is followed by the HTTP response code and a

URL or a message. Apache will sometimes offer additional information regarding

the problem/error. URLs can begin with a slash (/) for local web-paths (relative to the DocumentRoot), or be a full URL which the client can resolve. Alternatively, a

message can be provided to be displayed by the browser. Examples :

ErrorDocument 500 http://foo.example.com/cgi-bin/tester

ErrorDocument 404 /cgi-bin/bad_urls.pl

ErrorDocument 401 /subscription_info.html

ErrorDocument 403 "Sorry can't allow you access today"

Additionally, the special value default can be used to specify Apache's simple

hardcoded message. While not required under normal circumstances, default will

restore Apache's simple hardcoded message for configurations that would otherwise inherit an existing ErrorDocument.

ErrorDocument 404 /cgi-bin/bad_urls.pl

<Directory /web/docs>

ErrorDocument 404 default

</Directory>

56

Page 23: ADM100 apache-administration-sample-content

The DirectoryIndex directive sets the list of resources to look for, when the

client requests an index of the directory by specifying a / at the end of the directory

name. Local-url is the (%-encoded) URL of a document on the server relative to the

requested directory; it is usually the name of a file in the directory. Several URLs

may be given, in which case the server will return the first one that it finds. If none of

the resources exist and the Indexes option is set, the server will generate its own

listing of the directory.

DirectoryIndex index.html

Then a request for http://myserver/docs/ would return

http://myserver/docs/index.html if it exists, or would list the directory if it

did not.

The index of a directory can come from one of two sources :

A file written by the user, typically called index.html. The DirectoryIndex

directive sets the name of this file. This is controlled by mod_dir.

Otherwise, a listing generated by the server. The other directives control the format of this listing. The AddIcon, AddIconByEncoding and AddIconByType are

used to set a list of icons to display for various file types; for each file listed, the first

icon listed that matches the file is displayed. These are controlled by mod_autoindex.

57

Page 24: ADM100 apache-administration-sample-content

58

Page 25: ADM100 apache-administration-sample-content

59

Page 26: ADM100 apache-administration-sample-content

Apache runs as a permanent background task: a daemon (UNIX) or service

(Windows). Start-up is a slow and expensive operation, so for an operational server,

it is usual for Apache to start at system boot and remain permanently up.

The Apache HTTP Server comprises a relatively small core, together with a number

of modules. Modules may be compiled statically into the server or, more commonly,

held in a /modules/ or /libexec/ directory and loaded dynamically at runtime. In

addition, the server relies on the Apache Portable Runtime (APR) libraries, which

provide a cross-platform operating system layer and utilities, so that modules don't

have to rely on non-portable operating system calls. A special-purpose module, the

Multi-Processing Module (MPM), serves to optimize Apache for the underlying

operating system. The MPM should normally be the only module to access the

operating system other than through the APR.

Apache operation proceeds in two phases: start-up and operational. System start-up

takes place as root, and includes parsing the configuration file(s), loading modules,

and initializing system resources such as log files, shared memory segments, and

database connections. For normal operation, Apache relinquishes its system

privileges and runs as an unprivileged user before accepting and processing

connections from clients over the network. This basic security measure helps to

prevent a simple bug in Apache (or a module or script) from becoming a devastating

system vulnerability, like those exploited by malware such as "Code Red" and

"Nimda" in MS IIS.

This two-stage operation has some implications for applications architecture. First,

anything that requires system privileges must be run at system start-up. Second, it

is good practice to run as much initialization as possible at start-up, so as to

minimize the processing required to service each request.

One non-intuitive quirk of the architecture is that the configuration code is, in fact,

executed twice at start-up (although not at restart). The first time through checks

that the configuration is valid (at least to the point that Apache can successfully

start); the second pass is "live" and leads into the operational phase. Most modules

can ignore this behavior (standard use of APR pools ensures that it doesn't cause a

resource leak), but it may have implications for some modules.

60

Page 27: ADM100 apache-administration-sample-content

The Apache Portable Runtime (APR) is a supporting library for the Apache web

server. It provides a set of APIs that map to the underlying operating system. Where

the OS doesn't support a particular function, APR will provide a replacement. Thus,

the APR can be used to make a program truly portable across platforms.

APR was originally a part of Apache, but has now been spun off into a separate

project of the Apache Software Foundation, and is used by other applications to

achieve platform independence.

The range of platform-independent functionality provided by APR includes:

• Memory allocation and memory pool functionality

• Atomic operations

• Dynamic library handling

• File I/O

• Command argument parsing

• Locking

• Hash tables and arrays

• Mmap functionality

• Network sockets and protocols

• Thread, process and mutex functionality

• Shared memory functionality

• Time routines

• User and group ID services

61

Page 28: ADM100 apache-administration-sample-content

At the end of the start-up phase, after the configuration has been read, overall

control of Apache passes to a Multi-Processing Module. The MPM provides the

interface between the running Apache server and the underlying operating system.

Its primary role is to optimize Apache for each platform, while ensuring the server

runs efficiently and securely. Also uniquely, every Apache instance must contain

exactly one MPM, which is selected at build-time.

Why MPMs?

The old NCSA server, and Apache 1, grew up in a UNIX environment. It was a

multiprocess server, where each client would be serviced by one server instance. If

there were more concurrent clients than server processes, Apache would fork

additional server processes to deal with them. Under normal operation, Apache

would maintain a pool of available server processes to deal with incoming requests.

Whereas this scheme works well on UNIX-family systems, it is an inefficient solution

on platforms such as Windows, where forking a process is an expensive operation.

So making Apache truly cross-platform required another solution. The approach

adopted for Apache 2 is to turn the core processing into a pluggable module, the

MPM, which can be optimized for different environments. The MPM architecture

also allows different Apache models to coexist even within a single operating

system, thus providing users with options for different usages.

In practice, only UNIX-family operating systems offer a useful choice: Other

supported platforms (Windows, Netware, OS/2, BeOS) have a single MPM

optimized for each platform. UNIX has two production-quality MPMs (Prefork and

Worker) available as standard, a third (Event) that is thought to be stable for non-

SSL uses in Apache 2.2, and several experimental options unsuitable for production

use. Third-party MPMs are also available.

62

Page 29: ADM100 apache-administration-sample-content

Start of the module list :

mod_actions : This module provides for executing CGI scripts based on

media type or request method.

mod_alias : Provides for mapping different parts of the host filesystem in

the document tree and for URL redirection

mod_asis : Sends files that contain their own HTTP headers

mod_auth_basic : Basic authentication

mod_auth_digest : User authentication using MD5 Digest Authentication.

mod_authn_alias : Provides the ability to create extended authentication

providers based on actual providers

mod_authn_anon : Allows "anonymous" user access to authenticated areas

mod_authn_dbd : User authentication using an SQL database

mod_authn_dbm : User authentication using DBM files

mod_authn_default :Authentication fallback module

mod_authn_file : User authentication using text files

mod_authnz_ldap : Allows an LDAP directory to be used to store the database for

HTTP Basic authentication.

mod_authz_dbm : Group authorization using DBM files

mod_authz_default : Authorization fallback module

mod_authz_groupfile : Group authorization using plaintext files

mod_authz_host : Group authorizations based on host (name or IP address)

mod_authz_owner : Authorization based on file ownership

mod_authz_user : User Authorization

mod_autoindex : Generates directory indexes, automatically, similar to the

Unix ls command or the Win32 dir shell command

mod_cache : Content cache keyed to URIs.

63

Page 30: ADM100 apache-administration-sample-content

mod_actions :

This module provides for executing CGI scripts based on media type or request

method.

mod_alias :

Provides for mapping different parts of the host filesystem in the document tree and

for URL redirection

mod_asis :

Sends files that contain their own HTTP headers

mod_authn_dbd :

Provides authentication front-ends such as mod_auth_digest and

mod_auth_basic to authenticate users by looking up users in SQL tables.

mod_authn_dbm :

Provides authentication front-ends such as mod_auth_digest and

mod_auth_basic to authenticate users by looking up users in dbm password files

mod_rewrite :

Rule-based rewriting engine (based on a regular-expression parser) to rewrite

requested URLs on the fly. It supports an unlimited number of rules and an unlimited

number of attached rule conditions for each rule, to provide a really flexible and

powerful URL manipulation mechanism.

mod_alias :

The directives contained in this module allow for manipulation and control of URLs as requests arrive at the server. The Alias and ScriptAlias directives are used

to map between URLs and filesystem paths.

64

Page 31: ADM100 apache-administration-sample-content

mod_env :

Allows for control of the environment that will be provided to CGI scripts and SSI

pages. Environment variables may be passed from the shell which invoked the httpd

process.

mod_dir :

Provides for "trailing slash" redirects and serving directory index files

mod_include :

This module provides a filter which will process files before they are sent to the

client. The processing is controlled by specially formatted SGML comments,

referred to as elements. These elements allow conditional text, the inclusion of other

files or programs, as well as the setting and printing of environment variables.

mod_log_config :

Provides for flexible logging of client requests. Logs are written in a customizable

format, and may be written directly to a file, or to an external program. Conditional

logging is provided so that individual requests may be included or excluded from the

logs based on characteristics of the request.

mod_example :

Some files in the modules/experimental directory under the Apache distribution

directory tree are provided as an example to those that wish to write modules that

use the Apache API.

65

Page 32: ADM100 apache-administration-sample-content

The purpose of Apache's start-up phase is to read the configuration, load modules

and libraries, and initialize required resources. Each module may have its own

resources, and has the opportunity to initialize those resources. At start-up, Apache

runs as a single-process, single-thread program and has full system privileges.

Apache's main configuration file is normally called httpd.conf. The httpd.conf

configuration file is a plain text file and is parsed line-by-line at server start-up. The

contents of httpd.conf comprise directives, containers, and comments. Blank lines

and leading whitespace are also allowed, but will be ignored.

At the end of the start-up phase, control passes to the Multi-Processing Module. The

MPM is responsible for managing Apache's operation at a systems level. It typically

does so by maintaining a pool of worker processes and/or threads, as appropriate to

the operating system and other applicable constraints (such as optimization for a

particular usage scenario). The original process remains as "master," maintaining a

pool of worker children. These workers are responsible for servicing incoming

connections, while the parent process deals with creating new children, removing

surplus ones as necessary, and communicating signals such as "shut down" or

"restart."

Because of the MPM architecture, it is not possible to describe the operational

phase in definite terms. Whereas the standard MPMs use worker children in some

manner, they are not constrained to work in only one way. Thus another MPM

could, in principle, implement an entirely different server architecture at the system

level. There is no shutdown phase as such. Instead, anything that needs be done on

shutdown is registered as a cleanup. When Apache stops, all registered cleanups

are run.

The Apache 2.2 release is a bunch of improvements and is a painless upgrade from

the previous version. It's an incremental update from Apache 2.0: it adds new

features and consolidates existing capabilities, but preserves the underlying

architecture and API. Some of the most exciting changes serve to make Apache

altogether more scalable for the most demanding users. That's not to say it wasn't

already scalable: system administrators in various roles such as news sites and big

download sites report a single server sustaining well over 20,000 concurrent

connections at full performance.

66

Page 33: ADM100 apache-administration-sample-content

Content Generation

The simplest possible formulation of a webserver is a program that listens for HTTP

requests and returns a response when it receives one. In Apache, this job is

fundamentally the business of a content generator, the core of the webserver.

Most, though by no means all, modules are concerned with some aspect of

processing an HTTP request. But there is rarely, if ever, a reason for a module to concern itself with every aspect of HTTP—that is the business of the httpd. The

advantage of a modular approach is that a module can easily focus on a particular

task but ignore aspects of HTTP that are not relevant to it.

Exactly one content generator must be run for every HTTP request. Any module

may register content generators, normally by defining a function referenced by a

handler that can be configured using the SetHandler or AddHandler directives in

httpd.conf.

The default generator, which is used when no specific generator is defined by any

module, simply returns a file, mapped directly from the request to the filesystem.

Modules that implement content generators are sometimes known as "content

generator" or "handler" modules.

67

Page 34: ADM100 apache-administration-sample-content

In principle, a content generator can handle all the functions of a webserver. For

example, a CGI program gets the request and produces the response, and it can

take full control of what happens between them. Like other webservers, Apache

splits the request into different phases. For example, it checks whether the user is

authorized to do something before the content generator does that thing.

Several request phases precede the content generator. These serve to examine

and perhaps manipulate the request headers, and to determine what to do with the

request. For example :

The request URL will be matched against the configuration, to determine which

content generator should be used.

The request URL will normally be mapped to the filesystem. The mapping may be to

a static file, a CGI script, or whatever else the content generator may use.

If content negotiation is enabled, mod_negotiation will find the version of the

resource that best matches the browser's preference. For example, the Apache

manual pages are served in the language requested by the browser.

Access and authentication modules will enforce the server's access rules, and

determine whether the user is permitted what has been requested.

mod_alias or mod_rewrite may change the effective URL in the request.

Nonstandard Request Processing

Request processing may sometimes be diverted from the standard processing.

A module may divert processing into a new request or error document at any point

before the response has been sent.

A module may define additional phases and enable other modules to hook their own

processing in .

68

Page 35: ADM100 apache-administration-sample-content

access_checker : Apache checks whether access to the requested resource is

permitted according to the server configuration (httpd.conf). A module can add to or

replace Apache's standard logic, which implements the Allow/Deny From directives

in mod_access (httpd 1.x and 2.0) or mod_authz_host (httpd 2.2).

check_user_id : If any authentication method is in use, Apache will apply the

relevant authentication and set the username field r->user. A module may

implement an authentication method with this hook.

auth_checker : This hook checks whether the requested operation is permitted to

the authenticated user.

type_checker : This hook applies rules related to the MIME type (where

applicable) of the requested resource, and determines the content handler to use (if

not already set). Standard modules implementing this hook include mod_negotiation

and mod_mime.

fixups : This general-purpose hook enables modules to run any necessary

processing after the preceding hooks but before the content generator. Like

post_read_request, it is something of a catch-all, and is one of the most commonly

used hooks.

handler : This is the content generator hook. It is responsible for sending an

appropriate response to the client. If there are input data, the handler is also

responsible for reading them. Unlike the other hooks, where zero or many functions

may be involved in processing a request, every request is processed by exactly one

handler.

log_transaction : This hook logs the transaction after the response has been

returned to the client. A module may modify or replace Apache's standard logging.

A module may hook its own handlers into any of these processing phases. The

module provides a callback function and hooks it in, and Apache calls the function

during the appropriate processing phase. Modules that concern themselves with the

phases before content generation are sometimes known as metadata modules.

69

Page 36: ADM100 apache-administration-sample-content

Apache 2 Filters are handlers for processing data of the request and the response.

They have a common interface and are interchangeable. In the figure on the slide

you see two example filter chains: The input filter chain to process the data of the

request and the output filter chain to process the data of the response (provided by

the content handler). The agent ``Request processing'' triggers the input filter chain

while reading the request. An important use of the input filter chain is the SSL

module providing secure HTTP (HTTPS) communication. Besides separating filters

into input and output filters, 3 different categories can be distinguished :

Resource and Content Set Filters

Resource Filters alter the content that is passed through them. Server Side Includes

(SSI) or PHP scripting are typical examples.

Content Set Filters alter the content as a whole, for example to compress or

decompress it (Deflate).

Protocol and Transcode Filters

Protocol Filters are used to implement the protocol's behavior (HTTP, POP, ...).

That way future versions of HTTP could be supported.

Transcode Filters alter the transport encoding of request or response. For example,

the chunk output filter splits a resource into data chunks which it sends to the client

one after another .

Connection and Network Filters

Connection Filters deal with establishing and releasing a connection. For example,

establishing an HTTPS connection requires a special handshake between client and

server. They may also alter the content, in the HTTPS example by encrypting and

decrypting data.

Network Filters are responsible for interacting with the operating system to establish

network connections and complete associated tasks. To support protocols other

than TCP/IP, only a module implementing an input and output filter for the specific

connection protocol is needed.

70

Page 37: ADM100 apache-administration-sample-content

Each Apache module can provide a handler function for any of the request

processing phases. There are 4 types of return values possible for every handler.

DECLINED means the module declined to handle this phase, Apache moves to the

next module in the module list.

OK means that this phase has been processed, Apache will move on to the next

phase without giving any more modules an opportunity to handle this phase.

An error return (which is any HTTP [7] error constant) will cause Apache to produce

an error page and jump to the Logging phase.

A special value of DONE means the whole request has been serviced, Apache will

jump to the Logging phase.

The DECLINED return is somewhat deceiving, because many modules actually

perform some action and then return DECLINED to give other modules an

opportunity to handle the phase. The example below illustrates how the DECLINED

return can be used in a handler that inserts a silly reply header into every request:

from mod_python import apache

def fixup(req):

req.headers_out["X-Grok-this"] = "Python-Psychobabble"

return apache.DECLINED

At this point it should be a bit clearer how this functionality is different from CGI

environment. Comparing CGI with mod_python is not very meaningful, because the

scope of CGI is much narrower. One difference is that CGI is intended exclusively

for dynamic content generation, which is not a requirement for mod_python scripts.

For example, consider a mod_python script that implements a custom logging

mechanism for the entire server, which plays no role in content generation.

71

Page 38: ADM100 apache-administration-sample-content

72

Page 39: ADM100 apache-administration-sample-content

73

Page 40: ADM100 apache-administration-sample-content

Apache is configured by placing directives in plain text configuration files. The main configuration file is usually called httpd.conf. In addition, other configuration files

may be added using the Include directive, and wildcards can be used to include

many configuration files. Any directive may be placed in any of these configuration

files.

Changes to the main configuration files are only recognized by Apache when it is

started or restarted.

The server also reads a file containing mime document types; the filename is set by the TypesConfig directive, and is mime.types by default.

Apache configuration files contain one directive per line. The back-slash "\" may be

used as the last character on a line to indicate that the directive continues onto the

next line. There must be no other characters or white space between the back-slash

and the end of the line.

Directives in the configuration files are case-insensitive, but arguments to directives

are often case sensitive. Lines that begin with the hash character "#" are considered

comments, and are ignored. Comments may not be included on a line after a

configuration directive. Blank lines and white space occurring before a directive are

ignored, so you may indent directives for clarity.

.htaccess files (or "distributed configuration files") provide a way to make

configuration changes on a per-directory basis. A file, containing one or more

configuration directives, is placed in a particular document directory, and the

directives apply to that directory, and all subdirectories thereof.

If you want to call your .htaccess file something else, you can change the name

of the file using the AccessFileName directive. For example, if you would rather

call the file .config then you can put the following in your server configuration file:

AccessFileName .config

74

Page 41: ADM100 apache-administration-sample-content

In general, .htaccess files use the same syntax as the main configuration files.

What you can put in these files is determined by the AllowOverride directive.

This directive specifies, in categories, what directives will be honored if they are found in a .htaccess file. If a directive is permitted in a .htaccess file, the

documentation for that directive will contain an Override section, specifying what value must be in AllowOverride in order for that directive to be permitted.

For example, if you look at the documentation for the AddDefaultCharset

directive, you will find that it is permitted in .htaccess files. The Override line

reads FileInfo. Thus, you must have at least AllowOverride FileInfo in

order for this directive to be honored in .htaccess files.

There are two main reasons to avoid the use of .htaccess files.

The first of these is performance. When AllowOverride is set to allow the use of

.htaccess files, Apache will look in every directory for .htaccess files. Thus,

permitting .htaccess files causes a performance hit, whether or not you actually

even use them! Also, the .htaccess file is loaded every time a document is

requested.

Further note that Apache must look for .htaccess files in all higher-level

directories, in order to have a full complement of directives that it must apply.

The second consideration is one of security. You are permitting users to modify

server configuration, which may result in changes over which you have no control.

Carefully consider whether you want to give your users this privilege.

75

Page 42: ADM100 apache-administration-sample-content

Virtual hosting is a method that servers such as web servers use to host more

than one domain name on the same computer, sometimes on the same IP address.

Virtual web hosting is one of the most popular hosting options available at the

moment—probably because it is one of the most cost effective options on the

market. Also known as shared web hosting, virtual hosting allows a website owner

to have a site hosted on a web server that is shared with other websites. In simple

terms, the virtual hosting company's server will allocate out hosting services and

bandwidth to more than one website. Virtual web hosting is a cheaper hosting option

because you won't have to pay for a dedicated server to host just your website.

Virtual web hosting is a good solution for small- to medium-sized (and even some

larger) websites that aren't constantly being visited or that have reasonable

bandwidth needs.

You can use the Apache HTTP Server's virtual hosts capability to run different

servers for different IP addresses, different host names, or different ports on the

same server.

Virtual web hosting is often used on large scale in companies whose business

model is to provide low cost website hosting for customers. The vast majority of

such web hosting service customer websites worldwide are hosted on shared

servers, using virtual hosting technology.

Many businesses utilize virtual servers for internal purposes, where there is a

technology or administrative reason to keep several separate websites such as

customer extranet website, employee extranet, internal intranet, intranets for

different departments. If there are not security concerns in the website architectures,

they can be merged into a single server using virtual hosting technology, which

reduces management and administrative overhead and the number of separate

servers required to support the business.

76

Page 43: ADM100 apache-administration-sample-content

With the NameVirtualHost directive you specify the IP address on which the

server will receive requests for the name-based virtual hosts. This will usually be the

address to which your name-based virtual host names resolve. In cases where a

firewall or other proxy receives the requests and forwards them on a different IP

address to the server, you must specify the IP address of the physical interface on

the machine which will be servicing the requests. If you have multiple name-based

hosts on multiple addresses, repeat the directive for each address.

The term Virtual Host refers to the practice of running more than one web site (such

as www.company1.com and www.company2.com) on a single machine. Virtual

hosts can be "IP-based", meaning that you have a different IP address for every

web site, or "name-based", meaning that you have multiple names running on each

IP address. The fact that they are running on the same physical server is not

apparent to the end user.

Apache was one of the first servers to support IP-based virtual hosts right out of the

box. Versions 1.1 and later of Apache support both IP-based and name-based

virtual hosts (vhosts). The latter variant of virtual hosts is sometimes also called

host-based or non-IP virtual hosts.

Prior to 2.3.11, NameVirtualHost was required to instruct the server that a particular

IP address and port combination was usable as a name-based virtual host. In 2.3.11

and later, any time an IP address and port combination is used in multiple virtual

hosts, name-based virtual hosting is automatically enabled for that address.

This directive currently has no effect

77

Page 44: ADM100 apache-administration-sample-content

Name-based virtual hosts use multiple host names for the same webserver IP

address

With web browsers that support HTTP/1.1, as nearly all now do, upon connecting to

a webserver, the browsers send the address that the user typed into their browser's

address bar, the URL. The server can use this information to determine which web

site, as well as page, to show the user. The browser specifies the address by setting

the Host HTTP header with the host specified by the user. The Host header is

required in all HTTP/1.1 requests.

For instance, a server could be receiving requests for two domains, www.site1.com and www.site2.com, both of which resolve to the same IP

address. For www.site1.com, the server would send the HTML file from the

directory /var/www/user/Joe/site/, while requests for www.site2.com

would make the server serve pages from /var/www/user/Mary/site/.

If the Domain Name System (DNS) is not properly functioning, it becomes much

harder to access a virtually-hosted website. The user could try to fall back to using

the IP address to contact the system, as in http://10.23.45.67/. The web browser

doesn't know which hostname to use when this happens; moreover, since the web

server relies on the web browser client telling it what server name (vhost) to use, the

server will respond with a default website—often not the site the user expects.

A workaround in this case is to add the IP address and hostname to the client

system's hosts file. Accessing the server with the domain name should work again.

Users should be careful when doing this, however, as any changes to the true

mapping between hostname and IP address will be overridden by the local setting.

This workaround is not really useful for an average web user, but may be of some

use to a site administrator while fixing DNS records.

Another issue with virtual hosting is the inability to host multiple secure websites

running Secure Sockets Layer or SSL. Because the SSL handshake takes place

before the expected hostname is sent to the server, the server doesn't know which

certificate to present when the connection is made.

78

Page 45: ADM100 apache-administration-sample-content

The hosts file is one of several system facilities to assist in addressing network

nodes in a computer network. It is a common part in a operating system's Internet

Protocol (IP) implementation, and serves the function of translating human-friendly

hostnames into numeric protocol addresses, called IP addresses, that identify and

locate a host in an IP network.

In some operating systems, the host file content is used preferentially over other

methods, such as the Domain Name System (DNS), but many systems

implement name service switches (.e.g., nsswitch.conf) to provide customization.

Unlike the DNS, the hosts file is under the direct control of the local computer's

administrator.

The hosts file contains lines of text consisting of an IP address in the first text field

followed by one or more hostnames, each field separated by white space (blanks or

tabulation characters). Comment lines may be included; they are indicated by a

hash character (#) in the first position of such lines. Entirely blank lines in the file are

ignored. For example a typical hosts file may contain the following:

127.0.0.1 localhost # designate 127.0.0.1 as the

local host leave this line as is.

# Usually End of Default Host File. The rest is examples.

102.54.94.97 rhino.acme.com # source server

38.25.63.10 x.acme.com # x client host

#This type of entries will Block specific sites.

127.0.0.1 ibm.com # will resolve to Localhost, so No IBM.COM

#Entries will Block ads coming in Banner from ads.acme.com.

127.0.0.1 ads.acme.com

An IP address may have multiple hostnames, and that a hostname may be mapped

to several IP addresses.

79

Page 46: ADM100 apache-administration-sample-content

In IP based Virtual Hosting is where each virtual host has its own IP address. You

will need a new IP address for each virtual host you want to set up, either from your

existing allocation or by obtaining more from your service provider. Once you have

extra IP addresses, you tell your machine to handle them.

On some operating systems, you can give a single ethernet interface multiple

addresses (typically with an ifconfig alias command). On other systems you will

have to have a different physical interface for each IP address (typically by buying

extra ethernet cards).

IP addresses are a resource that costs money and are increasingly difficult to get

hold of, so modern browsers can now also use 'non-IP' virtual hosts. This lets you

use the same IP address for multiple host names. When the server receives an

incoming Web connection it does not know the hostname what was used in the

URL, however the new HTTP/1.1 specification adds a facility where the browser

must tell the server the hostname it is using, on the Host: header. If an older

browser connects to a non-IP virtual host, it will not send the Host: header, so the

server will have to respond with a list of possible virtual hosts. Apache provides

some help for configuring a site for both old and new browsers.

Picking a Hostname and Updating the DNS

Having selected an IP address, the next stage is to update the DNS so that

browsers can convert the hostname into the right address. The DNS is the system

that every machine connected to the Internet uses to find the IP address of host

names. If your hostname is not in the DNS, no-one will be able to connect to your

server (except by the unfriendly IP address).

If the virtual host name you are going to use is under your existing domain, you can

just add the record into your own DNS server. If the virtual host name is in someone

else's domain, you will need to get them to add it to their DNS server files. In some

cases, you will want to use a domain not yet used on the internet, in which case you

will have to apply for the domain name from the InterNIC and set up the primary and

secondary DNS servers for it, before adding the entry for your virtual host.

80

Page 47: ADM100 apache-administration-sample-content

Sometimes, there is a need to have more than one IP Addresses assigned to a PC

or a Server. If you have multiple services running on a server like a webserver and a

mail server and want to assign each of these services seperate IP Addresses the

Multiple IP addresses will help. While this need not necessarily require to have more

than one Network Interface Card (NIC) on the server.

You can add multiple IP Addresses to a single interface card. However, these IP

Addresses should be in the same logical network although in different subnets.

The following procedure will just guide you how to add Multiple IP Addresses :

1. Open Control Panel and Network Connections.

2. Right-click the LAN connection and select properties.

3. Select Internet Protocol (TCP/IP) and click properties.

4. Select "Use the following IP Address" and enter the primary IP Address, Subnet

mask and Gateway and the primary and secondary DNS Servers.

5. Click Advanced then Add under IP Addresses.

6. Add the IP Address and Subnet mask. Repeat the procedure if there are

additional IP Addresses to be added.

7. Click Add under "Default Gateways" and add the gateway addresses.

NOTE: The system will always use the 1st available Gateway and will use the

additional gateways unless the primary gateway is not reacheable.

8.Click OK then OK and then OK to save the changes.

9. To confirm the multiple IP Addresses are set, from the command prompt (Start-

Run and type CMD) run the ipconfig command. You can see multiple IP addresses

on one Network Interface Card.

81

Page 48: ADM100 apache-administration-sample-content

In addition to using names and IP addresses, Apache can listen for traffic on other

TCP ports and serve up content to requests that come in to that port. Port-based

virtual servers are often used in conjunction with load-balancing equipment. The

load balancers are configured with the DNS name and the TCP port that serves the

content. This TCP port matches the Apache virtual server configuration. The site is

often deployed on multiple servers to provide for redundancy and high availability,

hence the load-balancing equipment.

You cannot use name-based virtual hosts with SSL, because the SSL handshake

occurs before the HTTP request which identifies the appropriate name-based virtual

host. If you want to use name-based virtual hosts, they will only work with your non-

secure Web server.

82

Page 49: ADM100 apache-administration-sample-content

Port-based virtual hosting follows on from IP-based hosting. The main advantage of

this technique is that it makes it possible for a webmaster to test a lot of sites using

only one IP address/hostname or, in a pinch, host a large number of sites without

using name-based hosts and without using lots of IP numbers. Unfortunately, most

ordinary users don't like their web server having a funny port number, but this can

also be very useful for testing or staging sites.

User webuser

Group webgroup

Listen 80

Listen 8080

<VirtualHost 192.168.123.2:80>

ServerName www.butterthlies.com

ServerAdmin [email protected]

DocumentRoot /usr/www/site.virtual/htdocs/customers

</VirtualHost>

<VirtualHost 192.168.123.2:8080>

ServerName sales-IP.butterthlies.com

ServerAdmin [email protected]

DocumentRoot /usr/www/site.virtual/htdocs/salesmen

ServerName sales.butterthlies.com

</VirtualHost>

The Listen directives tell Apache to watch ports 80 and 8080. If you set Apache

going and access http://www.butterthlies.com, you arrive on port 80, the default, and

see the customers' site; if you access http://www.butterthlies.com:8080, you get the

salespeople's site. If you forget the port and go to http://sales.butterthlies.com, you

arrive on the customers' site, because the two share an IP address in our dummied

DNS.

83

Page 50: ADM100 apache-administration-sample-content

84