8/14/2019 Introduction to mod_rewrite
1/68
mod_rewriteIntroduction to mod_rewrite
Rich Bowen, Web Guy, Asbury [email protected]
http://people.apache.org/~rbowen/
ApacheCon US, 2006
1
http://people.apache.org/~rbowen/http://people.apache.org/~rbowen/mailto:[email protected]:[email protected]8/14/2019 Introduction to mod_rewrite
2/68
OutlineRegex basics
RewriteRule
RewriteCondRewriteMap
The evils of .htaccess les
Assorted odds and ends
2
8/14/2019 Introduction to mod_rewrite
3/68
mod_rewriteis not magic
Fear, more thancomplexity, makesmod_rewrite difcult
3
8/14/2019 Introduction to mod_rewrite
4/68
Although, it is complex
4
``The great thing aboutmod_rewrite
is it gives you all the congurabilityand exibility of Sendmail. The downside to
mod_rewrite is that it gives you allthe congurability and exibility of
Sendmail.''-- Brian Behlendorf
8/14/2019 Introduction to mod_rewrite
5/68
And lets not forget
voodoo!
5
`` Despite the tons of
examples and docs,mod_rewrite is voodoo.Damned cool voodoo,
but still voodoo. ''
-- Brian Moore
8/14/2019 Introduction to mod_rewrite
6/68
Line noise
6
"Regular expressionsare just line noise.
I hate them!"
(Heard 20 times perday on IRC)
When you hear it oftenenough, you start tobelieve it
8/14/2019 Introduction to mod_rewrite
7/68
Now that thats out of the way
Regular expressions are not magic
They are an algebraic expression of text
patterns
Once you get over the mysticism, it can still be
hard, but it's no longer mysterious
7
8/14/2019 Introduction to mod_rewrite
8/68
Vocabulary
Were going to start with a very small
vocabulary, and work up from thereMost of the time, this vocabulary is all thatyoull need
8
8/14/2019 Introduction to mod_rewrite
9/68
.
. matches any character
a.b matches acb, axb, a@b, and so on
It also matches Decalb and Marbelized
9
8/14/2019 Introduction to mod_rewrite
10/68
+
+ means that something needs to appear oneor more times
a+ matches a, aa, aaa, andStellaaaaaaaaaa!
The thing that is repeating isnt necessarily
just a single character
10
8/14/2019 Introduction to mod_rewrite
11/68
*
* means that the previous thingy needs tomatch zero or more times
This is subtly different from + and somefolks miss the distinction
giraf*e matches giraffe and girafe
It also matches girae
11
8/14/2019 Introduction to mod_rewrite
12/68
?
? means that the previous thingy needs to
match zero or one timesIn other words, it makes it optional
colou?r matches color and colour
12
8/14/2019 Introduction to mod_rewrite
13/68
^^ is called an anchor
It requires that the string start with thepattern
^A matches ANDY but it does not matchCANDY
Pronounced hat or caret or circumexor pointy-up thingy
13
8/14/2019 Introduction to mod_rewrite
14/68
$
$ is the other anchor
It requires that the string end with thepattern
a$ matches canada but not afghanistan
14
8/14/2019 Introduction to mod_rewrite
15/68
( )
( ) allows you to group several charactersinto one thingy
This allows you to apply repetitioncharacters (*, +, and ?) to a larger group of characters.
(ab)+ matches ababababababab
15
8/14/2019 Introduction to mod_rewrite
16/68
( ), continued( ) allows you to capture a match so that youcan use it later.
The value of the matched bit is stored in avariable called a backreferenceIt might be called $1 or %1 depending on thecontext
The second match is called $2 (or %2) and soon
16
8/14/2019 Introduction to mod_rewrite
17/68
[ ]
[ ] denes a character class
[abc] matches a or or b or c
c[uoa]t matches cut, cot, or cat
It also matches cote
It does not match coat
17
8/14/2019 Introduction to mod_rewrite
18/68
NOT
In mod_rewrite regular expressions, ! negatesany match
In a character class, ^ negates the characterclass
[^ab] matches any character except for a or
b.
18
8/14/2019 Introduction to mod_rewrite
19/68
So, what does this have to do withApache?
mod_rewrite lets you match URLs (or otherthings) and transform the target of the URLbased on that match.
19
RewriteEngine On# Burninate ColdFusion!RewriteRule (.*)\.cfm$ $1.php [PT]# And there was much rejoicing. Yaaaay.
8/14/2019 Introduction to mod_rewrite
20/68
RewriteEngineRewriteEngine On enablesthe mod_rewrite rewritingengine
No rewrite rules will beperformed unless this isenabled in the active scope
It never hurts to say it again
20
8/14/2019 Introduction to mod_rewrite
21/68
RewriteLog
21
RewriteLog /www/logs/rewrite_logRewriteLogLevel 9
You should turn on the RewriteLog before youdo any troubleshooting.
8/14/2019 Introduction to mod_rewrite
22/68
RewriteRule pattern target [ags]
The pattern part is the regular expressionthat you want to look for in the URL
If they try to go HERE send them HEREinstead.
The behavior can be further modied by theuse of one or more ags
RewriteRule
22
8/14/2019 Introduction to mod_rewrite
23/68
Example 1
SEO - Search Engine Optimization
Frequently based on misconceptions abouthow search engines work
Typical strategy is to make clean URLs -Avoid?argument=value&xyz=123
23
8/14/2019 Introduction to mod_rewrite
24/68
URL beauticationA URL looks like:
24
We would prefer that it looked like
Its easier to type, and easier toremember
http://example.com/book/bowen/apachehttp://example.com/book/bowen/apachehttp://example/http://example/8/14/2019 Introduction to mod_rewrite
25/68
Example 1, contd
User does not notice that the transformationhas been made
Used $1 and $2 to capture what was
requestedSlight oversimplication. Should probably use([^/]+) instead.
25
RewriteRule ^/book/(.*)/(.*) \/cgi-bin/book.cgi?topic=$1&author=$2 [PT]
8/14/2019 Introduction to mod_rewrite
26/68
Flags
Flags can modify the behavior of aRewriteRule
I used a ag in the example, and didnt tellyou what it meant
So, here are the ags
26
8/14/2019 Introduction to mod_rewrite
27/68
8/14/2019 Introduction to mod_rewrite
28/68
By the way ...Default is to treat the rewrite target as ale path
If the target starts in http:// or https://then it is treated as a URL, and a [R] isassumed (Redirect)
In a .htaccess le, or in scope,the le path is assumed to be relative tothat scope
28
8/14/2019 Introduction to mod_rewrite
29/68
RewriteRule ags[Flag] appears at end of RewriteRule
More than one ag separated by commas
I recommend using ags even when thedefault is what you want - it makes it easierto read later
Each ag has a longer form, which you can
use for greater readability.Theres *lots* of ags
29
8/14/2019 Introduction to mod_rewrite
30/68
Chain
[C] or [Chain]
Rules are considered as a whole. If one fails,the entire chain is abandoned
30
8/14/2019 Introduction to mod_rewrite
31/68
Cookie[CO=NAME:Value:Domain[:lifetime[:path]]
Long form [cookie=...]
Sets a cookie
31
RewriteRule ^/index.html - [CO=frontdoor:yes:.example.com]
In this case, the default values forpath (/) and lifetime (session)are assumed.
8/14/2019 Introduction to mod_rewrite
32/68
Env[E=var:val]
Long form [env=...]
Sets environment variable
Note that most of the time, SetEnvIf worksjust ne
32
RewriteRule \.jpg$ - [env=dontlog:1]
8/14/2019 Introduction to mod_rewrite
33/68
Forbidden[F] or [Forbidden] forces a 403 Forbidden
response
Consider mod_security instead for pattern-based URL blocking
33
RewriteEngine OnRewriteRule (cmd|root)\.exe - [F]
You could use this in conjunction with [E]
to avoid logging that stuff RewriteRule (cmd|root)\.exe - [F,E=dontlog:1]CustomLog /var/log/apache/access_log combined \
env=!dontlog
8/14/2019 Introduction to mod_rewrite
34/68
Handler[H=application/x-httpd-php]
Forces the use of a particular handler tohandle the resulting URL
Can often be replaced with using [PT] but isquite a bit faster
Available in Apache 2.2
34
8/14/2019 Introduction to mod_rewrite
35/68
Last[L] indicates that youve reached the end of the current ruleset
Any rules following this will be considered asa completely new ruleset
Its a good idea to use it, even when it would
otherwise be default behavior. It helps makerulesets more readable.
35
8/14/2019 Introduction to mod_rewrite
36/68
NextThe [N] or [Next] ag is a good way to getyourself into an innite loop
It tells mod_rewrite to run the entireruleset again from the beginning
Can be useful for doing global search andreplace stuff
I nd RewriteMap much more useful in thosesituations
36
8/14/2019 Introduction to mod_rewrite
37/68
NoCase
[NC] or [nocase] makes the RewriteRule caseinsensitive
Regular expressions are case-sensitive bydefault
37
8/14/2019 Introduction to mod_rewrite
38/68
NoEscape
[NE] or [noescape] disables the defaultbehavior of escaping (url-encoding) specialcharacters like #, ?, and so on
Useful for redirecting to a page #anchor
38
8/14/2019 Introduction to mod_rewrite
39/68
NoSubreq
[NS] or [nosubreq] ensures that the rulewont run on subrequests
Subrequests are things like SSI evaluations
Image and css requests are NOT subrequests
39
8/14/2019 Introduction to mod_rewrite
40/68
Proxy[P] rules are served through a proxysubrequest
mod_proxy must be installed for this ag to
work
40
RewriteEngine OnRewriteRule (.*)\.(jpg|gif|png) \
http://images.example.com/http://images.example.com/http://images.example.com/8/14/2019 Introduction to mod_rewrite
41/68
Passthrough
[PT] or [passthrough]
Hands it back to the URL mapping phase
Treat this as though this was the originalrequest
41
8/14/2019 Introduction to mod_rewrite
42/68
QSAppend
[QSA] or [qsappend] appends to the querystring, rather than replacing it.
42
8/14/2019 Introduction to mod_rewrite
43/68
Redirect
[R] or [redirect] forces a 302 Redirect
Note that in this case, the user will see thenew URL in their browser
This is the default behavior when the target
starts with http:// or https://
43
k
8/14/2019 Introduction to mod_rewrite
44/68
Skip
[S=n] or [skip=n] skips the next nRewriteRules
This is very weird
Ive never used this in the real world. Couldbe used as a sort of inverse RewriteCond(viz WordPress)
44
RewriteRule %{REQUEST_FILENAME} -f [S=15]
8/14/2019 Introduction to mod_rewrite
45/68
Type[T=text/html]
Forces the Mime type on the resulting URL
Used to do this instead of [H] in somecontexts
Good to ensure that le-path redirects are
handled correctly
45
RewriteRule ^(.+\.php)s$ $1 [T=application/x-httpd-php-source]
i C d
8/14/2019 Introduction to mod_rewrite
46/68
RewriteCondCauses a rewrite to be conditional
Can check the value of any variable andmake the rewrite conditional on that.
46
RewriteCond TestString Pattern [Flags]
R i C d
8/14/2019 Introduction to mod_rewrite
47/68
RewriteCond
The test string can be just about anything
Env vars, headers, or a literal string
Backreferences become %1, %2, etc
47
L i
8/14/2019 Introduction to mod_rewrite
48/68
Looping
Looping occurs when the target of a rewriterule matches the pattern
This results in an innite loop of rewrites
48
RewriteCond %{REQUEST_URI} \!^/example.html
RewriteRule ^/example /example.html [PT]
C di i l i
8/14/2019 Introduction to mod_rewrite
49/68
Conditional rewritesRewrites conditional on some arbitrarythingy
Only rst Rule is dependent
49
RewriteEngine onRewriteCond %{TIME_HOUR}%{TIME_MIN} >0700
RewriteCond %{TIME_HOUR}%{TIME_MIN}
8/14/2019 Introduction to mod_rewrite
50/68
SSL RewritesRedirect requests to https:// if the requestwas for http
(In a .htaccess le)
50
RewriteCond %{HTTPS} !ONRewriteRule (.*) https://%{HTTP_HOST}/$1 [R]
R i M
8/14/2019 Introduction to mod_rewrite
51/68
RewriteMap
Call an external program, or map le, toperform the rewrite
Useful for very complex rewrites, or perhapsones that rely on something outside of Apache
51
R i M l
8/14/2019 Introduction to mod_rewrite
52/68
RewriteMap - leFile of one-to-one relationships
52
RewriteMap docsmap txt:/www/conf/docsmap.txtRewriteRule /docs/(.*) ${docsmap:$1} [R,NE]
Where docsmap.txt contains:
... etc
Requests for http://example.com/docs/somethingnow get redirected to the Apache docs site forsomething. [NE] makes the #anchor bit work.
P l d b l i
http://example.com/http://example.com/http://httpd.apache.org/docs/mod_alias.html#http://httpd.apache.org/docs/mod_alias.html#http://httpd.apache.org/docs/mod_alias.html#http://httpd.apache.org/docs/mod_alias.html#8/14/2019 Introduction to mod_rewrite
53/68
Poor-mans load balancingRandom selection of server for load
balancing
53
RewriteMap servers rnd:/www/conf/servers.txtRewriteRule (.*) http://${servers:loadbalance}$1 [P,NS]
servers.txt contains:
loadbalance mars|jupiter|saturn|neptune
Requests are now randomly distributed between the fourservers. The NS ensures that the proxied URL doesntget re-rewritten.
db
8/14/2019 Introduction to mod_rewrite
54/68
dbm
Convert a one-to-one text mapping to a dbm
le
httxt2dbm utility (2.0)
54
RewriteMap asbury \dbm:/usr/local/apache/conf/aliases.map
R it M
8/14/2019 Introduction to mod_rewrite
55/68
RewriteMap - programCall an external program to do the rewrite
Perl is a common choice here, due to its skillat handling text.
55
RewriteMap dash2score \prg:/usr/local/apache/conf/dash2score.pl
RewriteEngine OnRewriteRule (.*-.*) ${dash2score:$1} [PT]
d h2 l
8/14/2019 Introduction to mod_rewrite
56/68
dash2score.pl#!/usr/bin/perl
$| = 1; # Turn off bufferingwhile () {s/-/_/g; # Replace - with _ globallyprint $_;
}
Turning off buffering is necessarybecause we need the outputimmediately for each line we feed it.
Apache starts the script on serverstartup, and keeps it running for the
life of the server process
SQL (i 2 3 HEAD)
8/14/2019 Introduction to mod_rewrite
57/68
SQL (in 2.3-HEAD)
Just committed on Monday
Have a SQL statement in the RewriteMapdirective which returns the mapping
57
ht l
8/14/2019 Introduction to mod_rewrite
58/68
.htaccess les
.htaccess les are evil
However, a lot of people have no choice
So ...
58
ht l
8/14/2019 Introduction to mod_rewrite
59/68
.htaccess les
In .htaccess les, or scope,everything is assumed to be relative to thatcurrent scope
So, that scope is removed from theRewriteRule
^/index.html in httpd.conf becomes^index.html in a .htaccess le or scope
59
ht l
8/14/2019 Introduction to mod_rewrite
60/68
.htaccess lesRewriteLog is particularly useful when tryingto get .htaccess le RewriteRules working.
However, you cant turn on RewriteLog ina .htaccess le, and presumably youreusing .htaccess les because you dont haveaccess to the main server cong.
Its a good idea to set up a test server onyour home PC and test there withRewriteLog enabled
60
ht l
8/14/2019 Introduction to mod_rewrite
61/68
.htaccess les
The rewrite pattern is relative to thecurrent directory
The rewrite target is also relative to thecurrent directory
In httpd.conf, the rewrite target is assumedto be a le path. In .htaccess les, that lepath is relative to the current directory, so itseems to be a URI redirect.
61
F rther reso rces
8/14/2019 Introduction to mod_rewrite
62/68
Further resources
http://rewrite.drbacchus.com/
Denitive Guide to mod_rewrite by RichBowen, from APress
http:/ /httpd.apache.org/docs-2.2/r ewrite/
62
Questions?
http://rewrite.drbacchus.com/http://httpd.apache.org/docs-2.1/rewrite/http://httpd.apache.org/docs-2.1/rewrite/http://rewrite.drbacchus.com/http://rewrite.drbacchus.com/8/14/2019 Introduction to mod_rewrite
63/68
Questions?
63
Bonus slides Recipes
8/14/2019 Introduction to mod_rewrite
64/68
Bonus slides - Recipes
Redirect everything to a central handler
64
8/14/2019 Introduction to mod_rewrite
65/68
65
RewriteEngine OnRewriteCond %{REQUEST_URI} !handler.phpRewriteRule (.*) /handler.php?$1 [PT,L,NE]
All requests are sent to handler.phpThe request is passed as a QUERY_STRING
argument to handler.php so that it knows whatwas requested.
Virtual Hosts
8/14/2019 Introduction to mod_rewrite
66/68
Virtual Hosts
Rewrite a request to a directory based onthe requested hostname.
66
8/14/2019 Introduction to mod_rewrite
67/68
The hostname ends up in %1
The requested path is in $1 - includes leadingslash
Will probably have to do special things forhandlers (like .php les)
67
RewriteEngine On
RewriteCond %{HTTP_HOST} (.*)\.example\.com [NC]RewriteRule (.*) /home/%1/www$1
phps source handler
8/14/2019 Introduction to mod_rewrite
68/68
.phps source handler
Syntax-highlighted coderendering of any .phple
RewriteRule (.*)\.phps \$1.php [H=application/x-httpd-php-source]