Upload
olivia-fagan
View
248
Download
0
Tags:
Embed Size (px)
Citation preview
mod_perl
High speed dynamic content
Definitions
Apache – OpenSource httpd server
Perl – OpenSource interpreted programming language
mod_perl – OpenSource interface between Apache and Perl
Scenario
Unix hosted web-site using CGI Perl scripts to create web pages with dynamic content.
As the site becomes larger and more popular, the speed slows
The web servers are not resource constrained
How can we speed the delivery of content?
Topics covered
Why use mod_perl? How mod_perl works Installation highlights Commonly encountered problems when
porting from CGI to mod_perl
Take home messages
use strict;
Use the PerlTaintCheck directive
mod_perl enabled scripts are fastPossibly faster than C code…
Effective use of mod_perl is not trivial
Why use mod_perl?
CGI scripts written in Perl and served from Apache can be slow
Generally the more dynamic the content, the slower the page return
As sites grow, their content usually becomes more dynamic
Sites ported to mod_perl show request return rates 200% to 2000% higher!!!
Evolution of a web-site
Why is CGI slow?
For each CGI script called:Apache must load the Perl interpreter
Perl must load the the CGI script including all used Modules
The CGI script must initialize itself on each loadDatabase initialization is a common slow-down
Dynamic content options
Taken from "Writing Apache Modules w ith Perl and C"
0 2 4 6 8 10 12 14
CGI
FastCGI
Server API
Server Side Includes
DHTML
Client-side Java
Embedded Interpreter
Integrated System
Portability
Performance
Simplicity
Pow er
The embedded interpreter
There are increasingly sophisticated solutions:
Only load the Perl interpreter once(PerlRun)
Only load the CGI scripts and modules once(PerlRegistry)
Only do initializations once(Code re-factoring)
The mod_perl version
Each Apache instance loads the Perl interpreter when it initializes (PerlRun)
mod_perl loads CGI scripts the first time they are executed (PerlRegistry)
We can move CGI initialization code into a separate script that is executed when each Apache instance starts (Code re-factoring)
How does mod_perl work?
It’s invoked by Apache so…How does Apache work?
It’s running Perl code so…How does Perl work?
The mod_perl handler
How does Perl work?
Interpreted language
To run a Perl programThe Perl interpreter is loaded
The Program is compiled by the Perl interpreter
The Program is executed
Perl’s @INC array
Analogous to the LD_LIBRARY_PATH environment variable
This array contains the paths that Perl will search when loading libraries or modules
With mod_perl this is set during Apache’s initialization phase and cannot be change later
Perl’s %INC hash
After Perl has compiled a program, it is added to %INC as a key with the fully qualified filename and compile time as a values
Before loading a program, Perl searches %INC for a match. If it finds one, it uses the %INC entry and does not re-load the program
How does Apache work?
Apache serves documents requested via HTTP
A sample request:
The Apache request cycle
The mode _perl handlers
You can interact with the Apache request cycle at any point with a handler
A handler is a program that is defined for a Directory, Location, Files or FilesMatch directive in Apache’s configuration files
The Apache handler API
Apache calls the handler, passes information about the current transaction and server state and expects an integer status code in return
If the handler returns an error, Apache short-circuits to the logging phase
The PerlHandler: part 1
Called prior to serving the requested document
Two default handlers PerlRun
PerlRegistry
The PerlHandler: part 2
Unless you are using a custom handler, your CGI programs are executed as subroutines inside the handler subroutine (nested subroutines)
Since Apache never exits, the CGI programs are never reaped by Perl’s garbage collection
A handler subroutine
sub handler {
my $request = shift(@_);
my $status = &do_something($request);
return $status;
}
Installation highlights
Install Apache and mod_perl together Configure Apache to use mod_perl Port your CGI scripts
Installation: part 1
(The following is for Unix, Win32 installs are more difficult)
Download Apache http://www.apache.org
current version is 1.3.12
Download mod_perl http://search.cpan.org
current version is 1.24
Check perl version (mod_perl 1.24 or earlier works best with Perl 5.00503—not 5.6.0)
Installation: part 2
By default building mod_perl will also build Apache
>perl Makefile.PL APACHE_SRC=../apache.1.3.12 \ DO_HTTPD=1 USE_APACI=1 EVERYTHING=1
>make;make test;make install
>cd ../apache.1.3.12; make install
Configuration
Configure Apache to use mod_perl:
modify httpd.conf<IfModule mod_perl.c>
Include conf/perl.conf
</IfModule>
perl.conf: part 1
PerlSetEnv PERL5LIB “/my/perl/lib”
PerlRequire conf/startup.pl
Alias /cgi-bin “/my/cgi/dir”
PerlTaintCheck On
perl.conf: part 2
<Location /cgi-bin>
SetHandler perl-script
PerlHandler Apache::Registry
Options ExecCGI
</Location>
perl.conf: part 3
A good initial check of mod_perl
<Location /mod_perl-status>
SetHandler perl-script
PerlHandler Apache::Status
</Location>
startup.pl
Loads commonly used modules
Initializes database connections
@EXPORT_OK = qw( $dbh );
use lib qw( /home/matt/perl/Modules
/opt/apps/lib );
my $dbh = DBI->connect($db,$user,$passwd);
Porting CGI scripts to mod_perl
‘use strict;’
during porting‘use diagnostics;’ And check the error.log
Run the Apache server single threaded
Add the following to perl.conf
PerlInitHandler ApacheStatINC
Common porting problems: part 1
As someone pointed out during the presentation, all of these problems could be solved by using re-entrant coding techniques
If you used them when you wrote the CGI scripts, porting will be easy
That being said, most CGI scripts are not written by people familiar with such techniques…
Common porting problems
Run-time attempts to add paths to @INC Implicit variable passing Assigning anonymous sub to $SIG{__DIE__} Compiled regular expressions No closure of file or dbm handles
Adding paths to @INC
This will not work because @INC is fixed after the initialization of Apache
Use startup.pl to add paths BEGIN blocks only get executed at compile
time
Implicit variable passing
The value of the variable in subroutines will not be updated after the first call
Perl considers the two variables to be separate
SolutionsFully qualify the variable in the subroutine
Pass by value or reference
Make the variable global
$SIG{__DIE__} problems
$SIG{__DIE__} is global
Your anonymous subroutine will be called by any ‘die’ statement
SolutionsWrite your own exception method
Error.pm (try, throw, catch)
Compiled regular expressions
Only compiled for the first pattern used
SolutionAdd an eval block
eval q{
foreach (@list) {
&mysub($_) if ( /$pattern/o );
}
}
No closure of file or dbm handles
Common to use ‘close on exit’ fh management
Now the script does not exit….
SolutionClose your file and dbm handles
Conclusions
mod_perl will speed up CGI scripts by 200% to 2000%
use strict; Use the PerlTaintCheck directive Effective use of mod_perl is not trivial, plan
accordingly
Further Reading
Writing Apache Modules with Perl and CLincoln Stein and Doug MacEachern
mod_perl Guidehttp://perl.apache.org/guideStas Bekman
FAQhttp://perl.apache.org/faq Frank Cringle
Apache %INC hash detail
The handler subroutine changes the name of the nested subroutines to the fully qualified filename plus the prefix Apache::ROOTApache::ROOT::My::Dir::script_2epl
This reduces the likelihood of namespace collisions, but frustrates tampering