Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl –...

Preview:

Citation preview

mod_perl

High speed dynamic content

Definitions

Apache – OpenSource httpd server

Perl – OpenSource interpreted programming language

mod_perl – OpenSource interface between Apache and Perl

Scenario

Unix hosted web-site using CGI Perl scripts to create web pages with dynamic content.

As the site becomes larger and more popular, the speed slows

The web servers are not resource constrained

How can we speed the delivery of content?

Topics covered

Why use mod_perl? How mod_perl works Installation highlights Commonly encountered problems when

porting from CGI to mod_perl

Take home messages

use strict;

Use the PerlTaintCheck directive

mod_perl enabled scripts are fastPossibly faster than C code…

Effective use of mod_perl is not trivial

Why use mod_perl?

CGI scripts written in Perl and served from Apache can be slow

Generally the more dynamic the content, the slower the page return

As sites grow, their content usually becomes more dynamic

Sites ported to mod_perl show request return rates 200% to 2000% higher!!!

Evolution of a web-site

Why is CGI slow?

For each CGI script called:Apache must load the Perl interpreter

Perl must load the the CGI script including all used Modules

The CGI script must initialize itself on each loadDatabase initialization is a common slow-down

Dynamic content options

Taken from "Writing Apache Modules w ith Perl and C"

0 2 4 6 8 10 12 14

CGI

FastCGI

Server API

Server Side Includes

DHTML

Client-side Java

Embedded Interpreter

Integrated System

Portability

Performance

Simplicity

Pow er

The embedded interpreter

There are increasingly sophisticated solutions:

Only load the Perl interpreter once(PerlRun)

Only load the CGI scripts and modules once(PerlRegistry)

Only do initializations once(Code re-factoring)

The mod_perl version

Each Apache instance loads the Perl interpreter when it initializes (PerlRun)

mod_perl loads CGI scripts the first time they are executed (PerlRegistry)

We can move CGI initialization code into a separate script that is executed when each Apache instance starts (Code re-factoring)

How does mod_perl work?

It’s invoked by Apache so…How does Apache work?

It’s running Perl code so…How does Perl work?

The mod_perl handler

How does Perl work?

Interpreted language

To run a Perl programThe Perl interpreter is loaded

The Program is compiled by the Perl interpreter

The Program is executed

Perl’s @INC array

Analogous to the LD_LIBRARY_PATH environment variable

This array contains the paths that Perl will search when loading libraries or modules

With mod_perl this is set during Apache’s initialization phase and cannot be change later

Perl’s %INC hash

After Perl has compiled a program, it is added to %INC as a key with the fully qualified filename and compile time as a values

Before loading a program, Perl searches %INC for a match. If it finds one, it uses the %INC entry and does not re-load the program

How does Apache work?

Apache serves documents requested via HTTP

A sample request:

The Apache request cycle

The mode _perl handlers

You can interact with the Apache request cycle at any point with a handler

A handler is a program that is defined for a Directory, Location, Files or FilesMatch directive in Apache’s configuration files

The Apache handler API

Apache calls the handler, passes information about the current transaction and server state and expects an integer status code in return

If the handler returns an error, Apache short-circuits to the logging phase

The PerlHandler: part 1

Called prior to serving the requested document

Two default handlers PerlRun

PerlRegistry

The PerlHandler: part 2

Unless you are using a custom handler, your CGI programs are executed as subroutines inside the handler subroutine (nested subroutines)

Since Apache never exits, the CGI programs are never reaped by Perl’s garbage collection

A handler subroutine

sub handler {

my $request = shift(@_);

my $status = &do_something($request);

return $status;

}

Installation highlights

Install Apache and mod_perl together Configure Apache to use mod_perl Port your CGI scripts

Installation: part 1

(The following is for Unix, Win32 installs are more difficult)

Download Apache http://www.apache.org

current version is 1.3.12

Download mod_perl http://search.cpan.org

current version is 1.24

Check perl version (mod_perl 1.24 or earlier works best with Perl 5.00503—not 5.6.0)

Installation: part 2

By default building mod_perl will also build Apache

>perl Makefile.PL APACHE_SRC=../apache.1.3.12 \ DO_HTTPD=1 USE_APACI=1 EVERYTHING=1

>make;make test;make install

>cd ../apache.1.3.12; make install

Configuration

Configure Apache to use mod_perl:

modify httpd.conf<IfModule mod_perl.c>

Include conf/perl.conf

</IfModule>

perl.conf: part 1

PerlSetEnv PERL5LIB “/my/perl/lib”

PerlRequire conf/startup.pl

Alias /cgi-bin “/my/cgi/dir”

PerlTaintCheck On

perl.conf: part 2

<Location /cgi-bin>

SetHandler perl-script

PerlHandler Apache::Registry

Options ExecCGI

</Location>

perl.conf: part 3

A good initial check of mod_perl

<Location /mod_perl-status>

SetHandler perl-script

PerlHandler Apache::Status

</Location>

startup.pl

Loads commonly used modules

Initializes database connections

@EXPORT_OK = qw( $dbh );

use lib qw( /home/matt/perl/Modules

/opt/apps/lib );

my $dbh = DBI->connect($db,$user,$passwd);

Porting CGI scripts to mod_perl

‘use strict;’

during porting‘use diagnostics;’ And check the error.log

Run the Apache server single threaded

Add the following to perl.conf

PerlInitHandler ApacheStatINC

Common porting problems: part 1

As someone pointed out during the presentation, all of these problems could be solved by using re-entrant coding techniques

If you used them when you wrote the CGI scripts, porting will be easy

That being said, most CGI scripts are not written by people familiar with such techniques…

Common porting problems

Run-time attempts to add paths to @INC Implicit variable passing Assigning anonymous sub to $SIG{__DIE__} Compiled regular expressions No closure of file or dbm handles

Adding paths to @INC

This will not work because @INC is fixed after the initialization of Apache

Use startup.pl to add paths BEGIN blocks only get executed at compile

time

Implicit variable passing

The value of the variable in subroutines will not be updated after the first call

Perl considers the two variables to be separate

SolutionsFully qualify the variable in the subroutine

Pass by value or reference

Make the variable global

$SIG{__DIE__} problems

$SIG{__DIE__} is global

Your anonymous subroutine will be called by any ‘die’ statement

SolutionsWrite your own exception method

Error.pm (try, throw, catch)

Compiled regular expressions

Only compiled for the first pattern used

SolutionAdd an eval block

eval q{

foreach (@list) {

&mysub($_) if ( /$pattern/o );

}

}

No closure of file or dbm handles

Common to use ‘close on exit’ fh management

Now the script does not exit….

SolutionClose your file and dbm handles

Conclusions

mod_perl will speed up CGI scripts by 200% to 2000%

use strict; Use the PerlTaintCheck directive Effective use of mod_perl is not trivial, plan

accordingly

Further Reading

Writing Apache Modules with Perl and CLincoln Stein and Doug MacEachern

mod_perl Guidehttp://perl.apache.org/guideStas Bekman

FAQhttp://perl.apache.org/faq Frank Cringle

Apache %INC hash detail

The handler subroutine changes the name of the nested subroutines to the fully qualified filename plus the prefix Apache::ROOTApache::ROOT::My::Dir::script_2epl

This reduces the likelihood of namespace collisions, but frustrates tampering

Recommended