41
mod_perl High speed dynamic content

Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Embed Size (px)

Citation preview

Page 1: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

mod_perl

High speed dynamic content

Page 2: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Definitions

Apache – OpenSource httpd server

Perl – OpenSource interpreted programming language

mod_perl – OpenSource interface between Apache and Perl

Page 3: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Scenario

Unix hosted web-site using CGI Perl scripts to create web pages with dynamic content.

As the site becomes larger and more popular, the speed slows

The web servers are not resource constrained

How can we speed the delivery of content?

Page 4: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Topics covered

Why use mod_perl? How mod_perl works Installation highlights Commonly encountered problems when

porting from CGI to mod_perl

Page 5: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Take home messages

use strict;

Use the PerlTaintCheck directive

mod_perl enabled scripts are fastPossibly faster than C code…

Effective use of mod_perl is not trivial

Page 6: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Why use mod_perl?

CGI scripts written in Perl and served from Apache can be slow

Generally the more dynamic the content, the slower the page return

As sites grow, their content usually becomes more dynamic

Sites ported to mod_perl show request return rates 200% to 2000% higher!!!

Page 7: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Evolution of a web-site

Page 8: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Why is CGI slow?

For each CGI script called:Apache must load the Perl interpreter

Perl must load the the CGI script including all used Modules

The CGI script must initialize itself on each loadDatabase initialization is a common slow-down

Page 9: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Dynamic content options

Taken from "Writing Apache Modules w ith Perl and C"

0 2 4 6 8 10 12 14

CGI

FastCGI

Server API

Server Side Includes

DHTML

Client-side Java

Embedded Interpreter

Integrated System

Portability

Performance

Simplicity

Pow er

Page 10: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The embedded interpreter

There are increasingly sophisticated solutions:

Only load the Perl interpreter once(PerlRun)

Only load the CGI scripts and modules once(PerlRegistry)

Only do initializations once(Code re-factoring)

Page 11: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The mod_perl version

Each Apache instance loads the Perl interpreter when it initializes (PerlRun)

mod_perl loads CGI scripts the first time they are executed (PerlRegistry)

We can move CGI initialization code into a separate script that is executed when each Apache instance starts (Code re-factoring)

Page 12: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

How does mod_perl work?

It’s invoked by Apache so…How does Apache work?

It’s running Perl code so…How does Perl work?

The mod_perl handler

Page 13: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

How does Perl work?

Interpreted language

To run a Perl programThe Perl interpreter is loaded

The Program is compiled by the Perl interpreter

The Program is executed

Page 14: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Perl’s @INC array

Analogous to the LD_LIBRARY_PATH environment variable

This array contains the paths that Perl will search when loading libraries or modules

With mod_perl this is set during Apache’s initialization phase and cannot be change later

Page 15: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Perl’s %INC hash

After Perl has compiled a program, it is added to %INC as a key with the fully qualified filename and compile time as a values

Before loading a program, Perl searches %INC for a match. If it finds one, it uses the %INC entry and does not re-load the program

Page 16: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

How does Apache work?

Apache serves documents requested via HTTP

A sample request:

Page 17: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The Apache request cycle

Page 18: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The mode _perl handlers

You can interact with the Apache request cycle at any point with a handler

A handler is a program that is defined for a Directory, Location, Files or FilesMatch directive in Apache’s configuration files

Page 19: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The Apache handler API

Apache calls the handler, passes information about the current transaction and server state and expects an integer status code in return

If the handler returns an error, Apache short-circuits to the logging phase

Page 20: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The PerlHandler: part 1

Called prior to serving the requested document

Two default handlers PerlRun

PerlRegistry

Page 21: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

The PerlHandler: part 2

Unless you are using a custom handler, your CGI programs are executed as subroutines inside the handler subroutine (nested subroutines)

Since Apache never exits, the CGI programs are never reaped by Perl’s garbage collection

Page 22: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

A handler subroutine

sub handler {

my $request = shift(@_);

my $status = &do_something($request);

return $status;

}

Page 23: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Installation highlights

Install Apache and mod_perl together Configure Apache to use mod_perl Port your CGI scripts

Page 24: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Installation: part 1

(The following is for Unix, Win32 installs are more difficult)

Download Apache http://www.apache.org

current version is 1.3.12

Download mod_perl http://search.cpan.org

current version is 1.24

Check perl version (mod_perl 1.24 or earlier works best with Perl 5.00503—not 5.6.0)

Page 25: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Installation: part 2

By default building mod_perl will also build Apache

>perl Makefile.PL APACHE_SRC=../apache.1.3.12 \ DO_HTTPD=1 USE_APACI=1 EVERYTHING=1

>make;make test;make install

>cd ../apache.1.3.12; make install

Page 26: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Configuration

Configure Apache to use mod_perl:

modify httpd.conf<IfModule mod_perl.c>

Include conf/perl.conf

</IfModule>

Page 27: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

perl.conf: part 1

PerlSetEnv PERL5LIB “/my/perl/lib”

PerlRequire conf/startup.pl

Alias /cgi-bin “/my/cgi/dir”

PerlTaintCheck On

Page 28: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

perl.conf: part 2

<Location /cgi-bin>

SetHandler perl-script

PerlHandler Apache::Registry

Options ExecCGI

</Location>

Page 29: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

perl.conf: part 3

A good initial check of mod_perl

<Location /mod_perl-status>

SetHandler perl-script

PerlHandler Apache::Status

</Location>

Page 30: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

startup.pl

Loads commonly used modules

Initializes database connections

@EXPORT_OK = qw( $dbh );

use lib qw( /home/matt/perl/Modules

/opt/apps/lib );

my $dbh = DBI->connect($db,$user,$passwd);

Page 31: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Porting CGI scripts to mod_perl

‘use strict;’

during porting‘use diagnostics;’ And check the error.log

Run the Apache server single threaded

Add the following to perl.conf

PerlInitHandler ApacheStatINC

Page 32: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Common porting problems: part 1

As someone pointed out during the presentation, all of these problems could be solved by using re-entrant coding techniques

If you used them when you wrote the CGI scripts, porting will be easy

That being said, most CGI scripts are not written by people familiar with such techniques…

Page 33: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Common porting problems

Run-time attempts to add paths to @INC Implicit variable passing Assigning anonymous sub to $SIG{__DIE__} Compiled regular expressions No closure of file or dbm handles

Page 34: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Adding paths to @INC

This will not work because @INC is fixed after the initialization of Apache

Use startup.pl to add paths BEGIN blocks only get executed at compile

time

Page 35: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Implicit variable passing

The value of the variable in subroutines will not be updated after the first call

Perl considers the two variables to be separate

SolutionsFully qualify the variable in the subroutine

Pass by value or reference

Make the variable global

Page 36: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

$SIG{__DIE__} problems

$SIG{__DIE__} is global

Your anonymous subroutine will be called by any ‘die’ statement

SolutionsWrite your own exception method

Error.pm (try, throw, catch)

Page 37: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Compiled regular expressions

Only compiled for the first pattern used

SolutionAdd an eval block

eval q{

foreach (@list) {

&mysub($_) if ( /$pattern/o );

}

}

Page 38: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

No closure of file or dbm handles

Common to use ‘close on exit’ fh management

Now the script does not exit….

SolutionClose your file and dbm handles

Page 39: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Conclusions

mod_perl will speed up CGI scripts by 200% to 2000%

use strict; Use the PerlTaintCheck directive Effective use of mod_perl is not trivial, plan

accordingly

Page 40: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Further Reading

Writing Apache Modules with Perl and CLincoln Stein and Doug MacEachern

mod_perl Guidehttp://perl.apache.org/guideStas Bekman

FAQhttp://perl.apache.org/faq Frank Cringle

Page 41: Mod_perl High speed dynamic content. Definitions Apache – OpenSource httpd server Perl – OpenSource interpreted programming language mod_perl – OpenSource

Apache %INC hash detail

The handler subroutine changes the name of the nested subroutines to the fully qualified filename plus the prefix Apache::ROOTApache::ROOT::My::Dir::script_2epl

This reduces the likelihood of namespace collisions, but frustrates tampering