Suppose you have to implement a long running, business-critical application
that takes a huge amount of code and a long startup time, which you absolutely
can't restart. Either because downtime means loss of $$, or e.g. your phone is
going to glow, or you're going to be flooded with emails from complaining customers;
or think of something that is crucial in a permanent workflow. Think perlmonks ;-)
You know that the first release will be running forever, but you will have to
dynamically update your application: your first release will be far from
perfect, and requirements will evolve during that application's runtime, which
you cannot forsee. In short, keep it running, but be able to update it.
As there certainly will be server downtimes due to maintainance, security
fixes and upgrades of other packages into which your updates could be
scheduled, this scenario may seem like a hypothetical one. But given a large
load time of your app, you might as well have a framework that requires this
kind of dynamic code change during development time; you may want to avoid e.g.
killing fastcgi servers or re-loading apache. For the sake of making my
point I'll just bless $hypothetical, $real. (Please take my apologies
if what I'm meditating over has been presented elsewhere and beaten to death already.)
What are your choices?
IPC
You could break your app up into single processes. You would have one inmutable
process which for doing
whatever() queues requests, sends messages
and receives responses to/from external processes via some
IPC
mechanism, which you could kill off and restart for upgrades. This adds complexity
to your code for relative little gain.
source file timestamps
A good idea.
You could periodically check the time stamps of your source files in the main
loop of your application, and throw away the %INC table entry of updated modules.
Some reload modules, e.g. Apache2::Reload, do that.
But doing so, you must be careful. You can't just throw away the symbol table, as
your objects get orphaned. Only in very new perl releases the
orphaned object bug is fixed: referencing orphaned objects doesn't cause a
segmentation fault anymore, rather they get connected to the empty __ANON__
package. That doesn't help much, either, because the object's methods will
be lost.
Packages may be too coarse grouping anyways. The granularity might not suffice.
You might want to update functions and/or methods, not an entire package.
But the functions or methods of a package are in a single file. Are they?
Let's see what we have at hand.
lazy loading
Perl has a well-known mechanism for lazy loading of functions, implemented
via the package pair AutoSplit and AutoLoader. Splitting up a
module into an immediate part and delayed code is easy: you put your functions
suitable for autoloading after the __END__ token. After running AutoSplit on
your Module, the immediate code part will reside in your Module.pm,
while the auto-loadable functions will reside in auto/Module/ in
per-function files, e.g. function.al.
If you use your module, only the immediate code part is compiled
and run. Calls to autosplit functions will be handled via AutoLoader, whose
AUTOLOAD block will require the function.al file the first
time function($foo) is called, and replace its own stack frame with that
of the just compiled function via the magic goto &function.
This may greatly reduce startup time, but other than that it doesn't give more
flexibility. Once a function is loaded from an autosplit file, it is defined.
The AUTOLOAD block will not be called again for that function, since now it can
be found in the symbol table.
The subroutine or method "doesn't care for itself". If it was to do so, calling
it would trigger some check subroutine that looks for some value particular to
the function's source file (timestamp, size, ownership, flags, md5sum, ...) and
decide from there whether it has to reload itself.
Having the check code outside of the sub means either a) shoehorning every call
to a monitored subroutine through a dispatching sub that checks the disk files,
or b) check asynchronously, e.g. with an alarm handler that periodically goes
over all files.
introducing Sub::Auto
This module provides for lazy loading and reloading of monitored subroutines.
=head1 NAME
Sub::Auto - Lazy loading and reloading of anonymous subroutines
=head1 SYNOPSIS
use Sub::Auto;
my $sub = Sub::Auto -> new ($file, $checksub, $autoprefix);
$result = $sub -> (@args);
$sub -> check (0); # turn source file checking off for $
+sub
$sub -> checksub ($coderef); # provide alternative checking routin
+e
use Sub::Auto qw (AUTOLOAD);
Sub::Auto -> check (1); # turn source file checking on
$result = somefunc (@args);
*somefunc{CODE}->check(0); # turn off checking for this named su
+b
=head1 DESCRIPTION
Sub::Auto provides lazy loading like AutoLoader, but also for function
files which return an anonymous subroutine upon require (as its last
evaluated statement).
Before requiring that file, it is checked via some subroutine returnin
+g
a value (default is mtime). The returned value is remembered. At each
call to that sub the check subroutine is run again if this subroutine'
+s
check flag is set; and if the returned value changed, the source file
+is
reloaded.
Importing the AUTOLOAD method provides for lazy loading of anonsubs as
+
named subs. The wrapped anonsub will be assigned to a symbol table ent
+ry
named after the filename root of the function source file.
=head1 METHODS
=over 4
=item new ($file, $checksubref, $autoprefix)
subroutine constructor. $file can be the path to some function file or
a function name which will be expanded to $autoprefix/__PACKAGE__/$fun
+ction.al
and searched for in @INC. $checksubref and $autoprefix are optional.
If they are not provided, the default class settings are used.
=item auto ($autoprefix)
set or get the default autoprefix. Default is 'auto', just as with Aut
+oLoader:
for e.g. POSIX::rand the source file would be auto/POSIX/rand.al . Sub
+::Auto
lets you replace the 'auto' part of the path with something else. Clas
+s method
(for now).
=item suffix ($suffix)
set or get the suffix of your autoloaded files (e.g. '.al', '.pl', '.t
+mpl')
as a package variable.
=item check (1)
set or get the check flag. Turn checking on by setting this to some tr
+ue value.
Default is off. Class and object method, i.e. Sub::Auto->check(1) sets
+ the
default to on, $sub->check(1) sets checking for a subroutine. For now,
+ there's
no way to inculcate the class default on subs with a private check fla
+g.
=item checksub ($coderef)
set the checking subroutine. Class and object method. This subroutine
+will be
invoked with a subroutines source filename (full path) every time the
+sub for
which it is configured - but only if check for that subroutine is true
+ -, and
should return some value special to that file.
Default is 'sub { (stat $_[0]) [9] }', i.e. mtime.
=back
=head1 SEE ALSO
AutoLoader, AutoSplit, autouse, DBIx::VersionedSubs
=head1 TODO
=over 4
=item eliminate paranoia
make this module truly subclassable. Turn lexical private subs into ou
+r() vars
or into named subs. Make the %AL hash accessible. All that means re-th
+ink code
calling semantics and uses of __PACKAGE__ .
=item provide for more path changes and access methods of subroutines
The 'auto' part of a subroutine should be changeable, as well as the f
+ull path
to a subroutine source file. Then, a subroutine's access method should
+ be made
more flexible, e.g. reading code from some database, retrieve via LWP,
+ or else.
=back
=head1 BUGS
Sub::Auto subroutines are always reported as __ANON__ (e.g. with Carp:
+:cluck),
even if they are assigned to a symbol table entry. Which might not be
+a bug.
There might be others.
=head1 Author
shmem <gm@cruft.de>
=head1 COPYRIGHT
Copyright 2007 by shmem <gm@cruft.de>
This program is free software; you can redistribute it and/or modify i
+t
under the same terms as Perl itself.
=cut
package Sub::Auto;
use Exporter qw(import);
use strict;
use warnings;
use Scalar::Util;
use File::Spec;
our $VERSION = 0.01;
our @EXPORT_OK = qw (AUTOLOAD);
my $Debug = 0;
our ($gensub, $load);
our %AL; # hash holding all info about subs
sub new {
my $class = shift;
my $caller = caller;
my $sub = $gensub -> ($caller,@_);
bless $sub, $class;
}
sub auto {
shift if __PACKAGE__ || $_[0] eq (caller(0))[0];
$AL {'auto'} = shift if @_;
$AL {'auto'};
}
sub check {
my $self = shift;
if(ref($self)) {
${ $AL {Sub} -> {Scalar::Util::refaddr($self)} -> {'check'} }
+= shift;
} else {
$AL {'check'} = shift;
}
}
sub checksub {
my $self = shift;
if(ref($self)) {
${ $AL{Sub} -> {Scalar::Util::refaddr($self)} -> {'checksub'}
+} = shift;
} else {
$AL {'checksub'} = shift;
}
}
sub suffix {
shift if __PACKAGE__ || $_[0] eq (caller(0))[0];
$AL {'suffix'} = shift if @_;
$AL {'suffix'};
}
checksub ( __PACKAGE__, sub { (stat $_[0]) [9] } ); # default check su
+broutine
check ( __PACKAGE__, 0); # default is not c
+hecking
# $gensub - returns an anonymous subroutine.
# Parameters:
# if one: filename (full path)
# if more: package, filename [, checkfuncref [, auto ]]
$gensub = sub {
my $package = scalar(@_) == 1 ? caller : shift;
my $file = shift;
my $chkfunc = shift || $AL {'checksub'};
my $auto = shift || $AL {'auto'} || 'auto';
my $function;
{
($function = pop (@{[ File::Spec->splitpath($file) ]}) ) =~ s/
+\..*//;
$file .= $AL {'suffix'} || '.al' unless $file =~ /\.\w+$/;
unless (-e $file) {
my ($filename, $seen);
{
$filename = File::Spec -> catfile ($auto, $package, $f
+ile);
foreach my $d ('.',@INC) { # check current working dir
+ first
my $f = File::Spec -> catfile ($d,$filename);
#warn "checking for $f\n";
if (-e $f) {
$file = $f;
#warn "got it! $file\n";
last;
}
}
# redo the search with a truncated filename
last if $seen;
unless (-e $file) {
$file =~ s/(\w{12,})(\.\w+)$/substr($1,0,11).$2/e;
$seen++;
redo;
}
}
die
"Can't locate function file '$filename' for package '$pa
+ckage'\n"
unless -e $file;
}
}
if (my $addr = $AL {'Inc'} -> {"$package\::$function"} ) {
return $AL {'Sub'} -> {$addr} -> {'outer'};
} else {
# file not known yet
my $inner;
my $h = {};
my $cr = $chkfunc -> ($file);
my $subname = "$package\::$function";
$h = {
file => $file,
check => \$AL {'check'},
checksub => \$chkfunc,
checkref => \$cr,
function => $subname,
};
my $outer = $load -> ($package, $file, $h) or die $@;
my $outeraddr = Scalar::Util::refaddr ($outer);
$h -> {'outer'} = $outer;
Scalar::Util::weaken ($h -> {outer});
$AL{Sub} -> {$outeraddr} = $h;
$AL{Inc} -> {$subname} = $outeraddr;
return bless $outer, __PACKAGE__;
}
};
$load = sub {
my ($package, $file, $h) = @_;
delete $INC {$file};
my $ref = eval "package $package; require '$file'";
# warn $@ if $@;
return undef if $@;
{
# just in case the require dinn' return a ref -
# then it's likely a named subroutine has been loaded
# see chromatics note below
# UNIVERSAL::isa($ref,'CODE') or $ref = \&{$h -> {'function'}};
Scalar::Util::reftype($ref) and Scalar::Util::reftype($ref) eq
+ 'CODE'
or $ref = \&{$h -> {'function'}};
${$h->{inner}} = $ref;
my $sub = sub {
my $cr = $h -> {'checkref'};
if( ${ $h -> {'check'} } and ${ $h-> {'checksub'} }
and
( my $c = ${ $h->{checksub} } -> ($file) ) != $$cr) {
warn "reloading $file" if $Debug;
$$cr = $c;
$load -> ($package, $file, $h);
}
goto ${ $h -> {inner} };
};
}
};
sub DESTROY {
my $outeraddr = Scalar::Util::refaddr ($_[0]);
my $h = $AL {Sub} -> {$outeraddr};
delete $AL {Inc} -> { $h -> {function}};
delete $AL {Sub} -> {$outeraddr};
}
sub AUTOLOAD {
no strict;
my $sub = $AUTOLOAD;
my ($pkg, $func, $filename);
{
($pkg, $func) = ($sub =~ /(.*)::([^:]+)$/);
$pkg = File::Spec -> catdir (split /::/, $pkg);
}
my $save = $@;
local $!; # Do not munge the value.
my $ref;
eval {
local $SIG{__DIE__};
$ref = $gensub -> ($pkg, $func, '', $AL{'auto'} || 'auto');
};
if ($@) {
if (substr ($sub,-9) eq '::DESTROY') {
no strict 'refs';
*$sub = sub {};
$@ = undef;
}
if ($@){
my $error = $@;
require Carp;
Carp::croak($error);
}
}
$@ = $save;
no warnings 'redefine';
*$AUTOLOAD = $ref;
goto $ref;
}
sub unimport {
my $callpkg = caller;
no strict 'refs';
my $symname = $callpkg . '::AUTOLOAD';
undef *{ $symname } if \&{ $symname } == \&AUTOLOAD;
*{ $symname } = \&{ $symname };
}
1;
__END__
If used as
use Sub::Auto qw(AUTOLOAD);
it is a drop-in replacement for
AutoLoader (it handles named
subroutines also) - with two caveats:
- it doesn't look for a package's autosplit file to pre-define subroutines for the caller
upon import() execution
- references to a pre-declared named sub change after loading the respective autoload function file
<update>
As for now, it works (for me :-), but some AutoLoader tests fail when run against this
module:
- autoload function files with truncated file names fail to load
- currently no unimport (no Sub::Auto; not implemented)
</update>
With a few changes and enhancements, AutoLoader could do the job.
All that would be necessary is
- extend AutoLoader's import to take a hash reference resembling %AL
- make it use a $gensub routine if provided, otherwise use its standard require
- check for the returned value from loading to choose the right form of goto
performance
Since the payload subroutines are wrapped into references which in turn are looked up
from a hash by code that performs more checks and is wrapped into another sub (the outer
sub, visible to the caller) there's significant overhead. You will not want to use this
module to wrap tiny recursive functions where just calling them takes up much of the
overall time they spend.
at the end
Do you find this module useful? Does it have the right name? should I release it to CPAN?
Or should I write a patch for
AutoLoader /
AutoSplit?
update: uploaded to CPAN as AutoReloader.
Comments, critiques and enhancement suggestions welcome.
update: replaced UNIVERSAL::isa with Scalar::Util::reftype as chromatic suggested, added autouse to the SEE ALSO section as per diotalevi's comment.
update: added unimport, fixed file searching, added proper die() statements - runs again the AutoLoader test file, failing only test 4 (can() returns ref to regular installed sub) - see above.
update: added AutoReloader for searchability, since that's its name on CPAN
--shmem
_($_=" "x(1<<5)."?\n".q·/)Oo. G°\ /
/\_¯/(q /
---------------------------- \__(m.====·.(_("always off the crowd"))."·
");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.