http://www.perlmonks.org?node_id=1050246

isync has asked for the wisdom of the Perl Monks concerning the following question:

I've got a module where I precompile regexes for speed/later use. But I can't decide what's best practice in terms of how I handle these module constants.
package Foo::Bar; use Regexp::Assemble; our $VERSION = '0.01'; sub new { my $class = shift; my $self = bless({@_}, $class); ## precompile regexes $self->{regex} = Regexp::Assemble->new->add( 'some-stuff' )->re; return $self; } sub function { my $self; ... $var =~ $regex; }
vs.
package Foo::Bar; use Regexp::Assemble; our $VERSION = '0.01'; our $regex = Regexp::Assemble->new->add( 'some-stuff' )->re; sub function { ... $var =~ $regex; }

I don't hesitate to declare module wide constants within the on-load/BEGIN part of a module, but it somehow feels wrong when these constants require some computation on-load, like here, precompiling regexes.

I would favour the the latter version. Doing a clean OO-style init and having the $obj var around is a bit cumbersome for the simple module I'm working on.
Arg, noticed that? I used the word "clean" to describe having these 'constants' in $self...

So please Monks, give me some advice. Is it just a matter of style, or design priciples, or are there actual problematic things to consider when my constants do some computing on-load?

(One difference I can think of is that this computing introduces an overhead and with my decision I decide for the module user where the precompilation overhead ends up:
  "fast load" + "mandatory new(), slowed" + "fast method" -- OO-style
  vs.
  "slower load" + "no mandatory new()" + "fast method". -- on-load style

Anything except that?

Replies are listed 'Best First'.
Re: Modules: computing a constant, "on load" or in new()?
by BrowserUk (Patriarch) on Aug 20, 2013 at 21:16 UTC

    Putting them in new means doing the same work over. That's a waste of cpu and a violation of DRY principle.

    There is nothing 'dirty' about setting up the 'constant' requirements of your module at load time.

    Indeed, that exactly what you are doing when you declare your functions.

    sub XYZ { ...} is exactly the same as *{__PACKAGE__'::XYZ'} = sub { ... };; it is a load-time assignment to a symbol in your package stash.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Modules: computing a constant, "on load" or in new()?
by tobyink (Canon) on Aug 20, 2013 at 21:56 UTC

    Personally I'd do something like this:

    package Foo::Bar; use Regexp::Assemble; our $VERSION = '0.01'; my $regex; sub new { my $class = shift; my $self = bless({@_}, $class); return $self; } sub function { my $self; ... $regex ||= Regexp::Assemble->new->add( 'some-stuff' )->re; $var =~ $regex; }

    That is, delay the compilation to the last possible moment, and then keep the result for future invocations to use.

    Type::Tiny does this sort of thing all over the place, and it's fast. :-)

    package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name
      ... delay the compilation to the last possible moment, and then keep the result for future invocations to use.

      If one were dealing with a lexical variable as in your example, and if the class was defined entirely within a file, and if the file/module was loaded only via a use statement, what would be the advantage of postponing initialization? If all the above conditions held, there could be no "order of evaluation" effects. Why would the following not be better because run-time (Update: well, 'run-time' is not quite the right term here, but you know what I mean) evaluation is entirely avoided? (In fact, it might even be better than the approach I posted here because you would be dealing with a 'pure' lexical variable without the interpolation problems associated with constant entities. However, methods or functions within the class could still change it, so it could not be considered a 'pure' constant.)

      package Foo::Bar; ... my $regex = qr{ hello }xms; ... sub new { ... } ... sub method { my $self = shift; ... if ($self->{bar} =~ m{ \b $regex \b }xms) { ... } ... } ... 1;

        "If one were dealing with a lexical variable as in your example, and if the class was defined entirely within a file, and if the file/module was loaded only via a use statement, what would be the advantage of postponing initialization?"

        There might be a few dozen such items, each of which is reasonably costly to compile, and in a particular invocation of the program, only one or two (or perhaps even none in the case where the program has been invoked with the --help parameter) actually need to be used.

        package Cow { use Moo; has name => (is => 'lazy', default => sub { 'Mooington' }) } say Cow->new->name
Re: Modules: computing a constant, "on load" or in new()?
by AnomalousMonk (Archbishop) on Aug 20, 2013 at 22:28 UTC

    What BrowserUk said.
    I would look at it slightly differently. If we are talking about a 'class' constant, i.e., something that is invariant over all class and object methods and all 'free' functions of the class, make it a constant as soon as possible, computed or otherwise. If it's an 'object' constant, i.e., something invariant only over the lifetime of a given object, then, of course, it must be somehow created or defined in the constructor: how else? As an example of the first possibility:

    package Foo::Bar; ... use constant RX => qr{ hello }xms; ... sub new { ... } ... sub method { my $self = shift; ... if ($self->{bar} =~ RX) { ... } ... } ... 1;
Re: Modules: computing a constant, "on load" or in new()?
by isync (Hermit) on Aug 21, 2013 at 00:08 UTC
    Again, quality and in-depth feedback.
    Lesson learned. Thank you guys!