asz has asked for the wisdom of the Perl Monks concerning the following question:


i recently had to use $', so reading perlvar more carefully, i noticed it warns about using $':
The use of this variable anywhere in a program imposes a considerable performance penalty on all regular expression matches. See "BUGS".
...but in the BUGS section it states that:
Due to an unfortunate accident of Perl's implementation, use English imposes a considerable performance penalty on all regular expression matches in a program,[...]
now i'm confused :) ... i don't understand wheter use English; (i.e. $POSTMATCH) reduces performance or simply using $'.
thank you for your time!


Replies are listed 'Best First'.
Re: use English; and performance
by duff (Parson) on Mar 02, 2006 at 16:04 UTC

    Both. Read the section on performance in English

    Basically, using $`, $&, and $' impose a performance penalty on all regular expression matches. Because the English module makes use of these vars, it too imposes the same performance penalty (unless you do as the docs say to avoid it).

      i'm trying to create a string tokenizer for a config file parser and the best that i've managed to think of is this:
      #!/usr/bin/perl use strict; use Data::Dumper; my $line = q[keyword1 value keyword2 "value with spaces" keyword3 valu +e]; print Dumper tokenize_line($line); sub tokenize_line { my $line = shift; my @tokens; while ($line =~ /(\S+)/g) { # every non-space match is a token push @tokens, $1; # anything in double-quotes is a single token if ($line =~ /\G\s*"(.+?)"/) { push @tokens, $1; # continue from this last match $line = $'; } } return \@tokens; }
      wich outputs this:
      $VAR1 = [ 'keyword1', 'value', 'keyword2', 'value with spaces', 'keyword3', 'value' ];
      i know it's an ugly hack, trying to substitute the original string with the rest of the matched pattern ($line = $';), but in my previous attempts i would use split and substr to achieve the same results... and it was very ugly :)
      what would be a better way to write this? i will be parsing some hundred lines from a config file, so i don't think i want a performance penalty. thank you all for your time and advice!


        Use /gc in your speculative match. /c prevents pos() from being reset on match failure.

        Makeshifts last the longest.

Re: use English; and performance
by monkey_boy (Priest) on Mar 02, 2006 at 16:09 UTC
    so.. to use it without the preformance issues.. straight from the docs:
    use English qw(-no_match_vars);

    UPDATE!: removed my over-zealous quoting, cheers ikegami.
    use English -no_match_vars;

    This is not a Signature...

      The unary minus quotes identifiers, so the following is sufficient and cuter:

      use English -no_match_vars;