Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?

Re^7: Speeding up named capture buffer access

by JadeNB (Chaplain)
on Dec 01, 2009 at 19:34 UTC ( #810441=note: print w/replies, xml ) Need Help??

in reply to Re^6: Speeding up named capture buffer access
in thread Speeding up named capture buffer access

As an example, in one place in Date::Manip, I match a set of related regular expressions that match various date strings, and there are 23 different possibilities containing 65 different matches between them (NOT all in the same order)
You've mentioned several times the need to work around the fact that you don't know which of many alternatives matched. Would it be possible, instead of
$string =~ /$re1|$re2/ and ( $h, $m, $s ) = ...
, to do
$string =~ $re1 and ( $h, $m, $s ) = ... or $string =~ $re2 and ( $h, +$m, $s ) = ...
and just have to worry about the order for individual regexes (rather than trying to find one order that works for all regexes); or does that also fall afoul of the maintainability requirement? Note that this approach means that introducing one new regex involves one simple counting problem, rather than one big counting problem that could interefere with all the old counts.

Replies are listed 'Best First'.
Re^8: Speeding up named capture buffer access
by SBECK (Chaplain) on Dec 01, 2009 at 20:18 UTC

    That's how I had it originally... and when you've got 23 different possibilities, it adds unnecessary complexity. There's already 23 possibilities wherever I create the regular expressions, but now there's 23 possibilities wherever I use it as well.

    Worse is that some of the regular expressions are used multiple places. When I modify a regular expression, I'd like to have it be done in one place (wherever the regexp is created) and not have to worry about it in some other place or places (wherever it's used). As it stands now, I can add new ways to express a date in one place, and it'll automatically -- the routine where I create all my regexps, and it'll automatically go into affect in the various places it might be used.

    Not a big problem of course... but I'm a huge fan of Larry's principle of laziness.

      If you are willing to trade a few globals to obtain speed, then a dispatch table might allow for reasonable maintanence:

      my( $hrs, $mins, $secs, $day, $month, $year, ... ); my %res = ( qr[(\d\d):(\d\d):(\d\d)] => sub{($hrs,$mins,$secs) = ($1 +,$2,$3) }, qr[(\d\d)/(\d\d)/(\d\d(?:\d\d)*)] => sub{($day,$month,$year) = ($1 +,$2,$3) }, ..., '' => sub{ die "Unknown date format" }, }; ... $maybeTime =~ $_ and $res{ $_ }->() for keys %res;

      Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
      "Science is about questioning the status quo. Questioning authority".
      In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://810441]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (2)
As of 2021-05-15 23:38 GMT
Find Nodes?
    Voting Booth?
    Perl 7 will be out ...

    Results (151 votes). Check out past polls.