Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re: Regex matching on grid alignment

by rjt (Curate)
on Sep 08, 2013 at 23:16 UTC ( [id://1052937]=note: print w/replies, xml ) Need Help??


in reply to Regex matching on grid alignment

I think you'll find this will do the trick:

$rem = $W - 3; qr/^ (?:.{$W})* .{0,$rem} (.)\1\1/x

The idea here is to minimize the branching the RE engine will have to do. The logic is pretty similar to what you might do if you had the string split into lines; just skip 0..$H1 rows, and we know we're at the beginning of a row, so from there we just match 0..$W-3 characters followed by a repeating sequence of 3 with your original regex.

Performance is the same (a few % better actually) as the plain /(.)\1\1/, and several times faster than anything I tried with unpack or split.

Edit: You can get another ~25% or so if your character set really is small like [ABCD] by unrolling (.)\1\1 into (?:AAA|BBB|CCC|DDD). If you're not just using this as a boolean test and still need the character in $1, use (AAA|BBB|CCC|DDD) instead and use substr($1,0,1) to grab the first character if you get a match. The idea here is to push the more expensive operations out of the hot loop that's called millions of times.

___________
1. $H-1, actually, but it makes no measurable difference to the efficiency, or correctness. Edit: Changed {0,$H} to *, to shave a few keystrokes. Thanks LanX!

use strict; use warnings; omitted for brevity.

Replies are listed 'Best First'.
Re^2: Regex matching on grid alignment
by LanX (Saint) on Sep 08, 2013 at 23:37 UTC
    > 1. $H-1, actually, but it makes no measurable difference to the efficiency, or correctness.

    so why not just * instead of {0,$H} ?

    Cheers Rolf

    ( addicted to the Perl Programming Language)

      Because {0,$H} is a disfigured emoticon in need of love just like anyone else. Emoticons with physical defects are people, too, you know...

      (Other than that, no, no particular reason, although I'm sure it made perfect sense to me at the time.)

Re^2: Regex matching on grid alignment
by Anonymous Monk on Sep 09, 2013 at 00:16 UTC

    Many thanks rjt! If it performs as well as /(.)\1\1/ like you say, it will be more than fast enough.

Re^2: Regex matching on grid alignment
by Anonymous Monk on Sep 08, 2013 at 23:34 UTC

    Performance is

    A mystery :)

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1052937]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (7)
As of 2024-04-16 07:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found