Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^5: Why is "any" slow in this case?

by LanX (Saint)
on Jul 28, 2025 at 14:14 UTC ( [id://11165832]=note: print w/replies, xml ) Need Help??


in reply to Re^4: Why is "any" slow in this case?
in thread Why is "any" slow in this case?

> Wow, thanks. I'll refactor with this, then.

please test thoroughly, I just hacked the code into my mobile as an example ... be also careful about the numbering of the captures or use an (?:...) for non-capture in the negative list.

> simply generating a list of few hundred captures is slower,

I can't follow, since you are using the /g modifier, each iteration will only capture 2 groups and then continue where it left of.

hence my ( $c, $r ) = ( $data =~ /^(\d+) (\d+)/mg ) should nicely do.

(Haven't tested the performance, but every statement normally counts)

> I'll refactor

You initially said that performance wasn't an issue and you were just curious.

I'd rather recommend to go for the clearest code, not for the fastest. Because in the long run maintenance costs you the most.

Cheers Rolf
(addicted to the Perl Programming Language :)
see Wikisyntax for the Monastery

Replies are listed 'Best First'.
Re^6: Why is "any" slow in this case?
by ysth (Canon) on Jul 28, 2025 at 16:17 UTC
    Re a few hundred, the data/code was simplified for this question, I'm guessing the real code does have hundreds.
Re^6: Why is "any" slow in this case?
by Anonymous Monk on Jul 28, 2025 at 20:20 UTC
    I can't follow, since you are using the /g modifier, each iteration will only capture 2 groups and then continue where it left of

    /g in list context consumes till the end of string; iterating with "while" is infinite loop (all but two captures are thrown away each time), or did I misunderstood completely? OK, nevermind

      > (all but two captures are thrown away each time)

      you are right I misremembered the /g, /c, pos(), \G mechanisms which only work in scalar context. Mea Culpa.

      But you can grep all @matches and post process them in pairs, like

      use v5.14; use warnings; say "INPUT: ",my $str = join " ", 11 .. 19; my @matches = $str =~ m/(\d+) (\d+)/g ; while (my ($c,$r) = splice @matches,0,2) { say "$c - $r"; }

      INPUT: 11 12 13 14 15 16 17 18 19 11 - 12 13 - 14 15 - 16 17 - 18

      If you want only two capture per string you can also use /gc to avoid an infinite loop.

      use v5.14; use warnings; say "INPUT: ",my $str = join " ", 1..9; my $stop=5; while ( my ($c,$r) = ($str =~ m/(\d+) (\d)/gc) ) { say "$c $r"; die unless $stop--; }

      INPUT: 1 2 3 4 5 6 7 8 9 1 2

      Cheers Rolf
      (addicted to the Perl Programming Language :)
      see Wikisyntax for the Monastery

Re^6: Why is "any" slow in this case?
by LanX (Saint) on Jul 30, 2025 at 22:18 UTC
    > please test thoroughly, I just hacked the code into my mobile as an example ...

    as expected the first approach was buggy.

    This approach seems to pass the tests (using 1 for 0 to test more edge cases)

    Hint: Anchoring is tricky when dealing with variable length strings and look-aheads.

    use v5.14; use warnings; say "INPUT: ",my $str = join " ", 0 ..34; my %DONT; @DONT{1, 15, 16, 31}=(); my $stop=15; my $donts = join "|", keys %DONT; say my $dont_re = "(?!(?:$donts)\\b)"; while ( $str =~ m/\b$dont_re(\d+) \b$dont_re(\d+)/g ) { say "$1 $2"; die unless $stop--; exists $DONT{$_} and die "$_ forbidden" for $1,$2; }

    INPUT: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 2 +4 25 26 27 28 29 30 31 32 33 34 (?!(?:31|1|15|16)\b) 2 3 # <-- no 0,1 4 5 6 7 8 9 10 11 12 13 # <-- no 14,15,16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 # <-- no 31 32 33 # <-- no 34

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    see Wikisyntax for the Monastery

    UPDATE

    had to improve the rexex again... :/

    Probably a negative-look-behind is easier and faster.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://11165832]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (2)
As of 2026-03-08 23:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    hippoepoptai's answer Re: how do I set a cookie and redirect was blessed by hippo!
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.