Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Some odd ambiguity in this regex

by misterperl (Pilgrim)
on Jun 05, 2020 at 15:10 UTC ( #11117718=perlquestion: print w/replies, xml ) Need Help??

misterperl has asked for the wisdom of the Perl Monks concerning the following question:

Perl 5.16, I'm trying to count digits in a scalar
DB<34> x $_ 0 '1223w3433.45+34'
..the expression inexpliquably drops the 1st digit (the 1) the first time I evaluate the regex:

DB<30> x /\d\D*/g 0 2 1 2 2 '3w' 3 3 4 4 5 3 6 3. 7 4 8 '5+' 9 3 10 4

But after that first time, the 1 is back? I changed nothing- just examined it again...

DB<31> x /\d\D*/g 0 1 1 2 2 2 3 '3w' 4 3 5 4 6 3 7 3. 8 4 9 '5+' 10 3 11 4 DB<32> x /\d\D*/g 0 1 1 2 2 2 3 '3w' 4 3 5 4 6 3 7 3. 8 4 9 '5+' 10 3 11 4 DB<33> x /\d\D*/g 0 1 1 2 2 2 3 '3w' 4 3 5 4 6 3 7 3. 8 4 9 '5+' 10 3 11 4 DB<34> x $_ 0 '1223w3433.45+34'
I changed the dot and plus to non-interpolated chars like W and Z, and it finds all the digits consistently. So I'm thinking its some kind of interpolation thing BUT, why would it interpolate differently on trial 1, than on subsequent trials? And if it IS some interpolation thing, why remove the first digit when the interpolated chars are in the middle?

Its perplexing! I also tried /{expression}/ and /\Qexpression\E/, and using $x instead of $_ - none worked.

How can I turn off interpolation, assuming that's the issue? I suppose one *fix* is to simply do the same evaluation 2X, but I can't imagine that's what Larry would suggest!

TYVM

Replies are listed 'Best First'.
Re: Some odd ambiguity in this regex
by hippo (Bishop) on Jun 05, 2020 at 15:25 UTC

    Your DB counters are in the 30s (and skip backwards). My guess is it's something in the steps you are not showing us. Here's my SSCCE:

    use strict; use warnings; use Test::More tests => 2; my @matches; $_ = '1223w3433.45+34'; @matches = (/\d\D*/g); is $matches[0], 1, 'First time'; @matches = (/\d\D*/g); is $matches[0], 1, 'Second time';

    Which passes, as you would expect.

Re: Some odd ambiguity in this regex
by johngg (Canon) on Jun 05, 2020 at 17:09 UTC
    I'm trying to count digits in a scalar

    Perhaps you could use tr///, see Quote Like Operators.

    johngg@shiraz:~/perl/Monks$ perl -E '$_ = q{1223w3433.45+34}; say tr{0 +-9}{};' 12

    I hope this is of interest.

    Cheers,

    JohnGG

      ... I'm trying to count digits in a scalar

      That's what the OP says, but the  /\d\D*/g regex suggests misterperl is trying to count occurrences of some kind of digit-group pattern. If that's the case, I can, offhand, think of some  s/// approaches (in the vein of a  tr/// operation) that would do the trick as well as the  m// approach that misterperl seems to favor:

      c:\@Work\Perl\monks>perl -wMstrict -le "$_ = '+++1223w3433.45+34***'; ;; my $ndg; ;; $ndg =()= /\d\D*/g; print qq{A: $ndg digit grps.; m// change unpossible '$_'}; ;; $ndg = do { (my $r = $_) =~ s/\d\D*//g }; print qq{B: $ndg digit grps.; s/// string unchanged '$_'}; ;; $ndg = s/(\d\D*)/$1/g; print qq{C: $ndg digit grps.; s/// string unchanged '$_'}; ;; $ndg = s/\d\D*//g; print qq{D: $ndg digit grps.; s/// string CHANGED '$_'}; " A: 12 digit grps.; m// change unpossible '+++1223w3433.45+34***' B: 12 digit grps.; s/// string unchanged '+++1223w3433.45+34***' C: 12 digit grps.; s/// string unchanged '+++1223w3433.45+34***' D: 12 digit grps.; s/// string CHANGED '+++'
      My own preference would be for the  m// approach.


      Give a man a fish:  <%-{-{-{-<

Re: Some odd ambiguity in this regex
by Paladin (Vicar) on Jun 05, 2020 at 15:29 UTC
    What version of 5.16 are you using, and what OS? It worked fine for me.
    [~]$ perl -v This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-li +nux (with 1 registered patch, see perl -V for more detail) Copyright 1987-2012, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5 source ki +t. Complete documentation for Perl, including FAQ lists, should be found +on this system using "man perl" or "perldoc perl". If you have access to + the Internet, point your browser at http://www.perl.org/, the Perl Home Pa +ge. [~]$ perl -de1 Loading DB routines from perl5db.pl version 1.37 Editor support available. Enter h or 'h h' for help, or 'man perldebug' for more help. main::(-e:1): 1 DB<1> $_ = '1223w3433.45+34' DB<2> x /\d\D*/g 0 1 1 2 2 2 3 '3w' 4 3 5 4 6 3 7 3. 8 4 9 '5+' 10 3 11 4 DB<3>
      This is perl 5, version 16, subversion 3 (v5.16.3) built for x86_64-linux-thread-multi (with 33 registered patches, see perl -V for more detail)
        Ok. Can you run the exact same commands I did above and see if it has the same or different output?
Re: Some odd ambiguity in this regex (updated)
by AnomalousMonk (Bishop) on Jun 05, 2020 at 16:17 UTC

    Further to hippo's post:

    ... something in the steps you are not showing us.
    misterperl:   It's easy to contrive that scenario entirely separate from debug mode:
    c:\@Work\Perl\monks>perl -wMstrict -le "$_ = '1223w3433.45+34'; ;; /\d\D*/g; printf qq{'$&' } while /\d\D*/g; " '2' '2' '3w' '3' '4' '3' '3.' '4' '5+' '3' '4'
    m//g in scalar or void context "remembers" the point at which a previous match ended and continues matching from that point. See pos.

    Update: See what happens if the side-effect-producing, void context
        /\d\D*/g;
    statement is changed to
        () = /\d\D*/g;
    (i.e., if list context is imposed).


    Give a man a fish:  <%-{-{-{-<

      It's often the result of using the nonsensical if (/.../g)

      $ perl -e' $_ = "abc"; if (/\w/g) { CORE::say for /(\w)/g; } ' b c
      I dont think I "contrived" anything. I did "R" and the first line was :
      $_= '1223w3433.45+34';
      then I did the "x". If the debugger works as you suggest, then it's not working right. It should interpret the expression correctly; if something isn't reset (which doesnt make sense ,since this happens on line 2) than it needs to get reset..

      Larry?

      But, you're correct it IS a debugger thing because the interpretation is correct outside dbg...

        Please apply this single line and show us the result:

        DB<5> x $_='12a3b'; /\d\D*/g 0 1 1 '2a' 2 '3b' DB<6>

        > But, you're correct it IS a debugger thing

        Very unlikely.

        And you've already been quick to blame Perl in the past.

        Edit

        We need a possibility to reproduce your problem before searching a hypothetical bug.

        So please reproduce, and tell us the versions of perl and the debugger.

        Cheers Rolf
        (addicted to the Perl Programming Language :)
        Wikisyntax for the Monastery

Re: Some odd ambiguity in this regex
by LanX (Sage) on Jun 05, 2020 at 16:31 UTC
    I can't reproduce your problem

    $ perl -de0 Loading DB routines from perl5db.pl version 1.53 ... DB<1> $_='1223w3433.45+34' DB<2> x /\d\D*/g 0 1 1 2 2 2 3 '3w' 4 3 5 4 6 3 7 3. 8 4 9 '5+' 10 3 11 4 DB<3> p $] 5.028002 DB<4>

    I concur with the others that you most likely changed pos in a previous step.

    Cheers Rolf
    (addicted to the Perl Programming Language :)
    Wikisyntax for the Monastery

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11117718]
Approved by Paladin
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (10)
As of 2022-06-27 12:25 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My most frequent journeys are powered by:









    Results (88 votes). Check out past polls.

    Notices?