Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

When greedy constructs do battle, can I choose the winner?

by amarquis (Curate)
on Sep 20, 2007 at 18:15 UTC ( #640202=perlquestion: print w/replies, xml ) Need Help??

amarquis has asked for the wisdom of the Perl Monks concerning the following question:

Context: I'm working on Project Euler problem 160, which asks you for the last 5 digits of a number, sans any trailing zeros.

My question is this: If I have two greedy constructs in my regex, can I make the latter one be more greedy? What seems to be happening now is that the first construct (Which says "Grab at least 1 but at most 5 digits, greedy) is taking everything it can, but I want the second construct (Which says "Grab as many ending zeros as you can") to have precedence. Here's the regex as is:

$running_total =~ s/(\d{1,5})0*$/$1/;

It's not difficult to change that to be "0 to 4 of any digit, one nonzero digit, and then some zeros," which should fix the problem, but it got me thinking. Is there any way to make that 0* beat up the \d{1,5} and take its lunch money?

Replies are listed 'Best First'.
Re: When greedy constructs do battle, can I choose the winner?
by ikegami (Pope) on Sep 20, 2007 at 18:55 UTC

    Yes, by placing it first.

    my ($last_five) = map scalar reverse, map /^(?>0*)(\d{1,5})/, map scalar reverse, $running_total;

    But note that I had to add (?>...) in order to prevent backtracking in the situation where the input is "0000". It's simpler to make sure the last of 5 isn't a 0.

    my ($last_five) = $running_total =~ /(\d{0,4}[^0\D])0*$/;

    or

    my ($last_five) = $running_total =~ /(\d{1,5}(?<!0))0*$/;

    In all of the above solutions, $last_five will be undef if there's no match.

    Tested.

Re: When greedy constructs do battle, can I choose the winner?
by duff (Parson) on Sep 20, 2007 at 18:27 UTC

    It sounds like you just want to make {1,5} not greedy to me.

    $running_total =~ s/(\d{1,5}?)0*$/$1/;
    Is this what you want?

    update: Looking at your question again, perhaps your problem is also that you're doing a substitution (which I just parroted) and you really want to do something more like this:

    $running_total = $1 if $running_total =~ /(\d{1,5}?)0*$/;

      For a number such as 243210890000, the non greedy {1,5} grabs the least it can, the 9, correct? I do want the {1,5} to grab as many as it can, up to 5 digits, but I don't want it grabbing any of the trailing zeros. Examples of what I want:

      Input $1 28400300 84003 100 1

      As it stands, if the input is "10" the {1,5} grabs itself some zeros, and I want those zeros to go into the 0* (I don't want 'em captured).

      I've rewritten the regex so that it matches "(zero to four digits, one non-zero digit) and then the 0*" but the issue got me curious to see if there was some way you could make the right-most greedy construct "beat out" one to its left.

        For a number such as 243210890000, the non greedy {1,5} grabs the least it can, the 9, correct?

        Incorrect. The string is processed left to right, so the non-greedy {1,5} will match one character then try to match zeroes to the end of the string. Since this fails the overall match, the RE engine will backtrack and try two characters, then three and four and five. When the five character version fails, backtracking will cause the starting point of the {1,5} match to move over one character and try again. This continues until either there's a match or the RE engine determines that it can't match.

Re: When greedy constructs do battle, can I choose the winner?
by johngg (Canon) on Sep 20, 2007 at 18:29 UTC
    You could use a look-ahead.

    use strict; use warnings; my @nos = qw{ 12304500 123 1234567000000 987045027000 }; print map { sprintf qq{%15s : %5s\n}, @$_ } map { [ $_, m{(\d{1,5})(?=0*\z)} ] } @nos;

    Here's the output.

    12304500 : 23045 123 : 123 1234567000000 : 34567 987045027000 : 45027

    I hope this is of use.

    Cheers,

    JohnGG

    Update: No, that doesn't work unless you make the \d{1,5} non-greedy, as suggested by duff

Re: When greedy constructs do battle, can I choose the winner?
by andreas1234567 (Vicar) on Sep 20, 2007 at 18:41 UTC
    perlretut says:
    Principle 3: If there are two or more elements in a regexp, the leftmost greedy quantifier, if any, will match as much of the string as possible while still allowing the whole regexp to match. The next leftmost greedy quantifier, if any, will try to match as much of the string remaining available to it as possible, while still allowing the whole regexp to match. And so on, until all the regexp elements are satisfied.
    --
    Andreas

      What I had figured, after reading that in the docs, was "Welp, you can do just about anything in Perl just about any way you want to, I wonder if I can break that behavior somehow?"

      While we're on the subject, let me run my interpretation of what's going on "behind the scenes" with the engine by y'all, to see if I understand it correctly. Let's take the simple pattern:

      "444440000" =~ /(/d*)(0{3,10})/

      So, the first thing it tries to do is give the /d* as much as possible, so the /d* is trying to match the whole string. Then there aren't enough zeros to match the second part, so it gives up and backs up, trying to put one less character into (/d*). It continues to fail and back up one character less until it comes out with a match.

      So in the above example, it tries things out like the following?

      /d* 0{3,10} 444440000 nothing 44444000 0 4444400 00 444440 000 (Success!)

        Yep. That's it. (Aside from your slash leaning the wrong direction).

Re: When greedy constructs do battle, can I choose the winner?
by RMGir (Prior) on Sep 20, 2007 at 18:29 UTC
    While I like the idea of bullying regexes battling over lunch money, I don't know how you can pull that off...

    But what I'd suggest is just doing

    $running_total =~ s/(\d{0,4}[1-9])0*$/$1/;
    I think that should get you as many digits as possible up to 5 before the trailing 0's, without anyone's lunch money being at risk. :)

    Mike

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://640202]
Approved by kyle
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2021-11-27 12:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?