Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Regex not behaving as expected

by Popcorn Dave (Abbot)
on Feb 01, 2003 at 00:19 UTC ( [id://231763]=perlquestion: print w/replies, xml ) Need Help??

Popcorn Dave has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I know I'm probably going to slap my forehead after I get a response for this but...

I've got a very simple regex which I want to just pull sequential digits out of a filename which is passed in - e.g. sfytd13.txt should return 13.

I'm using the following code:

use strict; my $file = 'file013.txt'; my $sk = $file =~ m/\d+/; print $sk;

Very simple, but the problem is that I'm only getting the '1' back. I realize that if I use parthensis matching $1 will return '013' but I'm curious as to why my code doesn't return 13.

My understanding is that \d+ is going to match a digit followed by any number of digits until I hit a non-digit. What am I missing here? Is something getting clobbered in the assignment?

Thanks in advance!

There is no emoticon for what I'm feeling now.

Replies are listed 'Best First'.
Re: Regex not behaving as expected
by Enlil (Parson) on Feb 01, 2003 at 00:38 UTC
    this works (assuming you want a series of numbers but disregarding leading zeros):
    use strict; my $file = 'file013.txt'; my ($sk) = $file =~ m/0*(\d+)/; print $sk;
    The problem with the way you had it is that it is returning 1 not because of the number 1 in the 013 but because it is true (akai.e., there was a match).


      Yes that's exactly what I was after. Thanks for that, I think it's been too long of a day. I can't believe I didn't catch that. :|

      Also thank you for the ($sk) bit. I never realized that you didn't have to use $1, $2, etc... in matching.

      There is no emoticon for what I'm feeling now.

Re: Regex not behaving as expected
by Paladin (Vicar) on Feb 01, 2003 at 00:33 UTC
    m// in scalar context returns true or false if it succeeds or fails. In list context without /g it returns the list of $1, $2, ... if you have used () to capture anything. In short you probably want to use:
    use strict; my $file = 'file013.txt'; my ($sk) = $file =~ m/(\d+)/; # list context to get what () captured print $sk;
    See perldoc perlop for more info on various combinations of m//, /g and what they return in list and scalar context.

    Update: Corrected typo.

Re: Regex not behaving as expected
by pizza_milkshake (Monk) on Feb 01, 2003 at 12:15 UTC
    context. your example has $sk in scalar context. you need list context, i.e. ($sk)

    perl -e'$_=q#: 13_2: 12/"{>: 8_4) (_4: 6/2"-2; 3;-2"\2: 5/7\_/\7: 12m m::#;s#:#\n#g;s#(\D)(\d+)#$1x$2#ge;print'

        Not necessarily. If there are no capture groups then the list context result behaves as if there is an implicit group around the entire regex.

        Seeking Green geeks in Minnesota

(elbie): Regex not behaving as expected
by elbie (Curate) on Feb 01, 2003 at 15:44 UTC

    Your solution looks very close to a common trick I use, which is to assign a vairable to another variable, then do a regex search and replace on the new one:

    my $file = 'file013.txt'; (my $sk = $file) =~ s/^.*?(\d+).*$/$1/;

    Which for the sake of this problem is a very similar solution to what others are proposing, the above method is much more useful when I want to keep most of the string rather than a portion of it.

    Example, I want to emphasise the word keep above:

    my $boring_string = '...when I want to keep most of the string...'; (my $exciting_string = $boring_string) =~ s/(keep)/<b>$1</b>/;


      If you use .* as "everything else", then be sure to use the /s modifier.   s/^.*?(\d+).*$/$1/s Beware of that if there is no digit in $file, then $sk won't change! This is bad. It would look like you need to change \d+ to \d*, but that would make it match "" at the beginning, and so $1 would be empty, thus erasing the whole string.

      But this problem can be solved. By rewriting the pattern to a more natural (?) pattern we'll soon see the solution. First, your pattern can be rewritten to   s/\D*(\d+).*/$1/s The anchors are removed, as they're unnecessary. They're unnecessary in your pattern too. Anyhow, now we can change \d+ to \d*, and $sk will be empty if no number was matched. So the result is   (my $sk = $file) =~ s/\D*(\d*).*/$1/s; But this still isn't fully analogous with   my ($sk) = $file =~ /(\d+)/; since $sk will be the empty string in the former, and the undefined value in the latter. So in extraction situations I often stay away from that trick, and simply use the latter.


Log In?

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://231763]
Approved by BazB
Front-paged by Trimbach
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2024-05-25 23:19 GMT
Find Nodes?
    Voting Booth?

    No recent polls found